Phase one of Unlocking Thesis Data is complete. You can read about the work of the project in the past three months in our report
Grace, Stephen and Whitton, Michael and Gould, Sara and Kotarski, Rachael Mapping the UK thesis landscape: Phase 1 project report for Unlocking Thesis Data. Project Report. University of East London, London. (10.15123/PUB.4307).
The report covers the background of community interest which led to the project, and analyses the survey responses from EThOS contacts we previously mentioned in the blog. It then summarises the six institutional case studies looking at thesis-related processes in detail at a range of universities (East London, Southampton, LSE, UAL, Bristol and Leicester). The case studies showed a wide variety of approaches in processing and making available theses, and this insight will help us ensure that we consider solutions that work for the widest possible range of universities. Each of the case studies required interviews with staff involved in processing theses, by a combination of Michael Whitton (University of Southampton), Sara Gould (The British LIbrary), Rachael Kotarski (The British Library) and me. Many thanks to Michael, Sara and Rachael for working with me on the project.
Subject to further Jisc funding, we hope in the next phase of Unlocking Thesis Data to address the following five recommendations from the report:
- Hold at least three thesis clinics to investigate opportunities and barriers to assigning DOI and ORCiD identifiers in UK universities
- Engage with system suppliers/vendors to identify opportunities for enhancing software with required PIDs
- Consult with EThOS formally to understand what needs to change in EThOS systems and processes to harvest and display PIDs and related metadata for theses and their data
- Evaluate approaches to updating UKETD profile, initially in EPrints, before planning software enhancements
- Investigate requirements and solutions for those institutions that use EThOS as their first-point repository
You can find links to all the case studies, the survey and the phase one report at http://dx.doi.org/10.15123/PROJECT.15.
The UEL case study, written by Michael Whitton and Sara Gould is now available at this DOI: 10.15123/PUB.4301. UEL currently require both print and electronic versions of theses. They currently have separare publications (ROAR) and data (data.uel) repositories, both using the EPrints software.
- DOIs for Theses can be minted in ROAR, using the DataCiteDOI plugin in the EPrints Bazaar.
- Using DOIs incorporating the Student number (e.g. 10.15123/THESIS.123456) would allow the DOI to be included in the Thesis itself.
- These could be assigned when the student registers, but only activated on publication of the thesis in ROAR.
- DOIs for Data are currently being minted in data.uel using the same plugin. The Repository links plugin in the EPrints Bazaar allows the Thesis to be linked to the associated data.
- ORCiDs can be promoted shortly after the student registers, so they could use it if they publish.
- Registration forms could be sent to the Repository Manager, who would apply for an account on the student’s behalf.
Yesterday at the DataCite UK client meeting held at the British Library, three universities attempted to assign a DOI to a sample thesis. As anyone who has tried a live demonstration will know, there is a risk that what should work doesn’t. Thankfully, it did work and all three DOIs were minted in front of the attendees.
Valerie McCutcheon of the University of Glasgow used the CoinDOI plugin to request and receive back a DOI for a thesis in Enlighten:Theses. The thesis is at http://dx.doi.org/10.5525/gla.thesis.6423.
Michael Whitton of the University of Southampton uploaded a small XML file directly to the DataCite Metadata Store, and received back the DOI http://dx.doi.org/10.5258/SOTON/374711 for ePrints Soton.
Finally, I used the same CoinDOI plugin to assign a DOI http://dx.doi.org/10.15123/PUB.3929 to a thesis in ROAR – one which had related data objects (actually two full-length documentary films created as part of the PhD thesis) in data.uel the data repository at UEL.
Grateful thanks to DataCite UK for the chance to update on the Unlocking Thesis Data project ahead of the Jisc sandpit workshop next week, and to Valerie and Michael for agreeing to mint the DOIs in front of an audience. Look out for release of the six university case studies during the coming week.
The UTD survey carried out in April-May 2015 has been analysed by Sara Gould (The British Library) as part of the UTD project. You can read the details of the results and what they mean for the administration of theses in UK universities in the report available at this DOI: 10.15123/PUB.4274. There are six key findings:
- 49 institutions (35.5%) responded to the survey, indicating that Unlocking Thesis Data is of interest to a significant proportion of HE institutions. A list of respondents is provided in Appendix A.
- At the time of the survey, no institution assigned DOI identifiers for their theses, although DataCite DOIs were used by 33% of institutions for their research data.
- Around 59% of institutions require students to submit both print and e-copies of their final thesis, and this often results in double-handling, for example in creating separate records for the catalogue and repository. This may have implications for UTD.
- The most ‘typical’ scenario is an institution which uses an EPrints repository for its e-theses and supporting files, students must submit both print and electronic versions of their thesis, the thesis is uploaded and the metadata created by the repository staff (though students are a close second), and the institution assigns DOIs for its datasets but not theses. This suggests such a scenario might form the first case study or the core focus for UTD.
- In response to the question How ready are you to begin assigning DOI identifiers to your theses? institutions varied from ‘Completely ready’ to ‘DOIs are not on our radar at all’. You can see the results in this pie chart, and our intention is to ask this question again at the end of the full UTD project; this will form a key indicator as to the success of the project.
Q16: How ready are you to begin assigning DOI identifiers to your theses?
- Twenty-four respondents (49%) volunteered their institution to be a case study for UTD. The aim is to deliver just six case studies under Phase 1, and we hope those institutions not selected will be willing to host UTD clinics, become early adopters or have other opportunities to be closely involved.
If you are interested in looking at the survey data in more detail, you will find it in XLSX and CSV formats at http://dx.doi.org/10.15123/DATA.12. We will soon be releasing case studies, looking at procedures in more detail in six different universities. These will complement the survey findings with a range of different practices, so that we can ensure subsequent work in the project will take account of real-world situations. And we hope to repeat the survey later in the project to see if there have been any actual or planned changes in procedures.
We had a great response to the baseline survey to gather data on thesis workflows, levels of DOI planning and interest in UTD – 51 responses or nearly 40% of all institutions. A full report will follow, but for now:
- Most institutions currently require both print and e- deposit, but lots are in transition
- Handling print and e- often means creating two sets of metadata, one in the repository and one for the catalogue: “Cataloguing staff create record for print copy; repository staff create record for e-copy”
- 67% of institutions store supplementary data files in same repository as the thesis. But typical responses included “This is a work in progress”, “Decision yet to be made” and “At the moment we don’t know”.
- A range of identifiers are used for data:
- And finally we asked how ready you are to begin assigning DOIs for theses:
Lots of work for UTD to do then. Next task, case studies starting with University of East London.
UTD is underway! Every EThOS contact in the UK has just been invited to complete a short survey of their current practice with regards to theses. The answers will help us understand the current landscape, and form a baseline to compare with in a year’s time at the end of the project. At the end of the survey, there is an invitation to email the project any workflows or other details of how your institution handles theses.
And don’t forget to vote for Unlocking Thesis Data in Jisc’s Research at Risk campaign at http://researchatrisk.ideascale.com/a/dtd/101964-31525.
Unlocking Thesis Data (UTD) is the short name for a project with a long title and a big ambition. “Unlocking the UK’s thesis data through persistent identifiers” will explore how the application of persistent identifiers, software and metadata enhancements, and guidance to institutions would kick-start a more widespread sharing of data generated in doctoral-level research in the UK. Here’s our project summary:
Unlocking Thesis Data (UTD) is a community-driven project to promote the use of persistent identifiers for theses, their underlying data and their authors. By their very nature, PhD theses break new ground and advance scholarly knowledge. Most make use of newly-created data but these data can be trapped in an appendix or DVD – either unavailable or not suited for reuse. UTD will make data more discoverable and citeable, thereby offering incentives to students to share their data in more appropriate formats, in the context of a sustainable national thesis framework.
Funded by Jisc, UTD is led by the Universities of East London and Southampton and EThOS (the UK’s national thesis service at the British Library). Phase one will explore current thesis practice through an online survey to EThOS member institutions, and individual case studies looking at the issues in more detail – including how institutions might apply DOI and ORCID identifiers. The survey and case study findings will be combined into a report with recommendations for further phases of the project. These are expected to enhance metadata and software for applying DOI and ORCID identifiers, to test them in live settings, and to offer comprehensive advice for institutions to adopt them. By summer 2016 we expect to have a sustainable infrastructure covering the whole UK, driving the wider availability of research data and introducing doctoral students to new norms of scholarly communication.
The first phase of the project runs from late April to mid July 2015, and has three components
- A survey of EThOS member institutions on their current practices with regard to PhD theses and their data
- Case studies in individual institutions digging into the details and seeing where DOI and ORCID identifiers could be assigned
- A summary report synthesising the survey and case studies, with recommendations for next steps
We look forward to building on the national EThOS network with a robust and sustainable thesis infrastructure adapted to the new norms of data sharing. Subject to continuing Jisc funding two further phases will deliver working services that meet the needs of students, their institutions and all those interested in the data generated in PhD research.