SkOSifying an Archaeological Thesaurus

Matteo Romanello (German Archaeological Institute, Berlin / Department of Digital Humanities, King’s College London)

In this paper I will present an interoperability use case that was developed in the framework of DARIAH-DE, the German branch of the EU-funded Digital Research Infrastructure for the Arts and Humanities (DARIAH). The use case consisted in transforming the openly available thesaurus of the German Archeological Institute, currently encoded in Marc21XML and accessible via an OAI-PMH end-point, into an RDF representation of the same data encoded in SKOS, the W3C standard to publish Knowledge Organization Systems in the Semantic Web.

On the technical side such a transformation was made possible by the Stellar Console, an open source tool developed by Ceri Binding and Doug Tudhope (Keith et al. 2012) in the framework of the AHRC-funded project “Semantic Technologies Enhancing Links and Linked Data for Archaeological Resources” (STELLAR). The 80,00 Marc21 records of the thesaurus, after being harvested via the OAI-PMH interface, were transformed into an intermediate CSV file, which is in turn fed into the Stellar Console in order to produce a SKOS/RDF output consisting of slightly less than 1 million triples. What it took to implement this transformation is a Python script of approximately 150 lines which ties the OAI-PMH interface and the Stellar Console together.

What this paper aims to show is that some interoperability can be achieved–or at least enabled–also by “simply” 1) providing machine-actionable interfaces, such as OAI-PMH, to collections of electronic resources; 2) using open licenses, such as Creative Commons or the GNU General Public License, to publish data and software as this enables other people to manipulate available data in various ways including migrating them from one (less interoperable) format to another (more interoperable) one.


May, Keith, Ceri Binding, Douglas Tudhope, and Stewart Jeffrey. ‘Semantic Technologies Enhancing Links and Linked Data for Archaeological Resources’. In CAA Proceedings 2012, edited by Mingquan Zhou, Iza Romanowska, Wu Zhongke, Xu Pengfei, and Philip Verhagen, 261–272. Amsterdam University Press, 2012.

Video by Doug Rocks-Macqueen, originally posted on his blog.

Leave a Reply