Hi all,

we are happy to announce version 2.4 of the Silk - Link Discovery Framework for 
the Web of Data.

The central idea of the Web of Data is to interlink data items using RDF links. 
However, in practice most data sources are not sufficiently interlinked with 
related data sources. The Silk Link Discovery Framework addresses this problem 
by providing tools to generate links between data items based on user-provided 
link specifications. It can be used by data publishers to generate links 
between datasets as well as by Linked Data consumers to augment Web data with 
additional RDF links.

Link specifications can either be written manually or developed using the new 
Silk Workbench. The Silk Workbench, is a web application which guides the user 
through the process of interlinking different data sources. It’s being shipped 
with the 2.4 version of Silk.
The Silk Workbench offers the following features:
- It enables the user to manage different sets of data sources and linking 
tasks.
- It offers a graphical editor which enables the user to easily create and edit 
link specifications.
- As finding a good linking heuristics is usually an iterative process, the 
Silk Workbench makes it possible for the user to quickly evaluate the links 
which are generated by the current link specification.
- It allows the user to create and edit a set of reference links used to 
evaluate the current link specification.

The Silk Link Discovery Framework includes three applications to execute the 
link specifications which address different use cases:
1. Silk Single Machine is used to generate RDF links on a single machine. The 
datasets that should be interlinked can either reside on the same machine or on 
remote machines which are accessed via the SPARQL protocol. Silk Single Machine 
provides multithreading and caching. In addition, the performance can be 
further enhanced using an optional blocking feature.
2. Silk Server can be used as an identity resolution component within 
applications that consume Linked Data from the Web. Silk Server provides an 
HTTP API for matching instances from an incoming stream of RDF data while 
keeping track of known entities. It can be used for instance together with a 
Linked Data crawler to populate a local duplicate-free cache with data from the 
Web.
3. Silk MapReduce is used to generate RDF links between datasets using a 
cluster of multiple machines. Silk MapReduce is based on Hadoop and can for 
instance be run on Amazon Elastic MapReduce. Silk MapReduce enables Silk to 
scale out to very big datasets by distributing the link generation to multiple 
machines.

More information about the Silk framework, the Silk Link Specification 
Language, as well as several examples that demonstrate how Silk is used to set 
links between different data sources in the LOD cloud is found at:

http://www4.wiwiss.fu-berlin.de/bizer/silk/

The Silk framework is provided under the terms of the Apache License, Version 
2.0 and can be downloaded from

http://www4.wiwiss.fu-berlin.de/bizer/silk/releases/

The development of Silk was supported by Vulcan Inc. as part of its Project 
Halo (www.projecthalo.com) and by the EU FP7 project LOD2 - Creating Knowledge 
out of Interlinked Data (http://lod2.eu/, Ref. No. 257943).

Thanks to  Christian Becker, Michal Murawicki and Andrea Matteini for 
contributing to the Silk Workbench.

Happy linking,

Robert Isele, Anja Jentzsch and Chris Bizer

Reply via email to