Re: automatic data interlinking
Dear François, This is a great initiative in a crucial area. I am wondering if there is anything we (rkbexplorer.com and sameas.org) can do to help. Clearly we have a lot of datasets which our tools have been grinding over aligning for many years, and we would be happy to offer anything you would find useful. However, there may also be other things. I looked at taking the outputs of last year's exercise into a sameas store, but found the URIs (at least the few I tried) were not LD, so backed off. So perhaps the first suggestion would be that whatever datasets you choose, they should be over LD URIs. Another suggestion would be that the outputs of the exercise should be published in such a way that they will be useful to the LD world. Not least, this would be more motivating to the participants. We would be happy to bring up a sameas store for this, or indeed a separate sameas store for each of the participants, where they can post their data, and they and others can then access it. And of course results with high precision can safely be put in sameas.org, which would be very exciting for me. (Of course the level of help we can give will be limited by the resources, which are limited.) In choosing datasets, perhaps an obvious place to start is something like the geographical data in the data.gov.uk world? Best Hugh PS In fact, there are some useful ones that would help personally, although you may feel they are too close to the last year's topics: for example we have LD datasets of the NSF (National Science Foundation) project data with the OAI (Open Archive Initiative) bibliographic data, and aligning these would be challenging but very interesting. On 21/05/2010 14:51, "François Scharffe" wrote: > Hello, > > Part of the ontology alignment evaluation initiative [1] we will have > for the 2nd year a data interlinking evaluation. > > We propose in this track to evaluate systems able to *automatically* > find interlinks between Web datasets, in contrast to semi-automatic > tools. This year we will focus on large datasets. Two datasets are given > in input and a set of links between equivalent resources will have to be > given in output. > > We're looking for systems to participate to the evaluation. We're also > looking for datasets that may be used for the evaluation, that is have a > nicely curated linkset to serve as a reference. > > btw I also invite you to look at the result of last year evaluation [2] > > Cheers > > François > > > > [1] http:///oaei.ontologymatching.org > [2] http:///oaei.ontologymatching.org/2009/instances/ >
Cool URIs (was: Re: Java Framework for Content Negotiation)
On 27.05.2010 15:51, Richard Cyganiak wrote: On 27 May 2010, at 10:47, Angelo Veltens wrote: What I am going to implement is this: http://www.w3.org/TR/cooluris/#r303uri I think, this is the way dbpedia works and it seems a good solution for me. It's the way DBpedia works, but it's by far the worst solution of the three presented in the document. DBpedia has copied the approach from D2R Server. The person who came up with it and designed and implemented it for D2R Server is me. This was back in 2006, before the term Linked Data was even coined, so I didn't exactly have a lot of experience to rely on. With what I know today, I would never, ever again choose that approach. Use 303s if you must; but please do me a favour and add that generic document, and please do me a favour and name the different variants and rather than and . Thanks a lot for sharing your experience with me. I will follow your advice. So if i'm going to implement what is described in section 4.2. i have to - serve html at http://www.example.org/doc/alice if text/html wins content negotiation and set content-location header to http://www.example.org/doc/alice.html - serve rdf/xml at http://www.example.org/doc/alice if application/rdf+xml wins content negotiation and set content-location header to http://www.example.org/doc/alice.rdf - serve html at http://www.example.org/doc/alice.html always - serve rdf/xml at http://www.example.org/doc/alice.rdf always Right? By the way: Is there any defined behavior for the client, what to do with the content-location information? Do Browsers take account of it? The DBpedia guys are probably stuck with my stupid design forever because changing it now would break all sorts of links. But the thing that really kills me is how lots of newbies copy that design just because they saw it on DBpedia and therefore think that it must be good. I think the problem is not only, that dbpedia uses that design, but that it is described in many examples as a possible or even "cool" solution, e.g. http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ (one of the first documents i stumbled upon) If we want to prevent people from using that design it should be clarified that and why it is a bad choice. Kind regards and thanks for your patience, Angelo
Semantic Web Challenge @ ISWC 2010 - Call for Participation
Dear all, we are happy to announce the Semantic Web Challenge 2010! The Semantic Web Challenge 2010 is collocated with the 9th International Semantic Web Conference (ISWC2010) in Shanghai, China. As last year, the challenge consists of two tacks: The Open Track and the Billion Triples Track, which requires participants to make use of the data set that has been crawled from the public Semantic Web. The data set consists of 3.2 billion triples this year and can be downloaded from the challenge's website. The Call for Participation is found below. More information about the Challenge is provided at http://challenge.semanticweb.org/ We are looking forward to your submissions which as we hope will make the Semantic Web Challenge again one of the most exciting events at ISWC. Best regards, Diana and Chris -- Call for Participation for the 8th Semantic Web Challenge at the 9th International Semantic Web Conference ISWC 2010 Shanghai, China, November 7-11, 2010 http://challenge.semanticweb.org/ -- Introduction Submissions are now invited for the 8th annual Semantic Web Challenge, the premier event for demonstrating practical progress towards achieving the vision of the Semantic Web. The central idea of the Semantic Web is to extend the current human-readable Web by encoding some of the semantics of resources in a machine-processable form. Moving beyond syntax opens the door to more advanced applications and functionality on the Web. Computers will be better able to search, process, integrate and present the content of these resources in a meaningful, intelligent manner. As the core technological building blocks are now in place, the next challenge is to demonstrate the benefits of semantic technologies by developing integrated, easy to use applications that can provide new levels of Web functionality for end users on the Web or within enterprise settings. Applications submitted should give evidence of clear practical value that goes above and beyond what is possible with conventional web technologies alone. As in previous years, the Semantic Web Challenge 2010 will consist of two tracks: the Open Track and the Billion Triples Track. The key difference between the two tracks is that the Billion Triples Track requires the participants to make use of the data set (consisting of 3.2 billion triples this year) that has been crawled from the Web and is provided by the organizers. The Open Track has no such restrictions. As before, the Challenge is open to everyone from industry and academia. The authors of the best applications will be awarded prizes and featured prominently at special sessions during the conference. The overall goal of this event is to advance our understanding of how Semantic Web technologies can be exploited to produce useful applications for the Web. Semantic Web applications should integrate, combine, and deduce information from various sources to assist users in performing specific tasks. --- Challenge Criteria The Challenge is defined in terms of minimum requirements and additional desirable features that submissions should exhibit. The minimum requirements and the additional desirable features are listed below per track. Open Track Minimal requirements 1. The application has to be an end-user application, i.e. an application that provides a practical value to general Web users or, if this is not the case, at least to domain experts. 2. The information sources used should be under diverse ownership or control should be heterogeneous (syntactically, structurally, and semantically), and should contain substantial quantities of real world data (i.e. not toy examples). The meaning of data has to play a central role. 3. Meaning must be represented using Semantic Web technologies. 4. Data must be manipulated/processed in interesting ways to derive useful information and this semantic information processing has to play a central role in achieving things that alternative technologies cannot do as well, or at all; Additional Desirable Features In addition to the above minimum requirements, we note other desirable features that will be used as criteria to evaluate submissions. 1. The application provides an attractive and functional Web interface (for human users) 2. The application should be scalable (in terms of the amount of data used and in terms of distributed components working together). Ideally, the application should use all data that is currently published on the Semantic Web. 3. Rigorous evaluations have taken place that demonstrate the benefits of semantic technologies, or validate the results obtained. 4. Novelty, in applying semantic technology to a domain or task that have not been considered before 5. Functionality is different from or goes beyond pure information retrieval 6. The application has clear commercia
CFP: Workshop on Knowledge Injecting into and Extraction from Linked Data (KIELD2010)
[Apologies for cross-posting] CALL FOR PAPERS - Workshop on Knowledge Injecting into and Extraction from Linked Data (KIELD2010) Co-located with EKAW 2010 - 11th to 15th October 2010 Website: http://ontologydesignpatterns.org/wiki/Odp:KIELD2010 Submissions: KEY DATES Submission deadline:July 15th Authors notified: August 9th Camera-ready versionAugust 25th WorkshopOctober 11 or 15 The KIELD workshop aims at gathering three prominent sub-communities of Knowledge Engineering and Management: Knowledge Modelling, Knowledge Discovery and Linked Data. The rapid growth of the Linked Data cloud, in parallel with on-the-fly design of relevant vocabularies, presents new opportunities for traditional research disciplines. The workshop welcomes research, application, and position papers on the following topics (non-exclusive): Ontology engineering for Linked Data Methodologies Ontology patterns extraction Ontology patterns identification and discovery Pattern-based triplification Anti-patterns or worst practices Data mining from Linked Data Entity recognition Link prediction Pattern mining Sequential patterns Rule mining Linked Data in use Domain applications based-on linked data Linked data exploitation Interaction with linked data There will be two categories of papers: * Regular research and application papers of up to 12 pages * Position papers of up to 6 pages All submissions should indicate into which category they fall. N.B. The workshop is organised in association with EKAW (http://ekaw2010.inesc-id.pt/). In order to register for CIAO2010, it is also necessary to register for EKAW 2010.