Re: Fresnel: State of the Art?
The Fresnel Path Language was submitted as a note to the W3C a while back: http://www.w3.org/2005/04/fresnel-info/fsl/ I implemented that in PHP as part of the moriarty library: http://code.google.com/p/moriarty/source/browse/trunk/graphpath.class.php I think FSL is very interesting (having looked at many path languages for RDF over the past 5 or 6 years) and I'd like to see more implementations. Ian On Mon, Feb 1, 2010 at 1:44 PM, Aldo Bucchi aldo.buc...@gmail.com wrote: Hi, I was looking at the current JFresnel codebase and the project seems to have little movement. I was wondering if this is the state of the art regarding Declarative Presentation Knowledge for RDF or have efforts moved elsewhere and I have missed it? Thanks! A -- Aldo Bucchi skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com/ PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail.
DBpedia-based entity recognition service / tool?
Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct me if I am wrong. Cheers, Matthias
Re: DBpedia-based entity recognition service / tool?
Hi Matthias, have you ever tried this http://lupedia.ontotext.com/ ? Perhaps it may help. cheers, Davide On Tue, Feb 2, 2010 at 1:26 PM, Matthias Samwald samw...@gmx.at wrote: Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct me if I am wrong. Cheers, Matthias
Re: Fresnel: State of the Art?
On Feb 2, 2010, at 2:01 PM, Ian Davis wrote: The Fresnel Path Language was submitted as a note to the W3C a while back: http://www.w3.org/2005/04/fresnel-info/fsl/ Correction: we used a template that looks like the one used for notes, but it was not officially submitted as a W3C Note. It is hosted on w3.org in date space. I implemented that in PHP as part of the moriarty library: http://code.google.com/p/moriarty/source/browse/trunk/graphpath.class.php I think FSL is very interesting (having looked at many path languages for RDF over the past 5 or 6 years) and I'd like to see more implementations. Great! I'm glad there is another implementation of it. The Java implementation of FSL made available as part of JFresnel is actually standalone (w.r.t JFresnel). So you can get just two small JAR files for the FSL engine, available through Maven, that will work either with Jena 2 or Sesame 2. See [1] for more info. [1] http://jfresnel.gforge.inria.fr/doc/dependencies.html -- Emmanuel Pietriga INRIA Saclay - Projet In Situ http://www.lri.fr/~pietriga
Re: DBpedia-based entity recognition service / tool?
Hi David, Thanks for the hint. I remember trying LUPedia a few months ago -- now it has a defined API, which is a good addition. Unfortunately, the quality of results could be improved quite a bit. Here is a scientific statement that I would like to see annotated: Albizia julibrissin has anxiolytic-like effects that are mediated by the changes of the serotonergic nervous system, especially 5-HT1A receptors. LUPedia is unable to identify any entities in this string, although DBpedia would contain them. http://dbpedia.org/resource/Albizia_julibrissin http://dbpedia.org/resource/Anxiolytic http://dbpedia.org/page/5-HT1A_receptor et cetera. It seems to recognize person names, as for the string Michael Jackson, the following URIs are returned: # http://dbpedia.org/resource/Parademon # http://dbpedia.org/resource/Michael_Jackson The first result is a bit puzzling (DBpedia tells me that 'In the DC Universe, Parademons are monstrous shock troops of Apokolips used by Darkseid to maintain the order of Apokolips.'). LUPedia does not seem to do any kind of stemming either, as submitting the string Michael Jacksons reduces the list of extracted URIs to: # http://dbpedia.org/resource/Parademon LUPedia in its current form will not perform too well in practical settings. Cheers, Matthias Samwald -- From: Davide Palmisano dav...@asemantics.com Sent: Tuesday, February 02, 2010 2:27 PM To: Matthias Samwald samw...@gmx.at Cc: public-lod@w3.org Subject: Re: DBpedia-based entity recognition service / tool? Hi Matthias, have you ever tried this http://lupedia.ontotext.com/ ? Perhaps it may help. cheers, Davide On Tue, Feb 2, 2010 at 1:26 PM, Matthias Samwald samw...@gmx.at wrote: Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct me if I am wrong. Cheers, Matthias
Re: DBpedia-based entity recognition service / tool?
Hi Matthias, this is quite strange. To be honest, they are currently working on tuning Lupedia. I'm sure your mail could be useful them. I'm forwarding it to them. BTW: and what about http://www.alchemyapi.com ? have you tried it? Davide On Tue, Feb 2, 2010 at 3:21 PM, Matthias Samwald samw...@gmx.at wrote: Hi David, Thanks for the hint. I remember trying LUPedia a few months ago -- now it has a defined API, which is a good addition. Unfortunately, the quality of results could be improved quite a bit. Here is a scientific statement that I would like to see annotated: Albizia julibrissin has anxiolytic-like effects that are mediated by the changes of the serotonergic nervous system, especially 5-HT1A receptors. LUPedia is unable to identify any entities in this string, although DBpedia would contain them. http://dbpedia.org/resource/Albizia_julibrissin http://dbpedia.org/resource/Anxiolytic http://dbpedia.org/page/5-HT1A_receptor et cetera. It seems to recognize person names, as for the string Michael Jackson, the following URIs are returned: # http://dbpedia.org/resource/Parademon # http://dbpedia.org/resource/Michael_Jackson The first result is a bit puzzling (DBpedia tells me that 'In the DC Universe, Parademons are monstrous shock troops of Apokolips used by Darkseid to maintain the order of Apokolips.'). LUPedia does not seem to do any kind of stemming either, as submitting the string Michael Jacksons reduces the list of extracted URIs to: # http://dbpedia.org/resource/Parademon LUPedia in its current form will not perform too well in practical settings. Cheers, Matthias Samwald -- From: Davide Palmisano dav...@asemantics.com Sent: Tuesday, February 02, 2010 2:27 PM To: Matthias Samwald samw...@gmx.at Cc: public-lod@w3.org Subject: Re: DBpedia-based entity recognition service / tool? Hi Matthias, have you ever tried this http://lupedia.ontotext.com/ ? Perhaps it may help. cheers, Davide On Tue, Feb 2, 2010 at 1:26 PM, Matthias Samwald samw...@gmx.at wrote: Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct me if I am wrong. Cheers, Matthias
Re: [fresnel] Re: Fresnel: State of the Art?
Fresnel is the state of the art. * it is also supported by http://less.aksw.org/browse * we use it in production for configuration editors for aperture.sourceforge.net, the ontologies include fresnel lenses. * you won't find anything else that really fits RDF because of the subclass/multiclass/missing properties/too many properties dynamics you have in RDF. Templating languages are not good for this, also fresnel data can spread and grow on the web like RDF - there are no security problems associated with it (as would maybe be with templating) sure, it is bad for many cases and could be improved, but the general concept of Lenses/Views/display/hide properties and ordering properties is essential and working. best Leo It was Emmanuel Pietriga who said at the right time 02.02.2010 14:39 the following words: On Feb 2, 2010, at 2:01 PM, Ian Davis wrote: The Fresnel Path Language was submitted as a note to the W3C a while back: http://www.w3.org/2005/04/fresnel-info/fsl/ Correction: we used a template that looks like the one used for notes, but it was not officially submitted as a W3C Note. It is hosted on w3.org in date space. I implemented that in PHP as part of the moriarty library: http://code.google.com/p/moriarty/source/browse/trunk/graphpath.class.php I think FSL is very interesting (having looked at many path languages for RDF over the past 5 or 6 years) and I'd like to see more implementations. Great! I'm glad there is another implementation of it. The Java implementation of FSL made available as part of JFresnel is actually standalone (w.r.t JFresnel). So you can get just two small JAR files for the FSL engine, available through Maven, that will work either with Jena 2 or Sesame 2. See [1] for more info. [1] http://jfresnel.gforge.inria.fr/doc/dependencies.html -- Emmanuel Pietriga INRIA Saclay - Projet In Situ http://www.lri.fr/~pietriga -- _ Dr. Leo Sauermann http://www.dfki.de/~sauermann Deutsches Forschungszentrum fuer Kuenstliche Intelligenz DFKI GmbH Trippstadter Strasse 122 P.O. Box 2080 Fon: +43 6991 gnowsis D-67663 Kaiserslautern Fax: +49 631 20575-102 Germany Mail: leo.sauerm...@dfki.de Geschaeftsfuehrung: Prof.Dr.Dr.h.c.mult. Wolfgang Wahlster (Vorsitzender) Dr. Walter Olthoff Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes Amtsgericht Kaiserslautern, HRB 2313 _
Re: DBpedia-based entity recognition service / tool?
I should probably be replying here as I've been doing this, and working on this for the past few months. I've found from experience that the only viable way to address this need is to do as follows: 1: Pass content through to both OpenCalais and Zemanta 2: Combine the results to provide a list of string terms to be associated with dbpedia resources (where zemanta hasn't already done it) 3: Lookup each string resource and try and associate it to the string 4: Return all matches with results to the end user in order for them to manually confirm the results. Steps 3 and 4 are the killers here, because no matter how could the service you can't always match to exact URIs (sometimes you can only determine that you may mean one of X many ambiguous URIs); and in other cases (approx 10% of the time) the wrong term is extracted by zemanta / opencalais which skews results. For instance Mrs London may come back as simply London which you'd take to mean http://dbpedia.org/resource/London. Similarly disambiguous links often mean different things in different contexts, which you can to some extend infer by correlating the other extracted terms, but again you can never get it perfect. Thus as far as I can see, even when cutting out any ambiguous lookups, this is always going to be a process that requires user confirmation. On the lookup side of things I'd been using the API of lookup.dbpedia.org written by Georgi; however I've recently found that it delivers multiple results per lookup term more often than not. Hence I've since been on a drive to create an alternative string based lookup which can return back single unambiguous links more often than not. Something I finally achieved over the weekend :-) In reality (regardless of client restrictions on some of the code) I don't think this is an API I could ever release, or that anybody could release yet(?); what I may well be able to do though is open source the lookup classes I've made sparql queries behind them, allowing people to run their own dbpedia URI lookup service. To clarify why this per application lookup is probably the best approach, string to resource matching is very much domain specific; in once case when we say FOAF we all mean the ontology http://dbpedia.org/resource/FOAF_(software) whereas in many places they mean http://dbpedia.org/resource/Friend_of_a_friend which means that we end up with the scenario: sitea:FOAF owl:sameAs dbp:FOAF_(software) ; rdfs:label FOAF@en . siteb:FOAF owl:sameAs dbp:Friend_of_a_friend ; rdfs:label FOAF@en . Combine this with the fact that to provide anything near a usable service, you'll need to cache look-ups and hit your own RDF store first before querying dbpedia, means that we have situation. On a world-open API level it makes sense to provide a disambiguous reply saying that FOAF could mean either resource, but on a domain level it makes sense to say we generally mean x and not y. If you need further convincing I can supply literally hundreds of use-cases where the resource implied by a string seems obvious but is in fact disambiguous - even RDF has 15+ meanings, but when I say RDF I always mean Resource Description Framework. Further it also allows for domain specific String to Resources triples; such as Linked Open Data, Linking Open Data, LOD, Linked Data (and case variations) all meaning the same thing. Also, it allows for terminology not yet in dbpedia; for instance iPad is used daily, but isn't known over on dbpedia (or recognised by zemanta yet, and only open calais can return it as they assign no meaning / resource association to the strings, it's just a string). Hope that helped a bit and if you have any questions or would like the resource lookup code / sparql queries do let me know. Regards, Nathan Ivan Herman wrote: Not providing an answer, but... if such tools are around, I would love to see them added to the SWSWiki[1]. At the moment, there is a generic category 'Tagging', with the following input: http://www.w3.org/2001/sw/wiki/Category:Tagging More would be good... Ivan [1] http://www.w3.org/2001/sw/wiki/ On 2010-2-2 13:26 , Matthias Samwald wrote: Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers.
Re: DBpedia-based entity recognition service / tool?
Davide Palmisano wrote: On Tue, Feb 2, 2010 at 3:39 PM, Matthias Samwald samw...@gmx.at wrote: Davide wrote: BTW: and what about http://www.alchemyapi.com ? have you tried it? AlchemyAPI does not seem to return DBpedia / Wikipedia identifiers (?) yes, read here http://www.alchemyapi.com/api/entity/textc.html you need to specify a parameter to enable this feature. I'm using this tool with proficiency. Whilst I do like alchemy, I've found you can extract much, much more information, of a much higher standard by combining OpenCalais and Zemanta in the process outlined in a previous mail. To illustrate I'll quickly hook in with alchemy again and post a few results for comparison shortly. Many Regards, Nathan
RE: DBpedia-based entity recognition service / tool?
Hi Matthias, So you're asking for the perfect entity recognition service, applicable to the easy domain of scientific texts? Sure, I developed one in my spare time, it's much better than OpenCalais, I was just too lazy to publish it yet... ;-) Cheers, Georgi -Original Message- From: public-lod-requ...@w3.org [mailto:public-lod-requ...@w3.org] On Behalf Of Matthias Samwald Sent: Tuesday, February 02, 2010 1:26 PM To: public-lod@w3.org Subject: DBpedia-based entity recognition service / tool? Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct me if I am wrong. Cheers, Matthias
Re: DBpedia-based entity recognition service / tool?
Georgi wrote: So you're asking for the perfect entity recognition service, applicable to the easy domain of scientific texts? No, just something that does not suck. Cheers, Matthias (Have registered for AlchemiAPI but have not tested it yet)
Re: DBpedia-based entity recognition service / tool?
On Tue, Feb 2, 2010 at 4:47 PM, Georgi Kobilarov georgi.kobila...@gmx.de wrote: Hi Matthias, So you're asking for the perfect entity recognition service, applicable to the easy domain of scientific texts? Sure, I developed one in my spare time, it's much better than OpenCalais, I was just too lazy to publish it yet... ;-) Yes please, I'll take two :) Seriously, I think it might be time to look at having common REST APIs for these things, so we have a more fluid marketplace where servers can be swapped and composed. How similar are the existing interfaces? I have no idea... One idea I had on NoTube that is implemented experimentally in http://lupedia.ontotext.com/ is to use RDFa as an interop point. So one of the interfaces from the Ontotext demo there is to return RDFa markup - http://lupedia.ontotext.com/test-page4rdfa.html ... however this doesn't leave much scope for including confidence measures etc in the output. cheers, Dan
Re: DBpedia-based entity recognition service / tool?
Nathan wrote: Davide Palmisano wrote: On Tue, Feb 2, 2010 at 3:39 PM, Matthias Samwald samw...@gmx.at wrote: Davide wrote: BTW: and what about http://www.alchemyapi.com ? have you tried it? AlchemyAPI does not seem to return DBpedia / Wikipedia identifiers (?) yes, read here http://www.alchemyapi.com/api/entity/textc.html you need to specify a parameter to enable this feature. I'm using this tool with proficiency. Whilst I do like alchemy, I've found you can extract much, much more information, of a much higher standard by combining OpenCalais and Zemanta in the process outlined in a previous mail. To illustrate I'll quickly hook in with alchemy again and post a few results for comparison shortly. for a quick comparison I've run through two documents through both alchemy and the opencalais/zemanta/lookup combination system to see how they compare; note with the alchemy results I've also included the non-linked-data terms so you can see why I've not used it in my own system. = TEST 1: source document: http://webr3.org/__play/optimal/webr3.html Alchemy Results = Linked Data: http://dbpedia.org/resource/England : England http://dbpedia.org/resource/Google : Google Generic Terms: FieldTerminology : web 3.0 Company : wikipedia City : London FieldTerminology : URIs Technology : HTML5 StateOrCounty : DC City : Dublin FieldTerminology : Web Developers FieldTerminology : HTML Notes: Both DC and Dublin are incorrect, as we mentioned Dublin Core. Combined OpenCalais / Zemanta + dbpedia lookup system: = Linked Data: http://dbpedia.org/resource/Linked_Data : Linked Open Data, LOD http://dbpedia.org/resource/RDFa : RDFa http://dbpedia.org/resource/Semantic_Web : Semantic Web http://dbpedia.org/resource/HTML : HTML4 http://dbpedia.org/resource/Dublin_Core : Dublin Core http://dbpedia.org/resource/Resource_Description_Framework : RDF http://dbpedia.org/resource/Web_page : web pages http://dbpedia.org/resource/HTML5 : HTML5 http://dbpedia.org/resource/London : London http://dbpedia.org/resource/Web_design : web designer http://dbpedia.org/resource/United_Kingdom : United Kingdom http://dbpedia.org/resource/Web_search_engine : search engine http://dbpedia.org/resource/Web_developer : Web developer http://dbpedia.org/resource/Joe_Bloggs : Joe Blogs http://dbpedia.org/resource/XHTML : XHTML http://dbpedia.org/resource/Web_2.0 : Web 2.0 http://dbpedia.org/resource/Open_Data : Open Data http://dbpedia.org/resource/Web_standards : Web standards http://dbpedia.org/resource/FOAF_%28software%29 : FOAF http://dbpedia.org/resource/Computing : Computing http://dbpedia.org/resource/World_Wide_Web : World Wide Web = TEST 2: source document: http://news.bbc.co.uk/1/hi/world/asia-pacific/8492608.stm Alchemy Results = Linked Data: http://dbpedia.org/resource/People's_Republic_of_China : China http://dbpedia.org/resource/United_States : United States http://dbpedia.org/resource/Republic_of_China : Taiwan http://dbpedia.org/resource/Beijing : Beijing http://dbpedia.org/resource/Communist_Party_of_China : Chinese Communist Party http://dbpedia.org/resource/Washington,_D.C. : Washington DC http://dbpedia.org/resource/White_House : White House http://dbpedia.org/resource/Barack_Obama : Barack Obama http://dbpedia.org/resource/Google : Google http://dbpedia.org/resource/Ministry_of_Foreign_Affairs_(People's_Republic_of_China) : Chinese Foreign Ministry http://dbpedia.org/resource/Boeing : Boeing http://dbpedia.org/resource/Iran : Iran http://dbpedia.org/resource/Tehran : Tehran Generic Terms: Person : Dalai Lama Person : Mr Zhu Country : Tibet City : Washington Person : Mr Obama Person : Zhu Weiqun Technology : aerospace Person : Obama Company : BBC GeographicFeature : Himalayan Person : Ma Zhaoxu Person : Paul Reynolds Person : Kasur Lodi Gyarit Combined OpenCalais / Zemanta + dbpedia lookup system: = Linked Data: http://dbpedia.org/resource/Communist_Party_of_China : Chinese Communist Party http://dbpedia.org/resource/Barack_Obama : Barack Obama http://dbpedia.org/resource/United_States : United States http://dbpedia.org/resource/People%27s_Republic_of_China : China http://dbpedia.org/resource/Washington%2C_D.C. : DC, Washington DC http://dbpedia.org/resource/China : Sino http://dbpedia.org/resource/President_of_the_United_States : US President http://dbpedia.org/resource/Dalai_Lama : Dalai Lama http://dbpedia.org/resource/Republic_of_China : Taiwan http://dbpedia.org/resource/Arms_industry : arms sales http://dbpedia.org/resource/Official : Official http://dbpedia.org/resource/Ma_Zhaoxu : Ma Zhaoxu
Re: DBpedia-based entity recognition service / tool?
Danbri wrote: One idea I had on NoTube that is implemented experimentally in http://lupedia.ontotext.com/ is to use RDFa as an interop point. So one of the interfaces from the Ontotext demo there is to return RDFa markup - http://lupedia.ontotext.com/test-page4rdfa.html I had a similar idea when I created a text annotation service for the biomedical domain. For example: http://whatizit.neurocommons.org/index.py/pmid?pipeline=whatizitEBIMedDiseaseChemicalsquery=17477962format=atag (can be quite slow to respond) This contains recognized entities embedded via RDFa. In this case, each sentence is represented as a sioc:Item and the entities within the sentence are represented as sioc:topics of the sentence. The RDF of the example above looks like this: http://tinyurl.com/ydayqr3 However, the entities are taken from ontologies in the Open Biomedical Ontologies (OBO) collection, and not DBpedia. Entity recognition is based on another bioinformatics server, and I cannot just add DBpedia as a vocabulary. however this doesn't leave much scope for including confidence measures etc in the output. Why? You could include it as RDFa markup around pieces of the text that are not visible in the browser? Cheers, Matthias Samwald
Re: DBpedia-based entity recognition service / tool?
Matthias Samwald wrote: Hi David, Thanks for the hint. I remember trying LUPedia a few months ago -- now it has a defined API, which is a good addition. Unfortunately, the quality of results could be improved quite a bit. Here is a scientific statement that I would like to see annotated: Albizia julibrissin has anxiolytic-like effects that are mediated by the changes of the serotonergic nervous system, especially 5-HT1A receptors. Since MarkMail hasn't indexed this page, could you make an HTML page somewhere, with the excerpt above in a paragraph, then reply with the doc URL, so I can quickly test to see how close we can get to what you seek via our Sponger Middleware. Kingsley LUPedia is unable to identify any entities in this string, although DBpedia would contain them. http://dbpedia.org/resource/Albizia_julibrissin http://dbpedia.org/resource/Anxiolytic http://dbpedia.org/page/5-HT1A_receptor et cetera. It seems to recognize person names, as for the string Michael Jackson, the following URIs are returned: # http://dbpedia.org/resource/Parademon # http://dbpedia.org/resource/Michael_Jackson The first result is a bit puzzling (DBpedia tells me that 'In the DC Universe, Parademons are monstrous shock troops of Apokolips used by Darkseid to maintain the order of Apokolips.'). LUPedia does not seem to do any kind of stemming either, as submitting the string Michael Jacksons reduces the list of extracted URIs to: # http://dbpedia.org/resource/Parademon LUPedia in its current form will not perform too well in practical settings. Cheers, Matthias Samwald -- From: Davide Palmisano dav...@asemantics.com Sent: Tuesday, February 02, 2010 2:27 PM To: Matthias Samwald samw...@gmx.at Cc: public-lod@w3.org Subject: Re: DBpedia-based entity recognition service / tool? Hi Matthias, have you ever tried this http://lupedia.ontotext.com/ ? Perhaps it may help. cheers, Davide On Tue, Feb 2, 2010 at 1:26 PM, Matthias Samwald samw...@gmx.at wrote: Dear LOD community, I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, based on categories. The tool or service should be performant enough to process large numbers of strings in a reasonable amount of time. Given the prolific creation of tiny tools and services in this community I am puzzled about my inability to find anything that accomplishes this task. Could you point me to something like that? Are there tools/services for Wikipedia that I could use? Zemanta seems to be too much geared towards 'enhanced blogging', while OpenCalais does not return Wikipedia/DBpedia identifiers. Please correct me if I am wrong. Cheers, Matthias -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter: kidehen
Re: DBpedia-based entity recognition service / tool?
Matthias Samwald wrote: Nathan wrote Quite sure the results speak for themselves + glad that so much useful information can be extracted from text all ready. The results look good indeed. It even passed the FOAF test! Can you estimate the ratio of contributions from Zemanta / contributions from OpenCalais? Does one source add more than the other? Does the ratio vary significantly between different texts? Zemanta is more precise, OpenCalais is more verbose; results vary depending on the subject matter and each documents content - in all honesty there is no way to say one is better than the other, but I can say that both combined is as good as you can get for now. Noted that Kingsley mentioned the Sponger Middleware for virtuoso, this would allow you to do the same afaik, but faster and with the option of adding in more sponger cartridges for virtually any third party api's + with the extensive list of cartridges already included it's definitely an option worth looking in to - and ultimately the fastest / most reliable. fyi: i ran your source text through the combination system and here's what it brings back: http://dbpedia.org/resource/Nervous_system http://dbpedia.org/resource/Albizia http://dbpedia.org/resource/5-HT1A_receptor http://dbpedia.org/resource/Serotonin http://dbpedia.org/resource/Albizia_julibrissin http://dbpedia.org/resource/Serotonergic http://dbpedia.org/resource/Parasympathetic_nervous_system http://dbpedia.org/resource/Neurochemistry http://dbpedia.org/resource/Biochemistry http://dbpedia.org/resource/Neurotransmitter http://dbpedia.org/resource/Physiology http://dbpedia.org/resource/Biology regards, Nathan
Re: DBpedia-based entity recognition service / tool?
Kingsley wrote: Since MarkMail hasn't indexed this page, could you make an HTML page somewhere, with the excerpt above in a paragraph, then reply with the doc URL, so I can quickly test to see how close we can get to what you seek via our Sponger Middleware. Sure. I uploaded it to http://hcls.deri.org/res/albizia_test_document.html The sentence can be found in context at http://www.ncbi.nlm.nih.gov/pubmed/15894080 It has also been manually annotated with SIOC/RDFa as part of collection at http://hcls.deri.org/atag/data/tcm_atags.html Cheers, Matthias Samwald
CFP, Ontology Repositories and Editors for the Semantic Web
ESWC 2010 Workshop on Ontology Repositories and Editors for the Semantic Web ORES 2010 - Call for papers and system descriptions - http://www.ontologydynamics.org/od/index.php/ores2010/ Heraklion, Greece - Deadline: March 1, 2010 The growing number of online ontologies makes the availability of ontology repositories, in which ontology practitioners can easily find, select and retrieve reusable components, a crucial issue. The recent emergence of several ontology repository systems is a further sign of this. However, in order for these systems to be successful, it is necessary to provide a forum for researchers and developers to discuss features and exchange ideas on the realization of ontology repositories in general and to consider explicitly their role in the ontology lifecycle. In addition, it is now critical to achieve interoperability between ontology repositories, through common interfaces, standard metadata formats, etc. ORES10 intends to provide such a forum. Illustrating the importance of the problem, significant initiatives are now emerging. One example is the Open Ontology Repositories (OOR) working group set up by the Ontolog community. Within this effort regular virtual meetings are organized and actively attended by ontology experts from around the world; The Ontolog OOR 2008 meeting was held at the National Institute for Standards in Technology (NIST), generating a joint communiqué outlining requirements and paving the way for collaborations. Another example is the Ontology Metadata Vocabulary (OMV) Consortium, addressing metadata for describing ontologies. Despite these initial efforts, ontology repositories are hardly interoperable amongst themselves. Although sharing similar aims (providing easy access to Semantic Web resources), they diverge in the methods and techniques employed for gathering these documents and making them available; each interprets and uses metadata in a different manner. Furthermore, many features are still poorly supported, such as modularization and versioning, as well as the relationship between ontology repositories and ontology engineering environments (editors) to support the entire ontology lifecycle. Submitting papers and system descriptions We want to bring together researchers and practitioners active in the design, development and application of ontology repositories, repository-aware editors, modularization techniques, versioning systems and issues around federated ontology systems. We therefore encourage the submission of research papers, position papers and system descriptions discussing some of the following questions: * How can ontology repositories talk to each other? * How can the abundant and complex knowledge contained in an ontology repository be made comprehensible for users? * What is the role of ontology repositories in the ontology lifecycle? * How can branching and versioning be managed in and across ontology repositories? * How can ontology repositories interoperate with ontology editors, and other applications and legacy systems? * How can connections across ontologies be managed within and across ontology repositories? * How can modularity be better supported in ontology repositories and editors? * How can ontology repositories and editors use distributed reasoning? * How can ontology repositories support corporate, national and domain specific semantic infrastructures? * How do ontology repositories support novel semantic applications? * What measurements for describing and comparing ontologies can we use? How could ontology repositories use these? Research papers are limited to 12 pages and position papers to 5 pages. For system descriptions, a 5 page paper should be submitted. All papers and system descriptions should be formatted according to the LNCS format (http://www.springer.com/computer/lncs?SGWID=0-164-2-72376-0 ). Proceedings of the workshop will be published online. Depending on the number and quality of the submissions, authors might be invited to present their papers during a poster session. Submissions can be realized through the easychair system at http://www.easychair.org/conferences/?conf=ores2010 . Important dates Papers and demo submission: March 1, 2010 (23:59 Hawaii Time) Notification: April 5, 2010 Camera ready version: April 18, 2010 Workshop: May 30 or 31, 2010 Organizing committee Mathieu d'Aquin, the Open University, UK Alexander García Castro, Bremen University, Germany Christoph Lange, Jacobs University Bremen, Germany Kim Viljanen, Aalto University, Finland Program committee Ken Baclawski, Northeastern University, USA. Leo J. Obrst, MITRE Corporation, USA. Mark Musen, Stanford University, USA. Natasha Noy, Stanford University, USA. Li Ding, Rensselaer Polytechnic Institute, USA. Mike Dean, BBN, USA. John Bateman, Universität Bremen, Germany. Michael Kohlhase, Jacobs University, Germany. Tomi Kauppinen, University of Muenster, Germany. Peter Haase, Fluid Operations, Germany. Raul
Conferences about Semantics
Hi, everyone. I would like to know some conferences regarding Semantics and linked data to submit papers. Any hint/advice? Best regards, Luciano - Luciano B. de Paula DCA - FEEC - Unicamp www.dca.fee.unicamp.br/~luciano - Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com
Re: DBpedia-based entity recognition service / tool?
Nathan wrote: Matthias Samwald wrote: Nathan wrote Quite sure the results speak for themselves + glad that so much useful information can be extracted from text all ready. The results look good indeed. It even passed the FOAF test! Can you estimate the ratio of contributions from Zemanta / contributions from OpenCalais? Does one source add more than the other? Does the ratio vary significantly between different texts? Zemanta is more precise, OpenCalais is more verbose; results vary depending on the subject matter and each documents content - in all honesty there is no way to say one is better than the other, but I can say that both combined is as good as you can get for now. Noted that Kingsley mentioned the Sponger Middleware for virtuoso, this would allow you to do the same afaik, but faster and with the option of adding in more sponger cartridges for virtually any third party api's + with the extensive list of cartridges already included it's definitely an option worth looking in to - and ultimately the fastest / most reliable. fyi: i ran your source text through the combination system and here's what it brings back: http://dbpedia.org/resource/Nervous_system http://dbpedia.org/resource/Albizia http://dbpedia.org/resource/5-HT1A_receptor http://dbpedia.org/resource/Serotonin http://dbpedia.org/resource/Albizia_julibrissin http://dbpedia.org/resource/Serotonergic http://dbpedia.org/resource/Parasympathetic_nervous_system http://dbpedia.org/resource/Neurochemistry http://dbpedia.org/resource/Biochemistry http://dbpedia.org/resource/Neurotransmitter http://dbpedia.org/resource/Physiology http://dbpedia.org/resource/Biology regards, Nathan Nathan, Re. mathias' doc (don't pick up anything useful via opencalais, zemanta, alchemy or any of the other meta cartridges), what did you use for the entity extraction? I ask because we have bio2rdf and linked open drug data etc.. loaded to: http://lod.openlinksw.com . If the life science realm entities are extracted, we can get FCT to locate associated entities, all that is required is an extractor cartridge that is place ahead of the LOD lookup re. sponger workflow. Note: Meta Cartridges are about doing lookups on the graphs produced by the Extractor Cartridges; basically, its about augmenting the graph with URIs culled from a variety of Linked Data space lookups. I am very intrigued re. your extractor :-) Re. your doc: URIBurner [1] or our Live Demo Server [2], both include the Sponger with fully loaded Cartridges. The sponger generates proxy/wrapper Linked Data URIs and it also makes a local Graph IRI using the URL of the sponged Resource. Thus, based on our AlchemyAPI meta cartridge you can do the following using the /sparql [3] or /isparql [4] (* this one lets you share results pages or query definition pages via URLs*) services associated with either instance If you want to forcefully clear out cache when your query is executed: DEFINE get:soft replace SELECT ?o2 ?o3 FROM http://webr3.org/__play/optimal/webr3.html WHERE { ?s rdfs:seeAlso ?o. ?o http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#Disambiguation ?o2. ?o2 ?p ?o3 } LIMIT 50 If you are happy to work with warm cache or let the Virtuoso instance deal with the cache invalidation then: SELECT ?o2 ?o3 FROM http://webr3.org/__play/optimal/webr3.html WHERE { ?s rdfs:seeAlso ?o. ?o http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#Disambiguation ?o2. ?o2 ?p ?o3 } LIMIT 50 ## For just distinct DBpedia URIs DEFINE get:soft replace SELECT distinct ?o ?dbp from http://webr3.org/__play/optimal/webr3.html WHERE {?s rdfs:seeAlso ?o. ?o ?p ?dbp FILTER ( regex(str(?dbp),.*dbpedia.org ) ) } Links 1. http://uriburner.com/sparql 2. http://uriburner.com/isparql 3. http://demo.openlinksw.com/sparql 4. http://demo.openlinksw.com/isparql 5. http://bit.ly/bAY8Uy -- SPARQL Protocol URL for the first query above 6. http://bit.ly/amqdfE -- ditto, but seeking all DBpedia URIs from the sponged resource -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter: kidehen
Re: Conferences about Semantics
On 2 Feb 2010, at 18:23, Luciano Bernardes de Paula wrote: I would like to know some conferences regarding Semantics and linked data to submit papers. ISWC ESWC LDOW I-Semantics There are many more. Best, Richard Any hint/advice? Best regards, Luciano - Luciano B. de Paula DCA - FEEC - Unicamp www.dca.fee.unicamp.br/~luciano - Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com
Re: [fresnel] Fresnel: State of the Art?
Disclaimer: I am only a user, so not into understanding it! We use fresnel for our details panes in rkbexplorer, and for the why? Pages and google gadgets as well (have been using it for quite a few years now). It has been pretty effective, and productive; it has been relatively easy to add the new ontological synonyms as they have come along, so we can really use the open web of data. For example the homepage things. It may also be that the version we use is rather old, as I think there are things like f:alternateProperties or maybe it is f:mergeProperties that we can't use. However, it is now a bit fragile to use - not because of the software (we use Jfresnel), but by the time you have over 800 lines of fresnel n3 with terms coming from more than 15 ontologies, it becomes a bit like writing machine code. And as hard to debug. I keep wanting to write a system to generate or maintain it, but can't find the time. Mind you, not sure what it would look like in Protégé - maybe that is the answer? But then would need to find the time to investigate, and in the end it ain't broke so I haven't fixed it. :-) But it is certainly an appropriate component in the scheme of the Web of Data, and a polishing might be beneficial, especially if it resulted in support tools. Best Hugh On 01/02/2010 14:09, Axel Rauschmayer a...@rauschma.de wrote: I think, it would make sense at some point in time to work on Fresnel 2. My experiences (while implementing editing extensions for Fresnel for Hyena [1]) were as follows: - Fresnel works great for editing, with a few extensions. I've found some things to be too complicated (mainly formats and the rules for applying them) for my taste, so I would simplify those for Fresnel 2. - For HTML *display*, I now prefer templating (with ideas similar to JSP). It gives you more control and is conceptually very simple. RDF templating would benefit from standardizing, too; I've just recently seen a paper somewhere that describes (yet another...) RDF templating mechanism. - Fresnel is still useful for editing and for targetting multiple display architectures (e.g. HTML and PDF, e.g. via iText). It is perfect when a form is all you need. [1] http://hypergraphs.de/hyena/ Does this make sense? Does anyone (dis)agree (possibly vehemently ;-) ? Axel On Feb 1, 2010, at 14:44 , Aldo Bucchi wrote: Hi, I was looking at the current JFresnel codebase and the project seems to have little movement. I was wondering if this is the state of the art regarding Declarative Presentation Knowledge for RDF or have efforts moved elsewhere and I have missed it? Thanks! A -- Aldo Bucchi skype:aldo.bucchi http://www.univrz.com/ http://aldobucchi.com/ PRIVILEGED AND CONFIDENTIAL INFORMATION This message is only for the use of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If you are not the intended recipient, please do not distribute or copy this communication, by e-mail or otherwise. Instead, please notify us immediately by return e-mail.
Re: Conferences about Semantics
Luciano Bernardes de Paula wrote: Hi, everyone. I would like to know some conferences regarding Semantics and linked data to submit papers. Any hint/advice? Best regards, Luciano - Luciano B. de Paula DCA - FEEC - Unicamp www.dca.fee.unicamp.br/~luciano - Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com Try: http://lod.openlinksw.com/fct/facet.vsp?cmd=loadfsq_id=2 Just filter to your needs across Type or Property dimensions. Basically, once set click on the Distinct values with Counts or Show values links, and after that click on the URI of those entities that interest you. -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter: kidehen
Re: Conferences about Semantics
On 02/02/10 18:15, Kingsley Idehen wrote: Luciano Bernardes de Paula wrote: Hi, everyone. I would like to know some conferences regarding Semantics and linked data to submit papers. Any hint/advice? Best regards, Luciano - Luciano B. de Paula DCA - FEEC - Unicamp www.dca.fee.unicamp.br/~luciano - Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com Try: http://lod.openlinksw.com/fct/facet.vsp?cmd=loadfsq_id=2 Just filter to your needs across Type or Property dimensions. Basically, once set click on the Distinct values with Counts or Show values links, and after that click on the URI of those entities that interest you. Interestingly, although very relevant, the Semantic Web tracks at WWW XXX (e.g. WWW 2010 or WWW 2009) don't show up... Cheers D --- Prof. Daniel Schwabe Dep. de Informática, PUC-Rio R. M. de S. Vicente, 225, Rio de Janeiro, RJ 22453-900 Tel. +55 21 3114 1500 x. 4356
Re: Conferences about Semantics
On 02/02/10 16:23, Luciano Bernardes de Paula wrote: Hi, everyone. I would like to know some conferences regarding Semantics and linked data to submit papers. Any hint/advice? Also relevant: http://semanticweb.org/wiki/Events. The Dogfood server at SemanticWeb.org has a collection of the main Semantic Web conference papers for the past few years, and is being kept up-to-date. Cheers D --- Prof. Daniel Schwabe Dep. de Informática, PUC-Rio R. M. de S. Vicente, 225, Rio de Janeiro, RJ 22453-900 Tel. +55 21 3114 1500 x. 4356
Re: DBpedia-based entity recognition service / tool?
On 2/2/10 7:26 AM, Matthias Samwald wrote: I would be glad to hear your advice on how to best accomplish a simple task: extracting DBpedia entities (identified with DBpedia URIs) from a string of text. With good accuracy and recall, possibly with some options to constraint the recognized entities to some subset of DBpedia, ... This closely related to the task of the Knowledge Base Population track [1] that was run as part of the NIST 2009 Text Analysis Conference [2]. The KBP track required systems to to two tasks: entity linking and slot filling. For entity linking, participants had to take an entity mention (e.g., CDC) and a document in which it appeared, and decide which of the ~800K entitles in a reference KB derived from Wikipedia it referred to or NIL it was thought to refer to none of them. For the slot filling part, participants started with a KB entity and had to fill in as many of the unknown slots (i.e., properties) by finding answers in a collection of 1.3 million English newswire articles. Each value had to be linked to evidence -- a pointer to the part of a document from which it was derived. If the slot values were also entities in the KB, then a link to them was supposed to also be found. The competition and workshop had 13 groups participating. Papers describing the systems and the results should be available online later this month. Several of them did very well on the entity linking task even when a large proportion of the queries were not in the KB and should resolve to NIL. The systems generally did much worse on the more difficult slot filling task. There will be another, revised version of the KBP track [3] held in 2010. [1] http://apl.jhu.edu/~paulmac/kbp.html [2] http://www.nist.gov/tac/ [3] http://nlp.cs.qc.cuny.edu/kbp/2010/ -- Tim Finin, Computer Science and Electrical Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore MD 21250 http://umbc.edu/~finin fi...@umbc.edu 410-455-3522 fax:-3969 http://ebiquity.umbc.edu/ tfi...@gmail.com
Re: Conferences about Semantics
Daniel Schwabe wrote: On 02/02/10 18:15, Kingsley Idehen wrote: Luciano Bernardes de Paula wrote: Hi, everyone. I would like to know some conferences regarding Semantics and linked data to submit papers. Any hint/advice? Best regards, Luciano - Luciano B. de Paula DCA - FEEC - Unicamp www.dca.fee.unicamp.br/~luciano - Veja quais são os assuntos do momento no Yahoo! +Buscados http://br.maisbuscados.yahoo.com Try: http://lod.openlinksw.com/fct/facet.vsp?cmd=loadfsq_id=2 Just filter to your needs across Type or Property dimensions. Basically, once set click on the Distinct values with Counts or Show values links, and after that click on the URI of those entities that interest you. Interestingly, although very relevant, the Semantic Web tracks at WWW XXX (e.g. WWW 2010 or WWW 2009) don't show up... Worst case, the data sets haven't been loaded. Give me a URL for the RDF dumps, if such exist. Kingsley Cheers D --- Prof. Daniel Schwabe Dep. de Informática, PUC-Rio R. M. de S. Vicente, 225, Rio de Janeiro, RJ 22453-900 Tel. +55 21 3114 1500 x. 4356 -- Regards, Kingsley Idehen President CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/blog/~kidehen Twitter: kidehen
foaf dataset
Dear LODers We are looking for a foaf dataset. UMBC has collect one some years ago [1]. Does anyone know a newer or bigger dataset? Thanks! [1] http://ebiquity.umbc.edu/blogger/2005/01/25/foaf-dataset-available/ - Jie Bao http://www.cs.rpi.edu/~baojie