Hi Tim, Jörg, Jens and Georgi
> 1. In
>
> http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%
> 2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FTim_Berners-
> Lee%3E done
>
> Tt says my foaf:name is "Berners-Lee, Tim". I think that that is a
> bug --- the foaf:name is
>normally in the order one would use it in conversation.
>
> I get things like "You know Berners-Lee, Tim (unconfirmed). You and
> Berners-Lee, Tim know
> no people in common" which don't read very well.
foaf:name is "Berners-Lee, Tim" is extracted from the Personendaten template
within the German Wikipedia shown below:
{{Personendaten
|NAME=Berners-Lee, Tim
|ALTERNATIVNAMEN=Timothy John Berners-Lee
|KURZBESCHREIBUNG=Erfinder des [[World Wide Web]]
|GEBURTSDATUM=8. Juni 1955
|GEBURTSORT=[[London]]
|STERBEDATUM=
|STERBEORT=
}}
Therefore "Berners-Lee, Tim" is not really a bug, but more raises the
question how much data normalization and data cleansing we should do in the
extraction process.
Currently, we extract data without changing it. What would be actually
needed is knowledge about how to change and normalize values for each
different template. As there are lots of different templates and new ones
are apprearing regularly, collecting this knowledge is a huge task.
In many cases it is also an open question whether we should fine-tune our
extraction code to deal with irregularities within Wikipedia or if it would
be better to use some robots to normalize the data within the Wikipedia
templates.
As the FOAF properties are generated by the template-specific
PersonendataExtractor (
http://dbpedia.svn.sourceforge.net/viewvc/dbpedia/extraction/extractors/PersondataExtractor.php?revision=421&view=markup)
it would not be a big deal to look for the comma within the string and
switch the given- and surname accordingly.
Georgi, Jens, Jörg: Do you think we run into any other problems with this
heuristic?
If not, could you change the PersonendataExtractor for the January release
of the dataset?
> 2. Currently each of the terms in the ontology is looked up separately
> with two fetches
> (303 and 200). This takes an intolerable amount of time. Apended is a
> list of some of the
> fetches from a typical session.
>
> Could we just have a single ontology file, possibly? Or a few, for
> different aspects?
Hmm, this touches the general scaliability problems of Linked Data when
datasets are heavily interlinked (all people in New York) or use a lot of
different properties.
The DBpedia ontology consists of 57 000 triples
(http://dbpedia.org/docs/downloads/2007-08-30/infoboxes.properties.nt.bz2)
and as I guess Tabulator does not want to have all of them, the single
ontology file # solution does not work for DBpedia.
Seperating them into several smaler files also does not really work because
of the distribution of properties. For instance,
dpedia.org/property/abstract occurs together with lots of different
properties.
A workaround for DBpedia could be to put all properties that occur frequent
into one file and then have another ontology file for each template in
Wikipedia which woud contain the definitions of all properties that are
generated by this template. In order to check if this would help for the
DBpedia case, we have to investigate how big the single file with the
frequent properties would be.
But I think that it would also very useful for the Semantic Web as a whole
to think about the general case where datasets contain 100 000s of
properties with have no specific distribution.
Anybody ideas for this?
Cheers
Chris
--
Chris Bizer
Freie Universität Berlin
+49 30 838 54057
[EMAIL PROTECTED]
www.bizer.de
----- Original Message -----
From: Tim Berners-Lee
To: Chris Bizer
Cc: Richard Cyganiak
Sent: Saturday, December 29, 2007 5:16 PM
Subject: DBPedia bugs: Property lookups take too long, names in unusual
order
Hi Chris,
I hope the new year finds you well. Here is a couple of requests for
dbpedia.
1. In
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FTim_Berners-Lee%3E
done
Tt says my foaf:name is "Berners-Lee, Tim". I think that that is a bug ---
the foaf:name is normally in the order one would use it in conversation.
I get things like "You know Berners-Lee, Tim (unconfirmed). You and
Berners-Lee, Tim know no people in common" which don't read very well.
2. Currently each of the terms in the ontology is looked up separately
with two fetches (303 and 200). This takes an intolerable amount of time.
Apended is a list of some of the fetches from a typical session.
Could we just have a single ontology file, possibly? Or a few, for different
aspects?
I think that the next year will be a very interesting one for linked data.
In the break, I have been playing with user-friendly interfaces in
tabulator for social networking. Apologies that it is totally
english-based.
Tim
http://www.w3.org/2004/02/skos/core done
http://dbpedia.org/property/reference done
http://dbpedia.org/property/name done
http://dbpedia.org/property/abstract done
http://xmlns.com/foaf/0.1/surname done
http://xmlns.com/foaf/0.1/givenname done
http://dbpedia.org/class/yago/Laureate110249011 done
http://dbpedia.org/property/wikipage-zh done
http://dbpedia.org/class/yago/Academician109759069 done
http://dbpedia.org/class/yago/Entity100001740 done
http://dbpedia.org/property/wikipage-es done
http://dbpedia.org/property/wikipage-ru done
http://dbpedia.org/property/birthDate done
http://dbpedia.org/class/yago/Scientist110560637 done
http://dbpedia.org/property/caption done
http://dbpedia.org/class/yago/Colleague109935990 done
http://dbpedia.org/class/yago/Inventor110214637 done
http://dbpedia.org/property/wikipage-sv done
http://dbpedia.org/property/birthPlace done
http://dbpedia.org/property/placeOfBirth done
http://dbpedia.org/property/wikipage-ja done
http://dbpedia.org/property/founder done
http://dbpedia.org/property/wikiPageUsesTemplate done
http://dbpedia.org/property/wikipage-de done
http://dbpedia.org/property/alternativeNames done
http://dbpedia.org/property/birth done
http://dbpedia.org/property/wikipage-pt done
http://dbpedia.org/class/yago/Person100007846 done
http://xmlns.com/foaf/0.1/page done
http://dbpedia.org/class/yago/Programmer110481268 done
http://dbpedia.org/property/dateOfBirth done
http://dbpedia.org/property/developer done
http://xmlns.com/foaf/0.1/depiction done
http://dbpedia.org/class/yago/Writer110794014 done
http://dbpedia.org/property/wikipage-fi done
http://dbpedia.org/property/wikipage-no done
http://dbpedia.org/property/wikipage-it done
http://dbpedia.org/property/wikipage-fr done
http://dbpedia.org/property/wikipage-pl done
http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007 done
http://dbpedia.org/property/hasPhotoCollection done
http://dbpedia.org/property/wikipage-nl done
http://dbpedia.org/property/shortDescription done
http://dbpedia.org/class/yago/Blogger109860415 done
http://dbpedia.org/property/image
remove
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Freference%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fabstract%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fname%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FLaureate110249011%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-zh%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FAcademician109759069%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FEntity100001740%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-es%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-ru%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FbirthDate%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FScientist110560637%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fcaption%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FColleague109935990%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FInventor110214637%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-sv%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FbirthPlace%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FplaceOfBirth%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-ja%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Ffounder%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FwikiPageUsesTemplate%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-de%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FalternativeNames%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fbirth%3E
done
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-pt%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FPerson100007846%3E
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FProgrammer110481268%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FdateOfBirth%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fdeveloper%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FWriter110794014%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-fi%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-no%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-it%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-fr%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-pl%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FhasPhotoCollection%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fwikipage-nl%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2FshortDescription%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fclass%2Fyago%2FBlogger109860415%3E
requestTimeout
http://dbpedia.openlinksw.com:8890/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=DESCRIBE+%3Chttp%3A%2F%2Fdbpedia.org%2Fproperty%2Fimage%3E
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion