hi all,
attached are some more problems from the englisch nquads dumps that my
scripts were able to filter out. What they there not able to filter out are
things like this:
<http://dbpedia.org/resource/Sean_Gustus>
<http://dbpedia.org/property/coachingyears>
"200220032004"^^<http://www.w3.org/2001/XMLSchema#int>
<http://en.wikipedia.org/wiki/Sean_Gustus#absolute-line=11> .
<http://dbpedia.org/resource/Rui_Miguel_Silva>
<http://dbpedia.org/property/titleYears>
"2001200220032004"^^<http://www.w3.org/2001/XMLSchema#int>
<http://en.wikipedia.org/wiki/Rui_Miguel_Silva#absolute-line=28> .
xsd:int is a signed 32bit integer with a maximum value of 2147483647.
Regarding the problems with case in the yago URIs I will not try to filter
out the problematic quads but instead will stick to dbpedia 3.6.
Regards,
Michael Brunnbauer
On Mon, Oct 03, 2011 at 07:04:20PM -0400, Kingsley Idehen wrote:
> On 10/3/11 6:57 PM, David Butler wrote:
> >Thanks Kingsley, much appreciated!
> >
> >Do you have any idea how soon the data is planned to be cleaned up?
>
> The extractors need to be fixed first, then the dumps regenerated.
> Alternatively, the dumps can also be tweaked via text processing and
> transformation. Once this is done, we just load the data etc..
>
> Thus, for now its more about fixing the dumps.
>
> Kingsley
> >
> >Thanks,
> >David
> >
> >On Mon, Oct 3, 2011 at 1:05 PM, Kingsley Idehen
> ><[email protected] <mailto:[email protected]>> wrote:
> >
> > On 10/3/11 3:28 PM, David Butler wrote:
> >> This is related to the owl:suBClassOf typo mentioned in another
> >> thread. I noticed this as well and fixed it manually in my local
> >> instance, BUT...
> >>
> >> It turns out that lots of YAGO type names are also messed up. For
> >> example:
> >>
> >> http://dbpedia.org/class/yago/ConduCtor109952539
> >> http://dbpedia.org/class/yago/TheatricalProduCEr110705448
> >> http://dbpedia.org/class/yago/StuDEntTeacher110666259
> >> http://dbpedia.org/class/yago/EduCAtor110045713
> >> http://dbpedia.org/class/yago/PrisonGuArd110149867
> >> etc.
> >>
> >> At first I saw no pattern, but now my theory is that the type
> >> names were post-processed to capitalize common abbreviations
> >> (such as for U.S. states, countries, elements on the periodic
> >> table, and AD/BC/CE).
> >>
> >> If anyone is relying heavily on the YAGO types, they will be
> >> forced to revert back to the 3.6 version of yago_links.nt if this
> >> isn't repaired. My recommendation/request would be to fix and
> >> release a new version of this file.
> >>
> >> Thanks,
> >> David
> >>
> >>
> >>
> >> ------------------------------------------------------------------------------
> >> All the data continuously generated in your IT infrastructure
> >> contains a
> >> definitive record of customers, application performance, security
> >> threats, fraudulent activity and more. Splunk takes this data and
> >> makes
> >> sense of it. Business sense. IT sense. Common sense.
> >> http://p.sf.net/sfu/splunk-d2dcopy1
> >>
> >>
> >> _______________________________________________
> >> Dbpedia-discussion mailing list
> >> [email protected]
> >> <mailto:[email protected]>
> >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
> > Once all the brokens items are fixed, we can just reload or update
> > the DBMS. I don't want this to happen without a serious amount of
> > cleanups being completed first. Thus, we will need to know when
> > all the issues have been resolved along these lines.
> >
> > --
> >
> > Regards,
> >
> > Kingsley Idehen
> > President& CEO
> > OpenLink Software
> > Web:http://www.openlinksw.com
> > Weblog:http://www.openlinksw.com/blog/~kidehen
> > <http://www.openlinksw.com/blog/%7Ekidehen>
> > Twitter/Identi.ca: kidehen
> >
> >
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------------
> > All the data continuously generated in your IT infrastructure
> > contains a
> > definitive record of customers, application performance, security
> > threats, fraudulent activity and more. Splunk takes this data and
> > makes
> > sense of it. Business sense. IT sense. Common sense.
> > http://p.sf.net/sfu/splunk-d2dcopy1
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > [email protected]
> > <mailto:[email protected]>
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
> >
>
>
> --
>
> Regards,
>
> Kingsley Idehen
> President& CEO
> OpenLink Software
> Web: http://www.openlinksw.com
> Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca: kidehen
>
>
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure contains a
> definitive record of customers, application performance, security
> threats, fraudulent activity and more. Splunk takes this data and makes
> sense of it. Business sense. IT sense. Common sense.
> http://p.sf.net/sfu/splunk-d2dcopy1
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
++ Michael Brunnbauer
++ netEstate GmbH
++ Geisenhausener Straße 11a
++ 81379 München
++ Tel +49 89 32 19 77 80
++ Fax +49 89 32 19 77 89
++ E-Mail [email protected]
++ http://www.netestate.de/
++
++ Sitz: München, HRB Nr.142452 (Handelsregister B München)
++ USt-IdNr. DE221033342
++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
<http://dbpedia.org/resource/Stair_Hole>
<http://dbpedia.org/ontology/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg/200px-Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://en.wikipedia.org/wiki/Stair_Hole#absolute-line=5> .
<http://dbpedia.org/resource/Stair_Hole> <http://xmlns.com/foaf/0.1/depiction>
<http://upload.wikimedia.org/wikipedia/commons/c/c1/Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://en.wikipedia.org/wiki/Stair_Hole#absolute-line=5> .
<http://upload.wikimedia.org/wikipedia/commons/c/c1/Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://en.wikipedia.org/wiki/Stair_Hole#absolute-line=5> .
<http://upload.wikimedia.org/wikipedia/commons/c/c1/Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://xmlns.com/foaf/0.1/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg/200px-Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://en.wikipedia.org/wiki/Stair_Hole#absolute-line=5> .
<http://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg/200px-Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Stair_Hole_%26_the_Lulworth_Crumple,_Lulworth_Cove,_UK_-_180°_Panorama.jpg>
<http://en.wikipedia.org/wiki/Stair_Hole#absolute-line=5> .
<http://dbpedia.org/resource/D%C3%A9ville-l%C3%A8s-Rouen>
<http://dbpedia.org/ontology/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/f/fc/Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg/200px-Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg.png>
<http://en.wikipedia.org/wiki/D%C3%A9ville-l%C3%A8s-Rouen#section=Heraldr&relative-line=3&absolute-line=28>
.
<http://dbpedia.org/resource/D%C3%A9ville-l%C3%A8s-Rouen>
<http://xmlns.com/foaf/0.1/depiction>
<http://upload.wikimedia.org/wikipedia/commons/f/fc/Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg>
<http://en.wikipedia.org/wiki/D%C3%A9ville-l%C3%A8s-Rouen#section=Heraldr&relative-line=3&absolute-line=28>
.
<http://upload.wikimedia.org/wikipedia/commons/f/fc/Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg>
<http://en.wikipedia.org/wiki/D%C3%A9ville-l%C3%A8s-Rouen#section=Heraldr&relative-line=3&absolute-line=28>
.
<http://upload.wikimedia.org/wikipedia/commons/f/fc/Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg>
<http://xmlns.com/foaf/0.1/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/f/fc/Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg/200px-Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg.png>
<http://en.wikipedia.org/wiki/D%C3%A9ville-l%C3%A8s-Rouen#section=Heraldr&relative-line=3&absolute-line=28>
.
<http://upload.wikimedia.org/wikipedia/commons/thumb/f/fc/Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg/200px-Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg.png>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Blason_ville_fr_Déville-lès-Rouen_%28Seine-Maritime%29.svg>
<http://en.wikipedia.org/wiki/D%C3%A9ville-l%C3%A8s-Rouen#section=Heraldr&relative-line=3&absolute-line=28>
.
<http://dbpedia.org/resource/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi>
<http://dbpedia.org/ontology/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/9/90/İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg/200px-İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://en.wikipedia.org/wiki/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi#absolute-line=5>
.
<http://dbpedia.org/resource/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi>
<http://xmlns.com/foaf/0.1/depiction>
<http://upload.wikimedia.org/wikipedia/commons/9/90/İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://en.wikipedia.org/wiki/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi#absolute-line=5>
.
<http://upload.wikimedia.org/wikipedia/commons/9/90/İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://en.wikipedia.org/wiki/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi#absolute-line=5>
.
<http://upload.wikimedia.org/wikipedia/commons/9/90/İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://xmlns.com/foaf/0.1/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/9/90/İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg/200px-İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://en.wikipedia.org/wiki/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi#absolute-line=5>
.
<http://upload.wikimedia.org/wikipedia/commons/thumb/9/90/İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg/200px-İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:İzmir_Atatürk_Anadolu_Teknik%2C_Teknik_ve_Endüstri_Meslek_Lisesi_Logosu.jpg>
<http://en.wikipedia.org/wiki/%C4%B0zmir_Atat%C3%BCrk_Anadolu_Teknik_Lisesi#absolute-line=5>
.
<http://dbpedia.org/resource/A_Rose_for_the_Dead>
<http://dbpedia.org/ontology/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg/200px-Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://en.wikipedia.org/wiki/A_Rose_for_the_Dead#absolute-line=5> .
<http://dbpedia.org/resource/A_Rose_for_the_Dead>
<http://xmlns.com/foaf/0.1/depiction>
<http://upload.wikimedia.org/wikipedia/commons/d/dc/Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://en.wikipedia.org/wiki/A_Rose_for_the_Dead#absolute-line=5> .
<http://upload.wikimedia.org/wikipedia/commons/d/dc/Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://en.wikipedia.org/wiki/A_Rose_for_the_Dead#absolute-line=5> .
<http://upload.wikimedia.org/wikipedia/commons/d/dc/Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://xmlns.com/foaf/0.1/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg/200px-Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://en.wikipedia.org/wiki/A_Rose_for_the_Dead#absolute-line=5> .
<http://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg/200px-Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Theatre_of_Tragedy_â_Ð_rose_for_the_death_%28Album_cover%29.jpg>
<http://en.wikipedia.org/wiki/A_Rose_for_the_Dead#absolute-line=5> .
<http://dbpedia.org/resource/%C4%8CD_Class_471>
<http://dbpedia.org/ontology/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg/200px-Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://en.wikipedia.org/wiki/%C4%8CD_Class_471#absolute-line=3> .
<http://dbpedia.org/resource/%C4%8CD_Class_471>
<http://xmlns.com/foaf/0.1/depiction>
<http://upload.wikimedia.org/wikipedia/commons/b/ba/Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://en.wikipedia.org/wiki/%C4%8CD_Class_471#absolute-line=3> .
<http://upload.wikimedia.org/wikipedia/commons/b/ba/Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://en.wikipedia.org/wiki/%C4%8CD_Class_471#absolute-line=3> .
<http://upload.wikimedia.org/wikipedia/commons/b/ba/Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://xmlns.com/foaf/0.1/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg/200px-Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://en.wikipedia.org/wiki/%C4%8CD_Class_471#absolute-line=3> .
<http://upload.wikimedia.org/wikipedia/commons/thumb/b/ba/Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg/200px-Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Praha%2C_SmÃchov%2C_SmÃchovské_nádražÃ%2C_City_Elephant_(nový).jpg>
<http://en.wikipedia.org/wiki/%C4%8CD_Class_471#absolute-line=3> .
<http://dbpedia.org/resource/Kurt_Maetzig>
<http://dbpedia.org/ontology/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg/200px-Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://en.wikipedia.org/wiki/Kurt_Maetzig#absolute-line=3> .
<http://dbpedia.org/resource/Kurt_Maetzig>
<http://xmlns.com/foaf/0.1/depiction>
<http://upload.wikimedia.org/wikipedia/commons/3/3f/Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://en.wikipedia.org/wiki/Kurt_Maetzig#absolute-line=3> .
<http://upload.wikimedia.org/wikipedia/commons/3/3f/Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://en.wikipedia.org/wiki/Kurt_Maetzig#absolute-line=3> .
<http://upload.wikimedia.org/wikipedia/commons/3/3f/Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://xmlns.com/foaf/0.1/thumbnail>
<http://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg/200px-Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://en.wikipedia.org/wiki/Kurt_Maetzig#absolute-line=3> .
<http://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg/200px-Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://purl.org/dc/elements/1.1/rights>
<http://en.wikipedia.org/wiki/File:Bundesarchiv_Bild_183-33557-0001%2C_Heusdorf%2C_Besuch_von_Künstlern.jpg>
<http://en.wikipedia.org/wiki/Kurt_Maetzig#absolute-line=3> .
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion