On Apr 15, 2014, at 5:40 AM, Jona Christopher Sahnwaldt <j...@sahnwaldt.de>
 wrote:

> On 15 April 2014 03:32, Patel-Schneider, Peter
> <peter.patel-schnei...@nuance.com> wrote:
>> 
>> […]
>> 
>> peter
>> 
>> PS:  Looking at intermediate results in this pipeline shows that DBpedia has
>> some issues with newlines in resource names.
>> 
> 
> That would be a bug. Could you provide specific examples?



I'm not sure that this is a bug.  Perhaps it would be better to call it a hole 
in some DBpedia algorithm.

There are (or were) Wikipedia pages where whatever DBpedia wants to use as the 
IRI contains a newline.  The code somehow thinks that this is very bad.

Below are the examples from 3.9's instance_types_en.tql.  The reason that I 
noticed this is that I sorted the file during processing and these came to the 
top.   

peter

# <BAD URI: Illegal character in path at index 106: 
http://en.dbpedia.org/resource/St._Paul's_Episcopal_Church,_South_Bass_Island__St._Paul's_Episcopal_Church\nSince_1864\nSouth_Bass__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/HistoricBuilding> 
<http://en.wikipedia.org/wiki/St._Paul's_Episcopal_Church,_South_Bass_Island?oldid=544344040#absolute-line=59>
 .
# <BAD URI: Illegal character in path at index 116: 
http://en.dbpedia.org/resource/The_Mental_and_Social_Life_of_Babies__Jean_Piaget,_Lev_Vygotsky,_Martin_P.M._Richards\n,__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Writer> 
<http://en.wikipedia.org/wiki/The_Mental_and_Social_Life_of_Babies?oldid=547594300#absolute-line=12>
 .
# <BAD URI: Illegal character in path at index 45: 
http://en.dbpedia.org/resource/Morgan's_Run__\nThe_Musical__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Musical> 
<http://en.wikipedia.org/wiki/Morgan's_Run?oldid=544902864#absolute-line=25> .
# <BAD URI: Illegal character in path at index 53: 
http://en.dbpedia.org/resource/Aculco__Town_of_Aculco\nCamino_Real_de_Tierra_Adentro__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/WorldHeritageSite> 
<http://en.wikipedia.org/wiki/Aculco?oldid=540391608#absolute-line=121> .
# <BAD URI: Illegal character in path at index 54: 
http://en.dbpedia.org/resource/Mak_and_the_Dudes__2007\nStar_Records__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Sales> 
<http://en.wikipedia.org/wiki/Mak_and_the_Dudes?oldid=545031280#section=Discograph&relative-line=9&absolute-line=33>
 .
# <BAD URI: Illegal character in path at index 56: 
http://en.dbpedia.org/resource/Carpi_F.C._1909__Leonardo\tTerigi__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/SportsTeamMember> 
<http://en.wikipedia.org/wiki/Carpi_F.C._1909?oldid=544461894#section=Current_squa&relative-line=10&absolute-line=48>
 .
# <BAD URI: Illegal character in path at index 58: 
http://en.dbpedia.org/resource/Eri-TV__Eritrean_Television\nEriTV__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/TelevisionStation> 
<http://en.wikipedia.org/wiki/Eri-TV?oldid=541278492#absolute-line=16> .
# <BAD URI: Illegal character in path at index 58: 
http://en.dbpedia.org/resource/Mazda_MPV__Third_generation\n_Asia-Pacific__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Automobile> 
<http://en.wikipedia.org/wiki/Mazda_MPV?oldid=548389314#section=Third_generation_2006–Present_(FWD/4WD&relative-line=2&absolute-line=114>
 .
# <BAD URI: Illegal character in path at index 58: 
http://en.dbpedia.org/resource/SRT_Viper__First_generation\nViper_RT/10__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Automobile> 
<http://en.wikipedia.org/wiki/SRT_Viper?oldid=546767595#section=First_generation_RT/10_(1992–1995)&relative-line=2&absolute-line=24>
 .
# <BAD URI: Illegal character in path at index 61: 
http://en.dbpedia.org/resource/Mazdaspeed3__Second_generation\nthumb%7C2nd_gen_MS3__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Automobile> 
<http://en.wikipedia.org/wiki/Mazdaspeed3?oldid=547450268#section=2010-presen&relative-line=3&absolute-line=54>
 .
# <BAD URI: Illegal character in path at index 62: 
http://en.dbpedia.org/resource/RAF_Merryfield__RNAS_Merryfield\n60px__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Airport> 
<http://en.wikipedia.org/wiki/RAF_Merryfield?oldid=544963386#section=RAF_Transport_Command/Royal_Navy_us&relative-line=2&absolute-line=95>
 .
# <BAD URI: Illegal character in path at index 64: 
http://en.dbpedia.org/resource/Simpang_Airport__Simpang_Air_Base\n60px__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/MilitaryStructure> 
<http://en.wikipedia.org/wiki/Simpang_Airport?oldid=545027722#absolute-line=22> 
.
# <BAD URI: Illegal character in path at index 65: 
http://en.dbpedia.org/resource/Los_Angeles_Metro_bus_fleet__150px\n150px__1> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/AutomobileEngine> 
<http://en.wikipedia.org/wiki/Los_Angeles_Metro_bus_fleet?oldid=548541695#section=Activ&relative-line=137&absolute-line=154>
 .
# <BAD URI: Illegal character in path at index 66: 
http://en.dbpedia.org/resource/Charles_Graner__Pvt._Charles_Graner\nUnited_States_Army__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/MilitaryPerson> 
<http://en.wikipedia.org/wiki/Charles_Graner?oldid=546443361#absolute-line=19> .
# <BAD URI: Illegal character in path at index 73: 
http://en.dbpedia.org/resource/1992–93_Reading_F.C._season__Jeff_Hopkins_\n_loan_from__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/SportsTeamMember> 
<http://en.wikipedia.org/wiki/1992–93_Reading_F.C._season?oldid=534376725#section=Squa&relative-line=6&absolute-line=32>
 .
# <BAD URI: Illegal character in path at index 76: 
http://en.dbpedia.org/resource/Bezaleel_Taft,_Sr.__Hon._Bezaleel_Taft,_Sr.,_\nHouse_\n_Georgian_Style_a__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Building> 
<http://en.wikipedia.org/wiki/Bezaleel_Taft,_Sr.?oldid=450071326#section=His_death_and_later_uses_of_the_mansio&relative-line=4&absolute-line=42>
 .
# <BAD URI: Illegal character in path at index 76: 
http://en.dbpedia.org/resource/Los_Angeles_Metro_bus_fleet__Flxible_40102-6T\nMetro_'B'__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/AutomobileEngine> 
<http://en.wikipedia.org/wiki/Los_Angeles_Metro_bus_fleet?oldid=548541695#section=Retire&relative-line=438&absolute-line=808>
 .
# <BAD URI: Illegal character in path at index 77: 
http://en.dbpedia.org/resource/Los_Angeles_Metro_bus_fleet__Flxible_40102-6C_\nMetro_'B'__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/AutomobileEngine> 
<http://en.wikipedia.org/wiki/Los_Angeles_Metro_bus_fleet?oldid=548541695#section=Retire&relative-line=368&absolute-line=738>
 .
# <BAD URI: Illegal character in path at index 81: 
http://en.dbpedia.org/resource/RAF_Sculthorpe__Royal_Air_Force_Station_Sculthorpe\n90px\n60px\n60px__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/MilitaryStructure> 
<http://en.wikipedia.org/wiki/RAF_Sculthorpe?oldid=543944630#absolute-line=53> .
# <BAD URI: Illegal character in path at index 82: 
http://en.dbpedia.org/resource/Tepotzotlán__Former_College_of_San_Francisco_Javier\nCamino_Real__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/WorldHeritageSite> 
<http://en.wikipedia.org/wiki/Tepotzotlán?oldid=545090583#absolute-line=121> .
# <BAD URI: Illegal character in path at index 93: 
http://en.dbpedia.org/resource/St._Paul_Church_South_Bass_Island__St._Paul's_Episcopal_Church\nSince_1864\nSouth_Bass__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/HistoricBuilding> 
<http://en.wikipedia.org/wiki/St._Paul_Church_South_Bass_Island?oldid=544286196#absolute-line=59>
 .
# <BAD URI: Illegal character in path at index 96: 
http://en.dbpedia.org/resource/Ironstone,_Massachusetts__Ironstone_Mill_Housing_and_Cellar_Hole_\nUxbridge,__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/Building> 
<http://en.wikipedia.org/wiki/Ironstone,_Massachusetts?oldid=531679439#absolute-line=69>
 .
# <BAD URI: Illegal character in path at index 99: 
http://en.dbpedia.org/resource/Aguascalientes,_Aguascalientes__Historic_Ensemble_of_Aguascalientes_\nCamino_Real_d__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/WorldHeritageSite> 
<http://en.wikipedia.org/wiki/Aguascalientes,_Aguascalientes?oldid=547303197#section=IT_and_Software_Development&relative-line=6&absolute-line=255>
 .
# <BAD URI: Illegal character in path at index 99: 
http://en.dbpedia.org/resource/Leeds_Bradford_International_Airport__Royal_Air_Force_Station_Yeadon\n90px__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/MilitaryStructure> 
<http://en.wikipedia.org/wiki/Leeds_Bradford_International_Airport?oldid=548453871#section=RAF_Yeado&relative-line=2&absolute-line=105>
 .
# <BAD URI: Illegal character in path at index 99: 
http://en.dbpedia.org/resource/San_Luis_Potosí,_San_Luis_Potosí__Historic_Center_of_San_Luis_Potosí\nCamino_Real_de__1>
 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://dbpedia.org/ontology/WorldHeritageSite> 
<http://en.wikipedia.org/wiki/San_Luis_Potosí,_San_Luis_Potosí?oldid=544034433#section=The_city_toda&relative-line=15&absolute-line=235>
 .


------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to