Marvin,

yes, it's a bug in our dataset. In particular in the Yago dataset, which
has been contributed externally and wasn't created with the DBpedia
framework (but hey, we've got many similar bugs in datasets created by
our framework ;))

Yago URIs have not been url-encoded. So as a workaround, you can
url_encode all URIs starting with http://dbpedia.org/class/yago/ in the
yago_en.nt file before loading it into your Jena model. That should do
it.

And we'll fix that bug for the future.

Best,
Georgi

--
Georgi Kobilarov
Freie Universität Berlin
www.georgikobilarov.com

> -----Original Message-----
> From: [EMAIL PROTECTED]
[mailto:dbpedia-
> [EMAIL PROTECTED] On Behalf Of Marvin Lugair
> Sent: Wednesday, August 20, 2008 12:57 AM
> To: [email protected]
> Subject: [Dbpedia-discussion] Ampersand in dbpedia returned URI
> breakingJena code
> 
> 
> Hi,
> 
> The following sparql query:
> select distinct ?Concept where {[] a ?Concept
> 
> Is the default query at the dbpedia endpoint http://dbpedia.org/sparql
> It returns several URI's including the following one (notice the and
> sign):
> 
> http://dbpedia.org/class/yago/Bill&MelindaGatesFoundationPeople
> 
> So DBPedia is returning URI's containing an ampersand. This is causing
> an exception in the Jena parser.
> 
> How do I fix this? None of Jenas methods will work, I cant transofrm
> the resultset into a model or even print is with the resultformatter.
> If i iterate over it, I can print the results one by one till I get to
> the malformed URI. How do I check in my code for malformed URI's?
> 
> 
> Any ideas?
> Thanks!
> Marv
> -------------
> 
> The code below works till i get a URI with an ampersand.
> The exception is coming from results.nextSolution(). Other Jena
> methods to convert the retrieved resultset to a model directly or
> format it produce the same exception (I assume they have a similar
> iterator inside)
> 
> 
> QueryExecution qexec =
> QueryExecutionFactory.sparqlService("http://DBpedia.org/sparql";,
> "select distinct ?Concept where {[] a ?Concept}");
> 
> try {
> ResultSet results = qexec.execSelect();
> for ( ; results.hasNext() ; )
> {
> QuerySolution soln = results.nextSolution() ;
> String x = soln.get("Concept").toString();
> System.out.print(x +"\n");
> }
> }
> 
> finally {
> System.out.println("closing!");
> qexec.close() ;
> }
> 
> 
> This will result in the following error:
> 
> 
> [com.ctc.wstx.exc.WstxLazyException]
> com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '<'
> (code 60); expected a semi-colon after the reference for entity
> 'MelindaGatesFoundationPeople'
> at [row,col {unknown-source}]: [2609,96]
> at
>
com.ctc.wstx.exc.WstxLazyException.throwLazily(WstxLazyException.java:4
> 5)
> at
> com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:671)
> at
>
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.jav
> a:3505)
> at
> com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:804)
> at
>
com.ctc.wstx.sr.BasicStreamReader.getElementText(BasicStreamReader.java
> :674)
> at
>
com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.getOneSolut
> ion(XMLIn\
> putStAX.java:472)
> at
>
com.hp.hpl.jena.sparql.resultset.XMLInputStAX$ResultSetStAX.hasNext(XML
> InputStAX\
> .java:213)
> 
> 
> 
> I also posted this on the Jena group but some seem to suggest it is a
> dbpedia issue: http://tech.groups.yahoo.com/group/jena-
> dev/message/36210
> 
> 
> 
> 
> 
>
-----------------------------------------------------------------------
> --
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the
> world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to