On 30/09/11 16:17, Alexandru Todor wrote:
Hi,
I maintain the German language DBpedia endpoint, and have gotten some
mails from users complaining that they don't get any results from the
endpoint when they query for resources like:
http://de.dbpedia.org/resource/München
This message and your message are ISO-8859-1
ü = 0xFC in ISO-8859-1 which is the same as a Unicode codepoint and 0xC3
0xBC in UTF-8.
I tried http://de.dbpedia.org/resource/München in my browser and got:
to http://de.dbpedia.org/data/M%C3%BCnchen.xml
which returns:
RDF/XML in UTF-8 but it contains e.g. line 3:
rdf:resource="http://de.dbpedia.org/resource/München"
in Firefox. That looks corrupt to me.
This is the code they sent me:
String queryString= "SELECT ?o WHERE
{<http://de.dbpedia.org/resource/München>
<http://purl.org/dc/terms/subject> ?o }";
Query query = QueryFactory.create(queryString);
QueryExecution qexec =
QueryExecutionFactory.sparqlService("http://de.dbpedia.org/sparql", query);
try {
ResultSet results = qexec.execSelect();
for (; results.hasNext();) {
QuerySolution s = results.nextSolution();
System.out.println(s.toString());
}
}
finally {
qexec.close();
}
I tried the code and it works for any IRI that contains no UTF8 chars
(so only for URIs), but when you have UTF8 chars it returns no result.
I've tried a couple of variations and it returns no result but also
doesn't throw any kind of exception, it's just as if the data wasn't there.
Then I proceeded to try an alternative method and used QueryEngineHTTP
to execute the query and it worked. However, QueryEngineHTTP messes up
the UTF8 encoding, so for example in the returned results you get
München instead of München . My guess is that QueryEngineHTTP encodes
the SPARQL results in ISO-8859-1 instead of UTF8, so decoding the
strings as ISO-8859-1 and re-encoding it as UTF8 fixed this.
the code seems to do:
URLEncoder.encode(s, "UTF-8")
but it's still working in strings. Something lower level (Sun
networking) does the string to bytes.
Andy
Kind Regards,
Alexandru Todor
Research Associate
AG Corporate Semantic Web
Freie Universität Berlin