This also affects URLs with ( ), and there also seems to be a bug in the actual triples.
Compare: http://dbpedia.org/page/The_Good_Shepherd_%28film%29 and http://dbpedia.org/page/The_Good_Shepherd_(film) The one without encoding has the yago:blah triples, the other the normal dbpedia stuff. - Gunnar On 06/05/10 15:06, Malte Kiesel wrote: > Mitko Iliev wrote: > >> As far as i remember colon is not valid in URI local part, right? > > http://tools.ietf.org/html/rfc3986#page-22 > seems not to disallow colons in URI paths for HTTP at least: > > path = path-abempty ; begins with "/" or is empty > ... > path-abempty = *( "/" segment ) > ... > segment = *pchar > ... > pchar = unreserved / pct-encoded / sub-delims / ":" / "@" > > I'm no expert on this matter though. DBpedia *does* use colons in URIs > anyways... > > I did some quick testing with Firefox; it looks like there's no > URLDecoding/URLEncoding going on when following Location: headers in 303 > redirects there, so Firefox behaves just like Java does. > > Also interesting: > $ curl -v http://dbpedia.org/resource/X-Men:_Evolution > (just normal HTML!) > < HTTP/1.1 303 See Other > < Location: http://dbpedia.org/page/X-Men:_Evolution > > No escaping going on here when doing the normal HTML request. So I guess > this is a bug in Virtuoso when requesting "application/rdf+xml" (and a > somewhat strange bug in curl perhaps). > > Regards > Malte > >> On May 5, 2010, at 6:05 PM, Malte Kiesel wrote: >> >>> Hi! >>> >>> Apparently there's something odd with the 303 redirects for resources >>> with ":" in their title. Basically, that seems to work from for example >>> curl, but it fails from Java. I'm not sure what component is buggy there. >>> >>> Example: >>> >>> $ curl -v -H "Accept: application/rdf+xml" >>> http://dbpedia.org/resource/X-Men:_Evolution >>> ... >>> < HTTP/1.1 303 See Other >>> < Content-Location: /data/X-Men%3A_Evolution.xml >>> >>> $ curl -H "Accept: application/rdf+xml" >>> http://dbpedia.org/data/X-Men:_Evolution >>> ...is fine. >>> >>> $ curl -H "Accept: application/rdf+xml" >>> http://dbpedia.org/data/X-Men%3A_Evolution >>> ...isn't - that strangely returns some foaf triples though (seems these >>> are returned for whatever data/ URI you request). >>> >>> Java seems to get redirected to the latter (broken) URI: >>> >>> url = "http://dbpedia.org/resource/X-Men:_Evolution"; >>> URL urlU = new URL(url); >>> HttpURLConnection uc = (HttpURLConnection) urlU.openConnection(); >>> uc.setInstanceFollowRedirects(true); >>> uc.setRequestProperty("Accept", "application/rdf+xml"); >>> uc.connect(); >>> InputStream is = uc.getInputStream(); >>> int read; >>> while ((read = is.read()) != -1) { System.out.write(read); } >>> ...outputs the triples the last (broken) curl command also fetches. >>> >>> Bug in Java? Bug in Virtuoso? >>> >>> I found a related discussion at [1] but that didn't cover the ":" case. >>> >>> Regards >>> Malte >>> >>> [1] >>> http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg00776.html >>> >>> -- >>> Malte Kiesel, DFKI GmbH > > > ------------------------------------------------------------------------------ > _______________________________________________ > Dbpedia-discussion mailing list > Dbpedia-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion ------------------------------------------------------------------------------ _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion