This also affects URLs with ( ), and there also seems to be a bug in the 
actual triples.

Compare:

http://dbpedia.org/page/The_Good_Shepherd_%28film%29
and
http://dbpedia.org/page/The_Good_Shepherd_(film)

The one without encoding has the yago:blah triples, the other the normal 
dbpedia stuff.

- Gunnar


On 06/05/10 15:06, Malte Kiesel wrote:
> Mitko Iliev wrote:
>
>> As far as i remember colon is not valid in URI local part, right?
>
> http://tools.ietf.org/html/rfc3986#page-22
> seems not to disallow colons in URI paths for HTTP at least:
>
> path          = path-abempty    ; begins with "/" or is empty
> ...
>         path-abempty  = *( "/" segment )
> ...
>         segment       = *pchar
> ...
>         pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
>
> I'm no expert on this matter though. DBpedia *does* use colons in URIs
> anyways...
>
> I did some quick testing with Firefox; it looks like there's no
> URLDecoding/URLEncoding going on when following Location: headers in 303
> redirects there, so Firefox behaves just like Java does.
>
> Also interesting:
> $ curl -v http://dbpedia.org/resource/X-Men:_Evolution
> (just normal HTML!)
> <  HTTP/1.1 303 See Other
> <  Location: http://dbpedia.org/page/X-Men:_Evolution
>
> No escaping going on here when doing the normal HTML request. So I guess
> this is a bug in Virtuoso when requesting "application/rdf+xml" (and a
> somewhat strange bug in curl perhaps).
>
> Regards
> Malte
>
>> On May 5, 2010, at 6:05 PM, Malte Kiesel wrote:
>>
>>> Hi!
>>>
>>> Apparently there's something odd with the 303 redirects for resources
>>> with ":" in their title. Basically, that seems to work from for example
>>> curl, but it fails from Java. I'm not sure what component is buggy there.
>>>
>>> Example:
>>>
>>> $ curl -v -H "Accept: application/rdf+xml"
>>> http://dbpedia.org/resource/X-Men:_Evolution
>>> ...
>>> <  HTTP/1.1 303 See Other
>>> <  Content-Location: /data/X-Men%3A_Evolution.xml
>>>
>>> $ curl -H "Accept: application/rdf+xml"
>>> http://dbpedia.org/data/X-Men:_Evolution
>>> ...is fine.
>>>
>>> $ curl -H "Accept: application/rdf+xml"
>>> http://dbpedia.org/data/X-Men%3A_Evolution
>>> ...isn't - that strangely returns some foaf triples though (seems these
>>> are returned for whatever data/ URI you request).
>>>
>>> Java seems to get redirected to the latter (broken) URI:
>>>
>>>             url = "http://dbpedia.org/resource/X-Men:_Evolution";;
>>>         URL urlU = new URL(url);
>>>         HttpURLConnection uc = (HttpURLConnection) urlU.openConnection();
>>>         uc.setInstanceFollowRedirects(true);
>>>         uc.setRequestProperty("Accept", "application/rdf+xml");
>>>         uc.connect();
>>>         InputStream is = uc.getInputStream();
>>>         int read;
>>>         while ((read = is.read()) != -1) { System.out.write(read); }
>>> ...outputs the triples the last (broken) curl command also fetches.
>>>
>>> Bug in Java? Bug in Virtuoso?
>>>
>>> I found a related discussion at [1] but that didn't cover the ":" case.
>>>
>>> Regards
>>> Malte
>>>
>>> [1]
>>> http://www.mail-archive.com/dbpedia-discussion@lists.sourceforge.net/msg00776.html
>>>
>>> --
>>> Malte Kiesel, DFKI GmbH
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


------------------------------------------------------------------------------

_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to