Dear all, I just checked a few specs to figure out what would be the best policy for DBpedia regarding URI encoding.
In summary, I think DBpedia should encode as few characters as possible, e.g. use '&', not '%26'. The URI spec [1] has a lot of special cases, but in the end it's quite clear that in our case we do not HAVE to encode most special characters like '&'. See 3.3 Path Component. More importantly, the RDF spec includes the following note [2]: Because of the risk of confusion between RDF URI references that would be equivalent if derefenced, the use of %-escaped characters in RDF URI references is strongly discouraged. Could hardly be clearer... A related, but different issue is how Wikipedia and Virtuoso dereference URIs. Wikipedia is very lenient: "&_(EP)" [3] is equivalent to "%26_%28EP%29" [4]. Even "OS%2F2" [5] is treated as equivalent to "OS/2" [6]. (Not sure which of these bahaviors is or isn't violating the URI spec). Virtuoso on dbpedia.org is very strict: it only returns data for "OS/2" [7] and "&_%28EP%29" [8], but empty pages for all other encoding variants. Christopher [1] http://www.ietf.org/rfc/rfc2396.txt [2] http://www.w3.org/TR/rdf-concepts/#dfn-URI-reference [3] http://en.wikipedia.org/wiki/&_(EP) [4] http://en.wikipedia.org/wiki/%26_%28EP%29 [5] http://en.wikipedia.org/wiki/OS%2F2 [6] http://en.wikipedia.org/wiki/OS/2 [7] http://dbpedia.org/resource/OS/2 [8] http://dbpedia.org/resource/&_%28EP%29 On Tue, Feb 21, 2012 at 15:04, Jimmy O'Regan <jore...@gmail.com> wrote: > On 21 February 2012 13:47, Richard Light <rich...@light.demon.co.uk> wrote: >> Jimmy, >> >> Not, I'm not confused. :-) >> > > Fair enough. > >> I just thought that if the "&" were URLencoded it wouldn't need to be XML >> escaped, because as you say it would then read "%26", and so wouldn't cause >> problems to the XML parser. And I thought URLencoding should happen here. >> To quote a random Web source [1]: > > That the URL isn't XML escaped in RDF/XML is clearly and unambiguously > a bug; that it isn't URL escaped is more a matter for discussion, but > the general consensus will probably be 'do what Wikipedia do', which > is to not escape ampersands. > > -- > <Sefam> Are any of the mentors around? > <jimregan> yes, they're the ones trolling you > > ------------------------------------------------------------------------------ > Keep Your Developer Skills Current with LearnDevNow! > The most comprehensive online learning library for Microsoft developers > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, > Metro Style Apps, more. Free future releases when you subscribe now! > http://p.sf.net/sfu/learndevnow-d2d > _______________________________________________ > Dbpedia-discussion mailing list > Dbpedia-discussion@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion