Richard,

> is that URLs containing '&' lead to broken (non-well-formed) RDF/XML.
Only if the XML serializer is broken - '&' must be
encoded, that's standard practice in XML. There
was a problem in Virtuoso, but that has been fixed:
http://sourceforge.net/mailarchive/message.php?msg_id=28876318

In other words: changing DBpedia URIs is not the
right way to fix a broken XML serializer. :-)

Christopher

On Tue, Mar 6, 2012 at 09:22, Richard Light <rich...@light.demon.co.uk> wrote:
> Christopher,
>
> I don't have strong views about the details of URI encoding.  (In my view,
> both Wikipedia and dbpedia should use simple numeric identifiers for each
> concept, rather than these stupid and mutable made-up-from-the-page-title
> ones.  But that's maybe a separate thread.)
>
> However, I think I need to point out that the reason I started this thread
> is that URLs containing '&' lead to broken (non-well-formed) RDF/XML.  So I
> think that '%26' is mandatory, whatever happens to other characters.
>
> Richard
>
>
> On 06/03/2012 00:14, Jona Christopher Sahnwaldt wrote:
>
> Dear all,
>
> I just checked a few specs to figure out what would be the best policy
> for DBpedia regarding URI encoding.
>
> In summary, I think DBpedia should encode as few characters as
> possible, e.g. use '&', not '%26'.
>
> The URI spec [1] has a lot of special cases, but in the end it's quite
> clear that in our case we do not HAVE to encode most special
> characters like '&'. See 3.3 Path Component.
>
> More importantly, the RDF spec includes the following note [2]:
>
> Because of the risk of confusion between RDF URI references that would
> be equivalent if derefenced, the use of %-escaped characters in RDF
> URI references is strongly discouraged.
>
> Could hardly be clearer...
>
>
> A related, but different issue is how Wikipedia and Virtuoso dereference
> URIs.
>
> Wikipedia is very lenient: "&_(EP)" [3] is equivalent to
> "%26_%28EP%29" [4]. Even "OS%2F2" [5] is treated as equivalent to
> "OS/2" [6]. (Not sure which of these bahaviors is or isn't violating
> the URI spec).
>
> Virtuoso on dbpedia.org is very strict: it only returns data for
> "OS/2" [7] and "&_%28EP%29" [8], but empty pages for all other
> encoding variants.
>
>
> Christopher
>
> [1] http://www.ietf.org/rfc/rfc2396.txt
> [2] http://www.w3.org/TR/rdf-concepts/#dfn-URI-reference
> [3] http://en.wikipedia.org/wiki/&_(EP)
> [4] http://en.wikipedia.org/wiki/%26_%28EP%29
> [5] http://en.wikipedia.org/wiki/OS%2F2
> [6] http://en.wikipedia.org/wiki/OS/2
> [7] http://dbpedia.org/resource/OS/2
> [8] http://dbpedia.org/resource/&_%28EP%29
>
>
> On Tue, Feb 21, 2012 at 15:04, Jimmy O'Regan <jore...@gmail.com> wrote:
>
> On 21 February 2012 13:47, Richard Light <rich...@light.demon.co.uk> wrote:
>
> Jimmy,
>
> Not, I'm not confused.  :-)
>
> Fair enough.
>
> I just thought that if the "&" were URLencoded it wouldn't need to be XML
> escaped, because as you say it would then read "%26", and so wouldn't cause
> problems to the XML parser.  And I thought URLencoding should happen here.
> To quote a random Web source [1]:
>
> That the URL isn't XML escaped in RDF/XML is clearly and unambiguously
> a bug; that it isn't URL escaped is more a matter for discussion, but
> the general consensus will probably be 'do what Wikipedia do', which
> is to not escape ampersands.
>
> --
> <Sefam> Are any of the mentors around?
> <jimregan> yes, they're the ones trolling you
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
> --
> Richard Light
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to