The following issue has been SUBMITTED. ====================================================================== http://bugs.librdf.org/mantis/view.php?id=495 ====================================================================== Reported By: normang Assigned To: ====================================================================== Project: Raptor RDF Syntax Library Issue ID: 495 Category: api Reproducibility: always Severity: major Priority: normal Status: new Syntax Name: RDFa & Turtle ====================================================================== Date Submitted: 2012-02-19 21:21 Last Modified: 2012-02-19 21:21 ====================================================================== Summary: RDFa parser produces unexpected results with CDATA sections and entity references Description: Consider the examples below.
Tests content1, 2, 4 and 5 are, I think wrong. For content1, 2, 4 and 5, the CDATA marked section is simply omitted. Although http://www.w3.org/TR/rdfa-syntax/ doesn't mention CDATA marked sections, there's nothing there that seems to warrant ignoring them. Tests content1, 2 and 5 produce XMLLiteral data which includes both elements and entities. However in each of the three cases, the Turtle output has the characters denoted by entities (the &<>) appearing literally in the rdf:XMLLiteral, making it not valid XML. Ie they're not escaped in any way. I can't find anything, in either http://www.w3.org/TR/REC-rdf-syntax/ (which I suppose is the definition of rdf:XMLLiteral) or http://www.w3.org/TeamSubmission/turtle/ which spells out what the content of an rdf:XMLLiteral should be, but I would be surprised if invalid XML is allowed. I don't know whether this is an RDFa parsing error or a Turtle serialisation error. Steps to Reproduce: % cat /tmp/try.xml <?xml version='1.0' encoding='utf-8'?> <!DOCTYPE html PUBLIC '-//W3C//DTD XHTML+RDFa 1.0//EN' 'http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd'> <html xmlns='http://www.w3.org/1999/xhtml' xmlns:ns='urn:ns#' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'> <head> <title property='ns:title'>T</title> <meta about='' property='ns:abstract' content='Abstract <>&%' /> </head> <body> <!-- for cases below, see http://www.w3.org/TR/rdfa-syntax/ Sect. 6.3.1.3 --> <!-- explicit XMLLiteral @datatype --> <p property='ns:content1' datatype='rdf:XMLLiteral' >content1: <![CDATA[cdata<>&]]> <span>not</span>&<></p> <!-- no @datatype, presence of elements implies it --> <p property='ns:content2' >content2: <![CDATA[cdata<>&]]> <span>not</span>&<></p> <!-- no @datatype, but no XML elements, so plain literal --> <p property='ns:content3' >content3: plain content</p> <!-- explicit empty @datatype, so interpreted as a plain literal --> <p property='ns:content4' datatype='' >content4: <![CDATA[cdata<>&]]> <span>not</span>&<></p> <!-- basically same as content2 above --> <div property='ns:content5' ><p>content5: <![CDATA[cdata<>&]]> <span>not</span>&<></p></div> </body></html> % rapper -irdfa -oturtle /tmp/try.xml rapper: Parsing URI file:///tmp/try.xml with parser rdfa rapper: Serializing with serializer turtle @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix : <http://www.w3.org/1999/xhtml> . @prefix ns: <urn:ns#> . <file:///tmp/try.xml> ns:abstract "Abstract <>&%" ; ns:content1 "content1: <span xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:ns=\"urn:ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">not</span>&<>"^^rdf:XMLLiteral ; ns:content2 "content2: <span xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:ns=\"urn:ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">not</span>&<>"^^rdf:XMLLiteral ; ns:content3 "content3: plain content" ; ns:content4 "content4: not&<>" ; ns:content5 "<p xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:ns=\"urn:ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">content5: <span>not</span>&<></p>"^^rdf:XMLLiteral ; ns:title "T" . rapper: Parsing returned 7 triples % rapper --version 2.0.4 % ====================================================================== Issue History Date Modified Username Field Change ====================================================================== 2012-02-19 21:21 normang New Issue ====================================================================== _______________________________________________ redland-dev mailing list [email protected] http://lists.librdf.org/mailman/listinfo/redland-dev
