On Fri, Oct 11, 2013 at 9:22 PM, Field Tian (fitian) <fit...@cisco.com> wrote:
> Yea, cdata is a traditional way in XML. XMPP seems not prefer CDATA. I don’t 
> know what's the detailed reason. But I think CDATA has its advantage 
> sometimes.

There is generally this assumption that CDATA would allow adding JSON
to XML without any further escaping. This isn't true.

XML 1.0 defines:
CData   ::=   (Char* - (Char* ']]>' Char*))

So JSON containing "]]>" inside cannot be added to XML without
escaping in some way.

Also, XML 1.0 defines valid characters (all characters, in PCDATA,
CDATA, even escaped via entities) as:
Char   ::=   #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]/* any Unicode character, excluding the surrogate
blocks, FFFE, and FFFF. */

While JSON (RFC 4627) defines unescaped characters as:
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

Which are not the same. JSON can contain unescaped characters that are
not allowed in a CDATA section. So you have to escape those anyway.

As a general rule, if you embed one textual language inside another,
escaping is pretty much mandatory. If you see a way around it, you are
probably mistaken and missing the edge cases.

--
Waqas Hussain

Reply via email to