On Fri, Oct 11, 2013 at 9:22 PM, Field Tian (fitian) <fit...@cisco.com> wrote: > Yea, cdata is a traditional way in XML. XMPP seems not prefer CDATA. I don’t > know what's the detailed reason. But I think CDATA has its advantage > sometimes.
There is generally this assumption that CDATA would allow adding JSON to XML without any further escaping. This isn't true. XML 1.0 defines: CData ::= (Char* - (Char* ']]>' Char*)) So JSON containing "]]>" inside cannot be added to XML without escaping in some way. Also, XML 1.0 defines valid characters (all characters, in PCDATA, CDATA, even escaped via entities) as: Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ While JSON (RFC 4627) defines unescaped characters as: unescaped = %x20-21 / %x23-5B / %x5D-10FFFF Which are not the same. JSON can contain unescaped characters that are not allowed in a CDATA section. So you have to escape those anyway. As a general rule, if you embed one textual language inside another, escaping is pretty much mandatory. If you see a way around it, you are probably mistaken and missing the edge cases. -- Waqas Hussain