Title: 0x1A Character
The spec is hard to read on this point. So-called "restricted" characters are not allowed. See the discussion beginning at http://www.stylusstudio.com/xmldev/200410/post30210.html for an explanation.


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, February 03, 2005 9:39 AM
To: xerces-c-dev@xml.apache.org
Subject: 0x1A Character

Hi All,

I have a question regarding an Invalid XML character and Xerces behavior pertaining to it.  Apologize since my questions are not completely Xerces specific.

Recently, we found out that some of our XML text nodes contain the 0x1A character.  This causes the Xerces parser to throw a Invalid character (Unicode: 0x1A) error.

Upon investigating the XML specs, the XML 1.0 Spec does not show that in the list of valid characters.  However, the XML 1.1 spec points to it as a valid, but restricted (not sure what that means). 

On reading this, I changed my XML prolog version to 1.1.  The DOMCount sample that I used still throws an exception.  I used Xerces 2.5.0 for my testing.

Questions I have:

1. What does the "restricted" char mean?

2. Will this be supported in the future?  Is that part of 1.1 spec not supported by Xerces yet?

3.  Xerces does not throw an exception when I create a DOM document and add a text node with that character.  Is there a method for me to check validity of my data before adding it to the text node?

4.  Any suggestions on getting around this issue?


Thanks,

Ravin

Reply via email to