Worth to note that some characters are completely forbidden in XML, such as "chr(0)". When dealing with external text input, some cleanup might be necessary to avoid breaking indexation. For example you could replace each forbidden XML character with " ".
André On 01/15/2013 09:55 PM, Alexandre Rafalovitch wrote:
Interesting point. Looks like CDATA is more limiting than I thought: http://en.wikipedia.org/wiki/CDATA#Issues_with_encoding . Basically, the recommendation is to avoid CDATA and automatically encode characters such as yours, as well as less/more and ampersand. Regards, Alex. -- André Bois-Crettez Search technology, Kelkoo http://www.kelkoo.com/
Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur.