If you can read the original file, but not when you edit it, I would bet
the reason is in the way you edit your XML files (and dump from the
database). What are you using? Could you attach a small sample file?
Alberto
jinesh kj wrote:
hi,
I tried reading the file you send. It didnt give any error, which means it
was reading perfectly. I dont know how to check in the debugger and all, so
dont know whether it read 200d or not. But if i try to edit the xml file,
with some text data along with, it is not reading the the text. Do i have to
do anything for it? Basically i am trying to read through an xml file, which
is a dump of mysql database. It have many zwj and all. I dont know whether
it is according to specified encoding or so and all.But since it was dumped
from database, using the built in function, i think a chance for error is
too low.
I am trying to use a similar function only, in my program, it returns
nothing when there is a ZWJ in my data.
I hope i am clear. I am able to read xml files without ZWJ easily.
regards
Jinesh K J
On Nov 28, 2007 4:02 PM, Alberto Massari <[EMAIL PROTECTED]> wrote:
I am attaching a sample XML that contains a U+200D character between a
--| and |-- pattern; I modified DOMPrint to issue a
const XMLCh* data=doc->getDocumentElement()->getTextContent();
and in the debugger I see that data[4] is \x200D
Have you checked your source XML really has that character? Also, is
the representation of the ZWJ character in the XML file valid according
to the specified encoding (e.g. in UTF-8, it's 0xE2 0x80 0x8D)?
Alberto
jinesh kj wrote:
hi,
Actually, getTextContent is not returning any value when there is a Zero
width joiner.
cheers
Jinesh K J
On Nov 28, 2007 3:28 PM, Alberto Massari <[EMAIL PROTECTED]>
wrote:
Hi Jinesh,
which kind of issues are you having? The text returned by
getTextContent
should contain a \x200D value inside. Or have you transcoded it into
chars?
Alberto
jinesh kj wrote:
hi all,
I was trying to read from an XML file where some data have ZERO Width
Joiner
in it. I used the getTextContent in DOMNode. I was able to read the
contents
without Zero width joiner, but there are some issues with these
special
characters. What do i have to change? Do i have to make any special
settings? Or do i have to use any other function insttead?
cheers
Jinesh K J