hi,

I dumped using mysql -X command which will give me output as xml file. I
dont know whether there is any problem with my xml files. Is there any
specific notation to represent the ZWJ and ZWNJ in xml files?

I am attaching an xml file i have.

Thank you for your help, and if you have a better idea what to do with the
xml file when i get characters like these, or any links to those details,
please point me.

regards

Jinesh K J

On Nov 28, 2007 4:46 PM, Alberto Massari <[EMAIL PROTECTED]> wrote:

> If you can read the original file, but not when you edit it, I would bet
> the reason is in the way you edit your XML files (and dump from the
> database). What are you using? Could you attach a small sample file?
>
> Alberto
>
> jinesh kj wrote:
> > hi,
> >
> > I tried reading the file you send. It didnt give any error, which means
> it
> > was reading perfectly. I dont know how to check  in the debugger and
> all, so
> > dont know whether it  read 200d or not. But if i try to edit the xml
> file,
> > with some text data along with, it is not reading the the text. Do i
> have to
> > do anything for it? Basically i am trying to read through an xml file,
> which
> > is a dump of mysql database. It have many zwj and all. I dont know
> whether
> > it is according to specified encoding or so and all.But since it was
> dumped
> > from database, using the built in function, i think a chance for error
> is
> > too low.
> >
> > I am trying to use a similar function only, in my program, it returns
> > nothing when there is a ZWJ in my data.
> >
> > I hope i am clear. I am able to read xml files without ZWJ easily.
> >
> > regards
> >
> > Jinesh K J
> >
> > On Nov 28, 2007 4:02 PM, Alberto Massari <[EMAIL PROTECTED]>
> wrote:
> >
> >
> >> I am attaching a sample XML that contains a U+200D character between a
> >> --| and |-- pattern; I modified DOMPrint to issue a
> >>
> >>            const XMLCh*
> data=doc->getDocumentElement()->getTextContent();
> >>
> >> and in the debugger I see that data[4] is \x200D
> >> Have you checked your source XML  really has that character? Also, is
> >> the representation of the ZWJ character in the XML file valid according
> >> to the specified encoding (e.g. in UTF-8, it's 0xE2 0x80 0x8D)?
> >>
> >> Alberto
> >>
> >> jinesh kj wrote:
> >>
> >>> hi,
> >>>
> >>> Actually, getTextContent is not returning any value when there is a
> Zero
> >>> width joiner.
> >>>
> >>> cheers
> >>>
> >>> Jinesh K J
> >>>
> >>> On Nov 28, 2007 3:28 PM, Alberto Massari <[EMAIL PROTECTED]>
> >>>
> >> wrote:
> >>
> >>>
> >>>> Hi Jinesh,
> >>>> which kind of issues are you having? The text returned by
> >>>>
> >> getTextContent
> >>
> >>>> should contain a \x200D value inside. Or have you transcoded it into
> >>>> chars?
> >>>>
> >>>> Alberto
> >>>>
> >>>> jinesh kj wrote:
> >>>>
> >>>>
> >>>>> hi all,
> >>>>>
> >>>>> I was trying to read from an XML file where some data have ZERO
> Width
> >>>>>
> >>>>>
> >>>> Joiner
> >>>>
> >>>>
> >>>>> in it. I used the getTextContent in DOMNode. I was able to read the
> >>>>>
> >>>>>
> >>>> contents
> >>>>
> >>>>
> >>>>> without Zero width joiner, but there are some issues with these
> >>>>>
> >> special
> >>
> >>>>> characters. What do i have to change? Do i have to make any special
> >>>>> settings? Or do i have to use any other function insttead?
> >>>>>
> >>>>> cheers
> >>>>> Jinesh K J
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>
> >
> >
> >
>
>


-- 
My Feelings,Expressions-
http://logbookofanobserver.blogspot.com

SMC : My computer, My language http://smc.org.in
സ്വതന്ത്ര മലയാളം കമ്പ്യൂട്ടിങ്ങ്, എന്റെ കമ്പ്യൂട്ടറിന് എന്റെ ഭാഷ
<?xml version="1.0"?>

<resultset statement="select * from TEXTS where BookCode=0009
">
  <row>
	<field name="BookCode">0009</field>
	<field name="PageNo">6</field>
	<field name="ImageLoc">/test/extra1/garnome-2.20.0/0009_Meinkamph_Img_600_Deskew/0009_MeinKamph_Img_600_Deskew_Page_0006.tif</field>
	<field name="Text">ut'amayaayirunnu. mahaajanasamuuhatte aat'iyulaykkaan‍ poonna prabhaashhand-a
ng-ng-al'il‍kkuut'i addeihn' tanr'e bhaashhaaparamaaya shaili muur‍chchakuut't'iyet'uttirunnu.
etiraal'iyil‍ninnu vashamaakkeind-t'a at'avukal'‍ manassilaakkikkond-t'utanne
manakkaruttoot'e ayaal'e neirit'unnatil‍ hir'r'lar‍ kaand-ikkunna dhairyavun' 
sthairyavun' lookananmaykkuveind-t'i upayoogichchirunneng-kil‍ ennu naan'
praar‍tthichchupookun'. charitrn' srxshht'ichcha oru charitrapurushhan‍ rachikkunna
samakaaliinacharitrn' enna nilayil‍- naasi prasthaanattinr'e aantara
muulyavun' atinr'e veirukal'un' pat'hikkaan‍ utakunna muulagranthamenna nila
yil‍- ii krxti shraddheiyamaand-ennatinu sn'shayamilla. uttamamaaya oru
saahityarachana ennatilupari ii irupataan' nuur'r'aand-t'ile ativikasita
raajyang-ng-al'il‍ onnaaya jar‍maniyil‍ parishuddha aaryan‍raktattinr'e mahattvn'
uyar‍ttippit'ikkaan‍ vempiya oru raashht'ratantrajnj-anr'e vikalamaaya 
antar‍dar‍shanattinr'e aavishhkaarn' enna nilaykk ii krxti shraddha
ar‍hikkunnu. rachayitaavinr'e aatmavatta suukshhmamaayi pratiphalippikkunna
ii krxti phaasisatteyun' sar‍vaadhipatyatteyun' etir‍kkaanaagrahikkunna
ellaa varun' paat'hapustakn'poole pat'hikkeind-t'ataand-. vit't'uviizchayillaatta oru
vikat'a vishvaasattinr'e vaktaavaaya addeihn' etra lakshhn' nissahaayaraaya
juutanmaareyun' mar'r'ul'l'avareyun' gyaas choon'bar'ukal'ilit't'u konnu ennat innun'
oru du:svapnamaayi irupataan'nuur'r'aand-t'inr'e kallichcha man:saakshhiyeppoolun'
aloosarappet'uttikkond-t'irikkunnu. juutaviroodhn', svavn'shaaraadhana,
adhikaaradaahn', vit't'uviizchayillaayma, aashayang-ng-al'ilul'l'a kat'un'pit'uttn'
enning-ng-ane pala duushhyavashang-ng-al'umul'l'a hir'r'lar'ut'e svabhaavattinr'e mar'uvashn'
ii aatmakathayil‍ ang-ng-ing-ng- sphurikkunnund-t'. hir'r'lar‍kk jar‍maniyil‍
kit't'iya vampichcha pintund-aykkul'l'a at'isthaanakaarand-avun' ii krxtiyil‍ninnu
kur'eyokke manassilaakkaan'.

ad'ool'‍ph hir'r'lar‍ vishvacharitrattile orapuur‍vvapratibhaasamaand-.
deishiiyatayut'eyun' saamyavaadattinr'eyun' meilang-kiyand-inj-nj-u manushhyavar‍ggatte
vn'shaat'isthaanattil‍ maatrn' nookkikkond-t' raktattinr'e parishuddhi parigand-ichch
aaryavar‍ggattinuveind-t'i lookaadhipatyn' neit'iyet'ukkaan‍ shramichcha oru van‍kit'a
kalaapakaariyun' manushhyavidveishhiyun' hin'saamuur‍ttiyumaayirunnu addeiha
menn janang-ng-al'‍ potuve vishvasikkunnu. hir'r'lar'ut'e cheytikal'ut'e 
duushhyaphalang-ng-al'‍ neirit't' anubhavichchavaril‍ innavasheishhichchit't'ul'l'avar‍kk dashaabda 
ng-ng-al'‍kkusheishhavun' nj-et't'alund-t'aakkunna oru bhiikarasatvamaand- addeihn'. onnaan'
lookayuddhattinusheishhn' vijayiraajyang-ng-al'‍ jar‍maniyut'emeil‍ eilpichcha
saampattikavun' bhuumishaastraparavun' maanasikavumaaya aaghaatang-ng-al'‍ und-ar‍tti
vit't'a pratikaaradaahn' aakaarn'puund-t'ataayirunnu addeihn' ennu vichaarikku
nnavarun' kur'avalla. lookaavasaanakaalatte sar‍vavinaashashaktiyaaya kal‍kki</field>
  </row>

  <row>
	<field name="BookCode">0009</field>
	<field name="PageNo">1</field>
	<field name="ImageLoc">/extra2/Annotation/0009_Meinkamph_Img_600_Deskew/0009_MeinKamph_Img_600_Deskew_Page_0001.tif</field>
	<field name="Text">ad'ool'‍ph hir'r'lar‍

(1889-1945)

pragalbhanaaya jar‍man‍ seichchhaadhikaari. rand-t'aan' lookamahaayuddhattinr'e kaarand-a
kkaaran‍. 1889-l‍ aastriyayil‍ janichchu. skuul'‍pat'hann' kazinj-nj- kur'enaal'‍ chitra
ng-ng-al'‍ varachchun' vir'r'un' nat'annu. 1913-l‍ sainyattil‍ cheir‍nnu. tut'ar‍nn raashht'riiyatti
lit'apet't', jar‍mman‍ naashhand-al‍ sooshhyalisr' (naasi) paar‍t't'iyut'e talavanaayi. krameind-a
jar‍mmaniyut'e chaan‍salar‍, prasid'anr' ennii padavikal'‍ neit'i(1933-’34). 1923-l‍
jayilil‍vachch main‍ kaan'ph (enr'e pooraat't'n') ezuti. it hir'r'lar'ut'e aatmakatha
yaand-. charitrapraadhaanyamul'l'a ii krxti rand-t'u bhaagamaayi 1925-”'26-laand- prasiddhiika
richchat. rand-t'aan' lookamahaayuddhattil‍ paraajayappet't'appool'‍ 1945 eipril‍ 30-n
aatmahatya cheytu.
</field>
  </row>
</resultset>

Reply via email to