Hello Thanks for all the replies. I fell quite safe now that i know how it works. All our important data is held within <xxx>...</xxx> so the extra linebreaks and spaces for better visualisation will not impact. Only one follow question. The XML_PARSE_NOBLANKS fixed everyting and i could mix CR and CRLF in the file. But when i read about the option it say it fixes whitespace and not CRLF. Is that part missing in the documentation or am i only reading it bad. Is 0x10 handled as a blank, or is there any more characters i might miss?
/James 2010/6/23 Michael Ludwig <[email protected]> > James Ytterstene schrieb am 23.06.2010 um 14:41 (+0200): > > > If i have the file unchanged from any windows editor the line ending > > is CR only but if someone edit the file it will be changed to CRLF > > (Stupid windows editors but we must use them) If i now try to read the > > file back in libxml2 i will get an extra node at each line only > > containing 0x10. > > Most serious editors have an option to go with DOS or UNIX or Mac line > endings. Maybe yours do, too. > > > If i change the xmlReadFile and add the option XML_PARSE_NOBLANKS i > > can read the file back ok. But when reading about that option i find > > many posts about not to use it, so im confused here. > > The question you have to answer: Are whitespace-only text nodes in your > XML significant or not? If they're not significant, nothing wrong with > stripping them. Unless, of course, your output is intended for human > consumption. In that case, you have to keep them, or apply automatic > output indenting. > > > When i read about libxml2 and how files should be parsed i get the > > feeling that the parser should handle the CRLF when reading files and > > always save the new files with CR only. So the extra CRLF shouIdn't be > > any issue but I can be wrong here. > > It's a requirement of the XML spec: > > http://www.w3.org/TR/REC-xml/#sec-line-ends > > > Is there any general solution for the parsing of files so the CR CRLF > > doesnt add any extra nodes? > > Well yes, the one you already found. Strip whitespace-only text nodes on > parsing, using the appropriate parser or processor option, like in this > case XML_PARSE_NOBLANKS. > > -- > Michael Ludwig > _______________________________________________ > xml mailing list, project page http://xmlsoft.org/ > [email protected] > http://mail.gnome.org/mailman/listinfo/xml >
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
