Hi Nahid,there is no standalone tutorial for XMLFormatter; but you can look at how DOMWriterImpl (in 2.8) and DOMLSSerializerImpl (in 3.0) use it to
understand its features. In short:
LocalFileFormatTarget fileTarget("c:\\tmp\\tmp.xml");
XMLFormatter formatter("utf-8", &fileTarget);
formatter << XMLFormatter::NoEscapes << "<root attr=\"" <<
XMLFormatter::AttrEscapes << "value" << XMLFormatter::NoEscapes << "\">";
formatter.formatBuf("literal", 7, XMLFormatter::CharEscapes);
formatter << XMLFormatter::NoEscapes << "</root>";
Alberto
Nahid wrote:
Thanks you Alberto. Is it possible to give me any tutorial link for XMLFormatter? -Nahid Alberto Massari wrote:Hi Nahid,if you are directly writing the text to a final XML file, you can use an XMLFormatter object that takes care of the conversion in a more efficient manner (instead of scanning the input string 5 times and then reallocating it as you are doing now). Depending on the type of filtering you are doing, you could achieve better performances by using a grep-like tool, if what you are filtering is not XML-aware.Alberto Nahid wrote:Hi Alberto,Thanks for your reply. Actually my program is a filter so I'm just taking a XML to outputanother XML. So it would better If I can output as it was. That is &.amp; instead of &. Right now I'm using this function to convert it again before make the output. string escape(string str) { string a[] = {"&", "<", ">", "\"", "'"}; string b[] = {"&", "<", ">", """, "'"}; for (int i=0; i<5; i++) { size_t pos=0; while ( ( pos = str.find(a[i], pos ) ) != std::string::npos ) { str.replace( pos, a[i].length(), b[i] ); pos += b[i].length(); } } return str; } But problem is as I told before I have to parse a huge file ( like 100 GB to 1 TB ) so doing the conversion twice is costly (This function increase 10% of running time :( ) So it would better if I can tell SAX2XMLReader not to convert &.amp; to & , which saves double conversion time. Thanks you again -Nahid Alberto Massari wrote:Hi Nahid,an XML document cannot contain a & character, as it has a special meaning (beginning of an entity reference); why do you need to see the raw text instead of its meaning?Alberto Nahid wrote:Hi, Before posting, I've searched for the solution but can't find any. May be it has a trivial solution. I'm using SAX2XMLReader for parsing a huge XML file which contains entity characters(>, < etc...) "<title>abc & cde</title>" which is converted to "<title>abc &cde</title>" I don't want this auto conversion. I just want the actual text.Do you have any idea, how can I do it? Thanks Regards -Nahid
