Thanks you Alberto.
Is it possible to give me any tutorial link for XMLFormatter?
-Nahid


Alberto Massari wrote:
> 
> Hi Nahid,
> if you are directly writing the text to a final XML file, you can use an 
> XMLFormatter object that takes care of the conversion in a more 
> efficient manner (instead of scanning the input string 5 times and then 
> reallocating it as you are doing now). Depending on the type of 
> filtering you are doing, you could achieve better performances by using 
> a grep-like tool, if what you are filtering is not XML-aware.
> 
> Alberto
> 
> Nahid wrote:
>> Hi Alberto,
>> Thanks for your reply. 
>> Actually my program is a filter so I'm just taking a XML to output
>> another
>> XML.
>> So it would better If I can output as it was. That is &.amp; instead of
>> &.
>>
>> Right now I'm using this function to convert it again before make the
>> output.
>>
>> string escape(string str) {
>>   string a[] = {"&", "<", ">", "\"", "'"};
>>   string b[] = {"&amp;",  "&lt;",  "&gt;",  "&quot;",  "&apos;"};
>>   for (int i=0; i<5; i++) {
>>     size_t pos=0;
>>     while ( ( pos = str.find(a[i], pos ) ) != std::string::npos ) {
>>       str.replace( pos, a[i].length(), b[i] );
>>       pos += b[i].length();
>>     }
>>   }
>>   return str;
>> }
>>
>> But problem is as I told before I have to parse a huge file ( like 100 GB
>> to
>> 1 TB ) so doing the conversion twice is costly (This function increase
>> 10%
>> of running time :( )
>>
>> So it would better if I can tell SAX2XMLReader not to convert &.amp; to &
>> ,
>> which saves double conversion time.
>>
>> Thanks you again
>> -Nahid
>>
>>
>> Alberto Massari wrote:
>>   
>>> Hi Nahid,
>>> an XML document cannot contain a & character, as it has a special 
>>> meaning (beginning of an entity reference); why do you need to see the 
>>> raw text instead of its meaning?
>>>
>>> Alberto
>>>
>>> Nahid wrote:
>>>     
>>>> Hi,
>>>> Before posting, I've searched for the solution but can't find any. May
>>>> be
>>>> it
>>>> has a trivial solution.
>>>> I'm using SAX2XMLReader for parsing a huge XML file which contains
>>>> entity
>>>> characters(&gt, &lt etc...)
>>>> "<title>abc &amp; cde</title>" which is converted to "<title>abc &
>>>> cde</title>" 
>>>> I don't want this auto conversion. I just want the actual text.
>>>> Do you have any idea, how can I do it?
>>>> Thanks
>>>> Regards
>>>> -Nahid
>>>>
>>>>   
>>>>       
>>>
>>>     
>>
>>   
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/xerces-translates-entity-characters%28-gt...%29-automatically-but-I-don%27t-want-to-tp20002547p20022739.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.

Reply via email to