Reopen issue: Character entities are escaped too aggressively
-------------------------------------------------------------

         Key: AXIS-2342
         URL: http://issues.apache.org/jira/browse/AXIS-2342
     Project: Apache Axis
        Type: Bug
  Components: Serialization/Deserialization  
    Versions: 1.0    
 Environment: Operating System: All
Platform: All
    Reporter: Thiago Jung Bauermann
 Assigned to: Axis Developers Mailing List 


We are using SOAP to send XML documents from client to server and back. The 
documents contain a lot of non-ASCII data. This is encoded as UTF-8 by us. 
However, when retrieved from an Axis server, Axis will escape almost all of our 
characters into character entities (so &#...;) This means messages become about 
three times as big as they have to for 'international' documents, which for us 
is a large performance problem. I narrowed down the problem to
  XMLUtils::xmlEncodeString
that has the code:
                if (((int)chars[i]) > 127) {
                        strBuf.append("&#");
                        strBuf.append((int)chars[i]);
                        strBuf.append(";");
This seems unnecessary to me, as Axis will send all messages in UTF-8 anyway, 
for which no encoding is necessary (and should encoding be configurable, I feel 
this should be escaped elsewhere).

Is there any reason for this code, I commented it out and it seemed to have no 
adverse effect on our application (apart from reduced network traffic)?

Tested with 1.0, also looked up in the sources of 1.1-rc2.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to