[ 
http://issues.apache.org/jira/browse/AXIS-2342?page=comments#action_12360513 ] 

Thiago Jung Bauermann commented on AXIS-2342:
---------------------------------------------

I am opening this issue again because it appears that the fix to this problem 
was removed from the source code. From what I could tell looking at the 
subversion repository, the revision 257917 restored the old, buggy code.

This is affecting me because my application must talk to a webservice which 
doesn't understand XML character entities (I know, it should, but fixing the 
webservice is not an option). The only way I can send non-ASCII characters is 
using UTF-8 or ISO-8859-1, which is not possible with Axis.

I tested with Axis 1.2.1 and 1.3. I didn't test with the trunk version, but 
looking at the code with ViewCVS, the problem is still there (class 
UTF8Encoder).

> Reopen issue: Character entities are escaped too aggressively
> -------------------------------------------------------------
>
>          Key: AXIS-2342
>          URL: http://issues.apache.org/jira/browse/AXIS-2342
>      Project: Apache Axis
>         Type: Bug
>   Components: Serialization/Deserialization
>     Versions: 1.0
>  Environment: Operating System: All
> Platform: All
>     Reporter: Thiago Jung Bauermann
>     Assignee: Axis Developers Mailing List

>
> We are using SOAP to send XML documents from client to server and back. The 
> documents contain a lot of non-ASCII data. This is encoded as UTF-8 by us. 
> However, when retrieved from an Axis server, Axis will escape almost all of 
> our 
> characters into character entities (so &#...;) This means messages become 
> about 
> three times as big as they have to for 'international' documents, which for 
> us 
> is a large performance problem. I narrowed down the problem to
>   XMLUtils::xmlEncodeString
> that has the code:
>                 if (((int)chars[i]) > 127) {
>                         strBuf.append("&#");
>                         strBuf.append((int)chars[i]);
>                         strBuf.append(";");
> This seems unnecessary to me, as Axis will send all messages in UTF-8 anyway, 
> for which no encoding is necessary (and should encoding be configurable, I 
> feel 
> this should be escaped elsewhere).
> Is there any reason for this code, I commented it out and it seemed to have 
> no 
> adverse effect on our application (apart from reduced network traffic)?
> Tested with 1.0, also looked up in the sources of 1.1-rc2.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to