[ 
http://issues.apache.org/jira/browse/AXIS-2025?page=comments#action_12314621 ] 

Shankar Unni commented on AXIS-2025:
------------------------------------

> With less jumping up and down and more providing of a patch, I would think 
> that this 
> is a valid bug and be inclined to apply the patch.

I apologize. It's been a bit frustrating dealing with this bug.

Yes, Axis is a SOAP processor, and not an RPC mechanism. And Yes, SOAP lays out 
rules for what is a valid message. The question is what happens when you try to 
implement an RPC mechanism on top of SOAP - the bug is in *that* layer, which 
the Axis library also supports.

I don't entirely agree that a String is bad simply because it contains an 
occasional  binary character. That would be a totally novel definition of 
String for just about any language. Every language has rules about what it 
allows in Strings, and in every case (C, C++, Java, VB, Pascal), unprintable 
characters are allowed subject to certain (minimally restrictive) rules (e.g. 
usually, no NULs, etc.).

Also, it's not like the entire String is, e.g., the contents of some binary 
file, or something like that (which would, of course, be more appropriately 
handled as an attachment - we do have certain pieces of data that are truly 
"binary", and we do encode them as byte[] for proper handling). Sometimes, it's 
just a String that has an occasional "unprintable" character in it (think: 
"ESC" or "BEL"(^G)).

(In fact, one of the situations where we ran into this was a SOAP interface to 
a systems monitoring layer that read the contents of log files and sent back 
events with the log messages in them. And these log messages often contain ESC 
and ^G. It's absurd to make every String in this API an attachment; and the use 
of these so-called "unprintable" characters is entirely valid in the context - 
they *are* legal things to spit out on screens, and they are legal to put into 
strings).

So yes, it's a larger issue - *if* you're building an RPC layer on top of the 
SOAP infrastructure, and given the restrictions in the SOAP layer's use of 
<xsd:string>, how do you safely transport Strings in general.

This is definitely an interoperability issue. But in the meantime, is there 
some other type that can be used to transport such strings? For instance, is it 
possible to use a custom type to map Strings into SOAP (i.e. avoid 
<xsd:string>)? Would such a thing be portable? I could see a custom mapping to 
some base64-type representation for the string body, but both sides need to 
agree that it's a String, and have it be mapped back to a String upon decoding.

Even if ready-made solutions are not available, I would greatly appreciate some 
hints or suggestions on how this can be *reasonably* handled - something that 
doesn't involve trying to hunt down every string in every interface and 
possibly convert them to attachments, or hand-encoding and decoding every 
string everywhere..  (I.e. reasonable workarounds / suggestions would be a 
great help!)


> Illegal XML characters in String arguments and return values cause XML 
> exceptions in Axis calls
> -----------------------------------------------------------------------------------------------
>
>          Key: AXIS-2025
>          URL: http://issues.apache.org/jira/browse/AXIS-2025
>      Project: Apache Axis
>         Type: Bug
>   Components: Serialization/Deserialization
>     Versions: 1.2
>  Environment: All (but reproduced on WinXP).
> Axis 1.1 and 1.2
>     Reporter: Shankar Unni
>     Assignee: Venkat Reddy
>  Attachments: Axis1.1badmsgAPI.log, Axis1.1echoAPI.log, Axis1.2badmsgAPI.log, 
> Axis1.2echoAPI.log
>
> Arguments and return values of Java type String are incorrectly handled if 
> they contain non-printing illegal ASCII characters.
> Example 1: bad return values:
> - - - - - - - - - - - - - - -
> E.g. the string 
>   "bad char: " + (char)3 + "."
> Trivial example:
> foo.jws:
>   public class foo {
>     public String badmsg()
>     {
>       return "bad: " + (char)3 + ".";
>     }
>   }
> When calling this method and the server is running on Axis 1.1, it returns 
> XML with the illegal character ASCII "3" in the text:
>    <badmsgReturn xsi:type="xsd:string">bad: ?.</badmsgReturn>  
> This causes an XML parse exception on the client side 
> ("org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x3) was 
> found in the element content of the document.")
> With Axis 1.2, the server doesn't even return a valid response: I get an HTTP 
> 200 OK with an empty content, causing a different XML parse error.
> Example 2: bad parameter values:
> - - - - - - - - - - - - - - - -
> A similar problem exists when passing such a string from the the client side.
> If I have a method in foo.jws:
>   public class foo {
>     public String echo(String s)
>     {
>       return s;
>     }
>   }
> Then if I write an ordinary Java client to call this, and pass it a bad 
> string as in the beginning of this post, I get an exception thrown while the 
> call is being composed:
> java.lang.IllegalArgumentException: The char '0x3' in 'bad char: ?.' is not a 
> valid XML character.
> This is somewhat absurd: shouldn't the serialization layer be encoding these 
> illegal XML characters as entity escapes? They're entirely legal in the 
> current locale (US), and normal Java code handles this character quite 
> normally.  Why should it croak when passed by XML/RPC?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to