On Tuesday, February 19, 2002, at 03:00 AM, James Bates wrote:
As you guys probably know, I'm really trying to get full Unicode support for documents in Xindice, and one of the reasons for switching to XML-RPC or SOAP or whatever is to get around some issues with CORBA on this front.
Now reading the XML-RPC book from O'Reilly, I discovered that String types in XML-RPC can be only ASCII! No other charcaters than those first 127 are garanteed by the specification to be supported by XML-RPC software. As O'Reilly point out, this doesn't rule out one or the other implementation actually supporting other characters to a varying degree; it just isn't garanteed.
Well that's understandable, there's really no way to guarantee that anyway. There are an awful lot of languages around that don't support Unicode.
Now even if Apache XML-RPC does support the whole thing fully, this kind of thing gives me the willies, and I'd really prefer our interface being defined in some langauge that GARANTEED that all our data needs will be addressed... I haven't read the SOAP book in detail yet, but as IO understand it, SOAP types are based on XML Schema types, and xsd:string supports any string of unicode characters, thus hopefully showing SOAP does address the issue correctly.
SOAP should have the same issue, as should any solution that is used from non-unicode languages. I'm sure the CORBA impls for something like PHP would have the same problem (I'm assuming PHP still doesn't support unicode but I really don't know). The question, is it safe within the scope of unicode languages?
Personally, I'm willing to trade issues in something like PHP, as long as we have fully robust support in unicode languages. In fact this same problem applies to all of XML. If someone is working in a non-unicode language we can't be expected to be able to deliver unicode data to them. At the same time there's no way I'm willing to limit our API support to just languages that support unicode.
+1 for XML-RPC.That being said, I really prefer to take as simple of an approach as possible to start with and that is really XML-RPC. So I'm
This is true: SOAP seems more complicated to get going. But to garantee full support for strings in XML-RPC, we might need to do a fair bit of work on top
to encode/decode the strings, negociate encodings etc...
In addition, as far as I can make out, XML-RPC doesn't even support XML fragments as a datatype? How then would we pass search results, or documents accross the API? As base64 encoded byte arrays of utf-8 encoded XML? or as XML-RPC strings with the necessary escapes of all non-ASCII characters?
Encoded as strings, Is there any reason that our XML has to be considered XML from the perspective of the transport protocol?
Kimbro Staken XML Database Software, Consulting and Writing http://www.xmldatabases.org/