Laurent--
Thanks for the quick reply. See comments below:
On Dec 5, 2009, at 4:22 PM, Laurent Sansonetti wrote:
> Hi Steve,
>
> On Dec 5, 2009, at 1:45 PM, s.ross wrote:
>
>> My code receives XML data from a Web Service API call that is in UTF8
>> encoding. This winds up in a string.
>>
>> return_data = NSURLConnection.sendSynchronousRequest(@request,
>> returningResponse: response, error: error)
>> str = NSString.alloc.initWithData(return_data, encoding:
>> NSUTF8StringEncoding)
>> puts "******* response encoding it #{str.encoding}"
>>
>> The result of the puts above is 'MACINTOSH'.
>>
>> I suspect the encoding of the string is not UTF-8, because when I try to
>> parse the XML using REXML, I get:
>>
>> RegexpError: too short multibyte code
>>
>> This occurs way in REXML:
>>
>> /Library/Frameworks/MacRuby.framework/Versions/0.5/usr/lib/ruby/1.9.0/rexml/text.rb:132:in
>> `check:'
>>
>> In any case, my questions are:
>>
>> 1) If anyone has run across this what did you do?
>
> I don't believe REXML works. In any case, I would recommend to not use it.
> Since you're already using Cocoa, why not giving NSXMLDocument a try?
What I really want to use is Nokogiri. My main issue is that I'm having to
reimplement XML-RPC because the Ruby Std. Lib version is broken over SSL. Even
if it weren't it's never been thread safe and thus can't operate
asynchronously. As a result, what I have is an XML document inside an XML-RPC
response envelope. That means I have to parse the document once to get the
contents of the envelope (which is HTML-escaped), then parse those contents to
get an XML document I can work with. I've been using XPath for that, and that's
why I haven't moved over the NSXMLDocument.
Maybe I'm missing a bet here and should shift my strategy. I'll do some more
reading...
>> 2) Why might the encoding be MACINTOSH and not UTF-8, as specified in the
>> initWithData method call?
>
> #encoding returns the fastest encoding available for the receiver. You may
> specify UTF-8 during the string creation, but if Cocoa can pick a smaller
> encoding at runtime (like ASCII) it will.
>
> This is different from the Ruby 1.9 semantics and we have a plan to fix that
> in 0.6.
This is kind of surprising behavior. The 1.9 semantics are sufficiently
different from 1.8x that code that works correctly on 1.8.7 breaks awkwardly on
1.9. Ok, but I fixed that in an MRI version and the gotcha above broke my
MacRuby version. Now that I know this, I guess I can deal with it.
>
>> 3) Suggestions?
>
> See my comment in 1) :)
>
> Laurent
> _______________________________________________
> MacRuby-devel mailing list
> [email protected]
> http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel
_______________________________________________
MacRuby-devel mailing list
[email protected]
http://lists.macosforge.org/mailman/listinfo.cgi/macruby-devel