On Oct 31, 2012, at 9:41 AM, Gerriet M. Denkmann wrote:

> I print strings like:
>       NSString *s = @"ร่วมรส";
>       fprintf(stderr, "%s\n", [ s UTF8String]);
> and usually it just works.
> 
> But sometimes it does not and I get garbage like:
> ร่ว\340\270\241รส
> 
> Converting these numbers to hex one gets: 0xe0 0xb8 0xa1 which is the 
> Utf8-code for THAI CHARACTER MO MA.
> So why does it not print (as it should):
> ร่วมรส ?
> 
> This is not really reproducible, but happens in about 3% of all lines.
> 
> Known error, or my mistake?

I have a couple of guesses:

* A bug in Terminal.app.  Does it happen in other terminal apps like iTerm (if 
you've tried)?  I assume it is never the case that the octal escape sequences 
get written out to file, if you redirect stderr.  Is that correct?  (That is, I 
don't think your program is actually writing out the octal sequence.  I think 
it's just a display issue.)

* Perhaps there's an issue with the stdio buffer breaking the sequence of UTF-8 
code units for a code point into two separate writes.  In your example code, 
that wouldn't happen because a) you're writing short lines and b) stderr is 
unbuffered by default.  However, if your real code that exhibits the problem 
might be fully buffered (or line buffered if you're writing lines long enough 
to exceed the buffer size), then this might happen.  If the sequence does get 
split into multiple writes, that still shouldn't be a problem unless the 
program receiving the data doesn't handle that case.  So, this would really 
just be a specific explanation of a possible Terminal bug.

Regards,
Ken


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to