> If I say to_utf8() once more on it I expect to get a string containing > 4 chars: > > "\xC3\x82\xC2\xA9" > > and I expect from_utf8() to go the other way. Ahhh, no wonder we differed. My brain equates to_utf8(to_utf8()) with to_utf8(). > Your to_utf8() seems to be named after "turn-on-the-utf8-flag" which I Nope. It's "convert-bytes-to-utf8-and-turn-on-the-utf8-flag". Maybe my documentation of the functions is somehow bad and confusing? > think of as just an internal implementation detail. What if we change > the internal representation to be always UTF-32. Do you want to > rename your functions then? No, why should I? They would still convert bytes they expect not to be/to be utf8 to something that is/is not utf8. -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Simon Cozens
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Gisle Aas
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Comprehensive UTF-8 decoder stress test file Markus Kuhn
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Ed Batutis
- Re: [EXPERIMENTAL] 1st draft of Encode Nick Ing-Simmons
- Re: [EXPERIMENTAL] 1st draft of Encode Gisle Aas
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Gisle Aas
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Gisle Aas
- Re: [EXPERIMENTAL] 1st draft of Encode Simon Cozens
- Re: [EXPERIMENTAL] 1st draft of Encode Graham Barr
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Simon Cozens
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Jarkko Hietaniemi
- Re: [EXPERIMENTAL] 1st draft of Encode Ed Batutis