And it appears to not choke on utf-8 inputs or unicode code points. I
say backport.
1> unicode:characters_to_binary("é").
<<"é">>
2> unicode:characters_to_binary("åéîøü and sometimes y").
<<"åéîøü and sometimes y">>
3> F = unicode:characters_to_binary("éúü").
<<"éúü">>
4> unicode:characters_to_binary(F).
<<"éúü">>
5> unicode:characters_to_binary([16#CFA8]).
<<"쾨">>
On Wed, May 11, 2011 at 10:54 AM, Filipe David Manana
<[email protected]> wrote:
> On Wed, May 11, 2011 at 3:50 PM, Paul Davis <[email protected]>
> wrote:
>> That should be fine I think. Theoretically the only thing getting
>> through before was non-unicode stuff which should pass through
>> unicode:characters_to_binary just fine, right?
>
> Right (to my understanding at least).
> I also checked that unicode:characters_to_binary/1 accepts IOLists
> (return value of io_lib:format/2):
>
> Eshell V5.8.3 (abort with ^G)
> 1>
> 1> unicode:characters_to_binary("abc").
> <<"abc">>
> 2> unicode:characters_to_binary(["abc"]).
> <<"abc">>
> 3> unicode:characters_to_binary(["ab"m $c]).
> * 1: syntax error before: m
> 3> unicode:characters_to_binary(["ab", $c]).
> <<"abc">>
> 4> unicode:characters_to_binary([["a"], <<"b">>, $c]).
> <<"abc">>
> 5> unicode:characters_to_binary([[["a"], <<"b">>, $c]]).
> <<"abc">>
> 6> unicode:characters_to_binary([[["a"], <<"b">>, $c, []]]).
> <<"abc">>
> 7> unicode:characters_to_binary([[["a"], <<"b">>, $c, [[<<>>]]]]).
> <<"abc">>
> 8> unicode:characters_to_binary([[["a"], [<<"b">>], $c, [[<<>>]]]]).
> <<"abc">>
> 9> unicode:characters_to_binary([[["a"], [<<"b">>], [$c], [[<<>>]]]]).
> <<"abc">>
> 10> unicode:characters_to_binary([[["a"], [<<"b">>], [], [$c], [[<<>>]]]]).
>
>
>>
>> On Wed, May 11, 2011 at 10:27 AM, Filipe David Manana
>> <[email protected]> wrote:
>>> If no one has an objection, I would apply this to 1.1.x as well, since
>>> it suffers the same issue.
>>>
>>> On Wed, May 11, 2011 at 3:26 PM, <[email protected]> wrote:
>>>> Author: fdmanana
>>>> Date: Wed May 11 14:26:21 2011
>>>> New Revision: 1101896
>>>>
>>>> URL: http://svn.apache.org/viewvc?rev=1101896&view=rev
>>>> Log:
>>>> Fix logger crash when messages have unicode characters
>>>>
>>>> This closes COUCHDB-1158. Thanks Dale Harvey.
>>>>
>>>> Modified:
>>>> couchdb/trunk/src/couchdb/couch_log.erl
>>>>
>>>> Modified: couchdb/trunk/src/couchdb/couch_log.erl
>>>> URL:
>>>> http://svn.apache.org/viewvc/couchdb/trunk/src/couchdb/couch_log.erl?rev=1101896&r1=1101895&r2=1101896&view=diff
>>>> ==============================================================================
>>>> --- couchdb/trunk/src/couchdb/couch_log.erl (original)
>>>> +++ couchdb/trunk/src/couchdb/couch_log.erl Wed May 11 14:26:21 2011
>>>> @@ -167,10 +167,10 @@ log(#state{fd = Fd}, ConsoleMsg, FileMsg
>>>> ok = io:put_chars(Fd, FileMsg).
>>>>
>>>> get_log_messages(Pid, Level, Format, Args) ->
>>>> - ConsoleMsg = io_lib:format(
>>>> - "[~s] [~p] " ++ Format ++ "~n", [Level, Pid | Args]),
>>>> + ConsoleMsg = unicode:characters_to_binary(io_lib:format(
>>>> + "[~s] [~p] " ++ Format ++ "~n", [Level, Pid | Args])),
>>>> FileMsg = ["[", httpd_util:rfc1123_date(), "] ", ConsoleMsg],
>>>> - {iolist_to_binary(ConsoleMsg), iolist_to_binary(FileMsg)}.
>>>> + {ConsoleMsg, iolist_to_binary(FileMsg)}.
>>>>
>>>> read(Bytes, Offset) ->
>>>> LogFileName = couch_config:get("log", "file"),
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Filipe David Manana,
>>> [email protected], [email protected]
>>>
>>> "Reasonable men adapt themselves to the world.
>>> Unreasonable men adapt the world to themselves.
>>> That's why all progress depends on unreasonable men."
>>>
>>
>
>
>
> --
> Filipe David Manana,
> [email protected], [email protected]
>
> "Reasonable men adapt themselves to the world.
> Unreasonable men adapt the world to themselves.
> That's why all progress depends on unreasonable men."
>