Re: horrible utf-8 performace in wc

2008-05-07 Thread Pádraig Brady
Jan Engelhardt wrote: https://bugzilla.novell.com/show_bug.cgi?id=381873 Forwarding this because it is a GNU issue, not specifically a Novell one. I reproduced this myself with the latest coreutils from git (BTW: You might want to repack that repo, counting objects during the clone was

Re: coreutils-6.11 released

2008-05-07 Thread Christophe LYON
On 25.04.2008 21:04, Jim Meyering wrote: Christophe LYON [EMAIL PROTECTED] wrote: If I manually add -lm, I get: .../bin/../lib/gcc/sparc-sun-solaris2.8/4.1.0/crt1.o:(.plt+0x0): multiple definition of `_PROCEDURE_LINKAGE_TABLE_' /usr/lib/libm.so:(.plt+0x0): first defined here Is this a

Re: coreutils-6.11 released

2008-05-07 Thread Jim Meyering
Christophe LYON [EMAIL PROTECTED] wrote: On 25.04.2008 21:04, Jim Meyering wrote: Christophe LYON [EMAIL PROTECTED] wrote: If I manually add -lm, I get: .../bin/../lib/gcc/sparc-sun-solaris2.8/4.1.0/crt1.o:(.plt+0x0): multiple definition of `_PROCEDURE_LINKAGE_TABLE_'

Re: horrible utf-8 performace in wc

2008-05-07 Thread Bo Borgerson
Pádraig Brady wrote: canonically équivalent canonically équivalent Pádraig. p.s. I Notice that gnome-terminal still doesn't handle combining characters correctly, and my mail client thunderbird is putting the accent on the q rather than the e, sigh. They both render correctly here

Re: horrible utf-8 performace in wc

2008-05-07 Thread Jan Engelhardt
On Wednesday 2008-05-07 13:11, Pádraig Brady wrote: Now that is a _lot_ of extra time. libiconv could probably be made more efficient. I've never actually looked at it. However wc calls mbrtowc() for each multibyte character. It would probably be a lot more efficient to use mbstowcs() to convert

Re: horrible utf-8 performace in wc

2008-05-07 Thread Jim Meyering
Pádraig Brady [EMAIL PROTECTED] wrote: Jan Engelhardt wrote: https://bugzilla.novell.com/show_bug.cgi?id=381873 Forwarding this because it is a GNU issue, not specifically a Novell one. I reproduced this myself with the latest coreutils from git (BTW: You might want to repack that repo,

Re: horrible utf-8 performace in wc

2008-05-07 Thread Bo Borgerson
Jim Meyering wrote: Bo Borgerson [EMAIL PROTECTED] wrote: I may be misinterpreting your patch, but it seems to me that decrementing count for zero-width characters could potentially lead to confusion. Not all zero-width characters are combining characters, right? It looks ok to me, since

Re: horrible utf-8 performace in wc

2008-05-07 Thread Pádraig Brady
Bo Borgerson wrote: Pádraig Brady wrote: canonically équivalent canonically équivalent Pádraig. p.s. I Notice that gnome-terminal still doesn't handle combining characters correctly, and my mail client thunderbird is putting the accent on the q rather than the e, sigh. They both

Re: horrible utf-8 performace in wc

2008-05-07 Thread Pádraig Brady
Bo Borgerson wrote: Jim Meyering wrote: Bo Borgerson [EMAIL PROTECTED] wrote: I may be misinterpreting your patch, but it seems to me that decrementing count for zero-width characters could potentially lead to confusion. Not all zero-width characters are combining characters, right? It