Re: bug in busybox sed with non-ascii chars

2014-05-07 Thread Rich Felker
On Mon, May 05, 2014 at 08:08:32PM +0100, Sam Liddicott wrote: > One of the advantages of utf-8 encoding was that it was easy to re-sync > after an invalid sequence. > > It's a bit of a waste to then not do that. Minus points for musl. An application can resync, although the C multibyte interface

Re: bug in busybox sed with non-ascii chars

2014-05-05 Thread Sam Liddicott
One of the advantages of utf-8 encoding was that it was easy to re-sync after an invalid sequence. It's a bit of a waste to then not do that. Minus points for musl. Can you not run sed with LANG=C or LANG=POSIX? Sam On 4 May 2014 15:57, "Rich Felker" wrote: > On Sun, May 04, 2014 at 04:44:10PM

Re: bug in busybox sed with non-ascii chars

2014-05-04 Thread Rich Felker
On Sun, May 04, 2014 at 04:44:10PM +0200, Denys Vlasenko wrote: > On Sat, May 3, 2014 at 5:07 PM, Rich Felker wrote: > >> Lets refuse to find end of line if there is a non UTF-8 sequence inside > >> that line? > >> Sounds wrong to me... > > > > sed (also regcomp and regexec) requires text input.

Re: bug in busybox sed with non-ascii chars

2014-05-04 Thread Denys Vlasenko
On Sat, May 3, 2014 at 5:07 PM, Rich Felker wrote: >> Lets refuse to find end of line if there is a non UTF-8 sequence inside that >> line? >> Sounds wrong to me... > > sed (also regcomp and regexec) requires text input. Byte streams with > illegal sequences are not text. Actually since the regex

Re: bug in busybox sed with non-ascii chars

2014-05-03 Thread Rich Felker
On Sat, May 03, 2014 at 03:17:49PM +0200, Denys Vlasenko wrote: > On Saturday 03 May 2014 05:10, Rich Felker wrote: > > On Wed, Apr 30, 2014 at 10:31:00AM +0200, Natanael Copa wrote: > > > Hi, > > > > > > I came across a bug (or feature) in busybox sed when trying to build > > > firefox-29. > > >

Re: bug in busybox sed with non-ascii chars

2014-05-03 Thread Denys Vlasenko
On Saturday 03 May 2014 05:10, Rich Felker wrote: > On Wed, Apr 30, 2014 at 10:31:00AM +0200, Natanael Copa wrote: > > Hi, > > > > I came across a bug (or feature) in busybox sed when trying to build > > firefox-29. > > > > Testcase based on what firefox's configure scripts does: > > > > ASCII=

Re: bug in busybox sed with non-ascii chars

2014-05-02 Thread Rich Felker
On Wed, Apr 30, 2014 at 10:31:00AM +0200, Natanael Copa wrote: > Hi, > > I came across a bug (or feature) in busybox sed when trying to build > firefox-29. > > Testcase based on what firefox's configure scripts does: > > ASCII='AA' > NONASCII=$'\246\246' > > echo -e "($ASCII)\n($NONASCII)" | b

Re: bug in busybox sed with non-ascii chars

2014-05-01 Thread Natanael Copa
On Fri, 2 May 2014 07:34:57 +0200 Denys Vlasenko wrote: > On Wednesday 30 April 2014 10:31, Natanael Copa wrote: > > Hi, > > > > I came across a bug (or feature) in busybox sed when trying to build > > firefox-29. > > > > Testcase based on what firefox's configure scripts does: > > > > ASCII=

Re: bug in busybox sed with non-ascii chars

2014-05-01 Thread Denys Vlasenko
On Wednesday 30 April 2014 10:31, Natanael Copa wrote: > Hi, > > I came across a bug (or feature) in busybox sed when trying to build > firefox-29. > > Testcase based on what firefox's configure scripts does: > > ASCII='AA' > NONASCII=$'\246\246' > > echo -e "($ASCII)\n($NONASCII)" | busybox s

bug in busybox sed with non-ascii chars

2014-04-30 Thread Natanael Copa
Hi, I came across a bug (or feature) in busybox sed when trying to build firefox-29. Testcase based on what firefox's configure scripts does: ASCII='AA' NONASCII=$'\246\246' echo -e "($ASCII)\n($NONASCII)" | busybox sed 's/$/,/' Expected result is a comma (,) after both lines. Actual result i