Re: locate.mklocatedb broken with LC_ALL!=C
On Fri, Feb 15, 2019 at 07:58:48PM +0100, Giovanni Bechis wrote: > ping... > any possible issue with millert@ diff ? > Giovanni > > On Sun, Oct 07, 2018 at 08:35:28PM -0600, Todd C. Miller wrote: > > On Sun, 07 Oct 2018 17:08:06 +0200, Marc Espie wrote: > > > > > Specifically, the only part that cares about > > > locale is sort, and it's definitely correct in fixing > > > it's not run on an utf-8 file. > > > > Agreed. How about the following? > > > > - todd > > > > Index: usr.bin/locate/locate/mklocatedb.sh > > === > > RCS file: /cvs/src/usr.bin/locate/locate/mklocatedb.sh,v > > retrieving revision 1.13 > > diff -u -p -u -r1.13 mklocatedb.sh > > --- usr.bin/locate/locate/mklocatedb.sh 18 Mar 2007 20:13:49 - > > 1.13 > > +++ usr.bin/locate/locate/mklocatedb.sh 8 Oct 2018 02:34:52 - > > @@ -66,7 +66,8 @@ filelist=`mktemp ${TMPDIR=/tmp}/_filelis > > } > > trap 'rm -f $bigrams $filelist' 0 1 2 3 5 10 15 > > > > -if $sortcmd $sortopt > $filelist; then > > +# Run sort in the C locale or binary data may be interpreted as UTF-8 > > +if LC_ALL=C $sortcmd $sortopt > $filelist; then > > $bigram < $filelist | $sort -nr | > > awk -Ft 'BEGIN { ORS = "" } NR <= 128 { print $2 }' > > > $bigrams && > > $code $bigrams < $filelist Oh, I thought it had been committed ages ago
Re: locate.mklocatedb broken with LC_ALL!=C
ping... any possible issue with millert@ diff ? Giovanni On Sun, Oct 07, 2018 at 08:35:28PM -0600, Todd C. Miller wrote: > On Sun, 07 Oct 2018 17:08:06 +0200, Marc Espie wrote: > > > Specifically, the only part that cares about > > locale is sort, and it's definitely correct in fixing > > it's not run on an utf-8 file. > > Agreed. How about the following? > > - todd > > Index: usr.bin/locate/locate/mklocatedb.sh > === > RCS file: /cvs/src/usr.bin/locate/locate/mklocatedb.sh,v > retrieving revision 1.13 > diff -u -p -u -r1.13 mklocatedb.sh > --- usr.bin/locate/locate/mklocatedb.sh 18 Mar 2007 20:13:49 - > 1.13 > +++ usr.bin/locate/locate/mklocatedb.sh 8 Oct 2018 02:34:52 - > @@ -66,7 +66,8 @@ filelist=`mktemp ${TMPDIR=/tmp}/_filelis > } > trap 'rm -f $bigrams $filelist' 0 1 2 3 5 10 15 > > -if $sortcmd $sortopt > $filelist; then > +# Run sort in the C locale or binary data may be interpreted as UTF-8 > +if LC_ALL=C $sortcmd $sortopt > $filelist; then > $bigram < $filelist | $sort -nr | > awk -Ft 'BEGIN { ORS = "" } NR <= 128 { print $2 }' > > $bigrams && > $code $bigrams < $filelist signature.asc Description: PGP signature
Re: locate.mklocatedb broken with LC_ALL!=C
On Sun, Oct 07, 2018 at 08:35:28PM -0600, Todd C. Miller wrote: > On Sun, 07 Oct 2018 17:08:06 +0200, Marc Espie wrote: > > > Specifically, the only part that cares about > > locale is sort, and it's definitely correct in fixing > > it's not run on an utf-8 file. > > Agreed. How about the following? > works for me, ok giovanni@ Cheers & Thanks Giovanni > - todd > > Index: usr.bin/locate/locate/mklocatedb.sh > === > RCS file: /cvs/src/usr.bin/locate/locate/mklocatedb.sh,v > retrieving revision 1.13 > diff -u -p -u -r1.13 mklocatedb.sh > --- usr.bin/locate/locate/mklocatedb.sh 18 Mar 2007 20:13:49 - > 1.13 > +++ usr.bin/locate/locate/mklocatedb.sh 8 Oct 2018 02:34:52 - > @@ -66,7 +66,8 @@ filelist=`mktemp ${TMPDIR=/tmp}/_filelis > } > trap 'rm -f $bigrams $filelist' 0 1 2 3 5 10 15 > > -if $sortcmd $sortopt > $filelist; then > +# Run sort in the C locale or binary data may be interpreted as UTF-8 > +if LC_ALL=C $sortcmd $sortopt > $filelist; then > $bigram < $filelist | $sort -nr | > awk -Ft 'BEGIN { ORS = "" } NR <= 128 { print $2 }' > > $bigrams && > $code $bigrams < $filelist
Re: locate.mklocatedb broken with LC_ALL!=C
On Sun, 07 Oct 2018 17:08:06 +0200, Marc Espie wrote: > Specifically, the only part that cares about > locale is sort, and it's definitely correct in fixing > it's not run on an utf-8 file. Agreed. How about the following? - todd Index: usr.bin/locate/locate/mklocatedb.sh === RCS file: /cvs/src/usr.bin/locate/locate/mklocatedb.sh,v retrieving revision 1.13 diff -u -p -u -r1.13 mklocatedb.sh --- usr.bin/locate/locate/mklocatedb.sh 18 Mar 2007 20:13:49 - 1.13 +++ usr.bin/locate/locate/mklocatedb.sh 8 Oct 2018 02:34:52 - @@ -66,7 +66,8 @@ filelist=`mktemp ${TMPDIR=/tmp}/_filelis } trap 'rm -f $bigrams $filelist' 0 1 2 3 5 10 15 -if $sortcmd $sortopt > $filelist; then +# Run sort in the C locale or binary data may be interpreted as UTF-8 +if LC_ALL=C $sortcmd $sortopt > $filelist; then $bigram < $filelist | $sort -nr | awk -Ft 'BEGIN { ORS = "" } NR <= 128 { print $2 }' > $bigrams && $code $bigrams < $filelist
Re: locate.mklocatedb broken with LC_ALL!=C
On Sun, Oct 07, 2018 at 09:43:05AM +0200, Giovanni Bechis wrote: > Hi, > after setting LC_ALL=en_US.UTF-8 on my env locate.mklocatedb seems broken, > resetting LC_ALL=C is a workaround. > > $ export LC_ALL=en_US.UTF-8 > $ doas /usr/libexec/locate.updatedb > sort: Illegal byte sequence > locate.mklocatedb: cannot build locate database > $ export LC_ALL=C > $ doas /usr/libexec/locate.updatedb > > Should we run weekly(8) with LC_ALL=C to be sure that locate.updatedb runs > correctly ? > > Cheers > Giovanni Fixing locate.mklocatedb looks much better. Specifically, the only part that cares about locale is sort, and it's definitely correct in fixing it's not run on an utf-8 file.
Re: locate.mklocatedb broken with LC_ALL!=C
On Sun, Oct 07, 2018 at 10:06:35AM +0200, Stefan Sperling wrote: > On Sun, Oct 07, 2018 at 09:43:05AM +0200, Giovanni Bechis wrote: > > Hi, > > after setting LC_ALL=en_US.UTF-8 on my env locate.mklocatedb seems broken, > > resetting LC_ALL=C is a workaround. > > > > $ export LC_ALL=en_US.UTF-8 > > $ doas /usr/libexec/locate.updatedb > > sort: Illegal byte sequence > > locate.mklocatedb: cannot build locate database > > $ export LC_ALL=C > > $ doas /usr/libexec/locate.updatedb > > > > Should we run weekly(8) with LC_ALL=C to be sure that locate.updatedb runs > > correctly ? > > Where did you set the LC_ALL variable? The UTF-8 locale should only be enabled > on a per-user basis (e.g. in ~/.profile), not for the entire system. > There are many programs in the base system which don't [yet] support UTF-8. thinking about it better I ran locate.updatedb from my user (LC_ALL set in .profile) to be able to have a locate database asap after new install. Adding a note in locate.updatedb(8) maybe ? Cheers Giovanni signature.asc Description: PGP signature
Re: locate.mklocatedb broken with LC_ALL!=C
On Sun, Oct 07, 2018 at 09:43:05AM +0200, Giovanni Bechis wrote: > Hi, > after setting LC_ALL=en_US.UTF-8 on my env locate.mklocatedb seems broken, > resetting LC_ALL=C is a workaround. > > $ export LC_ALL=en_US.UTF-8 > $ doas /usr/libexec/locate.updatedb > sort: Illegal byte sequence > locate.mklocatedb: cannot build locate database > $ export LC_ALL=C > $ doas /usr/libexec/locate.updatedb > > Should we run weekly(8) with LC_ALL=C to be sure that locate.updatedb runs > correctly ? Where did you set the LC_ALL variable? The UTF-8 locale should only be enabled on a per-user basis (e.g. in ~/.profile), not for the entire system. There are many programs in the base system which don't [yet] support UTF-8.
locate.mklocatedb broken with LC_ALL!=C
Hi, after setting LC_ALL=en_US.UTF-8 on my env locate.mklocatedb seems broken, resetting LC_ALL=C is a workaround. $ export LC_ALL=en_US.UTF-8 $ doas /usr/libexec/locate.updatedb sort: Illegal byte sequence locate.mklocatedb: cannot build locate database $ export LC_ALL=C $ doas /usr/libexec/locate.updatedb Should we run weekly(8) with LC_ALL=C to be sure that locate.updatedb runs correctly ? Cheers Giovanni