On Sun, 07 Oct 2018 17:08:06 +0200, Marc Espie wrote:

> Specifically, the only part that cares about
> locale is sort, and it's definitely correct in fixing
> it's not run on an utf-8 file.

Agreed.  How about the following?

 - todd

Index: usr.bin/locate/locate/mklocatedb.sh
===================================================================
RCS file: /cvs/src/usr.bin/locate/locate/mklocatedb.sh,v
retrieving revision 1.13
diff -u -p -u -r1.13 mklocatedb.sh
--- usr.bin/locate/locate/mklocatedb.sh 18 Mar 2007 20:13:49 -0000      1.13
+++ usr.bin/locate/locate/mklocatedb.sh 8 Oct 2018 02:34:52 -0000
@@ -66,7 +66,8 @@ filelist=`mktemp ${TMPDIR=/tmp}/_filelis
 }
 trap 'rm -f $bigrams $filelist' 0 1 2 3 5 10 15
 
-if $sortcmd $sortopt > $filelist; then
+# Run sort in the C locale or binary data may be interpreted as UTF-8
+if LC_ALL=C $sortcmd $sortopt > $filelist; then
         $bigram < $filelist | $sort -nr | 
                 awk -Ft 'BEGIN { ORS = "" } NR <= 128 { print $2 }' > $bigrams 
&&
         $code $bigrams < $filelist 

Reply via email to