On 26/11/10 18:01, DJ Lucas wrote: > Sent too bug-coreutils too (no bug id currently AFAICT). > > Bug only affects multi-byte locales. Take the following samples: > > > > bash-4.1# zcat cracklib-words-20080507.gz | sort -u --debug > file && > echo $? > sort: using `en_US.UTF-8' sorting rules > Segmentation fault > bash-4.1# echo $? > 139 > bash-4.1# > > > bash-4.1# zcat cracklib-words-20080507.gz | sort -u --parallel=1 > --debug > file && echo $? > sort: using `en_US.UTF-8' sorting rules > 0 > bash-4.1# > > > bash-4.1# zcat cracklib-words-20080507.gz | LANG=C sort -u --debug > > file && echo $? > sort: using simple byte comparison > 0 > bash-4.1# > > > bash-4.1# gzip -d cracklib-words-20080507.gz > bash-4.1# sort -u --debug cracklib-words-20080507 > file && echo $? > sort: using `en_US.UTF-8' sorting rules > 0 > bash-4.1# > > > In the interim, for a quick and dirty hack, I've added an LC_COLLATE > comparison and set nthreads to 1 in multibyte locales. > > Probably well known, but the test file that I used is available from: > http://downloads.sourceforge.net/cracklib/cracklib-words-20080507.gz > > -- DJ Lucas >
100% reproducible on a 32 bit F12 box (first I tried) # zcat cracklib-words-20080507.gz | ./coreutils-8.7/src/sort -u --parallel=1 > /dev/null # zcat cracklib-words-20080507.gz | ./coreutils-8.7/src/sort -u --parallel=2 > /dev/null Segmentation fault (core dumped) #0 0x00c4aecf in __strlen_ia32 () from /lib/libc.so.6 #1 0x00c4ecf1 in strcoll_l () from /lib/libc.so.6 #2 0x00c4a8b1 in strcoll () from /lib/libc.so.6 #3 0x0805682d in strcoll_loop (s1=<value optimized out>, s1size=<value optimized out>, s2=<value optimized out>, s2size=1852142453) at memcoll.c:39 #4 memcoll0 (s1=<value optimized out>, s1size=<value optimized out>, s2=<value optimized out>, s2size=1852142453) at memcoll.c:110 #5 0x08054263 in xmemcoll0 (s1=0xb5457008 "lauters", s1size=8, s2=0x6f626f79 <Address 0x6f626f79 out of bounds>, s2size=1852142453) at xmemcoll.c:71 #6 0x0804e091 in compare (a=0xb68cb558, b=0xb5c3e9b8) at sort.c:2653 #7 0x0804f320 in write_unique (line=0xb68cb558, tfp=0x9f2ebe0, temp_output=0x9f2e9c0 "/tmp/sortqd5BvL") at sort.c:3233 #8 0x0804f5b0 in mergelines_node (node=0xbfd06c0c, total_lines=822189, tfp=0x9f2ebe0, temp_output=0x9f2e9c0 "/tmp/sortqd5BvL") at sort.c:3279 #9 0x0804f8e4 in merge_loop (queue=0xbfd06ca4, total_lines=822189, tfp=0x9f2ebe0, temp_output=0x9f2e9c0 "/tmp/sortqd5BvL") at sort.c:3367 #10 0x0804fc52 in sortlines (lines=0xb7557038, dest=0xb68cb568, nthreads=1, total_lines=822189, parent=0xbfd06c0c, lo_child=true, merge_queue=0xbfd06ca4, tfp=0x9f2ebe0, temp_output=0x9f2e9c0 "/tmp/sortqd5BvL") at sort.c:3481 #11 0x0804f9a4 in sortlines_thread (data=0xbfd06be4) at sort.c:3404 #12 0x00d58ab5 in start_thread () from /lib/libpthread.so.0 #13 0x00caef1e in clone () from /lib/libc.so.6 (gdb) quit Looks like ASCII data passed on the stack for pointer and size (oboy, nesu) Hmm, seems like multiple threads are racing to update the static "saved" variable in write_unique() ?