Here are the values from another sort that has been running for over 12 hours now. This time that second argument (number of threads) looks fine in all three cases. And this time there are no zombie threads.
>: pstack 20632 20632: /usr/local/abacus/etsort/sort -tn -S 295063 --batch-size=100 -T /disk/ ----------------- lwp# 1 / thread# 1 -------------------- ffffffff7eadc810 lwp_wait (f2, ffffffff7fffea9c) ffffffff7ead4d74 _thrp_join (f2, 0, 0, 1, ffffffff7fffeca0, ffffffff7fffea9c) + 38 000000010000f2f4 sortlines (110137e90, 8, 7194a, 11015bfe0, ffffffff7fffeca0, 100136240) + 174 0000000100010144 sort (100137cd0, 1, ffffffff7ffff660, 8, ffffffff7fffeeac, ffffffff7ed00200) + 2f0 0000000100012bf4 main (13, ffffffff7ffff1f8, ffffffff7ffff298, 100136ca8, 100000000, ffffffff7ed00200) + 21cc 0000000100004ca4 _start (0, 0, 0, 0, 0, 0) + 7c ----------------- lwp# 242 / thread# 242 -------------------- ffffffff7eadc810 lwp_wait (f4, ffffffff7e1fbd2c) ffffffff7ead4d74 _thrp_join (f4, 0, 0, 1, ffffffff7fffeca0, ffffffff7e1fbd2c) + 38 000000010000f2f4 sortlines (110137e90, 4, 7194a, 11015c050, ffffffff7fffeca0, 100136240) + 174 000000010000f168 sortlines_thread (ffffffff7fffeb60, 1fc000, 0, 0, 10000f104, 0) + 64 ffffffff7ead8778 _lwp_start (0, 0, 0, 0, 0, 0) ----------------- lwp# 244 / thread# 244 -------------------- ffffffff7ead8818 lwp_park (0, 0, 0) 000000010000e710 lock_node (11015c360, 10f691fb0, ffffffff7ec4a300, ffffffff7fffecac, ffffffff7ed00a00, 0) + 14 000000010000efbc queue_check_insert_parent (ffffffff7fffeca0, 11015c3d0, 100136240, 1101597dd, ffffffff7ed00a00, 1c00) + 2c 000000010000f0e8 merge_loop (ffffffff7fffeca0, 7194a, 100136240, 1101597dd, ffffffff7eacff0c, 3) + 90 000000010000f43c sortlines (110137e90, 2, 7194a, 11015c0c0, ffffffff7fffeca0, 100136240) + 2bc 000000010000f168 sortlines_thread (ffffffff7e1fbdf0, 1fc000, 0, 0, 10000f104, 0) + 64 ffffffff7ead8778 _lwp_start (0, 0, 0, 0, 0, 0) >: truss -rall -wall -f -p 20632 20632/1: lwp_wait(242, 0xFFFFFFFF7FFFEA9C) (sleeping...) 20632/244: lwp_park(0x00000000, 0) (sleeping...) 20632/242: lwp_wait(244, 0xFFFFFFFF7E1FBD2C) (sleeping...) -----Original Message----- From: Bernhard Voelker [mailto:m...@bernhard-voelker.de] Sent: Tuesday, March 12, 2013 12:27 PM To: McFarland, Jeffrey Cc: coreutils@gnu.org Subject: Re: Multithreaded sort hangs on Solaris On 03/11/2013 04:47 PM, McFarland, Jeffrey wrote: >>: sudo pstack 16328 > > 16328: /usr/local/abacus/etsort/sort -tn -S 295063 --batch-size=100 > -T /disk/ > > ----------------- lwp# 1 / thread# 1 -------------------- > > ffffffff7d4d8818 lwp_park (0, 0, 0) > > 0000000100009c74 sortlines (111b56580, 111c56080, ffffffff7fffeab0, > 10012a321, ffffffff7fffead0, 10012a328) + 514 > > 000000010000a5cc sortlines (111558380, 2, ffffffff7fffeab0, 1121765e0, > 0, ffffffff7fffeab0) + e6c > > 000000010000a5cc sortlines (111956f80, 4, ffffffff7fffeab0, 112176420, > 0, ffffffff7fffeab0) + e6c > > 000000010000a5cc sortlines (112154760, 8, ffffffff7fffeab0, 1121760a0, > 1, ffffffff7fffeab0) + e6c > > 000000010000c070 sort (10012a740, 0, ffffffff7fffead0, 23, 10012cddd, > 112154760) + 350 > > 000000010000e6e8 main (13, ffffffff7ffff148, 0, 10012c220, fffd, > 10012b1e0) + 1ee8 > > 00000001000041bc _start (0, 0, 0, 0, 0, 0) + 7c Hi Jeffrey, the value of the second argument of topmost sortlines() invocation looks strange (if pstack shows it right). Can you attach with GDB and give us the values of the function arguments? Have a nice day, Berny ________________________________ This e-mail and files transmitted with it are confidential, and are intended solely for the use of the individual or entity to whom this e-mail is addressed. If you are not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you are not one of the named recipient(s) or otherwise have reason to believe that you received this message in error, please immediately notify sender by e-mail, and destroy the original message. Thank You.