Leo Butler wrote: > -16 -2 -14 -5 1 1 0 0.3080808080808057 0 0.1540404040404028 0.3904338415207971
That should be fine. > I have a dual processor machine, with each processor being an Intel Core 2 > Duo E6850, rated at 3GHz and cache 4096 kB, with 3.8GB total physical > memory and 4GB swap space and two partitions on the hdd with 200GB and > 140GB available space. Sounds like a very nice machine. > I am using sort v. 5.2.1 and v. 6.1 & v. 6.9. The former is installed as > part of the RHEL OS and the latter two were compiled from the source at > http://ftp.gnu.org/gnu/coreutils/ with the gcc v. 3.4.6 compiler. All good so far. To nail down two more details, could you provide the output of these commands? uname -a ldd --version | head -n1 file /usr/bin/sort ./sort That will give us the kernel and libc versions. That last will report whether the binary programs are 32-bit or 64-bit. > When I attempt to sort the file, with a command like > > ./sort -S 250M -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data > -T /data2 -o out.sort in.txt > > sort rapidly chews up about 40-50% of total physical memory (=1.5-1.9GB) at > which point the error message 'sort: memory exhausted' appears. This > appears to be independent of the parameter passed through the -S option. > ... > Is this an idiosyncratic problem? That is very strange. If by idiosyncratic do you mean is this particular to your system? Probably. Because I have routinely sorted large files without problem. But that doesn't mean it isn't a bug. At 50G the data file is very large compared to your 4G of physical memory. This means that sort cannot sort it in memory. It will open temporary files and sort a large chunk to one file and then another and then another as a first pass splitting up the input file into many sorted chunks. As a second pass it will merge-sort the sorted chunks together into the output file. What is the output of this command on your system? sysctl vm.overcommit_memory I am asking because by default the linux kernel overcommits memory and does not return out of memory conditions. Instead the process (or some other one) is killed by the linux out-of-memory killer. But enterprise systems will be configured with overcommit disabled for reliability reasons and that appears to be how your system is configured because you wouldn't see a message about being out of memory from sort otherwise. (I always disable overcommit so as to avoid the out-of-memory killer.) Do you have user process limits active? What is the output of this command? ulimit -a What does free say on your system? free > I have read backlogs of the list and people report sort-ing 100GB > files. Do you have any ideas? Without doing a lot of debugging I am wondering if your choice of locale setting is affecting this. I doubt it because all of the sort fields are numeric. But because this is easy enough could you try sorting using LC_ALL=C and see if that makes a difference? LC_ALL=C sort -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data -T /data2 -o out.sort in.txt Also could you determine how large the process is at the time that sort reports running out of memory? I am wondering if it is at a magic number size such as 2G or 4G that could provide more insight into the problem. Bob _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils