Pádraig Brady wrote:
tag 21000 wontfix
close 21000
stop
On 07/07/15 03:00, Christopher Samuel wrote:
Hi there,
When trying to sort with the -h option (--human-numeric-sort) it seems
to fail to get the ordering correct, for instance in a column of values
of memory usage from the Slurm HPC batch system you get this:
2868768K
2875504K
3278652K
3435484K
3461744K
4050208K
419.50M
421M
422M
447.50M
451M
467M
478.50M
479M
496M
998M
1.09G
1.31G
1.31G
1.31G
I.E. sort -h is not as general as you require.
You can leverage numfmt(1) though to do the required adjustments.
For example, copy/pasting this command:
g 420M
h 421M
i 422M
j 448M
k 451M
l 467M
m 479M
n 479M
o 496M
p 998M
q 1.1G
r 1.4G
s 1.4G
t 1.4G
a 2.9G
b 2.9G
c 3.3G
d 3.5G
e 3.5G
f 4.1G
But that's the wrong output. sort -h uses power-of-2 units.
And you expect users to use that as a workaround?
*puhshaw*... it's not that hard and this:
Paul Eggert wrote:
Looking at both would require arbitrary-precision arithmetic,
something that 'sort' doesn't do (it does only arbitrary-precision
comparison).
Arbitrary precision...that's a strawman argument IMHO...since the output
is targetted for 3-4 digits+ a suffix, one would just normalize them.
You can do up to exabytes in 64-bit integers (binary or decimal).
I added a file to mine as well: (file-{001-020}):
420M file-006
421M file-007
422M file-008
448M file-009
451M file-010
467M file-011
478M file-012
479M file-013
496M file-014
998M file-015
1.1G file-016
1.3G file-017
1.3G file-018
1.3G file-019
2.7G file-000
2.7G file-001
3.1G file-002
3.3G file-003
3.3G file-004
3.9G file-005
---- -----
~29G TOTAL
The above is from a 10-y/o horrid perl prog, "hsort", that started in perl4.
I've thought about refactoring it, but it works, so it sits on a back-shelf.
Admittedly, I just added the normalization, but I never had a case that
needed
it before I saw the above input. I would presume that the O.P.
(Christopher)
wouldn't find normalizing it to fewer digits to be a problem -- as long
as it sorts the numbers in the correct order...
My word... sort-h really does numeric compares within the same prefix?
It doesn't convert them to a number then sort, then reapply units?
Urg.....and I was so impressed that you guys finally added that to sort.
Oh well....*whistling innocently*....;-)
(it's not a completely trivial prog to write even in perl -- likely just
that much harder in C).
If you leave out support for Zetta and Yotta, you can do the rest in
64-bit integers...
Hey...I just wrote the output portion of that (and it has normalization)
in C++ for another prog... It's in C++, but I wouldn't think it that
hard to reuse for C. Yeah, I know, the output's the easier part.
but for 36 lines, it seems to work ok.
/********#*********#*********#*********#*********#*********#*********#*********/
// Produce num+binary or SI suffix {{{
static char * Scale(char buf[], double value, const int scale = 1024) {
static const char suffixes[] = { ' ', 'K', 'M', 'G', 'T', 'P', 'E' };
CE int last_i = sizeof(suffixes) / sizeof(char) - 1;
uns i;
for (i = 0; value >= 999.5 && i < last_i; ++i ) value /= (double) scale;
snprintf(buf, 10, value == 0.0 ? "0"
: value < 9.95 ? "%.1f%c"
: "%.0f%c", value, suffixes[i]);
return buf;
}
string Scale(double value, const int scale = 1024) {
char buf[16];
return string(Scale(buf, value, scale));
}
string Binary_Scale (double value) { return Scale(value); }
string Binary_Scale (uint64_t value) { return Binary_Scale((double)
value); }
string Binary_Scale (int64_t value) { return Binary_Scale((double) value); }
string SI_Scale (uint64_t value) { return Scale((double) value, 1000); }
char * Binary_Scale (char buf[], double value) { return Scale(buf, value); }
char * Binary_Scale (char buf[], uint64_t value) {
return Binary_Scale(buf, (double) value); }
char * Binary_Scale (char buf[], int64_t value) {
return Binary_Scale(buf, (double) value); }
char * SI_Scale (char buf[], uint64_t value) {
return Scale(buf, (double) value, 1000); }