Re: you are not going to be able to sort this by the fifth field.
EB> Except that you can specify overlapping keys. I find the idea of multiple EB> separate lines of underscores, one per key, much easier to follow in OK, any --debug=... is better than nothing.
Re: you are not going to be able to sort this by the fifth field.
According to jida...@jidanni.org on 3/4/2010 7:11 PM: > Thanks. I see I neglected the -b. > On the info page in the `--field-separator=SEPARATOR' discussion, do > mention the effects of -b on ' foo' etc. > PB> $ LC_CTYPE=C sort --debug -sb -k5,5 < taichung_county_atm.htm > (Use .txt, not .htms in examples.) > Anyway, your --debug stuff would be clearer with just pipes added > inline: > $ echo 'a b c'|sort --debug=show_fields > a| b| c > or something like that. Except that you can specify overlapping keys. I find the idea of multiple separate lines of underscores, one per key, much easier to follow in understanding how each line is broken down into fields and keys, than I would in trying to parse inline notations that change the line's contents. -- Eric Blake ebl...@redhat.com+1-801-349-2682 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: you are not going to be able to sort this by the fifth field.
Thanks. I see I neglected the -b. On the info page in the `--field-separator=SEPARATOR' discussion, do mention the effects of -b on ' foo' etc. PB> $ LC_CTYPE=C sort --debug -sb -k5,5 < taichung_county_atm.htm (Use .txt, not .htms in examples.) Anyway, your --debug stuff would be clearer with just pipes added inline: $ echo 'a b c'|sort --debug=show_fields a| b| c or something like that.
Re: you are not going to be able to sort this by the fifth field.
On 04/03/10 19:59, jida...@jidanni.org wrote: Try as you might, there is no way you are going to sort by this field, $ LC_CTYPE=zh_TW.UTF-8 w3m -dump \ http://www.tcb-bank.com.tw/tcb/servicesloc/atm_location/taichung_county_atm.htm | perl -anlwe 'print $F[4] if exists $F[4]'|LC_CTYPE=C sort without ripping it out of the table first using perl. Go ahead, try -t ... -k ...,... You won't be able to order that field in the same way one can after ripping it out of the table. This seems to work for me: LC_CTYPE=C sort -sb -k5,5 I confirmed by extracting the field as perl above _after_ the sort using: sed 's/^ *//; s/ +/ /g; s/\r$//' | cut -d ' ' -s -f5 | sed '/^$/d' sort (GNU coreutils) 8.4 P.S., perhaps add a --debug-fields mode which adds field boundary | pipe symbols into the output. Yes I agree it's very difficult to know exactly what's going on with the field processing in sort. I actually proposed and mostly implemented a --debug option. Here are some examples: $ LC_CTYPE=C sort --debug -sb -k5,5 < taichung_county_atm.htm ** no match for key **
you are not going to be able to sort this by the fifth field.
Try as you might, there is no way you are going to sort by this field, $ LC_CTYPE=zh_TW.UTF-8 w3m -dump \ http://www.tcb-bank.com.tw/tcb/servicesloc/atm_location/taichung_county_atm.htm | perl -anlwe 'print $F[4] if exists $F[4]'|LC_CTYPE=C sort without ripping it out of the table first using perl. Go ahead, try -t ... -k ...,... You won't be able to order that field in the same way one can after ripping it out of the table. sort (GNU coreutils) 8.4 P.S., perhaps add a --debug-fields mode which adds field boundary | pipe symbols into the output.