bug#41518: Bug in od?

2020-05-29 Thread Yuan Cao
On Fri, May 29, 2020 at 1:20 AM Bob Proulx  wrote:

> A little more information.
>
> Pádraig Brady wrote:
> > Yuan Cao wrote:
> > > I recently came across the following behavior.
> > >
> > > When using "--traditional x2" or "-x" option, it seems the order of hex
> > > code output for the characters is pairwise reversed (if that's the
> correct
> > > way of describing it).
>
> ‘-x’
>  Output as hexadecimal two-byte units.  Equivalent to ‘-t x2’.
>
> Outputs 16-bit integers in the *native byte order* of the machine.
> Which may be either big-endian or little-endian depending on the
> machine.  Not portable.  Depends upon the machine it is run upon.
>
> > If you want to hexdump independently of endianess you can:
> >
> >   od -Ax -tx1z -v
>
> The -tx1 option above is portable because it outputs 1-byte units
> instead of 2-byte units which is independent of endianess.
>
> This is the FAQ entry for this topic.
>
>
> https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-_0027od-_002dx_0027-command-prints-bytes-in-the-wrong-order_002e
>
> Bob
>

Thanks for pointing me to this documentation.

It just feels strange because the order does not reflect the order of the
characters in the file.

I think it might have been useful to get the "by word" value of the file if
you are working with a binary file historically. One might have stored some
data as a list of shorts. Then, we can easily view the data using "od -x
data_file_name".

Since memory is so cheap now, people are probably using just using chars
for text, and 4 byte ints or 8 byte ints where they used to use 2 byte ints
(shorts) before. In this case, the "by word" order does not seem to me to
be as useful and violates the principle of least astonishment needlessly.

It might be interesting to change the option to print values by double word
or quadword instead or add another option to let the users choose to print
by double word or quadword if they want.

Best Regards,

Yuan


bug#41518: Bug in od?

2020-05-24 Thread Yuan Cao
Hello,

I recently came across the following behavior.

When using "--traditional x2" or "-x" option, it seems the order of hex
code output for the characters is pairwise reversed (if that's the correct
way of describing it).

For example, using "od -cx" on a test file that contains "123456789\n", you
get the following output:

000   1   2   3   4   5   6   7   8   9   0  \n
 3231  3433  3635  3837  3039  000a
013

It seems like it should be the following instead:

000   1   2   3   4   5   6   7   8   9   0  \n
 3132  3334  3536  3738  3930  0a00
013

The version involved is od in GNU coreutils 8.28.

Best Regards,

Yuan