On Tue, 6 Dec 2016 02:17:03 -0800 Michael Forney <mfor...@mforney.org> wrote:
Hey Michael, > POSIX says that -c specifies a number of bytes, not characters. This > flag is commonly used by scripts that operate on binary files to do > things like extract a header. Treating the offsets as character > offsets will break things in mysterious ways. > > Instead, add a -m option (chosen to match `wc -m`, which also operates > on characters) to handle character offsets. > --- > I'm tempted to just delete the character functionality instead of > introducing a new non-standard option. I can see the use of tail with > codepoints, but we definitely need to make -c work on bytes so that we > don't break scripts. > > I'm also open to changing the option flag to something else. I just > chose -m because that's what wc uses for characters. well-spotted! Still, it's _very_ counterintuitive to call the flag "-c". Instead of adding a non-portable m-flag, it would even sound better to me to add a b-flag for byte-offsets. It all depends on how many scripts rely on this behaviour. Can you give an example? I thought cut(1) was the tool of choice for extracting headers and such things. Cheers Laslo -- Laslo Hunhold <d...@frign.de>