from:"Bob Proulx"

bug#54586: dd conv options doc

2022-04-04 Thread Bob Proulx

Karl Berry wrote:
>  'fdatasync'
>   Synchronize output data just before finishing.  This forces a
>   physical write of output data.
>
>  'fsync'
>   Synchronize output data and metadata just before finishing.
>   This forces a physical write of output data and metadata.
>
> Weirdly, these descriptions are inducing quite a bit of FUD in me.
>
> Why would I ever want the writes to be incomplete after running dd?
> Seems like that is dd's whole purpose.

Yes.  FUD.  The writes are not incomplete.  It is no different than
any other write.

echo "Hello, World!" > file1

Is that write complete?  It's no different.  If one is incomplete then
so is the other.  Note that the documentation does not say
"incomplete" but says "physical write".  As in, chiseled into stone.

The dd utility exists with a plethora of low level options not
typically available in other utilities.  Other utilities such as cp
for example.  That is one of the distinguishing features making dd
useful in a very large number of cases when otherwise we would use cp,
rsync, or one of the others.  Very low level control of option flags.
But just because options exist does not mean they should always be
used.  Most of the time they should not be used.

> Well, I suppose it is too late to make such a radical change as forcing
> a final sync.

Please, no.  Opposing this is the motivation for me writing this
response.  Things are wastefully slow already due to the number of
fsync() calls now coded into everywhere all over the place.  Other
programs.  Not referring to the coreutils here.  Let's not make the
problem worse by adding them where they are not desired.  And that is
why it is an option to dd and not on by default.  In those specific
cases where it is useful then it can be specified as an option.  dd is
exposing the interface for when it is useful.

As a practical matter I think with GNU dd's extensions that I never
ever use conv=fsync or conv=fdatasync but instead would always in
those same cases use oflag=direct,sync.  Such as when writing a
removable storage device like a USB drive, that I subsequently will
want to remove.  There is no benefit to caching the data since it will
be invalidated immediately.  Not using buffer cache avoids flushing
some other data that would be useful to keep in file system buffer
cache.  When the write is done then the removable media can be
removed.  This avoids needing to run sync explicitly.  Which sync's
*everything*.

> In which case I suggest adding another sentence along the lines of
> "If these options are not specified, the data will be physically
> written when the system schedules the syncs, ordinarily every few
> seconds" (correct?).

Yes.  However the behavior might vary slightly between the different
kernels such as Linux kernel, BSD kernel, or even HP-UX kernel.
Therefore the documentation of it is kernel specific.  Even if all of
the kernels operated similarly.

> "You can also manually sync the output filesystem yourself
> afterwards (xref sync)." Otherwise it feels uncertain when or
> whether the data will be physically written, or how to look into it
> further.

Generally this is a task that the operating system should be handling.
The programmer taking explicit control defeating the cache is almost
always going to be less efficient at it than the operating system.

However as you later mention writing an image to a removable storage
device like a USB thumbdrive needs to have the data flushed through
before removing the device.  GNU dd is good for this as I will
describe below but otherwise yes a "sync" (either the standalone or
the oflag) would be needed to ensure that the data has been flushed
through.

> As for "metadata", what does dd have to do with metadata?  My wild guess
> is that this is referring to filesystem metadata, not anything about dd
> specifically. Whatever the case, I suggest adding a word or two to the
> doc to give a clue.

It's not dd's fault.  The OS created it first!  It's a property given
meaning by the OS.  The OS defines the option flags.  The dd utility
is simply a thin layer giving access to the OS file option flags.

> Further, why would I want data to be synced and not metadata? Seems like
> fdatasync and fsync should both do both; or at least document that
> normally they'd be used together. Or, if there is a real-life case where
> a user would want one and not the other, how about documenting that? My
> imagination is failing me, but presumably these seemingly-undesirable
> options were invented for a reason.

The fdatasync() man page provides the information.

The aim of fdatasync() is to reduce disk activity for applications
that do not require all metadata to be synchronized with the disk.

In short fdatasync() is less heavy than fsync().

> BTW, I came across these options on a random page discussing dumping a
> .iso to a USB drive; the example was
>   dd if=foo.iso of=/dev/sde conv=fdatasync
> .. seems

bug#53631: coreutils id(1) incorrect behavior

2022-01-29 Thread Bob Proulx

Vladimir D. Seleznev wrote:
> Expected behavior is:
>   # id user1
>   uid=1027(user1) gid=1027(user1) groups=1027(user1)
>   # id user2
>   uid=1027(user1) gid=1027(user1) groups=1027(user1),1028(somegroup)

I just tried a test on both FreeBSD and NetBSD and both FreeBSD and
NetBSD behave as you expect.  That would give weight for GNU Coreutils
matching that behavior.

> Example:
>   # useradd user1
>   # groupadd somegroup
>   # useradd -o -u "$(id -u user1)" -g "$(id -G user1) -G somegroup user2

I'll just note that there is a missing ending quote character.  It's
also missing the -m option to create a home directory.  For those who
wish to recreate the test case.

root@turmoil:~# tail -n2 /etc/passwd /etc/group /etc/shadow /etc/gshadow
==> /etc/passwd <==
user1:x:1001:1001::/home/user1:/bin/sh
user2:x:1001:1001::/home/user2:/bin/sh

==> /etc/group <==
user1:x:1001:
somegroup:x:1002:user2

==> /etc/shadow <==
user1:!:19022:0:9:7:::
user2:!:19022:0:9:7:::

==> /etc/gshadow <==
user1:!::
somegroup:!::user2

With the above things are not really a valid configuration.  Therefore
I don't think it is surprising that the utilities don't "figure it
out" completely correctly.  I have never seen user2 used with a
different set of groups than the primary uid specifies.  I think in
practice that will be problematic.  Since the system will use the uid
for such things and the uid would map to a different set of auxilary
groups.  I think in practice this case is a problematic case at the
least.

Note that it is perfectly valid and long standing practice to allow
multiple passwd entries with the same uid number.  That's a technique
to allow multiple different passwords and login shells for the same
account.

[[ I'll further note that use of nscd completely breaks this useful
ability by hashing all duplicate uid entries together.  Like in The
Highlander, with nscd there can be only one.  It's why I never use
nscd anywhere as this makes it not suitable for purpose.  But that's
rather off this topic.  I'll bracket it as an aside. ]]

Bob

bug#53033: date has multiple "first saturday"s?

2022-01-10 Thread Bob Proulx

Darryl Okahata wrote:
> Bob Proulx wrote:
> Inconsistencies like this are why I wish it had never been implemented.  
> Best to avoid the syntax completely.
> 
> Thanks.  I'll avoid date and use either python or ruby to get this info.

To be clear what I meant was that I would avoid the ordinal word
descripts such as first, second, and third because as documented the
use of second is already used for the time unit.  I meant that instead
it would be better to use the actual numbers 1, 2, and 3, to avoid
that problem.

However reading your report again I now question whether I understand
what you were trying to report specifically.  Initially you wrote:

$ date -d "first saturday"
Sat Jan  8 00:00:00 PST 2022

Running it again today I get.

$ date -d "first saturday"
Sat Jan 15 12:00:00 AM MST 2022

$ date -d "next saturday"
Sat Jan 15 12:00:00 AM MST 2022

That's the first Saturday after now.  The debug is valuable
information.

$ date --debug -d 'first saturday'
date: parsed day part: next/first Sat (day ordinal=1 number=6)
date: input timezone: system default
date: warning: using midnight as starting time: 00:00:00
date: new start date: 'next/first Sat' is '(Y-M-D) 2022-01-15 00:00:00'
date: starting date/time: '(Y-M-D) 2022-01-15 00:00:00'
date: '(Y-M-D) 2022-01-15 00:00:00' = 164223 epoch-seconds
date: timezone: system default
date: final: 164223.0 (epoch-seconds)
date: final: (Y-M-D) 2022-01-15 07:00:00 (UTC)
date: final: (Y-M-D) 2022-01-15 00:00:00 (UTC-07)
Sat Jan 15 12:00:00 AM MST 2022

Is it useful to know the date, say..., three Saturdays from now?  I am
sure there is a good case for it.  But it always leaves me scratching
my head wondering.  Because it is basically working with the date of
today, at midnight, then the next Saturday.

$ date --debug -d 'third saturday'
date: parsed day part: third Sat (day ordinal=3 number=6)
date: input timezone: system default
date: warning: using midnight as starting time: 00:00:00
date: new start date: 'third Sat' is '(Y-M-D) 2022-01-29 00:00:00'
date: starting date/time: '(Y-M-D) 2022-01-29 00:00:00'
date: '(Y-M-D) 2022-01-29 00:00:00' = 1643439600 epoch-seconds
date: timezone: system default
date: final: 1643439600.0 (epoch-seconds)
date: final: (Y-M-D) 2022-01-29 07:00:00 (UTC)
date: final: (Y-M-D) 2022-01-29 00:00:00 (UTC-07)
Sat Jan 29 12:00:00 AM MST 2022

It seems to me that it would be just as clear to use numbers in that
position so as to avoid ambiguity.

$ date --debug -d '2 saturday'
date: parsed day part: (SECOND) Sat (day ordinal=2 number=6)
date: input timezone: system default
date: warning: using midnight as starting time: 00:00:00
date: new start date: '(SECOND) Sat' is '(Y-M-D) 2022-01-22 00:00:00'
date: starting date/time: '(Y-M-D) 2022-01-22 00:00:00'
date: '(Y-M-D) 2022-01-22 00:00:00' = 1642834800 epoch-seconds
date: timezone: system default
date: final: 1642834800.0 (epoch-seconds)
date: final: (Y-M-D) 2022-01-22 07:00:00 (UTC)
date: final: (Y-M-D) 2022-01-22 00:00:00 (UTC-07)
Sat Jan 22 12:00:00 AM MST 2022

There is no need for "second" in the "second saturday" when using the
relative time "2 saturday" produces the desired answer.

My wondering now is if "2 saturday" was actually what was desired at
all.  Perhaps it was really wanted to know the date of the first
Saturday of the month?  That's entirely a different problem.

Also, when working with dates I strongly encourage working with UTC.
I went along with the original example.  But I feel I should have been
producing examples like this instead with -uR.

$ date -uR --debug -d '2 saturday'
date: parsed day part: (SECOND) Sat (day ordinal=2 number=6)
date: input timezone: TZ="UTC0" environment value or -u
date: warning: using midnight as starting time: 00:00:00
date: new start date: '(SECOND) Sat' is '(Y-M-D) 2022-01-22 00:00:00'
date: starting date/time: '(Y-M-D) 2022-01-22 00:00:00'
date: '(Y-M-D) 2022-01-22 00:00:00' = 1642809600 epoch-seconds
date: timezone: Universal Time
date: final: 1642809600.0 (epoch-seconds)
date: final: (Y-M-D) 2022-01-22 00:00:00 (UTC)
date: final: (Y-M-D) 2022-01-22 00:00:00 (UTC+00)
Sat, 22 Jan 2022 00:00:00 +

Bob

bug#53145: "cut" can't segment Chinese characters correctly?

2022-01-09 Thread Bob Proulx

zendas wrote:
> Hello, I need to get Chinese characters from the string. I googled a
> lot of documents, it seems that the -c parameter of cut should be
> able to meet my needs, but I even directly execute the instructions
> on the web page, and the result is different from the
> demonstration. I have searched dozens of pages but the results are
> not the same as the demo, maybe this is a bug?

Unfortunately the example was attached as images instead of as plain
text.  Please in the future copy and paste the example as text rather
than as an image.  As an image it is impossible to reproduce by trying
to copy and paste the image.  As an image it is impossible to search
for the strings.

The images were also lost somehow from the various steps in the
mailing list pipelines with this message.  First it was classified as
spam by the anti-spam robot (SpamAssassin-Bogofilter-CRM114).  I
caught it in review and re-sent the message.  That may have been the
problem specifically with images.

> For example:
> https://blog.csdn.net/xuzhangze/article/details/80930714
> [20180705173450701.png]
> the result of my attempt:
> [螢幕快照 2022-01-10 02:49:46.png]

One of the two images:

https://debbugs.gnu.org/cgi/bugreport.cgi?msg=5;bug=53145;att=3;filename=20180705173450701.png

Second problem is that the first image shows as being corrupted.  I
can view the original however.  To my eye they are similar enough that
the one above is sufficient and I do not need to re-send the corrupted
image.

As to the problem you have reported it is due to lack of
internationalization support for characters.  -c is the same as -b at
this moment.

https://www.gnu.org/software/coreutils/manual/html_node/cut-invocation.html#cut-invocation

‘-c CHARACTER-LIST’
‘--characters=CHARACTER-LIST’
 Select for printing only the characters in positions listed in
 CHARACTER-LIST.  The same as ‘-b’ for now, but internationalization
 will change that.  Tabs and backspaces are treated like any other
 character; they take up 1 character.  If an output delimiter is
 specified, (see the description of ‘--output-delimiter’), then
 output that string between ranges of selected bytes.

For multi-byte UTF-8 characters the -c option will operate the same as
the -b option as of the current version and is not suitable for
dealing with multi-byte characters.

$ echo '螢幕快照'
螢幕快照
$ echo '螢幕快照' | cut -c 1
?
$ echo '螢幕快照' | cut -c 1-3
螢
$ echo '螢幕快照' | cut -b 1-3
螢

If the characters are known to be 3 bytes multi-characters then I
might suggest using -b to workaround the problem assuming 3 byte
characters.  Eventually when -c is coded to handle multi-byte
characters the handling as bytes will change.  Using -b would avoid
that change.

Some operating systems have patched that specific version of utilities
locally to add multi-byte character handling.  But the patches have
not been found acceptable for inclusion.  That is why there are
differences between different operating systems.

Bob

bug#53033: date has multiple "first saturday"s?

2022-01-07 Thread Bob Proulx

Darryl Okahata via GNU coreutils Bug Reports wrote:
> From coreutils 9.0 (note the difference between the "second" and "third"
> saturdays):
...
> $ src/date --debug -d "second saturday"
> date: parsed relative part: +1 seconds

Caution!  The date utility can't parse second due to second being a
unit of time.  The documentation says:

   A few ordinal numbers may be written out in words in some contexts.
This is most useful for specifying day of the week items or relative
items (see below).  Among the most commonly used ordinal numbers, the
word ‘last’ stands for -1, ‘this’ stands for 0, and ‘first’ and ‘next’
both stand for 1.  Because the word ‘second’ stands for the unit of time
there is no way to write the ordinal number 2, but for convenience
‘third’ stands for 3, ‘fourth’ for 4, ‘fifth’ for 5, ‘sixth’ for 6,
‘seventh’ for 7, ‘eighth’ for 8, ‘ninth’ for 9, ‘tenth’ for 10,
‘eleventh’ for 11 and ‘twelfth’ for 12.

Inconsistencies like this are why I wish it had never been
implemented.  Best to avoid the syntax completely.

Bob

bug#52481: chown of coreutils may delete the suid of file

2021-12-17 Thread Bob Proulx

21625039 wrote:
> [root@fedora ~]# ll test.txt
> -rwsr-x---. 1 root root 0 Dec 13 21:13 test.txt
> 
> [root@fedora ~]# chown root:root test.txt
> [root@fedora ~]# ll test.txt
> -rwxr-x---. 1 root root 0 Dec 13 21:13 test.txt

That is a feature of the Linux kernel, OpenBSD kernel, and NetBSD
kernel, and I presume of other kernels too.  I know that traditional
Unix systems did not.  But this is done by the kernel as a security
mitigation against some types of attack.

For example a user might have a file which is in their own directory
tree.  It might be executable and setuid.  Then through a social
engineering attack they coerce root into copying the file or otherwise
taking ownership of the directory tree because they are hoping to make
use of the now newly chowned root file that is executable.

Therefore as a security mitigation implemented by the OS kernel the
setuid bit is removed when chown'ing files.  If this is truly desired
then the file can be chmod'd explicitly after chown'ing the file.

This is entirely a kernel behavior and not of chown(1).  This isn't
specific to chown(1) command line utility at all.  For example you can
test that the same behavior from the kernel exists when using any
programming language.  It will have the same behavior.  Without
Coreutils involved at all.

# ll test.txt
-rwsr-xr-x 1 rwp rwp 0 Dec 17 17:13 test.txt
# perl -e 'chown 0, 0, "test.txt" or die;'
# ll test.txt
-rwxr-xr-x 1 root root 0 Dec 17 17:13 test.txt

Bob

bug#52206: Bug: rm -rf //

2021-11-30 Thread Bob Proulx

Bob Proulx wrote:
> Paul Eggert wrote:
> > Robert Swinford wrote:
> > > BTW, zsh globbing doesn’t exhibit this behavior!  It seems it is only a 
> > > problem in bash.
> >
> > In that case, the bug (whatever it is) wouldn't be a coreutils bug.
>
> I don't understand the comment that zsh doesn't expand the glob /*/*
> and I tried it and verified that it does indeed expand that glob
> sequence.

Lawrence Velazquez made sense of this on the bug-bash list.

https://lists.gnu.org/archive/html/bug-bash/2021-11/msg00193.html

Bob

bug#52115: Suggestion: LN command should swap TARGET and LINK_NAME if LINK_NAME already exists

2021-11-30 Thread Bob Proulx

Paul Eggert wrote:
> Bob Proulx wrote:
> > mv calls it SOURCE and DEST.  cp calls it SOURCE and DEST.  Perhaps ln
> > should also call it SOURCE and DEST too for consistency?
> 
> That's what ln did long ago, but that wording was deemed too confusing.
> Here's where we changed it to use something more like the current wording:
> 
> https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=519365bb089cf90bdc780c37292938f42019c7ea

This just proves that there is no perfect solution.  It's a flip-flop
with either state having inperfections.

My first thought was how humorous this situation is that due to
complaints about the documentation we would be lead in a circle back
to the beginning when this was changed previously due to complaints
about the documentation.

Bob

bug#52115: Suggestion: LN command should swap TARGET and LINK_NAME if LINK_NAME already exists

2021-11-30 Thread Bob Proulx

Andreas Schwab wrote:
> Bob Proulx wrote:
> > The more I think about it the more I think it should say CONTENT
> > rather than either TARGET or SOURCE.  Because it is actually setting
> > the content of the symbolic link.
> 
> A hard link doesn't have content.

But we are talking about symbolic links which do have content.

Bob

bug#52206: Bug: rm -rf //

2021-11-30 Thread Bob Proulx

Paul Eggert wrote:
> Robert Swinford wrote:
> > This seems like a bug: 
> > https://twitter.com/nixcraft/status/1465599844299411458
> 
> I don't see a coreutils bug there: rm operated as specified.

Agreed.  It's not an rm bug.  It's definitely unfortunate.  It's
similarly unfortunate to riding a bicycle into a lake.  But it isn't a
defect in the bicycle that it could not prevent someone from riding it
into a lake.

> > Interestingly, however, rm -rf // only does the following:
> 
> Yes, that's a special feature of GNU rm.

And apparently Bryan Cantrill reports that Solaris has the same
feature as GNU rm does for "rm -rf /" protection.

> > I believe illumos has already solved this problem in a POSIX compliant 
> > fashion
> 
> Not sure what you're talking about here. Could you be specific? Don't have
> time to watch videos.

I watched the cited video.  It features an interview with Bryan
Cantrill who very dynamically and entertainingly tells a story about a
scripted "rm -rf $1/$2" without checking if $1 and $2 were set or
unset resulting in "rm -rf /" being run by accident.  And therefore he
reports that Solaris implemented the prevention of running "rm -rf /".
This is said at time 1:27:00 in the video.  Which I note is the same
protection as GNU rm does.  So there isn't anything for GNU rm to
implement in order to match Solaris as it appears to be the same by
this report.

However $var1/$var2 expanding to / when those variables are not set is
a different case than /*/* expansion which has no variables and is
simply an error of usage.

> > BTW, zsh globbing doesn’t exhibit this behavior!  It seems it is only a 
> > problem in bash.
> 
> In that case, the bug (whatever it is) wouldn't be a coreutils bug.

I don't understand the comment that zsh doesn't expand the glob /*/*
and I tried it and verified that it does indeed expand that glob
sequence.

Bob

bug#52115: Suggestion: LN command should swap TARGET and LINK_NAME if LINK_NAME already exists

2021-11-29 Thread Bob Proulx

Bob Proulx wrote:
> With symbolic links the symlink contains a string.  The string could
> be pretty much anything.

The more I think about it the more I think it should say CONTENT
rather than either TARGET or SOURCE.  Because it is actually setting
the content of the symbolic link.  Therefore that seems the most
accurate.  Although VALUE also seems to have merit.

ln [OPTION]... CONTENT DEST

Bob

bug#52176: Problem with email list tags

2021-11-29 Thread Bob Proulx

Ulf Zibis wrote:
> Currently we have:
> List-Post: GNU coreutils Bug Reports 
> 
> When using "reply list" to answer to a comment of bug 12345 in a email client 
> such as Thunderbird, my reply is sent to bug-coreutils@gnu.org, but it should 
> be sent to 12...@debbugs.gnu.org
> 
> So I think, we should have:
> List-Post: GNU coreutils Bug Reports <12...@debbugs.gnu.org>
> 
> Alternatively the following tag could be added:
> Reply-To: 12...@debbugs.gnu.org

Please send comments, complaints, gripes, or suggestions about the BTS
to the help-debbugs AT gnu.org mailing list instead.  GNU Coreutils is
a user of the BTS but not a maintainer of the BTS.

Note that if a reply to bug-coreutils (or any of the BTS bug lists)
contains a subject containing "bug#52176" then the BTS will route it
to 52176 AT debbugs.gnu.org automatically.  Or at least it is supposed
to be routing it automatically.  Therefore either should actually work
correctly.

Note that if someone sends several messages with the same subject then
there is also logic in the BTS to try to route later messages to the
same bug ticket as the first message.  This is defeated if all of the
messages arrive at once.  But works if there is enough delay for the
first message to be allowed to create a ticket before subsequent
messages arrive.  Usually when we see multiple tickets created from a
user sending multiple messages it is due to them arriving into the BTS
all at the same time.  Hint for people moderating spam through
Mailman, send the first one through but pause a moment or three before
sending the follow-ups through.

Bob

bug#52115: Suggestion: LN command should swap TARGET and LINK_NAME if LINK_NAME already exists

2021-11-29 Thread Bob Proulx

Chris Elvidge wrote:
> Paul Eggert wrote:
> > Ulf Zibis wrote:
> > > I think, for beginners it would be less confusing, if the most
> > > simple form would be the first.
> > 
> > Unfortunately the simple form "ln TARGET" is quite rarely used, so
> > putting it first is likely to confuse beginners even more than what we
> > have already. Come to think of it, perhaps we should put the simple form
> > last instead of 2nd.

+1 for putting it last.

> I use 'ln -s "source"' quite a lot for linking into e.g. /usr/local/bin from
> my own $HOME/bin.

It defaults to "." as the target in that case.  I never liked that it
was allowed to be optional as I think it makes things much more
confusing than the two characters saved.

> The real problem could be with the terminology.
> 'ln [options] TARGET [LINK_NAME]'; the TARGET is really the source, which
> obviously must exist. A TARGET is really something you aim at.

Mostly agree.  With symbolic links the symlink contains a string.  The
string could be pretty much anything.  By convention it contains the
path to another file.  (Or to another special file.  Everything is a
file.)  But it is also used to contain a small bit of information in
other cases.  Such as for lockfiles and other uses.  Therefore source
isn't quite right.  But maybe it is good enough.  Because CONTENTS
seems less good even if perhaps more accurate.

> Perhaps it should be changed to 'ln [options] source [link]'

mv calls it SOURCE and DEST.  cp calls it SOURCE and DEST.  Perhaps ln
should also call it SOURCE and DEST too for consistency?

cp [OPTION]... [-T] SOURCE DEST
mv [OPTION]... [-T] SOURCE DEST
ln [OPTION]... [-T] SOURCE DEST

I like the consistency of that.

Although I don't like that -T is not apparently an OPTION.  It's not?
Why not?  Shouldn't that synopsis form simply be these?

cp [OPTION]... SOURCE DEST
mv [OPTION]... SOURCE DEST
ln [OPTION]... SOURCE DEST

Bob

bug#52115: Suggestion: LN command should swap TARGET and LINK_NAME if LINK_NAME already exists

2021-11-28 Thread Bob Proulx

Warren Parad wrote:
> except mv(1) and cp(1) are both "FROM" and then "TO", but ln is backwards
> from thi, it is "TO" then "FROM", the least the command could do is put
> these in the correct order.

But that is not correct.  The order for ln is the same as for cp and
mv in that the target getting created is the right side argument.

(Unless the -t or -T option is used to do it differently by explicit
syntax request.  Unless no target is specified in which case dot is
assumed.  I admit those two "unless" cases complicate the original
simplicity.  But the normal case is to create the right side argument
as the target of the command.)

> >  it is a one-time effort to learn the order
> Opinion, do you want proof that people can't learn this, because they
> haven't.

The target getting created is the right side argument.  If that is not
clear from the documentation then improving the documentation is
always good.

Let me say with some confidence that if the order were changed to
create the left argument that people would be very upset that cp and
mv created the right side argument but ln created a left side
argument!

Bob

bug#51345: dd with conv=fsync sometimes returns when its writes are still cached

2021-10-25 Thread Bob Proulx

Sworddragon wrote:
> On Knoppix 9.1 with the Linux Kernel 5.10.10-64 x86_64 and GNU Coreutils
> 8.32 I wanted to overwrite my USB Thumb Drive a few times with random data
> via "dd if=/dev/random of=/dev/sdb bs=1M conv=fsync". While it usually
> takes ~2+ minutes to perform this action dd returned once after less than
> 60 seconds which made me a bit curious.

I suggest another try using oflag=direct instead of conv=fsync.

dd if=/dev/random of=/dev/sdb bs=1M oflag=direct

Also with rates status.

dd if=/dev/random of=/dev/sdb bs=1M oflag=direct status=progress

Here is the documentation for it.

  ‘oflag=FLAG[,FLAG]...’

 ‘direct’
  Use direct I/O for data, avoiding the buffer cache.  Note that
  the kernel may impose restrictions on read or write buffer
  sizes.  For example, with an ext4 destination file system and
  a Linux-based kernel, using ‘oflag=direct’ will cause writes
  to fail with ‘EINVAL’ if the output buffer size is not a
  multiple of 512.

Bob

bug#47476: relative date of -1 month shows the wrong month

2021-04-04 Thread Bob Proulx

Lars Nooden wrote:
> On March 29, 2021, if a relative date of '-1 month' is passed to 'date',
> then the output shows March instead of February.

The date manual includes this section on relative months.

   The fuzz in units can cause problems with relative items.  For
example, ‘2003-07-31 -1 month’ might evaluate to 2003-07-01, because
2003-06-31 is an invalid date.  To determine the previous month more
reliably, you can ask for the month before the 15th of the current
month.  For example:

 $ date -R
 Thu, 31 Jul 2003 13:02:39 -0700
 $ date --date='-1 month' +'Last month was %B?'
 Last month was July?
 $ date --date="$(date +%Y-%m-15) -1 month" +'Last month was %B!'
 Last month was June!

This exactly covers the initial bug report.  Because March 29, 2021
minus 1 month results in the invalid date February 29, 2021 which not
being a leap year does not exist.  What _should_ be the result if the
date one month ago does not exist?  And the answer to that will mostly
depend upon what purpose the question was being asked.

When dealing with time in months it also depends upon what you are
needing done.  If it is the 7th of the month and you want to generate
a date that is also the 7th but one month later or earlier then if it
is March 7th and generate February 7th then that will be fewer days
difference than if it is were June 7th and deciding May 7th is the
month early.  Due to the nature of having a different number of days
in different months.  But if that was what I wanted then I would
determine what was the month prior and generate a new datestamp using
the current day of the month.

[[Aside:
Off the top of my head and hopefully without a trivial bug.  I welcome
corrections if I made a mistake in this.  But this is still not
completely general purpose.

$ date "+%F %T"
2021-04-04 20:50:19

$ date "+%Y-$(date --date="$(date +%Y-%m-15) -1 month" +%m)-%d %H:%M:%S"
2021-03-04 20:50:54

*HOWEVER* that still does not handle the case of the original poster's
report about what happens on March 29, 2021 minus one month?  It can't
be February 29th!  Isn't that the same as March 1st?
]]

Perhaps instead of the code using 30 day months it should use the
number of days in the current month?  Then on March 31, 2021 -1 month
since March has 31 days that would calculate February 28, 2021.  Is
that better or worse?

$ date --date="2021-03-31 12:00 + -31 days" "+%F %T"
2021-02-28 05:00:00

Potentially worse!  What happens on March 1, 2021 then?

$ date --date="2021-03-01 12:00 + -31 days" "+%F %T"
2021-01-29 05:00:00

In that case we skip over February entirely!

Chris Elvidge wrote:
> Pádraig Brady wrote:
> > The current FAQ (linked below) suggests the workaround of:
> > 
> >date --date="$(date +%Y-%m-15) -1 month" +'Last month was %B.'
> > 
> > https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e
> 
> It's noticeable that (on my system, CYGWIN-NT and/or Raspbian) 'date -d"now
> -1month"' gives a definitely wrong answer, but 'dateadd now -1mo' gives a
> somewhat more reasonable answer. dateadd is from the dateutils package,
> sometimes dadd and/or dateutils.dadd.
> 
> $ date +"%Y-%m-%d %H:%M:%S"
> 2021-03-30 10:37:00
> 
> $ date -d"now -1 month" +"%Y-%m-%d %H:%M:%S"
> 2021-03-02 09:37:17

So...  Here is the problem with "now".  Using "now" is problematic
*some* of the time.  Not all of the time.  Never when you are trying
it on the command line in the middle of the day.  But there are
windows of time around DST time changes when it is problematic.  If
you are getting the time now and it is the middle of the day say
around noon then that is far away from time changes.  But almost every
seasonal time change there is a bug report from someone who has an
automated process that ran right at the same time as time change
giving them a surprising result and they are filing a bug that it gave
them the wrong answer, because there was no 2am that day, or maybe
there were two 2ams that day, or something.

That's why it is better to test for days using noon as a reference.
And why when checking for months it is better to test for months away
from the change of month.  Really the 10th or the 20th would be as
good as the 15th but the 15th is in the middle of every month and why
it ended up getting into the FAQ recommendation.

> $ dateadd now -1mo -f"%Y-%m-%d %H:%M:%S"
> 2021-02-28 09:37:27

I don't know anything about dateadd and it is not part of GNU Coreutils.

Bob

bug#47353: Numbered backups also need kept-new-versions else will grow out of control

2021-03-24 Thread Bob Proulx

tag 47353 + notabug
close 47353
thanks

Dan Jacobson wrote:
> Or (info "(coreutils) Backup options") should "admit" that "Numbered
> backups need to be trimmed occasionally by the user, lest the fill up
> the disk."

If the user has asked for them then any decision of the disposition of
them is up to the user.  If the user fills up their storage with them
then surely the user who created them will know what they did and will
be in the best position to decide what to do.

This type of thing is really both too general to document in detail
and too specific to document in detail at the same time.  It targets a
very specific thing, filling up the disk, with a very general purpose
action, copying files.  Both of which are plain actions not hidden or
subtle.  Consuming storage space by making copies is the primary
purpose of the cp command.

> And also mention in the manual that e.g., emacs has methods to trim
> these automatically, but coreutils hasn't implemented them yet.

Although cp, mv, and ln, may have used the same format as emacs for
the creation of backup files that does not mean that they *are* emacs
or that emacs is the preferred editor for users of cp and mv or that
knowledge of emacs is needed to use them.

I use Emacs and find it a superior editor for creating customized
domain specific editors.  But I don't think it should be referenced
from cp because the Emacs documentation is *HUGELY* more complicated.
If a new user is reading documentation on how to use cp then being
directed to climb the learning curve of Emacs would be way too much to
ask!  There is a user who I think would file a bug that it is too much
to ask if it were done that way.

The better thing to mention in relation to cp would be rm as those
would be natural siblings.  But they are actually siblings already.
So there seems no further need to cross-reference them additionally
redundantly again redundantly.

I am marking the ticket as closed as there seems nothing to actually
do here.  But as always more discussion is welcome and if it is
determined that something should be done then the ticket may be opened
again to track it.

Bob

bug#45358: bootstrap fails due to a certificate mismatch

2021-03-09 Thread Bob Proulx

Erik Auerswald wrote:
> Grigoriy Sokolik wrote:
> > I've rechecked:
> 
> I cannot reproduce the problem, the certificate is trusted by my system:
> 
> # via IPv4
> $ gnutls-cli --verbose translationproject.org  'Connecting|Status'
> Connecting to '80.69.83.146:443'...
> - Status: The certificate is trusted. 
> # via IPv6
> $ gnutls-cli --verbose translationproject.org  'Connecting|Status'
> Connecting to '2a01:7c8:c037:6::20:443'...
> - Status: The certificate is trusted.

I have the same results here.  Everything looks okay in the inspection
of it.

> It seems to me as if your system does not trust the used root CA.
> 
> > [...]issuer `CN=DST Root CA X3,O=Digital Signature Trust Co.'[...]
> 
> On my Ubuntu 18.04 system, I find it via symlink from /etc/ssl/certs:
> 
> $ ls /etc/ssl/certs/DST_Root_CA_X3.pem -l
> lrwxrwxrwx 1 root root 53 Mai 28  2018 /etc/ssl/certs/DST_Root_CA_X3.pem 
> -> /usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt
> $ certtool --certificate-info < 
> /usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt | grep Subject:
>   Subject: CN=DST Root CA X3,O=Digital Signature Trust Co.

Again same here on my Debian system.  The root certificate store for
the trust anchor is in the ca-certificates package.

Looking at my oldest system I see this is distributed as package
version 20200601~deb9u1 and includes the above file.

$ apt-cache policy ca-certificates
ca-certificates:
  Installed: 20200601~deb9u1
  Candidate: 20200601~deb9u1
  Version table:
 *** 20200601~deb9u1 500
500 http://ftp.us.debian.org/debian stretch/main amd64 Packages
500 http://ftp.us.debian.org/debian stretch-updates/main amd64 
Packages
100 /var/lib/dpkg/status

Verifying that the equivalent of ca-certificates is installed on your
system should provide for it.

As this seems not to be a bug in Coreutils I am marking the bug as
closed with this mail.  However more discussion is always welcome.

Bob

bug#45358: bootstrap fails due to a certificate mismatch

2021-03-08 Thread Bob Proulx

Is this problem still a problem?  Perhaps it has been fixed in the
time this has been under discussion?  Because it looks okay to me.

Grigoriy Sokolik wrote:
>$ curl -v https://translationproject.org/latest/coreutils/ -o /dev/null
...
>* Connected to translationproject.org (80.69.83.146) port 443 (#0)
...
>* successfully set certificate verify locations:
>*  CAfile: /etc/ssl/certs/ca-certificates.crt
>*  CApath: none

I suspect this last line to be the root cause of the problem.  There
is no CApath and therefore no root anchoring certificates trusted.
Without that I don't see how any certificates can be trusted.

I do the same test here and see this.

$ curl -v https://translationproject.org/latest/coreutils/ -o /dev/null
...
* Connected to translationproject.org (80.69.83.146) port 443 (#0)
...
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs

Note the inclusion of the trusted root path.

* Server certificate:
*  subject: CN=stats.vrijschrift.org
*  start date: Mar  1 10:34:36 2021 GMT
*  expire date: May 30 10:34:36 2021 GMT
*  subjectAltName: host "translationproject.org" matched cert's
*  "translationproject.org"
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  SSL certificate verify ok.

Note that the certificate validates as okay.

Also if I simply ask openssl to validate:

$ openssl s_client -connect translationproject.org:443 -CApath 
/etc/ssl/certs -showcerts /dev/null
...
Verify return code: 0 (ok)

If I download all of the certificates and validate using certtool,
since you mentioned certtool I will use your example:

$ openssl s_client -connect translationproject.org:443 -CApath 
/etc/ssl/certs -showcerts /dev/null  | sed -n '/^-BEGIN 
CERTIFICATE-/,/^-END CERTIFICATE-/p' > 
/tmp/translationproject.org.certs
$ certtool --verbose --verify-profile=high --verify 
--infile=/tmp/translationproject.org.certs
Loaded system trust (127 CAs available)
Subject: CN=R3,O=Let's Encrypt,C=US
Issuer: CN=DST Root CA X3,O=Digital Signature Trust Co.
Checked against: CN=DST Root CA X3,O=Digital Signature Trust Co.
Signature algorithm: RSA-SHA256
Output: Verified. The certificate is trusted. 

Subject: CN=stats.vrijschrift.org
Issuer: CN=R3,O=Let's Encrypt,C=US
Checked against: CN=R3,O=Let's Encrypt,C=US
Signature algorithm: RSA-SHA256
Output: Verified. The certificate is trusted. 

Chain verification output: Verified. The certificate is trusted. 

Then it again validates okay.

I note that the certificate is current as of now and just recently
renewed.  It's fresh.

$ openssl s_client -connect translationproject.org:443 -CApath 
/etc/ssl/certs -showcerts /dev/null | sed -n '/^-BEGIN 
CERTIFICATE-/,/^-END CERTIFICATE-/p;/^-END CERTIFICATE-/q' 
| openssl x509 -noout -dates
notBefore=Mar  1 10:34:36 2021 GMT
notAfter=May 30 10:34:36 2021 GMT

Therefore I think everything is okay as far as I can tell from the
above.  Perhaps something about the site has changed to resolve a
problem since then?  Perhaps an intermediate certificate was added?

Bob

bug#45182: mktemp not created other permissions

2021-03-08 Thread Bob Proulx

close 45182
tag 45182 + notabug
thanks

Vasanth M.Vasanth wrote:
> When I create a temp file from root users using mktemp command, then it is
> not able to access other users. If the same do in other users then the
> group and user came respectively.

I see no difference in behavior of GNU Coreutils mktemp when used as a
root user or as a non-root user.

# mktemp
/tmp/tmp.7smatw2ZW5

# ls -ld /tmp/tmp.7smatw2ZW5
-rw--- 1 root root 0 Mar  8 21:56 /tmp/tmp.7smatw2ZW5

$ mktemp
/tmp/tmp.nnyNVef0wB

$ ls -ld /tmp/tmp.nnyNVef0wB
-rw--- 1 rwp rwp 0 Mar  8 21:54 /tmp/tmp.nnyNVef0wB

Therefore I am at a loss to understand the report that there are differences.

Also the purpose and intent of mktemp is to create files that are
accessible by the creating user only and not by other users and not by
other groups.  This is documented in the manual as this following.

   When creating a file, the resulting file has read and write
permissions for the current user, but no permissions for the group or
others; these permissions are reduced if the current umask is more
restrictive.

Therefore if I read your question about permissions correctly, yes
this is documented and intended behavior.

> Is this default behaviour or any flags available?

No.  The files created will always be such that the current user has
read and write permissions but no permissions for group or others.

Regarding users and groups however.  The default permission for
non-root, non-priviledged users in most modern operating systems is
that non-priviledged users cannot chown files.  That is a kernel level
restriction and not a restriction of GNU Coreutils.  If the OS allows
it then chown will do it.  If the OS does not allow it then it is the
kernel that is restricting it.  The root superuser however always has
full permission for chown actions.

If you desire less strict permissions then this may easily be
accomplished by chmod'ing the file afterward.  Such as this example.

tmpfile=$(mktemp) || exit 1
chmod g+w "$tmpfile"

And for a root user setting up a file or directory for another process
then the root user may chown and chgrp the file too.

tmpfile=$(mktemp) || exit 1
chmod g+w "$tmpfile"
chgrp somesharedgroup "$tmpfile"

This ordering is important.  Because a file that is created securely
may be relaxed.  But a file created with relaxed permissions may never
safely made securely restricted.  Therefore the files must be strict
from the start and only relaxed if that is the desire.

Thank you for your bug report.  However as the command is operating as
intended and documented I am going to close this bug ticket.  But
please if there is additional information feel free to add it to the
ticket.  It will be read and if there is a reason then the ticket will
be opened again.

Bob

bug#45695: Date does not work for dates before 1970

2021-01-12 Thread Bob Proulx

zed991 wrote:
> On linux, I can use  date +%s --date "31 Dec 1969"
> The result is -9
> A negative number

Which is correct for dates before the 0 time:

Thu, 01 Jan 1970 00:00:00 +

https://en.wikipedia.org/wiki/Unix_time

> But when I try it on Windows (using GNUWin32) it gives me an error -
> "invalid date"
> 
> I downloaded date for windows from this link -
> http://gnuwin32.sourceforge.net/packages/coreutils.htm
> 
> Is there any fix for Windows?

According to that page the last update of the GnuWin project was
2015-05-20 therefore one might think that project is no longer
updating now more than five years later.

Perhaps it would be good to look for a different MS-Windows port of
the software?  The usual recommendation is to install Cygwin which
generally is a more reliable port of the software.  Although I
understand that it might be a little heavy for many users.  But
whichever port to Microsoft you find look to see that it has been
updated in the last few years.

Generally the GNU Project is all about the source and use on Free(dom)
Software systems.  Generally most of us are not using Microsoft and
therefore it makes it hard for us to help.  It really needs a
Microsoft person to champion the cause and to keep that system updated.

Since this is not a bug in the GNU Coreutils software itself but in
the windows port of it I am going to go ahead and close the ticket
with this message.  But if you have updates about this please send an
update to the bug ticket as it would help us know what to say in the
future to other Microsoft users.  And other people searching the
archive will benefit from your experience with it.

Bob

bug#43828: invalid date converting from UTC, near DST

2020-10-28 Thread Bob Proulx

Martin Fido wrote:
> I have tzdata version 2020a:
> 
> $ apt-cache policy tzdata
> tzdata:
>   Installed: 2020a-0ubuntu0.16.04
>   Candidate: 2020a-0ubuntu0.16.04
> ...
> 
> $ zdump -v Australia/Sydney | grep 2020
> Australia/Sydney  Sat Apr  4 15:59:59 2020 UT = Sun Apr  5 02:59:59 2020 
> AEDT isdst=1 gmtoff=39600
> Australia/Sydney  Sat Apr  4 16:00:00 2020 UT = Sun Apr  5 02:00:00 2020 
> AEST isdst=0 gmtoff=36000
> Australia/Sydney  Sat Oct  3 15:59:59 2020 UT = Sun Oct  4 01:59:59 2020 
> AEST isdst=0 gmtoff=36000
> Australia/Sydney  Sat Oct  3 16:00:00 2020 UT = Sun Oct  4 03:00:00 2020 
> AEDT isdst=1 gmtoff=39600

I see this is Ubuntu 16.04.  I found a 16.04 system and I was able to
recreate this exact problem there.  However trying this on an 18.04
system and it is no longer an invalid date.

Bob

bug#43828: invalid date converting from UTC, near DST

2020-10-15 Thread Bob Proulx

Martin Fido wrote:
> I seem to have found a bug in the date utility, converting from UTC
> to Sydney time. It returns invalid date for what should be perfectly
> valid:
> 
> $ TZ='Australia/Sydney' date -d '2020-10-04T02:00:00Z'
> date: invalid date ‘2020-10-04T02:00:00Z’
> 
> $ TZ='Australia/Sydney' date -d '2020-10-04T02:59:59Z'
> date: invalid date ‘2020-10-04T02:59:59Z’

This is more likely to be in the tzdata zoneinfo database rather than
in date itself.  Could you please report what version of tzdata you
have on your system?  Current on my system is tzdata version 2020b-1.

And also this information too.

$ zdump -v Australia/Sydney | grep 2020
Australia/Sydney  Sat Apr  4 15:59:59 2020 UT = Sun Apr  5 02:59:59 2020 
AEDT isdst=1 gmtoff=39600
Australia/Sydney  Sat Apr  4 16:00:00 2020 UT = Sun Apr  5 02:00:00 2020 
AEST isdst=0 gmtoff=36000
Australia/Sydney  Sat Oct  3 15:59:59 2020 UT = Sun Oct  4 01:59:59 2020 
AEST isdst=0 gmtoff=36000
Australia/Sydney  Sat Oct  3 16:00:00 2020 UT = Sun Oct  4 03:00:00 2020 
AEDT isdst=1 gmtoff=39600

> Note DST in Sydney changed 10 hours earlier:
> 
> $ TZ='Australia/Sydney' date -d '2020-10-03T15:59:59Z'
> Sunday 4 October  01:59:59 AEST 2020
> 
> $ TZ='Australia/Sydney' date -d '2020-10-03T16:00:00Z'
> Sunday 4 October  03:00:00 AEDT 2020

Yes.  And I think that is suspicious.  Hopefully the zdump information
will show that database is in need of an update and that is the root
of the problem.  I suspect that DST was moved at some point in time.

> I have version 8.25:
> 
> $ date --version
> date (GNU coreutils) 8.25

I tried this on 8.13, 8.23, 8.26, and 8.32 and was unable to reproduce
the problem on any of those versions of date.  But I suspect the root
cause is in the tzdata zoneinfo database.

Bob

bug#43657: rm does not delete files

2020-10-15 Thread Bob Proulx

close 43657
thanks

Paul Eggert wrote:
> On 9/27/20 8:58 PM, Amit Rao wrote:
> > There's a limit? My first attempt didn't use a wildcard; i attempted to 
> > delete a directory.
> 
> 'rm dir' fails because 'rm' by default leaves directories alone.
> 
> > My second attempt was rm -rf dir/*
> 
> If "dir" has too many files that will fail due to shell limitations that
> have nothing to do with Coreutils. Use 'rm -rf dir' instead.

The only reason I can guess that rm -rf dir/* might fail would be
argument list too long.  Which has an FAQ entry.  I feel confident
this was the problem you experienced.

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Argument-list-too-long

In any case in order to establish the background it is necessary to
post command that you used exactly and then also the error message
that resulted.  Without that information exactly it is not possible to
establish the root cause of the behavior, if it is a bug, or if it is
kernel behavior.

Also if rm were to fail then extremely useful would be strace
information so that we could see the exact reason for the failure.

If this is the ARG_MAX limitation then it does not need rm to
reproduce the issue.  One can use any command.  Using echo should be
safe enough.

echo dir/* >/dev/null

In any case the suggested strategy of using "rm -rf dir" is very good
and very simple here.  It avoids that problem entirely.

Because I feel very confident that the issue is the kernel limitation
of ARG_MAX I am going to close this ticket.  However if you have
further information please reply and add it to the ticket.  It can
always be opened again if further information points to a bug to be
tracked.

Bob

bug#43162: chgrp clears setgid even when group is not changed

2020-09-20 Thread Bob Proulx

Paul Eggert wrote:
> Karl Berry wrote:
> > I was on centos7.
> > 
> >  (I don't observe your problem on my Fedora 31 box, for example).
> > 
> > Maybe there is hope for a future centos, then.

Just another few data points...

I was able to recreate this issue on a CentOS 7 system running in a
tmpfs filesystem.  So that's pretty much pointing directly at the
Linux kernel behavior independent of file system type.

Meanwhile...  I can also recreate this on a Debian system with a Linux
4.9 kernel in 9 Stretch.  But not on 10 Buster Linux 4.19.  But once
again not on an earlier Linux 3.2 kernel.  3.2 good, 4.9 bad, 4.19 good.

Therefore this seems to be a Linux behavior that was the desired way,
then flipped to the annoying behavior way, then has flipped back again
later.  Apparently.  Anyway just a few data points.

Bob

bug#43541: minor bug in GNU coreutils 8.30: pwd --version doesn't work

2020-09-20 Thread Bob Proulx

tag 43541 + notabug
close 43541
thanks

Nikolay wrote:
> GNU coreutils 8.30

Coreutils version 8.30.  Gotcha.

> $ pwd --version
> bash: pwd: --: invalid option
> pwd: usage: pwd [-LP]

But that is not the GNU Coreutils pwd program.  That is the shell
builtin pwd.  In this case it is bash.  And bash does not document
either a --version or --help option.

$ type pwd
pwd is a shell builtin

$ help pwd
pwd: pwd [-LP]
Print the name of the current working directory.

Options:
  -Lprint the value of $PWD if it names the current working
directory
  -Pprint the physical directory, without any symbolic links

By default, `pwd' behaves as if `-L' were specified.

Exit Status:
Returns 0 unless an invalid option is given or the current directory
cannot be read.

Since this isn't a coreutils program I am going to attend to the
housekeeping and close the bug ticket.  But please let's continue
discussion here for additional questions or comments.

This is actually an FAQ.

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#I-am-having-a-problem-with-kill-nice-pwd-sleep-or-test_002e

> $ man pwd
> 
> ...
> 
> --version
>   output version information and exit

That is the man page for Coreutils pwd.  And if you want to use the
external command then you must avoid the builtin.

$ type -a pwd
pwd is a shell builtin
pwd is /bin/pwd

$ env pwd --version
pwd (GNU coreutils) 8.32

Use of 'env' in this way forces searching PATH for the named program
regardless of shell and avoids builtins.

Hope this helps! :-)

Bob

bug#42440: bug with rm

2020-07-21 Thread Bob Proulx

tags 42440 + notabug
thanks

ï¿½ï¿½ï¿½ï¿½ wrote:
> sometimes,rm can't delete the file.
> but when using rm -rf + file .
> the file can be deleted.

This does not sound like a bug in the rm command. Therefore I am
tagging this as such. If you have follow up information and this
turns out to be an actual bug then we can reopen the bug report.

Unfortunately there is not enough information in the report to know
exactly the case that you are talking about. For example I don't know
if you are talking about a literal "+" in that line or not. I will
assume that you are since it is there.

There are several FAQs listed for rm. Any of these might be a
problem.

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#How-do-I-remove-files-that-start-with-a-dash_003f

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Why-doesn_0027t-rm-_002dr-_002a_002epattern-recurse-like-it-should_003f

You might have experienced either of those problems. Or a different
problem. We can't tell.

> sometimes,rm can't delete the file.

There are two main cases. One is that if the file is not writable by
the user then 'rm' will check for this and ask the user for
confirmation.

rwp@angst:/tmp/junk$ touch file1
rwp@angst:/tmp/junk$ chmod a-w file1
rwp@angst:/tmp/junk$ rm file1
rm: remove write-protected regular empty file 'file1'? n
rwp@angst:/tmp/junk$ ls -l file1
-r--r--r-- 1 bob bob 0 Jul 21 23:52 file1

The -f option will force it without prompting.

rwp@angst:/tmp/junk$ rm -f file1
rwp@angst:/tmp/junk$ ls -l file1
ls: cannot access 'file1': No such file or directory

This is a courtesy confirmation. Because the permissions on the file
is not important when it comes to removing a directory entry. A file
is really just an entry in the directory containing it. Removing a
file simply removes the entry from the directory. When the last link
to the file reaches zero then the file system reclaims the storage.
The file system is a "garbage collection" system using reference
counting.

https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)

Therefore the only permission needed to remove a file is write
permission to the directory containing it.

rwp@angst:/tmp/junk$ touch file2
rwp@angst:/tmp/junk$ ls -ld . file2
drwxrwxr-x 3 rwp rwp 100 Jul 21 23:56 ./
-rw-rw-r-- 1 rwp rwp 0 Jul 21 23:56 file2
rwp@angst:/tmp/junk$ chmod a-w .
rwp@angst:/tmp/junk$ ls -ld . file2
dr-xr-xr-x 3 rwp rwp 100 Jul 21 23:56 ./
-rw-rw-r-- 1 rwp rwp 0 Jul 21 23:56 file2

This creates a file. The file is writable. But I have changed the
directory containing it not to be writable. This prevents the ability
to remove the file. Can't remove it because the directory is not wriable.

rwp@angst:/tmp/junk$ rm file2
rm: cannot remove 'file2': Permission denied
rwp@angst:/tmp/junk$ rm -f file2
rm: cannot remove 'file2': Permission denied
rwp@angst:/tmp/junk$ rm -rf file2
rm: cannot remove 'file2': Permission denied
rwp@angst:/tmp/junk$ ls -ld . file2
dr-xr-xr-x 3 rwp rwp 100 Jul 21 23:56 ./
-rw-rw-r-- 1 rwp rwp 0 Jul 21 23:56 file2

In order to remove the file we must have write permission to the
directory. Adding write permission to the directory allows removing
the file.

rwp@angst:/tmp/junk$ chmod ug+w .
rwp@angst:/tmp/junk$ rm file2
rwp@angst:/tmp/junk$ ls -ld file2
ls: cannot access 'file2': No such file or directory

Expanding upon this problem is if there are many directories deep and
the directories are not writable.

rwp@angst:/tmp/junk$ mkdir -p dir1 dir1/dir2 dir1/dir2/dir3
rwp@angst:/tmp/junk$ touch dir1/dir2/dir3/file3
rwp@angst:/tmp/junk$ chmod -R a-w dir1
rwp@angst:/tmp/junk$ find dir1 -ls
69649132 0 dr-xr-xr-x 3 rwp rwp60 Jul 22 00:00 dir1
69649133 0 dr-xr-xr-x 3 rwp rwp60 Jul 22 00:00
dir1/dir2
69649134 0 dr-xr-xr-x 2 rwp rwp60 Jul 22 00:00
dir1/dir2/dir3
69650655 0 -r--r--r-- 1 rwp rwp 0 Jul 22 00:00
dir1/dir2/dir3/file3

That sets up the test case. None of the directories are wriable.
Therefore we cannot remove any of them. The directory holding the
entries must be writable.

rwp@angst:/tmp/junk$ rm -rf dir1
rm: cannot remove 'dir1/dir2/dir3/file3': Permission denied

Even using 'rm -rf' does not work. And should not work. Because the
directories are not writable.

In order to remove these files the directories must be made writable.

rwp@angst:/tmp/junk$ chmod -R u+w dir1
rwp@angst:/tmp/junk$ rm -rf dir1
rwp@angst:/tmp/junk$ ls -ld dir1
ls: cannot access 'dir1': No such file or directory

Hopefully this helps you understand how directory entries work, that
the directory holding an entry (either file or another directory) must
be writable. How to add write permission. How to remove a single
file. How to remove a dir

bug#42034: option to truncate at end or should that be default?

2020-06-24 Thread Bob Proulx

L A Walsh wrote:
> I allocated a large file of contiguous space (~3.6T), the size of a disk
> image I was going to copy into it with 'dd'.  I have the disk image
> 'overwrite' the existing file, in place ...

It's possible that you might want to be rescuing data from a failing
disk or doing other surgery upon it.  Therefore I want to mention
ddrescue here.

  https://www.gnu.org/software/ddrescue/

Of course it all depends upon the use case but ddrescue is a good tool
to have in the toolbox.  It might be just the right tool.

Take for example a RAID1 image on two failing drives that should be
identical but both are reporting errors.  If the failures do not
overlap then ddrescue can be used to merge the successful reads from
those two images producing one fully correct image.

Bob

bug#41792: Acknowledgement (dd function – message: "No boot sector on USB device"")

2020-06-24 Thread Bob Proulx

close 41792
thanks

Since the discussion has moved away from anything GNU Coreutils
related and doesn't seem to be reporting any bugs in any of the
utilities I am going to close the bug ticket.  But discussion may
continue here regardless.  If we see a dd bug we can re-open the
ticket.

Ricky Tigg wrote:
> The difference of device path is due to the fact that the USB media was
> plugged out after the write-operation was achieved on the Linux computer
> then plugged into a computer –Asus– whose Windows OS has to be restored,
> then plugged back to the same computer but to a *different* USB port. It's
> safe to open the present issue-ticket.

Hmm...  There is no reason that the Linux kernel would renumber the
device simply because it was removed and inserted again.  Therefore me
thinks that it was not cleanly removed.  Me thinks that something in
the system had mounted it keeping it busy preventing it from cleanly
being ejected.  This "something" may have been an automatic mounting
of it as many Desktop Environments unfortunately default to doing.
IMNHO automated mounting is a bad idea and should never be enabled by
default.

> *Source media*:
> https://www.microsoft.com/en-us/software-download/windows10ISO

The source media doesn't matter to GNU utilities.  The 'dd' utility
treats files as raw bytes and does not treat MS-Windows-10 ISO images
any differently than any other raw data.  It might be that or pictures
of your dog or random cosmic noise recorded from your radio.  It
doesn't matter.  It's just data.

Your Desktop Environment may take action however.  It is possible that
your DE will probe the device, detect that it is an ISO image, and
automatically mount that ISO image.  That's bad.  But that's your
Desktop Environment and unrelated to 'dd'.  But it always been a bad
idea.  Regardless of how many people do it.

> *Rufus v4.1.4* – I couldn't use it since The Windows OS installed is
> missing some system's files. Will convert it to fit on Fedora at release of
> version 33 which will update the uniformly mingw component and thus
> mingw64-headers which is old and is the cause of a known issue.
>
> I wrote the disc image as well using those tools then booted the USB device
> having the disc image written on.:
>
> *Fedora Media Writer v4.1.4* – Officially does not support Microsoft
> Windows disc images. I did not know that before writing.

My first thought was, huh?  Why would Fedora Media Writer not treat
files as raw files?  My second thought was that the question was for a
Fedora Media Writer mailing list as this bug ticket is not the place
to be discussing other random projects.

> *Unetbootin v677* – It writes partially the disc image thus the installer
> is operational partially. Issue was already reported by someone on Git.
>
> *Woeusb v3.3.1* – Installer is operational on BIOS but not on EFI systems.
> Issue was already reported by someone on Git.
>
> *Balena Etcher v1.5.9*8 x64 as AppImage format – The device is not listed
> at boot.

Gosh.  Reading your report makes MS-Windows seem like such a terrible
system!  I read about all of your pain of working on it.  You have
tried all of these tools and nothing is working for you.  It is
reading these types of reports that I am thankful I am working on a
Free(dom) Software operating system where things Just Work!

Meanwhile...  Let's get back to your information about 'dd'.

> $ file -b Win10_2004_Finnish_x64.iso
> ISO 9660 CD-ROM filesystem data 'CCCOMA_X64FRE_FI-FI_DV9' (bootable)

That looks like you were successfully able to write the ISO image to
the device.  Looks okay.

> *Component*: coreutils.x86_64  8.32-4.fc32.; *OS*: Linux Fedora

Good.

> Source of file:
> https://www.microsoft.com/en-us/software-download/windows10ISO
>
> Disc image file
> - checked against its SHA-256 checksum was correct
> - written successfully with that command:
> # dd if=Win10_2004_Finnish_x64.iso of=/dev/sdc bs=4M oflag=direct 
> status=progress && sync

I don't see any error messages.  That's good.

The oflag=direct should use direct I/O.  Which means that the 'sync'
shouldn't matter since there should be no file system buffer to flush.
It will simply flush other unrelated buffers.  Won't hurt though.

The bs size seems very small at 4M to me.  Especially for use with a
NAND flash USB storage device.  I would select a much larger size.  I
would probably use 64M which is likely to be an integral size of your
original ISO image but that should be verified.

> Once written, the partition is as follows:
> $ mount | fgrep /run/media/$USER
> /dev/sdb on /run/media/yk/CCCOMA_X64FRE_FI-FI_DV9 type udf
> (ro,nosuid,nodev,relatime,uid=1000,gid=1000,iocharset=utf8,uhelper=udisks2)

WHY is this mounted?  That seems like a problem.

You said that the device was removed and replaced and went from sdc to
sdb?!  Probably because it was mounted.

This feels like the root cause of all of your problems.  It feels to
me that something is automatically mounting the device.  That's b

bug#41657: md5sum: odd escaping for input filename \

2020-06-24 Thread Bob Proulx

close 41657
thanks

No one else has commented therefore I am closing the bug ticket.  But
the discussion may continue here.

Michael Coleman wrote:
> Thanks very much for your prompt reply.  Certainly, if this is
> documented behavior, it's not a bug.  I would have never thought to
> check the documentation as the behavior seems so strange.

I am not always so generous about documented behavior *never* being a
bug. :-)

> If I understand correctly, the leading backslash in the first field
> is an indication that the second field is escaped.  (The first field
> never needs escapes, as far as I can see.)

Right.  But it was available to clue in the md5sum and others that the
file name was an "unsafe" file name and was going to be escaped there.

> Not sure I would have chosen this, but it can't really be changed
> now.  But, I suspect that almost no real shell script would deal
> with this escaping correctly.  Really, I'd be surprised if there
> were even one example.  If so, perhaps it could be changed without
> trouble.

Let's talk about the shell scripting part.  Why would this ever need
to be parsed in a shell script?  And if so then that is precisely
where it would need to be done due to the file name!

Your own example was a file name that consisted of a single
backslash.  Since the backslash is the shell escape character then
handling that in a shell script would require escaping it properly
with a second backslash.

I will suggest that the primary use for the *sum utility output is as
input to the same utility later to check the content for differences.
That's arguably the primary use of it.

There are also cases where we will want to use the *sum utilities on a
single file.  That's fine.  I think the problematic case here might be
a usage like this usage.

  filename="\\"
  sum=$(md5sum "$filename" | awk '{print$1}')
  printf "%s\n" "$sum"
  \d41d8cd98f00b204e9800998ecf8427e

And then there is that extra backslash at the start of the hash.
Well, yes, that is unfortunate.  But in this case we already have the
filename in a variable and don't want the filename from md5sum.  This
is very similar to portability problems between different versions of
'wc' and other utilities too.  (Some 'wc' utils print leading spaces
and some do not.)

As you already deduced if md5sum does not have a file name then it
does not know if it is escaped or not.  Reading standard input instead
doesn't have a name and therefore "-" is used as a placeholder as per
the tradition.

  filename="\\"
  sum=$(md5sum < "$filename" | awk '{print$1}')
  printf "%s\n" "$sum"
  d41d8cd98f00b204e9800998ecf8427e

And because this is discussion I will note that the name is just one
of the possible names to a file.  Let's hard link it to a different
name.  And of course symbolic links are the same too.  A name is just
a pointer to a file.

  ln "$filename" foo
  md5sum foo
  d41d8cd98f00b204e9800998ecf8427e  foo

But I drift...

I think it likely you have already educated your people about the
problems and the solution was to read from stdin when the file name is
potentially untrusted "tainted" data.  (Since programming langauges
often refer to unknown untrusted data as "tainted" data for the
purpose of tracking what actions are safe upon it or not.  When taint
checking is enabled.)  Therefore if the name is unknown then it is
safer to avoid the name and use standard input.

And I suggest the same with other utilities such as 'wc' too.
Fortunately wc is not used to read back its own input.  Otherwise I am
sure someone would suggest that it would need the same escaping done
there too.  Example that thankfully does not actually exist:

  $ wc -l \\
  \0 \\

I am sure that if such a change were made that it would result in a
large wide spread breakage.  Let's hope that never happens.

Bob

bug#41657: md5sum: odd escaping for input filename \

2020-06-01 Thread Bob Proulx

Hello Michael,

Michael Coleman wrote:
> $ true > \\
> $ md5sum \\
> \d41d8cd98f00b204e9800998ecf8427e  \\
> $ md5sum < \\
> d41d8cd98f00b204e9800998ecf8427e  -

Thank you for the extremely good example!  It's excellent.

> The checksum is not what I would expect, due to the leading
> backslash.  And in any case, the "\d" has no obvious interpretation.
> Really, I can't imagine ever escaping the checksum.

As it turns out this is documented behavior.  Here is what the manual says:

 For each FILE, ‘md5sum’ outputs by default, the MD5 checksum, a
  space, a flag indicating binary or text input mode, and the file name.
  Binary mode is indicated with ‘*’, text mode with ‘ ’ (space).  Binary
  mode is the default on systems where it’s significant, otherwise text
  mode is the default.  Without ‘--zero’, if FILE contains a backslash or
  newline, the line is started with a backslash, and each problematic
  character in the file name is escaped with a backslash, making the
  output unambiguous even in the presence of arbitrary file names.  If
  FILE is omitted or specified as ‘-’, standard input is read.

Specifically it is this sentence.

  Without ‘--zero’, if FILE contains a backslash or newline, the line
  is started with a backslash, and each problematic character in the
  file name is escaped with a backslash, making the output unambiguous
  even in the presence of arbitrary file names.

And so the program is behaving as expected.  Which I am sure you will
not be happy about since this bug report about it.

Someone will correct me but I think the thinking is that the output of
md5sum is most useful when it can be checked with md5sum -c and
therefore the filename problem needed to be handled.  The trigger for
this escapes my memory.  But if you were to check the output with -c
then you would find this result with your test case.

  $ md5sum \\ | md5sum -c
  \: OK

And note that this applies to the other *sum programs too.

  The commands sha224sum, sha256sum, sha384sum and sha512sum compute
  checksums of various lengths (respectively 224, 256, 384 and 512
  bits), collectively known as the SHA-2 hashes. The usage and options
  of these commands are precisely the same as for md5sum and
  sha1sum. See md5sum invocation.

> (Yes, my users are a clever people.)

  I am so clever that sometimes I don't understand a single word of what I am 
saying -- Oscar Wilde

:-)

Bob

bug#37702: Suggestion for 'df' utility

2020-06-01 Thread Bob Proulx

Paul Eggert wrote:
> So I'd prefer having 'df' just do the "right" thing by default, and
> to have an option to override that. The "right" thing should be to
> ignore all these pseudofilesystems that hardly anybody cares about.

+1!  Which I thought I would say because often I am a status quo type
of person.  But this is clearly needed.  Hardly a day goes by that I
don't hear swearing from people about the current extremely noisy and
hard to use df output in the environment of dozens of pseudo file
systems.  And I don't think this will break legacy and scripted use.

Bob

bug#41554: chmod allows removing x bit on chmod without a force flag, which can be inconvenient to recover from

2020-05-29 Thread Bob Proulx

tag 41554 + notabug
close 41554
thanks

Will Rosecrans wrote:
> Based on an inane interview question that was discussed here on Twitter:
> https://twitter.com/QuinnyPig/status/1265286980859908102

It's an interview question.  The purpose of this type of question is
never a practical existing problem but is instead to create a unique,
unusual, and unlikely to have been previously experienced problem for
discussion with the candidate.  To see how the candidate thinks about
problems like this.  To see if they give up immediately or if they
persevere on.  To see if they try to use available resources such as
discussing the problem with the interviewer.  It's a method to see the
candidate's problem solving skills in action.  If the candidate says,
here is the canonical correct solution, then the interviewer knows
that the candidate has seen this question before, the interviewer will
have learned nothing about the candidates problem solving skills, and
will simply move on to another question continuing to try to assess this.

I am not particularly fond of interviewers that fish for a particular
answer.  Better when the interviewer knows they are looking for an
open ended discussion.  The goal is assessing the candidate's problem
solving ability not rote memorization of test prep questions and
answers.

It is easy to say, oh, we will simply have the program avoid changing
itself, since it that would almost never desirable.  But that says
that it is sometimes desirable.  And though easy to say it is actually
very hard to program it to avoid creating new bugs.  I might say
impossible.

If this particular case were to be modified in the program the only
results would be that the interviewer would need to look for a
different inane, unique, unusual, and unlikely to have been
experienced situation to put the candidate in.  But along the way the
program would have acquired a bit of cruft.  It would be an unnatural
growth on the program source.  It would forever need testing.  It adds
complexity.  It would likely be the source of an actual real world
bug.  As opposed to this thought-experiment situation.

> "chmod a-x $(which chmod)" not a particularly likely thing for a user to
> try to do directly, but it is conceivable for some sort of script to
> attempt it by accident because of a bug, and it would make the system
> inconvenient to recover.  Since it's almost never a desirable operation,
> chmodding chmod itself could simply fail unless something like --force is
> supplied.  The underlying safety logic is similar to that behind the
> existing "--(no-)preserve-root"

There are an infinite number of ways for someone to program a
mistake.  Trying to enumerate them all in a program to prevent them is
one of them.

Bob

bug#41518: Bug in od?

2020-05-29 Thread Bob Proulx

Yuan Cao wrote:
> > https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-_0027od-_002dx_0027-command-prints-bytes-in-the-wrong-order_002e
> 
> Thanks for pointing me to this documentation.
> 
> It just feels strange because the order does not reflect the order of the
> characters in the file.

It feels strange in the environment *today*.  But in the 1970's when
the 'od' was written it was perfectly natural on the PDP-11 to print
out the native machine word in the *native word order* of the PDP-11.
During that time most software operated on the native architecture and
the idea of being portable to other systems was not yet common.

The PDP-11 is a 16-bit word machine.  Therefore what you are seeing
with the 2-byte integer and the order it is printed is the order that
it was printed on the PDP-11 system.  And has remained unchanged to
the present day.  Because it can't change without breaking all
historical use.

For anyone using od today the best way to use -x is -tx1 which prints
bytes in a portable order.  Whenever you think to type in -x use -tx1
instead.  This avoids breaking historical use and produces the output
that you are wanting.

> I think it might have been useful to get the "by word" value of the file if
> you are working with a binary file historically. One might have stored some
> data as a list of shorts. Then, we can easily view the data using "od -x
> data_file_name".
> 
> Since memory is so cheap now, people are probably using just using chars
> for text, and 4 byte ints or 8 byte ints where they used to use 2 byte ints
> (shorts) before. In this case, the "by word" order does not seem to me to
> be as useful and violates the principle of least astonishment needlessly.

But changing the use of options to a command is a hard problem and
cannot be done without breaking a lot of use of it.  The better way is
not to try.  The options to head and tail changed an eon ago and yet
just in the last week I ran across a posting where the option change
bit someone in the usage change.

And since there is no need for any breaking change it is better not to
do it.  Simply use the correct options for what you want.  -tx1 in
this case.

> It might be interesting to change the option to print values by double word
> or quadword instead or add another option to let the users choose to print
> by double word or quadword if they want.

And the size of 16-bits was a good value for a yester-year.  32-bits
has been a good size for some years.  Now 64-bits is the most common
size.  The only way to win is not to play.  Better to say the size
explicitly.  And IMNHO the best size is 1 regardless of architecture.

  od -Ax -tx1z -v

Each of those options have been added over the years and each changes
the behavior of the program.  Each of those would be a breaking change
if they were made the default.  Best to ask for what you want explicitly.

I strongly recommend https://www.ietf.org/rfc/ien/ien137.txt as
required reading.

Bob

bug#41518: Bug in od?

2020-05-28 Thread Bob Proulx

A little more information.

Pádraig Brady wrote:
> Yuan Cao wrote:
> > I recently came across the following behavior.
> > 
> > When using "--traditional x2" or "-x" option, it seems the order of hex
> > code output for the characters is pairwise reversed (if that's the correct
> > way of describing it).

‘-x’
 Output as hexadecimal two-byte units.  Equivalent to ‘-t x2’.

Outputs 16-bit integers in the *native byte order* of the machine.
Which may be either big-endian or little-endian depending on the
machine.  Not portable.  Depends upon the machine it is run upon.

> If you want to hexdump independently of endianess you can:
> 
>   od -Ax -tx1z -v

The -tx1 option above is portable because it outputs 1-byte units
instead of 2-byte units which is independent of endianess.

This is the FAQ entry for this topic.

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-_0027od-_002dx_0027-command-prints-bytes-in-the-wrong-order_002e

Bob

bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-05 Thread Bob Proulx

taehwan jeoung wrote:
> Can this error message be clarified? The directory already exists, it is 
> not a file.

That is incorrect.  Directories are files.  FIFOs are files.  Device
nodes are files.  Symlinks are files.  Network sockets are files.
They are all files.  Therefore it is not incorrect to say that a file
already exists.  Directories are files.

We have all agreed that if a better error message were provided then
that would be an improvement.  We agree with you.  We would do it if
it were within the power of mkdir(1) to do it.  But it isn't.
Therefore we can't.

> lib/mkdir-p.c:200 contains this line of code that triggers below:-
> 
> error (0, mkdir_errno, _("cannot create directory %s"), quote (dir));
> 
> As it's easy enough to know that the reason mkdir fails is because 
> 'test' a directory that already exists.

That is also incorrect.  Since that information is not provided at the
time of the action it can only be inferred by implication later.  But
at the time of the failure return it cannot be known unless the kernel
provides that information.  Later in time things might have changed.

> Easy enough to check with stat() and S_ISDIR(sb.st_mode)

Incorrect.  Checking *later* with stat() does not provide the reason
that the earlier mkdir(2) failed.  It provides a guess of something
that might be the reason.  Maybe.  Or it maybe not.  Things may have
changed later in time and the guess made later might not be the
correct reason.  Reporting that as if it were would be a worse bug.
That checking later in time after the mkdir has failed is what
introduces the race condition that we have been talking about.  Please
do not ignore that critically important point.

> Can this be changed? Maybe I can make a patch for it.

Sigh.  Ignoring the reasons why this is a bad idea are not helpful.

Bob

bug#41001: mkdir: cannot create directory ‘test’: File exists

2020-05-04 Thread Bob Proulx

Jonny Grant wrote:
> Paul Eggert wrote:
> > Jonny Grant wrote:
> > > Is a more accurate strerror considered unreliable?
> > > 
> > > Current:
> > > mkdir: cannot create directory ‘test’: File exists
> > > 
> > > Proposed:
> > > mkdir: cannot create directory ‘test’: Is a directory
> > 
> > I don't understand this comment. As I understand it you're proposing a 
> > change to
> > the mkdir command not a change to the strerror library function, and the 
> > change
> > you're proposing would introduce a race condition to the mkdir command.
> 
> As the mkdir error returned to the shell is the same, I don't feel the
> difference between the words "File exists" and "Is a directory" on the
> terminal can be considered a race condition.

I read the message thread carefully and the proposal was to add an
additional non-atomic stat(2) call to the logic.  That sets up the
race condition.

The difference in the words of the error string is not the race
condition.  The race condition is created when trying to stat(2) the
file to see why it failed.  That can only be done as a separate
action.  That cannot be an atomic operation.  That can only create a
race condition.

For the low level utilities it is almost always a bad idea to layer in
additional system calls that are not otherwise there.  Doing so almost
always creates additional bugs.  And then there will be new bug
reports about those problems.  And those will be completely valid.

Try this experiment on your own.

  /tmp$ strace -e trace=mkdir mkdir foodir1
  mkdir("foodir1", 0777)  = 0
  +++ exited with 0 +++

  /tmp$ strace -e trace=mkdir mkdir foodir1
  mkdir("foodir1", 0777)  = -1 EEXIST (File exists)
  mkdir: cannot create directory ‘foodir1’: File exists
  +++ exited with 1 +++

The first mkdir("foodir1", 0777) call succeeded.  The second
mkdir("foodir1", 0777) call fail, returned -1, set errno = EEXIST,
EEXIST is the error number for "File exists".

Note that this output line:

  mkdir("foodir1", 0777)  = -1 EEXIST (File exists)

That line was entirely reported by the 'strace' command and is not any
code related to the Coreutils mkdir command.  The strace command
reported the same "File exists" message as mkdir did later, due to the
EEXIST error code.

Let's try the same experiment with a file.  And also with a pipe and a
character device too.

  /tmp$ touch file1

  /tmp$ strace -e trace=mkdir mkdir file1
  mkdir("file1", 0777)= -1 EEXIST (File exists)
  mkdir: cannot create directory ‘file1’: File exists
  +++ exited with 1 +++

  /tmp$ mkfifo fifo1

  strace -e trace=mkdir mkdir fifo1
  mkdir("fifo1", 0777)= -1 EEXIST (File exists)
  mkdir: cannot create directory ‘fifo1’: File exists
  +++ exited with 1 +++

  /tmp$ sudo mknod char1 c 5 0

  /tmp$ strace -e trace=mkdir mkdir char1
  mkdir("char1", 0777)= -1 EEXIST (File exists)
  mkdir: cannot create directory ‘char1’: File exists
  +++ exited with 1 +++

And so we see that the kernel is returning the same EEXIST error code
for *all* cases where a file previously exists.  And it is correct
because all of those are files.  Because directories are files, pipes
are files, and files are files.  Everything is a file.  Therefore
EEXIST is a correct error message.

In order to correctly change the message being reported the change
should be made in the kernel so that the kernel, which has the
information at that time atomically, could report an error providing
more detail than simply EEXIST.

You have proposed that mkdir add a stat(2) system call to extract this
additional information.

> as it's easy enough to call stat() like other package maintainers
> do, as you can see in binutils.

*That* stat() addition creates the race condition.  Adding a stat()
call cannot be done atomically.

It would need to be done either before the mkdir(), after the mkdir(),
or both before and after.  Let's see how that can go wrong.  Let's say
we stat(), does not exist, we continue with mkdir(), fails with EEXIST
because another process got there first.  So then we stat() again and
by that time the other process has already finished processing and
removed the directory again.  A system call trace would look like
this.

  lstat("foodir1", 0x7ffcafc12800)   = -1 ENOENT (No such file or directory)
  mkdir("foodir1", 0777) = -1 EEXIST (File exists)
  lstat("foodir1", 0x7ffcafc12800)   = -1 ENOENT (No such file or directory)

Okay.  That's confusing.  The only value in hand being EEXIST then
that is the error to be reported.  If this were repeated many times
then sometimes we would catch it as an actual directory.

  lstat("foodir1", 0x7ffcafc12800)   = -1 ENOENT (No such file or directory)
  mkdir("foodir1", 0777) = -1 EEXIST (File exists)
  lstat("foodir1", {st_mode=S_IFDIR|0775, st_size=40, ...}) = 0

In that case the proposal is to report it as EISDIR.  If we were to
set up two p

bug#40958: date command give current time zone regardless of seconds since epoch requested.

2020-04-29 Thread Bob Proulx

tag 40958 + notabug
close 40958
thanks

GNAT via GNU coreutils Bug Reports wrote:
> I am going to hazard a guess and say this is the expected behaviour,
> but I cannot find anything though goog.

The FAQ gives the recipe to figure these types of problems out.

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e

And for the timezone and date in question.

  zdump -v Europe/London | grep 1970
  ...no output...

That would be a little confusing.  So let's look at it with a pager
such as less.  Browse and find the years of interest.

  zdump -v Europe/London | less
  ...
  Europe/London  Sun Feb 18 01:59:59 1968 UT = Sun Feb 18 01:59:59 1968 GMT 
isdst=0 gmtoff=0
  Europe/London  Sun Feb 18 02:00:00 1968 UT = Sun Feb 18 03:00:00 1968 BST 
isdst=1 gmtoff=3600
  Europe/London  Sat Oct 26 22:59:59 1968 UT = Sat Oct 26 23:59:59 1968 BST 
isdst=1 gmtoff=3600
  Europe/London  Sat Oct 26 23:00:00 1968 UT = Sun Oct 27 00:00:00 1968 BST 
isdst=0 gmtoff=3600
  Europe/London  Sun Oct 31 01:59:59 1971 UT = Sun Oct 31 02:59:59 1971 BST 
isdst=0 gmtoff=3600
  Europe/London  Sun Oct 31 02:00:00 1971 UT = Sun Oct 31 02:00:00 1971 GMT 
isdst=0 gmtoff=0
  ...

And therefore it is of course as Andreas Schwab wrote.  "This took
place between 27 October 1968 and 31 October 1971, ..."  An
interesting footnote of history!

The date command uses the Time Zone Database for this information.
The database is typically updated by the operating system software
distribution upon which GNU date is run.  The The source of the
database is available here.

  https://www.iana.org/time-zones

GNU date is operating upon the data from that database.  That database
is updated often as it is a global compilation of every act of
governance and must be updated as the timezone rules are updated.  In
the Debian and derivative software distributions I know this is
packaged in the 'tzdata' package.

> The date command has a number of switches, one of which is -d where you give 
> it the number of seconds since epoch, as in "date -d@1234" or "date --date 
> @1234".
> 
> Additionally, you can get it to return as any string you want to, as in "date 
> -d@1234 "+%c %z %Z"
> 
> Both return "Thu Jan  1 01:20:34 BST 1970" or "Thu Jan  1 01:20:34 +0100 BST 
> 1970" for the UK.
> 
> /etc/localtime is set to /usr/share/zoneinfo/Europe/London.
> 
> That's wrong, it should give "Thu Jan  1 00:20:34 1970 + GMT".
> 
> After all, in January, the UK is not in daylight saving time at the beginning 
> of January.

And yet there it was!  By an Act of Governance daylight saving time
was in effect at that time!  No one is safe when the government is in
session. :-)

> It therefore gives you the current daylight saving time status,
> rather than what it should be at the time requested.
> 
> I assume currently, this will give erroneous results for any
> requests in daylight saving.

Because date appears to be operating correctly I am closing this bug
ticket.  But please we welcome that any discussion may continue in the
bug ticket.

Bob

bug#40904: listing multiple subdirectories places filenames in different columns between each subdirectory

2020-04-29 Thread Bob Proulx

tag 40904 + notabug
close 40904
thanks

Jim Clark wrote:
> When I list a hard drive "ls -AR > list.txt" and import it into Libreoffice
> Calc, then break the lines using "text-to-columns", I am not able to
> perform a fixed format break so that the filenames are placed in their own
> column.
> 
> It seems like, when listing all subdirectories the largest file size within
> the subdirectory places the filename at a column and all the other names in
> that subdirectory are at the same column, but other subdirectories will
> have their filenames at different columns depending on file size within
> that subdirectory.

File size?  Your example used "ls -AR" which does not include the file
size.  Therefore I am going to close the ticket for the purpose of
accounting.  Since there is no bug here.

But please let further discussion follow.  The ticket can be reopened
or reassigned easily if that is determined.

> It would be nice if all the filenames were at the same column in the
> directory and all subdirectories.

If you are trying to use "ls -lAR" then each directory is listed
individually and what you are saying is true.  However that is the way
the GNU ls program is designed to work.  Each directory is listed
individually with column spacing applied to that directory.

As Paul recommended, it is likely better for you to use find instead.
Since you apparently want the long listing format then perhaps:

find . -ls

That will produce a full recursive long listing all of the way down.
It will use a wide fixed spacing which is apparently what you want.

I am curious.  I can't imagine any reason to import a recursive file
listing into a spreadsheet...  What is the task goal you are trying to
do there?

Bob

bug#40220: date command set linux epoch time failed

2020-03-29 Thread Bob Proulx

Paul Eggert wrote:
> Bob Proulx wrote:
> > By reading the documentation for CLOCK_MONOTONIC in clock_gettime(2):
> 
> GNU 'date' doesn't use CLOCK_MONOTONIC, so why is CLOCK_MONOTONIC relevant
> to this bug report?

GNU date uses clock_settime() and settimeofday() on my Debian system.
Let me repeat the strace snippet from my previous message which shows this.

  TZ=UTC strace -o /tmp/out -v date -s "1970-01-01 00:00:00"

  ...
  clock_settime(CLOCK_REALTIME, {tv_sec=0, tv_nsec=0}) = -1 EINVAL (Invalid 
argument)
  settimeofday({tv_sec=0, tv_usec=0}, NULL) = -1 EINVAL (Invalid argument)
  ...

Both calls from GNU date are returning EINVAL.  Those are Linux kernel
system calls.  Those Linux kernel system calls are using
CLOCK_MONOTONIC.  Therefore GNU date, on Linux systems, is by
association with the Linux kernel, using CLOCK_MONOTONIC.

And the Linux kernel is returning EINVAL.  And according to the
documentation for both clock_settime() and settimeofday() the most
likely reason for the EINVAL is the application of CLOCK_MONOTONIC
preventing it, because that documentation says that one cannot set the
date earlier than the system uptime.  Why this is desirable I have no
idea as it does not seem desirable to me.  But I am just the
messenger, having read that in the documentation looking for the
reason for the EINVAL return.

> Is this some busybox thing? If so, user 'shy' needs to report it to the
> busybox people, not to bug-coreutils.

No.  It is only a busybox thing as much as it is a GNU date thing in
that both are making system calls to the Linux kernel and both are
failing with EINVAL.  The reference to busybox confused me at first
too.  But in the original report it was simply another case of the
same thing.  Which is actually a strong indication that it is not a
bug in either of the frontend implementations but something common to
both.  In this case the kernel is the common part.

This does not appear to be a bug in the sense that it is explicit
behavior.  It is working as the Linux kernel has coded it to behave.
According to the documentation.

If one were to take this anywhere it would be to the Linux kernel
mailing list to discover why they implemented this inconvenient
behavior.

Meanwhile...  Since I am writing this in this thread...  I might
mention to the original poster that if they are testing using old
clock times they might be able to get a good result by using
libfaketime https://github.com/wolfcw/libfaketime which is a user land
strategy for implementing different fake clock times for programs.
Very useful in testing.  And then there would be no need to set the
system time at all.

  $ faketime '1970-01-01 00:00:00 UTC' date -uR
  Thu, 01 Jan 1970 00:00:00 +

Bob

bug#40220: date command set linux epoch time failed

2020-03-29 Thread Bob Proulx

Paul Eggert wrote:
> Bob Proulx wrote:
> > I tested this in a victim system and if I was very quick I was able to
> > log in and set the time to :10 seconds but no earlier.
> 
> Sounds like some sort of atomic-time thing, since UTC and TAI differed by 10
> seconds when they started up in 1972. Perhaps the clock in question uses TAI
> internally?

By reading the documentation for CLOCK_MONOTONIC in clock_gettime(2):

   CLOCK_MONOTONIC
  Clock that cannot be set and represents monotonic time since--as
  described by POSIX--"some unspecified point in the past".  On
  Linux, that point corresponds to the number of seconds that the
  system has been running since it was booted.

  The CLOCK_MONOTONIC clock is not affected by discontinuous jumps
  in the system time (e.g., if the system administrator manually
  changes the clock), but is affected by the incremental
  adjustments performed by adjtime(3) and NTP.  This clock does
  not count time that the system is suspended.

It's the, "On Linux, that point corresponds to the number of seconds
that the system has been running since it was booted." part that seems
to apply here just by the reading of it.  To test this I can reboot a
VM, which boots quickly, and then as soon as I think it is available
by watching the console I can ssh into it as root from another
terminal.  And then in that other terminal logged in as root I try to
execute "date -s '1970-01-01 00:00:00 UTC'" as soon as possible.  I am
never able to do so due to EINVAL.

But if I reboot and repeat the experiment trying to set a few seconds
in time later then if I am quick I can sometimes catch "date -s
'1970-01-01 00:00:10 UTC'" and have it work.  Trying again now I was
able to be quick and get logged in and set in :07 UTC.  But then if I
wait and let seconds tick by and try setting to :10 UTC seconds again
it will fail.  This matches the model described by the documentation
that CLOCK_MONOTONIC is the system uptime and the kernel is not
allowing the clock set to be before the system uptime.

If I wait longer and try setting the date to various times then
experimentally the behavior matches that I cannot set the system time
earlier than the the system uptime.

Personally I can't see an advantage for this behavior.  Because if
someone is doing an experiment and wants to reset the clock to time
zero then I don't see an advantage of blocking that from happening.
However doing so might avoid some accidental settings of the system
clock to an unintended zero time.  Just like rm --preserve-root.  But
how often does that actually happen?  And then I would want to see a
way to do it anyway for the experiment possibilities.  Here reading
the documentation it seems to be a new hard limitation coded into the
Linux kernel that is blocking this.

Bob

bug#40220: date command set linux epoch time failed

2020-03-27 Thread Bob Proulx

tag 40220 + notabug
close 40220
thanks

shy wrote:
> I use command date -s "1970-01-20 00:00:00" to set date, but it
>  failed.  there is error message "date: can't set date: Invalid
>  argument".
>  It's UTC time and no timezone.

This is most likely a limitation of your kernel.  I can recreate this
problem on a Linux 4.9 system for example.

  TZ=UTC strace -o /tmp/out -v date -s "1970-01-01 00:00:00"

  ...
  clock_settime(CLOCK_REALTIME, {tv_sec=0, tv_nsec=0}) = -1 EINVAL (Invalid 
argument)
  settimeofday({tv_sec=0, tv_usec=0}, NULL) = -1 EINVAL (Invalid argument)
  ...

And the documented possible returns of EINVAL for clock_settime().

   EINVAL The clk_id specified is not supported on this system.

   EINVAL (clock_settime()): tp.tv_sec is negative or tp.tv_nsec is
  outside the range [0..999,999,999].

   EINVAL (since Linux 4.3)
  A call to clock_settime() with a clk_id of CLOCK_REALTIME
  attempted to set the time to a value less than the current value
  of the CLOCK_MONOTONIC clock.

And for settimeofday().

   EINVAL (settimeofday()): timezone is invalid.

   EINVAL (settimeofday()): tv.tv_sec is negative or tv.tv_usec is outside
  the range [0..999,999].

   EINVAL (since Linux 4.3)
  (settimeofday()): An attempt was made to set the time to a value
  less than the current value of the CLOCK_MONOTONIC clock (see
  clock_gettime(2)).

   EPERM  The calling process has insufficient privilege to call
  settimeofday(); under Linux the CAP_SYS_TIME capability is
  required.

But this is not a bug in GNU date.  This is likely the effect of
CLOCK_MONOTONIC in the Linux kernel.

   CLOCK_MONOTONIC
  Clock that cannot be set and represents monotonic time since--as
  described by POSIX--"some unspecified point in the past".  On
  Linux, that point corresponds to the number of seconds that the
  system has been running since it was booted.

  The CLOCK_MONOTONIC clock is not affected by discontinuous jumps
  in the system time (e.g., if the system administrator manually
  changes the clock), but is affected by the incremental
  adjustments performed by adjtime(3) and NTP.  This clock does
  not count time that the system is suspended.

I am not familiar with CLOCK_MONOTONIC but reading the documentation
points me to it as being the most likely reason this is not allowing
that time to be set.

I tested this in a victim system and if I was very quick I was able to
log in and set the time to :10 seconds but no earlier.

> I test with stime or settimeofday to set seconds 0, they are all have the 
> problem.
> 1. I use buildroot-2013.05, the busybox is in 1.21.1, the linux kernel is in 
> version 4.4.39.

That multiple frontends, GNU date and busybox date, all have the same
problem speaks that the problem is not with the frontend but with the
kernel handling the system call.

> 3.When set date command, the busybox uses function "stime" to set
> time, I use stime to set time around linux epoch time,
>but the stime seems not work well.
>int ret = 0;
>time_t time = 20;
>ret = stime(&time);
>printf("ret %d %d\r\n",ret, errno);
>perror("stime:");
> and the results are as follows:
> ret -1 22
> stime:: Invalid argument

And also confirmed by your independent test the the problem is not a
bug in GNU date.  Therefore I mark this GNU date bug ticket as closed
for our own accounting.  But please continue to discuss the issue
here.

Bob

bug#39850: "du" command can not count some files

2020-03-02 Thread Bob Proulx

Hyunho Cho wrote:
> $ find /usr/bin -type f | wc -l
> 2234
> 
> $ find /usr/bin -type f -print0 | du -b --files0-from=- | wc -l
> 

Hard links.  Files that are hard linked are only counted once by du
since du is summing up the disk usage and hard linked files only use
disk on the first usage.

Add the du -l option if you want to count hard linked files multiple
times.

  find /usr/bin -type f -print0 | du -l -b --files0-from=- | wc -l

That will generate an incorrect total disk usage amount however as it
will report hard linked disk space for each hard link.  But it all
depends upon what you are trying to count.

> $ du -b $( find /usr/bin -type f ) | wc -l
> 

  du -l -b $( find /usr/bin -type f ) | wc -l

> $ find /usr/bin -type f -exec stat -c %s {} + | awk '{sum+=$1} END{ print 
> sum}'
> 1296011570
> 
> $ find /usr/bin -type f -print0 | du -b --files0-from=- | awk '{sum+=$1} END{ 
> print sum}'
> 1282350388

  find /usr/bin -type f -print0 | du -l -b --files0-from=- | awk '{sum+=$1} 
END{ print sum}'

> $ diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | 
> du --files0-from=-  | cut -f 2  | sort )

  diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | du 
-l --files0-from=-  | cut -f 2  | sort )

I am surprised you didn't try du on each file in addition to stat -c %s
on each file when you were summing them up. :-)

  find /usr/bin -type f -exec du -b {} \; | awk '{sum+=$1} END{ print sum}'

Bob

bug#39135: Globbing with numbers does not allow me to specify order

2020-01-22 Thread Bob Proulx

Antti Savolainen wrote:
> When doing a shortcut to unmount in a specific order, I am unable to
> specify order with angle brackets. For example using 'umount /dev/sda[132]'
> will result in the system unmounting them in numerological order. First 1
> then 2 and finally 3. What I need it to do is to first unmount 1, then 3
> and finally 2. It would be nice for the glob to respect the order of
> numbers that it was given.

As Bernhard wrote this involves features that have nothing to do with
coreutils.  However I thought I might say some more too.

You say you would like character class expansion of file globbing to
preserve the order.  But that isn't something that it has ever done
before all of the way back 40 years.  The [...] brackets give the file
glob parser (its called glob because wildcards can match a glob of
files) a list of characters to match.  These can be ranges such as A-Z
or 0-9 and so forth.  The collection effectively makes a set of
characters.  This is expanded by the command line shell.  To see the
expansion one can use the echo command to echo them out.  Try this to
see what a command like yours is doing.

  echo /dev/sda[132]

That shows what the umount command line arguments are going to be.
The command line shell expands the wildcards and then passes the
resulting expansion to the command.  The command never sees the wild
card itself.

Therefore your specific desire is that the command line shell would do
something different from what it is doing now.  And that would be
something different from what it has ever done in the past.  This
would be a new behavior and a change in historic behavior.  And almost
certainly one that would break someone who is now depending upon the
current behavior of sorting the arguments.  They would then file a bug
that the arguments were no longer being sorted.  And they were there
first by decades.  Therefore if I were maintaining a shell I would not
want to make changes to that ordering since it would certainly break
others and generate more bug reports.

Instead if you need to have things happen in a specific order then the
task is up to you to specify an explicit order.  Bernhard suggested
brace expansion, which is a GNU bash specific feature.

  echo /dev/sda{1,3,2}
  /dev/sda1 /dev/sda3 /dev/sda2

However I am not personally a fan of bash-isms in scripts.  They won't
work everywhere.  Therefore I personally would just explicitly specify
the order.

  umount /dev/sda1
  umount /dev/sda3
  umount /dev/sda2

Doing things that way is unambiguous.  And if that is the correct
order then it is the correct order.

If you need a command line short cut to make typing this in easier
then I personally would create a small shell script.

  #!/bin/sh
  # Unmount the devices in mount dependency order.
  umount /dev/sda1
  umount /dev/sda3
  umount /dev/sda2

Store this in /usr/local/bin/umount-sda or some such name that makes
sense to you and chmod a+x the file to make it executable.  Then it is
a documentable command to do exactly what is needed.  Typical command
line completion with TAB will help as a typing aid to expand the file
name for you.  That is the way I would do it.

Bob

bug#38621: gdu showing different sizes

2019-12-16 Thread Bob Proulx

TJ Luoma wrote:
> AHA! Ok, now I understand a little better. I have seen the difference
> between "size" and "size on disk" and did not realize that applied
> here.
>
> I'm still not 100% clear on _why_ two "identical" files would have
> different results for "size on disk" (it _seems_ like those should be
> identical) but I suspect that the answer is probably of a technical
> nature that would be "over my head" so to speak, and truthfully, all I
> really need to know is "sometimes that happens" rather than
> understanding the technical details of why.

I think at the start is where the confusion began.  Because the
commands are named to show that they were intended to show different
things.

  'du' is named for showing disk usage

  'ls' is named for listing files

And those are rather different things!  Let's dig into the details.

The long format for information says:

  ‘-l’
  ‘--format=long’
  ‘--format=verbose’
   In addition to the name of each file, print the file type, file
   mode bits, number of hard links, owner name, group name, size, and
   timestamp (*note Formatting file timestamps::), normally the
   modification timestamp (the mtime, *note File timestamps::).  Print
   question marks for information that cannot be determined.

So we know that ls lists the size of the file.  But let me
specifically say that this is tagged to the *file*.  It's file
centric.  There is also the -s option.

  ‘-s’
  ‘--size’
   Print the disk allocation of each file to the left of the file
   name.  This is the amount of disk space used by the file, which is
   usually a bit more than the file’s size, but it can be less if the
   file has holes.

This displays how much disk space the file consumes instead of the
size of the file.  The two being different things.

And then the 'du' documentation says:

  ‘du’ reports the amount of disk space used by the set of specified files

And so du is the disk used by the file.  But as we know the amount of
disk used is dependent upon the file system holding the file.
Different file systems will have different storage methods and the
amount of disk space being consumed by a file will be different and
somewhat unrelated to the size of the file.  Disk space consumed to
hold the file could be larger or smaller than the file size.

In particular if the file is sparse then there are "holes" in the
middle that are all zero data and do not need to be stored.  Thereby
saving the space.  In which case it will be smaller.  Or since files
are stored in blocks the final block will have some fragment of space
at the end that is past the end of the file but too small to be used
for other files.  In which case it will be larger.

Therefore it is not surprising that the numbers displayed for disk
usage is not the same as the file content size.  They would really
only line up exactly if the file content size is a multiple of the
file system storage block size and every block is fully represented on
disk.  Otherwise they will always be at least somewhat different in
number.

As long as I am here I should mention 'df' which shows disk free space
information.  One sometimes thinks that adding up the file content
size should add up to du disk usage size, but it doesn't.  And one
sometimes thinks that adding up all of the du disk usage sizes should
add up to the df disk free sizes, but it doesn't.  That is due to a
similar reason.  File systems reserve a min-free amount of space for
superuser level processes to ensure continued operation even if the
disk is fulling up from non-privileged processes.  Also file system
efficiency and performance drops dramatically as the file system fills
up.  Therefore the file system reports space with the min-free
reserved space in mind.  And once again this is different on different
file systems.

But let me return to your first bit of information.  The ls long
listing of the files.  Your version of ls gave an indication that
something was different about the second file.

> % command ls -l *pkg
> -rw-r--r--  1 tjluoma  staff  5047 Dec 15 00:00 StreamDeck-4.4.2.12189.pkg
> -rw-r--r--@ 1 tjluoma  staff  5047 Dec 15 00:02 
> Stream_Deck_4.4.2.12189.pkg

See that '@' in that position?  The GNU ls coreutils 8.30
documentation I am looking at says:

 Following the file mode bits is a single character that specifies
 whether an alternate access method such as an access control list
 applies to the file.  When the character following the file mode
 bits is a space, there is no alternate access method.  When it is a
 printing character, then there is such a method.

 GNU ‘ls’ uses a ‘.’ character to indicate a file with a security
 context, but no other alternate access method.

 A file with any other combination of alternate access methods is
 marked with a ‘+’ character.

I did not see anywhere that documented what an '@' means.  Therefore
it is likely something applied in a downstream p

bug#35685: Request

2019-05-14 Thread Bob Proulx

tag 35685 + notabug
close 35685
thanks

Safdar Iqbal wrote:
> Sir,Provide me to installation procedure of wien2k(14.2) on ubuntu
> (19.04)sir chmod command cannot excite on my workstation core i7sir
> please guide methanks

Hello!  You are asking about WIEN2k (http://www.wien2k.at/) and also
Ubuntu but this is the GNU Coreutils project.  We do not know anything
about WIEN2k here.  As such I can only close the ticket as there isn't
anything we can do about it.  I am sorry but you will need to contact
the WIEN2k people to ask for help about WIEN2k.

Good luck!
Bob

bug#35654: We've found a vulnerability of gnu chown, please check it and request a cve id for us.

2019-05-14 Thread Bob Proulx

The essence of this report appears to be an attack of the form, can we
get the root user to perform an unsafe operation, in this case can we
trick root into dereferencing a symbolic link, such as from ./poc to
/etc, in order to perform a further action through the symlink.

However this is not a bug in chown's -h implementation.  Nor is it
particular to chown as this could be any other command as the trick to
dereference the symlink first before performing whatever action.  For
example here is a recipe using the same attack but without chown.

  ln -s /etc /tmp/junk
  # Now we trick root into reaching through the symlink.
  # No way root will see this trick coming!
  rm -f /tmp/junk/*
  # This removes the files from /etc.

The above does not use chown -h but is essentially the same attack.
However again this is not a bug in 'rm' nor 'ln'.  It is simply trying
to trick the superuser into doing unsafe actions.  It requires
cooperation on the part of root in order to perform the action.

But why would the superuser do such silly things?  This is very much
like Coyote painting a black image on the side of the mountain hoping
the Road Runner will mistake it for a tunnel and run into the mountain
becoming dinner for Coyote.  But the Road Runner never fell for such
tricks and neither should the superuser.  That it might happen does
not make black paint a threat to the Road Runner.

The use of 'sudo' does not change the nature of the issue.  Only the
root user can install sudo and configure it to perform the unsafe
actions as you have described.  And it also requires a local user to
look the superuser in the eye and try to con them up close and
personal.

Note that this is essentially the same in legacy Unix and in *BSD
where symbolic links originated.  The community has had decades to
poke at them.  It is even more interesting to poke at systems that
allow environment variables in symbolic links in which case the target
is dependent upon the runtime environment variables!

The root user is the superuser and with great power comes great
responsibility.  Extraordinary claims require extraordinary proof.  In
order for symlinks to be considered as a security vulnerability a more
convincing case will need to be presented.

Bob

bug#35167: About chroot some question on centos6 kernel:

2019-04-21 Thread Bob Proulx

close 35167
thanks

Hello 往事随风,

往事随风 wrote:
> OS centos6.10
> kernel vmlinuz-2.6.32-754.el6.x86_64
> hello!
> grub-install  in a new disk /mnt/boot;copy /bin/bash and *.so ; chroot 
> /mnt/sysroot is ok!exit and ctrl+d

Sounds like 'chroot' worked correctly in the above sequence.

> use the new disk startup,
> "dracut warning can't mount root filesystemmount :/dev/sda3 already mounted 
> or /sysroot busy
> mount: according to mtab, /dev/sdb3 is already mounted on /mnt/sysroot"
> don't chroot /mnt/sysroot 
> startup ——success
> 
> why?!  I don't now! 

I have no idea either.

This does not look like a bug report for the 'chroot' command from the
GNU Coreutils project however.  It looks like a bug report against
'dracut'.  As such there isn't anything that we can do about it here.
I think that is why no one else of the team responded.  It didn't seem
like anything that anyone here could do anything about.

Also the chroot command line utility is simply a thin wrapper around
the chroot(2) kernel system call.  It does whatever the kernel does.

Therefore I am going to close the bug in our ticket system.  However
please do respond and add any further discussion.  We will see it.  If
something looks like a bug the ticket will be re-opened.

Bob

bug#34713: Files vanishing when moving to different FS

2019-03-04 Thread Bob Proulx

tags 34713 notabug
close 34713
thanks

Hello Christoph,

Christoph Michelbach wrote:
> To reproduce this bug, you need two different file systems. Adapt the
> paths to fit your system.

Thank you for making this bug report.  However what you are
experiencing is due to the race condition created by the non-atomic
nature of copying files from one file system to another, removing
files, and renaming files.  This is not a bug in mv but is an
intrinsic behavior.

> Set the experimental file structure up like this:
> 
> mkdir exp
> cd exp
> mkdir a
> cd a
> touch a{1..10}
> cd ..
> mkdir b
> cd b
> touch b{1..1}
> mkdir /t/ae # /t has to be on a different file system

Thank you for the very nice test case.

> Then have two terminals open in the exp directory created above.

This is a clue to the nature of the problem being a race condition.
It describes simultaneous parallel processes.

> In one, execute this command:
> 
> mv a /t/ae

Because /t is on a different file system mv cannot simply rename the
files but must perform the action in two steps.  It copies the file
from source to destination.  It removes source file.  This is
documented in the mv with:

'mv' can move any type of file from one file system to another.
Prior to version '4.0' of the fileutils, 'mv' could move only
regular files between file systems.  For example, now 'mv' can
move an entire directory hierarchy including special device files
from one partition to another.  It first uses some of the same
code that's used by 'cp -a' to copy the requested directories and
files, then (assuming the copy succeeded) it removes the
originals.  If the copy fails, then the part that was copied to
the destination partition is removed.  If you were to copy three
directories from one partition to another and the copy of the
first directory succeeded, but the second didn't, the first would
be left on the destination partition and the second and third
would be left on the original partition.

The mv a /t/ae action is similar to cp -a a t/ae && rm -r a when the
action is successful.  Similar because there are two steps happening.
A first step with the copy and a second step with the removal and
there is a time skew between those actions.

> In the other, execute this one while the one in the first terminal
> still is running (hence the large number of files so you have time to
> do this):
> 
> mv b/* a

This is the second part of the race condition.  It it moving files
into the a directory at the same time that files are being copied out
of the directory and the directory itself being removed.

> You will end up with 100 000 files in /t/ae. The 10 000 files beginning
> with the letter b will be gone.

Look at the two actions explicitly:

Process 1:
  cp -a a /t/ae
  rm -rf a

Process 2:
  mv b/* a

Now it is more obvious that as soon as the first process copy finishes
that it will remove the source location, that is having files moved
into it by the second process, that the directory will be deleted by
the first process.

Does that make it easier to understand what is happening?

The copy and remove two actions do not occur when both the source and
destination are on the same file system.  In that case the file can be
renamed atomically without doing a copy.  But when the action is
across two file systems this is not possible and it is simulated (or
perhaps emulated) by the copy and remove two step action.

Whenever tasks are moving files into and out of the same directory at
the same time this is always something to be aware of regardless
because they may be an overlap of actions in that directory.

In this particular example the problem can be avoided by renaming "a"
first and then transfering the files to the other file system.
Because it was removed then the second process can create it without
collision.  Something like this pseudo-code.  However to safely use
temporary file names will require more code than this.  This is simply
for illustration purposes.

Process 1:
  mv a tmpdirname
  cp -a tmpdirname /t/ae
  rm -rf tmpdirname

Process 2:
  mkdir a
  mv b/* a

I hope this helps.

Since this is not a bug in mv I am going to close the ticket in our
bug database.  But please we would like to hear back from you in this
ticket for any further discussion.

Bob

bug#34700: rm refuses to remove files owned by the user, even in force mode

2019-03-02 Thread Bob Proulx

Erik Auerswald wrote:
> Bob Proulx wrote:
> > However regardless of intentions and design if one really wants to
> > smash it then this is easily scripted.  No code modifications are
> > needed.
> > 
> >#!/bin/sh
> >chmod -R u+w $1
> >rm -rf $1
> 
> To everyone considering the above "script": do not use it! It does not even
> guard against spaces in file names. Besides being dangerously buggy, it does
> not even solve the problem of deleting a file inside a read-only directory.

Obviously I typed that in extemporaneously on the spur of the moment.
I should have put an "untested" tag upon it.

But regardless of that it does not change the fact that the entire
purpose of read-only directories is to prevent removing and renaming
of files within them.

> I would suggest people with specific directories that inhibit deletion of
> files inside although they should not (e.g. a "cache") to deliberatly change
> the permissions of said directories prior to deleting files inside. Using a
> script like the above, even without the basic mistakes in the script, is
> quite dangerous.

I don't think we are in disagreement here.

Bob

bug#34700: rm refuses to remove files owned by the user, even in force mode

2019-03-01 Thread Bob Proulx

Nicolas Mailhot wrote:
> For their own reasons, the Go maintainers have decided the user Go cache
> will now be read-only.
> https://github.com/golang/go/issues/27161#issuecomment-433098406

Not wise.

> That means cleaning up cache artefacts with rm does not work anymore
> https://github.com/golang/go/issues/30502

Users count upon non-writable directories to prevent files from being
deleted.  I am confident that changing rm to delete contents of
non-writable directories would produce bug reports.  And worse it
would have resulted in data loss in those cases.  Weigh data loss
against inconvenience intentionally created.

They have intentionally done this to prevent actions such as rm -rf on
the path.  That is the entire purpose of making directories read-only,
to prevent the contents from being removed or renamed.

However regardless of intentions and design if one really wants to
smash it then this is easily scripted.  No code modifications are
needed.

  #!/bin/sh
  chmod -R u+w $1
  rm -rf $1

Bob

bug#12400: rmdir runs "amok", users "curse" GNU...(as rmdir has no option to stay on 1 file system)...

2019-02-26 Thread Bob Proulx

L A Walsh wrote:
> Bob Proulx wrote:
> > Please provide an example.  Something small.  Something concrete.
> > Please include the version of rmdir.
> 
> The original bug stems from having to use wild cards to delete
> all files in a directory instead of '.', as in being told to use:
> 
> rm -fr --one-filesystem foo/*

When reporting bugs in command line utilities it is good to avoid
using file glob wildcards in the test case.  Because that involves the
shell.  Because that makes the test case dependent upon the contents
of the directory which will then be expanded by the shell.

> instead of 
> 
> rm -fr --one-filesystem foo/. or 
> cd foo && rm -fr --one-filesystem .

  rm: refusing to remove '.' or '..' directory: skipping '.'

I agree with your complaint about "rm -rf ." not working.  That is an
annoying nanny-state restriction.  It should fail removing '.' after
having removed all it can remove.  And it only took 16 messages in
order to get to this root cause!  It would have been so much easier if
you had started there.

But this report is about rmdir so let's get back to rmdir.  Any
reports about rm should be in a separate ticket.  Mixing multiple bugs
in any one bug ticket is confusing and bad.

Bob

bug#34524: wc: word count incorrect when words separated only by no-break space

2019-02-22 Thread Bob Proulx

vampyre...@gmail.com wrote:
> The man page for wc states: "A word is a... sequence of characters delimited 
> by white space."
> 
> But its concept of white space only seems to include ASCII white
> space.  U+00A0 NO-BREAK SPACE, for instance, is not recognized.

Indeed this is because wc and other coreutils programs, and other
programs, use the libc locale definition.

  $ printf '\xC2\xA0\n' | env LC_ALL=en_US.UTF-8 od -tx1 -c
  000  c2  a0  0a
  302 240  \n
  003

  printf '\xC2\xA0\n' | env LC_ALL=en_US.UTF-8 grep '[[:space:]]' | wc -l
  0
  $ printf '\xC2\xA0 \n' | env LC_ALL=en_US.UTF-8 grep '[[:space:]]' | wc -l
  1

This shows that grep does not recognize \xC2\xA0 as a character in the
class of space characters either.

  $ printf '\xC2\xA0\n' | env LC_ALL=en_US.UTF-8 tr '[[:space:]]' x | od -tx1 -c
  000  c2  a0  78
  302 240   x
  003

And while a space character matches and is translated the other is not.

Since character classes are defined as part of the locale table there
isn't really anything we can do about it on the coreutils wc side of
things.  It would need to be redefined upstream there.

Bob

bug#34447: `pwd` doesn't show real working directory if directory is renamed by another session

2019-02-22 Thread Bob Proulx

tag 34447 + notabug
close 34447
thanks

Hello Chris,

Chris Wright wrote:
> I found that if a session's working directory is renamed or moved,
> `pwd` doesn't show the real working directory.

Thank you for your bug report.  However I think the shell's built-in
pwd is being confused with the external pwd command.  The shell
internal command has the behavior your describe, intentionally.  The
external one in GNU Coreutils does not.

> ~/test $ pwd
> /Users//test

The above is using the internal shell builtin.

  $ type pwd
  pwd is a shell builtin

  $ type -a pwd
  pwd is a shell builtin
  pwd is /bin/pwd

The bash shell built-in has this to say about the internal pwd.

  $ help pwd
  pwd: pwd [-LP]
Print the name of the current working directory.

Options:
  -Lprint the value of $PWD if it names the current working
directory
  -Pprint the physical directory, without any symbolic links

By default, `pwd' behaves as if `-L' were specified.

Therefore by default the shell's buitin pwd simply prints out the PWD
environment variable, which has not changed.  This is to preserve the
"logical" (not physical) directory tree based upon how the process got
there, intentionally tracking how they got there not where they are.
They got there by the path stored in PWD.

I hate that behavior.  But as with most things I was not consulted. :-}

In order to do what you want there are at least three options.  One is
to use the external coreutils version.  The idiom for forcing external
commands is using 'env' for it.

  env pwd

Another is adding the -P option.  This ignores PWD and returns the
physical path.

  pwd -P

And the third (what I do) is to set the shell to always use physical
paths.  Which is how it behaved before they added logical path
tracking in the PWD variable.  I have this in my ~/.bashrc file.

  set -o physical

Therefore I have closed this bug report for the purpose of triage of
the report in the coreutils tracker since this is really about bash
and not coreutils.  However please do reply as discussion may
continue.  We would love to continue the discussion.

Note that the coreutils 'pwd' defalts to -P, --physical unless -L,
--logical is given explicitly.  And that the documentation for the
coreutils pwd is subtly different from the bash version:

'-L'
'--logical'
 If the contents of the environment variable 'PWD' provide an
 absolute name of the current directory with no '.' or '..'
 components, but possibly with symbolic links, then output those
 contents.  Otherwise, fall back to default '-P' handling.

'-P'
'--physical'
 Print a fully resolved name for the current directory.  That is,
 all components of the printed name will be actual directory
 names—none will be symbolic links.

   If '-L' and '-P' are both given, the last one takes precedence.  If
neither option is given, then this implementation uses '-P' as the
default unless the 'POSIXLY_CORRECT' environment variable is set.

   Due to shell aliases and built-in 'pwd' functions, using an unadorned
'pwd' interactively or in a script may get you different functionality
than that described here.  Invoke it via 'env' (i.e., 'env pwd ...') to
avoid interference from the shell.

Hope this helps!

Bob

bug#34199: closed (Re: bug#34199: Small bug in cp (for win64))

2019-02-10 Thread Bob Proulx

Chris Kalish wrote:
> Hmmm ... not sure of the distribution, but the help file pointed me at this
> address:

> C:\> cp --version
> cp (GNU coreutils) 5.3.0

I always hate it when I am on your side of things and upstream says
this to me.  But here I am on the other side and going to say almost
exactly the thing I hate to hear.

Coreutils 5.3.0 was released on 2005-01-08 and today is 2019-02-10
making that version of the program you are running 14 years old!  That
is a very long time ago.  Since you are running on MS-Windows I will
say that was probably five whole versions of Microsoft ago!  It would
not be practically possible for most of us to recreate that version on
MS-Windows-XP of that era.  This makes it difficult to impossible to
do anything about even if we had an alive build system from 2005 still
running.  Plus here we are concerned about software on free(dom)
licensed platforms and Microsoft is a closed source proprietary
platform.  That was always supported by other teams doing ports to
non-free operating systems.  What's a developer to do? :-(

Perhaps I should ignore all of the years and simply say, yes, that is
a bug.  (I don't really know.  But I will say it.  Searching the
changelogs will show that 5.3.0 did introduce a number of bugs.)  And
we have fixed it!  The new version is v8.30 and that bug is fixed.
Eric reported that it was not a problem for Cygwin on MS-Windows.
Please upgrade to it and confirm with us that it is working for you
there.  Maybe that would be a less unpleasant to hear? :-)

> C:\> cp --help
> Report bugs to .

We are happy to have bugs reported here.  But often in ports the
behavior is dependent upon the port environment.  That is outside of
our control.

Please do upgrade to a newer version.  Cygwin tends to be the most
capable version.  Although there are other ports too.  We would
appreciate hearing about how this worked out for you regardless.

And maybe along the way you might consider upgrading to a free(dom)
software licensed operating system?  Then you would have upgrades
available by default. :-)

Bob

bug#12400: rmdir runs "amok", users "curse" GNU...(as rmdir has no option to stay on 1 file system)...

2019-02-10 Thread Bob Proulx

L A Walsh wrote:
> >> If you want a recursive option why not use 'rm -rf'?
>
> rmdir already provides a recursive delete that can cross
> file system boundaries

Please provide an example.  Something small.  Something concrete.
Please include the version of rmdir.

Something like:

  mkdir testdir testdir/dir1 testdir/dir2 testdir/dir2/dir3
  rmdir --recursive testdir/dir2
  rmdir --version

Include all input and output verbatim.  For clarity do not use shell
file glob wildcards because that is a dependence upon a specific
command line shell and the shell's configuration.

> dir1->dir2->dir3
> 
> dir1 is on 1 file system, dir 2 is on another and dir 3 can be on another.

GNU Coreutils rmdir does not provide a recursive delete option.
Therefore one can only assume that the rmdir you are referring to is a
different rmdir from a different project.

I specifically asked if you were using the rmdir --parents option but
my message was the only mention of --parents in this entire ticket and
in subsequent responses your messages also did not mention it.
Therefore I can only assume that there is no --parents option being
used here.

> >> There is always 'find' with the -delete option.  But regardless there
> >> has been the find -exec option.
>
> true -- so why should 'rm' protect against crossing boundaries
> deleting '/' or everything under '.' when there is find?
> 
> find is the obvious solution you are saying, so all that checking in
> rm should be removed, as it is inconsistent with rmdir that can
> cross boundaries.

My mention of 'find' was really a simple statement about alternatives
when programmatic needs are desired.  Because 'find' is the swiss army
chainsaw for directory traversal.  I didn't mean to derail the
discussion there.  But if it is to be derailed then 'find' is the best
choice when needing a specific set of programmatic requirements for
directory traversal.  The other utilities that have simpler
capabilities are the distractions.  But in theory this bug ticket was
about 'rmdir'.

> As for closing something not addressed for 6 years while the problem
> has grown worse -- (rmdir didnt' used to have a recursive delete), doesn't
> seem a great way to judge whether or not a bug is valid or not .

GNU Coreutils rmdir does not provide a recursive delete option.

This bug report so far has contained conflicting complaints to the
point that it has not been useful.  It still is not clear if you are
complaining about 'rmdir' or 'rm' even after requests for
clarification.  Or possibly your shell's ** file glob expansion.
Probably some combination of them all that is unique to your
environment.

To be useful a bug report must be descriptive so that the reader can
understand it.  If the reader can't understand it then how can it be
useful?  The report must be as simple as possible.  Because extraneous
complexity is distracting.  Stay focused on the bug being reported and
not about other unrelated things.  Bugs about behavior should be
reproducible with a test case.  Because nothing is as useful as a
concrete example.

I have reviewed the reports in this ticket and there seems to be no
viable bug report to operate upon here.  At some point without a test
case it only makes sense to say enough is enough and move on since
this does not appear to be a bug in any program of the coreutils
project.  However even though a bug is closed discussion may continue
as we are doing here.  The bug state is simply a way to organize
reports for the purposes of triage.  Many thanks to Assaf for putting
in the work to triage these old bug tickets.

If you wish to report a bug in rmdir's recursive delete option then we
must insist on a test case.

Bob

bug#13738: Add --all option to 'users' command

2019-02-10 Thread Bob Proulx

anatoly techtonik wrote:
> Bob Proulx wrote:
> > > Human users have UIDs starting at 1000,
> >
> > That assumption is incorrect.  Many systems start users off at 100.
> > Many others start users at 500.  There isn't any univerial standard.
> > It is a local system configuration option.
> 
> How to figure out at which number users UIDs start at a given system?

That is a system dependent problem.

On my Debian Stretch 9 system the /etc/login.defs file contains:

  # Min/max values for automatic uid selection in useradd
  #
  UID_MIN  1000
  UID_MAX 6
  # System accounts
  #SYS_UID_MIN  100
  #SYS_UID_MAX  999

Other systems will be different.  It is a policy implemented by the OS.

> > > so you can use that fact to filter out the non-humans:
> > >
> > > cut -d: -f1,3 /etc/passwd | egrep ':[0-9]{4}$' | cut -d: -f1
> >
> > This assumes that /etc/passwd is the user database.  While true on a
> > typical standalone system it is incorrect when NIS/yp or LDAP or other
> > account system is in use.  That is why I used 'getent passwd' even
> > though it is not available on all systems.  When available it obeys
> > the local system configuration and returns the correct information.
> 
> If NIS/yp or LDAP are installed, they provide getent, right?

'getent' is actually AFAIK a glibc utility.  AFAIK any OS using glibc
will provide it.  However traditional systems not based on glibc may
or may not.  I only have limited access to other systems at this time
and have no easy way to check *BSD or HP-UX or others for example.

> So if there is no getent, then /etc/passwd is de-facto database and
> can be reliably used as a fallback. Is that correct?

The /etc/nsswitch.conf file determines this.  Certainly the lowest
level default is /etc/passwd.  But the nsswitch.conf file is where
modifications are configured for this.

> Is there other way to distinguish user accounts other than matching
> "things that only seem to be true", like UID numbers?

There is no actual difference between user accounts and system
accounts.  The only real difference is that user accounts have a human
user associated with them but system accounts do not.  Other than that
they are the same.  Certainly to the OS they are simply a uid to hold
a running process.

> > Actually even that isn't sufficient.  The value for nobody 65534 is a
> > traditional value.  But uids may be larger on most modern systems.  It
> > isn't unusual to have the nobody id in the middle of the range with
> > real users having uid numbers both less than and greater than that
> > value.  Therefore in order to be completely correct additional filter
> > methods would be needed such as sets of ranges or block lists or
> > something similar.
> 
> Yes. I believe LXD has UID mapping for containers about 10,
> and those are not human users in general case.

That is a good example.  And one of which I was not aware.  And I am
sure there are other cases too.

> I am getting the feeling that the approach of solving problems be using
> the tool for specific case is misleading in the case that it battles with
> effects and not the causes. The cause of the mess if UID mapping in
> Linux kernel, which is not about users at all. There is a concept of user
> space, but correct me if I wrong - daemons that run with different UIDs
> are run in their own userspace as well. The user concept is not defined
> by kernel, but rather by some concept of having home and being able to
> login into it either from console or remotely.

All processes have a uid.  Some uids are associated with a human.
Some are not.  The kernel doesn't know the difference.  The kernel is
applying permissions based upon nothing more than the integer number
of the process.  For example the uid can send a signal to another
process with the same uid.  Or the superuser process can send a signal
to any other process regardless of uid.  But a non-superuser process
cannot send a signal to another random process of a different uid.
None of which has any relation to whether a human can log into the
account or not.

> If this behavior of humans vs daemons was explicitly documented
> somewhere, it could lead to better understanding if solving this problem
> in general is real.

I don't think this is possible because there really is no difference
between system uids and non-system uids.  Whether something is a
system uid or a non-system uid is a color we paint on it by human
judgement and two different people might judge the same thing
differently and both would be right.

It is also a difference which makes no difference.

> > It would help if you could say a few words about the case in
> > whic

bug#33943: (omitted) ls directly uses filename as option parameter

2019-01-02 Thread Bob Proulx

tags 33943 notabug
close 33943
merge 33942
thanks

This message generated a new bug ticket.  I merged it with the
previous bug ticket.

westlake wrote:
>  I have omitted that I recently downgraded my coreutils to doublecheck
> behaviour for ls, and noticed immediately the same behaviour was occuring,

It was still occurring because this is not a new behavior of 'ls'.
This is the Unix has operated since the beginning.

It seems that you missed seeing Assaf's explanation of it.  Let me
repeat some of it.

> $touch 0 ./--a ./-a ./-_a ./--

> $ ls -lad  -* [^-]*

Here the first example nicely uses ./ in front of the problematic
characters.  But the second one did not have ./ there.  If it did then
there would be no problem.  But instead the "-*" above is trouble.
Don't do it!  Always put ./ in front of file globs (wildcards) like
that.  It should be:

  $ ls -lad  ./-* ./[^-]*

> .. however a period of time the behaviour is no longer exhibiting the same,

It was not a period of time.  It was the contents of the directory
upon which the commands were used.  It is data dependent.  It depends
upon the file names that exist.  If there are no file names that start
with a '-' then none will be mistaken for an option.  As you knew when
you created the test case using touch above.

> I suppose I did not wait long enough for the new "ls" or whatever it is to
> come into effect...

It is not a time issue.  It is only a matter of file glob wildcard
expansion as done by the command line shell.  Using 'echo' to see a
preview of the command will show this.

> but there's still oddities with ls, I guess it is the unprediction of
> "getopt".. and so I guess I should address any further concerns with the
> developers of getopt.

This is also not a getopt issue.  The best practice is to prefix all
wildcards with ./ such as ./*.txt so that the resulting text string
will not be confused with an option starting with a '-' even if the
file name starts with a '-' as the result will be "./-something" but
the resulting argument to ls will start with "./" instead of "-".

Bob

bug#33577: ls lacks null terminator option

2018-12-03 Thread Bob Proulx

積丹尼 Dan Jacobson wrote:
> For files with blanks in their names,
> one shouldn't need this workaround:
> $ ls -t | tr \\n \\0 | xargs -0 more > /tmp/z.txt
> Please add a --print0 option. like find(1) has.

I think that adding a --print0 option to 'ls' is not wise because it
would suggest to people seeing it that 'ls' should be used in scripts.
But 'ls' is a command designed for human interaction not for use in
scripts.  Using 'find' for scripted use is the desired utility.

Such a patch has previously been submitted.

  http://lists.gnu.org/archive/html/coreutils/2014-02/msg5.html

Bob

bug#22195: deviation from POSIX in tee

2015-12-18 Thread Bob Proulx

Pádraig Brady wrote:
> Paul Eggert wrote:
> > trap '' PIPE
> 
> Generally you don't want to ignore SIGPIPE.
> http://pixelbeat/programming/sigpipe_handling.html
> as then you have to deal with EPIPE from write():

I wanted to add emphasis to this.  Ignoring SIGPIPE causes a cascade
of associated problems.  Best not to do it.

Bob

P.S. Typo alert:
  http://pixelbeat/programming/sigpipe_handling.html
Should be:
  http://www.pixelbeat.org/programming/sigpipe_handling.html

bug#22128: dirname enhancement

2015-12-10 Thread Bob Proulx

Nellis, Kenneth wrote:
> Still, my -f suggestion would be easier to type,
> but I welcome your alternatives.

Here is the problem.  You would like dirname to read a list from a
file.  Someone else will want it to read a file list of files listing
files.  Another will want to skip one header line.  Another will want
to skip multiple header lines.  Another will want the exact same
feature in basename too.  Another will want file name modification so
that it can be used to rename directories.  And on and on and on.
Trying to put every possible combination of feature into every utility
leads to unmanageable code bloat.

What do all of those have in common?  They are all specific features
that are easily available by using the features of the operating
system.  That is the entire point of a Unix-like operating system.  It
already has all of the tools needed.  You tell it what you want it to
do using those features.  That is the way the operating system is
designed.  Utilities such as dirname are simply small pieces in the
complete solution.

In this instance the first thing I thought of when I read your dirname
-f request was a loop.

   while read dir; do dirname $dir; done < list

Pádraig suggested xargs which was even shorter.

  xargs dirname < filename

Both of those directly do exactly what you had asked to do.  The
technique works not only with dirname but with every other command on
the system too.  A technique that works with everything is much better
than something that only works in one small place.

Want to get the basename instead?

   while read dir; do basename $dir; done < list

Want to modify the result to add a suffix?

   while read dir; do echo $dir.myaddedsuffix; done < list

Want to modify the name in some custom way?

   while read dir; do echo $dir | sed 's/foo/bar/; done < list

Want a sorted unique list modified in some custom way?

   while read dir; do echo $dir | sed 's/foo/bar/'; done < list | sort -u

The possibilities are endless and as they say limited only by your
imagination.  Anything you can think of doing you can tell the system
to do it for you.  Truly a marvelous thing to be so empowered.

Note that in order to be completely general and work with arbitrary
names that have embedded newlines then proper quoting is required and
the wisdom of today says always use null terminated strings.  But if
you are using a file of names then I assume you are operating on a
restricted and sane set of characters so this won't matter to you.
I do that all of the time.

Bob

bug#22128: dirname enhancement

2015-12-10 Thread Bob Proulx

Pádraig Brady wrote:
> Nellis, Kenneth wrote:
> > E.g., to get a list of directories that contain a specific file: 
> > 
> > find -name "xyz.dat" | dirname -f -
> 
> find -name "xyz.dat" -print0 | xargs -r0 dirname

Also if using GNU find can use GNU find's -printf operand and %h to
print the directory of the matching item.  Not portable to non-gnu
systems.

  find . -name xyz.dat -printf "%h\n"

Can generate null terminated string output for further xargs -0 use.

  find . -name xyz.dat -printf "%h\0" | xargs -0 ...otherstuff...

Bob

bug#22087: Problem with stdbuf configure test for 8.24 on Solaris with Studio C compiler.

2015-12-03 Thread Bob Proulx

Eric Blake wrote:
> Bob Proulx wrote:
> > Or is a return 0 already defaulted?  It stood out to me that the
> > previous return was unconditional and without an else or a
> > fallthrough this is a change from the previous control flow.
> > 
> >   -return !(stdbuf == 1);]])
> >   +if (stdbuf != 1)
> >   +  return 1;
> >   +return 0;]])
> 
> Explicitly listing 'return 0;' here would result in a doubled-up return
> 0 in the overall conftest.c file.

Gotcha!  That there is already a default return 0 answers my question.

Thanks,
Bob

bug#22087: Problem with stdbuf configure test for 8.24 on Solaris with Studio C compiler.

2015-12-03 Thread Bob Proulx

Paul Eggert wrote:
> How about the attached (untested) patch instead? It should fix the
> underlying problem, and thus avoid the need for fiddling with compiler
> flags.

> diff --git a/configure.ac b/configure.ac
> index 66c8cbe..3f546e9 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -475,7 +475,8 @@ AC_LINK_IFELSE(
>  {
>stdbuf = 1;
>  }]],[[
> -return !(stdbuf == 1);]])
> +if (stdbuf != 1)
> +  return 1;]])
>],
>[stdbuf_supported=yes])
>  AC_MSG_RESULT([$stdbuf_supported])

Fallthrough return 0?  Or is a return 0 already defaulted?  It stood
out to me that the previous return was unconditional and without an
else or a fallthrough this is a change from the previous control flow.

  -return !(stdbuf == 1);]])
  +if (stdbuf != 1)
  +  return 1;
  +return 0;]])

??

Bob

bug#22001: Is it possible to tab separate concatenated files?

2015-11-23 Thread Bob Proulx

Macdonald, Kim - BCCDC wrote:
> Sorry for the confusion - I wanted to add a tab (or even a new line)
> after each file that was concatenated. Actually a new line may be
> better.
>
> For Example:
> Concatenate the files like so:
> >gi|452742846|ref|NZ_CAFD01001.1| Salmonella enterica subsp., whole 
> >genome shotgun sequenceTTTCAGCATATATATAGGCCATCATACATAGCCATATAT
> >gi|452742846|ref|NZ_CAFD01002.1| Salmonella enterica subsp., whole 
> >genome shotgun 
> >sequenceCATAGCCATATATACTAGCTGACTGACGTCGCAGCTGGTCAGACTGACGTACGTCGACTGACGTC
> >gi|452742846|ref|NZ_CAFD01003.1| Salmonella enterica subsp., whole 
> >genome shotgun sequenceTATATAGATACATATATCGCGATATCAGACTGCATAGCGTCAG
> 
> Right now - Just using cat, they look , like:
> >gi|452742846|ref|NZ_CAFD01001.1| Salmonella enterica subsp., whole 
> >genome shotgun 
> >sequenceTTTCAGCATATATATAGGCCATCATACATAGCCATATAT>gi|452742846|ref|NZ_CAFD01002.1|
> > Salmonella enterica subsp., whole genome shotgun 
> >sequenceCATAGCCATATATACTAGCTGACTGACGTCGCAGCTGGTCAGACTGACGTACGTCGACTGACGTC>gi|452742846|ref|NZ_CAFD01003.1|
> > Salmonella enterica subsp., whole genome shotgun 
> >sequenceTATATAGATACATATATCGCGATATCAGACTGCATAGCGTCAG

That example shows a completely different problem.  It shows that your
input plain text files have no terminating newline, making them
officially not plain text files but binary files.  Because every plain
text line in a file must be terminated with a newline.  If they are
not then it isn't a text line.  Must be binary.

Why isn't there a newline at the end of the file?  Fix that and all of
your problems and many others go away.

Getting ahead of things 1...

If you just can't fix the lack of a newline at the end of those files
then you must handle it explicitly.

  for f in *.txt; do
cat "$f"
echo
  done

Getting ahead of things 2...

Sometimes people just want a separator between files.
Actually 'tail' will already do this rather well.

  tail -n+0 *.txt
  ==> 1.txt <==
  foo

  ==> 2.txt <==
  bar

Bob

bug#21916: sort -u drops unique lines with some locales

2015-11-16 Thread Bob Proulx

Pádraig Brady wrote:
> Christoph Anton Mitterer wrote:
> > Attached is a file, that, when sort -u'ed in my locale, looses lines
> > which are however unique.
> > 
> > I've also attached the locale, since it's a custom made one, but the
> > same seem to happen with "standard" locales as well, see e.g.
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=695489
> > 
> > PS: Please keep me CCed, as I'm writing off list.
> 
> If you compare at the byte level you'll get appropriate grouping:
> 
>   $ printf '%s\n' Ⅱ Ⅰ | LC_ALL=C sort
>   Ⅰ
>   Ⅱ

It is also possible to set only LC_COLLATE=C and not set everything to C.

> The same goes for other similar representations,
> like full width forms of latin numbers:
> 
>   $ printf '%s\n' ２ １ | ltrace -e strcoll sort
>   sort->strcoll("\357\274\222", "\357\274\221") = 0
>   ２
>   １
>
> That's a bit surprising, though maybe since only a limited
> number of these representations are provided, it was
> not thought appropriate to provide collation orders for them.

Hmm...  Seems questionable to me.

> There are details on the unicode representation at:
> https://en.wikipedia.org/wiki/Numerals_in_Unicode#Roman_numerals_in_Unicode
> Where it says "[f]or most purposes, it is preferable to compose the Roman 
> numerals
> from sequences of the appropriate Latin letters"
> 
> For example you could mix ISO 8859-1 and ISO 8859-5 to get appropriate 
> sorting:

One can transliterate them using 'iconv'.

  printf '%s\n' Ⅱ Ⅰ ２ １ | iconv -f UTF-8 -t ASCII//TRANSLIT | sort
  1
  2
  I
  II

Bob

bug#21760: timeout: Feature Request: --verbose ==> output if timeout was reached

2015-10-28 Thread Bob Proulx

Bernhard Voelker wrote:
> Pádraig Brady wrote:
> >Reopened until someone else votes.
> 
> I'm 50:50 regarding the usefulness of --verbose.
> Writing "killed PRG after N seconds elapsed" sounds like a useful
> message, yet I'm afraid that then other requests may follow soon
> which may really bloat the code, e.g.
> 
>   $ timeout --verbose 10 sleep 5
>   timeout: child process terminated after 5.012 seconds.

I am also 50:50 regarding the verbose message output.

For example I don't like the N seconds, or N.012 more detailed output.
As soon as this is produced there will be other people trying to parse
it.  Having that variable data in the middle of the string to parse
makes it more complicated.  Does it matter?  Better would be just to
say as concise as possible "timeout: terminated child process" and
stop there.

Having more verbosity usually sounds good.  But it often isn't.  In my
experience it tends to make the code much more crufty than it would be
otherwise.  Harder to read the actual code for needing to read around
the verbose output statements.  And then because internal code changes
will often change the verbose output it tends to break people who have
come to rely upon those messages.  Meaning it either freezes the
implementation just as it is because changes would change the visible
output or it risks breaking consumers of it if it changes.  It is not
a small problem.

>>Thomas Güttler wrote:
>>> A new team mate wasted some time to debug why
>>> a process was killed with signal 15.
>>>...
>>> This is important in environments where the one who reads
>>> the script failures is not the same person who writes the script.

IMNHO this is a common bad type of thinking.  The root cause of the
problem was created by a bug in a locally written script.  We haven't
seen that script but most likely it will have additional problems.
This bites someone.  Having been bitten this becomes personal.
Regardless of how unique or general having become personal it now
becomes important to make sure that no one else ever suffers this
problem again.  Ever.  Even if it creates additional problems.

One of the hardest things is to balance the creation of additional
problems with eliminating a previous problem.  The simpler of the two
is usually the best choice.

> As it's easy to have a wrapper for the original request, I'd rather
> not add it.

I will reiterate that I don't feel strongly either way.  As long as
the output message is as simple as possible.

> BTW: timeout shares stdout/stderr with its child; therefore,
> wouldn't the interleaved output be problematic?

A good example of a possible problem due to the law of unintended
consequences.  And if this leads to the request for --output-fd=N to
reroute file descriptors just to work around it then that is much too
much and shouldn't be done.

Bob

bug#21416: "--" syntax for ignoring flags doesn't seem to work right with GNU /bin/echo

2015-09-04 Thread Bob Proulx

Robert "Finny" Merrill wrote:
> ~/workspaces/diags-dev/s1 @bs360.sjc> /bin/echo --help
> Usage: /bin/echo [SHORT-OPTION]... [STRING]...
>   or:  /bin/echo LONG-OPTION
> Echo the STRING(s) to standard output.
> *snip*
> ~/workspaces/diags-dev/s1 @bs360.sjc> /bin/echo -- --help
> -- --help
> ~/workspaces/diags-dev/s1 @bs360.sjc>

Under what actual live conditions in the wild would someone be using
/bin/echo in this manor?

Most shell interpreters used for scripts will have a shell builtin
version of echo.

  $ ls -log /bin/sh
  lrwxrwxrwx 1 4 Nov  8  2014 /bin/sh -> dash
  $ /bin/sh -c 'echo --help'
  --help
  $ /bin/bash -c 'echo --help'
  --help
  $ /bin/ash -c 'echo --help'
  --help
  $ /bin/dash -c 'echo --help'
  --help
  $ /bin/ksh -c 'echo --help'
  --help
  $ /bin/csh -c 'echo --help'
  --help

I think this might be a problem that is purely academic as it can't
ever actually be hit in real life.  However if you provide an actual
example that would go a long way to making this problem clear.

> There doesn't seem to be a way to get /bin/echo to output the string "--help"

Woe is me for suggesting using -e or -E as they are terribly
non-portable options.  Don't use them!  Use printf instead.  But
having said that...

  $ /bin/echo -e --help
  --help
  $ /bin/echo -E --help
  --help

But please don't do it.  Use printf instead.  The shell printf command
has a standard syntax and may portably be used.

Bob

bug#21369: Coreutils RHEL 6.7 runuser

2015-08-28 Thread Bob Proulx

billy_k_woo...@homedepot.com wrote:
> Here is what we're running on RHEL6.7, and it's throwing 99 as the
> return code ($?)
> 
> test:/root#  /sbin/runuser -s /bin/ksh - tomcat -c whoami
> test:/root#  echo $?
> test:/root#  99

I cannot reproduce your problem on a RHEL 6.7 system here.  Therefore
I can only conclude that the problem must be in your local environment.

I am highly suspicious of the '-' option running the tomcat user's
profile and $ENV files.  What is in ~tomcat/profile?

Does the command work without the '-'?

  /sbin/runuser -s /bin/ksh tomcat -c whoami

Does that command work for other users not tomcat?  Perhaps you have
'mysql' already installed?  That would make a good alternative test
case if it happens to be there and if you have tomcat then I think it
likely.

  /sbin/runuser -s /bin/ksh mysql -c whoami

Looking a little further into strace you submitted I see that pid
23575 appears to be ksh and it appears to have segfaulted.

  [pid 23575] --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  [pid 23575] +++ killed by SIGSEGV +++

That isn't good.  Can you try a different shell?  (On my RHEL 6.7
system /bin/sh is symlinked to bash.)  I also have dash installed by
default too and think it would be a good alternative test case.

  /sbin/runuser -s /bin/sh - tomcat -c whoami
  /sbin/runuser -s /bin/dash - tomcat -c whoami

If 'ksh' is segfaulting then that is likely a problem if not the problem.
Do look closely at the tomcat profile and $ENV (possibly .kshrc)
environment though as I think that is likely involved.

Bob

bug#21371: Coreutils RHEL 6.7 runuser

2015-08-28 Thread Bob Proulx

forcemerge 21369 21371
stop

bug#21369: Coreutils RHEL 6.7 runuser

2015-08-28 Thread Bob Proulx

tag 21369 + notabug moreinfo
thanks

Hello Billy,

Thank you for your bug report.  We always appreciate it when people
take the time to file reports about problems.  However your report
isn't as good as it could be.  Sorry.  I am not saying that it isn't
appreciated.  But it could be better.

billy_k_woo...@homedepot.com wrote:
> /sbin/runuser issue on RHEL 6.7 when executing `whoami` as tomcat.

For one you don't really say what you are expecting.  You say there is
a problem with runuser running whoam as tomcat.  Instead please show
us exactly what command you were running verbatim.  Without having
that we are left to guess.  Or to dig it out.

I deduce from the logfile that you ran this command.

  # /sbin/runuser -s /bin/ksh - tomcat -c whoami

What was the output of that command?  Please always include the
command and the output of the command verbatim.  Otherwise we don't
know what it was or we have to dig for it.  That is why I tagged this
bug as "moreinfo" needed.

> Strace -f attached.

Wow.  That is a huge attachment making the email around 839K on the
initial contact.  Please anytime you send something that large
compress the log file first.  Using gzip on the log reduces it to 51K
which is much lighter on the mailing lists and everyone's mailboxes.
Mailing lists at lists.gnu.org often have thousands of subscribers and
the bandwidth consumed is multipled by every one.

Also before sending something that large it is useful to make contact
and make sure it is going to be useful.  Here I am not sure it is
useful.  Because you are stracing 'runuser' which is running 'ksh'
with the '-' option to source the profile and $ENV and then it is
runniing 'whoam'.  That is quite the long way around the universe to
get to the end.

For another you are reporting a problem about "Coreutils RHEL 6.7
runuser" but runuser is not a coreutils program.  AFAIK runuser comes
from the util-linux package.  That isn't something we over here in the
coreutils project have anything to do with.  Plus 'ksh'.  Plus the
entirety of the (unknown contents of) profile.  And then finally the
'whoami' command, which is a coreutils program.  Even if we fully
understand what you are reporting it is unlikely we can do anything
about it.  That is why I have initially tagged this report as
"notabug" concerning coreutils.

> The information in this Internet Email is confidential and may be
> legally privileged. It is intended solely for the addressee. Access
>...

And finally there is the useless email disclaimer.  We all know and
understand that it is attached by your company and there isn't
anything you can do about it.  But those are also useless, annoying,
and legally unenforceable.  The usual recommendation is that there are
many free(dom) respecting email providers available that don't abuse
the user's email this way.  It is good to use one of them instead.

So where are we?  What can we do make this report better?  First
please tell us what output comes out of the command.  I scanned
through the strace log in some detail but I couldn't dig out of it
what output was actually produced.  Or not if none.  It is a huge log
file and I couldn't spend a huge amout of time on it.

Then I expect that the problem is related to the use of the runuser
'-' option which sources the (unknown contents of) profile and $ENV.
I have often seen problems due to the code contents of those files.

When you reply please keep the 21...@debbugs.gnu.org bug address in
the recipient list.  That is the bug log for this report.

Thanks,
Bob

READ CAREFULLY.  By reading this email, you agree, on behalf of your
employer, to release me from all obligations and waivers arising from
any and all NON-NEGOTIATED agreements, licences, terms-of-service,
shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure,
non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I
have entered into with your employer, its partners, licensors, agents
and assigns, in perpetuity, without prejudice to my ongoing rights and
privileges.  You further represent that you have the authority to
release me from any BOGUS AGREEMENTS on behalf of your employer.

bug#21349: who shows no users nowadays on Debian

2015-08-26 Thread Bob Proulx

Sven Joachim wrote:
> Bob Proulx wrote:
> > AFAIK it doesn't have anything to do with Debian changing anything.
> 
> It most probably has, the latest xterm version (319) only writes a utmp
> entry if you start a login shell (i.e. use the -ls option).  That's
> supposed to be a bug fix[1].
> 1. https://bugs.debian.org/794201

Hmm...  I guess I haven't logged out since that xterm arrived.

  $ who | grep -c pts/
  61
  ... counting all of the entries up to pts/65...

And so it appears that it has been logging for me.  But if I
explicitly start a new xterm:

  $ xterm &

And then in that window look:

  $ tty
  /dev/pts/61

  $ who | grep -c pts/61
  0

So I stand corrected.  It appears that very recently it was changed to
not log utmp.  In spite of the bug log reporting the bug to have been
fixed this still seems to be active with an up to date Sid.  It
appears to have arrived Sunday with version 319-1.  Even explicitly
giving +ut doesn't do anything.  I will further take the discussion to
the Debian bug log since it doesn't have anything to do with
coreutils.

Bob

bug#21349: who shows no users nowadays on Debian

2015-08-26 Thread Bob Proulx

Erik Auerswald wrote:
> This works on a current Debian/testing system (stable as well), so it might
> be a recent Debian/Sid (unstable) issue. Perhaps you want to open a bug
> report there?

Updating utmp depends upon the terminal emulator.  XTerm updates it.
Any XTerm user will be logged in utmp.  I use XTerms and all of my
terminals log as a user in utmp.  But most other terminal emulators
ignore utmp and don't log anything.  If they don't log it then there
isn't anything 'who' can do about it.

AFAIK it doesn't have anything to do with Debian changing anything.
It is all about the changing state of Desktop Environments.  For the
graphical mouse user today they have exactly one "login" now and that
is the entirety of the graphical environment and they never launch a
terminal emulator.

Bob

bug#21290: ln --relative bug?

2015-08-18 Thread Bob Proulx

Pádraig Brady wrote:
> Matteo Cerutti wrote:
> > # ln -s --relative b c
> > lrwxrwxrwx. 1 root root 20 Aug 18 13:03 c -> ../non_existent_file
> > shouldn't c point to b?
> > 
> > Mind that it doesn't happen when the --relative option is absent.
> 
> Are you referring to the non_existent_file
> or the fact that the symlink is dereferenced before the relative adjustment?

Regarding:

  ln -s --relative b c

I think the expectation of least surprise is that --relative would
restrict itself to changing only the relative path of "b".  Since "b"
is already a relative path that --relative would have nothing to do
there.  And indeed when I saw that it surprised me.  (Until I read the
documentation.)  Because now it is order dependent.

  mkdir /tmp/test
  cd /tmp/test
  ln -s --relative b c
  ln -s /tmp/non_existent_file b
  ls -ldog b c
lrwxrwxrwx 1 22 Aug 18 16:30 b -> /tmp/non_existent_file
lrwxrwxrwx 1  1 Aug 18 16:29 c -> b
  ln -sfn --relative b c
lrwxrwxrwx 1 17 Aug 18 16:31 c -> non_existent_file
  rm -f c
  ln -s --relative b c
lrwxrwxrwx 1 1 Aug 18 16:38 c -> b

Without reading the documentation for --relative I found that surprising.
It isn't behavior that I would ever knowingly want to use.

> If the latter then that's expected as detailed at:
> 
>   http://www.gnu.org/software/coreutils/ln
> 
> Also included there is an example using `realpath`
> which gives more control over the dereferencing.

The documentation does clearly state that --relative does *both*
conversion of the value to a relative path and to dereference using
realpath.

 Relative symbolic links are generated based on their canonicalized
 containing directory, and canonicalized targets.  I.E. all symbolic
 links in these file names will be resolved.  *Note realpath
 invocation::, which gives greater control over relative file name
 generation, as demonstrated in the following example:

In hindsight perhaps it should have been called --relative-realpath
instead of just --relative.  Because now eventually it will be
suggested that --relative-only would be a useful option to only create
relative paths only.  Oh well.  (shrug)

Bob

bug#21218: ls -d

2015-08-12 Thread Bob Proulx

Sneeh, Eddie wrote:
>  The problem is the behavior of ls -d is so erratic it's difficult to
> describe in a concise way.

This statement pains me to read.  Because it shows that there is still
a misunderstanding on ls and ls -d.

> I haven't seen the code or the spec, so my suggestion is based
> solely on observation (which may or may not be complete).

Let me try to describe it.  In the beginning that we care about was
the Unix file system.  In this file system everything is a file[1].  I
am going to say that again.  Everything is a file.

Some files are special files.  Most files in /dev are special files
such as the block and character devices that immediately people think
about.  But directories are files too!  Directories are simply files
in the Unix file system.  Directories are another example of special
files.

The 'ls' command reads directories, which are special files, and lists
their contents.  The primary purpose of 'ls' is to list directories.
Focus on that task.  Read the directory file.  List it out.  Here is
an example.  In my example I am going to talk classically and simply
and acknowledge that some of the system and library calls have been
modernized but that doesn't matter because the behavior remains the
same.

  ls /

The 'ls' command will have one program argument.  That one program
argument is the "/" string.  The 'ls' program will stat(2) the
argument to see if it exists and if it is a directory.  If it is a
directory then it will treat it specially and call opendir(3) on that
string.  If that succeeds then it reads the directory and lists out
the content.  It only does this for directories.  In this way
directories are treated specially by 'ls'.  If the program argument is
not a identified by stat(2) as a directory then it is simply printed
out normally.

If 'ls' is not given any program arguments then it defaults to listing
the "." directory.  The 'ls' command without arguments:

  ls

Is the same as:

  ls .

This is the same thing.  The 'ls' program opens the "." string and
gets a handle to the file.  It then reads the directory and lists the
contents found there.

What about shell wildcards?

First they are called *shell* wildcards.  They don't have anything to
do with 'ls' at all.  The shell, typically /bin/bash, scans the
command line and *replaces* any shell wildcard with the expansion of
the file glob.  The '*' is called a file glob because it replaces a
glob of characters.  There are also [...]'s too.  This replacement
happens at the command line level and happens before the shell
invocation of 'ls'.

  mkdir test
  cd test
  touch file1.txt file2.txt file3.txt
  ls ./*.txt
  ...the ./*.txt is expanded and replaced by the shell...
  ls file1.txt file2.txt file3.txt
  ...the 'ls' command has three program arguments provided by the shell...
file1.txt file2.txt file3.txt

The 'ls' program has no way of knowing it was invoked with a shell
wildcard.  The shell replaced that wildcard before ls was launched.
All the 'ls' program knows is that it has a list of program
arguments.  It will walk through each of them in turn and list them
out program argument by program argument.

Now consider this and this next example.

  mkdir dir1.d
  touch dir1.d/file11.txt
  touch dir1.d/file12.txt
  ls -log *.*
-rw-rw-r-- 1  0 Aug 12 11:33 file1.txt
-rw-rw-r-- 1  0 Aug 12 11:33 file2.txt
-rw-rw-r-- 1  0 Aug 12 11:33 file3.txt

dir1.d:
total 0
-rw-rw-r-- 1 0 Aug 12 11:42 file11.txt
-rw-rw-r-- 1 0 Aug 12 11:46 file12.txt

Not that the files file1.txt, file2.txt file3.txt plus dir1.d were
what was listed.  The files file11.txt and file12.txt are part of the
dir1.d listing.  It is subtle but it is really dir1.d that is being
listed.

That is useful.  But it is also often not desired.  What has happened?
Let's use 'echo' to show what program arguments the the shell is using
to invoke the 'ls' command.

  echo ls *.*
ls dir1.d file1.txt file2.txt file3.txt

And so here we see that 'ls' has four program arguments.  The dir1.d
argument is a directory.  Therefore 'ls' reads the directory and lists
the contents.  And in doing so the directory itself is skipped from
the listing.  What is the permissions on dir1.d?

  ls -log .
total 0
drwxrwxr-x 2 80 Aug 12 11:46 dir1.d
-rw-rw-r-- 1  0 Aug 12 11:33 file1.txt
-rw-rw-r-- 1  0 Aug 12 11:33 file2.txt
-rw-rw-r-- 1  0 Aug 12 11:33 file3.txt

Okay.  The "." directory was listed.  That included the entire
directory.  But I just want to list the dir1.d directory by itself.

  ls -log *.d

Same as just listing the directory.  I just want to see the
permissions on the directory and not the contents.

  ls -log dir1.d
total 0
-rw-rw-r-- 1 0 Aug 12 11:42 file11.txt
-rw-rw-r-- 1 0 Aug 12 11:46 file12.txt

That isn't listing the directory.  Because 'ls' has a program argument
and it is a directory it is listing the contents of the directory.
But I want to list the permissions on the directory itself!

bug#21218: ls -d

2015-08-09 Thread Bob Proulx

Sneeh, Eddie wrote:
> I believe there is a problem with ls -d (unless the intent is to just list
> 1 directory).

Just noting this has an FAQ entry:

  
http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#ls-_002dd-does-not-list-directories_0021

Bob

bug#21051: direct/file deletion

2015-07-14 Thread Bob Proulx

Lee Sung wrote:
> [lsung@hw-lnx-shell3 ~/fpc_i2cs_cpld]$ ls -al
> total 28
> drwxr-xr-x   3 lsung ipg4096 Jul 13 11:23 .
> drwxr-xr-x  26 lsung qa-others 20480 Jul 13 11:40 ..
> drwxr-xr-x   2 lsung ipg4096 Jul 13 11:23 rtl
> 
> I want to delete  dir fpc_i2cs_cpld, but I cannot.

Easiest is to change directory one above and then delete it.  The
easiest way is with rm -r -f which will perform a recursive removal of
everything in the directory depth first from the bottom up.

  cd ..
  rm -rf fpc_i2cs_cpld  # warning: recursive removal

But you could also delete it using the full path.

  rm -rf ~/fpc_i2cs_cpld  # warning: recursive removal

> How would I delete directory "." and ".."

Those entries are required infrastructure and should not be deleted.
The "." directory refers to the current directory.  The ".." refers to
the parent directory.  The ".." entry on some classic Unix file
systems may be unlinked but I don't believe that any current Linux
file system allows this.  This is a restriction imposed by the kernel
and not by coreutils.

Bob

bug#21011: df utility

2015-07-08 Thread Bob Proulx

erbenton wrote:
> df: ‘/sys/kernel/debug/tracing’: Permission denied
>...
> Still the same issue, using df from a non-root account always
> results in permission denied warning which messes with scripts that
> use df.  strace shows that even with -x debugfs or -x sysfs that df
> is still probing /sys

Could you also provide information on /etc/mtab and the contents
of it?  I expect /etc/mtab is a symlink to /proc/mounts but I am
curious about the /sys entries that are in it.

  ls -l /etc/mtab

  cat /etc/mtab

Thanks,
Bob

bug#20954: wc - linux

2015-07-05 Thread Bob Proulx

tele wrote:
> Maybe we did not understand.
> I don't want change old definitions but create new option for wc or echo,
> because this above examples not make logic sense,

What would such an option do?

> ( and it I want fix, however with sed is also fixed )

Your original message asked if "echo | wc -l" should count 0 lines
instead of 1 line.  But the echo is going to produce one line and
therefore it should be counted.

In a later message you wrote using sed to delete blank lines so that
only non-blank lines remained to be counted.

> $ a="" ; echo  "$a"  |  sed '/^\s*$/d' | wc -l
> 0
> 
> $ a="3" ; echo  "$a"  |  sed '/^\s*$/d' | wc -l
> 1
> 
> Can be added option to "wc" to fix this problem without use sed in future ?

This tells me that you as yet did not understand things yet. :-(

I tried to explain this in more detail in my response to that message.
The sed command you pulled from stackoverflow.com deletes blank
lines.  That is a good way to avoid counting blank lines.

If I guess at what you are suggesting then it does not make sense to
add an option to wc to count only non-blank lines.  If you don't want
to count blank lines then delete them first.  There are an infinite
number of possible things to count.  There cannot be an infinite
number of options implemented.  And using sed to delete blank lines is
the Right Way To Do Things.

> however now Iunderstand that they work correctly in accordance with
> accepted principles.

Yes.

> > What is a text line? A text line by definition ends with a
> > newline. This has been standardized to prevent different
> > implementations from implementing it differently and creating
> > portability problems. Therefore all standards compliant
> > implementations must implement it in the same way to prevent
> > portability problems.
> 
> " wc -l " in most examples working correct,

"most"?  No.  "wc -l" is working correctly in all examples. :-)

> because it " echo "  give's " \n " and "wc -l" count correct.

Yes.

> I mentioned about "wc", because for me build option "wc -a" for "echo"  or
> "echo -m"
> this is not important.
> Maybe exist hope for example create option "m" to echo  , " echo -m "
> which not will from new line, but first line if variable is empty
> and from new line if is not empty  ?
> 
> example:
> 
> echo -m "" | wc -l
> 0
> 
> echo -m "e" | wc -l
> 1

The shell is a programming language.  If not infinite number then a
very large number of possibilities may be implemented by programming
them in the shell.  All such possibilities should not be coded into
specific options.  Instead if you have a specific need it should be
programmed.  Simply write the code that says explicitly what you want
to do.  There are millions of lines of code written for various tasks.
All of those millions of lines should not be turned into specific
options.  If you want to delete blank lines then simply delete blank
lines.

This entire discussion feels like an XY problem.  Here is a collection
of explanations of the XY problem.

  http://www.perlmonks.org/?node_id=542341

The help-b...@gnu.org mailing list is the right place to follow up but
if you wrote there and said what you were trying to do and asking how
to do it in the shell people would try to help you there.

Bob

bug#20981: BUG ABOUT XZ AND RZ COMMAND IN VMWARE UBUNTU

2015-07-04 Thread Bob Proulx

close 20981
thanks

Shubham Mishra wrote:
> when i tries to execute xz and rz commands on ubuntu 14 in my vmware they
> are not working properly.
> please resolve the error

You are reporting a bug in xz and rz however we are not the
maintainers of that software.  We can't help you.  Nothing to do with
us here.  Please report the problem to your Ubuntu distribution
maintainers.  Here is a link to more information.

  https://help.ubuntu.com/community/ReportingBugs

Unfortunately your bug report, while appreciated, contains no useful
information.  In order to be useful a bug report must say what you did
and what you expected.  What did you do?  Cut and paste it verbatim.
What did you expect?  Often times the problem is expectation and
misunderstanding the problem.  However don't tell us, we are not xz
nor rz folks.  Nothing we can do.  Tell Ubuntu instead.

The xz and rz programs are completely unrelated.  It is very unlikely
that you are having a problem with both of them.  Very likely you have
confused one wiith the other.  xz is a compression program.  rz is a
file transfer program.

Bob

bug#20954: wc - linux

2015-07-02 Thread Bob Proulx

tele wrote:
> "echo" gives in new line,

Yes.

> "echo -n" subtracts 1 line,

echo -n is non-portable and shouldn't be used.

echo -n suppresses emitting a trailing newline.

Note that in both of these cases you are using the shell's internal
builtin echo and not the coreutils echo.  They behave the same.

> but "wc -l" can count only from new line, so if something exist
> inside first line "wc -l" can not count. :-(

"wc -l" counts newlines.  That is the task that it was constructed to
do.  That is exactly what it does.  No more and no less.

What is a text line?  A text line by definition ends with a newline.
This has been standardized to prevent different implementations from
implementing it differently and creating portability problems.
Therefore all standards compliant implementations must implement it in
the same way to prevent portability problems.

> example:
>
>   $ a="j" ; echo  "$a"  |  wc -l
>   1

I have been wondering.  Why are you using a variable here?  Using the
variable as you are doing is no different than not using the variable.

  echo "j" | od -tx1 -c
  000  6a  0a
j  \n

There is one newline.  That counts as one text line.

>   $ a="" ; echo  "$a"  |  wc -l
>   1

  echo "" | od -tx1 -c
  000  0a
   \n

There is one newline.  That counts as one text line.

>   $ a="" ; echo -n "$a"  |  wc -l
>   0

  echo -n "" | od -tx1 -c
  000

Nothing was emitted.  No newlines.  Counts as zero lines.  But nothing
was emitted.  Zero characters.

  od -tx1 -c < /dev/null
  000

>   $ a="j" ; echo -n "$a"  |  wc -l
>   0

  echo -n "j" | od -tx1 -c
  000  6a
j

That emits one character, the 'j' character.  It emits no newlines.
Without any newlines at all that is not and cannot be a "text" line.
Without a newline that can only be interpreted as binary data.  In any
case there were no newlines to count and "wc -l" counted and reported
zero newlines.

Instead of echo -n it would be better and portable to use printf
instead.

  printf "j" | od -tx1 -c
  000  6a
j

Same action in a portable way using printf.  Avoid using echo with
options.

> So,
> 
> $ a="" ; echo  "$a"  |  sed '/^\s*$/d' | wc -l
> 0

  echo "" | sed '/^\s*$/d' | od -tx1 -c
  000

As we previosuly see the echo action will emit one newline character.
This is piped to the sed program which will delete that line.
Deleting the line is what the sed 'd' action does.  Therefore sed does
not emit the newline.  The text line is deleted.

> $ a="3" ; echo  "$a"  |  sed '/^\s*$/d' | wc -l
> 1

  echo "3" | sed '/^\s*$/d' | od -tx1 -c
  000  33  0a
3  \n

Here the echo emitted two character a '3' and a newline.  The sed
prgram did not match and therefore did not delete the line.  Since it
did not delete the line it passed the one text line to wc and "wc -l"
counted the one newline and reported one text line.

> Can be added option to "wc" to fix this problem without use sed in future ?
> Thanks for helping :-)

There is no problem to be fixed.  And therefore this isn't something
that can be "fixed" in wc.

Bob

bug#20954: wc - linux

2015-07-01 Thread Bob Proulx

tag 20954 + notabug
close 20954
thanks

tele wrote:
> Hi!

Hi! :-)

> From terminal:
> 
> $ a="" ; echo $s | wc -l
> 1

Do you mean $a instead of $s?  Either way is the same though assuming
$s is empty too.

> Should be 0 , yes ?

No.  Should be 1.  You have forgotten about the newline at the end of
the command.  The echo will terminate with a newline.  You can see
this with od.

  echo | od -tx1 -c
  000  0a
   \n

Since this appears to be a usage error I have closed the bug.  Please
feel free to follow up with more information.  We will read it.  And
we appreciate additional communication!  I am simply closing it to
keep the accounting straight.  :-)

Bob

bug#20679: A bug of pwd

2015-05-30 Thread Bob Proulx

Bernhard Voelker wrote:
> First of all, I want to mention that the invoked 'pwd' is a builtin
> in most shells, which means you have to e.g. specify the path like
> /bin/pwd to be sure to invoke the coreutils version of it.

A very, very small comment.  This is all true but the wording makes it
sound somewhat like a recommendation to use /bin/pwd in order to get
the coreutils program.  I don't think that was intended.  I think
instead it was intended only that normally the user has called the
builtin pwd and the builtin is not coreutils and therefore nothing we
would do about it here.  Reports about the builtin would go to the
shell.

One can compare the builtin by using both a plain "pwd" version and
comparing against the "/bin/pwd" coreutils version in order to test
differences between them.  If they work the same then it is very
unlikely it would be a bug since they are independent implementations.

For typical scripting it would be normal to continue to use the plain
"pwd" and use the shell builtin version.  But proper shell quoting is
still needed.  :-)

Bob

bug#20603: Possible bug in cp

2015-05-19 Thread Bob Proulx

Chris Puttick wrote:
> The expansion & consequences of my typo understood! However, given the
> risks inherent in this edge case (directory only has 2 files in it)
> and the unlikelihood of someone wanting to change a directory
> containing 2 different files into a directory containing 2 identical
> but differently named files, it would be great if the cp command to
> check, when the source and destination directories are the same, the
> file count in that directory and issue a warning before continuing if
> the file count =2.

So I think you are telling me that if I were to do this:

  mkdir /tmp/junk
  touch /tmp/junk/file1 /tmp/junk/file2
  cp /tmp/junk/file1 /tmp/junk/file2

You are suggesting that the cp command above would fail in the above
situation?  I am sorry but that is much to specific of something to
single out as a special case in the code.  And it would fail in most
perfectly valid situations.  Like the above.  The above is perfectly
valid and should not fail.

Additionally I often do things like this for example:

  cp foo.conf foo.conf.bak
  edit foo.conf
  cp foo.conf.bak foo.conf

It would be wrong if the above were to fail.

Additionally what about this case:

  mkdir /tmp/junk && cd /tmp/junk
  touch one.txt two.txt
  touch README
  cp *.txt

The shell will expand "cp *.txt" to be "cp one.txt two.txt".  The file
count for the directory will be three and therefore the suggested
special case rule would not be triggered.

I daresay that the suggested special case rule would only very rarely
be triggered when it would help and would be a false positive for most
valid cases.

I am sorry that you had a bad experience with file glob expansion.
But so far the suggestions for improvement hasn't made a compelling
argument yet.

Bob

bug#20603: Possible bug in cp

2015-05-18 Thread Bob Proulx

Chris Puttick wrote:
> In moment of tiredness issued a cp command similar to
> cp /path/to/files/*
> neglecting to add a destination (should have been ./)
> The files in the directory were 2 vdisks both ending .qcow2.

Ouch!  I am sorry for your loss.  I hope you had backups. :-(

> No error message was generated and the apparent result of running the
> command is that the file whose name came first alphabetically was
> copied over the other.

There was no error message because as far as the computer was
concerned it did exactly as you instructed it and no error occurred.

You are apparently not aware that the shell reads your command line,
parses it, expands it, and then the shell executes the resulting
command.  Many command line characters are "shell metacharacters".
Search for that and you will find many references.  When I say shell
here I will assume the bash shell however this part applies to all of
the Unix-like command line shells such as ksh, zsh, csh, sh and so forth.

One of the file glob characters is the "*" character.  (It is called a
file glob because the star is expanded to match a glob of files.)
Whenever you use a '*' in a command line that is an instruction to the
shell.  It tells the shell to list the files and match them and
replace the star character with the list of matching file names.  Try
this exercise to understand a little bit about the '*' character and
what the shell does.

  $ mkdir /tmp/junk
  $ cd /tmp/junk
  $ touch file1
  $ echo *
  file1
  $ touch file2
  $ echo *
  file1 file2
  $ touch file 3
  $ echo *
  file1 file2 file3
  $ echo *1
  file1
  $ echo *[23]
  file2 file3

As you can see the shell is replacing the '*' character with the files
that were expanded which match it.  And I threw in another couple of
file expansions too just to help push the concept home.

By this point you should know that your cp command had a '*' in the
command line.  The shell expanded that star.  There were only two
expansions.  Therefore the shell invoked cp with two arguments.

  cp /path/to/files/file1 /path/to/files/file2

That is the command that cp was invoked with after the shell expanded
the file globs on the command line.  As far as the cp command is
concerned it was given a command, cp executed the command, and the
command was completed without error.

The cp command has no way of knowing that you wanted to execute this
command instead.

  cp /path/to/files/file1 /path/to/files/file2 ./

How could it?  It can't.  One command looks just the same as the other
one to the cp command.  None of the commands ever see the '*'
character because it is expanded by the shell and replaced before the
utility is invoked.

Hope that expansion helps explain things.

Bob

bug#20575: possible bug with false?

2015-05-14 Thread Bob Proulx

Pádraig Brady wrote:
> Brian Walsh wrote:
> > 'false --help' and 'false --version' print nothing and return an
> error. I honestly don't know if it's working as intended. If not,
> the man page needs to be updated.
> 
> I think you're using the shell builtin?
> 
> Try: env false --help

Just noting that this also has an FAQ entry.

  
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#I-am-having-a-problem-with-kill-nice-pwd-sleep-or-test_002e

  bash$ type -a false
  false is a shell builtin
  false is /bin/false

Bob

bug#20523: GNU coreutils 8.4 date: wrong day shift calculation at the spring daylight savings time cutover

2015-05-07 Thread Bob Proulx

Markus Baur wrote:
> On one of my production systems I do daily database dumps between
> midnight and 1am every day. I noticed on March 9th this year is was
> dumping the wrong day. Digging further into this I found the shell
> wrapper script to be at fault and specifically the GNU date
> program. Here is a simplified version to reproduce the bug:

Thank you for the report.  However this appears to be a usage issue.

> echo YESTERDAY is `date -d 'yesterday' +%Y%m%d`
> echo 30 DAYS AGO is `date -d '30 days ago' +%Y%m%d`

Both of those are problematic when used near Daylight Saving Time
changes.

  $ zdump -v US/Pacific |grep 2015
  US/Pacific  Sun Mar  8 09:59:59 2015 UT = Sun Mar  8 01:59:59 2015 PST 
isdst=0 gmtoff=-28800
  US/Pacific  Sun Mar  8 10:00:00 2015 UT = Sun Mar  8 03:00:00 2015 PDT 
isdst=1 gmtoff=-25200
  US/Pacific  Sun Nov  1 08:59:59 2015 UT = Sun Nov  1 01:59:59 2015 PDT 
isdst=1 gmtoff=-25200
  US/Pacific  Sun Nov  1 09:00:00 2015 UT = Sun Nov  1 01:00:00 2015 PST 
isdst=0 gmtoff=-28800

As you can see March 9th is right on top of the DST change.  Instead
use one of these.

Use UTC (with the UTC offset):

  echo YESTERDAY is `date -u -d 'yesterday' +%Y%m%d`
  echo 30 DAYS AGO is `date -u -d '30 days ago' +%Y%m%d`

Or use 12:00 noon localtime:

  echo YESTERDAY is `date -d 'yesterday 12:00' +%Y%m%d`
  echo 30 DAYS AGO is `date -d '12:00 30 days ago' +%Y%m%d`

> root@yoyo-01-64-lv$ date 03090059; ./yesterday.sh
> Mon Mar  9 00:59:00 PDT 2015
> NOW is Mon Mar 9 00:59:00 PDT 2015
> TODAY is 20150309
> YESTERDAY is 20150307
> 30 DAYS AGO is 20150206
> 
> root@yoyo-01-64-lv$ date 03090100; ./yesterday.sh
> Mon Mar  9 01:00:00 PDT 2015
> NOW is Mon Mar 9 01:00:00 PDT 2015
> TODAY is 20150309
> YESTERDAY is 20150308
> 30 DAYS AGO is 20150207

It is not necessary to set the system time.  Simply provide a full
time reference to date and then operate relative to it.

  $ TZ=US/Pacific date -d '2015-03-09 00:59:00 -0700' +%Y-%m-%d
  2015-03-09

  $ TZ=US/Pacific date -d '2015-03-09 00:59:00 -0700 yesterday' +%Y-%m-%d
  2015-03-07

  $ TZ=US/Pacific date -d '2015-03-09 00:59:00 -0700' '+%Y-%m-%d %T %z'
  2015-03-09 00:59:00 -0700

  $ TZ=US/Pacific date -d '2015-03-09 00:59:00 -0700 yesterday' '+%Y-%m-%d %T 
%z'
  2015-03-07 23:59:00 -0800

> As you can see, the "yesterday" as well as the "30 days ago"
> calculation are one day off at 00:59, but correct a minute later.

Actually not.  If you examine the times you will find that because of
DST the time springs forward in the Spring and falls back in the Fall.
In the Spring when time springs forward the hour is missing and the
time gap of one day ago (yesterday) lands on the day before.

The FAQ documents the issue in detail.

http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#The-date-command-is-not-working-right_002e

In summary the usage is to either use UTC which avoids DST changes or
to specify a time that is away from DST changes such as 12:00 noon.
Use one or the other as described above and this problem is avoided.

Bob

bug#20407: Doubt on md5 case sensitivity

2015-04-22 Thread Bob Proulx

Indranil Chowdhury wrote:
> Is the md5 checksum comparison case sensitive? Or is it not? I did not find
> the answer in your manuals. Could you please let me know in a short reply?

The comparison of the md5sum is case insensitive.  The md5sum is a
value encoded in hexadecimal.  Case is insignificant in a hexadecimal
value.

You can prove this to yourself by trying an experiment.

  /tmp$ echo a > /tmp/a
  /tmp$ md5sum /tmp/a > /tmp/a.md5sum
  /tmp$ cat /tmp/a.md5sum
  60b725f10c9c85c70d97880dfe8191b3  /tmp/a
  /tmp$ md5sum -c /tmp/a.md5sum
  /tmp/a: OK
  /tmp$ awk '{print toupper($1), $2}' /tmp/a.md5sum
  60B725F10C9C85C70D97880DFE8191B3 /tmp/a
  /tmp$ awk '{print toupper($1), $2}' /tmp/a.md5sum | md5sum -c -
  /tmp/a: OK

Please ask questions on the coreut...@gnu.org mailing list rather than
in the bug tracker.

Bob

bug#19760: [bug] "tail -f" with inotify fails to follow a file after a rename()

2015-04-01 Thread Bob Proulx

Pádraig Brady wrote:
> BTW given that -f was broken for so long (6 years)

In retrospect the inotify support, while appearing nifty, has been the
source of many bugs and problems for such a simple utility.  It
certainly destabilized it.

> it lends more weight to making -f behave like -F by default.
>
> Note POSIX allows (and even implies this),
> and openBSD -f behaves like -F for example.

I think the reason this problem isn't often noticed is because once a
file has been renamed it usually is no longer written anymore.  This
makes the behavior be mostly invisible in the typical case.  Except
for the cases where it isn't and the people count on it working as
expected and it isn't.  Then it is a shame that such a simple utility
is misbehaving.

> Not something appropriate for coreutils 8.x,
> and I'd be 60:40 against changing in a later major release,
> but it's worth mentioning.

I use the -F behavior often but I think changing to -F behavior by
default actually makes it more complicated.  For example the -f
behavior is that tail follows a file.  And if I rename the file?  Then
it follows the original file.  If I unlink the file?  Then it follows
the original file.  No complexity.  Simple.  Does what it says.  (Or
at least should.)  The -F is more complicated and needs explanation
for how each of those cases is handled.

Bob

bug#20091: mv command

2015-03-12 Thread Bob Proulx

Rogers, Charles (MAN-Corporate-CON) wrote:
> Thank you all so much for the explanation.   It is as you describe.
> 
> 1.  We had insufficient permissions on the source directory
> 2.  The destination directory was indeed on a different file system

Ah!  All is explained.

> So, our question is answered, and again thanks.

Very good.  I am glad it worked out for you.  I will close this bug
ticket with this message then.

Bob

bug#20094: cp --dry-run

2015-03-12 Thread Bob Proulx

積丹尼 Dan Jacobson wrote:
> Proposal:
> add a new "cp --dry-run"
> Reason: we want to know what the verbose output of e.g.,
> $ cp -i -av /mnt/usb/thumb/ tmp/androidBackup/SDCard
> will be without actually running the command first.
> We are about to commit to a megabyte copy and we want to be sure where
> all those files are exactly going even after understanding all the
> documentation, and without needing to do partial "wet" runs
> etc. etc. etc. etc.

For this functionality I suggest using rsync which already implements
this capability.

  rsync -n -av /mnt/usb/thumb/ tmp/androidBackup/SDCard/thumb/

I see you using cp -i but if you are going to be copying megabytes and
many files I assert that cp -i isn't a good practical way to ensure a
good result.  But rsync --ignore-existing and --update options are
available for data merging.

Bob

bug#20091: mv command

2015-03-12 Thread Bob Proulx

Pádraig Brady wrote:
> Charles Rogers wrote:
> > Is it ever possible for the mv command ( without using the –u
> > option ) to leave the file(s) in the source directory, while also
> > copying to the destination directory?
> >...
> > Any  comments appreciated!

Your description implies that your destination is on a different file
system from your source.  Is this one one source directory or multiple
source directories being copied at one time?

In order for mv to decide it needs to copy a file it would need to
detect that the destination directory is on a different file system
from the source directory.  If so then mv will copy the file to the
destination location and remove the file from the source location.
The mv documentation describes this in some detail.

   ‘mv’ can move any type of file from one file system to another.
Prior to version ‘4.0’ of the fileutils, ‘mv’ could move only
regular files between file systems.  For example, now ‘mv’ can
move an entire directory hierarchy including special device files
from one partition to another.  It first uses some of the same
code that’s used by ‘cp -a’ to copy the requested directories and
files, then (assuming the copy succeeded) it removes the
originals.  If the copy fails, then the part that was copied to
the destination partition is removed.  If you were to copy three
directories from one partition to another and the copy of the
first directory succeeded, but the second didn’t, the first would
be left on the destination partition and the second and third
would be left on the original partition.

Copying files across file systems and removing them from the source is
a non-atomic operation.  There is always the possibility that the
process will be stopped (possibly by being killed or other
possibilities) after it has copied a file to the destination but
before it has removed the file from the source location.  It is not
possible to perform an atomic move across different file systems.  Any
of those possibilities should result in mv exiting non-zero and
returning an error status to the caller.  Most possibilities (not
SIGKILL which cannot be trapped) will result in mv printing an error
message to stderr.  Generally if mv has an error there should be an
error message and a non-zero exit status returned to the caller.

Bob

bug#19605: cp -v vs LC_ALL vs. quote marks

2015-01-16 Thread Bob Proulx

Pádraig Brady wrote:
> Dan Jacobson wrote:
> > All I know is in xterm I click three times and all of '...' including
> > the quotes gets copied, which is fine with me. Just keep it all 0x27.
> 
> Ah right that's an xterm specific feature. See XTerm*on3Clicks here:
> http://lukas.zapletalovi.com/2013/07/hidden-gems-of-xterm.html

Actually no.  Triple click to select the full line is a standard
feature of X Windows since forever.  I use it all of the time in
Firefox and Chromium for instance.  Every X widget should support it
natively.

Since triple clicks copy the entire line then the `...' being part of
the entire line will get copied too.  But I don't think the ` is the
worst part of that.  The worst part is the -> part.  You wouldn't want
to be pasting that part into a shell.  Selecting the entire line would
be useful for pasting as plain text such as into an editor.  (I am not
suggesting changing the -> but just pointing it out.)

FWIW regardless of the historical font (that I have never seen in my
lifetime) of ` and ' being symmetrical I have never liked the use of
the `...' in quote context.  I would prefer to have the quoted strings
use '...' too.  I know several projects have gone that way.  It looks
better and is more useful both at the same time.

Bob

bug#19533: comm does not detect common lines -- Mac OS X 10.9.5

2015-01-08 Thread Bob Proulx

Ali Khanafer wrote:
> Thanks Eric and Bob. I had sorted the files before calling comm, but I
> think the problem is that I sorted them as numeric:
> 
> sort -n test1 -o test1
> 
> When I removed the "-n", which is equivalent to what Bob has done, comm
> worked like a charm.

Yes that would cause the problem.  comm is a simple program from years
and years ago and expects things to be sorted simply.  Sort options in
the various programs have come up for discussion every so often.  But
so far things have continued as they are.  The biggest changes in this
area have been having the tools produce diagnostic information when
the input is not as they expect.  Check out the sort --debug option
for more useful diagnostics about sorting.

Glad things have been /sorted/ out! :-)

Bob

bug#19533: comm does not detect common lines -- Mac OS X 10.9.5

2015-01-07 Thread Bob Proulx

Eric Blake wrote:
> Ali Khanafer wrote:
> > I tried comm on test1.txt and test2.txt. The output I got is in
> > comm-test.txt. Comm found 11 common lines and missed 6 other lines.
> > 
> > Could you please explain why this is happening?
> 
> Using a newer version of coreutils would tell you why:
> ...
> Proper use of comm requires that you pre-sort both input files.  As
> such, this is not a bug in comm, so I'm closing this bug.  However, feel
> free to add further comments or questions.

If you are using bash then a bash specific feature is useful.  You can
sort them on the fly.

  comm <(sort test1) <(sort test2)

Or perhaps forcing a sort locale.

  env LC_ALL=C comm <(sort test1) <(sort test2)

I included LC_ALL=C to force a specific sort order which may or may
not be appropriate for all of your use cases.

Bob

bug#19476: Poor output from "help2man split"

2014-12-31 Thread Bob Proulx

Kevin O'Gorman wrote:
> Pádraig Brady wrote:
> > forcemerge 19228 19476
> > stop
>
> I have no clue what that means.

The coreutils project uses a Bug Tracking System (aka BTS) for
tracking bugs.  It is an instantiation of the Debian BTS.  A month ago
on Sun, 30 Nov 2014 09:06:32 -0800 you sent in a bug report.  It was
assigned bug number Bug#19228 at that time.

Today you sent in a duplicate of that previous report.  It was
assigned Bug#19476.  As far as I can tell the report today was
identical to the previous report.  Therefore Pádraig merged the two
bug ticket numbers together.  They are identical and therefore merging
duplicates is standard operating procedure for any bug tracking
system.

You can see the web interface to the bug log here:

  http://debbugs.gnu.org/19228
  http://debbugs.gnu.org/19476

To browse all of the bugs that are being tracked that are associated
with coreutils visit this URL:

  http://debbugs.gnu.org/cgi/pkgreport.cgi?pkg=coreutils

After opening a bug ticket if you have further comments simply
follow-up to the previous message and send those further comments to
the existing bug ticket rather than opening a new ticket.

Bob

bug#19447: chmod - problem

2014-12-26 Thread Bob Proulx

tag 19447 + notabug
close 19447
thanks

Hello Tom,

Thank you for your bug report.  Trying to help and improve the tools
is always appreciated.  However it seems there is a misunderstanding.
Therefore I am marking the bug closed as housekeeping for the ticket
system.  However please lets discuss the problem in this bug ticket
until it all makes sense.  Others having the same confusion can read
it and it will help them later.

Tom wrote:
> chmod does not work recursively. The command
> 
> chmod --recursive --verbose a-x ./*.txt
> 
> only has effects in the actual working directory, but not in the
> subdirectories.

In the above you did not specify any subdirectories to chmod.  At
least I assume that all of your *.txt files are files and not
directories.  If a directory was named foo.txt and it was a directory
then of course that would be a named argument and it would be a
directory and it would recurse down it.  In order for --recursive to
make sense at least one of the arguments must be a directory.  You
can't recurse down through a file.

I could say much more but this is actually an FAQ.  Here is the FGA in
response to it.  Please let us know if this answers your questions.

http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Why-doesn_0027t-rm-_002dr-_002a_002epattern-recurse-like-it-should_003f

Meanwhile, this is prehaps the useful command you wanted.  Try it.  It
is POSIX standard and would work on any POSIX system.

  find . -name '*.txt -exec chmod a-x {} +

If you wanted --verbose as in your question then:

  find . -name '*.txt -exec chmod --verbose a-x {} +

Again please let us know how we could improve the documentation or
whatever in order to make understanding what is happening easier.

Bob

bug#19240: cut 8.22 adds newline

2014-12-04 Thread Bob Proulx

John Kendall wrote:
> My goal was to bring up the differences between Solaris cut and gnu cut 
> and hear the justification.  And I've learned a lot.  I've been in the
> Solaris gated community for so long, imagine how much I have never
> had to think about!

At one time I was exactly the same way after years of using HP-UX! :-)
Well...  Maybe not because there were always other machines in the mix
too.

> But it was never my intention to have you solve the re-write for me.  I 
> only shared my code because Bob asked.  But I really appreciate you 
> solving it for me!
>
> Thanks again to all of you.

Thanks for the sharing.  As I said I was curious as to the code issue
that was problematic for portability.  I already knew it wasn't
portable or it wouldn't have been a squeaky wheel.  So seeing
something unportable was simply expected.

And I will speak for the group and say you are most welcome.  We do
this because if you set a tangled ball of string in front we would
untangle it.  It is just our nature.

Bob

bug#19240: cut 8.22 adds newline

2014-12-04 Thread Bob Proulx

Eric Blake wrote:
> Be careful; the POSIX specification of '%.30s' does NOT work well with
> multibyte characters; it is specified as:
> ...
> which means that it CAN and WILL corrupt output if the number of bytes
> written falls in the middle of a multi-byte character.

Good point.  Which leads me back to thinking that printing a tag first
and then the filename second and letting it be as long as it needs to
be without truncation is the best solution.

But of course in the original application coming from a legacy
environment the file names would never be multibyte.

Bob

bug#19240: cut 8.22 adds newline

2014-12-04 Thread Bob Proulx

John Kendall wrote:
> Bob Proulx wrote:
> >> I came upon this while porting scripts from Solaris 10 to Centos 7.
> > 
> > Can you share with us the specific construct that caused this to
> > arise?  I have done a lot of script porting to and from HP-UX systems
> > and am curious as to the issue.
> 
> The construct in question if just for formatting the output 
> of a script that compares disc files to what's in a database.  
> 
>  echo "$FILE ===\c"| cut -c1-30
>  echo " matches =="

Eww...  Immediately I have a second immune reaction to the above.  The
reason is that the use of echo above is non-portable.  It uses the old
System V echo interface that interprets escape sequences by default.
This can be enabled in bash with the --enable-usg-echo-default flag
but it is off by default because BSD doesn't support it by default.

The solution to this problem has been to recommend using 'printf'
everywhere anywhere that an escape sequence is needed or anywhere that
not having a newline is needed.  Since printf is POSIX standard and
avoids the echo unportability.  Use of echo can be very unportable and
the "\c" is one of those unportable things.

> The output on Solaris might look something like this (with 
> monospaced font on a terminal all the "matches" line up):
> ...

Cool.

> This can be re-written, of course.  (There is one corner case that 
> Solaris's cut handled nicely that I have not been able to come up 
> with a quick fix.) 

Immediately printf comes to mind.  Use %s with a format with
specifier.  Since printf is POSIX standard this should work anywhere.
The failure mode of not having printf available on really, really,
really old systems is trivially handled by providing a printf for that
system.  Much easier than dealing with other differences.

  printf "%.30s matches ==\n" "$FILE ==="

One thing I still don't like about the above is that it will truncate
any long file names.  Any file name longer than 30 will be trunncated.
But of course that would require changes in output format to address.
My preference would be to have "matches" first and the file name
second and let the file name go as long as it needs to go.

Bob

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1446 matches

Mail list logo