Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-19 Thread Chet Ramey

On 6/17/21 3:53 PM, Nora Platiel wrote:

On 2021-06-15 10:19 Chet Ramey wrote:

Or rather,
to never generate . or .. as a pathname component via globbing.


I don't think it's useful -- and it's certainly incompatible -- to make
an explicit pattern like `.?' ignore `..'.


I think it would be most useful. A better design.


It's far too late to relitigate that decision, which dates back to at
least the original Bourne shell and probably much earlier (I do not
have earlier versions available for testing right now).

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-19 Thread Chet Ramey

On 6/17/21 3:41 PM, Nora Platiel wrote:

On 2021-06-15 09:43 Chet Ramey wrote:

I can see how this would be more intuitive. Let's try it. I'll put support
in the next devel branch push.


Thanks!


I'm leaning towards a general statement about how dotglob affects the set
of filenames that are tested against the extended patterns, rather than
calling out `!' specially.


What about this:
| The extended pattern matching operators cannot match the leading dot of
| filenames `.' and `..' (or any filename, if dotglob is unset) unless the
| _matching_ subpattern starts with a literal dot.


I decided on this:

When matching filenames, the \fBdotglob\fP shell option determines
the set of filenames that are tested:
when \fBdotglob\fP is enabled, the set of filenames includes all files
beginning with ``.'', but ``.'' and ``..'' must be matched by a
pattern or sub-pattern that begins with a dot;
when it is disabled, the set does not
include any filenames beginning with ``.'' unless the pattern
or sub-pattern begins with a ``.''.
As above, ``.'' only has a special meaning when matching filenames.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-19 Thread Nora Platiel
Hello,
I just tried your commit of Tue Jun 15. I tested all the relevant patterns that 
came to mind, and they all behave as agreed.
I'll let you know if I find something unexpected but I'm satisfied with this 
solution.
Thanks for your work.
NP



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-17 Thread Nora Platiel
On 2021-06-15 10:19 Chet Ramey wrote:
> > Or rather,
> > to never generate . or .. as a pathname component via globbing.
>
> I don't think it's useful -- and it's certainly incompatible -- to make
> an explicit pattern like `.?' ignore `..'.

I think it would be most useful. A better design.

Of course this is just for the sake of argument, because I always assumed that 
changing the behavior of patterns like `.?' is out of the question on the 
grounds of compatibility and standard compliance (as it should be! I'm not 
proposing to take any action here).

In 20 years, every single time one of my glob patterns expanded to `.' or `..', 
it was either an error or I would discard them later.
I challenge anyone to show me a piece of code where it is useful for a 
non-literal path component to expand to `.' or `..'.

If you get `.' or `..' by accident you can mess up things. They are special 
names with a special meaning; if you need one of them you don't "search" for 
it, you name it (or you can use brace expansion to add it to other results).

> > Personally, I'd just want an option to always make . and .. hidden from
> > globs.
>
> You can use GLOBIGNORE for this, but you have to do a little more work.
[...]
> Yes, that's the `more work' part. You have to tune it to your needs.

But this `more work' part has to take place on a pattern by pattern basis, 
because there is no value of GLOBIGNORE that you can just set-and-forget.
For example a value of `./*:../*' is appropriate for the pattern `.*/foo.txt', 
but surely not for the pattern `./foo*.txt'.

If each pattern requires its own "mask", then it would defeat the intended 
purpose of the OP's desired option, which is to make patterns simpler and less 
error prone.
What we had in mind is some kind of on/off option that we could just keep 
turned on and it would always prevent non-literal path components to expand to 
`.' or `..'.

As I said, I can live without such option, because I can reliably hide `.' and 
`..' by protecting any leading dot with brackets (e.g. `[.]*/foo.txt') as long 
as dotglob is set.

Regards,
NP



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-17 Thread Nora Platiel
On 2021-06-15 09:43 Chet Ramey wrote:
> I can see how this would be more intuitive. Let's try it. I'll put support
> in the next devel branch push.

Thanks!

> I'm leaning towards a general statement about how dotglob affects the set
> of filenames that are tested against the extended patterns, rather than
> calling out `!' specially.

What about this:
| The extended pattern matching operators cannot match the leading dot of
| filenames `.' and `..' (or any filename, if dotglob is unset) unless the
| _matching_ subpattern starts with a literal dot.

The important part is "the matching subpattern" (not any subpattern).
If there is no matching subpattern, or if the matching subpattern doesn't start 
with dot, then `.' and `..' are excluded (or all dot files, if dotglob is 
unset).

E.g. both patterns `@(?|.foo*)' and `!(.foo*)', cannot match filename `.' 
because 1) the matching subpattern (`?') doesn't start with dot; 2) there is no 
matching subpattern.

To further clarify we could also expand the definition of `!()' with something 
along these lines:

| !(pattern-list)
|Matches anything except one of the given patterns.
|+   where anything is whatever `*' can expand to.

The other operators don't have a notion of "anything except" so they don't need 
clarification. This is not a special case, just a consequence of the first 
statement.

Regards,
NP



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-15 Thread Chet Ramey

On 6/6/21 6:31 AM, Ilkka Virta wrote:


Can you write a set of rules that encapsulates what you would like to see?
Or can the group?


I think it's a bit weird that !(.foo) can match . and .. when * doesn't.

The other means roughly "anything here", and the other means "anything but 
.foo here",

so having the latter match things the former doesn't is surprising.


Yes, that's essentially where we ended up.

Personally, I'd just want an option to always make . and .. hidden from 
globs. 


You can use GLOBIGNORE for this, but you have to do a little more work.


Or rather,
to never generate . or .. as a pathname component via globbing.


I don't think it's useful -- and it's certainly incompatible -- to make
an explicit pattern like `.?' ignore `..'.


For what it's worth, Zsh, mksh and fish seem to always hide . and .. , and 
at least Zsh does

that even with (.|..) orĀ @(.|..) .


Do they have the equivalent of `dotglob'?

And if we want to play this game, ksh93, dash, yash, and the BSD shells
all match `.' and `..' with patterns like `.*' and `.?'.

I tried to achieve that via GLOBIGNORE=.:.. , but that has the problem that 
it forces dotglob
on, and looks at the whole resulting path, so ./.* still gives ./. and ./.. 
. Unless you use
GLOBIGNORE=.:..:*/.:*/.. etc., but repeating the same for all different 
path lengths gets a bit

awkward.


Yes, that's the `more work' part. You have to tune it to your needs.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-15 Thread Chet Ramey

On 6/5/21 8:42 PM, Nora Platiel wrote:


The "matched explicitly" refers to the previous sentence, which talks about
the `.' at the start of a filename or path component needing to be matched
explicitly by a pattern beginning with a `.' or containing a `.' at the
right spot (after a `/'). I can add language to clarify that.


What about this?
The character `.' at the start of the
| filenames `.' and `..' must always be matched explicitly, even if dotglob
| is set. 


I added something similar, with additional wording about a `.' at the start
of a pattern.



Yes, it all depends on the "universal set" from which the matches of the inner 
`pattern-list' are subtracted.
But in the current implementation, the inner matches are subtracted from:
- all files, if dotglob is set
- all except dot files, if dotglob is unset

The inclusion of `.' and `..' when dotglob is set, seems inconsistent with the 
exclusion of dot files when dotglob is unset.
I think it would be most intuitive and useful to define the universal set to be 
whatever `*' can expand to.


I can see how this would be more intuitive. Let's try it. I'll put support
in the next devel branch push.



About the behavior of the extended operators ?,*,+,@ (with my proposed changes 
when dotglob is set), I'm not sure there's a need for explanation. I think it 
comes natural if you mentally translate the extended pattern into a sequence of 
non-extended patterns.

About `!()', we could say:


I'm leaning towards a general statement about how dotglob affects the set
of filenames that are tested against the extended patterns, rather than
calling out `!' specially.

Chet

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-06 Thread Nora Platiel
Hello,

> Personally, I'd just want an option to always make . and .. hidden from
> globs. [...]

If such option existed, I would certainly use it.
As I already said, I can't imagine why anyone would ever want a pattern to 
match `.' or `..' (unless the entire path component is literal).
But even if I had such option, I would still want the default behavior to be 
easy to understand and fully documented, therefore I maintain my proposed 
changes.
__

About dotglob, this is subjective but personally I would never prefer to have 
it unset.
It happened to me quite a few times that I forgot to include dot files by 
accident, but never the other way around (even after ten years of dotglob set).
When it is not a very conscious choice and I just think "I want all files", 
usually all means all.
When I don't want dot files (rarely), I'm usually very aware of it and I just 
use `[^.]*'.

And even more importantly, dotglob allows me to exclude `.' and `..' by 
protecting the leading dot with brackets. From your example:

$ touch .dot normal
$ echo [.]*
.dot
$ echo ./[.]*
./.dot

> For what it's worth, Zsh, mksh and fish seem to always hide . and ..

Just for curiosity, is that still standard compliant? (I'm talking about 
non-extended patterns, like `.*')

> I tried to achieve that via GLOBIGNORE=.:.. , but that has the problem that
> it forces dotglob on, and looks at the whole resulting path, so ./.* still
> gives ./. and ./.. . Unless you use GLOBIGNORE=.:..:*/.:*/.. etc., but
> repeating the same for all different path lengths gets a bit awkward.

Yes, I also tried GLOBIGNORE a long time ago, and it won't cut the salami.
I think GLOBIGNORE=.:.. is error prone because you get used to it and then you 
get bitten when there is more than one path component.
If you want to cover all cases, there's no need to take into account all 
different path lengths because `/' is not special. You just have to cover 4 
cases (alone, at start, in the middle, at end):
GLOBIGNORE=.:..:./*:../*:*/./*:*/../*:*/.:*/..
But the effect is still not useful (e.g. `./*.txt' will fail to match 
`./foo.txt').

While I'm at it, I will also point out that the docs suggest adding `.*' to 
GLOBIGNORE if the automatic activation of dotglob is undesirable.
I think such suggestion is misleading because the effect is very different 
(e.g. with dotglob unset `.bash*' still matches `.bashrc' etc. but with 
GLOBIGNORE=.* it won't).

Regards,
NP



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-06 Thread Nora Platiel
In my previous message, I wrote:
> Yes, it all depends on the "universal set" from which the matches of the inner
> `pattern-list' are subtracted.
> But in the current implementation, the inner matches are subtracted from:
> - all files, if dotglob is set
> - all except dot files, if dotglob is unset

Sorry, I got mixed up here. From the above it would look like `.' and `..' are 
never excluded from the matches of `!()' if dotglob is set.
Actually they are normally excluded (which is fine), but they are included back 
again if at least one of the inner patterns starts with a literal dot.

From this sentence:
> !(pattern-list)
>Matches anything except one of the given patterns.

currently the definition of "anything" is affected not only by the value of 
dotglob but also by the content of `pattern-list'.
I'm proposing that "anything" should be `*' (i.e. affected by dotglob but not 
by `pattern-list').

The usual dot treatment is applied to the inner patterns themselves but it 
doesn't make a difference in the final match.
E.g. the two patterns `!(.?)' and `!([.]?)' should have the same effect:
  `.?'   matches `..'
  `[.]?' doesn't match `..'
but `*' minus `.?' is still equal to `*' minus `[.]?', because `*' matches 
neither `.' nor `..' (and the final result should never include them).

My description of the desired behavior should still be sufficient.



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-06 Thread Ilkka Virta
On Sun, Jun 6, 2021 at 1:31 PM Ilkka Virta  wrote:

> Personally, I'd just want an option to always make . and .. hidden from
> globs. Or rather,
> to never generate . or .. as a pathname component via globbing. But
> without affecting
> other behaviour, like dotglob, and without precluding the use of . or ..
> as static parts of the
> path.
>

Hmm, looking at the code, this already seems to exist, in lib/glob/glob.c:

   /* Global variable controlling whether globbing ever returns . or ..
 regardless of the pattern. If set to 1, no glob pattern will ever
 match `.' or `..'. Disabled by default. */
  int glob_always_skip_dot_and_dotdot = 1;

I didn't read all the code, but as far as I tested from the git version,
that seems to do what I just
wanted and seems sensible to me with Nora's examples too. (I changed the
filenames from the
previous since I started copying their tests now.)

$ touch .foo .doo bar quux

With dotglob (the first is the same as just *):

$ shopt -s dotglob
$ echo @(.foo|*)
bar .doo .foo quux
$ echo !(.foo)
bar .doo quux
$ echo @(bar|.*)
bar .doo .foo

Without it:

$ shopt -u dotglob
$ echo @(.foo|*)
bar .foo quux
$ echo @(bar|.*)
bar .doo .foo

No match for . and .. even explicitly (with failglob here):

$ echo @(.|..)
bash: no match: @(.|..)

All files with dotglob unset:

$ echo @(.|)*
bar .doo .foo quux

Maybe I missed some more obscure case, though.


Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-06 Thread Ilkka Virta
> Can you write a set of rules that encapsulates what you would like to see?
> Or can the group?
>

I think it's a bit weird that !(.foo) can match . and .. when * doesn't.

The other means roughly "anything here", and the other means "anything but
.foo here",
so having the latter match things the former doesn't is surprising.

Personally, I'd just want an option to always make . and .. hidden from
globs. Or rather,
to never generate . or .. as a pathname component via globbing. But without
affecting
other behaviour, like dotglob, and without precluding the use of . or .. as
static parts of the
path.

As in:
$ touch .dot normal
$ echo .*
.dot
$ echo ./.*
./.dot

And depending on dotglob,  echo *  should give either  .dot normal  or
just  normal .

So, somewhat similarly to how globbing hides pathname components starting
with a
dot when dotglob is unset, just with another option to hide . and .. in
particular.

Frankly, I don't care if that would also mean that ./@(.|..)/ would match
nothing. I don't
see much use for globbing . and .. in any situation, the dangers of
accidentally climbing up
one level in the tree by a stray .* are much worse. Someone else might
disagree, of course,
but if one really wants to include those two, brace expansion should work
since the two
names are always known to exist anyway. And of course if it's an option,
one doesn't need
to use it if they don't like it.

For what it's worth, Zsh, mksh and fish seem to always hide . and .. , and
at least Zsh does
that even with (.|..) or @(.|..) .


I tried to achieve that via GLOBIGNORE=.:.. , but that has the problem that
it forces dotglob
on, and looks at the whole resulting path, so ./.* still gives ./. and ./..
. Unless you use
GLOBIGNORE=.:..:*/.:*/.. etc., but repeating the same for all different
path lengths gets a bit
awkward.


Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-05 Thread Nora Platiel
Thanks again for the info. Now I understand why `.' and `..' are handled 
separately, and I can imagine the complexity.

> The "matched explicitly" refers to the previous sentence, which talks about
> the `.' at the start of a filename or path component needing to be matched
> explicitly by a pattern beginning with a `.' or containing a `.' at the
> right spot (after a `/'). I can add language to clarify that.

What about this?
| When a pattern is used for filename expansion, the character `.' at the
| start of a filename or path component must be matched explicitly by a
| corresponding `.' at the start of the pattern or after a `/', unless the
| shell option dotglob is set. The character `.' at the start of the
| filenames `.' and `..' must always be matched explicitly, even if dotglob
| is set. In other cases, the `.' character is not treated specially.

> > $ echo !(.foo)
> > bar
>
> There is an equally compelling argument to be made that `.' and `..' should
> be included in the results from the second example, since they do not match
> the pattern `.foo'. The question is how much `not matching' you want.
> `dotglob' only affects the `matching' state. That's the essence of where we
> started with this.

Yes, it all depends on the "universal set" from which the matches of the inner 
`pattern-list' are subtracted.
But in the current implementation, the inner matches are subtracted from:
- all files, if dotglob is set
- all except dot files, if dotglob is unset

The inclusion of `.' and `..' when dotglob is set, seems inconsistent with the 
exclusion of dot files when dotglob is unset.
I think it would be most intuitive and useful to define the universal set to be 
whatever `*' can expand to.
I.e.
- all except `.' and `..', if dotglob is set
- all except dot files, if dotglob is unset

> I'm not averse to changing the current behavior. This is a niche case.
> Then instead of figuring out language to describe the current behavior,
> let's figure out language to describe the desired behavior.

I would want to limit as much as possible the cases where a path component can 
expand to `.' or `..', which is always undesirable, but also remain consistent 
with non-extended globbing and keep it simple.

With dotglob set, we can use `[.]' to match a starting dot without the risk of 
including `.' and `..'.
I always use `[.]*' instead of `.*', so it comes natural that I need to use 
`@([.]*|foo)' instead of `@(.*|foo)'.
But normally there's no need to protect the dot in `.foo*', so it doesn't come 
natural that I need to protect it in cases like `!([.]foo*)' or `@([.]foo*|??)'.

With dotglob unset, the current behavior seems fine.

> Your English is fine. You want to take a shot at a sentence or two
> describing your desired behavior? It should not take more than that.

About the behavior of the extended operators ?,*,+,@ (with my proposed changes 
when dotglob is set), I'm not sure there's a need for explanation. I think it 
comes natural if you mentally translate the extended pattern into a sequence of 
non-extended patterns.

About `!()', we could say:
| The `!()' operator will never match the character `.' at the start of a
| filename, unless the shell option dotglob is set. Even if dotglob is set,
| it will never match the character `.' at the start of the filenames `.'
| and `..'.

Or:
| The `!()' operator can only match strings that can be matched by `*', so
| the inclusion of filenames starting with the character `.' is affected by
| the shell option dotglob in the same way.

Regards,
NP



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-02 Thread Chet Ramey

On 5/31/21 11:23 AM, Nora Platiel wrote:


How would you improve the wording? What do you think is most important to
cover?


Here is the full paragraph for reference:

When a pattern is used for filename expansion, the character `.' at the
start of a filename or immediately following a slash must be matched
explicitly, unless the shell option dotglob is set. The filenames `.'
and `..' must always be matched explicitly, even if dotglob is set. In
other cases, the `.' character is not treated specially.


First:

The filenames `.' and `..' must always be matched explicitly, even if
dotglob is set.


I agree with gregrwm here.
(https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00251.html)
The sentence seems to imply that you need 2 literal dots to match `..'.

I read your answer and I understand: if that was the case, then the only 
(non-extended) pattern capable of matching `..' would be `..*'.
But my understanding of the expression "matched explicitly" is: matched in its 
entirety via characters that stand for themselves (i.e. not via special pattern 
characters).


The "matched explicitly" refers to the previous sentence, which talks about
the `.' at the start of a filename or path component needing to be matched
explicitly by a pattern beginning with a `.' or containing a `.' at the
right spot (after a `/'). I can add language to clarify that.

The dotglob option basically eliminates that restriction for files that are
not named `.' and `..' (or it tries).

The intersection of dotglob and extended globbing is where the
implementation gets tricky.


Next, there's nothing in the docs about dot treatment in the specific context 
of extended globbing.


That's true. It's one of the questions we're considering here. One option
is to say that the effect of dotglob on `.' and `..' in extended pattern
matching is ignored. (I am not advocating that.)



I expect @(P1|P2) to expand to the union of the matches of the separate 
subpatterns P1 and P2.


This is not unreasonable. You can say similar things about the rest of the
extended globbing operators.

But let's talk specifically about the treatment of `.' and `..'.



Example of expected results:
$ touch .foo bar
$ shopt -s dotglob
$ echo @(.foo|*)
.foo bar


I can see this. It's consistent with the policy that `.' and `..' can only
be matched by a pattern beginning with a literal `.'.


$ echo !(.foo)
bar


There is an equally compelling argument to be made that `.' and `..' should
be included in the results from the second example, since they do not match
the pattern `.foo'. The question is how much `not matching' you want.
`dotglob' only affects the `matching' state. That's the essence of where we
started with this.



It's not intuitive. The dotglob causes all files starting with `.' to be
in the list, the .foo pattern keeps `.' and `..' from being discarded,
and the `*' matches it (since dotglob disables the requirement that an
initial `.' be matched explicitly).


Ok, if things happen in the way and order you described here, then I can 
understand it.


If you want to look at it from an implementation perspective, think of it
this way:

Given the POSIX fnmatch() interface used to match strings, turning on
`dotglob' causes calls to fnmatch *not* to include the FNM_PERIOD flag.
This removes all special treatment of `.'; it's mostly intended to be used
when not matching pathnames, so there's a special FNM_PATHNAME flag to
use with it.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/fnmatch.html#tag_16_154

There's no direct way to treat `.' and `..' specially here.

The bash extglob implementation uses its fnmatch-workalike internally to
match each pattern. Hence the use of heuristics to include and omit `.'
and `..'.


But yes, it's not intuitive. It seems totally arbitrary to me that alternative 
subpatterns in a pattern-list influence each other's behavior concerning `.' 
and `..', but not concerning dot-files in general.

Not being familiar with the actual implementation, I've never considered the exclusion of dot-files and the exclusion of `.' and `..' as two different mechanisms. 


The special behavior regarding `.' and `..' is the special case. Using the
standard interfaces, you either have all files beginning with `.', or you
don't. You have to check for them separately, and you can do it at a couple
of different levels depending on your policy.



I would still prefer the behavior I was expecting from the start. I'm having a 
hard time finding good words to document the current behavior, which is 
probably an indication that it is too complex (or that my English sucks :D).


I'm not averse to changing the current behavior. This is a niche case.
Then instead of figuring out language to describe the current behavior,
let's figure out language to describe the desired behavior.


Another minor observation:
it is not documented that the dot in the pattern must be also at the beginning to be able 
to "match explicitly".
$ shopt -u dotglo

Re: !(.pattern) can match . and .. if dotglob is enabled

2021-05-31 Thread Nora Platiel
Thank you for your effort in understanding my problem.

> The actual change, captured in the `devel' branch that tracks bash
> development, happened sometime in 2011.

I see.

> How would you improve the wording? What do you think is most important to
> cover?

Here is the full paragraph for reference:
> When a pattern is used for filename expansion, the character `.' at the
> start of a filename or immediately following a slash must be matched
> explicitly, unless the shell option dotglob is set. The filenames `.'
> and `..' must always be matched explicitly, even if dotglob is set. In
> other cases, the `.' character is not treated specially.

First:
> The filenames `.' and `..' must always be matched explicitly, even if
> dotglob is set.

I agree with gregrwm here.
(https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00251.html)
The sentence seems to imply that you need 2 literal dots to match `..'.

I read your answer and I understand: if that was the case, then the only 
(non-extended) pattern capable of matching `..' would be `..*'.
But my understanding of the expression "matched explicitly" is: matched in its 
entirety via characters that stand for themselves (i.e. not via special pattern 
characters).

It would be much clearer if we prepend this part:
"[The character `.' at the start of] the filenames `.' and `..' must always be 
matched explicitly, even if dotglob is set."

Next, there's nothing in the docs about dot treatment in the specific context 
of extended globbing. The only behavior that I would consider self-explanatory 
enough not to require a description is the following:

I expect @(P1|P2) to expand to the union of the matches of the separate 
subpatterns P1 and P2.
(I.e. if I take the expansion of P1 P2, sort it, remove duplicates and 
non-existing files, I expect to get the same result as @(P1|P2). )

And:
x@(P1|P2)y ->xP1y xP2y
x?(P1|P2)y -> xy xP1y xP2y
x*(P1|P2)y -> xy xP1y xP2y xP1P1y xP1P2y xP2P1y xP2P2y ...
x+(P1|P2)y ->xP1y xP2y xP1P1y xP1P2y xP2P1y xP2P2y ...
x!(P1|P2)y -> x*y excluding results where @(P1|P2) matches the part matched by 
the `*'

Example of expected results:
$ touch .foo bar
$ shopt -s dotglob
$ echo @(.foo|*)
.foo bar
$ echo !(.foo)
bar

I think this would be the least surprising and least error prone behavior. I'm 
wondering if the actual behavior is different by design, or if it was just 
easier to implement that way, or if the consequences were not considered.

> > So why it does here?
> > $ shopt -s dotglob; echo @(.foo|*)
> > . .. .a b
> > $ shopt -u dotglob; echo @(.foo|*)
> > b
>
> It's not intuitive. The dotglob causes all files starting with `.' to be
> in the list, the .foo pattern keeps `.' and `..' from being discarded,
> and the `*' matches it (since dotglob disables the requirement that an
> initial `.' be matched explicitly).

Ok, if things happen in the way and order you described here, then I can 
understand it.
But yes, it's not intuitive. It seems totally arbitrary to me that alternative 
subpatterns in a pattern-list influence each other's behavior concerning `.' 
and `..', but not concerning dot-files in general.

Not being familiar with the actual implementation, I've never considered the 
exclusion of dot-files and the exclusion of `.' and `..' as two different 
mechanisms. To me it was just that dotglob-unset has a stricter "filter" than 
dotglob-set has (filter activated by the absence of a literal dot at start of 
pattern).
I.e. `.' and `..' are to dotglob-set, as all dot-files (including `.' and `..') 
are to dotglob-unset.

I would still prefer the behavior I was expecting from the start. I'm having a 
hard time finding good words to document the current behavior, which is 
probably an indication that it is too complex (or that my English sucks :D).

Here is an attempt:
"If dotglob is set, and an extended pattern contains at least one subpattern 
beginning with a literal dot, then in the context of such extended pattern, 
there is no requirement for the first character of the filenames `.' or `..' to 
be matched explicitly."

Another minor observation:
it is not documented that the dot in the pattern must be also at the beginning 
to be able to "match explicitly".
$ shopt -u dotglob; echo *.foo  # doesn't match `.foo'
$ shopt -s dotglob; echo *..# doesn't match `..'
Even though you could say that the dot is "matched explicitly" in both cases.

I find it also interesting that:
$ shopt -u dotglob; echo !(bar).foo  # doesn't match `.foo'
$ shopt -u dotglob; echo ?(bar).foo  # matches `.foo'
$ shopt -u dotglob; echo *(bar).foo  # matches `.foo'

Ok, this is not so likely to confuse people, so I'm not sure it's worth 
documenting if it makes the explanation too cumbersome.
Here is my (bad) attempt:
"If a path component in the pattern doesn't start with a literal dot, than such 
path component can never expand to a name beginning with dot, unless the shell 
option dotglob is set. If a path component in the pattern doesn't

Re: !(.pattern) can match . and .. if dotglob is enabled

2021-05-27 Thread Chet Ramey

On 5/26/21 7:36 PM, Nora Platiel wrote:

Hello,


This is behavior that changed more than ten years ago.


I thought it changed in this commit:
https://git.savannah.gnu.org/cgit/bash.git/commit/?id=ac50fbac377e32b98d2de396f016ea81e8ee9961
2014-02-26 -> 7.2 years ago


That's the commit to the master branch when bash-4.3 was released. The
actual change, captured in the `devel' branch that tracks bash development,
happened sometime in 2011.


But yes, I know it's old stuff and I was not implying a regression, just 
mentioning it FYI.


There was a relevant discussion back in January:
https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00240.html


Thanks, I did a search but I missed it.
It *is* relevant, because I think that @(?|.?) matching '.' is consistent with 
!(.foo) matching '.' and '..'


OK, so we're saying the same thing at this point.


The problem is, there is nothing in the docs that hints at such behavior.
I.e. that the pattern beginning with a literal dot may not match at all, but it 
still signal that '.' and '..' should be accepted in the final match.


How would you improve the wording? What do you think is most important to
cover?



So why it does here?
$ shopt -s dotglob; echo @(.foo|*)
. .. .a b
$ shopt -u dotglob; echo @(.foo|*)
b


It's not intuitive. The dotglob causes all files starting with `.' to be
in the list, the .foo pattern keeps `.' and `..' from being discarded,
and the `*' matches it (since dotglob disables the requirement that an
initial `.' be matched explicitly).

Can you write a set of rules that encapsulates what you would like to see?
Or can the group?


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-05-26 Thread Nora Platiel
Hello,

> This is behavior that changed more than ten years ago.

I thought it changed in this commit:
https://git.savannah.gnu.org/cgit/bash.git/commit/?id=ac50fbac377e32b98d2de396f016ea81e8ee9961
2014-02-26 -> 7.2 years ago
But yes, I know it's old stuff and I was not implying a regression, just 
mentioning it FYI.

> There was a relevant discussion back in January:
> https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00240.html

Thanks, I did a search but I missed it.
It *is* relevant, because I think that @(?|.?) matching '.' is consistent with 
!(.foo) matching '.' and '..'

If you maintain that such behavior is correct, I have no problem with it.
I will just use [.]pattern even inside !(...) to get rid of '.' and '..'

The problem is, there is nothing in the docs that hints at such behavior.
I.e. that the pattern beginning with a literal dot may not match at all, but it 
still signal that '.' and '..' should be accepted in the final match.
The docs only talk about the requirement of '.' and  '..' to "be matched 
explicitly", which is vague and subject to interpretation, especially in the 
context of extended patterns.
I'm not the first person to complain about such wording.

> When dotglob is enabled, the shell's pattern matcher interprets an extglob
> pattern beginning with a literal `.' as specifying that files beginning
> with a `.' should match, so the negated pattern matches `.' and `..'.

Shouldn't this be the case with dotglob disabled?
And with dotglob enabled, the exclusion applies only to '.' and '..', instead 
of all dot-files.
If not, again the docs is wrong or incomplete.

dotglob enabled  -> '.' and  '..' require pattern beginning with literal dot
dotglob disabled -> all dot-files require pattern beginning with literal dot

I.e. my understanding is that every pattern starting with a literal dot should 
*not* be influenced by dotglob, because the literal dot calls off every 
exclusion.

Of course dotglob makes no difference here:
$ shopt -s dotglob; echo @(.*)
. .. .a
$ shopt -u dotglob; echo @(.*)
. .. .a

$ shopt -s dotglob; echo @(.?)
.. .a
$ shopt -u dotglob; echo @(.?)
.. .a

So why it does here?
$ shopt -s dotglob; echo @(.foo|*)
. .. .a b
$ shopt -u dotglob; echo @(.foo|*)
b

$ shopt -s dotglob; echo !(.foo)
. .. .a b
$ shopt -u dotglob; echo !(.foo)
b

Regards,
NP



Re: !(.pattern) can match . and .. if dotglob is enabled

2021-05-26 Thread Chet Ramey

On 5/25/21 8:58 PM, Nora Platiel wrote:


Bash Version: 5.1
Patch Level: 8
Release Status: release

Hello,

Repeat-By:
$ shopt -s dotglob extglob
$ echo !(.foo)
. .. .other files

The doc says: "The filenames '.' and '..' must always be matched explicitly, even if 
dotglob is set."


When dotglob is enabled, the shell's pattern matcher interprets an extglob
pattern beginning with a literal `.' as specifying that files beginning
with a `.' should match, so the negated pattern matches `.' and `..'.


Also the pattern !(.foo) didn't match . and .. before version 4.3.0.


This is behavior that changed more than ten years ago.

There was a relevant discussion back in January:

https://lists.gnu.org/archive/html/bug-bash/2021-01/msg00240.html

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



!(.pattern) can match . and .. if dotglob is enabled

2021-05-25 Thread Nora Platiel
Configuration Information [Automatically generated, do not change]:
Machine: i686
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -Wno-parentheses -Wno-format-security
uname output: Linux columbus 4.4.246-gentoo #2 SMP Thu Dec 31 17:31:16 -00 2020 
i686 AMD Athlon(tm) XP 2800+ AuthenticAMD GNU/Linux
Machine Type: i686-pc-linux-gnu

Bash Version: 5.1
Patch Level: 8
Release Status: release

Hello,

Repeat-By:
$ shopt -s dotglob extglob
$ echo !(.foo)
. .. .other files

The doc says: "The filenames '.' and '..' must always be matched explicitly, 
even if dotglob is set."
I would infer that a !(...) should never match '.' and '..', because you cannot 
match something literally using an operator that means "anything except".

The patterns .foo and [.]foo are equivalent under dotglob, therefore I would 
expect !(.foo) and !([.]foo) to be also equivalent. But . and .. are excluded 
here already:
$ shopt -s dotglob extglob
$ echo !([.]foo)
.other files
$ echo !(foo)
.other files

Personally I can't imagine how it can be ever useful for a pattern to match . 
or .. (unless the path component is literally . or ..).
Therefore I always work with dotglob enabled and use [.]pattern instead of 
.pattern to get rid of . and ..
But I don't think it makes much sense with the !(...) operator.
Also the pattern !(.foo) didn't match . and .. before version 4.3.0.

Regards,
NP