Bug#940464: grep --and -eX -eY -eZ (X∩Y∩Z intersection, not X∪Y∪Z union)

2019-09-25 Thread Santiago Ruano Rincón
Control: tags -1 + upstream

El 16/09/19 a las 12:09, Trent W. Buck escribió:
> Package: grep
> Version: 3.3-1
> Severity: wishlist
> 
> (Surely someone has already asked for this, but I can't see where.
> I may have already reported this myself, and forgotten.
> If so, sorry!)
> 
> Right now if you do
> 
> grep -eX -eY -eZ
> 
> You'll get lines that match *any of* X, Y, or Z.
> Quite often I want to search for lines that match *all of* X, Y, and Z — but 
> in any order.
> For example,
> 
> # all 4TB 2.5-inch SATA products
> grep -Fwi -eSATA -e2TB -e2.5in products.csv
> 
> Below is a short discussion of the workarounds I know about.
> 
> Is "grep --and" something that has already been discussed and rejected?
> I looked through debbugs.gnu.org and the source tarball, but
> I couldn't find anything about this.
> 
> 
> PS: grep -v --and would intuitively mean "not all",
> i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but
> omit lines matching *both* X and Y.
> 
> PS: I can't decide if "--and" or "--intersection" is a better name.
> I put both in the bug subject so people searching for either will find this 
> ticket.
> I think "--all" is probably too confusing.
> 
> 
> 
> Workaround #1
> =
> I can work around this by listing every possible order, but 1) this
> scales poorly with the number of patterns; and 2) it can't be used
> with -F.  For example,
> 
> grep --and -eX -eY -eZ input*.txt   # becomes
> 
> grep -eZ.*Y.*X \
>  -eZ.*X.*Y \
>  -eY.*Z.*X \
>  -eY.*X.*Z \
>  -eX.*Z.*Y \
>  -eX.*Y.*Z \
>  input*.txt
> 
> 
> Workaround #2
> =
> I can pipe greps together.  This is what I currently do.
> This is more convenient and feels faster than workaround #1, but
> I suspect the inter-process overhead is significant.
> 
> If grep implemented this internally, it could zero-copy.
> Being able to "grep -rnH --and"  would also be convenient.
> 
> For example,
> 
> grep --and -F -eX -eY -eZ input*.txt   # becomes
> 
> cat input*.txt |
> grep -F -eX |
> grep -F -eY |
> grep -F -eZ


signature.asc
Description: PGP signature


Bug#940464: grep --and -eX -eY -eZ (X∩Y∩Z intersection, not X∪Y∪Z union)

2019-09-25 Thread Santiago Ruano Rincón
Hi,

El 16/09/19 a las 12:09, Trent W. Buck escribió:
> Package: grep
> Version: 3.3-1
> Severity: wishlist
> 
> (Surely someone has already asked for this, but I can't see where.
> I may have already reported this myself, and forgotten.
> If so, sorry!)
> 
> Right now if you do
> 
> grep -eX -eY -eZ
> 
> You'll get lines that match *any of* X, Y, or Z.
> Quite often I want to search for lines that match *all of* X, Y, and Z — but 
> in any order.
> For example,
> 
> # all 4TB 2.5-inch SATA products
> grep -Fwi -eSATA -e2TB -e2.5in products.csv
> 
> Below is a short discussion of the workarounds I know about.
> 
> Is "grep --and" something that has already been discussed and rejected?
> I looked through debbugs.gnu.org and the source tarball, but
> I couldn't find anything about this.
> 
…

Could you please forward your request to the upstream developers at
bug-g...@gnu.org? I know I could do it by myself, but I'd prefer to keep
you in the loop directly.

Cheers,

 -- Santiago


signature.asc
Description: PGP signature


Bug#940464: grep --and -eX -eY -eZ (X∩Y∩Z intersection, not X∪Y∪Z union)

2019-09-15 Thread Trent W. Buck
Package: grep
Version: 3.3-1
Severity: wishlist

(Surely someone has already asked for this, but I can't see where.
I may have already reported this myself, and forgotten.
If so, sorry!)

Right now if you do

grep -eX -eY -eZ

You'll get lines that match *any of* X, Y, or Z.
Quite often I want to search for lines that match *all of* X, Y, and Z — but in 
any order.
For example,

# all 4TB 2.5-inch SATA products
grep -Fwi -eSATA -e2TB -e2.5in products.csv

Below is a short discussion of the workarounds I know about.

Is "grep --and" something that has already been discussed and rejected?
I looked through debbugs.gnu.org and the source tarball, but
I couldn't find anything about this.


PS: grep -v --and would intuitively mean "not all",
i.e. "grep -v --and -eX -eY" would return lines matching X *or* Y, but
omit lines matching *both* X and Y.

PS: I can't decide if "--and" or "--intersection" is a better name.
I put both in the bug subject so people searching for either will find this 
ticket.
I think "--all" is probably too confusing.



Workaround #1
=
I can work around this by listing every possible order, but 1) this
scales poorly with the number of patterns; and 2) it can't be used
with -F.  For example,

grep --and -eX -eY -eZ input*.txt   # becomes

grep -eZ.*Y.*X \
 -eZ.*X.*Y \
 -eY.*Z.*X \
 -eY.*X.*Z \
 -eX.*Z.*Y \
 -eX.*Y.*Z \
 input*.txt


Workaround #2
=
I can pipe greps together.  This is what I currently do.
This is more convenient and feels faster than workaround #1, but
I suspect the inter-process overhead is significant.

If grep implemented this internally, it could zero-copy.
Being able to "grep -rnH --and"  would also be convenient.

For example,

grep --and -F -eX -eY -eZ input*.txt   # becomes

cat input*.txt |
grep -F -eX |
grep -F -eY |
grep -F -eZ