subject:"GREP"

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-20 Thread Michael


another excellent example why your posts are almost always worth reading.

thank you! :)

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread Greg Wooledge

On Thu, Jan 19, 2023 at 02:00:00PM +0100, DdB wrote:
> Am 19.01.2023 um 13:13 schrieb Greg Wooledge:
> > The fact that this *appears* to work is what causes so much confusion.
> > It will "work" some of the time, but not all of the time, and you'll
> > get different results depending on which directory you're in, on which
> > computer.
> > 
> > Bash has two other settings for handling unmatched globs.  The first one
> > is called "nullglob", and if it's turned on, an unmatched glob is simply
> > discarded from the command argument list.
> 
> I was really curious, how Greg would put words to this one. And i gotta
> applaud: Such unambiguous explanations, and so circumspect at the same
> time. Even understanding the basis for confusion, i could learn
> something new from this (other settings ...).

Regarding the nullglob and failglob settings:

failglob is pretty safe.  You can play around with that, and see how
it affects things.  Other shells have their equivalent enabled by
default, so it's basically a preference.

nullglob should not be used in an interactive shell.  There are far too
many tools that rely on the shell's normal behavior (passing along an
unmatched glob), and which will break if you turn on nullglob and then
type something innocuous.  The classic example, again, is ls.  If you
have nullglob enabled, and you do "ls *.ttx" or whatever typo, you'll
get the behavior of "ls" with no arguments, which shows all the files,
rather than an error message.

The only place where nullglob is useful is in a script, and even then,
it has to be a script where extra care is used.  I believe the intended
use case is something like this:

  for f in *.txt; do
printf 'Processing "%s"...' "$f"
process -- "$f" && echo ' done.'
  done

With nullglob enabled, if there are no matching *.txt files, the loop
is skipped.  Without nullglob, the *.txt will be unchanged, and the
loop body will be executed once, passing the literal '*txt' to the
printf and process commands.

There are several reasons why nullglob isn't on by default in scripts,
and isn't commonly used.  One of them is the same as with interactive
shells -- there are programs that may behave in a surprising way if
an unmatched glob is removed, rather than passed along.

Another reason is that you probably want to perform a basic sanity
check on your filenames before calling your "process" command.  The
glob might match things that aren't regular files -- directories, for
example, or FIFOs, which could cause "process" to fail or hang.  So,
you might prefer something like this:

  for f in *.txt; do
test -f "$f" || continue
printf 'Processing "%s"...' "$f"
process -- "$f" && echo ' done.'
  done

Adding that basic sanity check also prevents processing an umatched
*.txt glob as a filename, so nullglob just isn't needed here.

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread tomas

On Thu, Jan 19, 2023 at 02:00:00PM +0100, DdB wrote:

[...]

> I was really curious, how Greg would put words to this one. And i gotta
> applaud: Such unambiguous explanations, and so circumspect at the same
> time. Even understanding the basis for confusion, i could learn
> something new from this (other settings ...).

Greg knows bash. Really. If you don't know his bash pages, you should:

  http://mywiki.wooledge.org/

Cheers
-- 
t


signature.asc
Description: PGP signature

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread DdB

Am 19.01.2023 um 13:13 schrieb Greg Wooledge:
> The fact that this *appears* to work is what causes so much confusion.
> It will "work" some of the time, but not all of the time, and you'll
> get different results depending on which directory you're in, on which
> computer.
> 
> Bash has two other settings for handling unmatched globs.  The first one
> is called "nullglob", and if it's turned on, an unmatched glob is simply
> discarded from the command argument list.

I was really curious, how Greg would put words to this one. And i gotta
applaud: Such unambiguous explanations, and so circumspect at the same
time. Even understanding the basis for confusion, i could learn
something new from this (other settings ...).

Excellent post there!
Thank you
DdB

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread tomas

On Thu, Jan 19, 2023 at 07:13:46AM -0500, Greg Wooledge wrote:

[...]

> So, just to add to the list of people who've already said it: always
> quote the patterns that you pass to apt list, because you want apt
> to use them directly, without your shell interfering.

And, if in doubt, just replace the original command with "echo". This
will let you "see" what your command is going to see. I think it's a
good way to get a feeling of what's going on. Those examples are bash,
in its default setting:

  tomas@trotzki:~$ echo foo*
  foo foo.txt
  tomas@trotzki:~$ echo blarg*
  blarg*

Cheers
-- 
t

signature.asc
Description: PGP signature

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread Greg Wooledge

On Thu, Jan 19, 2023 at 12:11:43PM +0100, Christoph Brinkhaus wrote:
> For curiosity If have done a small test as below.
> Unfortunately there are a few outputs in German. For this comparisons
> the exact meanings of the German text has no importance at all.
> 
> 1. The first command of the original poster:
> chris@lenovo ~> apt list sudo*
> fish: No matches for wildcard 'sudo*'. See `help expand`.
> apt list sudo*
>  ^
> 2. Create an empty file to see the effect:
> chris@lenovo ~> touch sudo
> 
> 3. The first command of the original poster:
> chris@lenovo ~> apt list sudo*
> Auflistung… Fertig
> sudo/stable-security,now 1.9.5p2-3+deb11u1 amd64  [Installiert,automatisch]
> N: Es gibt 1 zusätzliche Version. Bitte verwenden Sie die Option »-a«, um sie 
> anzuzeigen.

Your examples are excellent, but there's one more piece to this story.
The behavior of a glob that doesn't match any files (e.g. your 1.)
depends on the shell, and on the settings that are chosen within that
shell.

In bash, with default settings, a glob that doesn't match any files
is passed on literally as a command argument.  The classic example
of this is demonstrated by "ls":

unicorn:~$ ls *.ttx
ls: cannot access '*.ttx': No such file or directory

If I misspell "*.txt" as "*.ttx" I get this message.  Bash (my shell)
saw the *.ttx glob, and tried to expand it to the list of matching
filenames in my directory.  There aren't any, so it passed the glob
along without expanding it.  This allowed ls to see the original glob
just as I had typed it, and to include it in its error message.

If I did the same thing with apt list sudo* I would get this:

unicorn:~$ ls sudo*
ls: cannot access 'sudo*': No such file or directory
unicorn:~$ apt list sudo*
Listing... Done
sudo-ldap/stable-security 1.9.5p2-3+deb11u1 amd64
sudo-ldap/stable-security 1.9.5p2-3+deb11u1 i386
sudo/stable-security,now 1.9.5p2-3+deb11u1 amd64 [installed]
sudo/stable-security 1.9.5p2-3+deb11u1 i386
sudoku-solver/stable 1.0.1-2 amd64
sudoku-solver/stable 1.0.1-2 i386
sudoku/stable 1.0.5-2+b3 amd64
sudoku/stable 1.0.5-2+b3 i386

Since there are no files matching the sudo* glob in my directory, bash
passes it along untouched, and apt uses it as a matching pattern against
package names.

The fact that this *appears* to work is what causes so much confusion.
It will "work" some of the time, but not all of the time, and you'll
get different results depending on which directory you're in, on which
computer.

Bash has two other settings for handling unmatched globs.  The first one
is called "nullglob", and if it's turned on, an unmatched glob is simply
discarded from the command argument list.

unicorn:~$ bash
unicorn:~$ shopt | grep glob
dotglob off
extglob off
failgloboff
globasciiranges on
globstaroff
nocaseglob  off
nullgloboff
unicorn:~$ shopt -s nullglob
unicorn:~$ apt list sudo* | head

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Listing...
0ad-data-common/stable,stable 0.0.23.1-1.1 all
0ad-data/stable,stable 0.0.23.1-1.1 all
0ad/stable 0.0.23.1-5+b1 amd64
0ad/stable 0.0.23.1-5+b1 i386
0install-core/stable 2.16-1 amd64
0install-core/stable 2.16-1 i386
0install/stable 2.16-1 amd64
0install/stable 2.16-1 i386
0x/stable 0.9-1 amd64

In this case, the unmatched sudo* glob is dropped altogether, and the
resulting command is simply "apt list".  I knew what would happen, so I
piped it to head, to shorten the output.

The other setting is called "failglob", and if it's turned on, an
unmatched glob causes an error at the shell level, and prevents execution
of the command.

unicorn:~$ shopt -u nullglob; shopt -s failglob
unicorn:~$ apt list sudo*
bash: no match: sudo*
unicorn:~$ exit
exit

This is very much like what your fish example did, and what (t)csh does
by default, if I remember correctly.

So, just to add to the list of people who've already said it: always
quote the patterns that you pass to apt list, because you want apt
to use them directly, without your shell interfering.

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread Christoph Brinkhaus

Am Thu, Jan 19, 2023 at 11:40:50AM +0100 schrieb to...@tuxteam.de:

Hello Tomas,

> On Thu, Jan 19, 2023 at 11:31:23AM +0100, Christoph Brinkhaus wrote:
> > Am Thu, Jan 19, 2023 at 09:10:30AM +0100 schrieb js-p...@online.de:
> > 
> > Hello Julian,
> > 
> > > Hello together,
> > > listing packages in apt with ”sudo“ in the title returns different output 
> > > (bash commands at the end of the email). I would fill a bug report, but 
> > > I'm not sure whether to address it to grep or apt. How do you see this?
> > > 
> > > Kind regards
> > > Julian Schreck
> > > --
> > > $ apt list sudo*   vs.  $ apt list | grep "^sudo[a-z-]"
> > > $ apt list *sudo   vs.  $ apt list | grep "[a-z-]sudo/"
> > > $ apt list *sudo*  vs.  $ apt list | grep "sudo"
> > 
> > It seems to be more a shell topic or how man 7 glob is handled.
> > Please try the first patterns with "" signs as
> > apt-list "sudo*" instead of apt-list sudo* and so on.
> > It made the difference for me using the fish shell.
> 
> Doing "apt list sudo*" without any quoting is asking for trouble anyway,
> regardless of the shell you use.

Yes, quoting is the correct term. And it is of superior improtance,
especially if the input is not perdefined in terms of the structure.
E.g. existance of special signs as the asterisk *, white spaces and so
on.
> 
> What the command apt will ultimately see will depend on what files are
> in your current directory.

For curiosity If have done a small test as below.
Unfortunately there are a few outputs in German. For this comparisons
the exact meanings of the German text has no importance at all.

1. The first command of the original poster:
chris@lenovo ~> apt list sudo*
fish: No matches for wildcard 'sudo*'. See `help expand`.
apt list sudo*
 ^
2. Create an empty file to see the effect:
chris@lenovo ~> touch sudo

3. The first command of the original poster:
chris@lenovo ~> apt list sudo*
Auflistung… Fertig
sudo/stable-security,now 1.9.5p2-3+deb11u1 amd64  [Installiert,automatisch]
N: Es gibt 1 zusätzliche Version. Bitte verwenden Sie die Option »-a«, um sie 
anzuzeigen.

4. Run apt list sudo
The result is the same as 3.

5. Delete the original empty file and create a modified one:
mv sudo sudolala

6. The first command of the original poster:
chris@lenovo ~> apt list sudo*
Auflistung… Fertig

There is no sudolala.

> Don't do that. It will drive you crazy.

Sure. Thanks for triggering me to do the small tests.
I did not want to hijack the thread. But I think it makes sense to
show the effect of missing quoting.

Kind regards,
Christoph
-- 
Ist die Katze gesund
schmeckt sie dem Hund.

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread tomas

On Thu, Jan 19, 2023 at 11:31:23AM +0100, Christoph Brinkhaus wrote:
> Am Thu, Jan 19, 2023 at 09:10:30AM +0100 schrieb js-p...@online.de:
> 
> Hello Julian,
> 
> > Hello together,
> > listing packages in apt with ”sudo“ in the title returns different output 
> > (bash commands at the end of the email). I would fill a bug report, but I'm 
> > not sure whether to address it to grep or apt. How do you see this?
> > 
> > Kind regards
> > Julian Schreck
> > --
> > $ apt list sudo*   vs.  $ apt list | grep "^sudo[a-z-]"
> > $ apt list *sudo   vs.  $ apt list | grep "[a-z-]sudo/"
> > $ apt list *sudo*  vs.  $ apt list | grep "sudo"
> 
> It seems to be more a shell topic or how man 7 glob is handled.
> Please try the first patterns with "" signs as
> apt-list "sudo*" instead of apt-list sudo* and so on.
> It made the difference for me using the fish shell.

Doing "apt list sudo*" without any quoting is asking for trouble anyway,
regardless of the shell you use.

What the command apt will ultimately see will depend on what files are
in your current directory.

Don't do that. It will drive you crazy.

Cheers
-- 
t


signature.asc
Description: PGP signature

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread Christoph Brinkhaus

Am Thu, Jan 19, 2023 at 09:10:30AM +0100 schrieb js-p...@online.de:

Hello Julian,

> Hello together,
> listing packages in apt with ”sudo“ in the title returns different output 
> (bash commands at the end of the email). I would fill a bug report, but I'm 
> not sure whether to address it to grep or apt. How do you see this?
> 
> Kind regards
> Julian Schreck
> --
> $ apt list sudo*   vs.  $ apt list | grep "^sudo[a-z-]"
> $ apt list *sudo   vs.  $ apt list | grep "[a-z-]sudo/"
> $ apt list *sudo*  vs.  $ apt list | grep "sudo"

It seems to be more a shell topic or how man 7 glob is handled.
Please try the first patterns with "" signs as
apt-list "sudo*" instead of apt-list sudo* and so on.
It made the difference for me using the fish shell.

Kind regards,
Christoph
-- 
Ist die Katze gesund
schmeckt sie dem Hund.

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread Markus Schönhaber


19.01.23, 09:10 +0100, js-p...@online.de:


Hello together,
listing packages in apt with ”sudo“ in the title returns different output (bash 
commands at the end of the email). I would fill a bug report, but I'm not sure 
whether to address it to grep or apt. How do you see this?


To me it seems there's neither a bug in apt nor in grep but rather in 
your regular expressions.



Kind regards
Julian Schreck
--
$ apt list sudo*   vs.  $ apt list | grep "^sudo[a-z-]"


The former also matches "sudo", the latter RE does not - it matches ex. 
"sodoa" or "sudo-".



$ apt list *sudo   vs.  $ apt list | grep "[a-z-]sudo/"


The former also matches "sudo", the latter RE does not - it matches ex. 
"bsudo" or "-sudo".



$ apt list *sudo*  vs.  $ apt list | grep "sudo"


This might give the same results.

BTW: using unquoted wildcards in parameters to shell commands is most 
often a bad idea (unless they are really meant to be file name patterns).


--
Regards
  mks

Re: reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread DdB

Am 19.01.2023 um 09:10 schrieb js-p...@online.de:
> Hello together,
> listing packages in apt with ”sudo“ in the title returns different output 
> (bash commands at the end of the email). I would fill a bug report, but I'm 
> not sure whether to address it to grep or apt. How do you see this?
> 
> Kind regards
> Julian Schreck
> --
> $ apt list sudo*   vs.  $ apt list | grep "^sudo[a-z-]"
> $ apt list *sudo   vs.  $ apt list | grep "[a-z-]sudo/"
> $ apt list *sudo*  vs.  $ apt list | grep "sudo"
> 
> 

I do not understand the issue. Nor do not see the "bug".
I do get different outputs from different commands though, as they are
not identical.

reportbug: don't know: bug in apt [list] or in grep

2023-01-19 Thread js-priv

Hello together,
listing packages in apt with ”sudo“ in the title returns different output (bash 
commands at the end of the email). I would fill a bug report, but I'm not sure 
whether to address it to grep or apt. How do you see this?

Kind regards
Julian Schreck
--
$ apt list sudo*   vs.  $ apt list | grep "^sudo[a-z-]"
$ apt list *sudo   vs.  $ apt list | grep "[a-z-]sudo/"
$ apt list *sudo*  vs.  $ apt list | grep "sudo"

Re: use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread David

On Wed, 21 Dec 2022 at 04:18, Lee  wrote:
> On 12/20/22, David  wrote:

> > $ echo -e '100:CD001\nXXX\n200:CD001' | awk 'BEGIN { FS=":" ; done=0 }
> > /CD001/ && done==0 { print $1 - 50 ; done=1 }'
> > 50
>
> You can do it without flags:
>
> $ echo -e '100:CD001\nXXX\n200:CD001' | awk -F: '/CD001/ { print $1 -
> 50 ; exit }'
> 50

That's better indeed. Thanks for sharing those improvements!
It really is worthwhile to know some basics of 'awk'.

Re: use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread Lee

On 12/20/22, David  wrote:
> On Tue, 20 Dec 2022 at 22:04, David  wrote:
>> On Tue, 20 Dec 2022 at 22:02, David  wrote:
>
>> > $ echo -e '100:CD001\n200:CD001' | awk 'BEGIN { FS=":" } /CD001/ &&
>> > NR==1 { print $1 - 50 }'
>> > 50
>>
>> Oops, my mistake, that's not the solution. Give me another minute and I
>> will post a better one one.
>
> The below does a better job:
> (command should be all on one line)
>
> $ echo -e '100:CD001\nXXX\n200:CD001' | awk 'BEGIN { FS=":" ; done=0 }
> /CD001/ && done==0 { print $1 - 50 ; done=1 }'
> 50

You can do it without flags:

$ echo -e '100:CD001\nXXX\n200:CD001' | awk -F: '/CD001/ { print $1 -
50 ; exit }'
50

Re: use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread Stefan Monnier

> Not that that is always important. But I just commented today
> because so often 'awk' is ignored as if its only capability is 'print $1'
>  when in fact it is actually very powerful but neglected.

FWIW, `sed` can also do that job.  Tho the subtraction part would take
a lot more work (`sed` doesn't know how to subtract, so you'd have to
write a chunk of `sed` code which implements subtraction by hand.
A fun exercise for the masochists out there who like to write code for
Turing machines).


Stefan

Re: use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread David

On Tue, 20 Dec 2022 at 22:04, David  wrote:
> On Tue, 20 Dec 2022 at 22:02, David  wrote:

> > $ echo -e '100:CD001\n200:CD001' | awk 'BEGIN { FS=":" } /CD001/ &&
> > NR==1 { print $1 - 50 }'
> > 50
>
> Oops, my mistake, that's not the solution. Give me another minute and I
> will post a better one one.

The below does a better job:
(command should be all on one line)

$ echo -e '100:CD001\nXXX\n200:CD001' | awk 'BEGIN { FS=":" ; done=0 }
/CD001/ && done==0 { print $1 - 50 ; done=1 }'
50

Still quite clean and obvious for a one liner (for folks who know how 'awk'
works), and it will be significantly faster than pipelines and
subshell collections.

Not that that is always important. But I just commented today
because so often 'awk' is ignored as if its only capability is 'print $1'
 when in fact it is actually very powerful but neglected.

Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game)

2022-12-20 Thread Thomas Schmitt

Hi,

The Wanderer wrote:
> With the '-o' option, grep prints only the parts of the line that were
> matched - but the plural here is very relevant. If that guess is
> correct, then the "line" in question has *four* occurrences, so grep
> prints them all - each on a separate line of output.

The man page agrees:

  -o, --only-matching
 Print only the matched (non-empty) parts  of  a  matching  line,
 with each such part on a separate output line.

So -o is probably inapproriate in my pipe.
(It seems to be a development remnant. I got the pipe from a similar one
 in an older mail of mine about hacking ISO images as stress test for
 GRUB's ISO 9660 reader.)
On the other hand it curbs the length of the output.

David wrote:
> Short demo:
> $ echo 100:CD001 | awk 'BEGIN { FS=":" } /CD001/ { print $1 - 50 }'
> 50

Looks like a good alternative to sed and expr.
(I keep in memory the gesture "awk '{print $1}'" for picking words out of
 lines. New stuff does not fit easy into that memory.)

> I only write this because I just magine how poor old 'awk' feels:
> "don't embed me in this pipelines and subshells and unnecessary
> commands, I can do all that stuff myself without any help!!".

My apologies to the venerable Awk Programming Language from an old
procedural programmer.

Have a nice day :)

Thomas

Re: use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread David

On Tue, 20 Dec 2022 at 22:02, David  wrote:

> $ echo -e '100:CD001\n200:CD001' | awk 'BEGIN { FS=":" } /CD001/ &&
> NR==1 { print $1 - 50 }'
> 50

Oops, my mistake, that's not the solution. Give me another minute and I
will post a better one one.

Re: use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread David

On Tue, 20 Dec 2022 at 21:53, The Wanderer  wrote:
> On 2022-12-20 at 05:37, David wrote:
> > On Tue, 20 Dec 2022 at 21:10, The Wanderer  wrote:
> >> On 2022-12-20 at 02:51, Thomas Schmitt wrote:

> >>> This contradicts the promises of man grep about option -m.

> >> It does seem to, at least at a glance - but I think I've figured
> >> out what's going on, and it's actually consistent with the option
> >> set you gave.
> >
> > [...]
> >
> > Hi,
> >
> > Slightly offtopic rambling ...
> >
> > I haven't looked at the 'grep' part of the above expression, but
> > I assume that its output lines look something like:
> > 100:CD001
> >
> > If that is the case, then awk does not need any assistance
> > from 'expr' or 'sed' (and even not from 'grep' if we were not
> > searching a binary file).
> >
> > Short demo:
> > $ echo 100:CD001 | awk 'BEGIN { FS=":" } /CD001/ { print $1 - 50 }'
> > 50
>
> If you replace the "echo 100:CD001" with "echo -e
> '100:CD001\n200:CD001'" (not sure if that syntax is portable to all
> shells, but it works in my version of bash), this does print '50' and
> '150' on consecutive lines - which (if I'm not mistaken) matches the
> behavior of the original pipeline, but is not what is actually desired
> here.
>
> > I only write this because I just magine how poor old 'awk' feels:
> > "don't embed me in this pipelines and subshells and unnecessary
> > commands, I can do all that stuff myself without any help!!".
>
> Because of the above, it looks like a pipeline may still be necessary
> here, to filter it down to just the first number being output. Unless
> awk has another feature that would let us do that limiting internally too?

Fair point. Thanks for noticing my laziness. :)

But 'awk' does indeed have vast powers, that are sadly very overlooked
in modern times:

$ echo -e '100:CD001\n200:CD001' | awk 'BEGIN { FS=":" } /CD001/ &&
NR==1 { print $1 - 50 }'
50

And I'm not anything like an 'awk' expert.

I just like to share what little knowledge I have because it is a bit
sad when cool tools fall out of fashion.

use of awk instead of complex multielement commands (was Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game))

2022-12-20 Thread The Wanderer

On 2022-12-20 at 05:37, David wrote:

> On Tue, 20 Dec 2022 at 21:10, The Wanderer 
> wrote:
> 
>> On 2022-12-20 at 02:51, Thomas Schmitt wrote:

>>> This contradicts the promises of man grep about option -m.
>>
>> It does seem to, at least at a glance - but I think I've figured
>> out what's going on, and it's actually consistent with the option
>> set you gave.
> 
> [...]
> 
> Hi,
> 
> Slightly offtopic rambling ...
> 
> I haven't looked at the 'grep' part of the above expression, but
> I assume that its output lines look something like:
> 100:CD001
> 
> If that is the case, then awk does not need any assistance
> from 'expr' or 'sed' (and even not from 'grep' if we were not
> searching a binary file).
> 
> Short demo:
> $ echo 100:CD001 | awk 'BEGIN { FS=":" } /CD001/ { print $1 - 50 }'
> 50

If you replace the "echo 100:CD001" with "echo -e
'100:CD001\n200:CD001'" (not sure if that syntax is portable to all
shells, but it works in my version of bash), this does print '50' and
'150' on consecutive lines - which (if I'm not mistaken) matches the
behavior of the original pipeline, but is not what is actually desired
here.

> I only write this because I just magine how poor old 'awk' feels:
> "don't embed me in this pipelines and subshells and unnecessary
> commands, I can do all that stuff myself without any help!!".

Because of the above, it looks like a pipeline may still be necessary
here, to filter it down to just the first number being output. Unless
awk has another feature that would let us do that limiting internally too?

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc
Description: OpenPGP digital signature

Re: 'grep -o -m' (was Re: Can't mount CD image of Win95 game)

2022-12-20 Thread David

On Tue, 20 Dec 2022 at 21:10, The Wanderer  wrote:
> On 2022-12-20 at 02:51, Thomas Schmitt wrote:

> >>>   offst=$( expr \
> >>>  $( grep -a -o -b -m 1 CD001 cdimage.iso \
> >>> | sed -e 's/:/ /' \
> >>> | awk '{ print $1 }' ) - 32769 )
> >
> > The Wanderer wrote:
> >> Cutting down the command line led me to discover that even with '-m 1',
> >> four different numbers are printed by the grep-pipeline subshell.
> >> (Without '-m 1', seven are printed.)
> >
> > This contradicts the promises of man grep about option -m.
>
> It does seem to, at least at a glance - but I think I've figured out
> what's going on, and it's actually consistent with the option set you
> gave.

[...]

Hi,

Slightly offtopic rambling ...

I haven't looked at the 'grep' part of the above expression, but
I assume that its output lines look something like:
100:CD001

If that is the case, then awk does not need any assistance
from 'expr' or 'sed' (and even not from 'grep' if we were not
searching a binary file).

Short demo:
$ echo 100:CD001 | awk 'BEGIN { FS=":" } /CD001/ { print $1 - 50 }'
50

I only write this because I just magine how poor old 'awk' feels:
"don't embed me in this pipelines and subshells and unnecessary
commands, I can do all that stuff myself without any help!!".

'grep -o -m' (was Re: Can't mount CD image of Win95 game)

2022-12-20 Thread The Wanderer

On 2022-12-20 at 02:51, Thomas Schmitt wrote:

> Hi,
> 
> i wrote:

>>> To obtain the offset of the first occurence of "CD001", do
>>>
>>>   offst=$( expr \
>>>  $( grep -a -o -b -m 1 CD001 cdimage.iso \
>>> | sed -e 's/:/ /' \
>>> | awk '{ print $1 }' ) - 32769 )
> 
> The Wanderer wrote:
>> Cutting down the command line led me to discover that even with '-m 1',
>> four different numbers are printed by the grep-pipeline subshell.
>> (Without '-m 1', seven are printed.)
> 
> This contradicts the promises of man grep about option -m.

It does seem to, at least at a glance - but I think I've figured out
what's going on, and it's actually consistent with the option set you
gave.

If I pass the same ISO through 'strings' before piping to grep, I find
that there are four occurrences of 'CD001' in the first 25 lines that
strings printed, and the next doesn't happen until line 20290.

My guess would be that grep is treating a "line" as ending with a
newline, and that there isn't a newline character in the ISO in question
until after all four of those occurrences.

With the '-o' option, grep prints only the parts of the line that were
matched - but the plural here is very relevant. If that guess is
correct, then the "line" in question has *four* occurrences, so grep
prints them all - each on a separate line of output.

The key to realizing how this interaction is consistent with the
documentation is that the man page for '-m' doesn't promise that it will
stop processing after the first match, but rather the first matched
*line*. And since a "line" in a binary input file can be very long (a
fact I know from lengthy and painful experience), it's entirely possible
for the first matched line to contain multiple matches - each of which
will then get printed.

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc
Description: OpenPGP digital signature

Re: Exploring grep-dctrl

2022-11-30 Thread Yassine Chaouche


Le 11/30/22 à 4:02 PM, Yassine Chaouche a écrit :



But even then, translation files might contain more than one entry for same 
package,
maybe one for each (version x architecture) product.
For eg.:

$ grep-dctrl winbind -s Description-en 
/var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_main_i18n_Translation-en
Description-en: Samba winbind client library - development files
  Samba is an implementation of the SMB/CIFS protocol for Unix systems,
  providing support for cross-platform file and printer sharing with
  Microsoft Windows, OS X, and other Unix systems.
  .
  This package provides the development files (static library and headers)
  required for building applications against libwbclient, a library for client
  applications that interact via the winbind pipe protocol with a Samba
  winbind server.
Description-en: Samba winbind client library
  Samba is an implementation of the SMB/CIFS protocol for Unix systems,
  providing support for cross-platform file and printer sharing with
  Microsoft Windows, OS X, and other Unix systems.
  .
  This package provides a library for client applications that interact
  via the winbind pipe protocol with a Samba winbind server.
Description-en: SMB/CIFS file, print, and login server for Unix
  Samba is an implementation of the SMB/CIFS protocol for Unix systems,
  providing support for cross-platform file and printer sharing with
  Microsoft Windows, OS X, and other Unix systems.  Samba can also function
  as an NT4-style domain controller, and can integrate with both NT4 domains
  and Active Directory realms as a member server.
  .
  This package provides the components necessary to use Samba as a stand-alone
  file and print server or as an NT4 or Active Directory domain controller.
  For use in an NT4 domain or Active Directory realm, you will also need the
  winbind package.
  .
  This package is not required for connecting to existing SMB/CIFS servers
  (see smbclient) or for mounting remote filesystems (see cifs-utils).
Description-en: service to resolve user and group information from Windows NT 
servers
  Samba is an implementation of the SMB/CIFS protocol for Unix systems,
  providing support for cross-platform file sharing with Microsoft Windows, OS 
X,
  and other Unix systems.  Samba can also function as a domain controller
  or member server in both NT4-style and Active Directory domains.
  .
  This package provides winbindd, a daemon which integrates authentication
  and directory service (user/group lookup) mechanisms from a Windows
  domain on a Linux system.
  .
  Winbind based user/group lookups via /etc/nsswitch.conf can be enabled via
  the libnss-winbind package. Winbind based Windows domain authentication can
  be enabled via the libpam-winbind package.
$



Woops! I was missing -PX flags, all good now.


$ grep-dctrl -PX winbind -s Description-en /var/lib/
Description-en: service to resolve user and group information from Windows NT se
 Samba is an implementation of the SMB/CIFS protocol for Unix systems,
 providing support for cross-platform file sharing with Microsoft Windows, OS X,
 and other Unix systems.  Samba can also function as a domain controller
 or member server in both NT4-style and Active Directory domains.
 .
 This package provides winbindd, a daemon which integrates authentication
 and directory service (user/group lookup) mechanisms from a Windows
 domain on a Linux system.
 .
 Winbind based user/group lookups via /etc/nsswitch.conf can be enabled via
 the libnss-winbind package. Winbind based Windows domain authentication can
 be enabled via the libpam-winbind package.
$



--
Yassine -- sysadm
57 33

Re: Exploring grep-dctrl

2022-11-30 Thread Yassine Chaouche


Le 11/29/22 à 5:48 PM, David Wright a écrit :

Please don't post HTML, but text.


Sorry, I thought my messages were multipart.
I just realized they were not after your message.
I'll have to fine-tune my MUA (thunderbird).
For the moment, I changed it to text only.


On Sun 27 Nov 2022 at 17:25:45 (+0100), Yassine Chaouche wrote:

 I tried to achieve the same w/o using apt-cache, but couldn't. 
   
   My failed attempts were : 
   
   
 1/
 16:37:50 ~ -1- $ grep-dctrl -PX syslog-summary
 /var/lib/apt/lists/*_Packages


Those are the wrong files for the descriptions; you want *_Translation-en

Cheers,
David.



Thanks!
Now I only need to find a way to exit grep-dctrl on first match.
Skimming through the manpage I didn't find anything related,
except for -q which also disables printing.

I'm thinking about turning a single grep-dctrl call with multiple files
into a loop that would end after first grep-dctrl success exit code,
something like :


function package.describe2 {
for file in /var/lib/apt/lists/*_Translation-en
do
grep-dctrl -s Description-en "$1" "$file" && printf "%64s : %s\n" "$file" "$?" 
&& break
done

}


But even then, translation files might contain more than one entry for same 
package,
maybe one for each (version x architecture) product.
For eg.:

$ grep-dctrl winbind -s Description-en 
/var/lib/apt/lists/security.ubuntu.com_ubuntu_dists_trusty-security_main_i18n_Translation-en
Description-en: Samba winbind client library - development files
 Samba is an implementation of the SMB/CIFS protocol for Unix systems,
 providing support for cross-platform file and printer sharing with
 Microsoft Windows, OS X, and other Unix systems.
 .
 This package provides the development files (static library and headers)
 required for building applications against libwbclient, a library for client
 applications that interact via the winbind pipe protocol with a Samba
 winbind server.
Description-en: Samba winbind client library
 Samba is an implementation of the SMB/CIFS protocol for Unix systems,
 providing support for cross-platform file and printer sharing with
 Microsoft Windows, OS X, and other Unix systems.
 .
 This package provides a library for client applications that interact
 via the winbind pipe protocol with a Samba winbind server.
Description-en: SMB/CIFS file, print, and login server for Unix
 Samba is an implementation of the SMB/CIFS protocol for Unix systems,
 providing support for cross-platform file and printer sharing with
 Microsoft Windows, OS X, and other Unix systems.  Samba can also function
 as an NT4-style domain controller, and can integrate with both NT4 domains
 and Active Directory realms as a member server.
 .
 This package provides the components necessary to use Samba as a stand-alone
 file and print server or as an NT4 or Active Directory domain controller.
 For use in an NT4 domain or Active Directory realm, you will also need the
 winbind package.
 .
 This package is not required for connecting to existing SMB/CIFS servers
 (see smbclient) or for mounting remote filesystems (see cifs-utils).
Description-en: service to resolve user and group information from Windows NT 
servers
 Samba is an implementation of the SMB/CIFS protocol for Unix systems,
 providing support for cross-platform file sharing with Microsoft Windows, OS X,
 and other Unix systems.  Samba can also function as a domain controller
 or member server in both NT4-style and Active Directory domains.
 .
 This package provides winbindd, a daemon which integrates authentication
 and directory service (user/group lookup) mechanisms from a Windows
 domain on a Linux system.
 .
 Winbind based user/group lookups via /etc/nsswitch.conf can be enabled via
 the libnss-winbind package. Winbind based Windows domain authentication can
 be enabled via the libpam-winbind package.
$


Should I just continue hacking around until I get desired results
(maybe process output with awk)
or is there a better approach to this? (reaching grep-dctrl limits)


Best,
--
Yassine -- sysadm

Re: Exploring grep-dctrl

2022-11-29 Thread David Wright

Please don't post HTML, but text.

On Sun 27 Nov 2022 at 17:25:45 (+0100), Yassine Chaouche wrote:
> I tried to achieve the same w/o using apt-cache, but couldn't. 
>   
>   My failed attempts were : 
>   
>   
> 1/
> 16:37:50 ~ -1- $ grep-dctrl -PX syslog-summary
> /var/lib/apt/lists/*_Packages

Those are the wrong files for the descriptions; you want *_Translation-en

Cheers,
David.

Exploring grep-dctrl

2022-11-27 Thread Yassine Chaouche


  
  
Hello there,
  
  
I am exploring the possibilities of grep-dctrl.
My current experiment is to try and show the description of a
package that is not necessarily installed.
I have defined a package.describe
function in my .bashrc that does the following: 
  
  
  $ package.describe ()
{ apt-cache show "$1" | grep-dctrl -s Description-en -
}
  
  


I get desired result
  
  16:40:35 ~ -1- $ package.describe syslog-summary
  Description-en: summarize the contents of a syslog log file
   This program summarizes the contents of a log file written by
  syslog,
   by displaying each unique (except for the time) line once,
  and also
   the number of times such a line occurs in the input. The
  lines are
   displayed in the order they occur in the input.
   .
   It is also possible to define some "ignore rules" using
  regular
   expressions.
  16:42:15 ~ -1- $
  
  

  

I tried to achieve the same w/o using apt-cache, but couldn't. 
  
  My failed attempts were : 
  
  
1/
16:37:50 ~ -1- $ grep-dctrl -PX syslog-summary
/var/lib/apt/lists/*_Packages   

Package:
syslog-summary   

Priority:
optional

Section:
universe/admin   

Installed-Size:
80

Maintainer: Ubuntu Developers
 

Original-Maintainer: David Paleino
 

Architecture:
all 

Version:
1.14-2   

Depends: python (>=
2.5)  

Recommends:
python-magic  

Filename:
pool/universe/s/syslog-summary/syslog-summary_1.14-2_all.deb

Size:
9798

MD5sum:
1694a9b5722f7264f7fb9485c9367f8e  

SHA1:
c10d8bbf1fc65bcabd2094bd3510fe72925c8d2f

Re: grep replacement using sed is behaving oddly

2022-10-22 Thread Max Nikulin


On 22/10/2022 20:23, Gary Dale wrote:

     sed -i '//d' *.html

did the trick.


I would suggest you to use more specific pattern to avoid removing of 
meaningful text due to a lost newline character:


sed -i -e '/^\s*]*>\s*$/d'

"." in regexp may be a source of surprises (or catastrophic backtracking 
in more complex regexps)


printf 'A bc\n' | sed -e 's/<.*>/\n/'
A
c

printf 'A bc\n' | sed -e 's/<.[^>]>/\n/g'
A
b
c

Re: grep replacement using sed is behaving oddly

2022-10-22 Thread Gary Dale


On 2022-10-21 15:14, David Wright wrote:

On Fri 21 Oct 2022 at 14:15:01 (-0400), Greg Wooledge wrote:

On Fri, Oct 21, 2022 at 08:01:00PM +0200, to...@tuxteam.de wrote:

On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:

I'm hoping someone can tell me what I'm doing wrong. I have a line in a lot
of HTML files that I'd like to remove. The line is:

     

I'm testing the sed command to remove it on just one file. When it works,
I'll run it against *.html. My command is:

  sed -i -s 's/\s*\//g' history.html

Unfortunately, the replacement doesn't remove the line but rather leaves me
with:

     <;">

This looks as if the <> in the regexp were interpreted as left and right
word boundaries (but that would only be the case if you'd given the -E
(or -r) option).

Try explicitly adding the --posix option, perhaps...

Gary is using non-POSIX syntax (specifically the \s), so that's not going
to help unless he first changes his regular expression to be standard.

The whitespace is tricky. I pasted the email into emacs, and I see
that there are NO-BREAK SPACEs at the start, and one after "hr".
Who knows whether they're really in the OP's files, or just put
there by their MUA.


I think you might be on to something with the \< and \> here.  I can see
absolutely no reason why Gary put backslashes in front of spaces and
angle brackets here.

I'm guessing the reason is guessing.


The backslashes in front of the spaces are probably
just noise, and can be ignored.  The \< and \> on the other hand might
be interpreted as something special, the same way \s is (because this is
GNU sed, which loves to do nonstandard things).

unicorn:~$ echo 'abc  xyz' | sed 's/<.*>//'
abc  xyz
unicorn:~$ echo 'abc  xyz' | sed 's/\<.*\>//'

unicorn:~$

So... yeah, \< and/or \> clearly have some special meaning to GNU sed.
Good luck figuring out what that is.

Word boundaries, as tomas said. The .*\> can be seen to have worked,
as matching stopped after the end of the word "rem", leaving the
punctuation behind.


For Gary's actual problem, simply removing the backslashes where they're
not wanted would be a good start.  Actually learning sed could be step 2.

The man/info pages leave a lot to be desired. A table with columns that showed:

   codesupported byeffect
  \s  -e   match all whitespace except NON-BREAK or whatever
  --posix
  -E
  --posix -E
  or whatever

might really help. As it is, unless you're looking at a real book,
you get a table like:

   '\s'
  Matches whitespace characters (spaces and tabs).  Newlines embedded
  in the pattern/hold spaces will also match:

   '\S'
  Matches non-whitespace characters.

   '\<'
  Matches the beginning of a word.

   '\>'
  Matches the end of a word.

but it's next to impossible to keep track of whether you're in a
section that's speaking POSIX, GNU, or some mid-20th century tradition.


I feel obliged at this point to mention that parsing HTML with regular
expressions is a fool's errand, and that sed should not be the tool of
choice here.  Nor should grep, nor any other RE-based tool.  This goes
triple when one doesn't even know the correct syntax for their RE.

https://stackoverflow.com/q/1732348

To be fair, I'm not sure whether the OP is really trying to parse
HTML, or just remove some similar strings that they see as redundant.

Cheers,
David.


Thanks. This command

    sed -i '//d' *.html

did the trick.

I've gotten into the habit of escaping special characters rather than 
memorizing the full list of which ones need to be escaped. I do most of 
my editing in Kate but use sed from to time when making the same change 
to all the files in a web site, as was the case here. Obviously I wasn't 
aware of the special meaning of \< and \> in sed...


Thanks again.

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread tomas

On Sat, Oct 22, 2022 at 10:32:24AM +0700, Max Nikulin wrote:
> On 22/10/2022 02:09, The Wanderer wrote:
> > 
> > 'info sed', section 'sed regular expressions', subsection 'regular
> > expression extensions':
> 
> While a reader may find more interesting stuff lying around while traveling
> by this path, there is a shorthand
> 
> info "(sed) regexp extensions"

Yes, good point.

Cheers
-- 
t


signature.asc
Description: PGP signature

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread Max Nikulin


On 22/10/2022 02:09, The Wanderer wrote:


'info sed', section 'sed regular expressions', subsection 'regular
expression extensions':


While a reader may find more interesting stuff lying around while 
traveling by this path, there is a shorthand


info "(sed) regexp extensions"

and alternatives that might be more friendly:

- infobrowser "(sed) regexp extensions"
  (configurable through update-alternatives)
- tkinfo "(sed) regexp extensions"
- emacs -f info-standalone "(sed) regexp extensions"
  or
  M-: (info "(sed) regexp extensions") RET
  within a running emacs instance

https://www.gnu.org/software/sed/manual/html_node/regexp-extensions.html
https://www.gnu.org/software/sed/manual/sed.html#regexp-extensions

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread The Wanderer

On 2022-10-21 at 16:16, Greg Wooledge wrote:

> On Fri, Oct 21, 2022 at 03:09:32PM -0400, The Wanderer wrote:
> 
>> IOW, each seems to be half of the usual '\b' (edge of a word) set.
>> With the default sed behavior (not sure whether that's basic
>> regular expressions or extended regular expressions, in the
>> nomenclature of the info document), you can use replace the latter
>> with an alternation of both of the former:
> 
> The things you're discussing here, \< \> \b, are all GNU extensions. 
> They're *not* part of the POSIX standard BRE and ERE languages.
> 
> This doesn't mean you should stop using them.  Just that you should
> be aware that your script is not going to be portable if you use
> them.

Sure. My point was more that it's not terribly hard to find out what the
"special meaning to GNU sed" that this syntax has is; you just have to
do a bit of searching in the info document.

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc
Description: OpenPGP digital signature

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread Greg Wooledge

On Fri, Oct 21, 2022 at 03:09:32PM -0400, The Wanderer wrote:
> IOW, each seems to be half of the usual '\b' (edge of a word) set. With
> the default sed behavior (not sure whether that's basic regular
> expressions or extended regular expressions, in the nomenclature of the
> info document), you can use replace the latter with an alternation of
> both of the former:

The things you're discussing here, \< \> \b, are all GNU extensions.
They're *not* part of the POSIX standard BRE and ERE languages.

This doesn't mean you should stop using them.  Just that you should be
aware that your script is not going to be portable if you use them.

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread David Wright

On Fri 21 Oct 2022 at 14:15:01 (-0400), Greg Wooledge wrote:
> On Fri, Oct 21, 2022 at 08:01:00PM +0200, to...@tuxteam.de wrote:
> > On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:
> > > I'm hoping someone can tell me what I'm doing wrong. I have a line in a 
> > > lot
> > > of HTML files that I'd like to remove. The line is:
> > > 
> > >     
> > > 
> > > I'm testing the sed command to remove it on just one file. When it works,
> > > I'll run it against *.html. My command is:
> > > 
> > >  sed -i -s 's/\s*\//g' history.html
> > > 
> > > Unfortunately, the replacement doesn't remove the line but rather leaves 
> > > me
> > > with:
> > > 
> > >     <;">
> > 
> > This looks as if the <> in the regexp were interpreted as left and right
> > word boundaries (but that would only be the case if you'd given the -E
> > (or -r) option).
> > 
> > Try explicitly adding the --posix option, perhaps...
> 
> Gary is using non-POSIX syntax (specifically the \s), so that's not going
> to help unless he first changes his regular expression to be standard.

The whitespace is tricky. I pasted the email into emacs, and I see
that there are NO-BREAK SPACEs at the start, and one after "hr".
Who knows whether they're really in the OP's files, or just put
there by their MUA.

> I think you might be on to something with the \< and \> here.  I can see
> absolutely no reason why Gary put backslashes in front of spaces and
> angle brackets here.

I'm guessing the reason is guessing.

> The backslashes in front of the spaces are probably
> just noise, and can be ignored.  The \< and \> on the other hand might
> be interpreted as something special, the same way \s is (because this is
> GNU sed, which loves to do nonstandard things).
> 
> unicorn:~$ echo 'abc  xyz' | sed 's/<.*>//'
> abc  xyz
> unicorn:~$ echo 'abc  xyz' | sed 's/\<.*\>//'
> 
> unicorn:~$ 
> 
> So... yeah, \< and/or \> clearly have some special meaning to GNU sed.
> Good luck figuring out what that is.

Word boundaries, as tomas said. The .*\> can be seen to have worked,
as matching stopped after the end of the word "rem", leaving the
punctuation behind.

> For Gary's actual problem, simply removing the backslashes where they're
> not wanted would be a good start.  Actually learning sed could be step 2.

The man/info pages leave a lot to be desired. A table with columns that showed:

  codesupported byeffect
 \s  -e   match all whitespace except NON-BREAK or whatever
 --posix
 -E
 --posix -E
 or whatever

might really help. As it is, unless you're looking at a real book,
you get a table like:

  '\s'
 Matches whitespace characters (spaces and tabs).  Newlines embedded
 in the pattern/hold spaces will also match:

  '\S'
 Matches non-whitespace characters.

  '\<'
 Matches the beginning of a word.

  '\>'
 Matches the end of a word.

but it's next to impossible to keep track of whether you're in a
section that's speaking POSIX, GNU, or some mid-20th century tradition.

> I feel obliged at this point to mention that parsing HTML with regular
> expressions is a fool's errand, and that sed should not be the tool of
> choice here.  Nor should grep, nor any other RE-based tool.  This goes
> triple when one doesn't even know the correct syntax for their RE.
> 
> https://stackoverflow.com/q/1732348

To be fair, I'm not sure whether the OP is really trying to parse
HTML, or just remove some similar strings that they see as redundant.

Cheers,
David.

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread The Wanderer

On 2022-10-21 at 14:15, Greg Wooledge wrote:

> So... yeah, \< and/or \> clearly have some special meaning to GNU
> sed. Good luck figuring out what that is.

'info sed', section 'sed regular expressions', subsection 'regular
expression extensions':

>> '\<'
>>  Matches the beginning of a word.
>> 
>>   $ echo "abc %-= def." | sed 's/\>   Xabc %-= Xdef.
>> 
>> '\>'
>>  Matches the end of a word.
>> 
>>   $ echo "abc %-= def." | sed 's/\>/X/g'
>>   abcX %-= defX.

IOW, each seems to be half of the usual '\b' (edge of a word) set. With
the default sed behavior (not sure whether that's basic regular
expressions or extended regular expressions, in the nomenclature of the
info document), you can use replace the latter with an alternation of
both of the former:

echo 'a b c' | sed 's/\(\<\|\>\)b\b//'
a  c
echo 'a b c' | sed 's/\bb\b//'
a  c

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc
Description: OpenPGP digital signature

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread tomas

On Fri, Oct 21, 2022 at 02:15:01PM -0400, Greg Wooledge wrote:
> On Fri, Oct 21, 2022 at 08:01:00PM +0200, to...@tuxteam.de wrote:
> > On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:
> > > I'm hoping someone can tell me what I'm doing wrong. I have a line in a 
> > > lot
> > > of HTML files that I'd like to remove. The line is:
> > > 
> > >     
> > > 
> > > I'm testing the sed command to remove it on just one file. When it works,
> > > I'll run it against *.html. My command is:
> > > 
> > >  sed -i -s 's/\s*\//g' history.html
> > > 
> > > Unfortunately, the replacement doesn't remove the line but rather leaves 
> > > me
> > > with:
> > > 
> > >     <;">
> > 
> > This looks as if the <> in the regexp were interpreted as left and right
> > word boundaries (but that would only be the case if you'd given the -E
> > (or -r) option).
> > 
> > Try explicitly adding the --posix option, perhaps...
> 
> Gary is using non-POSIX syntax (specifically the \s), so that's not going
> to help unless he first changes his regular expression to be standard.

Yes, but he's telling sed to use POSIX aka "obsolete", following the jargon
of man (7) regex (by not overriding the default, which is POSIX/obsolete).
Unless something else is at work (@Gary: does "which sed" say /bin/sed?)

> I think you might be on to something with the \< and \> here.  I can see
> absolutely no reason why Gary put backslashes in front of spaces and
> angle brackets here.

They shouldn't do anything for spaces, since they are ordinary characters.
But HEY! I got that the wrong way around: escaping the <> makes them special:
Gary -- take away the backslashes from the angle brackets. That should help.
And as Greg says -- also from the spaces, that should unobfuscate your
regexp a bit.

All that said. deleting the line with sed is what you want, anyway, as
noted by Greg elsewhere in the thread.

> The backslashes in front of the spaces are probably
> just noise, and can be ignored.  The \< and \> on the other hand might
> be interpreted as something special, the same way \s is (because this is
> GNU sed, which loves to do nonstandard things).

No, you are absolutely correct, my mind had a twist. With -E, you can use
<> as word boundary matches, without the -E, those are \< and \>.

> 
> unicorn:~$ echo 'abc  xyz' | sed 's/<.*>//'
> abc  xyz
> unicorn:~$ echo 'abc  xyz' | sed 's/\<.*\>//'
> 
> unicorn:~$ 
> 
> So... yeah, \< and/or \> clearly have some special meaning to GNU sed.
> Good luck figuring out what that is.

Word boundaries: the zero-width string between the last non-word character
and the first word character ("<") and that one between the last word
character and the following non-word character (">"). PCRE has those,
too.

> For Gary's actual problem, simply removing the backslashes where they're
> not wanted would be a good start.  Actually learning sed could be step 2.

Exactly.

> I feel obliged at this point to mention that parsing HTML with regular
> expressions is a fool's errand, and that sed should not be the tool of
> choice here.  Nor should grep, nor any other RE-based tool.  This goes
> triple when one doesn't even know the correct syntax for their RE.

Definitely. As far as HTML is concerned, Gary's line

is totally equivalent to

(actually there must not be a whitespace between < and the hr).

Some day the monster generating the HTML becomes creative, and the
debugging session is interesting. I guess Gary knows that.

(I've got a nice anecdote about processing of XML line by line with
Perl and funny stuff, but I won't bore you with that :-)

Cheers
-- 
t

signature.asc
Description: PGP signature

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread Greg Wooledge

On Fri, Oct 21, 2022 at 08:01:00PM +0200, to...@tuxteam.de wrote:
> On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:
> > I'm hoping someone can tell me what I'm doing wrong. I have a line in a lot
> > of HTML files that I'd like to remove. The line is:
> > 
> >     
> > 
> > I'm testing the sed command to remove it on just one file. When it works,
> > I'll run it against *.html. My command is:
> > 
> >  sed -i -s 's/\s*\//g' history.html
> > 
> > Unfortunately, the replacement doesn't remove the line but rather leaves me
> > with:
> > 
> >     <;">
> 
> This looks as if the <> in the regexp were interpreted as left and right
> word boundaries (but that would only be the case if you'd given the -E
> (or -r) option).
> 
> Try explicitly adding the --posix option, perhaps...

Gary is using non-POSIX syntax (specifically the \s), so that's not going
to help unless he first changes his regular expression to be standard.

I think you might be on to something with the \< and \> here.  I can see
absolutely no reason why Gary put backslashes in front of spaces and
angle brackets here.  The backslashes in front of the spaces are probably
just noise, and can be ignored.  The \< and \> on the other hand might
be interpreted as something special, the same way \s is (because this is
GNU sed, which loves to do nonstandard things).

unicorn:~$ echo 'abc  xyz' | sed 's/<.*>//'
abc  xyz
unicorn:~$ echo 'abc  xyz' | sed 's/\<.*\>//'

unicorn:~$ 

So... yeah, \< and/or \> clearly have some special meaning to GNU sed.
Good luck figuring out what that is.

For Gary's actual problem, simply removing the backslashes where they're
not wanted would be a good start.  Actually learning sed could be step 2.

I feel obliged at this point to mention that parsing HTML with regular
expressions is a fool's errand, and that sed should not be the tool of
choice here.  Nor should grep, nor any other RE-based tool.  This goes
triple when one doesn't even know the correct syntax for their RE.

https://stackoverflow.com/q/1732348

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread tomas

On Fri, Oct 21, 2022 at 01:50:29PM -0400, Greg Wooledge wrote:
> On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:
> >  sed -i -s 's/\s*\//g' history.html
> > 
> > Unfortunately, the replacement doesn't remove the line but rather leaves me
> > with:
> > 
> >     <;">
> 
> The 's' command in sed doesn't remove lines.  It performs a substitution
> within a line.
> 
> To remove lines, use the 'd' command instead.

This is even better. Still the substitution above looks funny,
doesn't it?

Cheers
-- 
t


signature.asc
Description: PGP signature

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread tomas

On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:
> I'm hoping someone can tell me what I'm doing wrong. I have a line in a lot
> of HTML files that I'd like to remove. The line is:
> 
>     
> 
> I'm testing the sed command to remove it on just one file. When it works,
> I'll run it against *.html. My command is:
> 
>  sed -i -s 's/\s*\//g' history.html
> 
> Unfortunately, the replacement doesn't remove the line but rather leaves me
> with:
> 
>     <;">

This looks as if the <> in the regexp were interpreted as left and right
word boundaries (but that would only be the case if you'd given the -E
(or -r) option).

Try explicitly adding the --posix option, perhaps...

And oh, option "-i" implies "-s" according to the docs. But this shouldn't
matter (and much less explain the behaviour you're seeing).

Cheers
-- 
t

signature.asc
Description: PGP signature

Re: grep replacement using sed is behaving oddly

2022-10-21 Thread Greg Wooledge

On Fri, Oct 21, 2022 at 01:21:44PM -0400, Gary Dale wrote:
>  sed -i -s 's/\s*\//g' history.html
> 
> Unfortunately, the replacement doesn't remove the line but rather leaves me
> with:
> 
>     <;">

The 's' command in sed doesn't remove lines.  It performs a substitution
within a line.

To remove lines, use the 'd' command instead.

unicorn:~$ printf 'line %s\n' {1..10} | sed '/7/d'
line 1
line 2
line 3
line 4
line 5
line 6
line 8
line 9
line 10

grep replacement using sed is behaving oddly

2022-10-21 Thread Gary Dale

I'm hoping someone can tell me what I'm doing wrong. I have a line in a 
lot of HTML files that I'd like to remove. The line is:


    


I'm testing the sed command to remove it on just one file. When it 
works, I'll run it against *.html. My command is:


 sed -i -s 's/\s*\//g' history.html

Unfortunately, the replacement doesn't remove the line but rather leaves 
me with:


    <;">

The leading spaces, angle brackets and some punctuation (but not all) is 
left behind. Moreover, If I try to remove the EOL by adding \n after the 
\>, the replace fails (and yes, the closing bracket is the last 
character on the line).


I get the same behaviour under both Bullseye and Bookworm so I assume 
this is how sed is supposed to operate. However, when I try the same 
regex in Kate, it works.


Is this a long-standing bug in sed or am I doing something wrong?

Thanks

Re: grep: show matching line from pattern file

2022-06-03 Thread DdB

Hello,

of course, there are different ways to solve this, i like the perl
approach. Only since i myself am not all that familiar with the
language, i'd like to add 2 pointers:
(M)AWK scripting language can do similar things (read syslog once, loop
over regular expressions and output anything you want about it).
But if you can live with calling egrep repeatedly, i would suggest GNU
parallel, which works similar to xargs, only a much enhanced version of
it, using sevral cores in parallel by default but also handling the
commandline in a much improved way (special syntax, so to speak). It
allows coding your request as a one-liner, i am certain, but probably
not as effective, as perl or awk would have been.
BTW: GNU parallel is in debian repos, but a quite outdated version of it.

Have fun, DdB

Re: grep: show matching line from pattern file

2022-06-03 Thread Richard Hector

On 3/06/22 07:17, Greg Wooledge wrote:

On Thu, Jun 02, 2022 at 03:12:23PM -0400, duh wrote:

> > Jim Popovitch wrote on 28/05/2022 21:40:
> > > I have a file of regex patterns and I use grep like so:
> > > 
> > >  ~$ grep -f patterns.txt /var/log/syslog
> > > 
> > > What I'd like to get is a listing of all lines, specifically the line

> > > numbers of the regexps in patterns.txt, that match entries in
> > > /var/log/syslog.   Is there a way to do this?

$cat -n /var/log/syslog | grep warn

and it found "warn" in the syslog file and provided line numbers. I have
not used the -f option

You're getting the line numbers from the log file.  The OP wanted the line
numbers of the patterns in the -f pattern file.

Why?  I have no idea.  There is no standard option to do this, because
it's not a common requirement.  That's why I wrote one from scratch
in perl.

I don't know what the OP's use case is, but here's an example I might use:

I have a bunch of custom ignore files for logcheck. After a software 
upgrade, I might want to check which patterns no longer match anything, 
and can be deleted or modified.

I'd really still want to check with real egrep, though, rather than 
using perl's re engine instead.

Cheers,
Richard

Re: Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread David Christensen


On 6/2/22 22:50, Will Mengarini wrote:

* David Christensen  [22-06/02=Th 19:18 -0700]:

[...]
Now I can almost match your prompt -- there is a dash before 'bash':

2022-06-02 19:05:10 dpchrist@laalaa ~
$ PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"
laalaa/pts/8 -bash1 ~ 19:08 0$

The dash seems to be coming from the '\s' bash(1) -> PROMPTING ->
backslash-escaped special characters:

2022-06-02 19:12:58 dpchrist@laalaa ~
$ PS1="\\s"
-bash


The dash indicates you're running a login shell.  That's useful
information, because a login shell is initialized differently.
See section INVOCATION in `man bash`.

You see in the rendered prompt it says "-bash1"; "1" is $SHLVL.
If you run another bash from inside that bash, and again set PS1
as you did above, you should see "bash2" instead of "-bash1".



RTFM bash(1), I see:

  PROMPTING

  \s the  name  of  the shell, the basename of $0 (the portion 
following the final slash)


  INVOCATION

  A login shell is one whose first character of argument zero is a -, 
or one started with the --login option.



Thank you for the explanation of your PS1 and the subtleties of Bash, 
variable expansion, remove matching prefix pattern, $0, and PS1 '\s'.



David

Re: Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread Will Mengarini

* David Christensen  [22-06/02=Th 19:18 -0700]:
> [...]
> Now I can almost match your prompt -- there is a dash before 'bash':
>
> 2022-06-02 19:05:10 dpchrist@laalaa ~
> $ PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"
> laalaa/pts/8 -bash1 ~ 19:08 0$
>
> The dash seems to be coming from the '\s' bash(1) -> PROMPTING ->
> backslash-escaped special characters:
>
> 2022-06-02 19:12:58 dpchrist@laalaa ~
> $ PS1="\\s"
> -bash

The dash indicates you're running a login shell.  That's useful
information, because a login shell is initialized differently.
See section INVOCATION in `man bash`.

You see in the rendered prompt it says "-bash1"; "1" is $SHLVL.
If you run another bash from inside that bash, and again set PS1
as you did above, you should see "bash2" instead of "-bash1".

Re: Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread David Christensen


On 6/2/22 19:25, Greg Wooledge wrote:

On Thu, Jun 02, 2022 at 06:01:11PM -0700, David Christensen wrote:

This is my PS1.  '\u' does not work on all of Debian, FreeBSD, Cygwin, and
macOS, so the expansion of ${USER} is inserted between two string literals
when .profile runs and sets PS1:

2022-06-02 17:39:09 dpchrist@laalaa ~
$ grep PS1 .profile
export PS1='\n\D{%Y-%m-%d %H:%M:%S} '${USER}'@\h \w\n\$ '


Variable expansions *are* performed when PS1 is evaluated.  So, you
could simply do:

PS1='stuff $USER more stuff'

That will delay the expansion of USER (which by the way is a BSD-ism)
until the prompt is drawn.  The way you've written it, $USER is
expanded at the time PS1 is assigned.  Which is not wrong, so long as
the value of USER cannot change in the middle of a shell session...
but it's not how most of the experienced people would do it, I think.

I'm rather curious how you managed to find a system + bash version
where \u doesn't work.  That sounds like something you'd want to report
to the bug-bash mailing list... or, possibly, a user error.  \u should
work on any Unix-like system.



I found the reason for not using '\u' in Bash PS1 -- the 'toor' user on 
FreeBSD:


2022-06-02 21:30:08 toor@f3 ~
# freebsd-version ; uname -a ; bash --version
12.3-RELEASE-p5
FreeBSD f3.tracy.holgerdanske.com 12.3-RELEASE-p5 FreeBSD 
12.3-RELEASE-p5 GENERIC  amd64

GNU bash, version 5.1.16(0)-release (amd64-portbld-freebsd12.3)
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>


This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

2022-06-02 21:30:16 toor@f3 ~
# egrep '^(root|toor)' /etc/passwd
root:*:0:0:Charlie &:/root:/bin/csh
toor:*:0:0:Bourne-again Superuser:/root:/usr/local/bin/bash

2022-06-02 21:30:28 toor@f3 ~
# whoami
root

2022-06-02 21:30:30 toor@f3 ~
# echo $USER
toor

2022-06-02 21:30:34 toor@f3 ~
# PS1='\n\D{%Y-%m-%d %H:%M:%S} \u@\h \w\n\$ '

2022-06-02 21:30:41 root@f3 ~
#


Your suggestion produces the desired prompt:

2022-06-02 21:30:41 root@f3 ~
# PS1='\n\D{%Y-%m-%d %H:%M:%S} $USER@\h \w\n\$ '

2022-06-02 21:31:29 toor@f3 ~
#


But, I think I will keep the braces:

2022-06-02 21:55:38 toor@f3 ~
# vi .profile

export PS1='\n\D{%Y-%m-%d %H:%M:%S} ${USER}@\h \w\n\$ '

2022-06-02 21:56:04 toor@f3 ~
# . .profile

2022-06-02 21:56:06 toor@f3 ~
# echo $PS1
\n\D{%Y-%m-%d %H:%M:%S} ${USER}@\h \w\n\$

2022-06-02 21:56:10 toor@f3 ~
#


(It works on all of my other platforms.)


Thank you,

David

Re: Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread Greg Wooledge

On Thu, Jun 02, 2022 at 06:01:11PM -0700, David Christensen wrote:
> This is my PS1.  '\u' does not work on all of Debian, FreeBSD, Cygwin, and
> macOS, so the expansion of ${USER} is inserted between two string literals
> when .profile runs and sets PS1:
> 
> 2022-06-02 17:39:09 dpchrist@laalaa ~
> $ grep PS1 .profile
> export PS1='\n\D{%Y-%m-%d %H:%M:%S} '${USER}'@\h \w\n\$ '

Variable expansions *are* performed when PS1 is evaluated.  So, you
could simply do:

PS1='stuff $USER more stuff'

That will delay the expansion of USER (which by the way is a BSD-ism)
until the prompt is drawn.  The way you've written it, $USER is
expanded at the time PS1 is assigned.  Which is not wrong, so long as
the value of USER cannot change in the middle of a shell session...
but it's not how most of the experienced people would do it, I think.

I'm rather curious how you managed to find a system + bash version
where \u doesn't work.  That sounds like something you'd want to report
to the bug-bash mailing list... or, possibly, a user error.  \u should
work on any Unix-like system.

Re: Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread David Christensen


On 6/2/22 18:35, Will Mengarini wrote:

* David Christensen  [22-06/02=Th 18:01 -0700]:

On 6/2/22 17:12, Will Mengarini wrote:

* David Christensen [22-06/02=Th 15:50 -0700]:

On 6/2/22 15:13, Will Mengarini wrote:



In this transcript, the number before the prompt-ending '$' is $?:

debian/pts/4 bash3 ~ 14:56 0$perl -e 'open "gweeblefleep" || die'
debian/pts/4 bash3 ~ 14:57 0$perl -e 'open "gweeblefleep" or die'
Died at -e line 1.
debian/pts/4 bash3 ~ 14:57 2$



What is your shell?  PS1?


The shell is Bash 5.1.4.
PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"



The snippet '${TTY#/dev/}' seems to produce ' -' on my computer.
How does your computer produce 'pts/4 ' and what does it mean?


'pts/4' is an abbreviation of '/dev/pts/4', pseudoterminal 4.

TTY is `tty`; it's been so long I'd forgotten that's not available in
all shells.  You should have the 'tty' program; it's in coreutils.


Is there a reason why you are using
double quotes, rather than single quotes?


So I can interpolate stuff like ${TTY#/dev/}.  In your case,
you'll need to set TTY=`tty` before setting PS1, so Bash
can use string substitution to remove '/dev/' from it.



Okay.  That explains TTY and bash(1) -> Parameter Expansion -> When not 
performing substring expansion -> Remove matching prefix pattern:


2022-06-02 19:04:45 dpchrist@laalaa ~
$ TTY=`tty`

2022-06-02 19:04:53 dpchrist@laalaa ~
$ echo "'$TTY'"
'/dev/pts/8'

2022-06-02 19:04:58 dpchrist@laalaa ~
$ echo "'${TTY#/dev/}'"
'pts/8'


Now I can almost match your prompt -- there is a dash before 'bash':

2022-06-02 19:05:10 dpchrist@laalaa ~
$ PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"
laalaa/pts/8 -bash1 ~ 19:08 0$


The dash seems to be coming from the '\s' bash(1) -> PROMPTING -> 
backslash-escaped special characters:


2022-06-02 19:12:58 dpchrist@laalaa ~
$ PS1="\\s"
-bash


David

Re: Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread Will Mengarini

* David Christensen  [22-06/02=Th 18:01 -0700]:
>On 6/2/22 17:12, Will Mengarini wrote:
>> * David Christensen [22-06/02=Th 15:50 -0700]:
>>> On 6/2/22 15:13, Will Mengarini wrote:
>
>>>> In this transcript, the number before the prompt-ending '$' is $?:
>>>> 
>>>> debian/pts/4 bash3 ~ 14:56 0$perl -e 'open "gweeblefleep" || die'
>>>> debian/pts/4 bash3 ~ 14:57 0$perl -e 'open "gweeblefleep" or die'
>>>> Died at -e line 1.
>>>> debian/pts/4 bash3 ~ 14:57 2$
>>>> 
>>>
>>> What is your shell?  PS1?
>>
>> The shell is Bash 5.1.4.
>> PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"
>
> Interesting.
>
> This is my daily driver:
>
> 2022-06-02 17:38:55 dpchrist@laalaa ~
> $ cat /etc/debian_version ; uname -a ; dpkg-query -W bash
> 11.3
> Linux laalaa 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) x86_64
> GNU/Linux
> bash  5.1-2+b3
>
> This is my PS1. '\u' does not work on all of Debian, FreeBSD,
> Cygwin, and macOS, so the expansion of ${USER} is inserted
> between two string literals when .profile runs and sets PS1:
>
> 2022-06-02 17:39:09 dpchrist@laalaa ~
> $ grep PS1 .profile
> export PS1='\n\D{%Y-%m-%d %H:%M:%S} '${USER}'@\h \w\n\$ '
> #export PS1='\n\D{%Y-%m-%d %H:%M:%S} \u@\h \w\n\$ '
>
> Testing your PS1:
>
> 2022-06-02 17:45:03 dpchrist@laalaa ~
> $ PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"
> laalaa/ -bash1 ~ 17:45 0$
>
> The snippet '${TTY#/dev/}' seems to produce ' -' on my computer.
> How does your computer produce 'pts/4 ' and what does it mean?

'pts/4' is an abbreviation of '/dev/pts/4', pseudoterminal 4.

TTY is `tty`; it's been so long I'd forgotten that's not available in
all shells.  You should have the 'tty' program; it's in coreutils.

> Is there a reason why you are using
> double quotes, rather than single quotes?

So I can interpolate stuff like ${TTY#/dev/}.  In your case,
you'll need to set TTY=`tty` before setting PS1, so Bash
can use string substitution to remove '/dev/' from it.

Bash and the PS1 environment variable [was: grep: show matching line from pattern file]

2022-06-02 Thread David Christensen


On 6/2/22 17:12, Will Mengarini wrote:
> * David Christensen [22-06/02=Thu 15:50 -0700]:
>> On 6/2/22 15:13, Will Mengarini wrote:

>>> In this transcript, the number before the prompt-ending '$' is $?:
>>> 
>>> debian/pts/4 bash3 ~ 14:56 0$perl -e 'open "gweeblefleep" || die'
>>> debian/pts/4 bash3 ~ 14:57 0$perl -e 'open "gweeblefleep" or die'
>>> Died at -e line 1.
>>> debian/pts/4 bash3 ~ 14:57 2$
>>> 
>>
>> What is your shell?  PS1?
>
> The shell is Bash 5.1.4.
> PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"


Interesting.


This is my daily driver:

2022-06-02 17:38:55 dpchrist@laalaa ~
$ cat /etc/debian_version ; uname -a ; dpkg-query -W bash
11.3
Linux laalaa 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) 
x86_64 GNU/Linux

bash5.1-2+b3


This is my PS1.  '\u' does not work on all of Debian, FreeBSD, Cygwin, 
and macOS, so the expansion of ${USER} is inserted between two string 
literals when .profile runs and sets PS1:


2022-06-02 17:39:09 dpchrist@laalaa ~
$ grep PS1 .profile
export PS1='\n\D{%Y-%m-%d %H:%M:%S} '${USER}'@\h \w\n\$ '
#export PS1='\n\D{%Y-%m-%d %H:%M:%S} \u@\h \w\n\$ '


Testing your PS1:

2022-06-02 17:45:03 dpchrist@laalaa ~
$ PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"
laalaa/ -bash1 ~ 17:45 0$


The snippet '${TTY#/dev/}' seems to produce ' -' on my computer.   How 
does your computer produce 'pts/4 ' and what does it mean?



Is there a reason why you are using double quotes, rather than single 
quotes?



David

Re: grep: show matching line from pattern file

2022-06-02 Thread Will Mengarini

* David Christensen  [22-06/02=Thu 15:50 -0700]:
> On 6/2/22 15:13, Will Mengarini wrote:
>> * Greg Wooledge  [22-05/28=Sa 17:11 -0400]:
>>> [...]
>>> #!/usr/bin/perl
>>> use strict; use warnings;
>>> [...]
>>> open PATS, ">> [...]
>>
>> You need "or die", not "|| die", because of precedence: what you coded
>> checks whether "> wanted to check whether the result of open() is logically true.
>
> +1  That is a good explanation of a Perl fine point/ gotcha.
>
>> In this transcript, the number before the prompt-ending '$' is $?:
>> 
>> debian/pts/4 bash3 ~ 14:56 0$perl -e 'open "gweeblefleep" || die'
>> debian/pts/4 bash3 ~ 14:57 0$perl -e 'open "gweeblefleep" or die'
>> Died at -e line 1.
>> debian/pts/4 bash3 ~ 14:57 2$
>> 
>
> What is your shell?  PS1?

The shell is Bash 5.1.4.  My PS1 is constructed by an elaborate script
that's old enough to have sex in Thailand, but you can get the effect
of what I posted by setting PS1 with the line

PS1="\\h/${TTY#/dev/} \\s$SHLVL \\w \\A \$?\\\$"

assuming you're running at least Bash 2.05a.  You may prefer

PS1="\\h/${TTY#/dev/} \\s^$SHLVL \\w \\A \$?\\\$"
.

My original script was coded for Bash 1.4.7, and had to do

PS1="\\h${TTY#/dev/} \\s$SHLVL \\w \`s=\$?;date +%H:%M;exit \$s\` \$?\\\$"

because \A wasn't available, so 'date' had to be run in a subshell
that needed to take care to save and restore $?.  (The variable
it uses for that, s, goes away when the subshell does; and that
scary-looking exit just exits the subshell, resetting $?.)

-- 
 Will Mengarini  
 Free software: the Source will be with you, always.
   sh -c 'echo -n MENGARINI|sum -s|colrm 4'

Re: grep: show matching line from pattern file

2022-06-02 Thread David Christensen


On 6/2/22 15:13, Will Mengarini wrote:

* Greg Wooledge  [22-05/28=Sa 17:11 -0400]:

[...]
#!/usr/bin/perl
use strict; use warnings;
[...]
open PATS, "

You need "or die", not "|| die", because of precedence: what you coded
checks whether "


+1  That is a good explanation of a Perl fine point/ gotcha.



In this transcript, the number before the prompt-ending '$' is $?:

debian/pts/4 bash3 ~ 14:56 0$perl -e 'open "gweeblefleep" || die'
debian/pts/4 bash3 ~ 14:57 0$perl -e 'open "gweeblefleep" or die'
Died at -e line 1.
debian/pts/4 bash3 ~ 14:57 2$




What is your shell?  PS1?


David

Re: grep: show matching line from pattern file

2022-06-02 Thread Will Mengarini

* Greg Wooledge  [22-05/28=Sa 17:11 -0400]:
> [...] 
> #!/usr/bin/perl
> use strict; use warnings;
> [...] 
> open PATS, " [...] 

You need "or die", not "|| die", because of precedence: what you coded
checks whether "
perl -le"print unpack '%C*',MENGARINI"

Re: grep: show matching line from pattern file

2022-06-02 Thread Greg Wooledge

On Thu, Jun 02, 2022 at 03:12:23PM -0400, duh wrote:

> > > Jim Popovitch wrote on 28/05/2022 21:40:
> > > > I have a file of regex patterns and I use grep like so:
> > > > 
> > > >  ~$ grep -f patterns.txt /var/log/syslog
> > > > 
> > > > What I'd like to get is a listing of all lines, specifically the line
> > > > numbers of the regexps in patterns.txt, that match entries in
> > > > /var/log/syslog.   Is there a way to do this?

> $cat -n /var/log/syslog | grep warn
> 
> and it found "warn" in the syslog file and provided line numbers. I have
> not used the -f option

You're getting the line numbers from the log file.  The OP wanted the line
numbers of the patterns in the -f pattern file.

Why?  I have no idea.  There is no standard option to do this, because
it's not a common requirement.  That's why I wrote one from scratch
in perl.

Re: grep: show matching line from pattern file

2022-06-02 Thread duh




On 5/29/22 9:44 AM, David Wright wrote:

On Sun 29 May 2022 at 15:02:35 (+0200), Jörg-Volker Peetz wrote:

Jim Popovitch wrote on 28/05/2022 21:40:

Not exactly Debian specific, but hoping that someone here can help.

I have a file of regex patterns and I use grep like so:

 ~$ grep -f patterns.txt /var/log/syslog

What I'd like to get is a listing of all lines, specifically the line
numbers of the regexps in patterns.txt, that match entries in
/var/log/syslog.   Is there a way to do this?

How about this:

$ grep -of patterns.txt /var/log/syslog.1 | grep -n -f - patterns.txt

That will only work for literal patterns, not regex ones.

Cheers,
David.



I may be missing a lot but I will risk throwing in my 2 cents (which is
no longer worth the proverbial amount with

the current inflation)


$cat -n /var/log/syslog | grep warn

and it found "warn" in the syslog file and provided line numbers. I have
not used the -f option

(but am now aware of it in the pastfrom your post -- thank you. Might
file that away for future use but will not press

my luck at the moment).

My approach  saves me the effort of writing a scriptl. Ahh, the Perl
philosophy of "there is more than one way to do it".

Re: grep: show matching line from pattern file

2022-05-29 Thread Jim Popovitch

On Sat, 2022-05-28 at 17:11 -0400, Greg Wooledge wrote:
> On Sat, May 28, 2022 at 04:02:39PM -0400, The Wanderer wrote:
> > On 2022-05-28 at 15:40, Jim Popovitch wrote:
> > > I have a file of regex patterns and I use grep like so:
> > > 
> > >    ~$ grep -f patterns.txt /var/log/syslog 
> > > 
> > > What I'd like to get is a listing of all lines, specifically the line
> > > numbers of the regexps in patterns.txt, that match entries in
> > > /var/log/syslog.   Is there a way to do this?
> > 
> > I don't know of a standardized way to do that (if anyone else wants to
> > suggest one, I'm open to learn), but of course it *can* be done, via
> > scripting. Off the top of my head, I came up with the following
> > 
> > for line in $(seq 1 $(wc -l patterns.txt | cut -d ' ' -f 1)) ; do
> >   if grep $(head -n $line patterns.txt | tail -n 1) /var/log/syslog >
> > /dev/null ; then
> > echo $line ;
> >   fi
> > done
> 
> The quoting here is... completely absent (and that's extremely bad), but
> also importantly, one would ideally like to avoid running grep a thousand
> times, especially if the target logfile is large.
> 
> I believe this is the kind of job for which perl is well-suited.  I'm not
> great at perl, but I'll give it a shot.
> 
> Here's a version with some extra information as output, so I can verify
> that it's doing something reasonably close to correct:
> 
> 
> #!/usr/bin/perl
> use strict; use warnings;
> 
> my @patlist;
> open PATS, " chomp(@patlist = );
> close PATS;
> 
> while (<>) {
> chomp;
> for (my $i = 0; $i <= $#patlist; $i++) {
>   print "$i|$patlist[$i]|$_\n" if /$patlist[$i]/;
> }
> }
> 
> 
> Now, to test it, we need a patterns.txt file:
> 
> 
> unicorn:~$ cat patterns.txt 
> PATH
> HOME|~
> a...e
> 
> 
> And an input (log) file:
> 
> 
> unicorn:~$ cat file
> zebra
> Home, home on the range.
> Oops, I meant HOME on the range.
> 
> applesauce
> 
> 
> And here's what it does:
> 
> 
> unicorn:~$ ./foo file
> 1|HOME|~|Oops, I meant HOME on the range.
> 2|a...e|applesauce
> 
> 
> Pattern numbers 1 and 2 (the second and third, since it starts at 0) were
> matched, so we have a line for each of those.
> 
> If that's kinda what you wanted, then you can adjust this to do precisely
> what you wanted.  It shouldn't take a lot of work, I hope.  Well, I guess
> that depends on what you really want.
> 
> Bash is not well-suited to this task, and even if we were to take The
> Wanderer's script and fix all the issues in it, it would still be a
> vastly inferior solution.  Some tools are just not meant for some jobs.
> 

Thanks Greg, that is exactly what I needed, and double thanks for the
details in explaining it, etc. 

-Jim P.

Re: grep: show matching line from pattern file

2022-05-29 Thread David Wright

On Sun 29 May 2022 at 15:02:35 (+0200), Jörg-Volker Peetz wrote:
> Jim Popovitch wrote on 28/05/2022 21:40:
> > Not exactly Debian specific, but hoping that someone here can help.
> > 
> > I have a file of regex patterns and I use grep like so:
> > 
> > ~$ grep -f patterns.txt /var/log/syslog
> > 
> > What I'd like to get is a listing of all lines, specifically the line
> > numbers of the regexps in patterns.txt, that match entries in
> > /var/log/syslog.   Is there a way to do this?
> 
> How about this:
> 
> $ grep -of patterns.txt /var/log/syslog.1 | grep -n -f - patterns.txt

That will only work for literal patterns, not regex ones.

Cheers,
David.

Re: grep: show matching line from pattern file

2022-05-29 Thread Jörg-Volker Peetz


Jim Popovitch wrote on 28/05/2022 21:40:

Not exactly Debian specific, but hoping that someone here can help.

I have a file of regex patterns and I use grep like so:

~$ grep -f patterns.txt /var/log/syslog

What I'd like to get is a listing of all lines, specifically the line
numbers of the regexps in patterns.txt, that match entries in
/var/log/syslog.   Is there a way to do this?

-Jim P.


How about this:

$ grep -of patterns.txt /var/log/syslog.1 | grep -n -f - patterns.txt

Regards,
Jörg.

Re: grep: show matching line from pattern file

2022-05-28 Thread The Wanderer

On 2022-05-28 at 17:11, Greg Wooledge wrote:

> On Sat, May 28, 2022 at 04:02:39PM -0400, The Wanderer wrote:
>
>> On 2022-05-28 at 15:40, Jim Popovitch wrote:
>> > I have a file of regex patterns and I use grep like so:
>> > 
>> >~$ grep -f patterns.txt /var/log/syslog 
>> > 
>> > What I'd like to get is a listing of all lines, specifically the line
>> > numbers of the regexps in patterns.txt, that match entries in
>> > /var/log/syslog.   Is there a way to do this?
>> 
>> I don't know of a standardized way to do that (if anyone else wants to
>> suggest one, I'm open to learn), but of course it *can* be done, via
>> scripting. Off the top of my head, I came up with the following
>> 
>> for line in $(seq 1 $(wc -l patterns.txt | cut -d ' ' -f 1)) ; do
>>   if grep $(head -n $line patterns.txt | tail -n 1) /var/log/syslog >
>> /dev/null ; then
>> echo $line ;
>>   fi
>> done
> 
> The quoting here is... completely absent (and that's extremely bad), but
> also importantly, one would ideally like to avoid running grep a thousand
> times, especially if the target logfile is large.

A brother of mine has schooled me on several things I did wrong here
already, and I am now going over my long-tail stockpile of scripts with
shellcheck, seeing what I can learn. (I normally do quote variables, but
for some reason it slipped my mind this time until after I'd already hit
Send. Some of these are known-safe values which won't need quoting, but
others aren't, so the principle remains valid to cite.)

This wasn't especially intended as a great solution, in any case; it
started out as me trying to write shell-like pseudocode, but then I
couldn't see any obvious reason why it wouldn't work, and when I tried
it out I only needed a few tweaks before it did. There's a *reason* I
had the "YMMV" on that message.

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc
Description: OpenPGP digital signature

Re: grep: show matching line from pattern file

2022-05-28 Thread Greg Wooledge

On Sat, May 28, 2022 at 04:02:39PM -0400, The Wanderer wrote:
> On 2022-05-28 at 15:40, Jim Popovitch wrote:
> > I have a file of regex patterns and I use grep like so:
> > 
> >~$ grep -f patterns.txt /var/log/syslog 
> > 
> > What I'd like to get is a listing of all lines, specifically the line
> > numbers of the regexps in patterns.txt, that match entries in
> > /var/log/syslog.   Is there a way to do this?
> 
> I don't know of a standardized way to do that (if anyone else wants to
> suggest one, I'm open to learn), but of course it *can* be done, via
> scripting. Off the top of my head, I came up with the following
> 
> for line in $(seq 1 $(wc -l patterns.txt | cut -d ' ' -f 1)) ; do
>   if grep $(head -n $line patterns.txt | tail -n 1) /var/log/syslog >
> /dev/null ; then
> echo $line ;
>   fi
> done

The quoting here is... completely absent (and that's extremely bad), but
also importantly, one would ideally like to avoid running grep a thousand
times, especially if the target logfile is large.

I believe this is the kind of job for which perl is well-suited.  I'm not
great at perl, but I'll give it a shot.

Here's a version with some extra information as output, so I can verify
that it's doing something reasonably close to correct:

#!/usr/bin/perl
use strict; use warnings;

my @patlist;
open PATS, ");
close PATS;

while (<>) {
chomp;
for (my $i = 0; $i <= $#patlist; $i++) {
print "$i|$patlist[$i]|$_\n" if /$patlist[$i]/;
}
}

Now, to test it, we need a patterns.txt file:

unicorn:~$ cat patterns.txt 
PATH
HOME|~
a...e

And an input (log) file:

unicorn:~$ cat file
zebra
Home, home on the range.
Oops, I meant HOME on the range.

applesauce

And here's what it does:

unicorn:~$ ./foo file
1|HOME|~|Oops, I meant HOME on the range.
2|a...e|applesauce

Pattern numbers 1 and 2 (the second and third, since it starts at 0) were
matched, so we have a line for each of those.

If that's kinda what you wanted, then you can adjust this to do precisely
what you wanted.  It shouldn't take a lot of work, I hope.  Well, I guess
that depends on what you really want.

Bash is not well-suited to this task, and even if we were to take The
Wanderer's script and fix all the issues in it, it would still be a
vastly inferior solution.  Some tools are just not meant for some jobs.

Re: grep: show matching line from pattern file

2022-05-28 Thread The Wanderer

On 2022-05-28 at 15:40, Jim Popovitch wrote:

> Not exactly Debian specific, but hoping that someone here can help.
> 
> I have a file of regex patterns and I use grep like so:
> 
>~$ grep -f patterns.txt /var/log/syslog 
> 
> What I'd like to get is a listing of all lines, specifically the line
> numbers of the regexps in patterns.txt, that match entries in
> /var/log/syslog.   Is there a way to do this?

I don't know of a standardized way to do that (if anyone else wants to
suggest one, I'm open to learn), but of course it *can* be done, via
scripting. Off the top of my head, I came up with the following

for line in $(seq 1 $(wc -l patterns.txt | cut -d ' ' -f 1)) ; do
  if grep $(head -n $line patterns.txt | tail -n 1) /var/log/syslog >
/dev/null ; then
echo $line ;
  fi
done

I just tested that on my own system, with a different file (since I'm
not root right now) and a couple of exact-string patterns found by
examining that file, and it seems to work as intended. YMMV.

-- 
   The Wanderer

The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore all
progress depends on the unreasonable man. -- George Bernard Shaw

signature.asc
Description: OpenPGP digital signature

grep: show matching line from pattern file

2022-05-28 Thread Jim Popovitch

Not exactly Debian specific, but hoping that someone here can help.

I have a file of regex patterns and I use grep like so:

   ~$ grep -f patterns.txt /var/log/syslog 

What I'd like to get is a listing of all lines, specifically the line
numbers of the regexps in patterns.txt, that match entries in
/var/log/syslog.   Is there a way to do this?

-Jim P.

Re: sometimes i go huh (grep result)

2018-08-27 Thread songbird

Greg Wooledge wrote:
> On Mon, Aug 27, 2018 at 10:36:12AM -0400, songbird wrote:
>> me@ant(25)$ env | grep -F "-g"
>> grep: invalid option -- 'g'
>
> You want either -- or -e.
>
> grep -F -- -g
> grep -F -e -g

  i just found it interesting that after this many years
of linux/unix i'd not remembered this issue.  my golden
years are ahead a bit yet so i sure hope this isn't a
mental issue!  lol

  songbird

Re: sometimes i go huh (grep result)

2018-08-27 Thread Joe Pfeiffer

What's tripping you up is that some processing is being done by the
shell before grep ever sees your pattern.  Taking that into account,
what grep is seeing is:

songbird  writes:

> me@ant(25)$ env | grep -F "-g"
grep -F -g
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> me@ant(26)$ env | grep -F '-g'
grep -F -g
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> me@ant(27)$ env | grep -F 'CFL'
> CFLAGS=-g
grep -F CFL
> me@ant(28)$ env | grep '\-g'
> CFLAGS=-g
grep \-g
(I'll note that in this case you need the quotes or the shell would have
stripped the \ .  I'm guessing this one is doing what you want)
> me@ant(29)$ env | grep '-g'
grep -g
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> me@ant(30)$ env | grep "-g"
grep -g
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
>
>
> songbird

Re: sometimes i go huh (grep result)

2018-08-27 Thread Roberto C . Sánchez

On Mon, Aug 27, 2018 at 11:26:12AM -0400, Greg Wooledge wrote:
> On Mon, Aug 27, 2018 at 11:20:42AM -0400, Roberto C. Sánchez wrote:
> > env |grep [-]g
> 
> Fails if there is a file named -g in the current directory, as that
> matches the unquoted glob and causes it to expand.  Also fails if failglob
> is turned on, whether the file exists or not (fails differently in the
> two cases).
> 
> Also fails if nullglob is turned on, but that is definitely not
> recommended in interactive shells.
> 
Quite right.  In my haste I forgot the quotes:

env |grep '[-]g'

-- 
Roberto C. Sánchez

Re: sometimes i go huh (grep result)

2018-08-27 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Aug 27, 2018 at 11:19:25AM -0400, Greg Wooledge wrote:
> On Mon, Aug 27, 2018 at 10:36:12AM -0400, songbird wrote:
> > me@ant(25)$ env | grep -F "-g"
> > grep: invalid option -- 'g'
> 
> You want either -- or -e.
> 
> grep -F -- -g
> grep -F -e -g

More generally, '--' is convention for "end of option arguments,
normal arguments from here on". Most utilities nowadays stick to
that convention. It was introduced precisely for this case.

Note that quoting, as you do (i.e. "-g") can't work, because the
shell unwraps that level of quotes; grep will still see -g and
think it's an option. This quoting will help to "protect" whitespace:

  grep foo bar

will see two arguments, foo and bar, whereas

  grep "foo bar"

will see one, "foo bar".

Cheers
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAluEGYYACgkQBcgs9XrR2kb6vACfSGHIgX57p5r3oyJ+5vNFQgCX
sVUAn13XlvIHlRGYmzNeLfEpDwaDDFQx
=vMte
-END PGP SIGNATURE-

Re: sometimes i go huh (grep result)

2018-08-27 Thread Nicolas George

songbird (2018-08-27):
> me@ant(25)$ env | grep -F "-g"
> grep: invalid option -- 'g'

Maybe what you want is an explanation rather than just a solution.

Quotes are for the shell: they protect arguments that contain special
characters, so that commands get them as is.

For example, you need to write:

echo "Fire*Wolf"

because without the quotes, the shell would try to find all the files in
the current directory with a name that matches the pattern.

Since the dash is not special for the shell, the quotes are unnecessary.
They do no harm, but have no consequences here:

grep "-g"
grep '-g'
    grep -g
grep ""''""-"g"''

all invoke grep with one extra argument "-g".

The dash is special for programs that understand options (some do not;
some do with a different syntax, for example key=value), and need to be
escaped the way programs expect it. The usual escaping is that an
argument "--" means all following arguments are not options, even if
they start with a dash.

> me@ant(26)$ env | grep -F '-g'
> grep: invalid option -- 'g'

Same as above.

> me@ant(28)$ env | grep '\-g'
> CFLAGS=-g

grep sees the argument starting with a backslash, it is not an option,
therefore it is the regexp. But backshash-dash could have had a special
semantic, like backslash-parentheses.

> me@ant(29)$ env | grep '-g'
> grep: invalid option -- 'g'

Same as above.

> me@ant(30)$ env | grep "-g"
> grep: invalid option -- 'g'

Same as above.

Regards,

-- 
  Nicolas George


signature.asc
Description: Digital signature

Re: sometimes i go huh (grep result)

2018-08-27 Thread Greg Wooledge

On Mon, Aug 27, 2018 at 11:20:42AM -0400, Roberto C. Sánchez wrote:
> env |grep [-]g

Fails if there is a file named -g in the current directory, as that
matches the unquoted glob and causes it to expand.  Also fails if failglob
is turned on, whether the file exists or not (fails differently in the
two cases).

Also fails if nullglob is turned on, but that is definitely not
recommended in interactive shells.

Re: sometimes i go huh (grep result)

2018-08-27 Thread Roberto C . Sánchez

On Mon, Aug 27, 2018 at 10:36:12AM -0400, songbird wrote:
> me@ant(25)$ env | grep -F "-g"
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> me@ant(26)$ env | grep -F '-g'
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> me@ant(27)$ env | grep -F 'CFL'
> CFLAGS=-g
> me@ant(28)$ env | grep '\-g'
> CFLAGS=-g
> me@ant(29)$ env | grep '-g'
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> me@ant(30)$ env | grep "-g"
> grep: invalid option -- 'g'
> Usage: grep [OPTION]... PATTERN [FILE]...
> Try 'grep --help' for more information.
> 
> 
try this:

env |grep [-]g

Regards,

-Roberto

-- 
Roberto C. Sánchez

Re: sometimes i go huh (grep result)

2018-08-27 Thread Greg Wooledge

On Mon, Aug 27, 2018 at 10:36:12AM -0400, songbird wrote:
> me@ant(25)$ env | grep -F "-g"
> grep: invalid option -- 'g'

You want either -- or -e.

grep -F -- -g
grep -F -e -g

sometimes i go huh (grep result)

2018-08-27 Thread songbird

me@ant(25)$ env | grep -F "-g"
grep: invalid option -- 'g'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
me@ant(26)$ env | grep -F '-g'
grep: invalid option -- 'g'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
me@ant(27)$ env | grep -F 'CFL'
CFLAGS=-g
me@ant(28)$ env | grep '\-g'
CFLAGS=-g
me@ant(29)$ env | grep '-g'
grep: invalid option -- 'g'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
me@ant(30)$ env | grep "-g"
grep: invalid option -- 'g'
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.


songbird

Re: Little question grep

2017-01-08 Thread John L. Ries

That strikes me as being just a touch too complex for grep.  It may well
be doable, but you'll probably have an easier time using AWK (possibly not
what you wanted to hear, but it's well worth learning).  The object of the
game would be to count the number of signs on each line and print only
those with the specified number.  Then in a shell script (my preferred
poison is ksh) set up a loop like so:

#Not guaranteed to be syntactically correct
for ((i=5; i<=10; i++)); do
  awk -f ctsign.awk N=$i biglist.txt >${i}signs.txt
  done

Or one can use such scripting languages as Perl or Python to do the whole
job.

Hope it helps...

--|
John L. Ries  |
Salford Systems   |
Phone: (619)543-8880 x107 |
or (435)867-8885  |
--|

On Sun, 8 Jan 2017, Hans wrote:

>
> Hi all,
>
>
>
> I have a little problem with using grep.
>
>
>
> The problem:
>
>
>
> I have a wordlist with 3,5 Mio words in ASCII. No I want filter out all
> words with 5,6, 7, 8, 9 and 10 signs in seperate lists. The wordlist
> contains all sort of signs, like alphanumeric, control signs like "^", "]"
> and others.
>
> So it must be same, whatever sign grep reads. I found this:
>
>
>
> grep -o -w -E '^[[:alnum:]]{5}' file1
>
>
>
> But it looks like it is only grepping text. I read the manual of grep, and I
> see, there are more options to chose. But I did not completely understand,
> if I have to chose every option in addition or if is there an option,which
> covers every kind of sign.
>
>
>
> Would be nice, if someone could make this a little bit brighter for me.
>
>
>
> Thank you for any hints.
>
>
>
> Best regards
>
>
>
> Hans
>
>
>
>
>
>
>

Re: Little question grep

2017-01-08 Thread Reco

On Sun, 8 Jan 2017 12:40:34 +0300
Reco  wrote:

> or, if you need whole words (i.e. need to exclude spaces):
> 
> egrep '^[^ ]$' file1

Self-edit. Of course it's:

egrep '^[^ ]{5}$' file1

Reco

Re: Little question grep

2017-01-08 Thread Reco

Hi.

On Sun, 08 Jan 2017 10:11:26 +0100
Hans  wrote:

> Hi all, 
> 
> I have a little problem with using grep.
> 
> The problem: 
> 
> I have a wordlist with 3,5 Mio words in ASCII. No I want filter out all words 
> with 5,6, 
> 7, 8, 9 and 10 signs in seperate lists. The wordlist contains all sort of 
> signs, like 
> alphanumeric, control signs like "^", "]" and others.
> So it must be same, whatever sign grep reads. I found this:
> 
> grep -o -w -E '^[[:alnum:]]{5}' file1
> 
> 
> But it looks like it is only grepping text. I read the manual of grep, and I 
> see, there 
> are more options to chose. But I did not completely understand, if I have to 
> chose 
> every option in addition or if is there an option,which covers every kind of 
> sign.

As it should be. regex(7) specifies that character classes are defined
in wctype(3), which states that '[[:alnum:]]' merely implements isalnum
(3), which, in turn is defined as (isalpha(c) || isdigit(c)).

So, what you really need is for five characters only (note final '$'):

egrep '^.{5}$' file1

or, if you need whole words (i.e. need to exclude spaces):

egrep '^[^ ]$' file1

Reco

Little question grep

2017-01-08 Thread Hans

Hi all, 

I have a little problem with using grep.

The problem: 

I have a wordlist with 3,5 Mio words in ASCII. No I want filter out all words 
with 5,6, 
7, 8, 9 and 10 signs in seperate lists. The wordlist contains all sort of 
signs, like 
alphanumeric, control signs like "^", "]" and others.
So it must be same, whatever sign grep reads. I found this:

grep -o -w -E '^[[:alnum:]]{5}' file1


But it looks like it is only grepping text. I read the manual of grep, and I 
see, there 
are more options to chose. But I did not completely understand, if I have to 
chose 
every option in addition or if is there an option,which covers every kind of 
sign.

Would be nice, if someone could make this a little bit brighter for me.

Thank you for any hints.

Best regards

Hans

Re: When do I use perl, awk, sed, grep, cut etc.. Was[OT] get all devices from a vendor from pci.ids

2017-01-06 Thread Javier Barroso

Hello,

On Fri, Jan 6, 2017 at 10:46 PM, Floris  wrote:
>>> So every Debian user has the perl command?
>>
>> Not only Debian users, the vast majority of linux / unix users have
>> perl installed (maybe now that android is here, this statement is not
>> true any more ...
>>
>> With awk:
>> awk -v vendor=0e11 'p == 1 && /^[^[:space:]]/ { p=0; } $0 ~ "^"vendor"
>> " {p=1;} p' /usr/share/misc/pci.ids
>>
>> With sed:
>> sed -ne '/^0e11/p' -e '/^0e11/,/^[^[:space:]]/ { /^[^[:space:]]/d ; p
>> }'  /usr/share/misc/pci.ids
>>
>> Wrapping to script which get an argument is easy
>>
>> Regards
>>
>
> Thanks for your answer, and the next question pops in.
>
> Is there a "rule" when I use perl, awk, sed, grep, cut etc. in a script?
> Often I find multiple solutions on the Internet to achieve the same output.
>
> A simple example:
>
> cut -d: -f2
> or
> awk -F: '{ print $2 }'
>
> Is the one better as the other (CPU/RAM usage), or is it just "taste"

I'm pretty sure that there are tools to help meassure ram / cpu used
by a shell script and then present result (max / min / media ...), but
I don't know any one (maybe memusg from a quick google search). I
would limit memory though ulimit -m to the memory that I would have to
run the script.

If a tool will not be executed thousands of times is irrelevant if it
is using cut / awk or another shell basic command. Why would you spend
more time on an script to optimize it if it won't run many times?
Even, if it will be executed by cron task, It is not important if it
takes one second, or ten seconds. Of course if it is a public script
is important, you don't know how many times it will be executed, the
less time take the script , less time are lost by future users ...

For a simple cpu benchmark, sorted by real and user % cpu:

$ time for i in {1..1000} ; do ./test_awk > /dev/null ; done

real0m33,671s
user0m28,384s
sys 0m1,228s
$ time for i in {1..1000} ; do ./test_bash > /dev/null ; done

real0m29,404s
user0m26,708s
sys 0m1,344s
$ time for i in {1..1000} ; do ./test_sed > /dev/null ; done

real0m16,765s
user0m12,452s
sys 0m0,820s
$ time for i in {1..1000} ; do ./test_perl > /dev/null ; done

real0m15,932s
user0m11,344s
sys 0m0,564s

# test_bash:

id="${1:-0e11}"
while IFS=\0 read line
do
if [[ "$line" =~ ^$id[[:space:]].* ]]
then
echo "$line"
p=1
continue
fi
if [ "$p" = 1 ] && [[ "$line" =~ ^[[:space:]] ]]
then
echo "$line"
fi
if [[ "$line" =~ ^[^[:space:]] ]] && [ "$p" = 1 ]
then
exit 0
fi

done  < /usr/share/misc/pci.ids

Regards,

When do I use perl, awk, sed, grep, cut etc.. Was[OT] get all devices from a vendor from pci.ids

2017-01-06 Thread Floris


So every Debian user has the perl command?

Not only Debian users, the vast majority of linux / unix users have
perl installed (maybe now that android is here, this statement is not
true any more ...

With awk:
awk -v vendor=0e11 'p == 1 && /^[^[:space:]]/ { p=0; } $0 ~ "^"vendor"
" {p=1;} p' /usr/share/misc/pci.ids

With sed:
sed -ne '/^0e11/p' -e '/^0e11/,/^[^[:space:]]/ { /^[^[:space:]]/d ; p
}'  /usr/share/misc/pci.ids

Wrapping to script which get an argument is easy

Regards



Thanks for your answer, and the next question pops in.

Is there a "rule" when I use perl, awk, sed, grep, cut etc. in a script?
Often I find multiple solutions on the Internet to achieve the same output.

A simple example:

cut -d: -f2
or
awk -F: '{ print $2 }'

Is the one better as the other (CPU/RAM usage), or is it just "taste"

Floris

Re: Charsets v grep

2015-11-15 Thread Martin Str|mberg

In article  R. Clayton  
wrote:
> and I've been getting a lot of this lately:

>   $ grep ^Subject: cbtm 
>   Binary file cbtm matches

> whereas before (a month or so ago) I used to get actual matches on std-out.
> It's easy enough to work around like so

>   $ sed -n -e '/^Subject:/p' < cbtm 
>   Subject: Re: PTFACULTY: FTFACULTY: When saying "Nous sommes Paris" is not
>   Subject: FTFACULTY: When saying "Nous sommes Paris" is not enough
>   Subject: Lowered Reserve Prices

> but I'd like to grep working like it used to.  What is the way for me to get
> grep back?  Some other points that may be useful:

>   $ file cbtm 
>   cbtm: ISO-8859 text, with very long lines

>   $ ba env | grep -i utf
>   LANG=en_US.UTF-8
>   XTERM_LOCALE=en_US.UTF-8

Trying "grep -a ..." might work. Or "LANG=C grep ...".


-- 
MartinS

Charsets v grep

2015-11-15 Thread R. Clayton

I'm running this

  $ bash --version
  GNU bash, version 4.3.42(1)-release (i586-pc-linux-gnu)
  Copyright (C) 2013 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

  This is free software; you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.

  $

on this

  $ lsb_release -a
  No LSB modules are available.
  Distributor ID:   Debian
  Description:  Debian GNU/Linux testing-updates (sid)
  Release:  testing-updates
  Codename: sid

  $

and I've been getting a lot of this lately:

  $ grep ^Subject: cbtm 
  Binary file cbtm matches

  $

whereas before (a month or so ago) I used to get actual matches on std-out.
It's easy enough to work around like so

  $ sed -n -e '/^Subject:/p' < cbtm 
  Subject: Re: PTFACULTY: FTFACULTY: When saying "Nous sommes Paris" is not
  Subject: FTFACULTY: When saying "Nous sommes Paris" is not enough
  Subject: Lowered Reserve Prices

  $

but I'd like to grep working like it used to.  What is the way for me to get
grep back?  Some other points that may be useful:

  $ file cbtm 
  cbtm: ISO-8859 text, with very long lines

  $ ba env | grep -i utf
  LANG=en_US.UTF-8
  XTERM_LOCALE=en_US.UTF-8

  $

Re: Usage of grep - Was: no .bash_hostory file was found in user home folder

2013-10-20 Thread Ralf Mardorf

On Sun, 2013-10-20 at 03:13 +0200, Markus Falb wrote:
> I find myself doing this on occasion.
> Sometimes it seems quicker to add the pipe to the previous command than to 
> modify the whole thing.

I agree with this! _But_ unfortunately many people are not aware how to
use some commands "better" and it can matter. I guess I used  ls  too
often with  grep  instead of only using  ls  , until somebody mentioned
it. I guess such "mistakes" should be mentioned on open mailing lists.
Next time somebody googles for  grep  s/he shouldn't find an less good
example by the original thread, but my hint by this thread.

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1382255934.705.48.camel@archlinux

Re: Usage of grep - Was: no .bash_hostory file was found in user home folder

2013-10-19 Thread Markus Falb

On 20.Okt.2013, at 00:51, Ralf Mardorf wrote:

> 
> 
> On Sun, 2013-10-20 at 03:42 +0500, Muhammad Yousuf Khan wrote:
>>cat /etc/passwd | grep ykhan
>>ykhan:x:19000:19000:ykhan,,,:/home/ykhan:/bin/bash
> 
> [rocketmouse@archlinux ~]$ grep rocketmouse /etc/passwd
> rocketmouse:x:1000:1000::/home/rocketmouse:/bin/bash
> 
> IOW if you use grep, then you don't need to use cat first.

I find myself doing this on occasion.
Sometimes it seems quicker to add the pipe to the previous command than to 
modify the whole thing.

Knowing your shell's command line shortcut's helps with that, at least maybe.
I recommend the emacs tutorial (bash command line navigation is emacs-ish by 
default)

-- 
Markus

--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/753e1b0b-cb41-4a06-a31f-e8e621d7e...@gmail.com

Usage of grep - Was: no .bash_hostory file was found in user home folder

2013-10-19 Thread Ralf Mardorf



On Sun, 2013-10-20 at 03:42 +0500, Muhammad Yousuf Khan wrote:
> cat /etc/passwd | grep ykhan
> ykhan:x:19000:19000:ykhan,,,:/home/ykhan:/bin/bash

[rocketmouse@archlinux ~]$ grep rocketmouse /etc/passwd
rocketmouse:x:1000:1000::/home/rocketmouse:/bin/bash

IOW if you use grep, then you don't need to use cat first.



-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1382223110.705.37.camel@archlinux

Re: apt-cache --names-only search apm | grep sleepd. Is there a bug?

2011-05-05 Thread Camaleón

On Thu, 05 May 2011 05:10:53 -0700, Regid Ichira wrote:

> $ apt-cache --names-only search apm | grep sleepd
> sleepd - puts an inactive or low battery laptop to sleep
> 
> Am I right that, according to man apt-cache, mentioning sleepd is a bug?

(...)

Yep, well... kind of. Already reported ;-)

"apt-cache search --names-only" also searchs for "Provides", man page 
should be updated
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=618017

Greetings,

-- 
Camaleón


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/pan.2011.05.05.13.21...@gmail.com

apt-cache --names-only search apm | grep sleepd. Is there a bug?

2011-05-05 Thread Regid Ichira

$ apt-cache --names-only search apm | grep sleepd
sleepd - puts an inactive or low battery laptop to sleep

Am I right that, according to man apt-cache, mentioning sleepd is a bug?

$ man apt-cache | grep -A20 ' search regex' | head
   search regex [ regex ... ]
   search performs a full text search on all available package
   lists for the POSIX regex pattern given, see regex(7). It
   searches the package names and the descriptions for an
   occurrence of the regular expression and prints out the package
   name and the short description, including virtual package names.
   If --full is given then output identical to show is produced for
   each matched package, and if --names-only is given then the long
   description is not searched, only the package name is.

$ dpkg -S /usr/bin/apt-cache
apt: /usr/bin/apt-cache
$ dpkg -l apt
ii  apt0.8.14.1   Advanced front-end for dpkg


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/289306.2287...@web120706.mail.ne1.yahoo.com

Re: Off topic question about grep

2010-11-09 Thread ~Stack~

On 11/09/2010 12:26 PM, Bob McGowan wrote:
> On 11/09/2010 06:00 AM, Jochen Schulz wrote:
...
>> What was your exact command line? Did you quote the regular expression?
>> My guess is that the shell interpreted the '*' character for you and you
>> ended up with a command line like this:
>>
>> $ grep [_a-zA-Z][_a-zA-Z0-9]file1 file2 file3
...

> The shell will expand the above into space separated values, based on
> matches to the glob pattern.  The first match will become the pattern
> used by grep, searched for in the remaining file names.  Try this:
> 
>   echo grep [_a-zA-Z][_a-zA-Z0-9]*
> 
> to see what the shell does in any particular case.

Yeah. I feel really silly now.

I was so focused on getting the regular expression right that I
completely forgot to consider the shell interpreting things on my
behalf. Couldn't see the forest because of the tree in my way I guess.

Thanks! I do appreciate it.



signature.asc
Description: OpenPGP digital signature

Re: Off topic question about grep

2010-11-09 Thread Bob McGowan

On 11/09/2010 06:00 AM, Jochen Schulz wrote:
> ~Stack~:
>>
>> But that would match against 9_asD which begins with a number (not what
>> I wanted). So I tried:
>> [_a-zA-Z][_a-zA-Z0-9]*
>>
>> I realize that the expression won't do what I mistakenly thought I
>> wanted it to do. What is puzzling to me is that my hard disk usage
>> peaked, my cpu jumped, and grep took almost two minutes to return an
>> exit code of 1 (no match). :-/
> 
> What was your exact command line? Did you quote the regular expression?
> My guess is that the shell interpreted the '*' character for you and you
> ended up with a command line like this:
> 
> $ grep [_a-zA-Z][_a-zA-Z0-9]file1 file2 file3
> 
> where file1 etc. are the files in your current directory. That's why
> grep took so long to finish and it didn't find anything because file1 is
> part of your regexp.
> 
> J.

To be pedantically correct ;)

grep [_a-zA-Z][_a-zA-Z0-9]*

The shell will expand the above into space separated values, based on
matches to the glob pattern.  The first match will become the pattern
used by grep, searched for in the remaining file names.  Try this:

  echo grep [_a-zA-Z][_a-zA-Z0-9]*

to see what the shell does in any particular case.  For example, I got:

  grep 00firefox-files_before 01cache.list ... xxyy

The ... in the above is 57 files.  58 files counting the xxyy were
searched for the "pattern" 00firefox-files_before, which is actually a
file name and so not likely to be found in any of the files searched.
If you want to prove this, try:

  ls > files
  grep *
  files:00firefox-files_before

-- 
Bob McGowan

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4cd9926b.8080...@symantec.com

Re: Off topic question about grep

2010-11-09 Thread Paul E Condon

On 20101109_071001, ~Stack~ wrote:
> Hello everyone!
> 
> I ran into a strange issue with grep and I was hoping someone could
> explain what I feel is an oddity.
> 
> I was trying to match a word that starts with either a _ or a letter
> followed by any number of _, letters, or numbers. (eg: Good = Asdf1,
> _aSD1. Bad: 9_asD ). My test text file is just those three examples,
> each on a new line.
> 
> I first tested with this:
> [_a-zA-Z][_a-zA-Z0-9]
> 
> But that would match against 9_asD which begins with a number (not what
> I wanted). So I tried:
> [_a-zA-Z][_a-zA-Z0-9]*
> 
> I realize that the expression won't do what I mistakenly thought I
> wanted it to do. What is puzzling to me is that my hard disk usage
> peaked, my cpu jumped, and grep took almost two minutes to return an
> exit code of 1 (no match). :-/
> 
> At first I thought it may be an issue with Debian Squeeze (current box)
> so I tried it on Debian Lenny with similar results. Same for an Ubuntu
> Lucid and Fedora 10. So I am pretty sure it is something with grep and
> not just the version of grep.
> 
> I was hoping someone might know why grep behaves so oddly with that
> expression. If it was a monster file or something I could understand
> the system utilization peak, but it is just three lines in a text file.
> 
> Just so you know, I have a working solution. In my case, every instance
> is on a new line so I have a working expression using:
> ^[_a-zA-Z][_a-zA-Z0-9]*$

This last expression anchors the expression to the beginning of a line.
To anchor an expression to the beginning of a word you need:

\<[_a-zA-Z][_a-zA-Z0-9]*$

but this will only work if you agree with the implementers of grep as to what
it is that defines the beginning of a word. What is your definition?

Look in 'man grep' for clues as to where you can find the official
grep implmenters definition. I found '\<' in 'man grep' under 
'The Backslash Character and Special Expressions'

HTH
-- 
Paul E Condon   
pecon...@mesanetworks.net


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20101109162431.ga3...@big.lan.gnu

Re: Off topic question about grep

2010-11-09 Thread Jochen Schulz

~Stack~:
> 
> But that would match against 9_asD which begins with a number (not what
> I wanted). So I tried:
> [_a-zA-Z][_a-zA-Z0-9]*
>
> I realize that the expression won't do what I mistakenly thought I
> wanted it to do. What is puzzling to me is that my hard disk usage
> peaked, my cpu jumped, and grep took almost two minutes to return an
> exit code of 1 (no match). :-/

What was your exact command line? Did you quote the regular expression?
My guess is that the shell interpreted the '*' character for you and you
ended up with a command line like this:

$ grep [_a-zA-Z][_a-zA-Z0-9]file1 file2 file3

where file1 etc. are the files in your current directory. That's why
grep took so long to finish and it didn't find anything because file1 is
part of your regexp.

J.
-- 
Scientists know what they are talking about.
[Agree]   [Disagree]
 <http://www.slowlydownward.com/NODATA/data_enter2.html>


signature.asc
Description: Digital signature

Off topic question about grep

2010-11-09 Thread ~Stack~

Hello everyone!

I ran into a strange issue with grep and I was hoping someone could
explain what I feel is an oddity.

I was trying to match a word that starts with either a _ or a letter
followed by any number of _, letters, or numbers. (eg: Good = Asdf1,
_aSD1. Bad: 9_asD ). My test text file is just those three examples,
each on a new line.

I first tested with this:
[_a-zA-Z][_a-zA-Z0-9]

But that would match against 9_asD which begins with a number (not what
I wanted). So I tried:
[_a-zA-Z][_a-zA-Z0-9]*

I realize that the expression won't do what I mistakenly thought I
wanted it to do. What is puzzling to me is that my hard disk usage
peaked, my cpu jumped, and grep took almost two minutes to return an
exit code of 1 (no match). :-/

At first I thought it may be an issue with Debian Squeeze (current box)
so I tried it on Debian Lenny with similar results. Same for an Ubuntu
Lucid and Fedora 10. So I am pretty sure it is something with grep and
not just the version of grep.

I was hoping someone might know why grep behaves so oddly with that
expression. If it was a monster file or something I could understand
the system utilization peak, but it is just three lines in a text file.

Just so you know, I have a working solution. In my case, every instance
is on a new line so I have a working expression using:
^[_a-zA-Z][_a-zA-Z0-9]*$

I am just curious about the odd behavior.

Thanks!



signature.asc
Description: OpenPGP digital signature

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-18 Thread Alexander Batischev

On Fri, Sep 17, 2010 at 11:07:53PM -0600, Bob Proulx wrote:
> Albretch Mueller wrote:

> But newer POSIX standard find can use a {} + to launch grep once and
> to pass as many files on the command line as the system allows.  That
> is faster since grep is launched only as many times as needed.
> Usually only once.
> 
>   $ find -name '*.extension' -exec grep -H 'pattern' {} +

Wow! I thought that such things can be done only by xargs. Thank you very much!

-- 
Regards,
Alexander Batischev

1024D/69093C81
F870 A381 B5F5 D2A1 1B35  4D63 A1A7 1C77 6909 3C81


signature.asc
Description: Digital signature

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Bob Proulx

Albretch Mueller wrote:
> $ find -name '*.extension' -exec grep -H 'pattern' {} \;

Using {} \; is the old way.  That invokes grep once per file.  That
works but is slower and less efficient than it could be because it
takes a little bit of time to launch grep.

But newer POSIX standard find can use a {} + to launch grep once and
to pass as many files on the command line as the system allows.  That
is faster since grep is launched only as many times as needed.
Usually only once.

  $ find -name '*.extension' -exec grep -H 'pattern' {} +

As a benefit you don't need to escape the + since it isn't a special
character to the shell.

Bob


signature.asc
Description: Digital signature

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Stephen Powell

On Fri, 17 Sep 2010 14:08:18 -0400 (EDT), Albretch Mueller wrote:
> 
>  I need to:
>~ search for files using a pattern (say all files with a certain extension)
>~ then search inside each of the found files for a word or regexp pattern
>~ You could do this using find, cat and grep in a script, but I was
>  wondering about how could you do it with a oneliner

Search all files under the home directory (recursively) with an extension of 
.txt
for the keyword "xorg":

grep -r xorg ~/*.txt

-- 
  .''`. Stephen Powell
 : :'  :
 `. `'`
   `-


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/1630941229.67671.1284753074424.javamail.r...@md01.wow.synacor.com

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Alexander Batischev

On Fri, Sep 17, 2010 at 06:32:06PM +, Albretch Mueller wrote:
> > if you need certain extension, you don't even need cat and find, it's all 
> > about grep:
> 
> > $ grep 'string' *.extension
> ~
>  The thing is that I need to know in which file the pattern was found
> and as you guys suggested:
> ~
> $ find -name '*.extension' -exec grep -H 'pattern' {} \;
> ~
>  does it
Um... Well, single grep will show you filename as well, because (quoting the
manpage):

-H, --with-filename
Print the file name for each match.  This is the default when there is more
than one file to search.

-- 
Regards,
Alexander Batischev

1024D/69093C81
F870 A381 B5F5 D2A1 1B35  4D63 A1A7 1C77 6909 3C81


signature.asc
Description: Digital signature

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Joe Brenner


Albretch Mueller  wrote:

>  I need to:
> ~
>  search for files using a pattern (say all files with a certain extension)
> ~
>  then search inside each of the found files for a word or regexp pattern
> ~
>  You could do this using find, cat and grep in a script, but I was
> wondering about how could you do it with a oneliner

http://www.athabascau.ca/html/depts/compserv/webunit/HOWTO/find.htm
Look down to "How to find a string in a selection of files".

http://en.wikipedia.org/wiki/Find
Look down to "Search for a string"

But the newer style is to pipe find into xargs:

http://blog.endpoint.com/2010/07/efficiency-of-find-exec-vs-find-xargs.html
http://www.sunmanagers.org/pipermail/summaries/2005-March/006255.html
http://www.unix.com/unix-dummies-questions-answers/19217-difference-between-xargs-exec.html


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201009171857.o8hivtsn053...@kzsu.stanford.edu

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Albretch Mueller

> if you need certain extension, you don't even need cat and find, it's all 
> about grep:

> $ grep 'string' *.extension
~
 The thing is that I need to know in which file the pattern was found
and as you guys suggested:
~
$ find -name '*.extension' -exec grep -H 'pattern' {} \;
~
 does it
~
 Thanks
 lbrtchx


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlkti=-wa157ua4xpmq6s-2u7ghx9q3j1nrjqvcc...@mail.gmail.com

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Camaleón

On Fri, 17 Sep 2010 18:08:18 +, Albretch Mueller wrote:

> I need to:
> ~
>  search for files using a pattern (say all files with a certain
>  extension)
> ~
>  then search inside each of the found files for a word or regexp pattern
> ~
>  You could do this using find, cat and grep in a script, but I was
> wondering about how could you do it with a oneliner ~

How about?

***
find /path/to/search/ -type f -iname \*.ext -exec grep -H 'text to search' {} \;
***

Greetings,

-- 
Camaleón


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/pan.2010.09.17.18.20...@gmail.com

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Alexander Batischev

On Fri, Sep 17, 2010 at 06:08:18PM +, Albretch Mueller wrote:
>  I need to:
> ~
>  search for files using a pattern (say all files with a certain extension)
Is this part so complicated that bash can't handle it? I mean, if you need
certain extension, you don't even need cat and find, it's all about grep:

$ grep 'string' *.extension

> ~
>  then search inside each of the found files for a word or regexp pattern

But if you really want to use find, here's something you may try:

$ find -name '*.extension' -exec grep -H 'pattern' {} \;

(-name may be substituted by -iname if you don't want case sensitivity).

This oneliner would work exactly the same as first one - each match would be
preceded with a filename delimited from a matching string itself by a semicolon
(:)

Hope it helps.

-- 
Regards,
Alexander Batischev


signature.asc
Description: Digital signature

Re: searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Axel Freyn

Hi Albrecht,
On Fri, Sep 17, 2010 at 06:08:18PM +, Albretch Mueller wrote:
>  search for files using a pattern (say all files with a certain extension)
>  then search inside each of the found files for a word or regexp pattern
>  You could do this using find, cat and grep in a script, but I was
> wondering about how could you do it with a oneliner

You could do it with a line like
find / -name filename -exec grep word {} \;

this will search for all files with the name "filename" and then use
"grep" in order to search for the expression "word" in this file.
The line outputs the grep-results.
You could also just output the filenames by using
find / -name filename -exec grep -l word {} \;

For more details, see "man find" and "man grep"  ,-)

Axel


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100917181702.gg17...@axel

searching inside files with find, cat and grep as a oneliner ...

2010-09-17 Thread Albretch Mueller

 I need to:
~
 search for files using a pattern (say all files with a certain extension)
~
 then search inside each of the found files for a word or regexp pattern
~
 You could do this using find, cat and grep in a script, but I was
wondering about how could you do it with a oneliner
~
 Thanks
 lbrtchx


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlktike929zpdn58icgutqhxuabe3xuq5crz614w...@mail.gmail.com

Re: match across line using grep

2010-08-19 Thread Bob Proulx

Zhang Weiwu wrote:
> Bob McGowan wrote:
> > My point is that changing only the LANG environment variable changed the
> > way 'grep' dealt with the newline character.  
> 
> You are right this really look like a problem. Where should I file the
> bug? The gnu projects management looks mysterious to me, unlike other
> foss projects where there is a bug tracker open for every product. Or,
> should I file it as a Debian bug?

Personally for something like this I would file it upstream.  But the
official policy is that you can always file a bug in the Debian BTS
and put the burden on the Debian maintainer to forward it upstream.
And in many cases I think that is good when Debian's version may be
different from the upstream or older than the upstream.  Upstreams are
sometimes annoyed with a distro's "stable" release that doesn't track
their upstream daily builds.  Reporting to Debian first is never wrong
and frequently the only right thing to do.

But I think for cases like you have here you would be better served
talking directly to the upstream maintainers.  Because the core
functionality of grep is the same and besides, what would a Debian
maintainer actually be able to do in this case?  You wouldn't really
want Debian grep to behave different from other distro's grep command
so Debian specific patches isn't a good thing.

To file a bug against GNU grep send an email to:

  bug-g...@gnu.org

Please choose a good subject line.  (The current one seems reasonable.
I am often annoyed enough to point out that a subject line like "bug"
or "doesn't work" is terrible but is often seen in bug reports.)

Be sure to mention the version of grep that you are using.

  grep --version

You can review previous discussions in the archive here.

  http://lists.gnu.org/archive/html/bug-grep/

To file a bug against a Debian package use 'reportbug' and follow the
prompts and instructions.

  $ reportbug grep

You can review the Debian bug tracking system reports for grep here:

  http://bugs.debian.org/cgi-bin/pkgreport.cgi?package=grep

Bob

signature.asc
Description: Digital signature

Re: match across line using grep

2010-08-19 Thread Zhang Weiwu

On 2010年08月07日 06:41, Bob McGowan wrote:
> My point is that changing only the LANG environment variable changed the
> way 'grep' dealt with the newline character.  

You are right this really look like a problem. Where should I file the
bug? The gnu projects management looks mysterious to me, unlike other
foss projects where there is a bug tracker open for every product. Or,
should I file it as a Debian bug?

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4c6d3cbc.3070...@realss.com

1 2 3 4 5 6 >

1 - 100 of 546 matches

Mail list logo