subject:"a question about GREP"

Re: ack (was: a simple question about grep)

2007-10-19 Thread Kent Johnson

Bill Ricker wrote:

 The Andy and the ack project have built a better grep with perl.
 http://perladvent.pm.org/2006/5/
 search.cpan.org/~petdance/ack/ack
 petdance.com/ack/

Thank you again for pointing this out! I use ack several times a week, 
if not daily. It has saved me from having to learn/remember how to use 
'find' for which I am exceedingly grateful.

Kent
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-10 Thread Jerry

Thank you for all the great solutions!

Because of my extremely limited *nix knowledge, I'd use the approach of two
grep's in a pipeline, such as the one grep '^\*'  yourFile | grep -v
'^\*INDICATOR'  suggested by Michael, as it's simple to understand and easy
to memorize.

Thank you  again.

Zhao
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-07 Thread Tom Buskey

On 9/6/07, Kent Johnson [EMAIL PROTECTED] wrote:

 Tom Buskey wrote:
 
 
  On 9/6/07, *G.O.* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED]
  wrote:
 
  egrep ^\*[^INDICATOR] filename.txt

 That excludes lines beginning with * and any of the characters INDCATOR,
 i.e. *N, *D, etc will all be excluded.

  That didn't work for me, but this did:
 
 egrep '^\*[^I][^N][^I][^D][^I][^C][^A][^T][^O][^R]' filename.txt

 That will exclude a line that matches INDICATOR at any character, for
 example *aN



You're right.

perhaps this:

 egrep -P ^\*(?!INDICATOR) filename.txt


GNU egrep  2.5.1 doesn't work:
$ cat z
*INDICATOR name1 zip1
geoid gender location
*INDICATOR name2 zip2
*geoid gender location
INDICATOR name3 zip3
*district court
$ egrep  '^\*(?!INDICATOR)' z
$

No output.
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-07 Thread Kent Johnson

Bill Ricker wrote:
   Or, if you only have an old grep, but do have Perl, the following should 
 work:
 
 The Andy and the ack project have built a better grep with perl.

Cool. By default ack ignores plain text files, so you have to tell it to 
include them even when explicitly specifying the file. Here is an ack 
command that solves the OP's problem:

ack --text '^\*(?!INDICATOR)' myfile.txt

Kent
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-07 Thread Ben Scott

On 9/7/07, Tom Buskey [EMAIL PROTECTED] wrote:
 egrep -P ^\*(?!INDICATOR) filename.txt

 GNU egrep  2.5.1 doesn't work:
 $ egrep  '^\*(?!INDICATOR)' z
 $

  You need to specify -P (or --perl) to turn on support for Perl
regular expression extensions.Otherwise it will interpret the (?
as... hmmm, to tell the truth, I'm not sure what that'll do.  I don't
think that's valid traditional regexp syntax.  In any event, it won't
work.

  Hmmm, for that matter, it doesn't seem to like egrep -P.  I guess
that's because egrep is basically just the same thing as grep -E,
and grep -E -P is invalid.  So try grep -P.  On a CentOS 5.0 box:

$ grep '^\*(?!INDICATOR)' sample
$ grep -P '^\*(?!INDICATOR)' sample
*geoid gender location
*district court
$ rpm -q grep
grep-2.5.1-54.2.el5
$

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-07 Thread Shawn K. O'Shea

Will if you're going to go into 3-letter tools that start with 'a'
that can do the requested task, then I'm just going to have to tell
everyone how to do it with awk

 awk '/^\*/  !/^\*INDICATOR/ { print $0 }' file

awk takes a pattern and then a set of things to do with lines that
match that pattern. So my pattern says line starts with '*' AND lines
does NOT start with '*INDICATOR'. Lines that match get processed by
the curly braces, which in this case prints out the entire line ($0 in
awk parlance)

-Shawn

On 9/7/07, Bill Ricker [EMAIL PROTECTED] wrote:
Or, if you only have an old grep, but do have Perl, the following should 
  work:

 The Andy and the ack project have built a better grep with perl.
 http://perladvent.pm.org/2006/5/
 search.cpan.org/~petdance/ack/ack
 petdance.com/ack/

 ack is pure Perl, so consistent across all platforms. Command name is
 25% shorter. :-) Heck, it's 50% shorter compared to grep -r. 
 use.perl.org/~petdance/journal/31763

 http://www.youtube.com/watch?v=G1ynTV_E-5s [Andy petdance giving
 ack Lighting talk at OSCON 2007, 9min]
 http://www.perlfoundation.org/perl5/index.cgi?ack

 Disclaimer - I have been known to contribute a patch to ack once in
 a blue moon.

 --
 Bill
 [EMAIL PROTECTED] [EMAIL PROTECTED]
 ___
 gnhlug-discuss mailing list
 gnhlug-discuss@mail.gnhlug.org
 http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

a simple question about grep

2007-09-06 Thread Jerry

Hi,

I have a text file whose content looks like below:

*INDICATOR name1 zip1
geoid gender location
*INDICATOR name2 zip2
*geoid gender location
INDICATOR name3 zip3
*district court

I want to pick up all lines starting with * but no INDICATOR
followed.

So for the example above, I want to pick up the following 2 lines:

(the 3rd line) *geoid gender location
(the last line) *district court

How to construct regular expression with grep as a one-line command to
achieve this goal? Or any other simple solutions?
Thank you!

Zhao
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Michael ODonnell


grep '^\*'  yourFile | grep -v '^\*INDICATOR'

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Tom Buskey

egrep '^\*' FILE | egrep -v '^\*INDICATOR'

I'm not sure how you'd combine them into one REGXP.
I'm sure there's a better way in perl (GNU egrep will do perl with -P)

On 9/6/07, Jerry [EMAIL PROTECTED] wrote:

 Hi,

 I have a text file whose content looks like below:

 *INDICATOR name1 zip1
 geoid gender location
 *INDICATOR name2 zip2
 *geoid gender location
 INDICATOR name3 zip3
 *district court

 I want to pick up all lines starting with * but no INDICATOR
 followed.

 So for the example above, I want to pick up the following 2 lines:

 (the 3rd line) *geoid gender location
 (the last line) *district court

 How to construct regular expression with grep as a one-line command to
 achieve this goal? Or any other simple solutions?
 Thank you!

 Zhao

 ___
 gnhlug-discuss mailing list
 gnhlug-discuss@mail.gnhlug.org
 http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Star

 I want to pick up all lines starting with * but no INDICATOR
 followed.

I'd double-grep it, but i'm not infront of a *nix box to check

grep -i * | grep -v *INDICATOR filename

or something to that effect.

-- 
~ *
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread G.O.

egrep ^\*[^INDICATOR] filename.txt

gurhan

On 9/6/07, Jerry [EMAIL PROTECTED] wrote:
 Hi,


 I have a text file whose content looks like below:


 *INDICATOR name1 zip1
 geoid gender location
  *INDICATOR name2 zip2
  *geoid gender location
  INDICATOR name3 zip3
  *district court


 I want to pick up all lines starting with * but no INDICATOR
  followed.


 So for the example above, I want to pick up the following 2 lines:


 (the 3rd line) *geoid gender location
  (the last line) *district court


 How to construct regular expression with grep as a one-line command to
 achieve this goal? Or any other simple solutions?
  Thank you!

 Zhao

 ___
 gnhlug-discuss mailing list
 gnhlug-discuss@mail.gnhlug.org
 http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/


___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Tom Buskey

On 9/6/07, G.O. [EMAIL PROTECTED] wrote:

 egrep ^\*[^INDICATOR] filename.txt

 gurhan


That didn't work for me, but this did:

   egrep '^\*[^I][^N][^I][^D][^I][^C][^A][^T][^O][^R]' filename.txt



On 9/6/07, Jerry [EMAIL PROTECTED] wrote:
  Hi,
 
 
  I have a text file whose content looks like below:
 
 
  *INDICATOR name1 zip1
  geoid gender location
   *INDICATOR name2 zip2
   *geoid gender location
   INDICATOR name3 zip3
   *district court
 
 
  I want to pick up all lines starting with * but no INDICATOR
   followed.
 
 
  So for the example above, I want to pick up the following 2 lines:
 
 
  (the 3rd line) *geoid gender location
   (the last line) *district court
 
 
  How to construct regular expression with grep as a one-line command to
  achieve this goal? Or any other simple solutions?
   Thank you!
 
  Zhao
 
  ___
  gnhlug-discuss mailing list
  gnhlug-discuss@mail.gnhlug.org
  http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
 
 
 ___
 gnhlug-discuss mailing list
 gnhlug-discuss@mail.gnhlug.org
 http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Kent Johnson

Tom Buskey wrote:
 
 
 On 9/6/07, *G.O.* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 
 wrote:
 
 egrep ^\*[^INDICATOR] filename.txt

That excludes lines beginning with * and any of the characters INDCATOR, 
i.e. *N, *D, etc will all be excluded.

 That didn't work for me, but this did:
 
egrep '^\*[^I][^N][^I][^D][^I][^C][^A][^T][^O][^R]' filename.txt

That will exclude a line that matches INDICATOR at any character, for 
example *aN

perhaps this:

egrep -P ^\*(?!INDICATOR) filename.txt

Kent
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Ben Scott

On 9/6/07, Kent Johnson [EMAIL PROTECTED] wrote:
 perhaps this:

 egrep -P ^\*(?!INDICATOR) filename.txt

  Assuming your grep supports the Perl regular expression extensions
(a useful thing to have), that should work.

  Or, if you only have an old grep, but do have Perl, the following should work:

perl -pe '/^\*(?!INDICATOR)/ and print' filename.txt

  Otherwise, I agree with what other suggested, using two grep's in a pipeline.

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a simple question about grep

2007-09-06 Thread Bill Ricker

   Or, if you only have an old grep, but do have Perl, the following should 
 work:

The Andy and the ack project have built a better grep with perl.
http://perladvent.pm.org/2006/5/
search.cpan.org/~petdance/ack/ack
petdance.com/ack/

ack is pure Perl, so consistent across all platforms. Command name is
25% shorter. :-) Heck, it's 50% shorter compared to grep -r. 
use.perl.org/~petdance/journal/31763

http://www.youtube.com/watch?v=G1ynTV_E-5s [Andy petdance giving
ack Lighting talk at OSCON 2007, 9min]
http://www.perlfoundation.org/perl5/index.cgi?ack

Disclaimer - I have been known to contribute a patch to ack once in
a blue moon.

-- 
Bill
[EMAIL PROTECTED] [EMAIL PROTECTED]
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-27 Thread Jerry


Mike,

You are right. Some files on our server which I believe are plain text files
turn out to be data, based on what file command shows. Weird! These
files were moved from AIX system to the current Red Hat system,  could this
have something to with the file type?

Thank you.

Zhao

On 3/26/07, mike ledoux [EMAIL PROTECTED] wrote:


 Steven's solution (listed below) only partially works, for reasons I
don't
 know. By partially, I mean his solution can only find SOME files
matching
 the search criteria.

 find . -type f -name \*out\* | \
 xargs file | \
 awk '/ASCII/ { sub(/:/, ); print $1}' | \
 xargs grep -l zip  zip.txt

If you run 'find . -type f -name '*out*' -print0 | xargs -0 file'
I bet some of the files you are calling plain text files are not
ASCII text files, which is what the above is looking for.  For
example, a file 'file' reports as ISO-8859 English text will
almost certainly meet *your* critera for plain text, but doesn't
include ASCII anywhere in the output of 'file'.

--
[EMAIL PROTECTED]  OpenPGP KeyID 0x57C3430B
Holder of Past Knowledge   CS, O-
Working on Megatokyo is a lot like trying to fix the engine on a bus
while
it cruises down a bumpy highway at 75 mph with two monkeys fighting over
the steering wheel and a brick on the accelerator.  Piro

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-27 Thread Ben Scott


On 3/27/07, Jerry [EMAIL PROTECTED] wrote:

You are right. Some files on our server which I believe are plain text files
turn out to be data, based on what file command shows.


 The file(1) command just looks at the contents of a file, and
looks for known patterns (also called magic numbers).  For example,
all GIF image files being with the characters GIF89.  So it looks
for patterns like that, and guesses at what the file is supposed to
be.  There isn't any actual file type metadata stored in a standard
*nix filesystem.


Weird! These files were moved from AIX system to the current Red Hat system,
could this have something to with the file type?


 It's more likely they happen to contain some characters which are
outside the strict ASCII standard printable character set (A-Z, a-z,
0-9, space, keyboard punctuation).  For example, maybe it contains
some so-called high ASCII (8th bit set), which isn't a standard at
all, but various platform-specific and mutually-incompatible
extensions to ASCII.

 Older versions of the file(1) command would probably identify a
UTF-16 encoded Unicode text file as data.

-- Ben
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-26 Thread Michael ODonnell



Do you have a source tree that has already proven
to be buildable for the machine in question,
independent of these new drivers.  That would
help during triage of this problem...
 
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-26 Thread Jerry


Hi,

Thank you for all your help and time. I really appreciate it.



On our server, which runs Red Hat Enterprise Linux AS release 3 (Taroon
Update 8)

Lloyd ([EMAIL PROTECTED])'s solution works:

find -type f -name '*out*' | xargs grep -wli zip  zip.txt

Question: -type f limits to regular file, does the so-called regular
file strictly mean plain text files?


Also, solution from Ben (w/ adding search pattern, which is zip) and Bill
(w/ moving zip ahead of .) works:

grep -lwir --include=\*out\* zip .  zip.txt

---

Steven's solution (listed below) only partially works, for reasons I don't
know. By partially, I mean his solution can only find SOME files matching
the search criteria.

find . -type f -name \*out\* | \
xargs file | \
awk '/ASCII/ { sub(/:/, ); print $1}' | \
xargs grep -l zip  zip.txt


---
Kevin, in your solution (listed below), why are there 2 directory names are
used? Could you please explain a bit to me? Thank you.


find your-dirname1 your-dirname2 -name \*out\* \
 -exec perl -e 'undef $/;
$filename=$ARGV[0];
$_=;
exit(!(-T $filename  /\bzip\b/))' \{\} \; -print \
   zip.txt


BTW, yes, I'm serious about the plain text files part.

And thank you for your favorite alias, I've not tested though.

--
Bill, your another solution (listed below), based on Steve's, doesn't work
:-(

find . -name '*out*' -exec file '{}' '|' grep -q ASCII ';' -print0 \
   | xargs -0 grep -wli zip  zip.txt



Again, thank you guys all!

Zhao
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-26 Thread Ben Scott


[off-list]

On 3/26/07, Michael ODonnell [EMAIL PROTECTED] wrote:

Do you have a source tree that has already proven
to be buildable for the machine in question,
independent of these new drivers.  That would
help during triage of this problem...


 FYI, I think you replied to the wrong thread.

--
One day I feel I'm ahead of the wheel / And the next it's rolling over me
 -- Rush, Far Cry
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-26 Thread Kevin D. Clark


Jerry writes:

 Kevin, in your solution (listed below), why are there 2 directory names are
 used? Could you please explain a bit to me? Thank you.
 
 
 find your-dirname1 your-dirname2 -name \*out\* \
   -exec perl -e 'undef $/;
  $filename=$ARGV[0];
  $_=;
  exit(!(-T $filename  /\bzip\b/))' \{\} \; -print \
 zip.txt
 
 
 BTW, yes, I'm serious about the plain text files part.

By your-dirname1 and your-dirname2 I mean the directories *you*
are interested in searching.

For example:

  find /usr/src /media/usbdrive /home/jerry/src/foo -exec perl ...


In your case, you might want to begin searching for the current
directory, which is ..

 find . -exec perl ...

Regards,

--kevin
-- 
GnuPG ID: B280F24E  Never could stand that dog.
alumni.unh.edu!kdc   -- Tom Waits



___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-26 Thread Ben Scott


On 3/26/07, Jerry [EMAIL PROTECTED] wrote:

Question: -type f limits to regular file, does the so-called regular
file strictly mean plain text files?


 No.  find -type f will include binary files, executables, and
such.  The regular file part means that it is just a file containing
user data -- a bag of bytes, as one person put it.  As opposed to a
symbolic link, a named pipe (FIFO), or a device node.


Also, solution from Ben (w/ adding search pattern, which is zip) and Bill
(w/ moving zip ahead of .) works:

grep -lwir --include=\*out\* zip .  zip.txt


 You may want to add -I to that as well, to exclude binary files,
as Kevin Clark suggested.  That is:

grep -lwirI --include=\*out\* zip .  zip.txt


Kevin, in your solution (listed below), why are there 2 directory names are
used? Could you please explain a bit to me?


 I think he's just demonstrating that you can specify multiple
directory names on some implementations of find(1).  You can specify
only one, if you prefer.  On some find(1) implementations, you can
specify no directories at all, which implies the current directory.

--
One day I feel I'm ahead of the wheel / And the next it's rolling over me
 -- Rush, Far Cry
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-26 Thread mike ledoux

On Mon, Mar 26, 2007 at 11:38:31AM -0400, Jerry wrote:
 Lloyd ([EMAIL PROTECTED])'s solution works:
 
 find -type f -name '*out*' | xargs grep -wli zip  zip.txt
 
 Question: -type f limits to regular file, does the so-called regular
 file strictly mean plain text files?

It does not. regular file means not a special file, directory,
named pipe, symbolic link, or socket. plain text files are a
subset of regular files.  If you just want to omit non-text files
from the output, something like:

  find . -type f -name '*out*' -print0 | xargs -0 grep -wliI zip  zip.txt

will probably do what you want.  The -I option to GNU grep tells it
to treat binary files as if they contain no matches.  The -print0
to find and -0 to xargs improve handling of file names that contain
whitespace.

 Steven's solution (listed below) only partially works, for reasons I don't
 know. By partially, I mean his solution can only find SOME files matching
 the search criteria.
 
 find . -type f -name \*out\* | \
 xargs file | \
 awk '/ASCII/ { sub(/:/, ); print $1}' | \
 xargs grep -l zip  zip.txt

If you run 'find . -type f -name '*out*' -print0 | xargs -0 file'
I bet some of the files you are calling plain text files are not
ASCII text files, which is what the above is looking for.  For
example, a file 'file' reports as ISO-8859 English text will
almost certainly meet *your* critera for plain text, but doesn't
include ASCII anywhere in the output of 'file'.

-- 
[EMAIL PROTECTED]  OpenPGP KeyID 0x57C3430B
Holder of Past Knowledge   CS, O-
Working on Megatokyo is a lot like trying to fix the engine on a bus while
 it cruises down a bumpy highway at 75 mph with two monkeys fighting over
 the steering wheel and a brick on the accelerator.  Piro

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-25 Thread Ben Scott


O 23 Mar 2007 22:20:24 -0400, Kevin D. Clark [EMAIL PROTECTED] 

   Holy crap!  Where's Perl's oft-decried extreme conciseness?  ;-)

My solution comes from my experience, and I was going for correctness,
portability, and clarity, in that order.


 I can't resist pointing out that Perl isn't a guaranteed on Unix
systems, either.  ;-)


By the way, did you forget to add --binary-files=without-match to
your solution?  The original poster asked for text files only.


 Yes.  As Bill Freeman pointed out, I also left out the search
pattern!  (I was cut-and-pasting from an xterm where I was actually
testing things, so I'm not sure how I managed to do that, but I guess
I found a way.)  They need to build a script interpreter into email.
;-)  Oh wait, Microsoft already did, it was called Outlook 2000 and
we know how that turned out..

--
One day I feel I'm ahead of the wheel / And the next it's rolling over me
 -- Rush, Far Cry
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-25 Thread Kevin D. Clark


Ben Scott writes:

  They need to build a script interpreter into email.
 ;-)  Oh wait, Microsoft already did, it was called Outlook 2000 and
 we know how that turned out..

Ho ho ho...true enough.  (-:

--kevin
-- 
GnuPG ID: B280F24E  Never could stand that dog.
alumni.unh.edu!kdc   -- Tom Waits
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-24 Thread Bill Freeman

Ben Scott writes:
  grep -lwir --include=\*out\* .  zip.txt

Close.  You've left out what he's searching for:

   grep -lwir --include=\*out\* . zip  zip.txt

Of course, this doesn't have the subtilty that Steven W. Orr addedd to
limit it to text files, as Jerry mentioned, but didn't seem to be
trying to do.

Note that all the versions using find and xargs have issues in
modern times when it is highly likely that there are files whose names
include spaces.  find's -print0 option combined with xargs's
-0 (zero, not a capitol letter) option take care of this.

Steven's awk script would need further work to extract filenames
containing spaces.  Or, you can use find's -exec:

   find . -name '*out*' -exec file '{}' '|' grep -q ASCII ';' -print0 \
 | xargs -0 grep -wli zip  zip.txt

[ -exec treats everything up to the next simicolon (which must be
quoted so that the shell will pass it to find) as a command (pipe) to
run, except that an argument consisting of a matched pair of curly
braces (which must be quoted agains shell interpretation) is replaced,
in the command run, by the name of the file under consideration.
Things in the command, like the pipe symbol, that are special to the
shell, must be quoted.  (find never sees the quotes, so they're not
quoted in the sub-shell running the command.)  If the command fails
(returns non-zero status) -exec moves on to the next filename, so this
one doesn't get printed. ]

Bill

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

a question about GREP

2007-03-23 Thread Jerry


Hi,

The manual of grep command on Red Hat states that:

-R, -r, --recursive
read all files in each directory, recursively, this is
equivalent to -d recurse option

 --*include*=PATTERN recurse in directories only searching file
matching PATTERN
 --exclude=PATTERN recurse in directories skip file matching
PATTERN

For the --include or --exclude option, what is file matching PATTERN
supposed to mean? I supposed it means file name match PATTERN, not file
content match patten, am I right?

I'm asking this question, because I'm trying to do the following thing:

Find out all plain text files whose file names contain out and whose
contents containing zip (in the form of whole word),  and then output
these files names to a file called zip.txt. (These plain text files are
located in the sub-directories at different levels)

I tried the following 2 lines of commands to try to achieve the goal above,
but neither worked. Anyone cares to spot the error? I suspect most likely
it's because my usage/understanding of --include option is wrong.

grep -Hwli -r --include=out zip *   zip.txt

grep -Hwli --include=out zip *  zip.txt

Sorry if this question sounds stupid.

Thank you for your time.

Zhao
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Jerry


Scott,

Thank you for your solution. But it didn't work on system. :-(

Also, doesn't Grep stand for global regular expression print?

Zhao

On 3/23/07, Scott A. Valcourt [EMAIL PROTECTED] wrote:


Zhao-

Grep stands for global replace, though it is most often used as a global
find of a text pattern in UNIX.

I'm asking this question, because I'm trying to do the following thing:

Find out all plain text files whose file names contain out and whose
contents containing zip (in the form of whole word),  and then output
these files names to a file called zip.txt. (These plain text files are
located in the sub-directories at different levels)

Well, one way to do this in UNIX is really of the following:

grep -r zip *out*.*  zip.txt

I think this is what you want to do.

-Scott

At 03:41 PM 3/23/2007, you wrote:
Hi,

The manual of grep command on Red Hat states that:

 -R, -r, --recursive
 read all files in each directory, recursively, this is
equivalent to -d recurse option

  --include=PATTERN recurse in directories only searching file
matching PATTERN
  --exclude=PATTERN recurse in directories skip file matching
PATTERN

For the --include or --exclude option, what is file matching PATTERN
supposed to mean? I supposed it means file name match PATTERN, not file
content match patten, am I right?

I'm asking this question, because I'm trying to do the following thing:

Find out all plain text files whose file names contain out and whose
contents containing zip (in the form of whole word),  and then output
these files names to a file called zip.txt. (These plain text files are
located in the sub-directories at different levels)

I tried the following 2 lines of commands to try to achieve the goal
above, but neither worked. Anyone cares to spot the error? I suspect most
likely it's because my usage/understanding of --include option is wrong.

grep -Hwli -r --include=out zip *   zip.txt

grep -Hwli --include=out zip *  zip.txt

Sorry if this question sounds stupid.

Thank you for your time.

Zhao
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

-Scott Valcourt email:  [EMAIL PROTECTED]
Computer Science Departmentphone:  (603) 862-4489
University of New Hampshirefax:(603) 862-3493
310 Nesmith Hall
Durham, NH 03824


___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Kevin D. Clark


Jerry writes:

 Find out all plain text files whose file names contain out and whose
 contents containing zip (in the form of whole word),  and then output
 these files names to a file called zip.txt. (These plain text files are
 located in the sub-directories at different levels)

Here is how I would do this:

find your-dirname1 your-dirname2 -name \*out\* \
   -exec perl -e 'undef $/; 
  $filename=$ARGV[0];
  $_=; 
  exit(!(-T $filename  /\bzip\b/))' \{\} \; -print \
 zip.txt


Notes:

1:  I assume you were serious about the plain text files part.  This
is what the -T bit in the Perl program looks for.  No binary
files, right?

2:  I assume you were serious about the zip part, so a word like
unzip would not qualify.

3:  The Perl code has some warts, but I was trying for clarity here.

4:  The find program is very powerful and you can never go wrong
learning about its features.

Regards,

--kevin


PS  I thought you might like some of my favorite aliases:

# Author: kevin d. clark

# Finds text files in the specified directories.  These use Perl's -T
# and -B tests.  Here's some relevant documentation from the perlfunc 
# page:
#
#The -T and -B switches work as follows.  The first block or
#so of the file is examined for odd characters such as strange
#control codes or characters with the high bit set.  If too many
#strange characters (30%) are found, it's a -B file, other-
#wise it's a -T file.  Also, any file containing null in the
#first block is considered a binary file. []  Both -T and
#-B return true on a null file...
#
# Caveat programmer.
# 

# Find text files
txtfind () {
  if [ $# -eq 0 ] ; then
txtfind .
  else
perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f
 -T);},  at ARGV);' ${ at }
  fi
}

# Find DOS-formatted text files
dostxtfind () {
  if [ $# -eq 0 ] ; then
dostxtfind .
  else
perl -MFile::Find -e 'find(sub{ 
 $crlf = 0;
 if (($f = -f)  ($T = -T)) {
at ARGV=($_);
   binmode(ARGV);
   (/\r\n/  $crlf++) while();
 }
 print $File::Find::name\n 
   if ($f  $T  $crlf);
   },  at ARGV)' ${ at }
  fi
}

# Find binary files
binfind () {
  if [ $# -eq 0 ] ; then
binfind .
  else
perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f
 -B);},  at ARGV);' ${ at }
  fi
}




--
GnuPG ID: B280F24E  Never could stand that dog.
alumni.unh.edu!kdc   -- Tom Waits
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Steven W. Orr

On Friday, Mar 23rd 2007 at 15:41 -0400, quoth Jerry:

=The manual of grep command on Red Hat states that:
=
=-R, -r, --recursive
=read all files in each directory, recursively, this is
=equivalent to -d recurse option
=
= --*include*=PATTERN recurse in directories only searching file
=matching PATTERN
= --exclude=PATTERN recurse in directories skip file matching
=PATTERN
=
=For the --include or --exclude option, what is file matching PATTERN
=supposed to mean? I supposed it means file name match PATTERN, not file
=content match patten, am I right?
=
=I'm asking this question, because I'm trying to do the following thing:
=
=Find out all plain text files whose file names contain out and whose
=contents containing zip (in the form of whole word),  and then output
=these files names to a file called zip.txt. (These plain text files are
=located in the sub-directories at different levels)
=
=I tried the following 2 lines of commands to try to achieve the goal above,
=but neither worked. Anyone cares to spot the error? I suspect most likely
=it's because my usage/understanding of --include option is wrong.
=
=grep -Hwli -r --include=out zip *   zip.txt
=
=grep -Hwli --include=out zip *  zip.txt
=
=Sorry if this question sounds stupid.

That's the dumbest question I ever heard! (just kidding)

It seems to me that you need grep find awk xargs etc...

Tell me if this helps:

find . -type f -name \*out\* | \
 xargs file | \
 awk '/ASCII/ { sub(/:/, ); print $1}' | \
 xargs grep -l zip  zip.txt

Line 1 gets the list of files whose name contains the word out.
Line 2 takes that list and runs the file command
Line 3 takes the output of file and prints out column 1 (without the colon 
   at the end) if the word ASCII is found
Line 4 takes the previous output and searches those files for the word zip 
and the output then goes into zip.txt

Easy peasy japaneezy

(I shall now buff my nails.)

-- 
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Steven W. Orr

On Friday, Mar 23rd 2007 at 16:33 -0400, quoth Jerry:

=Also, doesn't Grep stand for global regular expression print?

General Regular Expression Processor

-- 
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Python

On Fri, 2007-03-23 at 15:41 -0400, Jerry wrote:
 Hi,
 
 The manual of grep command on Red Hat states that:
 
  -R, -r, --recursive   
  read all files in each directory, recursively, this is
 equivalent to -d recurse option
 
   --include=PATTERN recurse in directories only searching file
 matching PATTERN
   --exclude=PATTERN recurse in directories skip file matching
 PATTERN
 
 For the --include or --exclude option, what is file matching PATTERN
 supposed to mean? I supposed it means file name match PATTERN, not
 file content match patten, am I right? 
 
 I'm asking this question, because I'm trying to do the following
 thing:
 
 Find out all plain text files whose file names contain out and whose
 contents containing zip (in the form of whole word),  and then
 output these files names to a file called zip.txt. (These plain text
 files are located in the sub-directories at different levels)

Would this approach work?

find -type f -name '*out*' | xargs grep -wli zip  zip.txt


use find to recurse through directories and create a list of files.

xargs feeds the file list as arguments to grep.

grep examines the files looking for the word zip ignoring case
and writes the filenames

which get directed into zip.txt

 
 I tried the following 2 lines of commands to try to achieve the goal
 above, but neither worked. Anyone cares to spot the error? I suspect
 most likely it's because my usage/understanding of --include option is
 wrong. 
 
 grep -Hwli -r --include=out zip *   zip.txt
 
 grep -Hwli --include=out zip *  zip.txt
 
 Sorry if this question sounds stupid.
 
 Thank you for your time.
 
 Zhao 
 ___
 gnhlug-discuss mailing list
 gnhlug-discuss@mail.gnhlug.org
 http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
-- 
Lloyd Kvam
Venix Corp

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Shawn K. O'Shea


because my usage/understanding of --include option is wrong.


grep -Hwli -r --include=out zip *   zip.txt

grep -Hwli --include=out zip *  zip.txt



It seems to be more of a glob pattern. I played around a little on one
of my boxes and I believe something more like
--include=*out*
for the include option will work.

-Shawn
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Ben Scott


On 3/23/07, Jerry [EMAIL PROTECTED] wrote:

For the --include or --exclude option, what is file matching PATTERN
supposed to mean?


 Typically, it's a shell glob.  My testing appears to confirm that.


I supposed it means file name match PATTERN, not file
content match patten, am I right?


 Yah.


Find out all plain text files whose file names contain out and whose
contents containing zip (in the form of whole word),  and then output
these files names to a file called zip.txt.


I think this should work:

grep -lwir --include=\*out\* .  zip.txt

Using long options:

grep  --files-with-matches --word-regexp --ignore-case \
--recursive --include=\*out\* .  zip.txt

 The backslashes before the stars (\*out\*) are needed because
otherwise the shell will try to expand them, which may prevent grep
from seing them.


grep -Hwli -r --include=out zip *   zip.txt


 The biggest problem there is that the include PATTERN is just out,
which means the filename would have to be just out.  Not without
or outside.  By putting the stars around it, as I did, it will match
anything (including nothing) on either side, as well.

 The * you give for the file name will be expanded by the shell,
which may or may not give you what you want.  I used just ., which
is the current directory.  Let grep handle getting the file list from
the current directory, since you're using a recursive file search (-r)
anyway.

 Also, a few minor superfluous things: -l implies -H, so you don't
need to specify both.  And you don't need to quote zip, since zip
does not contain any shell meta-characters.

--
One day I feel I'm ahead of the wheel / And the next it's rolling over me
 -- Rush, Far Cry
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Ben Scott


On 23 Mar 2007 17:01:40 -0400, Kevin D. Clark [EMAIL PROTECTED] wrote:

Here is how I would do this:

find your-dirname1 your-dirname2 -name \*out\* \
   -exec perl -e 'undef $/;
  $filename=$ARGV[0];
  $_=;
  exit(!(-T $filename  /\bzip\b/))' \{\} \; -print \
 zip.txt


 Holy crap!  Where's Perl's oft-decried extreme conciseness?  ;-)

 I much prefer the all-in-one approach:

grep -lwir --include=\*out\* .  zip.txt

 Yah, the find command is very useful, since it's generic, and thus
works in very complicated situations, when nothing else will.  But for
more common cases, the convenience features of modern *nix tools
really do save a lot of work.  :-)

--
One day I feel I'm ahead of the wheel / And the next it's rolling over me
 -- Rush, Far Cry
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread mike ledoux

On Fri, Mar 23, 2007 at 05:12:04PM -0400, Steven W. Orr wrote:
 On Friday, Mar 23rd 2007 at 16:33 -0400, quoth Jerry:
 
 =Also, doesn't Grep stand for global regular expression print?
 
 General Regular Expression Processor

Jerry is correct.  The name grep comes from the ed command g/regex/p:
(search) global(ly for lines matching the) regular expression(, and) print.

-- 
[EMAIL PROTECTED]  OpenPGP KeyID 0x57C3430B
Holder of Past Knowledge   CS, O-
Touch passion when it comes your way Stephen.  It's rare enough as it is,
 don't walk away when it calls you by name.  Marcus Cole

___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Kevin D. Clark


Ben Scott writes:

   Holy crap!  Where's Perl's oft-decried extreme conciseness?  ;-)

From my perspective, I deal with unix-flavored systems all the time
with feature-lacking grep implementations.  As recently as three weeks
ago, I was working on a system without any fancy GNU grep.  This
system would happy grep through binary files and display the output on
your screenthus hosing your terminal.

My solution comes from my experience, and I was going for correctness,
portability, and clarity, in that order.  I realize this is a Linux
list, but I don't always live in that world.

By the way, did you forget to add --binary-files=without-match to
your solution?  The original poster asked for text files only.

Kind Regards,

--kevin
-- 
GnuPG ID: B280F24E  Never could stand that dog.
alumni.unh.edu!kdc   -- Tom Waits
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

Re: a question about GREP

2007-03-23 Thread Kevin D. Clark


Here is another copy of my favorite shell functions, since I kindof
sent out garbled versions the first time.

I hope others find these to be useful.

--kevin



# txtfind, dostxtfind, and binfind all use Perl's -B and -T file 
# test operations.
#
# Here are some relevant sections from the perlfunc documentation:
#
#  The -T and -B switches work as follows.  The first block or
#  so of the file is examined for odd characters such as strange
#  control codes or characters with the high bit set.  If too many
#  strange characters (30%) are found, it is -B file, other-
#  wise it is a -T file.  Also, any file containing null in the
#  first block is considered a binary file
#  ...
#  Both -T and -B return true on a null file.
#
# Caveat programmer.

txtfind () {
  if [ $# -eq 0 ] ; then
txtfind .
  else
perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f  -T);}, 
@ARGV);' [EMAIL PROTECTED]
  fi
}

dostxtfind () {
  if [ $# -eq 0 ] ; then
dostxtfind .
  else
perl -MFile::Find -e 'find(sub{ 
 $crlf = 0;
 $f = -f;
 $T = -T;
 @ARGV=($_);
 binmode(ARGV);
 ((/\r\n/)  $crlf++) while();
 print $File::Find::name $crnl\n
   if ($f  $T  $crlf);
   }, @ARGV)' [EMAIL PROTECTED]
  fi
}


binfind () {
  if [ $# -eq 0 ] ; then
binfind .
  else
perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f  -B);}, 
@ARGV);' [EMAIL PROTECTED]
  fi
}



-- 
GnuPG ID: B280F24E  Never could stand that dog.
alumni.unh.edu!kdc   -- Tom Waits
___
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

38 matches

Mail list logo