Re: ack (was: a simple question about grep)
Bill Ricker wrote: The Andy and the ack project have built a better grep with perl. http://perladvent.pm.org/2006/5/ search.cpan.org/~petdance/ack/ack petdance.com/ack/ Thank you again for pointing this out! I use ack several times a week, if not daily. It has saved me from having to learn/remember how to use 'find' for which I am exceedingly grateful. Kent ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
Thank you for all the great solutions! Because of my extremely limited *nix knowledge, I'd use the approach of two grep's in a pipeline, such as the one grep '^\*' yourFile | grep -v '^\*INDICATOR' suggested by Michael, as it's simple to understand and easy to memorize. Thank you again. Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
On 9/6/07, Kent Johnson [EMAIL PROTECTED] wrote: Tom Buskey wrote: On 9/6/07, *G.O.* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: egrep ^\*[^INDICATOR] filename.txt That excludes lines beginning with * and any of the characters INDCATOR, i.e. *N, *D, etc will all be excluded. That didn't work for me, but this did: egrep '^\*[^I][^N][^I][^D][^I][^C][^A][^T][^O][^R]' filename.txt That will exclude a line that matches INDICATOR at any character, for example *aN You're right. perhaps this: egrep -P ^\*(?!INDICATOR) filename.txt GNU egrep 2.5.1 doesn't work: $ cat z *INDICATOR name1 zip1 geoid gender location *INDICATOR name2 zip2 *geoid gender location INDICATOR name3 zip3 *district court $ egrep '^\*(?!INDICATOR)' z $ No output. ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
Bill Ricker wrote: Or, if you only have an old grep, but do have Perl, the following should work: The Andy and the ack project have built a better grep with perl. Cool. By default ack ignores plain text files, so you have to tell it to include them even when explicitly specifying the file. Here is an ack command that solves the OP's problem: ack --text '^\*(?!INDICATOR)' myfile.txt Kent ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
On 9/7/07, Tom Buskey [EMAIL PROTECTED] wrote: egrep -P ^\*(?!INDICATOR) filename.txt GNU egrep 2.5.1 doesn't work: $ egrep '^\*(?!INDICATOR)' z $ You need to specify -P (or --perl) to turn on support for Perl regular expression extensions.Otherwise it will interpret the (? as... hmmm, to tell the truth, I'm not sure what that'll do. I don't think that's valid traditional regexp syntax. In any event, it won't work. Hmmm, for that matter, it doesn't seem to like egrep -P. I guess that's because egrep is basically just the same thing as grep -E, and grep -E -P is invalid. So try grep -P. On a CentOS 5.0 box: $ grep '^\*(?!INDICATOR)' sample $ grep -P '^\*(?!INDICATOR)' sample *geoid gender location *district court $ rpm -q grep grep-2.5.1-54.2.el5 $ -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
Will if you're going to go into 3-letter tools that start with 'a' that can do the requested task, then I'm just going to have to tell everyone how to do it with awk awk '/^\*/ !/^\*INDICATOR/ { print $0 }' file awk takes a pattern and then a set of things to do with lines that match that pattern. So my pattern says line starts with '*' AND lines does NOT start with '*INDICATOR'. Lines that match get processed by the curly braces, which in this case prints out the entire line ($0 in awk parlance) -Shawn On 9/7/07, Bill Ricker [EMAIL PROTECTED] wrote: Or, if you only have an old grep, but do have Perl, the following should work: The Andy and the ack project have built a better grep with perl. http://perladvent.pm.org/2006/5/ search.cpan.org/~petdance/ack/ack petdance.com/ack/ ack is pure Perl, so consistent across all platforms. Command name is 25% shorter. :-) Heck, it's 50% shorter compared to grep -r. use.perl.org/~petdance/journal/31763 http://www.youtube.com/watch?v=G1ynTV_E-5s [Andy petdance giving ack Lighting talk at OSCON 2007, 9min] http://www.perlfoundation.org/perl5/index.cgi?ack Disclaimer - I have been known to contribute a patch to ack once in a blue moon. -- Bill [EMAIL PROTECTED] [EMAIL PROTECTED] ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
a simple question about grep
Hi, I have a text file whose content looks like below: *INDICATOR name1 zip1 geoid gender location *INDICATOR name2 zip2 *geoid gender location INDICATOR name3 zip3 *district court I want to pick up all lines starting with * but no INDICATOR followed. So for the example above, I want to pick up the following 2 lines: (the 3rd line) *geoid gender location (the last line) *district court How to construct regular expression with grep as a one-line command to achieve this goal? Or any other simple solutions? Thank you! Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
grep '^\*' yourFile | grep -v '^\*INDICATOR' ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
egrep '^\*' FILE | egrep -v '^\*INDICATOR' I'm not sure how you'd combine them into one REGXP. I'm sure there's a better way in perl (GNU egrep will do perl with -P) On 9/6/07, Jerry [EMAIL PROTECTED] wrote: Hi, I have a text file whose content looks like below: *INDICATOR name1 zip1 geoid gender location *INDICATOR name2 zip2 *geoid gender location INDICATOR name3 zip3 *district court I want to pick up all lines starting with * but no INDICATOR followed. So for the example above, I want to pick up the following 2 lines: (the 3rd line) *geoid gender location (the last line) *district court How to construct regular expression with grep as a one-line command to achieve this goal? Or any other simple solutions? Thank you! Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
I want to pick up all lines starting with * but no INDICATOR followed. I'd double-grep it, but i'm not infront of a *nix box to check grep -i * | grep -v *INDICATOR filename or something to that effect. -- ~ * ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
egrep ^\*[^INDICATOR] filename.txt gurhan On 9/6/07, Jerry [EMAIL PROTECTED] wrote: Hi, I have a text file whose content looks like below: *INDICATOR name1 zip1 geoid gender location *INDICATOR name2 zip2 *geoid gender location INDICATOR name3 zip3 *district court I want to pick up all lines starting with * but no INDICATOR followed. So for the example above, I want to pick up the following 2 lines: (the 3rd line) *geoid gender location (the last line) *district court How to construct regular expression with grep as a one-line command to achieve this goal? Or any other simple solutions? Thank you! Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
On 9/6/07, G.O. [EMAIL PROTECTED] wrote: egrep ^\*[^INDICATOR] filename.txt gurhan That didn't work for me, but this did: egrep '^\*[^I][^N][^I][^D][^I][^C][^A][^T][^O][^R]' filename.txt On 9/6/07, Jerry [EMAIL PROTECTED] wrote: Hi, I have a text file whose content looks like below: *INDICATOR name1 zip1 geoid gender location *INDICATOR name2 zip2 *geoid gender location INDICATOR name3 zip3 *district court I want to pick up all lines starting with * but no INDICATOR followed. So for the example above, I want to pick up the following 2 lines: (the 3rd line) *geoid gender location (the last line) *district court How to construct regular expression with grep as a one-line command to achieve this goal? Or any other simple solutions? Thank you! Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
Tom Buskey wrote: On 9/6/07, *G.O.* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: egrep ^\*[^INDICATOR] filename.txt That excludes lines beginning with * and any of the characters INDCATOR, i.e. *N, *D, etc will all be excluded. That didn't work for me, but this did: egrep '^\*[^I][^N][^I][^D][^I][^C][^A][^T][^O][^R]' filename.txt That will exclude a line that matches INDICATOR at any character, for example *aN perhaps this: egrep -P ^\*(?!INDICATOR) filename.txt Kent ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
On 9/6/07, Kent Johnson [EMAIL PROTECTED] wrote: perhaps this: egrep -P ^\*(?!INDICATOR) filename.txt Assuming your grep supports the Perl regular expression extensions (a useful thing to have), that should work. Or, if you only have an old grep, but do have Perl, the following should work: perl -pe '/^\*(?!INDICATOR)/ and print' filename.txt Otherwise, I agree with what other suggested, using two grep's in a pipeline. -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a simple question about grep
Or, if you only have an old grep, but do have Perl, the following should work: The Andy and the ack project have built a better grep with perl. http://perladvent.pm.org/2006/5/ search.cpan.org/~petdance/ack/ack petdance.com/ack/ ack is pure Perl, so consistent across all platforms. Command name is 25% shorter. :-) Heck, it's 50% shorter compared to grep -r. use.perl.org/~petdance/journal/31763 http://www.youtube.com/watch?v=G1ynTV_E-5s [Andy petdance giving ack Lighting talk at OSCON 2007, 9min] http://www.perlfoundation.org/perl5/index.cgi?ack Disclaimer - I have been known to contribute a patch to ack once in a blue moon. -- Bill [EMAIL PROTECTED] [EMAIL PROTECTED] ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Mike, You are right. Some files on our server which I believe are plain text files turn out to be data, based on what file command shows. Weird! These files were moved from AIX system to the current Red Hat system, could this have something to with the file type? Thank you. Zhao On 3/26/07, mike ledoux [EMAIL PROTECTED] wrote: Steven's solution (listed below) only partially works, for reasons I don't know. By partially, I mean his solution can only find SOME files matching the search criteria. find . -type f -name \*out\* | \ xargs file | \ awk '/ASCII/ { sub(/:/, ); print $1}' | \ xargs grep -l zip zip.txt If you run 'find . -type f -name '*out*' -print0 | xargs -0 file' I bet some of the files you are calling plain text files are not ASCII text files, which is what the above is looking for. For example, a file 'file' reports as ISO-8859 English text will almost certainly meet *your* critera for plain text, but doesn't include ASCII anywhere in the output of 'file'. -- [EMAIL PROTECTED] OpenPGP KeyID 0x57C3430B Holder of Past Knowledge CS, O- Working on Megatokyo is a lot like trying to fix the engine on a bus while it cruises down a bumpy highway at 75 mph with two monkeys fighting over the steering wheel and a brick on the accelerator. Piro ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On 3/27/07, Jerry [EMAIL PROTECTED] wrote: You are right. Some files on our server which I believe are plain text files turn out to be data, based on what file command shows. The file(1) command just looks at the contents of a file, and looks for known patterns (also called magic numbers). For example, all GIF image files being with the characters GIF89. So it looks for patterns like that, and guesses at what the file is supposed to be. There isn't any actual file type metadata stored in a standard *nix filesystem. Weird! These files were moved from AIX system to the current Red Hat system, could this have something to with the file type? It's more likely they happen to contain some characters which are outside the strict ASCII standard printable character set (A-Z, a-z, 0-9, space, keyboard punctuation). For example, maybe it contains some so-called high ASCII (8th bit set), which isn't a standard at all, but various platform-specific and mutually-incompatible extensions to ASCII. Older versions of the file(1) command would probably identify a UTF-16 encoded Unicode text file as data. -- Ben ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Do you have a source tree that has already proven to be buildable for the machine in question, independent of these new drivers. That would help during triage of this problem... ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Hi, Thank you for all your help and time. I really appreciate it. On our server, which runs Red Hat Enterprise Linux AS release 3 (Taroon Update 8) Lloyd ([EMAIL PROTECTED])'s solution works: find -type f -name '*out*' | xargs grep -wli zip zip.txt Question: -type f limits to regular file, does the so-called regular file strictly mean plain text files? Also, solution from Ben (w/ adding search pattern, which is zip) and Bill (w/ moving zip ahead of .) works: grep -lwir --include=\*out\* zip . zip.txt --- Steven's solution (listed below) only partially works, for reasons I don't know. By partially, I mean his solution can only find SOME files matching the search criteria. find . -type f -name \*out\* | \ xargs file | \ awk '/ASCII/ { sub(/:/, ); print $1}' | \ xargs grep -l zip zip.txt --- Kevin, in your solution (listed below), why are there 2 directory names are used? Could you please explain a bit to me? Thank you. find your-dirname1 your-dirname2 -name \*out\* \ -exec perl -e 'undef $/; $filename=$ARGV[0]; $_=; exit(!(-T $filename /\bzip\b/))' \{\} \; -print \ zip.txt BTW, yes, I'm serious about the plain text files part. And thank you for your favorite alias, I've not tested though. -- Bill, your another solution (listed below), based on Steve's, doesn't work :-( find . -name '*out*' -exec file '{}' '|' grep -q ASCII ';' -print0 \ | xargs -0 grep -wli zip zip.txt Again, thank you guys all! Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
[off-list] On 3/26/07, Michael ODonnell [EMAIL PROTECTED] wrote: Do you have a source tree that has already proven to be buildable for the machine in question, independent of these new drivers. That would help during triage of this problem... FYI, I think you replied to the wrong thread. -- One day I feel I'm ahead of the wheel / And the next it's rolling over me -- Rush, Far Cry ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Jerry writes: Kevin, in your solution (listed below), why are there 2 directory names are used? Could you please explain a bit to me? Thank you. find your-dirname1 your-dirname2 -name \*out\* \ -exec perl -e 'undef $/; $filename=$ARGV[0]; $_=; exit(!(-T $filename /\bzip\b/))' \{\} \; -print \ zip.txt BTW, yes, I'm serious about the plain text files part. By your-dirname1 and your-dirname2 I mean the directories *you* are interested in searching. For example: find /usr/src /media/usbdrive /home/jerry/src/foo -exec perl ... In your case, you might want to begin searching for the current directory, which is .. find . -exec perl ... Regards, --kevin -- GnuPG ID: B280F24E Never could stand that dog. alumni.unh.edu!kdc -- Tom Waits ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On 3/26/07, Jerry [EMAIL PROTECTED] wrote: Question: -type f limits to regular file, does the so-called regular file strictly mean plain text files? No. find -type f will include binary files, executables, and such. The regular file part means that it is just a file containing user data -- a bag of bytes, as one person put it. As opposed to a symbolic link, a named pipe (FIFO), or a device node. Also, solution from Ben (w/ adding search pattern, which is zip) and Bill (w/ moving zip ahead of .) works: grep -lwir --include=\*out\* zip . zip.txt You may want to add -I to that as well, to exclude binary files, as Kevin Clark suggested. That is: grep -lwirI --include=\*out\* zip . zip.txt Kevin, in your solution (listed below), why are there 2 directory names are used? Could you please explain a bit to me? I think he's just demonstrating that you can specify multiple directory names on some implementations of find(1). You can specify only one, if you prefer. On some find(1) implementations, you can specify no directories at all, which implies the current directory. -- One day I feel I'm ahead of the wheel / And the next it's rolling over me -- Rush, Far Cry ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On Mon, Mar 26, 2007 at 11:38:31AM -0400, Jerry wrote: Lloyd ([EMAIL PROTECTED])'s solution works: find -type f -name '*out*' | xargs grep -wli zip zip.txt Question: -type f limits to regular file, does the so-called regular file strictly mean plain text files? It does not. regular file means not a special file, directory, named pipe, symbolic link, or socket. plain text files are a subset of regular files. If you just want to omit non-text files from the output, something like: find . -type f -name '*out*' -print0 | xargs -0 grep -wliI zip zip.txt will probably do what you want. The -I option to GNU grep tells it to treat binary files as if they contain no matches. The -print0 to find and -0 to xargs improve handling of file names that contain whitespace. Steven's solution (listed below) only partially works, for reasons I don't know. By partially, I mean his solution can only find SOME files matching the search criteria. find . -type f -name \*out\* | \ xargs file | \ awk '/ASCII/ { sub(/:/, ); print $1}' | \ xargs grep -l zip zip.txt If you run 'find . -type f -name '*out*' -print0 | xargs -0 file' I bet some of the files you are calling plain text files are not ASCII text files, which is what the above is looking for. For example, a file 'file' reports as ISO-8859 English text will almost certainly meet *your* critera for plain text, but doesn't include ASCII anywhere in the output of 'file'. -- [EMAIL PROTECTED] OpenPGP KeyID 0x57C3430B Holder of Past Knowledge CS, O- Working on Megatokyo is a lot like trying to fix the engine on a bus while it cruises down a bumpy highway at 75 mph with two monkeys fighting over the steering wheel and a brick on the accelerator. Piro ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
O 23 Mar 2007 22:20:24 -0400, Kevin D. Clark [EMAIL PROTECTED] Holy crap! Where's Perl's oft-decried extreme conciseness? ;-) My solution comes from my experience, and I was going for correctness, portability, and clarity, in that order. I can't resist pointing out that Perl isn't a guaranteed on Unix systems, either. ;-) By the way, did you forget to add --binary-files=without-match to your solution? The original poster asked for text files only. Yes. As Bill Freeman pointed out, I also left out the search pattern! (I was cut-and-pasting from an xterm where I was actually testing things, so I'm not sure how I managed to do that, but I guess I found a way.) They need to build a script interpreter into email. ;-) Oh wait, Microsoft already did, it was called Outlook 2000 and we know how that turned out.. -- One day I feel I'm ahead of the wheel / And the next it's rolling over me -- Rush, Far Cry ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Ben Scott writes: They need to build a script interpreter into email. ;-) Oh wait, Microsoft already did, it was called Outlook 2000 and we know how that turned out.. Ho ho ho...true enough. (-: --kevin -- GnuPG ID: B280F24E Never could stand that dog. alumni.unh.edu!kdc -- Tom Waits ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Ben Scott writes: grep -lwir --include=\*out\* . zip.txt Close. You've left out what he's searching for: grep -lwir --include=\*out\* . zip zip.txt Of course, this doesn't have the subtilty that Steven W. Orr addedd to limit it to text files, as Jerry mentioned, but didn't seem to be trying to do. Note that all the versions using find and xargs have issues in modern times when it is highly likely that there are files whose names include spaces. find's -print0 option combined with xargs's -0 (zero, not a capitol letter) option take care of this. Steven's awk script would need further work to extract filenames containing spaces. Or, you can use find's -exec: find . -name '*out*' -exec file '{}' '|' grep -q ASCII ';' -print0 \ | xargs -0 grep -wli zip zip.txt [ -exec treats everything up to the next simicolon (which must be quoted so that the shell will pass it to find) as a command (pipe) to run, except that an argument consisting of a matched pair of curly braces (which must be quoted agains shell interpretation) is replaced, in the command run, by the name of the file under consideration. Things in the command, like the pipe symbol, that are special to the shell, must be quoted. (find never sees the quotes, so they're not quoted in the sub-shell running the command.) If the command fails (returns non-zero status) -exec moves on to the next filename, so this one doesn't get printed. ] Bill ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
a question about GREP
Hi, The manual of grep command on Red Hat states that: -R, -r, --recursive read all files in each directory, recursively, this is equivalent to -d recurse option --*include*=PATTERN recurse in directories only searching file matching PATTERN --exclude=PATTERN recurse in directories skip file matching PATTERN For the --include or --exclude option, what is file matching PATTERN supposed to mean? I supposed it means file name match PATTERN, not file content match patten, am I right? I'm asking this question, because I'm trying to do the following thing: Find out all plain text files whose file names contain out and whose contents containing zip (in the form of whole word), and then output these files names to a file called zip.txt. (These plain text files are located in the sub-directories at different levels) I tried the following 2 lines of commands to try to achieve the goal above, but neither worked. Anyone cares to spot the error? I suspect most likely it's because my usage/understanding of --include option is wrong. grep -Hwli -r --include=out zip * zip.txt grep -Hwli --include=out zip * zip.txt Sorry if this question sounds stupid. Thank you for your time. Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Scott, Thank you for your solution. But it didn't work on system. :-( Also, doesn't Grep stand for global regular expression print? Zhao On 3/23/07, Scott A. Valcourt [EMAIL PROTECTED] wrote: Zhao- Grep stands for global replace, though it is most often used as a global find of a text pattern in UNIX. I'm asking this question, because I'm trying to do the following thing: Find out all plain text files whose file names contain out and whose contents containing zip (in the form of whole word), and then output these files names to a file called zip.txt. (These plain text files are located in the sub-directories at different levels) Well, one way to do this in UNIX is really of the following: grep -r zip *out*.* zip.txt I think this is what you want to do. -Scott At 03:41 PM 3/23/2007, you wrote: Hi, The manual of grep command on Red Hat states that: -R, -r, --recursive read all files in each directory, recursively, this is equivalent to -d recurse option --include=PATTERN recurse in directories only searching file matching PATTERN --exclude=PATTERN recurse in directories skip file matching PATTERN For the --include or --exclude option, what is file matching PATTERN supposed to mean? I supposed it means file name match PATTERN, not file content match patten, am I right? I'm asking this question, because I'm trying to do the following thing: Find out all plain text files whose file names contain out and whose contents containing zip (in the form of whole word), and then output these files names to a file called zip.txt. (These plain text files are located in the sub-directories at different levels) I tried the following 2 lines of commands to try to achieve the goal above, but neither worked. Anyone cares to spot the error? I suspect most likely it's because my usage/understanding of --include option is wrong. grep -Hwli -r --include=out zip * zip.txt grep -Hwli --include=out zip * zip.txt Sorry if this question sounds stupid. Thank you for your time. Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ -Scott Valcourt email: [EMAIL PROTECTED] Computer Science Departmentphone: (603) 862-4489 University of New Hampshirefax:(603) 862-3493 310 Nesmith Hall Durham, NH 03824 ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Jerry writes: Find out all plain text files whose file names contain out and whose contents containing zip (in the form of whole word), and then output these files names to a file called zip.txt. (These plain text files are located in the sub-directories at different levels) Here is how I would do this: find your-dirname1 your-dirname2 -name \*out\* \ -exec perl -e 'undef $/; $filename=$ARGV[0]; $_=; exit(!(-T $filename /\bzip\b/))' \{\} \; -print \ zip.txt Notes: 1: I assume you were serious about the plain text files part. This is what the -T bit in the Perl program looks for. No binary files, right? 2: I assume you were serious about the zip part, so a word like unzip would not qualify. 3: The Perl code has some warts, but I was trying for clarity here. 4: The find program is very powerful and you can never go wrong learning about its features. Regards, --kevin PS I thought you might like some of my favorite aliases: # Author: kevin d. clark # Finds text files in the specified directories. These use Perl's -T # and -B tests. Here's some relevant documentation from the perlfunc # page: # #The -T and -B switches work as follows. The first block or #so of the file is examined for odd characters such as strange #control codes or characters with the high bit set. If too many #strange characters (30%) are found, it's a -B file, other- #wise it's a -T file. Also, any file containing null in the #first block is considered a binary file. [] Both -T and #-B return true on a null file... # # Caveat programmer. # # Find text files txtfind () { if [ $# -eq 0 ] ; then txtfind . else perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f -T);}, at ARGV);' ${ at } fi } # Find DOS-formatted text files dostxtfind () { if [ $# -eq 0 ] ; then dostxtfind . else perl -MFile::Find -e 'find(sub{ $crlf = 0; if (($f = -f) ($T = -T)) { at ARGV=($_); binmode(ARGV); (/\r\n/ $crlf++) while(); } print $File::Find::name\n if ($f $T $crlf); }, at ARGV)' ${ at } fi } # Find binary files binfind () { if [ $# -eq 0 ] ; then binfind . else perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f -B);}, at ARGV);' ${ at } fi } -- GnuPG ID: B280F24E Never could stand that dog. alumni.unh.edu!kdc -- Tom Waits ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On Friday, Mar 23rd 2007 at 15:41 -0400, quoth Jerry: =The manual of grep command on Red Hat states that: = =-R, -r, --recursive =read all files in each directory, recursively, this is =equivalent to -d recurse option = = --*include*=PATTERN recurse in directories only searching file =matching PATTERN = --exclude=PATTERN recurse in directories skip file matching =PATTERN = =For the --include or --exclude option, what is file matching PATTERN =supposed to mean? I supposed it means file name match PATTERN, not file =content match patten, am I right? = =I'm asking this question, because I'm trying to do the following thing: = =Find out all plain text files whose file names contain out and whose =contents containing zip (in the form of whole word), and then output =these files names to a file called zip.txt. (These plain text files are =located in the sub-directories at different levels) = =I tried the following 2 lines of commands to try to achieve the goal above, =but neither worked. Anyone cares to spot the error? I suspect most likely =it's because my usage/understanding of --include option is wrong. = =grep -Hwli -r --include=out zip * zip.txt = =grep -Hwli --include=out zip * zip.txt = =Sorry if this question sounds stupid. That's the dumbest question I ever heard! (just kidding) It seems to me that you need grep find awk xargs etc... Tell me if this helps: find . -type f -name \*out\* | \ xargs file | \ awk '/ASCII/ { sub(/:/, ); print $1}' | \ xargs grep -l zip zip.txt Line 1 gets the list of files whose name contains the word out. Line 2 takes that list and runs the file command Line 3 takes the output of file and prints out column 1 (without the colon at the end) if the word ASCII is found Line 4 takes the previous output and searches those files for the word zip and the output then goes into zip.txt Easy peasy japaneezy (I shall now buff my nails.) -- Time flies like the wind. Fruit flies like a banana. Stranger things have .0. happened but none stranger than this. Does your driver's license say Organ ..0 Donor?Black holes are where God divided by zero. Listen to me! We are all- 000 individuals! What if this weren't a hypothetical question? steveo at syslang.net ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On Friday, Mar 23rd 2007 at 16:33 -0400, quoth Jerry: =Also, doesn't Grep stand for global regular expression print? General Regular Expression Processor -- Time flies like the wind. Fruit flies like a banana. Stranger things have .0. happened but none stranger than this. Does your driver's license say Organ ..0 Donor?Black holes are where God divided by zero. Listen to me! We are all- 000 individuals! What if this weren't a hypothetical question? steveo at syslang.net ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On Fri, 2007-03-23 at 15:41 -0400, Jerry wrote: Hi, The manual of grep command on Red Hat states that: -R, -r, --recursive read all files in each directory, recursively, this is equivalent to -d recurse option --include=PATTERN recurse in directories only searching file matching PATTERN --exclude=PATTERN recurse in directories skip file matching PATTERN For the --include or --exclude option, what is file matching PATTERN supposed to mean? I supposed it means file name match PATTERN, not file content match patten, am I right? I'm asking this question, because I'm trying to do the following thing: Find out all plain text files whose file names contain out and whose contents containing zip (in the form of whole word), and then output these files names to a file called zip.txt. (These plain text files are located in the sub-directories at different levels) Would this approach work? find -type f -name '*out*' | xargs grep -wli zip zip.txt use find to recurse through directories and create a list of files. xargs feeds the file list as arguments to grep. grep examines the files looking for the word zip ignoring case and writes the filenames which get directed into zip.txt I tried the following 2 lines of commands to try to achieve the goal above, but neither worked. Anyone cares to spot the error? I suspect most likely it's because my usage/understanding of --include option is wrong. grep -Hwli -r --include=out zip * zip.txt grep -Hwli --include=out zip * zip.txt Sorry if this question sounds stupid. Thank you for your time. Zhao ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/ -- Lloyd Kvam Venix Corp ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
because my usage/understanding of --include option is wrong. grep -Hwli -r --include=out zip * zip.txt grep -Hwli --include=out zip * zip.txt It seems to be more of a glob pattern. I played around a little on one of my boxes and I believe something more like --include=*out* for the include option will work. -Shawn ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On 3/23/07, Jerry [EMAIL PROTECTED] wrote: For the --include or --exclude option, what is file matching PATTERN supposed to mean? Typically, it's a shell glob. My testing appears to confirm that. I supposed it means file name match PATTERN, not file content match patten, am I right? Yah. Find out all plain text files whose file names contain out and whose contents containing zip (in the form of whole word), and then output these files names to a file called zip.txt. I think this should work: grep -lwir --include=\*out\* . zip.txt Using long options: grep --files-with-matches --word-regexp --ignore-case \ --recursive --include=\*out\* . zip.txt The backslashes before the stars (\*out\*) are needed because otherwise the shell will try to expand them, which may prevent grep from seing them. grep -Hwli -r --include=out zip * zip.txt The biggest problem there is that the include PATTERN is just out, which means the filename would have to be just out. Not without or outside. By putting the stars around it, as I did, it will match anything (including nothing) on either side, as well. The * you give for the file name will be expanded by the shell, which may or may not give you what you want. I used just ., which is the current directory. Let grep handle getting the file list from the current directory, since you're using a recursive file search (-r) anyway. Also, a few minor superfluous things: -l implies -H, so you don't need to specify both. And you don't need to quote zip, since zip does not contain any shell meta-characters. -- One day I feel I'm ahead of the wheel / And the next it's rolling over me -- Rush, Far Cry ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On 23 Mar 2007 17:01:40 -0400, Kevin D. Clark [EMAIL PROTECTED] wrote: Here is how I would do this: find your-dirname1 your-dirname2 -name \*out\* \ -exec perl -e 'undef $/; $filename=$ARGV[0]; $_=; exit(!(-T $filename /\bzip\b/))' \{\} \; -print \ zip.txt Holy crap! Where's Perl's oft-decried extreme conciseness? ;-) I much prefer the all-in-one approach: grep -lwir --include=\*out\* . zip.txt Yah, the find command is very useful, since it's generic, and thus works in very complicated situations, when nothing else will. But for more common cases, the convenience features of modern *nix tools really do save a lot of work. :-) -- One day I feel I'm ahead of the wheel / And the next it's rolling over me -- Rush, Far Cry ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
On Fri, Mar 23, 2007 at 05:12:04PM -0400, Steven W. Orr wrote: On Friday, Mar 23rd 2007 at 16:33 -0400, quoth Jerry: =Also, doesn't Grep stand for global regular expression print? General Regular Expression Processor Jerry is correct. The name grep comes from the ed command g/regex/p: (search) global(ly for lines matching the) regular expression(, and) print. -- [EMAIL PROTECTED] OpenPGP KeyID 0x57C3430B Holder of Past Knowledge CS, O- Touch passion when it comes your way Stephen. It's rare enough as it is, don't walk away when it calls you by name. Marcus Cole ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Ben Scott writes: Holy crap! Where's Perl's oft-decried extreme conciseness? ;-) From my perspective, I deal with unix-flavored systems all the time with feature-lacking grep implementations. As recently as three weeks ago, I was working on a system without any fancy GNU grep. This system would happy grep through binary files and display the output on your screenthus hosing your terminal. My solution comes from my experience, and I was going for correctness, portability, and clarity, in that order. I realize this is a Linux list, but I don't always live in that world. By the way, did you forget to add --binary-files=without-match to your solution? The original poster asked for text files only. Kind Regards, --kevin -- GnuPG ID: B280F24E Never could stand that dog. alumni.unh.edu!kdc -- Tom Waits ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
Re: a question about GREP
Here is another copy of my favorite shell functions, since I kindof sent out garbled versions the first time. I hope others find these to be useful. --kevin # txtfind, dostxtfind, and binfind all use Perl's -B and -T file # test operations. # # Here are some relevant sections from the perlfunc documentation: # # The -T and -B switches work as follows. The first block or # so of the file is examined for odd characters such as strange # control codes or characters with the high bit set. If too many # strange characters (30%) are found, it is -B file, other- # wise it is a -T file. Also, any file containing null in the # first block is considered a binary file # ... # Both -T and -B return true on a null file. # # Caveat programmer. txtfind () { if [ $# -eq 0 ] ; then txtfind . else perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f -T);}, @ARGV);' [EMAIL PROTECTED] fi } dostxtfind () { if [ $# -eq 0 ] ; then dostxtfind . else perl -MFile::Find -e 'find(sub{ $crlf = 0; $f = -f; $T = -T; @ARGV=($_); binmode(ARGV); ((/\r\n/) $crlf++) while(); print $File::Find::name $crnl\n if ($f $T $crlf); }, @ARGV)' [EMAIL PROTECTED] fi } binfind () { if [ $# -eq 0 ] ; then binfind . else perl -MFile::Find -e 'find(sub{print $File::Find::name\n if (-f -B);}, @ARGV);' [EMAIL PROTECTED] fi } -- GnuPG ID: B280F24E Never could stand that dog. alumni.unh.edu!kdc -- Tom Waits ___ gnhlug-discuss mailing list gnhlug-discuss@mail.gnhlug.org http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/