recursive grep and openoffice
Thanks for all of your replies. I didn't know that tools such as tracker would search with openoffice document. With respect to the command line, I have fixed on find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep string-being sought /dev/null' \; -print but it returns immediately, and seems to do no searching. To my amateur eyes, it looks like it should work. I've done some searching, but to no avail. Anybody got ideas? John
Re: recursive grep and openoffice
John O Laoi schrieb: find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep string-being sought /dev/null' \; -print For me it works . Maybe you should quote *.odt: '*.odt'. And try just find . -name *.odt to see if the odt files are found. Rainer -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: recursive grep and openoffice
In 1f1816a90903180556k56e3e592qa14c55d1c3193...@mail.gmail.com, John O Laoi wrote: Thanks for all of your replies. I didn't know that tools such as tracker would search with openoffice document. With respect to the command line, I have fixed on find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep string-being sought /dev/null' \; -print I think I'd rewrite it as: find . \ -name '*.odt' \ -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \ -print I'm not sure what the rules are for find substituting {} within another argument, so it seems best to write it as a separate argument. If you have anything that matches *.odt in the current directory, the find won't work[1] unless you quote it. You might also need to throw double-quotes around the regex, depending on its contents. -- Boyd Stephen Smith Jr. ,= ,-_-. =. b...@iguanasuicide.net ((_/)o o(\_)) ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-' http://iguanasuicide.net/\_/ [1] It may work, but it won't actually be searching for files with names matching the glob *.odt. signature.asc Description: This is a digitally signed message part.
Re: recursive grep and openoffice
On Wed, Mar 18, 2009 at 11:19:20AM -0500, Boyd Stephen Smith Jr. wrote: In 1f1816a90903180556k56e3e592qa14c55d1c3193...@mail.gmail.com, John O Laoi wrote: With respect to the command line, I have fixed on find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep string-being sought /dev/null' \; -print I think I'd rewrite it as: find . \ -name '*.odt' \ -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \ -print I'm not sure what the rules are for find substituting {} within another argument, so it seems best to write it as a separate argument. If you have anything that matches *.odt in the current directory, the find won't work[1] unless you quote it. You might also need to throw double-quotes around the regex, depending on its contents. ... [1] It may work, but it won't actually be searching for files with names matching the glob *.odt. It _may_ also work if unquoted, since bash will leave the literal * as is if there's no match in the current directory, and so find will see it as intended. I guess that's a good thing, but it can be confusing. Bash will also leave {} untouched since it doesn't expand to anything, so I don't see any point in quoting it as shown. Ken -- Ken Irving -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: recursive grep and openoffice
In 20090318164208.ga14...@localhost, Ken Irving wrote: On Wed, Mar 18, 2009 at 11:19:20AM -0500, Boyd Stephen Smith Jr. wrote: I think I'd rewrite it as: find . \ -name '*.odt' \ -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \ -print I'm not sure what the rules are for find substituting {} within another argument, so it seems best to write it as a separate argument. If you have anything that matches *.odt in the current directory, the find won't work[1] unless you quote it. You might also need to throw double-quotes around the regex, depending on its contents. [1] It may work, but it won't actually be searching for files with names matching the glob *.odt. It _may_ also work if unquoted, since bash will leave the literal * as is if there's no match in the current directory, and so find will see it as intended. I guess that's a good thing, but it can be confusing. I did qualify my statement with if you have anything that matches *.odt in the current directory. Also, I think it might depend on your glob expansion settings in bash. ISTR an option to expand non-matching globs to either an empty argument or no argument at all. Bash will also leave {} untouched since it doesn't expand to anything, so I don't see any point in quoting it as shown. Bash will, but I've heard other shells will not--something about considering {} an empty command group. I'm pretty sure the relevant standards require {} to NOT be recognized as anything special since neither '{' or '}' is an operator character, but I'd have to read them again to be sure. -- Boyd Stephen Smith Jr. ,= ,-_-. =. b...@iguanasuicide.net ((_/)o o(\_)) ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-' http://iguanasuicide.net/\_/ signature.asc Description: This is a digitally signed message part.
Re: recursive grep and openoffice
On Wed, Mar 18, 2009 at 01:45:42PM -0500, Boyd Stephen Smith Jr. wrote: In 20090318164208.ga14...@localhost, Ken Irving wrote: On Wed, Mar 18, 2009 at 11:19:20AM -0500, Boyd Stephen Smith Jr. wrote: I think I'd rewrite it as: find . \ -name '*.odt' \ -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \ -print I'm not sure what the rules are for find substituting {} within another argument, so it seems best to write it as a separate argument. If you have anything that matches *.odt in the current directory, the find won't work[1] unless you quote it. You might also need to throw double-quotes around the regex, depending on its contents. [1] It may work, but it won't actually be searching for files with names matching the glob *.odt. It _may_ also work if unquoted, since bash will leave the literal * as is if there's no match in the current directory, and so find will see it as intended. I guess that's a good thing, but it can be confusing. I did qualify my statement with if you have anything that matches *.odt in the current directory. Also, I think it might depend on your glob expansion settings in bash. ISTR an option to expand non-matching globs to either an empty argument or no argument at all. Understood. My point is just that folks may have success using bare glob patterns in some cases, but then get nailed when they end up matching something. My recommendation is to escape the glob metachar itself, e.g., find . -name \*.odt as I find that this sort of hints at what's going on, but YMMV. Bash will also leave {} untouched since it doesn't expand to anything, so I don't see any point in quoting it as shown. Bash will, but I've heard other shells will not--something about considering {} an empty command group. I'm pretty sure the relevant standards require {} to NOT be recognized as anything special since neither '{' or '}' is an operator character, but I'd have to read them again to be sure. Another poster mentioned enclosing the braces in quotes, {}, to accomodate filenames with spaces, which I've never encountered or thought about, but then those quotes would need to be protected from the initial shell expansion. This all gets even more fun if you're doing the find on a remote system using ssh, and the shell, ssh, the remote shell, etc., are all in there expanding and removing quotes and escapes. Ken -- Ken Irving -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
recursive grep and openoffice
Hello, I sometimes need to find a file, and I only know of some text contained therein. So I launch a search as follows: $ grep -r text i am looking for /home/john OR $ find /home/john -type f -exec grep -i * **text i am looking for * '{}' \; -print where /home/john is my home directory. The problem is that this does not search within .odt openoffice files. It will located any .doc files that contain the string, but not openoffice files. This is a big nuisance, as most of my files are now .odt files. Some research has let me know that openoffice files are zip files that contain other files. I am using etch (soon to upgrade). Has anybody got a solution that will recursively search a directory looking for a file that contains some specified text, and will search within openoffice file? John
Re: recursive grep and openoffice
John O Laoi wrote: Hello, I sometimes need to find a file, and I only know of some text contained therein. |The problem is that this does not search within .odt openoffice files.| |It will located any .doc files that contain the string, but not openoffice files.| You mean MS-word? How do you do that? |Has anybody got a solution that will recursively search a directory looking for a file that contains some specified text, and will search within openoffice file?| What about find . -name *.odt -exec unzip -c {} content.xml | grep what you want to find\; -print Sjoerd -- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments signature.asc Description: OpenPGP digital signature
Re: recursive grep and openoffice
Sjoerd Hardeman wrote: What about find . -name *.odt -exec unzip -c {} content.xml | grep what you want to find\; -print This one is not working, use find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what you want to find' \; -print instead. Sjoerd -- () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments signature.asc Description: OpenPGP digital signature
Re: recursive grep and openoffice
On Mon, Mar 16, 2009 at 15:29:50 +0100, Sjoerd Hardeman (sjo...@lorentz.leidenuniv.nl) wrote: Sjoerd Hardeman wrote: What about find . -name *.odt -exec unzip -c {} content.xml | grep what you want to find\; -print This one is not working, use find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what you want to find' \; -print Ingenious - but I think the *.odt needs to be within quotation marks for this to work, (i.e. '*.odt'). -- Bob Cox. Stoke Gifford, near Bristol, UK. Please reply to the list only. Do NOT send copies directly to me. Debian on the NSLU2: http://bobcox.com/slug/ -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: recursive grep and openoffice
Bob Cox schrieb: On Mon, Mar 16, 2009 at 15:29:50 +0100, Sjoerd Hardeman (sjo...@lorentz.leidenuniv.nl) wrote: Sjoerd Hardeman wrote: What about find . -name *.odt -exec unzip -c {} content.xml | grep what you want to find\; -print This one is not working, use find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what you want to find' \; -print Ingenious - but I think the *.odt needs to be within quotation marks for this to work, (i.e. '*.odt'). {} should also be quoted, in case that there are file names with spaces or other special characters. And grep (or sh) output should be redirected to dev/null: find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what you want to find /dev/null' \; -print -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: recursive grep and openoffice
Sjoerd Hardeman wrote: Sjoerd Hardeman wrote: What about find . -name *.odt -exec unzip -c {} content.xml | grep what you want to find\; -print This one is not working, use find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what you want to find' \; -print instead. Sjoerd How about various desktop search tools that Linux has (tracker, beagle, kerry)? -- Please reply to this list only. I read this list on its corresponding newsgroup on gmane.org. Replies sent to my email address are just filtered to a folder in my mailbox and get periodically deleted without ever having been read. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org