recursive grep and openoffice

2009-03-18 Thread John O Laoi
Thanks for all of your replies.
I didn't know that tools such as tracker would search with openoffice
document.

With respect to the command line, I have fixed on

 find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep
string-being sought  /dev/null' \; -print


but it returns immediately, and seems to do no searching.

To my amateur eyes, it looks like it should work. I've done some searching,
but to no avail.

Anybody got ideas?

John


Re: recursive grep and openoffice

2009-03-18 Thread Rainer Kluge
John O Laoi schrieb:
 
 find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep
 string-being sought  /dev/null' \; -print
 

For me it works . Maybe you should quote *.odt: '*.odt'. And try just

   find . -name *.odt

to see if the odt files are found.

Rainer


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: recursive grep and openoffice

2009-03-18 Thread Boyd Stephen Smith Jr.
In 1f1816a90903180556k56e3e592qa14c55d1c3193...@mail.gmail.com, John O 
Laoi wrote:
Thanks for all of your replies.
I didn't know that tools such as tracker would search with openoffice
document.

With respect to the command line, I have fixed on

 find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep
string-being sought  /dev/null' \; -print

I think I'd rewrite it as:
find . \
-name '*.odt' \
-exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \
-print

I'm not sure what the rules are for find substituting {} within another 
argument, so it seems best to write it as a separate argument.  If you have 
anything that matches *.odt in the current directory, the find won't work[1] 
unless you quote it.  You might also need to throw double-quotes around the 
regex, depending on its contents. 
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/

[1] It may work, but it won't actually be searching for files with names 
matching the glob *.odt.



signature.asc
Description: This is a digitally signed message part.


Re: recursive grep and openoffice

2009-03-18 Thread Ken Irving
On Wed, Mar 18, 2009 at 11:19:20AM -0500, Boyd Stephen Smith Jr. wrote:
 In 1f1816a90903180556k56e3e592qa14c55d1c3193...@mail.gmail.com, John O 
 Laoi wrote:
 With respect to the command line, I have fixed on
 
  find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep
 string-being sought  /dev/null' \; -print
 
 I think I'd rewrite it as:
 find . \
 -name '*.odt' \
 -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \
 -print
 
 I'm not sure what the rules are for find substituting {} within another 
 argument, so it seems best to write it as a separate argument.  If you have 
 anything that matches *.odt in the current directory, the find won't work[1] 
 unless you quote it.  You might also need to throw double-quotes around the 
 regex, depending on its contents. 
 ...
 
 [1] It may work, but it won't actually be searching for files with names 
 matching the glob *.odt.
 
It _may_ also work if unquoted, since bash will leave the literal *
as is if there's no match in the current directory, and so find will
see it as intended.  I guess that's a good thing, but it can be 
confusing.

Bash will also leave {} untouched since it doesn't expand to anything, so
I don't see any point in quoting it as shown.

Ken
-- 
Ken Irving


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: recursive grep and openoffice

2009-03-18 Thread Boyd Stephen Smith Jr.
In 20090318164208.ga14...@localhost, Ken Irving wrote:
On Wed, Mar 18, 2009 at 11:19:20AM -0500, Boyd Stephen Smith Jr. wrote:
 I think I'd rewrite it as:
 find . \
 -name '*.odt' \
 -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \
 -print

 I'm not sure what the rules are for find substituting {} within
 another argument, so it seems best to write it as a separate argument. 
 If you have anything that matches *.odt in the current directory, the
 find won't work[1] unless you quote it.  You might also need to throw
 double-quotes around the regex, depending on its contents.

 [1] It may work, but it won't actually be searching for files with names
 matching the glob *.odt.
It _may_ also work if unquoted, since bash will leave the literal *
as is if there's no match in the current directory, and so find will
see it as intended.  I guess that's a good thing, but it can be
confusing.

I did qualify my statement with if you have anything that matches *.odt in 
the current directory.  Also, I think it might depend on your glob 
expansion settings in bash.  ISTR an option to expand non-matching globs to 
either an empty argument or no argument at all.

Bash will also leave {} untouched since it doesn't expand to anything, so
I don't see any point in quoting it as shown.

Bash will, but I've heard other shells will not--something about considering 
{} an empty command group.  I'm pretty sure the relevant standards require 
{} to NOT be recognized as anything special since neither '{' or '}' is an 
operator character, but I'd have to read them again to be sure.
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/



signature.asc
Description: This is a digitally signed message part.


Re: recursive grep and openoffice

2009-03-18 Thread Ken Irving
On Wed, Mar 18, 2009 at 01:45:42PM -0500, Boyd Stephen Smith Jr. wrote:
 In 20090318164208.ga14...@localhost, Ken Irving wrote:
 On Wed, Mar 18, 2009 at 11:19:20AM -0500, Boyd Stephen Smith Jr. wrote:
  I think I'd rewrite it as:
  find . \
  -name '*.odt' \
  -exec sh -c 'unzip -c $1 content.xml | grep -q regex' \{} \; \
  -print
 
  I'm not sure what the rules are for find substituting {} within
  another argument, so it seems best to write it as a separate argument. 
  If you have anything that matches *.odt in the current directory, the
  find won't work[1] unless you quote it.  You might also need to throw
  double-quotes around the regex, depending on its contents.
 
  [1] It may work, but it won't actually be searching for files with names
  matching the glob *.odt.
 It _may_ also work if unquoted, since bash will leave the literal *
 as is if there's no match in the current directory, and so find will
 see it as intended.  I guess that's a good thing, but it can be
 confusing.
 
 I did qualify my statement with if you have anything that matches *.odt in 
 the current directory.  Also, I think it might depend on your glob 
 expansion settings in bash.  ISTR an option to expand non-matching globs to 
 either an empty argument or no argument at all.

Understood.  My point is just that folks may have success using bare
glob patterns in some cases, but then get nailed when they end up matching
something.   My recommendation is to escape the glob metachar itself, e.g.,

find . -name \*.odt

as I find that this sort of hints at what's going on, but YMMV.

 Bash will also leave {} untouched since it doesn't expand to anything, so
 I don't see any point in quoting it as shown.
 
 Bash will, but I've heard other shells will not--something about considering 
 {} an empty command group.  I'm pretty sure the relevant standards require 
 {} to NOT be recognized as anything special since neither '{' or '}' is an 
 operator character, but I'd have to read them again to be sure.

Another poster mentioned enclosing the braces in quotes, {}, to
accomodate filenames with spaces, which I've never encountered or thought
about, but then those quotes would need to be protected from the initial
shell expansion.

This all gets even more fun if you're doing the find on a remote system
using ssh, and the shell, ssh, the remote shell, etc., are all in there 
expanding and removing quotes and escapes.

Ken
-- 
Ken Irving


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



recursive grep and openoffice

2009-03-16 Thread John O Laoi
Hello,

 I sometimes need to find a file, and I only know of some text contained
therein.

So I launch a search as follows:

 $ grep -r text i am looking for /home/john

 OR

 $ find /home/john -type f -exec grep -i * **text i am looking for * '{}'
\; -print

 where /home/john is my home directory.

 The problem is that this does not search within  .odt  openoffice files.

It will located any  .doc  files that contain the string, but not openoffice
files.

This is a big nuisance, as most of my files are now .odt files.

 Some research has let me know that openoffice files are zip files that
contain other files.

 I am using etch (soon to upgrade).


 Has anybody got a solution that will recursively search a directory looking
for a file that contains some specified text, and will search within
openoffice file?


 John


Re: recursive grep and openoffice

2009-03-16 Thread Sjoerd Hardeman

John O Laoi wrote:

Hello,

I sometimes need to find a file, and I only know of some text contained 
therein.

|The problem is that this does not search within  .odt  openoffice files.|

|It will located any  .doc  files that contain the string, but not 
openoffice files.|

You mean MS-word? How do you do that?
|Has anybody got a solution that will recursively search a directory 
looking for a file that contains some specified text, and will search 
within openoffice file?|

What about
 find . -name *.odt -exec unzip -c {} content.xml | grep what you want 
to find\; -print


Sjoerd

--
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments



signature.asc
Description: OpenPGP digital signature


Re: recursive grep and openoffice

2009-03-16 Thread Sjoerd Hardeman

Sjoerd Hardeman wrote:

What about
 find . -name *.odt -exec unzip -c {} content.xml | grep what you want 
to find\; -print

This one is not working, use
 find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what
 you want to find' \; -print
instead.

Sjoerd


--
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments



signature.asc
Description: OpenPGP digital signature


Re: recursive grep and openoffice

2009-03-16 Thread Bob Cox
On Mon, Mar 16, 2009 at 15:29:50 +0100, Sjoerd Hardeman 
(sjo...@lorentz.leidenuniv.nl) wrote: 

 Sjoerd Hardeman wrote:
 What about
  find . -name *.odt -exec unzip -c {} content.xml | grep what you want 
 to find\; -print
 This one is not working, use
  find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what
  you want to find' \; -print

Ingenious - but I think the *.odt needs to be within quotation marks for
this to work, (i.e. '*.odt'). 

-- 
Bob Cox.  Stoke Gifford, near Bristol, UK.
Please reply to the list only.  Do NOT send copies directly to me.
Debian on the NSLU2: http://bobcox.com/slug/


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: recursive grep and openoffice

2009-03-16 Thread Rainer Kluge
Bob Cox schrieb:
 On Mon, Mar 16, 2009 at 15:29:50 +0100, Sjoerd Hardeman 
 (sjo...@lorentz.leidenuniv.nl) wrote: 
 
 Sjoerd Hardeman wrote:
 What about
  find . -name *.odt -exec unzip -c {} content.xml | grep what you want 
 to find\; -print
 This one is not working, use
  find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what
  you want to find' \; -print
 
 Ingenious - but I think the *.odt needs to be within quotation marks for
 this to work, (i.e. '*.odt'). 
 

{} should also be quoted, in case that there are file names with spaces or other
special characters. And grep (or sh) output should be redirected to dev/null:

find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what  you want
to find  /dev/null' \; -print


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: recursive grep and openoffice

2009-03-16 Thread H.S.
Sjoerd Hardeman wrote:
 Sjoerd Hardeman wrote:
 What about
  find . -name *.odt -exec unzip -c {} content.xml | grep what you
 want to find\; -print
 This one is not working, use
  find . -name *.odt -exec sh -c 'unzip -c {} content.xml | grep what
  you want to find' \; -print
 instead.
 
 Sjoerd
 
 

How about various desktop search tools that Linux has (tracker, beagle,
kerry)?


-- 

Please reply to this list only. I read this list on its corresponding
newsgroup on gmane.org. Replies sent to my email address are just
filtered to a folder in my mailbox and get periodically deleted without
ever having been read.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org