Re: PDF inventory software

2009-06-10 Thread ill...@gmail.com
2009/6/9 Daniel Underwood djuatde...@gmail.com: When I enter: $ find *.pdf -print0 | xargs -0 pdftotext nothing seems to happen.  Although there is no error message, the text files are not created. Any idea why? Ah, apologies. I was just testing with $ find *.pdf -print0 | xargs -0 cat to

Re: PDF inventory software

2009-06-10 Thread b. f.
Hmm.. The command find *.pdf -exec pdftotext {} \; works in directories in which no PDF file returns the Document has not the mandatory ending %EOF error. When a directory contains one of these files, none of the files get converted. Is there some way to ignore or skip over this %EOF problem

Re: PDF inventory software

2009-06-09 Thread Christopher Illies
On Mon, Jun 08, 2009 at 05:17:29PM -0400, Daniel Underwood wrote: I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local

Re: PDF inventory software

2009-06-09 Thread Polytropon
On Mon, 8 Jun 2009 22:37:01 -0400 (EDT), vogelke+u...@pobox.com (Karl Vogel) wrote: Are these PDF files generated by scanning journal pages, or do they contain text? If the latter, you could use something like xapian or hyperestraier to make a full-text index of your files. On a

Re: PDF inventory software

2009-06-09 Thread Polytropon
On Mon, 8 Jun 2009 23:11:50 -0400, Daniel Underwood djuatde...@gmail.com wrote: Since all the PDFs contain text (none are scanned images), can I simply use some command like grep to search for text within the collection? If so, how would I do this? Can grep read text from within PDFs? I

Re: PDF inventory software

2009-06-09 Thread Grünewald Michaël
Le 8 juin 09 à 23:17, Daniel Underwood a écrit : I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local drive. Hi Daniel, I

Re: PDF inventory software

2009-06-09 Thread Daniel Underwood
I'm trying to convert all PDF files in a directory to text using pdftotext. I tried the following command: $ find *.pdf | xargs -0 pdftotext Error: Couldn't open file 'Ross-JAMA-2007 (Prostate Screening Strategies).pdf Sanda-JAMA-2009 (Prostate Cancer Treatment).pdf ' Why is this not working?

Re: PDF inventory software

2009-06-09 Thread John Almberg
On Jun 8, 2009, at 5:17 PM, Daniel Underwood wrote: I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local drive. In the

Re: PDF inventory software

2009-06-09 Thread ill...@gmail.com
2009/6/9 Daniel Underwood djuatde...@gmail.com: I'm trying to convert all PDF files in a directory to text using pdftotext.  I tried the following command: $ find *.pdf | xargs -0 pdftotext Error: Couldn't open file 'Ross-JAMA-2007 (Prostate Screening Strategies).pdf Sanda-JAMA-2009

Re: PDF inventory software

2009-06-09 Thread LoH
ill...@gmail.com wrote: 2009/6/9 Daniel Underwood djuatde...@gmail.com: I'm trying to convert all PDF files in a directory to text using pdftotext. I tried the following command: $ find *.pdf | xargs -0 pdftotext Error: Couldn't open file 'Ross-JAMA-2007 (Prostate Screening Strategies).pdf

Re: PDF inventory software

2009-06-09 Thread Daniel Underwood
$ find *.pdf -exec pdftotext {} \; Error: Document has not the mandatory ending %EOF ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to

Re: PDF inventory software

2009-06-09 Thread LoH
Daniel Underwood wrote: $ find *.pdf -exec pdftotext {} \; Error: Document has not the mandatory ending %EOF Have you run pdftotext on a single file in your archive as a test? --Joseph Lenox ___ freebsd-questions@freebsd.org mailing list

Re: PDF inventory software

2009-06-09 Thread Daniel Underwood
Hmm.. The command find *.pdf -exec pdftotext {} \; works in directories in which no PDF file returns the Document has not the mandatory ending %EOF error. When a directory contains one of these files, none of the files get converted. Is there some way to ignore or skip over this %EOF problem

Re: PDF inventory software

2009-06-09 Thread LoH
Daniel Underwood wrote: Yes, it works fine on most PDFs. There are a couple that give me: $ pdftotext Sanda-JAMA-2009\ \(Prostate\ Cancer\ Treatment\).pdf Error: Document has not the mandatory ending %EOF It's probably an issue with the PDF itself, not with the program. --Joseph Lenox

Re: PDF inventory software

2009-06-09 Thread Polytropon
On Tue, 09 Jun 2009 16:07:03 -0500, LoH lordofhyph...@gmail.com wrote: Daniel Underwood wrote: Yes, it works fine on most PDFs. There are a couple that give me: $ pdftotext Sanda-JAMA-2009\ \(Prostate\ Cancer\ Treatment\).pdf Error: Document has not the mandatory ending %EOF

Re: PDF inventory software

2009-06-09 Thread Daniel Underwood
I retrieved a fresh copy of the error-causing PDF, and now all is well. Thanks for all the excellent help! ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to

Re: PDF inventory software

2009-06-09 Thread Olivier Nicole
Daniel, I'm trying to convert all PDF files in a directory to text using pdftotext. I tried the following command: Aside from the syntax of the command find(1) and some article that may be in corrupted PDF, you may consider hacking pdftotext to skip the do not print flag in some of the PDF

Re: PDF inventory software

2009-06-08 Thread LoH
Daniel Underwood wrote: I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local drive. In the course of reading literature for

Re: PDF inventory software

2009-06-08 Thread Polytropon
On Mon, 8 Jun 2009 17:17:29 -0400, Daniel Underwood djuatde...@gmail.com wrote: I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto

PDF inventory software

2009-06-08 Thread Daniel Underwood
I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local drive. In the course of reading literature for research, it often happens

Re: PDF inventory software

2009-06-08 Thread Daniel Underwood
Poly and LoH: Thanks, these are great ideas! ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org

Re: PDF inventory software

2009-06-08 Thread Yuri Pankov
On Mon, Jun 08, 2009 at 05:17:29PM -0400, Daniel Underwood wrote: I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local

Re: PDF inventory software

2009-06-08 Thread Bill Moran
Daniel Underwood djuatde...@gmail.com wrote: I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local drive. In the course

Re: PDF inventory software

2009-06-08 Thread Polytropon
On Mon, 8 Jun 2009 17:45:38 -0400, Daniel Underwood djuatde...@gmail.com wrote: Poly and LoH: Thanks, these are great ideas! I'd like to add that if you define your data fields well, you can use it to generate BibTeX and other LaTeX entries from your records. You can even easily turn it into

Re: PDF inventory software

2009-06-08 Thread FRLinux
On Mon, Jun 8, 2009 at 10:17 PM, Daniel Underwooddjuatde...@gmail.com wrote: I'm looking for a way to manage my personal collection of research articles.  Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my

Re: PDF inventory software

2009-06-08 Thread Olivier Nicole
Hi, I'm looking for a way to manage my personal collection of research articles. Ideally I'd like some way to keep records on authors, keywords, journals, and publication years of articles (PDF files) downloaded onto my local drive. Certainly overkill, but dspace(.org) can keep up a digital

Re: PDF inventory software

2009-06-08 Thread Karl Vogel
On Mon, 8 Jun 2009 17:17:29 -0400, Daniel Underwood djuatde...@gmail.com said: D In the course of reading literature for research, it often happens that I D find myself wanted to return to something I have previously read, but I D only recall a few things about the article, often the author

Re: PDF inventory software

2009-06-08 Thread Daniel Underwood
Since all the PDFs contain text (none are scanned images), can I simply use some command like grep to search for text within the collection? If so, how would I do this? Can grep read text from within PDFs? ___ freebsd-questions@freebsd.org mailing list

Re: PDF inventory software

2009-06-08 Thread Olivier Nicole
Since all the PDFs contain text (none are scanned images), can I simply use some command like grep to search for text within the collection? If so, how would I do this? Can grep read text from within PDFs? pdftotext, comes with the port xpdf I think Olivier

Re: PDF inventory software

2009-06-08 Thread Lord Of Hyphens
On Mon, Jun 8, 2009 at 10:21 PM, Olivier Nicole o...@cs.ait.ac.th wrote: Since all the PDFs contain text (none are scanned images), can I simply use some command like grep to search for text within the collection? If so, how would I do this? Can grep read text from within PDFs?

Re: PDF inventory software

2009-06-08 Thread LoH
Daniel Underwood wrote: A partial solution would also to do a search on someone else's index (google scholar, IEEEXplore, etc) to get the title of what you're looking for. True, but in this situation, I want to find something within a local collection of literature. E.g., find a table of