Re: [Trisquel-users] finding particular pages within PDFs

adel . afzal Fri, 29 Aug 2014 11:30:57 -0700

MB, having the text would be way more useful than the PDF pages! Thanks forrecommending pdftotext and the -layout option.

I have some questions -- could you help me break this process down intosmaller steps?

I looked up pdfjam's split command online -- I think that it may be a littletime consuming (my PDFs are a few thousand pages long):


http://0x2a.at/blog/2011/02/pdf_manipulation_on_the_cli/

http://tex.stackexchange.com/questions/79623/quickly-extracting-individual-pages-from-a-document

I looked at PDF Shuffler (the GUI one) and that can only split filesone-by-one. Are there other options?

Once I split the files into single pages, I'll need the Shell command 'forfile in pages/*" loop. I don't understand what this step will do. Could youplease explain this step too?

About this step: 'if pdftotex "$file" - | grep -i regexps' -- does this copyall the PDF text to one text file? And then search (grep) the text file?Does this command take text from many single PDfs? Or only after the "hit"pages are joined up into one document?

What does it mean to "append the file to a Shell variable" ? What is thegoal in this step? Could you please explain how I can do this step too?

Re: [Trisquel-users] finding particular pages within PDFs

Reply via email to