Re: [plug] how to extract index information from PDF file

Anuerin Diaz Fri, 18 Dec 2009 08:00:00 -0800

maybe its really a generated content field. you can try converting it
to html or word and then filtering the lines styled by header1 or
whatever is used to distinguish it as a section.

ciao!

On Fri, Dec 18, 2009 at 11:56 AM, Erwin Olario <[email protected]> wrote:
> A PDF's index information doesn't seem to be part of its meta-data..
>
> On Fri, Dec 18, 2009 at 11:32 AM, Erwin Olario <[email protected]> wrote:
>>
>> Your google-fu is better than mine. Thanks, will check it out.
>>
>> On Fri, Dec 18, 2009 at 10:31 AM, Anuerin Diaz <[email protected]>
>> wrote:
>>>
>>> have you tried libextractor
>>> [http://www.gnu.org/software/libextractor/]? that was one of the hits
>>> i got from google.
>>>
>>>
>>>
>>> On Thu, Dec 17, 2009 at 11:51 PM, Erwin Olario <[email protected]> wrote:
>>> > Hi list.
>>> > Are there any tools I can use to extract the index information from PDF
>>> > files?
>>>

-- 
"Programming, an artform that fights back"

Anuerin G. Diaz
Registered Linux User #246176
Friendly Linux Board @ http://mandrivausers.org/index.php
http://ramfree17.net/capsule , when you absolutely have nothing else
better to do
_________________________________________________
Philippine Linux Users' Group (PLUG) Mailing List
http://lists.linux.org.ph/mailman/listinfo/plug
Searchable Archives: http://archives.free.net.ph

Re: [plug] how to extract index information from PDF file

Reply via email to