Extracting layout information and text from searchable PDF

viraf.bankwalla Mon, 27 Feb 2017 10:36:19 -0800

I have a number of searchable PDF documents from which I want to extract layout 
information and text.  These documents are mixed in that some pages may be 
structured (e.g. forms) while others may be unstructured free form text (e.g. 
letters, reports, etc).  
I was wondering if there were any projects that provided such capabilities.  I 
am familiar with PdfTextExtractor and it would probably be a starting point if 
I was to build this functionality out.
Thanks
- viraf

Extracting layout information and text from searchable PDF

Reply via email to