Re: Text Extraction with Formatting

Maruan Sahyoun Tue, 07 May 2013 00:40:15 -0700

Hi,

you can either extract to HTML (call Extract Text with the -html option for 
example) or create you own logic. You can take a look at  
org.apache.pdfbox.util.PDFText2HTML as a starting point.


There is also a project to convert PDFtoSVG using PDFBox as a basis which might 
also serve as an example (https://bitbucket.org/petermr/pdftosvg)

BR
Maruan Sahyoun

Am 07.05.2013 um 08:39 schrieb rahul bhalla <[email protected]>:

> Is it possible to extract text from a PDF without ignoring the formatting?
> or when text is extracting it put tag which we use in html..
> 
> Thanks
> 
> -- 
> Regards
> Rahul Bhalla

Re: Text Extraction with Formatting

Reply via email to