Hi, you can either extract to HTML (call Extract Text with the -html option for example) or create you own logic. You can take a look at org.apache.pdfbox.util.PDFText2HTML as a starting point.
There is also a project to convert PDFtoSVG using PDFBox as a basis which might also serve as an example (https://bitbucket.org/petermr/pdftosvg) BR Maruan Sahyoun Am 07.05.2013 um 08:39 schrieb rahul bhalla <[email protected]>: > Is it possible to extract text from a PDF without ignoring the formatting? > or when text is extracting it put tag which we use in html.. > > Thanks > > -- > Regards > Rahul Bhalla

