ExtractText unfortunately isn't the same functionality as pdf2htmlex. pdf2htmlex is intended to be a picture-perfect representation of the PDF embeddable in the browser, including images, graphics, etc... The ExtractText option from what I've seen is only the text and doesn't do any of the alignment.
AFAIK, PDFBox does not intend to provide this kind of functionality anywhere. Hope that helps, Branden On Tue, Jan 19, 2016 at 1:19 AM, Tilman Hausherr <[email protected]> wrote: > Am 19.01.2016 um 03:54 schrieb admin: >> >> Hi guys! >> >> I would like to ask PdfBox can provide pdf2html convert? like >> pdf2htmlEx. >> pdf2htmlEx is dependent on the operating system too much >> If PDFbox provide this functionality would be great. >> >> -- >> tianyi li >> >> >> > > Just use the ExtractText command line utility, with the "-html" option. > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

