Thanks, will do On Thu, Jul 23, 2009 at 11:51 PM, Daniel Wilson < [email protected]> wrote:
> Iain is absolutely right. The latest code in SVN has LOTS of improvements > -- and is pretty stable in its own right. > Daniel > > On Thu, Jul 23, 2009 at 5:01 PM, Iain Clapham > <[email protected]>wrote: > > > Mark, > > > > Have you upgraded to the latest FontBox ? now at 0.8 > > > > I think it is a good idea to pull the latest SVN ( then you can hack away > > at the nice Java code :~)) > > > > Cheers +++ Iain > > > > > > Mark Kerzner wrote: > > > >> Hi, > >> I have compared the PDFBox-to-text to the pdftohtml (in Linux) - then to > >> text conversion, and I found the second one a little clearer. For > example, > >> the bottom lines in a PDF (Copyrights, etc) were combined into one line > by > >> the PDFBox conversion, and had three separate pieces in the other way. > >> > >> I am using the last stable PDFBox jar, which dates back to 2006, and the > >> pdftohtml utility is from about the same time, so I can understand this. > >> > >> My question then is twofold: does the comparison make sense, and should > I > >> use the pdftohtml combined with text converter, or should I try to build > >> the > >> latest from SVN? > >> > >> Thank you, > >> Mark > >> > >> > >> > > > > >
