Re: pdfbox.io - which should I use

Maruan Sahyoun Tue, 18 Feb 2014 12:52:32 -0800

Yes, we could use RandomAccessRead as a base and subclasses to wrap NIO and 
others.


Then the parsers would use RandomAccessRead

WDYT

Maruan Sahyoun

> Am 18.02.2014 um 21:42 schrieb John Hewson <j...@jahewson.com>:
> 
> The streams used by BaseParser and PDFParser are sequential, so you can 
> ignore them.
> Use of PushBackInputStream in the non-sequential parser seems a little odd. 
> 
> We might want to think about getting rid of the classes in 
> org.apache.pdfbox.io and replacing
> them with classes from java.nio.channels. It looks like the PDFBox classes 
> pre-date NIO.
> With NIO we could use memory mapped files, which for large PDFFiles will 
> perform better
> than an InputStream.
> 
> -- John
> 
>> On 18 Feb 2014, at 03:53, Maruan Sahyoun <sahy...@fileaffairs.de> wrote:
>> 
>> Hi,
>> 
>> there are currently a number of different options to use as a base for a 
>> potential new parser/lexer. The ones currently in use are
>> 
>> BaseParser: 
>> import org.apache.pdfbox.io.PushBackInputStream;
>> import org.apache.pdfbox.io.RandomAccess;
>> 
>> PDFParser (additional):
>> import org.apache.pdfbox.io.RandomAccess;
>> 
>> NonSequentialParser:
>> import org.apache.pdfbox.io.PushBackInputStream;
>> import org.apache.pdfbox.io.RandomAccess;
>> import org.apache.pdfbox.io.RandomAccessBuffer;
>> import org.apache.pdfbox.io.RandomAccessBufferedFileInputStream;
>> 
>> There are some additional Classes/Interfaces in the io package e.g. 
>> RandomAccessBufferedFileInputStream implementing RandomAccessRead
>> 
>> Any preferences, ideas of consolidating this? 
>> 
>> Currently I’m using RandomAccessBufferedFileInputStream with some additional 
>> implementations of RandomAccessRead to support reading from a ByteArray for 
>> testing purposes)
>> 
>> BR
>> 
>> Maruan Sahyoun
>

Re: pdfbox.io - which should I use

Reply via email to