Yes, we could use RandomAccessRead as a base and subclasses to wrap NIO and others.
Then the parsers would use RandomAccessRead WDYT Maruan Sahyoun > Am 18.02.2014 um 21:42 schrieb John Hewson <j...@jahewson.com>: > > The streams used by BaseParser and PDFParser are sequential, so you can > ignore them. > Use of PushBackInputStream in the non-sequential parser seems a little odd. > > We might want to think about getting rid of the classes in > org.apache.pdfbox.io and replacing > them with classes from java.nio.channels. It looks like the PDFBox classes > pre-date NIO. > With NIO we could use memory mapped files, which for large PDFFiles will > perform better > than an InputStream. > > -- John > >> On 18 Feb 2014, at 03:53, Maruan Sahyoun <sahy...@fileaffairs.de> wrote: >> >> Hi, >> >> there are currently a number of different options to use as a base for a >> potential new parser/lexer. The ones currently in use are >> >> BaseParser: >> import org.apache.pdfbox.io.PushBackInputStream; >> import org.apache.pdfbox.io.RandomAccess; >> >> PDFParser (additional): >> import org.apache.pdfbox.io.RandomAccess; >> >> NonSequentialParser: >> import org.apache.pdfbox.io.PushBackInputStream; >> import org.apache.pdfbox.io.RandomAccess; >> import org.apache.pdfbox.io.RandomAccessBuffer; >> import org.apache.pdfbox.io.RandomAccessBufferedFileInputStream; >> >> There are some additional Classes/Interfaces in the io package e.g. >> RandomAccessBufferedFileInputStream implementing RandomAccessRead >> >> Any preferences, ideas of consolidating this? >> >> Currently I’m using RandomAccessBufferedFileInputStream with some additional >> implementations of RandomAccessRead to support reading from a ByteArray for >> testing purposes) >> >> BR >> >> Maruan Sahyoun >