Hi,
Am 19.07.2012 13:02, schrieb Maruan Sahyoun:
resuming to work on PDFBOX-1000 I came across a question how to maintain some
state within the base components PDFLexer and Simple Parser (which has yet to
come).
E.g. in order to differentiate a number from an indirect object I potentially
have to read three tokens {num} {gen} obj to check if {num} is an individual
number or the start of an indirect object. There are two ways to recover if
I've read too many tokens and the number was in fact the individual object
a) depend on file position e.g. filePointer and seek
b) maintain some internal state
I currently tend to go for b) as this would remove the dependency on
filePointer() and seek() or similar methods but that means if the parsing has
to start from a new point within the file, object etc. there needs too be some
reset() call to reset the state. Also the caller e.g. ConformingParser has to
make sure that there is some way to reposition the cursor. On the other hand
not being dependent on a specific position would enable the PDFLexer and
SimpleParser to be extended to work on byte[] and similar.
WDYT
why not using o.a.p.io.RandomAccessRead? This interface can be
implemented for all kinds of input material.
Best regards,
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780474
F: +49 345 4780471
timo.boe...@ontochem.com
_
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_