https://issues.apache.org/bugzilla/show_bug.cgi?id=51318

             Bug #: 51318
           Summary: Exceptions in NDocumentInputStream preventing
                    streaming of data out of MS Publisher files
           Product: POI
           Version: 3.2-FINAL
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: HPBF
        AssignedTo: [email protected]
        ReportedBy: [email protected]
    Classification: Unclassified


Related to 51317 - Need ability to stream and chunk data out of MS Publisher
documents.

I attempted to implement streaming and chunking of data out of pub files and
got errors as below.

Basically I attempted to read from DocumentInputStream in chunks, in
succession, rather than read in the whole stream into a large preallocated byte
array.

    byte[] filler = new byte[25]; 

    byte[] bytes = new byte[8];
    int read = dis.read(bytes, 0, 8);

    if (read <= 0) {
      // 
    } else {
      String f8 = new String(bytes);
      if (!f8.equals("CHNKINK ")) {
        throw new IllegalArgumentException("Expecting 'CHNKINK ' but was '" +
f8 + "'");
      }
      // Ignore the next 24, for now at least

      dis.read(filler, 8, 24);

      for (int i = 0; i < 20; i++) {
        int offset = 0x20 + i * 24;

        bytes = new byte[25];
        read = dis.read(bytes, offset, bytes.length);

Note the line which attempts to read the filler 24 bytes so we can get to the
bits. I had to try it there because was getting error simply trying to do
read(bytes, offset, bytes.length).

Errors are all like this first:
Exception in thread "main" java.lang.IndexOutOfBoundsException: can't read past
buffer boundaries
    at
org.apache.poi.poifs.filesystem.NDocumentInputStream.read(NDocumentInputStream.java:142)
    at
org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:118)

Now, if we examine NDocumentInputStream.read(byte[], int, int), there is a
conditional there:
if (off < 0 || len < 0 || b.length < off + len) {

This assumes that the byte array is large and you're going in sequence. If you
want to jump around you'd presumably want to check b.length < len.

Tried that. Got the next error as follows:
Exception in thread "main" java.lang.IndexOutOfBoundsException
    at java.nio.Buffer.checkBounds(Buffer.java:530)
    at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:125)
    at
org.apache.poi.poifs.filesystem.NDocumentInputStream.readFully(NDocumentInputStream.java:250)
    at
org.apache.poi.poifs.filesystem.NDocumentInputStream.read(NDocumentInputStream.java:151)
    at
org.apache.poi.poifs.filesystem.DocumentInputStream.read(DocumentInputStream.java:118)

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to