[ https://issues.apache.org/jira/browse/ARROW-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568056#comment-17568056 ]
James Henderson commented on ARROW-17107: ----------------------------------------- > All vectors that use offsets must have at least one offset (or more > specifically: the number of offsets is always the number of values + 1, see > [the > spec|https://arrow.apache.org/docs/format/Columnar.html#variable-size-binary-layout]) mm, although this isn't the case for dense unions - I guess their 'offsets' are conceptually different from the offsets in the variable-width vectors? > possibly empty vectors may not have allocated any memory as a > micro-optimization? that was my assumption too, yeah :) > [Java] JSONFileWriter throws IOOBE writing an empty list > -------------------------------------------------------- > > Key: ARROW-17107 > URL: https://issues.apache.org/jira/browse/ARROW-17107 > Project: Apache Arrow > Issue Type: Bug > Components: Java > Affects Versions: 8.0.0 > Reporter: James Henderson > Priority: Minor > > Hey folks, > I'm trying to write an empty ListVector out through the `JsonFileWriter`, and > am getting an IOOBE. Stack trace is as follows: > > ``` > java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, > 0)) > at org.apache.arrow.memory.ArrowBuf.checkIndexD (ArrowBuf.java:318) > org.apache.arrow.memory.ArrowBuf.chk (ArrowBuf.java:305) > org.apache.arrow.memory.ArrowBuf.getInt (ArrowBuf.java:424) > org.apache.arrow.vector.ipc.JsonFileWriter.writeValueToGenerator > (JsonFileWriter.java:270) > org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson > (JsonFileWriter.java:237) > org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson > (JsonFileWriter.java:253) > org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson > (JsonFileWriter.java:253) > org.apache.arrow.vector.ipc.JsonFileWriter.writeFromVectorIntoJson > (JsonFileWriter.java:253) > org.apache.arrow.vector.ipc.JsonFileWriter.writeBatch > (JsonFileWriter.java:200) > org.apache.arrow.vector.ipc.JsonFileWriter.write (JsonFileWriter.java:190) > ``` > It's trying to write the offset buffer of the list, which is empty. L224 of > JFW.java sets `bufferValueCount` to 1 (because we're not a DUV), so we enter > the `for` loop. We don't hit the `valueCount=0` condition in L230 (because > we're not a varbinary or a varchar vector). So we fall into the `else`, which > tries to write the 0th element in the offset vector, and IOOBE. > Could we include 'list' in either the L224 or the L230 checks? > Admittedly, I'm not aware of the history of this section, but it seems that, > by the time we hit L230 (i.e. excluding DUV), any empty vector should yield a > single 0? > Let me know if there's any more info I can provide! > Cheers, > James -- This message was sent by Atlassian Jira (v8.20.10#820010)