Voon Hou created FLINK-38138: -------------------------------- Summary: Array OOB error when trying to get a binary from a non-zero offset from a vector Key: FLINK-38138 URL: https://issues.apache.org/jira/browse/FLINK-38138 Project: Flink Issue Type: Bug Components: API / Type Serialization System Affects Versions: 1.20.0, 1.18.0, 1.17.0 Reporter: Voon Hou
{code:java} // code placeholder {code} We have noticed that an OOB error is being thrown when trying to invoke getBinary when the following conditions are met: # There are multiple elements in a HeapArrayVector # When trying to get an element from a non-zero offset of the array # When trying to convert this array that we have obtained to a binary {code:java} Caused by: java.lang.IllegalArgumentException: 72 > 36 at java.util.Arrays.copyOfRange(Arrays.java:3519) at org.apache.flink.table.data.columnar.ColumnarArrayData.getBinary(ColumnarArrayData.java:138) at org.apache.hudi.table.format.cow.vector.ColumnarGroupRowData.getBinary(ColumnarGroupRowData.java:121) at org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$3(RowData.java:228) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.toBinaryRow(RowDataSerializer.java:207) at org.apache.flink.table.data.writer.AbstractBinaryWriter.writeRow(AbstractBinaryWriter.java:147) at org.apache.flink.table.data.writer.BinaryArrayWriter.writeRow(BinaryArrayWriter.java:30) at org.apache.flink.table.data.writer.BinaryWriter.write(BinaryWriter.java:155) at org.apache.flink.table.runtime.typeutils.MapDataSerializer.toBinaryMap(MapDataSerializer.java:179) at org.apache.flink.table.runtime.typeutils.MapDataSerializer.copy(MapDataSerializer.java:113) at org.apache.flink.table.runtime.typeutils.MapDataSerializer.copy(MapDataSerializer.java:44) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170) at org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131) {code} The error below is caused by the code below: {code:java} @Override public byte[] getBinary(int pos) { BytesColumnVector.Bytes byteArray = getByteArray(pos); if (byteArray.len == byteArray.data.length) { return byteArray.data; } else { return Arrays.copyOfRange(byteArray.data, byteArray.offset, byteArray.len); } } {code} The function signature of {{Arrays.copyOfRange}} is as such: {code:java} public static byte[] copyOfRange(byte[] original, int from, int to) { {code} The {{java.util.Arrays.copyOfRange(original, from, to)}} method copies the specified range of the {{original}} array into a new array. * {{{}from{}}}: The initial index of the range to be copied, inclusive. * {{{}to{}}}: The final index of the range to be copied, {*}exclusive{*}. Hence, the correct way of invoking {{Arrays.copyOfRange}} should be: {code:java} return Arrays.copyOfRange( byteArray.data, byteArray.offset, byteArray.offset + byteArray.len); {code} Will fire a fix for this. -- This message was sent by Atlassian Jira (v8.20.10#820010)