Voon Hou created FLINK-38138:
--------------------------------
Summary: Array OOB error when trying to get a binary from a
non-zero offset from a vector
Key: FLINK-38138
URL: https://issues.apache.org/jira/browse/FLINK-38138
Project: Flink
Issue Type: Bug
Components: API / Type Serialization System
Affects Versions: 1.20.0, 1.18.0, 1.17.0
Reporter: Voon Hou
{code:java}
// code placeholder
{code}
We have noticed that an OOB error is being thrown when trying to invoke
getBinary when the following conditions are met:
# There are multiple elements in a HeapArrayVector
# When trying to get an element from a non-zero offset of the array
# When trying to convert this array that we have obtained to a binary
{code:java}
Caused by: java.lang.IllegalArgumentException: 72 > 36 at
java.util.Arrays.copyOfRange(Arrays.java:3519) at
org.apache.flink.table.data.columnar.ColumnarArrayData.getBinary(ColumnarArrayData.java:138)
at
org.apache.hudi.table.format.cow.vector.ColumnarGroupRowData.getBinary(ColumnarGroupRowData.java:121)
at
org.apache.flink.table.data.RowData.lambda$createFieldGetter$245ca7d1$3(RowData.java:228)
at
org.apache.flink.table.runtime.typeutils.RowDataSerializer.toBinaryRow(RowDataSerializer.java:207)
at
org.apache.flink.table.data.writer.AbstractBinaryWriter.writeRow(AbstractBinaryWriter.java:147)
at
org.apache.flink.table.data.writer.BinaryArrayWriter.writeRow(BinaryArrayWriter.java:30)
at
org.apache.flink.table.data.writer.BinaryWriter.write(BinaryWriter.java:155) at
org.apache.flink.table.runtime.typeutils.MapDataSerializer.toBinaryMap(MapDataSerializer.java:179)
at
org.apache.flink.table.runtime.typeutils.MapDataSerializer.copy(MapDataSerializer.java:113)
at
org.apache.flink.table.runtime.typeutils.MapDataSerializer.copy(MapDataSerializer.java:44)
at
org.apache.flink.table.runtime.typeutils.RowDataSerializer.copyRowData(RowDataSerializer.java:170)
at
org.apache.flink.table.runtime.typeutils.RowDataSerializer.copy(RowDataSerializer.java:131)
{code}
The error below is caused by the code below:
{code:java}
@Override
public byte[] getBinary(int pos) {
BytesColumnVector.Bytes byteArray = getByteArray(pos);
if (byteArray.len == byteArray.data.length) {
return byteArray.data;
} else {
return Arrays.copyOfRange(byteArray.data, byteArray.offset,
byteArray.len);
}
} {code}
The function signature of {{Arrays.copyOfRange}} is as such:
{code:java}
public static byte[] copyOfRange(byte[] original, int from, int to) { {code}
The {{java.util.Arrays.copyOfRange(original, from, to)}} method copies the
specified range of the {{original}} array into a new array.
* {{{}from{}}}: The initial index of the range to be copied, inclusive.
* {{{}to{}}}: The final index of the range to be copied, {*}exclusive{*}.
Hence, the correct way of invoking {{Arrays.copyOfRange}} should be:
{code:java}
return Arrays.copyOfRange(
byteArray.data, byteArray.offset, byteArray.offset + byteArray.len);
{code}
Will fire a fix for this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)