Syed Shameerur Rahman created HIVE-25443: --------------------------------------------
Summary: Arrow SerDe Cannot serialize/deserialize complex data types When there are more than 1024 values Key: HIVE-25443 URL: https://issues.apache.org/jira/browse/HIVE-25443 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 3.1.2, 3.1.1, 3.0.0, 3.1.0 Reporter: Syed Shameerur Rahman Assignee: Syed Shameerur Rahman Fix For: 4.0.0 Complex data types like MAP, STRUCT cannot be serialized/deserialzed using Arrow SerDe when there are more than 1024 values. This happens due to ColumnVector always being initialized with a size of 1024. Issue #1 : https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L213 Issue #2 : https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L215 Sample unit test to reproduce the case in TestArrowColumnarBatchSerDe : {code:java} @Test public void testListBooleanWithMoreThan1024Values() throws SerDeException { String[][] schema = { {"boolean_list", "array<boolean>"}, }; Object[][] rows = new Object[1025][1]; for (int i = 0; i < 1025; i++) { rows[i][0] = new BooleanWritable(true); } initAndSerializeAndDeserialize(schema, toList(rows)); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)