Jingsong Lee created FLINK-12796:
------------------------------------
Summary: Introduce BaseArray and BaseMap to reduce conversion
overhead to blink
Key: FLINK-12796
URL: https://issues.apache.org/jira/browse/FLINK-12796
Project: Flink
Issue Type: Bug
Components: Table SQL / Planner
Reporter: Jingsong Lee
Assignee: Jingsong Lee
Currently, in internal data format of flink, the array is only BinaryArray, and
the map is only BinaryMap. If the user writes a UDAF with arrays as parameters
and return values, it will lead to frequent conversion between Java arrays and
BinaryArrays (each conversion is equivalent to the entire array of copys),
which is very time-consuming.
In order to avoid copy in conversion, BaseArray and BaseMap are introduced as
internal formats.
BaseArray is the parent of GenericArray and BinaryArray, providing various read
and write operations on an array.
GenericArray is a wrapper class for Java arrays, which internally wraps a Java
array. This array stores some elements of internal data format.
Conversion can be avoided when the element type is a primitive type or a type
that is consistent internally format and externally format. (Detail see:
DataFormatConverters)
After our benchmark, the performance of UDAF using primitive Array has been
improved by 10 times.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)