Liya Fan created ARROW-6172: ------------------------------- Summary: [Java] Avoid creating value holders repeatedly when reading data from JDBC Key: ARROW-6172 URL: https://issues.apache.org/jira/browse/ARROW-6172 Project: Apache Arrow Issue Type: Improvement Components: Java Reporter: Liya Fan Assignee: Liya Fan
When converting JDBC data to Arrow data. A value holder is created for each single value. The following code snippet gives an example: NullableSmallIntHolder holder = new NullableSmallIntHolder(); holder.isSet = isNonNull ? 1 : 0; if (isNonNull) { holder.value = (short) value; } smallIntVector.setSafe(rowCount, holder); smallIntVector.setValueCount(rowCount + 1); This is inefficient, both in terms of memory usage, and computational efficiency. For most types, we can improve the performance by directly setting the value. For example, the benchmarks on IntVector show that a 20% performance improvement can be achieved by directly setting the int value: Benchmark Mode Cnt Score Error Units IntBenchmarks.setIntDirectly avgt 5 15.397 ± 0.018 us/op IntBenchmarks.setWithValueHolder avgt 5 19.198 ± 0.789 us/op -- This message was sent by Atlassian JIRA (v7.6.14#76016)