Tianshuo Deng created PARQUET-341:
-------------------------------------
Summary: Improve write performance with wide schema sparse data
Key: PARQUET-341
URL: https://issues.apache.org/jira/browse/PARQUET-341
Project: Parquet
Issue Type: Improvement
Reporter: Tianshuo Deng
Assignee: Tianshuo Deng
In write path, when there are tons of sparse data, most of time is spent on
writing nulls.
Currently writing nulls has the same code path as writing values, which is
reclusive traverse all the leaves when a group is null.
Due to the fact that when a group is null all the leaves beneath it should be
written with null value with the same repetition level and definition level, we
can eliminate the recursion call to get the leaves
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)