Gang Wu created PARQUET-2253: -------------------------------- Summary: Postpone dictionary encoding decision for starting null pages. Key: PARQUET-2253 URL: https://issues.apache.org/jira/browse/PARQUET-2253 Project: Parquet Issue Type: Improvement Components: parquet-mr Reporter: Gang Wu Assignee: Gang Wu
Discussion from the dev mailing list: [Fallback Encoding for Very Sparse or Sorted Datasets-Apache Mail Archives|https://lists.apache.org/thread/jkkxhknsxw8cdv8fxf1bstqdn5fzw2pl] If beginning values are all nulls, the dictionary encoding is disabled when first data page is created. This loses the advantage of dictionary encoding if following values are good fit. -- This message was sent by Atlassian Jira (v8.20.10#820010)