benj created DRILL-7395: --------------------------- Summary: Partial Partition By to CTAS Parquet files Key: DRILL-7395 URL: https://issues.apache.org/jira/browse/DRILL-7395 Project: Apache Drill Issue Type: Improvement Components: Storage - Parquet Affects Versions: 1.16.0 Reporter: benj
In the case of a data set with few value are prevailing while most have weak occurrences, it will be useful to have the abilities to create Parquet with a partial _PARTITION BY_. It would then be possible to group all the small occurrences together without being "impacted" by the "too" common values. It's not exactly the same, but it exists partial index on some database (https://www.postgresql.org/docs/current/indexes-partial.html) -- This message was sent by Atlassian Jira (v8.3.4#803005)