Re: [DISCUSS][SPIP][SPARK-29031] Materialized columns

2019-09-15 Thread Wenchen Fan
> 1. It is a waste of IO. The whole column (in Map format) should be read and Spark extract the required keys from the map, even though the query requires only one or a few keys in the map This sounds like a similar use case to nested column pruning. We should push down the map key extracting to t

[DISCUSS][SPIP][SPARK-29031] Materialized columns

2019-09-10 Thread Jason Guo
Hi, I'd like to propose a feature name materialized column. This feature will boost queries on complex type columns. https://docs.google.com/document/d/186bzUv4CRwoYY_KliNWTexkNCysQo3VUTLQVrVijyl4/edit?usp=sharing *Background* In data warehouse domain, there is a common requirement to add new f