Hi dev,


We would like raise a discussion about supporting `DataFilter pushed down 
dynamically`. We are trying to push down dataFilter with PartitionFilter, and 
prune partitionFilter at runtime before push to parquet, which can push less 
filter to parquet. More details can be found in 35669.



before this patch, the physical plan is:

Filter:
(((id#14 < 2) AND (part#15 = 0)) OR ((id#14 > 3) AND (part#15 = 1)))

ParquetScan DataFilters: 
[(((id#14 < 2) AND (part#15 = 0)) OR ((id#14 > 3) AND (part#15 = 1)))]

PartitionFilters: 
[((part#15 = 0) OR (part#15 = 1))]

PushedFilters: 
[Or(LessThan(id,2),GreaterThan(id,3))]


after this patch, the physical plan is:

Filter:
(((id#14 < 2) AND (part#15 = 0)) OR ((id#14 > 3) AND (part#15 = 1)))

ParquetScan DataFilters: 
[(((id#14 < 2) AND (part#15 = 0)) OR ((id#14 > 3) AND (part#15 = 1)))]

PartitionFilters: 
[((part#15 = 0) OR (part#15 = 1))]

PushedDynamicalFilters: 
[Or(And(LessThan(id,2),EqualTo(part,0)),And(GreaterThan(id,3),EqualTo(part,1)))]


Please note that PushedFilters is changed to PushedDynamicalFilters. 
[Or(And(LessThan(id,2),EqualTo(part,0)),And(GreaterThan(id,3),EqualTo(part,1)))]
 means data filter id < 2 is dynamically pushed to partition 0 and data filter 
id > 3 is dynamically pushed to partition 1.

We would like to start a discussion about this PR and whether we can use 
PushedDynamicalFilters to represent filter dynamic pushdown. Any feedback is 
welcome.

Thanks!

Jacky Lee


Reply via email to