Dandandan commented on PR #19639:
URL: https://github.com/apache/datafusion/pull/19639#issuecomment-3722233173

   So I guess the main factor is expressions like this being super expensive to 
evaluate:
   ```
   predicate=DynamicFilter [ l_partkey@1 >= 3 AND l_partkey@1 <= 199962 AND 
hash_lookup ] AND DynamicFilter [ l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND 
hash_lookup ] AND DynamicFilter [ CASE hash_repartition % 10 WHEN 0 THEN 
l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND l_partkey@1 >= 7 AND l_partkey@1 
<= 199998 AND hash_lookup WHEN 1 THEN l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 
AND l_partkey@1 >= 2 AND l_partkey@1 <= 199996 AND hash_lookup WHEN 2 THEN 
l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND l_partkey@1 >= 3 AND l_partkey@1 
<= 200000 AND hash_lookup WHEN 3 THEN l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 
AND l_partkey@1 >= 1 AND l_partkey@1 <= 200000 AND hash_lookup WHEN 4 THEN 
l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND l_partkey@1 >= 3 AND l_partkey@1 
<= 199998 AND hash_lookup WHEN 5 THEN l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 
AND l_partkey@1 >= 3 AND l_partkey@1 <= 199998 AND hash_lookup WHEN 6 THEN 
l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND l_partkey@1 >= 9 AND 
 l_partkey@1 <= 200000 AND hash_lookup WHEN 7 THEN l_suppkey@2 >= 1 AND 
l_suppkey@2 <= 10000 AND l_partkey@1 >= 1 AND l_partkey@1 <= 200000 AND 
hash_lookup WHEN 8 THEN l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND 
l_partkey@1 >= 2 AND l_partkey@1 <= 199998 AND hash_lookup WHEN 9 THEN 
l_suppkey@2 >= 1 AND l_suppkey@2 <= 10000 AND l_partkey@1 >= 8 AND l_partkey@1 
<= 199999 AND hash_lookup ELSE false END ] AND DynamicFilter [ CASE 
hash_repartition % 10 WHEN 0 THEN l_orderkey@0 >= 5 AND l_orderkey@0 <= 5999847 
AND hash_lookup WHEN 1 THEN l_orderkey@0 >= 6 AND l_orderkey@0 <= 5999970 AND 
hash_lookup WHEN 2 THEN l_orderkey@0 >= 37 AND l_orderkey@0 <= 5999975 AND 
hash_lookup WHEN 3 THEN l_orderkey@0 >= 1 AND l_orderkey@0 <= 5999971 AND 
hash_lookup WHEN 4 THEN l_orderkey@0 >= 131 AND l_orderkey@0 <= 5999969 AND 
hash_lookup WHEN 5 THEN l_orderkey@0 >= 66 AND l_orderkey@0 <= 5999941 AND 
hash_lookup WHEN 6 THEN l_orderkey@0 >= 34 AND l_orderkey@0 <= 5999974 AND 
hash_lookup WHEN 7 THEN l_orderke
 y@0 >= 4 AND l_orderkey@0 <= 5999940 AND hash_lookup WHEN 8 THEN l_orderkey@0 
>= 3 AND l_orderkey@0 <= 5999879 AND hash_lookup WHEN 9 THEN l_orderkey@0 >= 71 
AND l_orderkey@0 <= 6000000 AND hash_lookup ELSE false END ], 
pruning_predicate=l_partkey_null_count@1 != row_count@2 AND l_partkey_max@0 >= 
3 AND l_partkey_null_count@1 != row_count@2 AND l_partkey_min@3 <= 199962 AND 
l_suppkey_null_count@5 != row_count@2 AND l_suppkey_max@4 >= 1 AND 
l_suppkey_null_count@5 != row_count@2 AND l_suppkey_min@6 <= 10000, 
required_guarantees=[], metrics=[output_rows=319.4 K, elapsed_compute=10ns, 
output_bytes=21.9 MB, output_batches=733, files_ranges_pruned_statistics=10 
total → 10 matched, row_groups_pruned_statistics=6 total → 6 matched, 
row_groups_pruned_bloom_filter=6 total → 6 matched, page_index_rows_pruned=6.00 
M total → 6.00 M matched, batches_split=0, bytes_scanned=66.32 M, 
file_open_errors=0, file_scan_errors=0, num_predicate_creation_errors=0, 
predicate_cache_inner_records=0, pre
 dicate_cache_records=0, predicate_evaluation_errors=0, 
pushdown_rows_matched=0, pushdown_rows_pruned=0, 
bloom_filter_eval_time=451.85µs, filter_apply_time=1.13s, 
metadata_load_time=246.88ms, page_index_eval_time=602.59µs, 
row_pushdown_eval_time=20ns, statistics_eval_time=737.84µs, 
time_elapsed_opening=422.26ms, time_elapsed_processing=21.54s, 
time_elapsed_scanning_total=21.33s
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to