Dandandan commented on issue #20325:
URL: https://github.com/apache/datafusion/issues/20325#issuecomment-3905282899

   In addition to some of the overhead you already mentioned (CachedArrayReader 
/ skips / filter + concat) that could be reduced, I think a lot is actually the 
IO pattern.
   
   Some observations/thoughts when running target_partitions=cpu cores on my 
machine:
   
   * CPU usage is not super high for the parquet decoding / datafusion exec 
threads - ~90% of it just waiting on data to arrive.
   * This can be demonstrated when doubling the parallelism (2x cores) on my 
machine, the query finishes ~20% faster due to being able to do more parallel 
IO / hiding IO latency.
   * In this case the IO pattern (skipping data) and the `LocalFileSystem` 
object store is also adding overhead (open / close / allocation / metadata) as 
it reads more smaller ranges, probably more context switching etc. going on:
   
   <img width="708" height="397" alt="Image" 
src="https://github.com/user-attachments/assets/94e9eca3-2450-41f1-8cd8-d7bd3b8afdb3";
 />
   
   * I think it might help in this case to cache / remember the file handles 
(open / close is called many times for each range): 
https://github.com/apache/arrow-rs-object-store/issues/18
   * Tokio (with this query) spreads the IO out to ~30 blocking threads (and 
keeps them available / spreads the work on them) as it uses `spawn_blocking` in 
the object store (perhaps we can bring our own thread pool to reduce context 
switching)
   
   * Perhaps we can consider prefetching inside the parquet reader ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to