darmie commented on issue #20324:
URL: https://github.com/apache/datafusion/issues/20324#issuecomment-3941326780

   > One other direction I am exploring is to see if morsel-driven execution 
can help here.
   > 
   > One hypothesis is that filter pushdown pushes more CPU work (especially in 
the case of dynamic queries) and serial IO (i.e. each individual RowFilter) + 
some additional overhead so slow / skewed partitions will become even more slow.
   > 
   > With morsel-driven execution we might be able to mitigate this effect, as 
we can distribute the work better by planning the work using a queue (and so 
any overhead or file IO latencies will be spread out more).
   > 
   > PoC is here [#20477](https://github.com/apache/datafusion/pull/20477) - it 
seems it gives quite a bit of speedups on Clickbench(!) (without filter 
pushdown) though I see some large slowdowns on TPCH SF10 as well, probably as 
it doesn't benefit much (as far as I remember data / filters are perfectly 
distributed and files seem to contain many row groups) and probably hurts 
locality as implemented.
   
   Complementary to solving the scheduling problem using Morsel driven approach 
is to use JIT native code to solve per-morsel execution cost.  Cache locality 
loss from morsel reassignment can be resolved via JIT code cache. For example, 
if the compiled decoder for `(DICT, Int32, bit_width=12, filter=neq_empty)` is 
cached and reused across morsels regardless of which thread runs them, the code 
stays hot even when the data moves between threads. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to