zhztheplayer commented on issue #7810:
URL: 
https://github.com/apache/incubator-gluten/issues/7810#issuecomment-2456411668

   > @zhztheplayer Is there any specific consideration on memory management 
side?
   
   There should be no hard limitation from Spark's / Gluten's memory management 
system against using multiple threads in the same Spark task. There were only a 
few of lock issues that have been resolved already. The only rule is we'd 
report the allocations to the task's memory manager, and this is something 
against the approach running a background execution thread pool for all 
executor's tasks.
   
   > We can set task.cores to more than 1 then set to Velox's backend threads.
   
   Velox's top-level threading model should be designed for Presto's push 
model. Which doesn't allow Spark's use case that creates an iterator over 
Velox's task. The parallelism we can easily adopt from Velox is the in-pipeline 
background thread pools, for example, parallel join build or scan pre-fetch. 
But again using the thread pools will require us to handle the memory 
allocation carefully speaking of Spark's memory manager model.
   
   Having said that, there could be a hacky way to combine Spark's pull model 
with Velox's push model, which requires creating a queue in between them as a 
data broker. We can do some researches and PoCs to see if this way really 
brings performance advantages, if it does, we could consider doing refactors on 
our memory management code and other arch code to switch to it. But it's a big 
deal and is something more or less violates Spark's design, so we may need to 
think more before coding.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to