[ 
https://issues.apache.org/jira/browse/SYSTEMML-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1309:
-------------------------------------
    Description: In contrast to parfor mr jobs, where every task has its own, 
process-local buffer pool, on spark with multi-threaded executors, multiple 
tasks share a common buffer pool. This is advantageous because common inputs 
are just read once. However, it also requires a synchronized buffer pool 
initialization and cleanup per executor. Especially the cleanup (e.g., of 
created cache directories) is tricky because spark does not provide an executor 
close call. Hence, our approach is to use a robust version of deleteOnExit that 
is independent of the exit code and also removes remaining files that are 
unknown during delete registration.    (was: In contrast to parfor mr jobs, 
where every task has its own, process-local buffer pool, on spark with 
multi-threaded executors, multiple tasks share a common buffer pool. This is 
advantageous because common inputs are just read once. However, it also 
requires a synchronized buffer pool initialization and cleanup per executor. 
Especially the cleanup (e.g., of created cache directories) is tricky because 
spark does not provide an executor close call. Hence, our approach is to use a 
robust version of deleteOnExit that is independent of the exit code.  )

> Parfor spark buffer pool handling
> ---------------------------------
>
>                 Key: SYSTEMML-1309
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1309
>             Project: SystemML
>          Issue Type: Sub-task
>          Components: APIs, Runtime
>            Reporter: Matthias Boehm
>            Assignee: Matthias Boehm
>
> In contrast to parfor mr jobs, where every task has its own, process-local 
> buffer pool, on spark with multi-threaded executors, multiple tasks share a 
> common buffer pool. This is advantageous because common inputs are just read 
> once. However, it also requires a synchronized buffer pool initialization and 
> cleanup per executor. Especially the cleanup (e.g., of created cache 
> directories) is tricky because spark does not provide an executor close call. 
> Hence, our approach is to use a robust version of deleteOnExit that is 
> independent of the exit code and also removes remaining files that are 
> unknown during delete registration.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to