[ 
https://issues.apache.org/jira/browse/SYSTEMML-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691648#comment-15691648
 ] 

Felix Schüler commented on SYSTEMML-1127:
-----------------------------------------

So I see two ways to go here and would need some more info on what's going on 
to decide which one to chose:

1) Give each thread its own cache directory
2) Synchronize the LocalFileUtils.createLocalFileIfNotExist() method and have 
threads share the cache

It seems like the parfor workers use the folder created in 
/tmp/systemml/pid_host use this directory as cache. Is this a cache per process 
or per thread? If a worker spawns multiple threads they will run in the same 
process and a call to create this directory will generate a race condition and 
throw an error. [~mboehm7] could you give me some advice on this?

> Distributed unique IDs are not unique
> -------------------------------------
>
>                 Key: SYSTEMML-1127
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1127
>             Project: SystemML
>          Issue Type: Bug
>          Components: ParFor
>            Reporter: Felix Schüler
>
> When executing a Spark parfor, the SparkParforWorker throws an exception 
> which states that the localtmpdir could not be created. This is due to the 
> fact that multiple executors are running multithreaded on the same worker. 
> The createDistributedUniqueID() method in the IDHander.java creates unique 
> IDs only per pid and host, not per thread. This could potentially be solved 
> by adding the threadID to the unique ID. The question is if every thread 
> should have its own cache or if the logic should be changed so that the first 
> creation will be successful and then the threads share one cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to