[ https://issues.apache.org/jira/browse/SYSTEMML-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691648#comment-15691648 ]
Felix Schüler commented on SYSTEMML-1127: ----------------------------------------- So I see two ways to go here and would need some more info on what's going on to decide which one to chose: 1) Give each thread its own cache directory 2) Synchronize the LocalFileUtils.createLocalFileIfNotExist() method and have threads share the cache It seems like the parfor workers use the folder created in /tmp/systemml/pid_host use this directory as cache. Is this a cache per process or per thread? If a worker spawns multiple threads they will run in the same process and a call to create this directory will generate a race condition and throw an error. [~mboehm7] could you give me some advice on this? > Distributed unique IDs are not unique > ------------------------------------- > > Key: SYSTEMML-1127 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1127 > Project: SystemML > Issue Type: Bug > Components: ParFor > Reporter: Felix Schüler > > When executing a Spark parfor, the SparkParforWorker throws an exception > which states that the localtmpdir could not be created. This is due to the > fact that multiple executors are running multithreaded on the same worker. > The createDistributedUniqueID() method in the IDHander.java creates unique > IDs only per pid and host, not per thread. This could potentially be solved > by adding the threadID to the unique ID. The question is if every thread > should have its own cache or if the logic should be changed so that the first > creation will be successful and then the threads share one cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)