[ https://issues.apache.org/jira/browse/SYSTEMML-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968740#comment-15968740 ]
Matthias Boehm commented on SYSTEMML-1518: ------------------------------------------ And just to scope the affected versions: I introduced this bug about 7 month ago. > Corrupted input file names in old and new mlcontext apis > -------------------------------------------------------- > > Key: SYSTEMML-1518 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1518 > Project: SystemML > Issue Type: Bug > Reporter: Matthias Boehm > Priority: Blocker > > Both the new and old mlcontext APIs call > {{OptimizerUtils.getUniqueTempFileName()}} to create HDFS filenames for > registered input frames or matrices. This call simply forwards the request to > {{Dag}} for consistency with hdfs filenames of intermediates and to ensure > isolation with regard to concurrently running scripts (from different client > processes on a shared cluster). > However, for this code path the internal scratch space configuration is > always uninitialized leading to corrupt filenames such as > {{/_p1234_1.2.345.678//_t0/temp1_0}}. The missing scratch_space prefix is > problematic because the remainder is interpreted as an absolute file path, > often leading to permission issues because typical users are not granted > write access on HFDS root. > Note that this issue might not be immediately visible in all scenarios > because it only affects input variables that are exported to HDFS (e.g., > during guarded collect or as specific inputs to remote parfor). -- This message was sent by Atlassian JIRA (v6.3.15#6346)