A recent issue, described in SYSTEMML-1466, made me think about the cleanup semantics of our temporary scratch_space when coming through the new MLContext API. For our main compilation chain (hadoop/spark_submit), the semantics are very clear: we delete the entire script specific directory before and after execution. However, for MLContext it is not as easy because temporary variables are potentially handed out as results but we need the cleanup because otherwise temporary writes fail. Checking for existing files is also not possible, as this might even lead to incorrect results. Could somebody please clarify the current cleanup semantics and point me to the relevant code?
Regards, Matthias