[ 
https://issues.apache.org/jira/browse/TEZ-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010430#comment-14010430
 ] 

Hitesh Shah commented on TEZ-1106:
----------------------------------

Comments:

In getTezBaseStagingPath(), is there a reason that the resolved path is not 
being updated back into the config? 
In getTezSystemStagingPath(), why is getTezBaseStagingPath() being called? Why 
are 2 public functions needed instead of just exposing a single one? 

Also, for getTezSystemStagingPath(), is there a reason why it always has to do 
a create? To clarify, are there any cases that you believe the directory should 
always exist and not be silently created? 

{code}
+      if (!fs.exists(tezStagingDir)) {
+        mkDirForAM(fs, tezStagingDir);
+      }
+      tezStagingDir = new Path(tezStagingDir, strAppId);
+      if (!fs.exists(tezStagingDir)) {
+        mkDirForAM(fs, tezStagingDir);
+      }
{code}
   - why invoke mkdirForAM twice? It internally calls mkdirs() anyway. 

Javadocs nit - a lot of comments seem to have a ":" in them. Such as ":TEZ", 
":Dag ID", etc. Not sure what the ":" signifies.

mkDirForAM() and createFileForAM() need a function description.

{code}
+    Path recoveryPath = TezCommonUtils.getRecoveryPath(tezSystemStagingDir, 
conf, strAppId);
     recoveryFS = recoveryPath.getFileSystem(conf);
     recoveryDataDir = recoveryFS.makeQualified(recoveryPath);
{code}
   - Are some of the steps now redundant due to the re-factor?

Rest looks good. Think we are getting close to a final patch. Thanks for 
patiently addressing the review comments [~kamrul] 



 







> Tez framework should use a unique subdir when creating new files in staging  
> -----------------------------------------------------------------------------
>
>                 Key: TEZ-1106
>                 URL: https://issues.apache.org/jira/browse/TEZ-1106
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mohammad Kamrul Islam
>         Attachments: TEZ-1106.1.patch, TEZ-1106.2.patch, TEZ-1106.3.patch, 
> TEZ-1106.4.patch
>
>
> Currently the files are created in different sub-directories. It is hard to 
> manage and cleanup at the end.
> The proposal is to create a new subdir  : $STAGE_DIR/<APP_ID>/
> All recovery files will go under  : $STAGE_DIR/<APP_ID>/recovery/<attemp_num>/
> All confs will go under:  $STAGE_DIR/<APP_ID>/conf/
> All dagplans will go:  $STAGE_DIR/<APP_ID>/dag_id/plan/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to