[ 
https://issues.apache.org/jira/browse/TEZ-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000516#comment-14000516
 ] 

Hitesh Shah commented on TEZ-1106:
----------------------------------

Comments:

  - set permissions on staging dir related code in TezClientUtils.java could be 
moved into the same class as other helper functions
     - it is questionable whether we should be doing the set permissions for 
the top level dir in the first place. May want to file a separate jira to 
remove this code. Setting permissions is especially not needed if all tez data 
generated by the framework is going into a special sub dir. + 
 
{code}
+      if (LOG.isDebugEnabled()) {
+        LOG.debug("Stage directory information for AppId :" + appId + " 
tezSysStagingPath :"
+            + tezSysStagingPath + " binaryConfPath :" + binaryConfPath + " 
sessionJarsPath :"
+            + sessionJarsPath + " binaryPlanPath :" + binaryPath);
+      }
{code}
   - can any of these values ever be null? 

  - TezCommonUtils:: getTezBaseStagingPath() vs getTezSystemStagingPath()
     - What is the expectation for a user when configuring the staging dir?
     - When the staging dir is deleted by the AM, what is deleted? the .tez 
subdir or the base dir? 

 - for all the get*Path functions, it might be simpler to just do "return new 
Path(..." 
 
 - In some get*Path functions, we have String dagId. In most, we have 
ApplicationId as a param even though what is really needed is appIdStr. Maybe 
accept appIdStr instead of converting to string multiple times? 

 - why do some functions do getFileSystem() and resolvePath() whereas others 
just do simple string appends to create a path? What is the general behavior? 
Also, please add javadocs for all newly added functions. 
 
 - all functions are public - do some need to change to private? Also are any 
functions in the TezCommonUtils class meant to be used by user code or just by 
internal framework code? Please use private/public annotations as needed.

 - AppContext:: getTezSystemStagingPath() - is this being used anywhere? 

 - getTezSystemStagingPath() - always calls mkdirs without checking existence 
of dir. Is this intentional?




> Tez framework should use a unique subdir when creating new files in staging  
> -----------------------------------------------------------------------------
>
>                 Key: TEZ-1106
>                 URL: https://issues.apache.org/jira/browse/TEZ-1106
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Mohammad Kamrul Islam
>         Attachments: TEZ-1106.1.patch, TEZ-1106.2.patch, TEZ-1106.3.patch
>
>
> Currently the files are created in different sub-directories. It is hard to 
> manage and cleanup at the end.
> The proposal is to create a new subdir  : $STAGE_DIR/<APP_ID>/
> All recovery files will go under  : $STAGE_DIR/<APP_ID>/recovery/<attemp_num>/
> All confs will go under:  $STAGE_DIR/<APP_ID>/conf/
> All dagplans will go:  $STAGE_DIR/<APP_ID>/dag_id/plan/



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to