[
https://issues.apache.org/jira/browse/TEZ-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010430#comment-14010430
]
Hitesh Shah commented on TEZ-1106:
----------------------------------
Comments:
In getTezBaseStagingPath(), is there a reason that the resolved path is not
being updated back into the config?
In getTezSystemStagingPath(), why is getTezBaseStagingPath() being called? Why
are 2 public functions needed instead of just exposing a single one?
Also, for getTezSystemStagingPath(), is there a reason why it always has to do
a create? To clarify, are there any cases that you believe the directory should
always exist and not be silently created?
{code}
+ if (!fs.exists(tezStagingDir)) {
+ mkDirForAM(fs, tezStagingDir);
+ }
+ tezStagingDir = new Path(tezStagingDir, strAppId);
+ if (!fs.exists(tezStagingDir)) {
+ mkDirForAM(fs, tezStagingDir);
+ }
{code}
- why invoke mkdirForAM twice? It internally calls mkdirs() anyway.
Javadocs nit - a lot of comments seem to have a ":" in them. Such as ":TEZ",
":Dag ID", etc. Not sure what the ":" signifies.
mkDirForAM() and createFileForAM() need a function description.
{code}
+ Path recoveryPath = TezCommonUtils.getRecoveryPath(tezSystemStagingDir,
conf, strAppId);
recoveryFS = recoveryPath.getFileSystem(conf);
recoveryDataDir = recoveryFS.makeQualified(recoveryPath);
{code}
- Are some of the steps now redundant due to the re-factor?
Rest looks good. Think we are getting close to a final patch. Thanks for
patiently addressing the review comments [~kamrul]
> Tez framework should use a unique subdir when creating new files in staging
> -----------------------------------------------------------------------------
>
> Key: TEZ-1106
> URL: https://issues.apache.org/jira/browse/TEZ-1106
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Mohammad Kamrul Islam
> Assignee: Mohammad Kamrul Islam
> Attachments: TEZ-1106.1.patch, TEZ-1106.2.patch, TEZ-1106.3.patch,
> TEZ-1106.4.patch
>
>
> Currently the files are created in different sub-directories. It is hard to
> manage and cleanup at the end.
> The proposal is to create a new subdir : $STAGE_DIR/<APP_ID>/
> All recovery files will go under : $STAGE_DIR/<APP_ID>/recovery/<attemp_num>/
> All confs will go under: $STAGE_DIR/<APP_ID>/conf/
> All dagplans will go: $STAGE_DIR/<APP_ID>/dag_id/plan/
--
This message was sent by Atlassian JIRA
(v6.2#6252)