[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382690#comment-15382690
 ] 

Varun Saxena edited comment on MAPREDUCE-6380 at 7/18/16 5:45 PM:
------------------------------------------------------------------

bq. This patch tries to check the existence of log files under user log dir 
instead of only checking the existence of user log dir.  So it is necessary to 
list up files under user log dir. Extraneous log dir we call here is the user 
log dir which exists but has not actual log files.
FileSystem#listStatus does exactly that. It will return an empty array if no 
app log directories are found under the user log dir (dir like 
/tmp/logs/user/logs). So the for loop deleteOldLogDirsFrom will have no 
iterations in this case. Were you talking about this case ?
The default config for remote app log dir is /tmp/logs and the default config 
for suffix is logs. So application logs are typically found in 
/tmp/logs/user/logs
The JIRA issue IIUC here is when we have extraneous folders/files like 
/tmp/logs/user i.e. for directories inside remote log dir. This leads to 
FileNotFoundException which is printed in logs. We can just catch this 
exception in deleteOldLogDirsFrom method and print a log without exception 
trace.
We do not need the additional call to listStatusIterator() in run() method. 
Because we are already listing on the same directory and checking its contents 
inside deleteOldLogDirsFrom method.

BTW, is the issue only with stack trace being printed ? Because I do not see 
any other problem. Ideally extraneous directories should not exist in this path 
at all.

If extraneous directories have to be considered, I see a bigger issue while 
listing app directories. If there is a spurious directory inside 
/tmp/logs/user/logs which is not in app id format (i.e. a directory like 
/tmp/logs/user/logs/dummy), it will be bigger problem.
Because this would lead to the timer thread associated with log deletion 
exiting as the IllegalArgumentException is not caught.
This issue can be handled in this JIRA. 


was (Author: varun_saxena):
bq. This patch tries to check the existence of log files under user log dir 
instead of only checking the existence of user log dir.  So it is necessary to 
list up files under user log dir. Extraneous log dir we call here is the user 
log dir which exists but has not actual log files.
FileSystem#listStatus does exactly that. It will return an empty array if no 
app log directories are found under the user log dir (dir like 
/tmp/logs/user/logs). So the for loop deleteOldLogDirsFrom will have no 
iterations in this case. Were you talking about this case ?
The default config for remote app log dir is /tmp/logs and the default config 
for suffix is logs. So application logs are typically found in 
/tmp/logs/user/logs
The JIRA issue IIUC here is when we have extraneous folders/files like 
/tmp/logs/user i.e. for directories inside remote log dir. This leads to 
FileNotFoundException which is printed in logs. We can just catch this 
exception in deleteOldLogDirsFrom method and print a log without exception 
trace.
We do not need the additional call to listStatusIterator() in run() method. We 
are already listing on the same directory inside deleteOldLogDirsFrom method.

BTW, is the issue only with stack trace being printed ? Because I do not see 
any other problem. Ideally extraneous directories should not exist in this path 
at all.

If extraneous directories have to be considered, I see a bigger issue while 
listing app directories. If there is a spurious directory inside 
/tmp/logs/user/logs which is not in app id format (i.e. a directory like 
/tmp/logs/user/logs/dummy), it will be bigger problem.
Because this would lead to the timer thread associated with log deletion 
exiting as the IllegalArgumentException is not caught.
This issue can be handled in this JIRA. 

> AggregatedLogDeletionService will throw exception when there are some other 
> directories in remote-log-dir
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6380
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6380
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver
>            Reporter: Zhang Wei
>            Assignee: Kai Sasaki
>            Priority: Trivial
>         Attachments: MAPREDUCE-6380.01.patch, MAPREDUCE-6380.02.patch, 
> MAPREDUCE-6380.03.patch, MAPREDUCE-6380.04.patch, MAPREDUCE-6380.05.patch, 
> MAPREDUCE-6380.06.patch, MAPREDUCE-6380.07.patch
>
>
> AggregatedLogDeletionService will throw FileNotFoundException when there are 
> some extraneous directories put in remote-log-dir. The deletion function will 
> try to listStatus against the "extraneous-dir + suffix"  dir.  I think it 
> would be better  if  the function can ignore these directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to