[jira] [Comment Edited] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-21 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609803#comment-13609803
 ] 

Siddharth Seth edited comment on YARN-71 at 3/22/13 2:07 AM:
-

Couple more comments. Hopefully the last set.
- //queue deletions here, rather than NM init? - this comment isn't required 
anymore.
- The log messages in the delete method should not suppress the exception (
LOG.warn(Failed to delete localDir:  + localDir, e) ; )
- There's a log message in the test which says named nobody - this likely 
needs to be user
- Bunch of System.out, and printStackTrace() which need to be removed
- ContainerManager.getContainerStatus does not imply much - ant state except 
non-final states will show up as RUNNING. What should be verified is the 
internal state of the Container - via container.getContainerState(). You'll 
need access to the nmcontext in the test to access this

When testing, did you try different users with the LCE ?

  was (Author: sseth):
Couple more comments. Hopefully the last set.
- //queue deletions here, rather than NM init? - this comment isn't required 
anymore.
- The log messages in the delete method should not suppress the exception (
LOG.warn(Failed to delete localDir:  + localDir, e);)
- There's a log message in the test which says named nobody - this likely 
needs to be user
- Bunch of System.out, and printStackTrace() which need to be removed
- ContainerManager.getContainerStatus does not imply much - ant state except 
non-final states will show up as RUNNING. What should be verified is the 
internal state of the Container - via container.getContainerState(). You'll 
need access to the nmcontext in the test to access this

When testing, did you try different users with the LCE ?
  
 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.10.patch, YARN-71.11.patch, YARN-71.12.patch, 
 YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, YARN.71.4.patch, 
 YARN-71.5.patch, YARN-71.6.patch, YARN-71.7.patch, YARN-71.8.patch, 
 YARN-71.9.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-13 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13602042#comment-13602042
 ] 

Siddharth Seth edited comment on YARN-71 at 3/14/13 5:20 AM:
-

Comments on the latest patch.

- timestamp can move out - so that the same ts is used across all local dirs.
- Instead of scheduling old files, then renaming the current files and 
scheduling additional deletes - this could change to just rename the current 
files, and schedule deletion once.
In the unit test
- There's a couple of races. One when asserting state as RUNNING since the 
events may not have been processed. Second when asserting file delete, since 
that's also a separate thread.
- Also, the test should verify the correct user being used for deletion; spy on 
the deletion service.
- Minor, Use Records instead of RecordFactory

Also, can you please mention how you've tested the patch.

  was (Author: sseth):
Comments on the latest patch.

- timestamp can moce out - so that the same ts is used across all local dirs.
- Instead of scheduling old files, then renaming the current files and 
scheduling additional deletes - this could change to just rename the current 
files, and schedule deletion once.
In the unit test
-  
  
 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch, YARN-71.7.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira