[ https://issues.apache.org/jira/browse/YARN-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203720#comment-16203720 ]
Jason Lowe commented on YARN-7244: ---------------------------------- Thanks for updating the patch! The test failure appears to be related, and it would be good to cleanup the checkstyle issues. In TestShuffleHandler the code to build absLogDir is replicated a lot of times and is always the same value. This should be a final static computed once in TestShuffleHandler and referenced in the places that need it rather than copy-n-paste of the code. This also removes the need for the getAbsLogDir method. When computing absLogDir the code should not assume the target directory is the right place but rather look at the environment for direction where to go. GenericTestUtils#getTestDir would help here. Rather than needing a TestShuffleHandler1 class that overrides getAuxilaryLocalPathHandler, I think we can just use ShuffleHandler directly and call setAuxiliaryLocalPathHandler after creating it. That's what the nodemanager is going to do anyway during normal execution, and the unit test would be closer to the real-world scenario. Do we really need to provide both a TestAuxiliaryLocalPathHandler and TestAuxiliaryLocalPathHandler1? The original test code used a LocalDirAllocator directly, and it seems like we could use one test handler class that passes through to the same local dir allocator that was originally setup in the unit test. AuxiliaryService#getAuxiliaryLocalPathHandler and AuxiliaryService#setAuxiliaryLocalPathHandler need javadocs since this is a public class that we expect app framework providers to use. AuxServices should take the path handler in its constructor since it's not legal to call any other methods before it has been set. This allows the handler field to be marked final and removes the need for get/set methods. In BaseContainerManagerTest it's more appropriate to start the node health checker service rather than the dirs handler, since the health checker is responsible for initing and starting the dirs handler. Otherwise it looks weird that we started a service without calling init first. TestAuxServices should use Mockito to build the AuxiliaryLocalPathHandler mock object. AuxService#getServices returns a collection of AuxiliaryService objects, so it would be cleaner to iterate on AuxiliaryService objects directly rather than iterate on Service objects and cast them in the loop. I'm not a fan of testContainerAuxPathHandler since it uses long sleeps to essentially test that the aux path handler is hooked up to the nodemanager's dir handler. It would be much better if we could eliminate the need for sleeps and instead set a mocked/controllable dirs handler in the containermanager then change the behavior of the mocked object and verify that change is reflected in the aux path handler is well. Or even simpler, AuxiliaryLocalPathHandlerImpl could take a dir handler as a constructor argument and change to a static class so we can instantiate it and test it directly rather than needing to bring up the entire container manager for this test. > ShuffleHandler is not aware of disks that are added > --------------------------------------------------- > > Key: YARN-7244 > URL: https://issues.apache.org/jira/browse/YARN-7244 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > Attachments: YARN-7244.001.patch, YARN-7244.002.patch, > YARN-7244.003.patch, YARN-7244.004.patch, YARN-7244.005.patch > > > The ShuffleHandler permanently remembers the list of "good" disks on NM > startup. If disks later are added to the node then map tasks will start using > them but the ShuffleHandler will not be aware of them. The end result is that > the data cannot be shuffled from the node leading to fetch failures and > re-runs of the map tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org