----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43350/ -----------------------------------------------------------
(Updated Feb. 18, 2016, 10:29 p.m.) Review request for samza, Navina Ramesh, Jagadish Venkatraman, and Yi Pan (Data Infrastructure). Repository: samza Description (updated) ------- * Throw a SamzaContainerStartException whenever the container start request to the NM fails. * Catch the exception in the Allocator and re-request for a container on ANY-HOST * Release the failed container * Also, refactored some methods in the ContainerUtil and ContainerAllocators to simplify the code. * Finally, fix a logic bug in the HostAwareContainerAllocator wherein the expiration logic was flipped. Diffs ----- samza-core/src/main/scala/org/apache/samza/config/TaskConfig.scala 51e9e99e506c065938b18d081df6fe1ff24f79ec samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java 2e192eecf412525f45846edb62b23d261d35cbda samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerAllocator.java 31fcc572d499d4e17cdf1340d7072c1562ccfdb7 samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java 54db5e5b2b1b109d202e814809adbfd2bc84fb4b samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerUtil.java 91fae98c074e1648e7168fb8e76d6e1e656816fc samza-yarn/src/main/java/org/apache/samza/job/yarn/HostAwareContainerAllocator.java 8e1db77db74e1dc01bb0ec64e102befd225123e9 samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaAppState.java bc5b606394351e4039d084914fe5898e6ff148a1 samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaContainerLaunchException.java PRE-CREATION samza-yarn/src/main/java/org/apache/samza/job/yarn/SamzaTaskManager.java a3562a1155cbf24989200063b3ebd472b295db37 samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java e2b45d7577dea5a4a71af22521c93a7fd75eaefc samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java 269d82479650b3bc2890d250da0391d34104b1eb samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java 88d9f24d16fc3d9842b387cfc22edaf1dfa6fd06 samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerAllocator.java 5fcad82fc10f87510e994896ed7380f1e636a18f samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerListener.java 8fc0b9898ed4484b9db98da7f622ab61f35cd4b1 samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerRequestState.java e7441e512e73b5d09740f252dc52efece640bea4 samza-yarn/src/test/java/org/apache/samza/job/yarn/util/MockContainerUtil.java 4426ce6ac8808257dc00df50452df3d7555d6d0b samza-yarn/src/test/java/org/apache/samza/job/yarn/util/TestAMRMClientImpl.java 951e0f9504b87263eba891d367c9806248241c7d Diff: https://reviews.apache.org/r/43350/diff/ Testing ------- Added 2 unit tests and I'm doing some manual testing in parallel with this review. The manual test is to kill the NM for one of the containers of a job that uses Host Affinity. When the RM detects the outage, it should notify the AM which will try to restart the container on the same host. It will get a connection error and at that point should retry on a DIFFERENT host. Thanks, Jake Maes