----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43967/#review121304 -----------------------------------------------------------
Ship it! Ship It! - Sebastian Toader On Feb. 29, 2016, 8:26 p.m., Jonathan Hurley wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43967/ > ----------------------------------------------------------- > > (Updated Feb. 29, 2016, 8:26 p.m.) > > > Review request for Ambari, Alejandro Fernandez, Nate Cole, Sumit Mohanty, > Sebastian Toader, and Sid Wagle. > > > Bugs: AMBARI-15173 > https://issues.apache.org/jira/browse/AMBARI-15173 > > > Repository: ambari > > > Description > ------- > > Seen while performing an upgrade, it's possible that the status of a > request/stage does not match that of its tasks. Essentially, the task could > be {{HOLDING}} while the request is still {{IN_PROGRESS}}. > > I believe that AMBARI-15011 is responsible for this issue. AMBARI-15011 > introduced, among other things, a cache to the > {{HostRoleCommandStatusSummaryDTO}} which is a aggregation of the number of > tasks a stage has in each state (PENDING, HOLDING, etc). > > This {{HostRoleCommandStatusSummaryDTO}} is used by {{CalculatedState}} to > calculate a stage's and request's status based on the tasks. > > The problem is that {{ServerActionExecutor}} is moving a tasks's state to > {{HOLDING}} (reflected in the database correctly) but the cache invalidation > happens inside the uncommitted transaction. This causes stale data to be > re-cached. So, when we go to calculate the request and state status, we get > {{IN_PROGRESS}} instead of {{HOLDING}}. > > {code} > { > "href": > "http://172.22.72.13:8080/api/v1/clusters/cl1/requests/61/stages/1?fields=*,tasks/*", > "Stage": { > "cluster_name": "cl1", > "context": "Stop YARN Queues", > "display_status": "IN_PROGRESS", > "end_time": -1, > "progress_percent": 35, > "request_id": 61, > "skippable": true, > "stage_id": 1, > "start_time": 1456227329191, > "status": "IN_PROGRESS" > }, > "tasks": [ > { > "href": > "http://172.22.72.13:8080/api/v1/clusters/cl1/requests/61/stages/1/tasks/754", > "Tasks": { > "attempt_cnt": 1, > "cluster_name": "cl1", > "command": "EXECUTE", > "command_detail": "Before continuing, please stop all YARN queues. If > yarn-site's yarn.resourcemanager.work-preserving-recovery.enabled is set to > true, then you can skip this step since the clients will retry on their own.", > "custom_command_name": > "org.apache.ambari.server.serveraction.upgrades.ManualStageAction", > "end_time": -1, > "error_log": "errors-754.txt", > "exit_code": 0, > "host_name": "os-r6-mkqzcs-c10tom21unsecha-6.novalocal", > "id": 754, > "output_log": "output-754.txt", > "request_id": 61, > "role": "AMBARI_SERVER_ACTION", > "stage_id": 1, > "start_time": 1456227329191, > "status": "HOLDING", > "stderr": "", > "stdout": "", > "structured_out": {} > } > } > ] > } > {code} > > > Diffs > ----- > > > ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java > 003e2e6 > > ambari-server/src/main/java/org/apache/ambari/server/orm/AmbariJpaLocalTxnInterceptor.java > b5442c2 > > ambari-server/src/main/java/org/apache/ambari/server/orm/TransactionalLocks.java > 1768dd8 > > ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java > c2ded2f > ambari-server/src/test/java/org/apache/ambari/annotations/LockAreaTest.java > PRE-CREATION > > ambari-server/src/test/java/org/apache/ambari/annotations/TransactionalLockInterceptorTest.java > 6ebdc0b > > ambari-server/src/test/java/org/apache/ambari/annotations/TransactionalLockTest.java > 1862088 > > Diff: https://reviews.apache.org/r/43967/diff/ > > > Testing > ------- > > Pending unit tests... > > > Thanks, > > Jonathan Hurley > >