GitHub user narendly opened a pull request: https://github.com/apache/helix/pull/284
PR You can merge this pull request into a Git repository by running: $ git pull https://github.com/narendly/helix master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/helix/pull/284.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #284 ---- commit 6090732be6b88863017a93106fa692dc7350520b Author: Hunter Lee <hulee@...> Date: 2018-10-31T21:20:18Z [HELIX-776] REST2.0: Add delete command to updateInstanceConfig For instance configs, REST2.0 did not expose the REST API for deletion of fields. This RB adds update and delete commands to updateInstanceConfig and an integration test thereof. Changelist: 1. Add delete command to updateInstanceConfig in InstanceAccessor 2. Add integration tests commit 5d24ed544898ff69f289f54be71a04413735d118 Author: Hunter Lee <hulee@...> Date: 2018-10-31T21:21:49Z [HELIX-777] TASK: Handle null currentState for unscheduled tasks It was observed that when a workflow is submitted and the Controller attempts to schedule its tasks, ZK read fails to read the appropriate job's context, causing the job to be stuck in an unscheduled state. The job remained unscheduled because it had no currentStates, and its job context did not contain any assignment/state information. This RB fixes such stuck states by detecting null currentStates. Changelist: 1. Check if currentState is null and if it is, manually assign an INIT state commit ceba1a55ae351090144c001324f908f2364212a4 Author: Hunter Lee <hulee@...> Date: 2018-11-01T00:20:37Z [HELIX-778] TASK: Fix a race condition in updatePreviousAssignedTasksStatus It was observed that TestUnregisteredCommand is very unstable. The reason was identified to be a race condition where when a task fails, sometimes a pending message for that task (from INIT to RUNNING) wasn't being cleaned up on time, so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would try to process that message and skip the status update of that task (like updating its status and NUM_ATTEMPTS field in JobContext). A short, temporary fix is to call markPartitionError() prior to checking the pending message, but over the long haul, we would need to revisit the task status update's design here to avoid this type of race conditions. Changelist: 1. Move markPartitionError() up before checking for a pending message on the task 2. Fix TestUnregisteredCommand's instability ---- ---