GitHub user narendly opened a pull request:

    https://github.com/apache/helix/pull/288

    PR

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/narendly/helix master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/288.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #288
    
----
commit 3844ad60034b029f3bbd916f629a7969117c1b26
Author: narendly <narendly@...>
Date:   2018-11-01T23:54:48Z

    [HELIX-782] TASK: Make TaskDriver use ZKClient's create when creating 
workflows
    
    TaskDriver should use create() but currently is using set(), which just 
overwrites ZNodes that are in ZK. This is undesirable and we need to fix it, 
especially in the wake of ZNode restructuring.
    
    AC:
    1. Make TaskDriver use create() instead of set()
    2. Add an integration test: 
TestWorkflowCreation:testWorkflowCreationNoDuplicates()

commit 3d9c03064a5c26a9ed9ad674567674f2d9eca160
Author: narendly <narendly@...>
Date:   2018-11-01T23:55:59Z

    TASK: Fix JobQueue's job state-related bug
    
    The bug was observed in 
TestTaskRebalancerStopResume:stopAndResumeNamedQueue(), which was being 
unstable.
    It was observed that for JobQueues with multiple jobs, the second job would 
get marked as IN_PROGRESS even though the first job hadn't completed/failed, 
especially when the queue was being stopped and resumed. This was due to a bug 
in getIncompleteJobCount() because it was not counting jobs in STOPPING state. 
This was fixed and another check was added right before JobDispatcher marks a 
job as STOPPED so that it would not mark it STOPPED if the job state is 
NOT_STARTED.
    Changelist:
    1. Fix getIncompleteJobCount()
    2. Add a check so that we don't mark NOT_STARTED jobs as STOPPED

commit befb1036f8d8be2729a800d3dde88fc1362a6489
Author: narendly <narendly@...>
Date:   2018-11-01T23:57:33Z

    [HELIX-784] TASK: Fix a bug in getExpiredJobs
    
    getExpiredJobs(), when the job config is null, would just continue instead 
of adding it to expiredJobs so that the job cleanup/purge would be re-tried. 
This could possibly cause purge failures to leave a lot of jobs un-purged with 
just the job config missing in ZK. This RB fixes this.
    
    Changelist:
    1. Add the job name to expiredJobs if the job config does not exist in ZK
    2. Add a more detailed description in the error log
    3. Add an integration test for two task-related stages: 
TaskPersistDataStage and TaskGarbageCollectionStage in TestTaskStage.java

----


---

Reply via email to