GitHub user kl0u opened a pull request:

    https://github.com/apache/flink/pull/2797

    [FLINK-5056] Makes the BucketingSink rescalable.

    This PR makes the BucketingSink rescalable, fixes a bug that could lead to 
deleting
    valid data and improves the javadocs of the class. 
    
    In the process of making the sink rescalable, we also stop deleting 
lingering files upon restoring.
    This is to avoid possible race-conditions that can lead to one task 
deleting files that another
    task uses.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kl0u/flink bucket-ref-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2797.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2797
    
----
commit d233d807d91e4438a07d3e38192a65e9f2c302bc
Author: kl0u <kklou...@gmail.com>
Date:   2016-11-06T19:44:53Z

    [FLINK-5054] Make the BucketingSink rescalable.
    
    Refactors the BucketingSink to be able to change
    parallelism after restoring from a savepoint. To
    do so, this commit changes the following:
    1) the sink does not clean up lingering files upon
       restoring
    2) the previous snapshot/restore cycle is replaced
       by the new initializeState/snapshotState one.

commit fbf5c8699ee8c2e2c3b108ba6ec5051ff8d06f2a
Author: kl0u <kklou...@gmail.com>
Date:   2016-11-06T19:44:53Z

    [FLINK-5056] BucketingSink:Clear state only after committing all pending 
data.
    
    Before clearing up the state of the Sink upon receiving a notification
    about a successful checkpoint, we also check if all pending buckets for
    previous checkponts have already been committed.

commit d2d638eee240848c632ee54769ca844a131f216b
Author: kl0u <kklou...@gmail.com>
Date:   2016-11-13T22:21:46Z

    [FLINK-5056] Improve documentation of the BucketingSink.
    
    This commit also removes an unused method that would replace
    the close() when a dispose() is introduced in the RichFunction.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to