I have found another problem: Under certain circumstances Flink can lose
state data by completing an invalid checkpoint.
https://issues.apache.org/jira/browse/FLINK-5667.

Cheers,
Till

On Thu, Jan 26, 2017 at 6:27 PM, Till Rohrmann <trohrm...@apache.org> wrote:

> Robert also found an issue that pending checkpoint files are not properly
> cleaned up: https://issues.apache.org/jira/browse/FLINK-5660. To my
> surprise, the issue was already fixed in 1.1.4 so I guess I've forgotten to
> forward port the fix. There is a pending PR to fix it. The fix could also
> be part of a 1.2.1 release.
>
> Cheers,
> Till
>
> On Thu, Jan 26, 2017 at 6:04 PM, Ufuk Celebi <u...@apache.org> wrote:
>
>> I ran some tests and found the following issues:
>>
>> https://issues.apache.org/jira/browse/FLINK-5663: Checkpoint fails
>> because of closed registry
>> => This happened a couple of times for the first checkpoints after
>> submitting a job. If it happened on every submission I would
>> definitely make this a blocker, but I happen to run into it in like 3
>> out of 10 job submission. What do we make of this?
>>
>> https://issues.apache.org/jira/browse/FLINK-5665: When the failures
>> happened, I also had some lingering 0-byte files.
>>
>> https://issues.apache.org/jira/browse/FLINK-5664: I also found the
>> logging of the RocksDB backend a little noisy (for my local setup at
>> least with many tasks per TM and low checkpointing interval.)
>>
>> All in all, I'm not sure if we want to make these a blocker or not.
>> I'm fine both ways with a follow up 1.2.1 release.
>>
>> ===
>>
>> - Verified signatures and checksums
>> - Checked out the Java quickstarts and ran the jobs
>> - All poms point to 1.2.0
>> - Migrated multiple jobs via savepoint from 1.1.4 to 1.2.0 with Kryo
>> types, session windows (w/o lateness), operator and keyed state for
>> all three backends
>> - Rescaled the same jobs from 1.2.0 savepoints with all three backends
>> - Verified the "migration namespace serializer" fix
>> - Ran streaming state machine with Kafka source, RocksDB backend and
>> master and worker failures (standalone cluster)
>>
>> On Wed, Jan 25, 2017 at 9:14 PM, Robert Metzger <rmetz...@apache.org>
>> wrote:
>> > Dear Flink community,
>> >
>> > Please vote on releasing the following candidate as Apache Flink version
>> > 1.2.0.
>> >
>> > The commit to be voted on:
>> > 8b5b6a8b (http://git-wip-us.apache.org/repos/asf/flink/commit/8b5b6a8b)
>> >
>> > Branch:
>> > release-1.2.0-rc2
>> > (https://git1-us-west.apache.org/repos/asf/flink/repo?p=flin
>> > k.git;a=shortlog;h=refs/heads/release-1.2.0-rc2)
>> >
>> > The release artifacts to be voted on can be found at:
>> > *http://people.apache.org/~rmetzger/flink-1.2.0-rc2/
>> > <http://people.apache.org/~rmetzger/flink-1.2.0-rc2/>*
>> >
>> > The release artifacts are signed with the key with fingerprint D9839159:
>> > http://www.apache.org/dist/flink/KEYS
>> >
>> > The staging repository for this release can be found at:
>> > *https://repository.apache.org/content/repositories/orgapacheflink-1113
>> > <https://repository.apache.org/content/repositories/orgapacheflink-1113
>> >*
>> >
>> > -------------------------------------------------------------
>> >
>> > I would like to keep Friday as the target release time. Please let me
>> know
>> > if you want me to move the deadline to Monday if you need more time of
>> the
>> > testing.
>> >
>> > The vote ends on Friday, January 27, 2017, 6pm CET.
>> >
>> > Please test the release rather now than on Friday morning, to be able to
>> > cancel it as early as possible.
>> > For making the testing easier, I've created this document to track what
>> has
>> > already been tested and what needs to be tested: https://docs.google.co
>> > m/document/d/1MX-8l9RrLly3UmZMODHBnuZUrK_n-DGIBLjFKyCrTAs/
>> edit?usp=sharing
>> > Feel free to add more tests or change existing ones.
>> >
>> > [ ] +1 Release this package as Apache Flink 1.2.0
>> > [ ] -1 Do not release this package, because ...
>>
>
>

Reply via email to