Thanks Gyula!

The current state of things is:
- Stefan is working on a fix for 
https://issues.apache.org/jira/browse/FLINK-5663.
- Till is working on https://issues.apache.org/jira/browse/FLINK-5667.

As far as I can tell, these will be fixed today and we are ready to go for RC3.

I resolved the other issues I created.

– Ufuk

On 26 January 2017 at 22:16:26, Gyula Fóra (gyf...@apache.org) wrote:
> Hi,
>  
> Aside from the issues mentioned above I have some good news as well.
>  
> I have finished porting and started testing one of our major production
> jobs (RBea) on 1.2 and everything seems to run well so far, with
> savepoints, rescaling, externalized checkpoints, metrics etc. on YARN.
>  
> In this job I use, windowing, RocksDB state, iterations, timers, broadcast
> states, repartitionable operator states etc. and everything seems to be
> working extremely well under normal circumstances.
>  
> So far I mostly ran sunny day tests but I will continue testing with larger
> load and some failure scenarios. I will keep you posted.
>  
> Great job!
> Gyula
>  
>  
>  
> Robert Metzger ezt írta (időpont: 2017. jan. 26., Cs,
> 21:28):
>  
> Damn. I really hoped that this RC goes through.
>  
> I propose to keep the RC2 open until we've fixed all issues mentioned here
> and to get some more testing feedback.
>  
>  
>  
> On Thu, Jan 26, 2017 at 8:06 PM, Stephan Ewen wrote:
>  
> > @Till - I think that FLINK-5667 is a blocker
> >
> > Good catch finding it!
> >
> > On Thu, Jan 26, 2017 at 7:51 PM, Till Rohrmann  
> > wrote:
> >
> > > I have found another problem: Under certain circumstances Flink can lose
> > > state data by completing an invalid checkpoint.
> > > https://issues.apache.org/jira/browse/FLINK-5667.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Jan 26, 2017 at 6:27 PM, Till Rohrmann  
> > > wrote:
> > >
> > > > Robert also found an issue that pending checkpoint files are not
> > properly
> > > > cleaned up: https://issues.apache.org/jira/browse/FLINK-5660. To my
> > > > surprise, the issue was already fixed in 1.1.4 so I guess I've
> > forgotten
> > > to
> > > > forward port the fix. There is a pending PR to fix it. The fix could
> > also
> > > > be part of a 1.2.1 release.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Thu, Jan 26, 2017 at 6:04 PM, Ufuk Celebi wrote:
> > > >
> > > >> I ran some tests and found the following issues:
> > > >>
> > > >> https://issues.apache.org/jira/browse/FLINK-5663: Checkpoint fails
> > > >> because of closed registry
> > > >> => This happened a couple of times for the first checkpoints after
> > > >> submitting a job. If it happened on every submission I would
> > > >> definitely make this a blocker, but I happen to run into it in like 3
> > > >> out of 10 job submission. What do we make of this?
> > > >>
> > > >> https://issues.apache.org/jira/browse/FLINK-5665: When the failures
> > > >> happened, I also had some lingering 0-byte files.
> > > >>
> > > >> https://issues.apache.org/jira/browse/FLINK-5664: I also found the
> > > >> logging of the RocksDB backend a little noisy (for my local setup at
> > > >> least with many tasks per TM and low checkpointing interval.)
> > > >>
> > > >> All in all, I'm not sure if we want to make these a blocker or not.
> > > >> I'm fine both ways with a follow up 1.2.1 release.
> > > >>
> > > >> ===
> > > >>
> > > >> - Verified signatures and checksums
> > > >> - Checked out the Java quickstarts and ran the jobs
> > > >> - All poms point to 1.2.0
> > > >> - Migrated multiple jobs via savepoint from 1.1.4 to 1.2.0 with Kryo
> > > >> types, session windows (w/o lateness), operator and keyed state for
> > > >> all three backends
> > > >> - Rescaled the same jobs from 1.2.0 savepoints with all three
> backends
> > > >> - Verified the "migration namespace serializer" fix
> > > >> - Ran streaming state machine with Kafka source, RocksDB backend and
> > > >> master and worker failures (standalone cluster)
> > > >>
> > > >> On Wed, Jan 25, 2017 at 9:14 PM, Robert Metzger  
> > > >> wrote:
> > > >> > Dear Flink community,
> > > >> >
> > > >> > Please vote on releasing the following candidate as Apache Flink
> > > version
> > > >> > 1.2.0.
> > > >> >
> > > >> > The commit to be voted on:
> > > >> > 8b5b6a8b (http://git-wip-us.apache.org/repos/asf/flink/commit/
> > > 8b5b6a8b)
> > > >> >
> > > >> > Branch:
> > > >> > release-1.2.0-rc2
> > > >> > (https://git1-us-west.apache.org/repos/asf/flink/repo?p=flin
> > > >> > k.git;a=shortlog;h=refs/heads/release-1.2.0-rc2)
> > > >> >
> > > >> > The release artifacts to be voted on can be found at:
> > > >> > *http://people.apache.org/~rmetzger/flink-1.2.0-rc2/
> > > >> > *
> > > >> >
> > > >> > The release artifacts are signed with the key with fingerprint
> > > D9839159:
> > > >> > http://www.apache.org/dist/flink/KEYS
> > > >> >
> > > >> > The staging repository for this release can be found at:
> > > >> > *https://repository.apache.org/content/repositories/
> > > orgapacheflink-1113
> > > >> > > > > orgapacheflink-1113
> > > >> >*
> > > >> >
> > > >> > -------------------------------------------------------------
> > > >> >
> > > >> > I would like to keep Friday as the target release time. Please let
> > me
> > > >> know
> > > >> > if you want me to move the deadline to Monday if you need more time
> > of
> > > >> the
> > > >> > testing.
> > > >> >
> > > >> > The vote ends on Friday, January 27, 2017, 6pm CET.
> > > >> >
> > > >> > Please test the release rather now than on Friday morning, to be
> > able
> > > to
> > > >> > cancel it as early as possible.
> > > >> > For making the testing easier, I've created this document to track
> > > what
> > > >> has
> > > >> > already been tested and what needs to be tested:
> > > https://docs.google.co
> > > >> > m/document/d/1MX-8l9RrLly3UmZMODHBnuZUrK_n-DGIBLjFKyCrTAs/
> > > >> edit?usp=sharing
> > > >> > Feel free to add more tests or change existing ones.
> > > >> >
> > > >> > [ ] +1 Release this package as Apache Flink 1.2.0
> > > >> > [ ] -1 Do not release this package, because ...
> > > >>
> > > >
> > > >
> > >
> >
>  

Reply via email to