I have created a custom build of RocksDB 4.11.2 that fixes a significant performance problem with append operations. I think this should definitely be part of the 1.2.1 release because this is already blocking some users. What is missing is uploading the jar to maven central and a testing run, e.g. with some misbehaved job that has large state.
> Am 04.04.2017 um 11:57 schrieb Robert Metzger <rmetz...@apache.org>: > > Thank you for opening a PR for this. > > Chesnay, do you need more reviews for the metrics changes / backports? > > Are there any other release blockers for 1.2.1, or are we good to go? > > On Mon, Apr 3, 2017 at 6:48 PM, Aljoscha Krettek <aljos...@apache.org> > wrote: > >> I created a PR for the revert: https://github.com/apache/flink/pull/3664 >> >>> On 3. Apr 2017, at 18:32, Stephan Ewen <se...@apache.org> wrote: >>> >>> +1 for options (1), but also invest the time to fix it properly for 1.2.2 >>> >>> >>> On Mon, Apr 3, 2017 at 9:10 AM, Kostas Kloudas < >> k.klou...@data-artisans.com> >>> wrote: >>> >>>> +1 for 1 >>>> >>>>> On Apr 3, 2017, at 5:52 PM, Till Rohrmann <trohrm...@apache.org> >> wrote: >>>>> >>>>> +1 for option 1) >>>>> >>>>> On Mon, Apr 3, 2017 at 5:48 PM, Fabian Hueske <fhue...@gmail.com> >> wrote: >>>>> >>>>>> +1 to option 1) >>>>>> >>>>>> 2017-04-03 16:57 GMT+02:00 Ted Yu <yuzhih...@gmail.com>: >>>>>> >>>>>>> Looks like #1 is better - 1.2.1 would be at least as stable as 1.2.0 >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> On Mon, Apr 3, 2017 at 7:39 AM, Aljoscha Krettek < >> aljos...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Just so we’re all on the same page. ;-) >>>>>>>> >>>>>>>> There was https://issues.apache.org/jira/browse/FLINK-5808 which >> was >>>> a >>>>>>>> bug that we initially discovered in Flink 1.2 which was/is about >>>>>> missing >>>>>>>> verification for the correctness of the combination of parallelism >> and >>>>>>>> max-parallelism. Due to lacking test coverage this introduced two >> more >>>>>>> bugs: >>>>>>>> - https://issues.apache.org/jira/browse/FLINK-6188: Some >>>>>>>> setParallelism() methods can't cope with default parallelism >>>>>>>> - https://issues.apache.org/jira/browse/FLINK-6209: >>>>>>>> StreamPlanEnvironment always has a parallelism of 1 >>>>>>>> >>>>>>>> IMHO, the options are: >>>>>>>> 1) revert the changes made for FLINK-5808 on the release-1.2 branch >>>>>> and >>>>>>>> live with the bug still being present >>>>>>>> 2) put in more work to fix FLINK-5808 which requires fixing some >>>>>>> problems >>>>>>>> that have existed for a long time with how the parallelism is set in >>>>>>>> streaming programs >>>>>>>> >>>>>>>> Best, >>>>>>>> Aljoscha >>>>>>>> >>>>>>>>> On 31. Mar 2017, at 21:34, Robert Metzger <rmetz...@apache.org> >>>>>> wrote: >>>>>>>>> >>>>>>>>> I don't know what is best to do, but I think releasing 1.2.1 with >>>>>>>>> potentially more bugs than 1.2.0 is not a good option. >>>>>>>>> I suspect a good workaround for FLINK-6188 >>>>>>>>> <https://issues.apache.org/jira/browse/FLINK-6188> is setting the >>>>>>>>> parallelism manually for operators that can't cope with the default >>>>>> -1 >>>>>>>>> parallelism. >>>>>>>>> >>>>>>>>> On Fri, Mar 31, 2017 at 9:06 PM, Aljoscha Krettek < >>>>>> aljos...@apache.org >>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> You mean reverting the changes around FLINK-5808 [1]? This is what >>>>>>>>>> introduced the follow-up FLINK-6188 [2]. >>>>>>>>>> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-5808 >>>>>>>>>> [2]https://issues.apache.org/jira/browse/FLINK-6188 >>>>>>>>>> >>>>>>>>>> On Fri, Mar 31, 2017, at 19:10, Robert Metzger wrote: >>>>>>>>>>> I think reverting FLINK-6188 for the 1.2 branch might be a good >>>>>> idea. >>>>>>>>>>> FLINK-6188 introduced two new bugs, so undoing the FLINK-6188 fix >>>>>>> will >>>>>>>>>>> lead >>>>>>>>>>> only to one known bug in 1.2.1, instead of an uncertain number of >>>>>>>> issues. >>>>>>>>>>> So 1.2.1 is not going to be worse than 1.2.0 >>>>>>>>>>> >>>>>>>>>>> The fix will hopefully make it into 1.2.2 then. >>>>>>>>>>> >>>>>>>>>>> Any other thoughts on this? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Mar 31, 2017 at 6:46 PM, Fabian Hueske < >> fhue...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> I merged the fix for FLINK-6044 to the release-1.2 and >> release-1.1 >>>>>>>>>> branch. >>>>>>>>>>>> >>>>>>>>>>>> 2017-03-31 15:02 GMT+02:00 Fabian Hueske <fhue...@gmail.com>: >>>>>>>>>>>> >>>>>>>>>>>>> We should also backport the fix for FLINK-6044 to Flink 1.2.1. >>>>>>>>>>>>> >>>>>>>>>>>>> I'll take care of that. >>>>>>>>>>>>> >>>>>>>>>>>>> 2017-03-30 18:50 GMT+02:00 Aljoscha Krettek < >> aljos...@apache.org >>>>>>> : >>>>>>>>>>>>> >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-6188 turns out to >>>>>> be >>>>>>> a >>>>>>>>>> bit >>>>>>>>>>>>>> more involved, see my comments on the PR: >>>>>>>>>>>>>> https://github.com/apache/flink/pull/3616. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As I said there, maybe we should revert the commits regarding >>>>>>>>>>>>>> parallelism/max-parallelism changes and release and then fix >> it >>>>>>>>>> later. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Mar 29, 2017, at 23:08, Aljoscha Krettek wrote: >>>>>>>>>>>>>>> I commented on FLINK-6214: I think it's working as intended, >>>>>>>>>> although >>>>>>>>>>>> we >>>>>>>>>>>>>>> could fix the javadoc/doc. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Mar 29, 2017, at 17:35, Timo Walther wrote: >>>>>>>>>>>>>>>> A user reported that all tumbling and slinding window >>>>>> assigners >>>>>>>>>>>>>> contain >>>>>>>>>>>>>>>> a pretty obvious bug about offsets. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-6214 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I think we should also fix this for 1.2.1. What do you >> think? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>> Timo >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Am 29/03/17 um 11:30 schrieb Robert Metzger: >>>>>>>>>>>>>>>>> Hi Haohui, >>>>>>>>>>>>>>>>> I agree that we should fix the parallelism issue. >> Otherwise, >>>>>>>>>> the >>>>>>>>>>>>>> 1.2.1 >>>>>>>>>>>>>>>>> release would introduce a new bug. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Mar 28, 2017 at 11:59 PM, Haohui Mai < >>>>>>>>>> ricet...@gmail.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -1 (non-binding) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> We recently found out that all jobs submitted via UI will >>>>>>>>>> have a >>>>>>>>>>>>>>>>>> parallelism of 1, potentially due to FLINK-5808. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Filed FLINK-6209 to track it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> ~Haohui >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Mar 27, 2017 at 2:59 AM Chesnay Schepler < >>>>>>>>>>>>>> ches...@apache.org> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> If possible I would like to include FLINK-6183 & >> FLINK-6184 >>>>>>>>>> as >>>>>>>>>>>>>> well. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> They fix 2 metric-related issues that could arise when a >>>>>>>>>> Task is >>>>>>>>>>>>>>>>>>> cancelled very early. (like, right away) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> FLINK-6183 fixes a memory leak where the TaskMetricGroup >>>>>> was >>>>>>>>>>>>>> never closed >>>>>>>>>>>>>>>>>>> FLINK-6184 fixes a NullPointerExceptions in the buffer >>>>>>>>>> metrics >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> PR here: https://github.com/apache/flink/pull/3611 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 26.03.2017 12:35, Aljoscha Krettek wrote: >>>>>>>>>>>>>>>>>>>> I opened a PR for FLINK-6188: >> https://github.com/apache/ >>>>>>>>>>>>>>>>>> flink/pull/3616 >>>>>>>>>>>>>>>>>>> <https://github.com/apache/flink/pull/3616> >>>>>>>>>>>>>>>>>>>> This improves the previously very sparse test coverage >> for >>>>>>>>>>>>>>>>>>> timestamp/watermark assigners and fixes the bug. >>>>>>>>>>>>>>>>>>>>> On 25 Mar 2017, at 10:22, Ufuk Celebi <u...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I agree with Aljoscha. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -1 because of FLINK-6188 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Sat, Mar 25, 2017 at 9:38 AM, Aljoscha Krettek < >>>>>>>>>>>>>>>>>> aljos...@apache.org> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> I filed this issue, which was observed by a user: >>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-6188 >>>>>>>>>>>>>>>>>>>>>> I think that’s blocking for 1.2.1. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 24 Mar 2017, at 18:57, Ufuk Celebi < >> u...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> RC1 doesn't contain Stefan's backport for the >>>>>>>>>> Asynchronous >>>>>>>>>>>>>> snapshots >>>>>>>>>>>>>>>>>>>>>>> for heap-based keyed state that has been merged. >> Should >>>>>>>>>> we >>>>>>>>>>>>>> create >>>>>>>>>>>>>>>>>> RC2 >>>>>>>>>>>>>>>>>>>>>>> with that fix since the voting period only starts on >>>>>>>>>> Monday? >>>>>>>>>>>>>> I think >>>>>>>>>>>>>>>>>>>>>>> it would only mean rerunning the scripts on your >> side, >>>>>>>>>>>> right? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> – Ufuk >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Fri, Mar 24, 2017 at 3:05 PM, Robert Metzger < >>>>>>>>>>>>>>>>>> rmetz...@apache.org> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> Dear Flink community, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Please vote on releasing the following candidate as >>>>>>>>>> Apache >>>>>>>>>>>>>> Flink >>>>>>>>>>>>>>>>>>> version 1.2 >>>>>>>>>>>>>>>>>>>>>>>> .1. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The commit to be voted on: >>>>>>>>>>>>>>>>>>>>>>>> *732e55bd* (* >>>>>>>>>>>>>>>>>>> http://git-wip-us.apache.org/repos/asf/flink/commit/ >>>>>>>>>> 732e55bd >>>>>>>>>>>>>>>>>>>>>>>> <http://git-wip-us.apache.org/ >>>>>>>>>>>> repos/asf/flink/commit/732e55b >>>>>>>>>>>>>> d>*) >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Branch: >>>>>>>>>>>>>>>>>>>>>>>> release-1.2.1-rc1 >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The release artifacts to be voted on can be found >> at: >>>>>>>>>>>>>>>>>>>>>>>> *http://people.apache.org/~ >> rmetzger/flink-1.2.1-rc1/ >>>>>>>>>>>>>>>>>>>>>>>> <http://people.apache.org/~ >> rmetzger/flink-1.2.1-rc1/ >>>>>>> * >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The release artifacts are signed with the key with >>>>>>>>>>>>>> fingerprint >>>>>>>>>>>>>>>>>>> D9839159: >>>>>>>>>>>>>>>>>>>>>>>> http://www.apache.org/dist/flink/KEYS >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The staging repository for this release can be found >>>>>>>>>> at: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://repository.apache.org/ >>>>>>>>>> content/repositories/orgapache >>>>>>>>>>>>>> flink-1116 >>>>>>>>>>>>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The vote ends on Wednesday, March 29, 2017, 3pm CET. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> [ ] +1 Release this package as Apache Flink 1.2.1 >>>>>>>>>>>>>>>>>>>>>>>> [ ] -1 Do not release this package, because ... >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>>> >> >>