Hi,

I’m hereby canceling the vote for RC2 of Flink 1.8.0 because of various issues 
mentioned in the vote thread.

Best,
Aljoscha

> On 19. Mar 2019, at 11:48, Stephan Ewen <se...@apache.org> wrote:
> 
> @Gordon The tupleByKey and benchmarkCount are most likely not caused by
> serializers, more probably by a network stack change.
> I would look at the AvroSerializer issue independent of those benchmarks.
> 
> On Tue, Mar 19, 2019 at 8:23 AM Piotr Nowojski <piotr.nowoj...@gmail.com>
> wrote:
> 
>> Hi all,
>> 
>> Regarding the regression from mid February looks like happened in this
>> commit range 3d39cb0..a9eb6d7
>> 
>> I'm investigating the regression from January 29th. It happened in the
>> commit range 35fa2b7..81acd0a (I think I managed to reproduce the results
>> locally for it)
>> 
>> Piotrek
>> 
>> wt., 19 mar 2019 o 07:20 jincheng sun <sunjincheng...@gmail.com>
>> napisał(a):
>> 
>>> Hi Alijoscha,
>>> 
>>> I have merged the following issues found in RC1 and RC2 into the
>>> release-1.8 branch.
>>> 
>>> - Add `frocksdbjni` dependency in NOTICE - FLINK-11950
>>> - Improve end-to-end test  - FLINK-11892
>>> - Deprecated Window API - FLINK-11918
>>> 
>>> Currently, I am performing functional testing of YARN cluster mode and
>>> multiple operating systems. I think these tests result will be valid for
>>> the next RC as well.
>>> 
>>> Best,
>>> Jincheng
>>> 
>>> Shaoxuan Wang <wshaox...@gmail.com> 于2019年3月19日周二 上午11:45写道:
>>> 
>>>> I tested RC2 with the following items:
>>>> - Maven Central Repository contains all artifacts
>>>> - Built the source with Maven (ensured all source files have Apache
>>>> headers)
>>>> - Checked checksums and GPG files (for instance, flink-core-1.8.0.jar)
>>>> that
>>>> match the corresponding release files
>>>> - Verified that the source archives do not contains any binaries
>>>> - Manually executed the tests in IDE
>>>> 
>>>> @Alijoscha, per the discussion in RC1, we should consider sending the
>>>> release vote to the user group to gather more feedbacks.
>>>> @Gordon and @Yu, I noticed there are some perf regressions occurred on
>>>> Jan.29 (and consistently exist after that) for the tests
>>>> of stateBackends.FS and stateBackends.ROCKS_INC.
>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
>>>> @Chesnay, how did you notice and capture the license Notice issue? It
>>>> seems
>>>> very difficult to track. I am trying to understand the way how we
>>>> organized
>>>> the license Notice. For this case, why do we only need to add the
>>>> dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
>>>> seems there are other modules that bundles dependency of the
>>>> flink-statebackend.
>>>> 
>>>> Regards,
>>>> Shaoxuan
>>>> 
>>>> 
>>>> 
>>>> On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai <
>>>> tzuli...@apache.org>
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> The regressions in the benchmark were also brought up earlier in this
>>>>> thread by Yu.
>>>>> From the previous investigations, these are the commits that touched
>>>>> relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
>>>>> around Jan / Feb:
>>>>> 
>>>>> TupleSerializer -
>>>>> 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
>>>>> subclasses of TupleSerializerBase to use new serialization
>>>> compatibility
>>>>> abstractions
>>>>> 
>>>>> AvroSerializer -
>>>>> 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop
>>>> canEqual()
>>>>> from TypeSerializer
>>>>> 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro]
>>>> Manually
>>>>> Java-deserialize AvroSerializer for backwards compatibility
>>>>> 
>>>>> RowSerializer -
>>>>> 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop
>>>> canEqual()
>>>>> from TypeSerializer
>>>>> b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table]
>>>> Migrating
>>>>> the RowSerializer to use new compatibility API
>>>>> 
>>>>> The odd thing is, the times of these commits don't really match the
>>>> drops
>>>>> in their respective benchmark result timeline.
>>>>> For tupleKeyBy benchmark, the drop started around end of January,
>>>> where as
>>>>> the TupleSerializer was only last touched mid February.
>>>>> For the serializerRow and serializerAvro benchmarks, the drop occurred
>>>>> around mid February, where as the only commit around that time was
>>>>> 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
>>>>> 
>>>>> The only possible explanation that I can provide for the AvroSerializer
>>>>> benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
>>>>> That commit had to touch the `readObject` method of the AvroSerializer,
>>>>> which introduced some type checks / casts.
>>>>> This may have caused regression in deserializing the AvroSerializer
>>>> itself,
>>>>> which would have been accounted for in the job initialization phase of
>>>> the
>>>>> serializerAvro benchmark.
>>>>> The commit should not have affected per-record performance of the
>>>>> AvroSerializer.
>>>>> However, again, the commit time for 479ebd5987 was end of January,
>>>> where as
>>>>> the benchmark result drop occurred around mid February for the
>>>>> serializerAvro benchmark.
>>>>> 
>>>>> We haven't managed to identify any solid causes so far, only the above
>>>>> speculations.
>>>>> 
>>>>> Cheers,
>>>>> Gordon
>>>>> 
>>>>> 
>>>>> On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen <se...@apache.org> wrote:
>>>>> 
>>>>>> Piotr and me discovered a possible issue in the benchmarks.
>>>>>> 
>>>>>> Looking at the time graphs, there seems to be one issue coming
>>>> around end
>>>>>> of January. It increased network throughput, but decreased overall
>>>>>> performance and added more variation in time (possibly through GC).
>>>> Check
>>>>>> the trend in these graphs:
>>>>>> 
>>>>>> Increased Throughput:
>>>>>> 
>>>>>> 
>>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
>>>>>> Higher variance in count benchmark:
>>>>>> 
>>>>>> 
>>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
>>>>>> Drop in tuple-key-by performance trend:
>>>>>> 
>>>>>> 
>>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on
>>>>>> 
>>>>>> In addition, the Avro and Row serializers seem to have a performance
>>>> drop
>>>>>> since mid February:
>>>>>> 
>>>>>> 
>>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
>>>>>> 
>>>>>> 
>>>>> 
>>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on
>>>>>> 
>>>>>> @Gordon any idea what could be the cause of this?
>>>>>> 
>>>>>> 
>>>>>> On Mon, Mar 18, 2019 at 3:08 PM Yu Li <car...@gmail.com> wrote:
>>>>>> 
>>>>>>> Watching the benchmark data for days and indeed it's normalized
>>>> for the
>>>>>>> time being. However, the result seems to be unstable. I also tried
>>>> the
>>>>>>> benchmark locally and observed obvious wave even with the same
>>>>> commit...
>>>>>>> 
>>>>>>> I guess we may need to improve it such as increasing the
>>>>>>> RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a
>>>> stable
>>>>>>> micro benchmark is important to verify perf-related improvements
>>>> (and I
>>>>>>> think the benchmark and website are already great ones but just
>>>> need
>>>>> some
>>>>>>> love). Let me mark this as one of my backlog and will open a JIRA
>>>> when
>>>>>>> prepared.
>>>>>>> 
>>>>>>> Anyway good to know it's not a regression, and thanks for the
>>>> efforts
>>>>>> spent
>>>>>>> on checking it over! @Gordon @Chesnay
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> Yu
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler <ches...@apache.org
>>>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>>> The regressions is already normalizing again. I'd observer it
>>>> further
>>>>>>>> before doing anything.
>>>>>>>> 
>>>>>>>> The same applies to the benchmarkCount which tanked even more in
>>>> that
>>>>>>>> same run.
>>>>>>>> 
>>>>>>>> On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
>>>>>>>>> @Yu
>>>>>>>>> Thanks for reporting that Yu, great that this was noticed.
>>>>>>>>> 
>>>>>>>>> The serializerAvro case seems to only be testing on-wire
>>>>>> serialization.
>>>>>>>>> I checked the changes to the `AvroSerializer`, and it seems
>>>> like
>>>>>>>>> FLINK-11436 [1] with commit 479ebd59 was the only change that
>>>> may
>>>>>> have
>>>>>>>>> affected that.
>>>>>>>>> That commit wasn't introduced exactly around the time when the
>>>>>>> indicated
>>>>>>>>> performance regression occurred, but was still before the
>>>>> regression.
>>>>>>>>> The commit introduced some instanceof type checks / type
>>>> casting in
>>>>>> the
>>>>>>>>> readObject of the AvroSerializer, which may have caused this.
>>>>>>>>> 
>>>>>>>>> Currently investigating further.
>>>>>>>>> 
>>>>>>>>> Cheers,
>>>>>>>>> Gordon
>>>>>>>>> 
>>>>>>>>> On Fri, Mar 15, 2019 at 11:45 AM Yu Li <car...@gmail.com>
>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Hi Aljoscha and all,
>>>>>>>>>> 
>>>>>>>>>> From our performance benchmark web site (
>>>>>>>>>> http://codespeed.dak8s.net:8000/changes/) I observed a
>>>> noticeable
>>>>>>>>>> regression (-6.92%) on the serializerAvro case comparing the
>>>>> latest
>>>>>>> 100
>>>>>>>>>> revisions, which may need some attention. Thanks.
>>>>>>>>>> 
>>>>>>>>>> Best Regards,
>>>>>>>>>> Yu
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek <
>>>>> aljos...@apache.org
>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>> Please review and vote on the release candidate 2 for Flink
>>>>> 1.8.0,
>>>>>> as
>>>>>>>>>>> follows:
>>>>>>>>>>> [ ] +1, Approve the release
>>>>>>>>>>> [ ] -1, Do not approve the release (please provide specific
>>>>>> comments)
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> The complete staging area is available for your review, which
>>>>>>> includes:
>>>>>>>>>>> * JIRA release notes [1],
>>>>>>>>>>> * the official Apache source release and binary convenience
>>>>>> releases
>>>>>>> to
>>>>>>>>>> be
>>>>>>>>>>> deployed to dist.apache.org <http://dist.apache.org/> [2],
>>>> which
>>>>>> are
>>>>>>>>>>> signed with the key with fingerprint
>>>>>>>>>>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>>>>>>>>>>> * all artifacts to be deployed to the Maven Central
>>>> Repository
>>>>> [4],
>>>>>>>>>>> * source code tag "release-1.8.0-rc2" [5],
>>>>>>>>>>> * website pull request listing the new release [6]
>>>>>>>>>>> * website pull request adding announcement blog post [7].
>>>>>>>>>>> 
>>>>>>>>>>> The vote will be open for at least 72 hours. It is adopted by
>>>>>>> majority
>>>>>>>>>>> approval, with at least 3 PMC affirmative votes.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Aljoscha
>>>>>>>>>>> 
>>>>>>>>>>> [1]
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>>>>>>>>>> <
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>>>>>>>>>> [2]
>>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/
>>>>>> <
>>>>>>>>>>> 
>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
>>>>>>>>>>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
>>>>>>>>>>> https://dist.apache.org/repos/dist/release/flink/KEYS>
>>>>>>>>>>> [4]
>>>>>>>>>> 
>>>>>>> 
>>>> https://repository.apache.org/content/repositories/orgapacheflink-1213
>>>>>>>>>>> <
>>>>>>>> 
>>>>>> 
>>>> https://repository.apache.org/content/repositories/orgapacheflink-1210/
>>>>>>>>>>> 
>>>>>>>>>>> [5]
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
>>>>>>>>>>> <
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
>>>>>>>>>>> [6] https://github.com/apache/flink-web/pull/180 <
>>>>>>>>>>> https://github.com/apache/flink-web/pull/180>
>>>>>>>>>>> [7] https://github.com/apache/flink-web/pull/179 <
>>>>>>>>>>> https://github.com/apache/flink-web/pull/179>
>>>>>>>>>>> 
>>>>>>>>>>> P.S. The difference to the previous RC1 is very small, you
>>>> can
>>>>>> fetch
>>>>>>>> the
>>>>>>>>>>> two tags and do a "git log
>>>> release-1.8.0-rc1..release-1.8.0-rc2”
>>>>> to
>>>>>>> see
>>>>>>>>>> the
>>>>>>>>>>> difference in commits. Its fixes for the issues that led to
>>>> the
>>>>>>>>>>> cancellation of the previous RC plus smaller fixes. Most
>>>>>>>>>>> verification/testing that was carried out should apply as is
>>>> to
>>>>>> this
>>>>>>>> RC.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 

Reply via email to