@Gordon The tupleByKey and benchmarkCount are most likely not caused by
serializers, more probably by a network stack change.
I would look at the AvroSerializer issue independent of those benchmarks.

On Tue, Mar 19, 2019 at 8:23 AM Piotr Nowojski <piotr.nowoj...@gmail.com>
wrote:

> Hi all,
>
> Regarding the regression from mid February looks like happened in this
> commit range 3d39cb0..a9eb6d7
>
> I'm investigating the regression from January 29th. It happened in the
> commit range 35fa2b7..81acd0a (I think I managed to reproduce the results
> locally for it)
>
> Piotrek
>
> wt., 19 mar 2019 o 07:20 jincheng sun <sunjincheng...@gmail.com>
> napisał(a):
>
>> Hi Alijoscha,
>>
>> I have merged the following issues found in RC1 and RC2 into the
>> release-1.8 branch.
>>
>> - Add `frocksdbjni` dependency in NOTICE - FLINK-11950
>> - Improve end-to-end test  - FLINK-11892
>> - Deprecated Window API - FLINK-11918
>>
>> Currently, I am performing functional testing of YARN cluster mode and
>> multiple operating systems. I think these tests result will be valid for
>> the next RC as well.
>>
>> Best,
>> Jincheng
>>
>> Shaoxuan Wang <wshaox...@gmail.com> 于2019年3月19日周二 上午11:45写道:
>>
>>> I tested RC2 with the following items:
>>> - Maven Central Repository contains all artifacts
>>> - Built the source with Maven (ensured all source files have Apache
>>> headers)
>>> - Checked checksums and GPG files (for instance, flink-core-1.8.0.jar)
>>> that
>>> match the corresponding release files
>>> - Verified that the source archives do not contains any binaries
>>> - Manually executed the tests in IDE
>>>
>>> @Alijoscha, per the discussion in RC1, we should consider sending the
>>> release vote to the user group to gather more feedbacks.
>>> @Gordon and @Yu, I noticed there are some perf regressions occurred on
>>> Jan.29 (and consistently exist after that) for the tests
>>> of stateBackends.FS and stateBackends.ROCKS_INC.
>>>
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
>>>
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
>>> @Chesnay, how did you notice and capture the license Notice issue? It
>>> seems
>>> very difficult to track. I am trying to understand the way how we
>>> organized
>>> the license Notice. For this case, why do we only need to add the
>>> dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
>>> seems there are other modules that bundles dependency of the
>>> flink-statebackend.
>>>
>>> Regards,
>>> Shaoxuan
>>>
>>>
>>>
>>> On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai <
>>> tzuli...@apache.org>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > The regressions in the benchmark were also brought up earlier in this
>>> > thread by Yu.
>>> > From the previous investigations, these are the commits that touched
>>> > relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
>>> > around Jan / Feb:
>>> >
>>> > TupleSerializer -
>>> > 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
>>> > subclasses of TupleSerializerBase to use new serialization
>>> compatibility
>>> > abstractions
>>> >
>>> > AvroSerializer -
>>> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop
>>> canEqual()
>>> > from TypeSerializer
>>> > 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro]
>>> Manually
>>> > Java-deserialize AvroSerializer for backwards compatibility
>>> >
>>> > RowSerializer -
>>> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop
>>> canEqual()
>>> > from TypeSerializer
>>> > b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table]
>>> Migrating
>>> > the RowSerializer to use new compatibility API
>>> >
>>> > The odd thing is, the times of these commits don't really match the
>>> drops
>>> > in their respective benchmark result timeline.
>>> > For tupleKeyBy benchmark, the drop started around end of January,
>>> where as
>>> > the TupleSerializer was only last touched mid February.
>>> > For the serializerRow and serializerAvro benchmarks, the drop occurred
>>> > around mid February, where as the only commit around that time was
>>> > 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
>>> >
>>> > The only possible explanation that I can provide for the AvroSerializer
>>> > benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
>>> > That commit had to touch the `readObject` method of the AvroSerializer,
>>> > which introduced some type checks / casts.
>>> > This may have caused regression in deserializing the AvroSerializer
>>> itself,
>>> > which would have been accounted for in the job initialization phase of
>>> the
>>> > serializerAvro benchmark.
>>> > The commit should not have affected per-record performance of the
>>> > AvroSerializer.
>>> > However, again, the commit time for 479ebd5987 was end of January,
>>> where as
>>> > the benchmark result drop occurred around mid February for the
>>> > serializerAvro benchmark.
>>> >
>>> > We haven't managed to identify any solid causes so far, only the above
>>> > speculations.
>>> >
>>> > Cheers,
>>> > Gordon
>>> >
>>> >
>>> > On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen <se...@apache.org> wrote:
>>> >
>>> > > Piotr and me discovered a possible issue in the benchmarks.
>>> > >
>>> > > Looking at the time graphs, there seems to be one issue coming
>>> around end
>>> > > of January. It increased network throughput, but decreased overall
>>> > > performance and added more variation in time (possibly through GC).
>>> Check
>>> > > the trend in these graphs:
>>> > >
>>> > > Increased Throughput:
>>> > >
>>> > >
>>> >
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
>>> > > Higher variance in count benchmark:
>>> > >
>>> > >
>>> >
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
>>> > > Drop in tuple-key-by performance trend:
>>> > >
>>> > >
>>> >
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on
>>> > >
>>> > > In addition, the Avro and Row serializers seem to have a performance
>>> drop
>>> > > since mid February:
>>> > >
>>> > >
>>> >
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
>>> > >
>>> > >
>>> >
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on
>>> > >
>>> > > @Gordon any idea what could be the cause of this?
>>> > >
>>> > >
>>> > > On Mon, Mar 18, 2019 at 3:08 PM Yu Li <car...@gmail.com> wrote:
>>> > >
>>> > > > Watching the benchmark data for days and indeed it's normalized
>>> for the
>>> > > > time being. However, the result seems to be unstable. I also tried
>>> the
>>> > > > benchmark locally and observed obvious wave even with the same
>>> > commit...
>>> > > >
>>> > > > I guess we may need to improve it such as increasing the
>>> > > > RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a
>>> stable
>>> > > > micro benchmark is important to verify perf-related improvements
>>> (and I
>>> > > > think the benchmark and website are already great ones but just
>>> need
>>> > some
>>> > > > love). Let me mark this as one of my backlog and will open a JIRA
>>> when
>>> > > > prepared.
>>> > > >
>>> > > > Anyway good to know it's not a regression, and thanks for the
>>> efforts
>>> > > spent
>>> > > > on checking it over! @Gordon @Chesnay
>>> > > >
>>> > > > Best Regards,
>>> > > > Yu
>>> > > >
>>> > > >
>>> > > > On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler <ches...@apache.org
>>> >
>>> > > wrote:
>>> > > >
>>> > > > > The regressions is already normalizing again. I'd observer it
>>> further
>>> > > > > before doing anything.
>>> > > > >
>>> > > > > The same applies to the benchmarkCount which tanked even more in
>>> that
>>> > > > > same run.
>>> > > > >
>>> > > > > On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
>>> > > > > > @Yu
>>> > > > > > Thanks for reporting that Yu, great that this was noticed.
>>> > > > > >
>>> > > > > > The serializerAvro case seems to only be testing on-wire
>>> > > serialization.
>>> > > > > > I checked the changes to the `AvroSerializer`, and it seems
>>> like
>>> > > > > > FLINK-11436 [1] with commit 479ebd59 was the only change that
>>> may
>>> > > have
>>> > > > > > affected that.
>>> > > > > > That commit wasn't introduced exactly around the time when the
>>> > > > indicated
>>> > > > > > performance regression occurred, but was still before the
>>> > regression.
>>> > > > > > The commit introduced some instanceof type checks / type
>>> casting in
>>> > > the
>>> > > > > > readObject of the AvroSerializer, which may have caused this.
>>> > > > > >
>>> > > > > > Currently investigating further.
>>> > > > > >
>>> > > > > > Cheers,
>>> > > > > > Gordon
>>> > > > > >
>>> > > > > > On Fri, Mar 15, 2019 at 11:45 AM Yu Li <car...@gmail.com>
>>> wrote:
>>> > > > > >
>>> > > > > >> Hi Aljoscha and all,
>>> > > > > >>
>>> > > > > >>  From our performance benchmark web site (
>>> > > > > >> http://codespeed.dak8s.net:8000/changes/) I observed a
>>> noticeable
>>> > > > > >> regression (-6.92%) on the serializerAvro case comparing the
>>> > latest
>>> > > > 100
>>> > > > > >> revisions, which may need some attention. Thanks.
>>> > > > > >>
>>> > > > > >> Best Regards,
>>> > > > > >> Yu
>>> > > > > >>
>>> > > > > >>
>>> > > > > >> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek <
>>> > aljos...@apache.org
>>> > > >
>>> > > > > >> wrote:
>>> > > > > >>
>>> > > > > >>> Hi everyone,
>>> > > > > >>> Please review and vote on the release candidate 2 for Flink
>>> > 1.8.0,
>>> > > as
>>> > > > > >>> follows:
>>> > > > > >>> [ ] +1, Approve the release
>>> > > > > >>> [ ] -1, Do not approve the release (please provide specific
>>> > > comments)
>>> > > > > >>>
>>> > > > > >>>
>>> > > > > >>> The complete staging area is available for your review, which
>>> > > > includes:
>>> > > > > >>> * JIRA release notes [1],
>>> > > > > >>> * the official Apache source release and binary convenience
>>> > > releases
>>> > > > to
>>> > > > > >> be
>>> > > > > >>> deployed to dist.apache.org <http://dist.apache.org/> [2],
>>> which
>>> > > are
>>> > > > > >>> signed with the key with fingerprint
>>> > > > > >>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>>> > > > > >>> * all artifacts to be deployed to the Maven Central
>>> Repository
>>> > [4],
>>> > > > > >>> * source code tag "release-1.8.0-rc2" [5],
>>> > > > > >>> * website pull request listing the new release [6]
>>> > > > > >>> * website pull request adding announcement blog post [7].
>>> > > > > >>>
>>> > > > > >>> The vote will be open for at least 72 hours. It is adopted by
>>> > > > majority
>>> > > > > >>> approval, with at least 3 PMC affirmative votes.
>>> > > > > >>>
>>> > > > > >>> Thanks,
>>> > > > > >>> Aljoscha
>>> > > > > >>>
>>> > > > > >>> [1]
>>> > > > > >>>
>>> > > > > >>
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>> > > > > >>> <
>>> > > > > >>>
>>> > > > > >>
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>> > > > > >>> [2]
>>> > https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/
>>> > > <
>>> > > > > >>>
>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
>>> > > > > >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
>>> > > > > >>> https://dist.apache.org/repos/dist/release/flink/KEYS>
>>> > > > > >>> [4]
>>> > > > > >>
>>> > > >
>>> https://repository.apache.org/content/repositories/orgapacheflink-1213
>>> > > > > >>> <
>>> > > > >
>>> > >
>>> https://repository.apache.org/content/repositories/orgapacheflink-1210/
>>> > > > > >>>
>>> > > > > >>> [5]
>>> > > > > >>>
>>> > > > > >>
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
>>> > > > > >>> <
>>> > > > > >>>
>>> > > > > >>
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
>>> > > > > >>> [6] https://github.com/apache/flink-web/pull/180 <
>>> > > > > >>> https://github.com/apache/flink-web/pull/180>
>>> > > > > >>> [7] https://github.com/apache/flink-web/pull/179 <
>>> > > > > >>> https://github.com/apache/flink-web/pull/179>
>>> > > > > >>>
>>> > > > > >>> P.S. The difference to the previous RC1 is very small, you
>>> can
>>> > > fetch
>>> > > > > the
>>> > > > > >>> two tags and do a "git log
>>> release-1.8.0-rc1..release-1.8.0-rc2”
>>> > to
>>> > > > see
>>> > > > > >> the
>>> > > > > >>> difference in commits. Its fixes for the issues that led to
>>> the
>>> > > > > >>> cancellation of the previous RC plus smaller fixes. Most
>>> > > > > >>> verification/testing that was carried out should apply as is
>>> to
>>> > > this
>>> > > > > RC.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Reply via email to