Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-19 Thread Stephan Ewen
@Gordon The tupleByKey and benchmarkCount are most likely not caused by
serializers, more probably by a network stack change.
I would look at the AvroSerializer issue independent of those benchmarks.

On Tue, Mar 19, 2019 at 8:23 AM Piotr Nowojski 
wrote:

> Hi all,
>
> Regarding the regression from mid February looks like happened in this
> commit range 3d39cb0..a9eb6d7
>
> I'm investigating the regression from January 29th. It happened in the
> commit range 35fa2b7..81acd0a (I think I managed to reproduce the results
> locally for it)
>
> Piotrek
>
> wt., 19 mar 2019 o 07:20 jincheng sun 
> napisał(a):
>
>> Hi Alijoscha,
>>
>> I have merged the following issues found in RC1 and RC2 into the
>> release-1.8 branch.
>>
>> - Add `frocksdbjni` dependency in NOTICE - FLINK-11950
>> - Improve end-to-end test  - FLINK-11892
>> - Deprecated Window API - FLINK-11918
>>
>> Currently, I am performing functional testing of YARN cluster mode and
>> multiple operating systems. I think these tests result will be valid for
>> the next RC as well.
>>
>> Best,
>> Jincheng
>>
>> Shaoxuan Wang  于2019年3月19日周二 上午11:45写道:
>>
>>> I tested RC2 with the following items:
>>> - Maven Central Repository contains all artifacts
>>> - Built the source with Maven (ensured all source files have Apache
>>> headers)
>>> - Checked checksums and GPG files (for instance, flink-core-1.8.0.jar)
>>> that
>>> match the corresponding release files
>>> - Verified that the source archives do not contains any binaries
>>> - Manually executed the tests in IDE
>>>
>>> @Alijoscha, per the discussion in RC1, we should consider sending the
>>> release vote to the user group to gather more feedbacks.
>>> @Gordon and @Yu, I noticed there are some perf regressions occurred on
>>> Jan.29 (and consistently exist after that) for the tests
>>> of stateBackends.FS and stateBackends.ROCKS_INC.
>>>
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
>>>
>>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
>>> @Chesnay, how did you notice and capture the license Notice issue? It
>>> seems
>>> very difficult to track. I am trying to understand the way how we
>>> organized
>>> the license Notice. For this case, why do we only need to add the
>>> dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
>>> seems there are other modules that bundles dependency of the
>>> flink-statebackend.
>>>
>>> Regards,
>>> Shaoxuan
>>>
>>>
>>>
>>> On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai <
>>> tzuli...@apache.org>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > The regressions in the benchmark were also brought up earlier in this
>>> > thread by Yu.
>>> > From the previous investigations, these are the commits that touched
>>> > relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
>>> > around Jan / Feb:
>>> >
>>> > TupleSerializer -
>>> > 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
>>> > subclasses of TupleSerializerBase to use new serialization
>>> compatibility
>>> > abstractions
>>> >
>>> > AvroSerializer -
>>> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop
>>> canEqual()
>>> > from TypeSerializer
>>> > 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro]
>>> Manually
>>> > Java-deserialize AvroSerializer for backwards compatibility
>>> >
>>> > RowSerializer -
>>> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop
>>> canEqual()
>>> > from TypeSerializer
>>> > b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table]
>>> Migrating
>>> > the RowSerializer to use new compatibility API
>>> >
>>> > The odd thing is, the times of these commits don't really match the
>>> drops
>>> > in their respective benchmark result timeline.
>>> > For tupleKeyBy benchmark, the drop started around end of January,
>>> where as
>>> > the TupleSerializer was only last touched mid February.
>>> > For the serializerRow and serializerAvro benchmarks, the drop occurred
>>> > around mid February, where as the only commit around that time was
>>> > 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
>>> >
>>> > The only possible explanation that I can provide for the AvroSerializer
>>> > benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
>>> > That commit had to touch the `readObject` method of the AvroSerializer,
>>> > which introduced some type checks / casts.
>>> > This may have caused regression in deserializing the AvroSerializer
>>> itself,
>>> > which would have been accounted for in the job initialization phase of
>>> the
>>> > serializerAvro benchmark.
>>> > The commit should not have affected per-record performance of the
>>> > AvroSerializer.
>>> > However, again, the commit time for 479ebd5987 was end of January,
>>> where as
>>> > the benchmark result drop occurred around mid February for the
>>> > serializerAvro benchmark.
>>> >
>>> > We hav

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-19 Thread Piotr Nowojski
Hi all,

Regarding the regression from mid February looks like happened in this
commit range 3d39cb0..a9eb6d7

I'm investigating the regression from January 29th. It happened in the
commit range 35fa2b7..81acd0a (I think I managed to reproduce the results
locally for it)

Piotrek

wt., 19 mar 2019 o 07:20 jincheng sun  napisał(a):

> Hi Alijoscha,
>
> I have merged the following issues found in RC1 and RC2 into the
> release-1.8 branch.
>
> - Add `frocksdbjni` dependency in NOTICE - FLINK-11950
> - Improve end-to-end test  - FLINK-11892
> - Deprecated Window API - FLINK-11918
>
> Currently, I am performing functional testing of YARN cluster mode and
> multiple operating systems. I think these tests result will be valid for
> the next RC as well.
>
> Best,
> Jincheng
>
> Shaoxuan Wang  于2019年3月19日周二 上午11:45写道:
>
>> I tested RC2 with the following items:
>> - Maven Central Repository contains all artifacts
>> - Built the source with Maven (ensured all source files have Apache
>> headers)
>> - Checked checksums and GPG files (for instance, flink-core-1.8.0.jar)
>> that
>> match the corresponding release files
>> - Verified that the source archives do not contains any binaries
>> - Manually executed the tests in IDE
>>
>> @Alijoscha, per the discussion in RC1, we should consider sending the
>> release vote to the user group to gather more feedbacks.
>> @Gordon and @Yu, I noticed there are some perf regressions occurred on
>> Jan.29 (and consistently exist after that) for the tests
>> of stateBackends.FS and stateBackends.ROCKS_INC.
>>
>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
>>
>> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
>> @Chesnay, how did you notice and capture the license Notice issue? It
>> seems
>> very difficult to track. I am trying to understand the way how we
>> organized
>> the license Notice. For this case, why do we only need to add the
>> dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
>> seems there are other modules that bundles dependency of the
>> flink-statebackend.
>>
>> Regards,
>> Shaoxuan
>>
>>
>>
>> On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai > >
>> wrote:
>>
>> > Hi,
>> >
>> > The regressions in the benchmark were also brought up earlier in this
>> > thread by Yu.
>> > From the previous investigations, these are the commits that touched
>> > relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
>> > around Jan / Feb:
>> >
>> > TupleSerializer -
>> > 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
>> > subclasses of TupleSerializerBase to use new serialization compatibility
>> > abstractions
>> >
>> > AvroSerializer -
>> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
>> > from TypeSerializer
>> > 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro]
>> Manually
>> > Java-deserialize AvroSerializer for backwards compatibility
>> >
>> > RowSerializer -
>> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
>> > from TypeSerializer
>> > b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table]
>> Migrating
>> > the RowSerializer to use new compatibility API
>> >
>> > The odd thing is, the times of these commits don't really match the
>> drops
>> > in their respective benchmark result timeline.
>> > For tupleKeyBy benchmark, the drop started around end of January, where
>> as
>> > the TupleSerializer was only last touched mid February.
>> > For the serializerRow and serializerAvro benchmarks, the drop occurred
>> > around mid February, where as the only commit around that time was
>> > 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
>> >
>> > The only possible explanation that I can provide for the AvroSerializer
>> > benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
>> > That commit had to touch the `readObject` method of the AvroSerializer,
>> > which introduced some type checks / casts.
>> > This may have caused regression in deserializing the AvroSerializer
>> itself,
>> > which would have been accounted for in the job initialization phase of
>> the
>> > serializerAvro benchmark.
>> > The commit should not have affected per-record performance of the
>> > AvroSerializer.
>> > However, again, the commit time for 479ebd5987 was end of January,
>> where as
>> > the benchmark result drop occurred around mid February for the
>> > serializerAvro benchmark.
>> >
>> > We haven't managed to identify any solid causes so far, only the above
>> > speculations.
>> >
>> > Cheers,
>> > Gordon
>> >
>> >
>> > On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:
>> >
>> > > Piotr and me discovered a possible issue in the benchmarks.
>> > >
>> > > Looking at the time graphs, there seems to be one issue coming around
>> end
>> > > of January. It increased network throughput, but decreased overall
>> >

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
Hi Chesnay, thanks for sharing your long term plan about
`flink-shaded-hadoop`!
Got your points. I also think we need the JIRA for that, at the follows
right time.
Best Regards,
Jincheng

Chesnay Schepler  于2019年3月18日周一 下午8:13写道:

> Long term plan _is_ to move flink-shaded-hadoop to flink-shaded, I believe
> there's even a JIRA for that.
>
> Until that is in place they _must_ have retain the flink version as
> otherwise we'd be unable to change them in follow-up releases without
> changing the version scheme again.
>
> And even after the move they will retain the flink-shaded version like all
> other flink-shaded modules, for the above reason.
>
> On 18.03.2019 12:10, jincheng sun wrote:
>
> Hi Chesnay,
>
> The artifacts to be released do not have a SNAPSHOT suffix:
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>
> Thank you for providing this link. It's very useful for contributors who
> want to check the RC on YARN.
>
> My suggestion may not describe being clear, let me explain:
>
> 1. Since 1.8.0, Flink's release package will not contain the corresponding
> Hadoop dependency, then the user has two ways to get the required hadoop
> dependency:
>
>1). Download the existing Hadoop version on the Flink download page.
>2). Generate the version required by the user from the source code (see
> https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#hadoop-versions)
> For example, version 2.6.1 is required: `mvn clean install -DskipTests
> -Dhadoop.version=2.6.1`.
>
> 2. About how to manage the JARs release of Hadoop dependencies:
>
>1). The name of Hadoop shaded version should not include Flink
> version,  take your link as an example:
>`.../flink-shaded-hadoop2-uber/2.4.1-1.8.0/xx.jar`
>`.../flink-shaded-hadoop2-uber/2.6.5-1.8.0/xx.jar`
>`.../flink-shaded-hadoop2-uber/2.7.5-1.8.0/xx.jar`
>`.../flink-shaded-hadoop2-uber/2.8.3-1.8.0/xx.jar`
> The above version name I think it is possible to change `2.4.1-1.8.0` to
> `2.4.1`. That is, the same version of `Hadoop` shade can be used in many
> Flink versions, such as 2.8.3 Hadoop is not only available for Flink-1.8.0,
> it can be used by Flink-1.8.x or it can be used by Flink-1.9.x. etc.
>
>2). Release the shaded-Hadoop independently:
>For a long-term,  we can release the shaded JARs independently and move
> `flink-shaded-hadoop` into `https://github.com/apache/flink-shaded`,  So
> I suggest that we can publish Hadoop versions independently,  and share
> them in multiple Flink versions.
>
> What do you think?
>
> Best,
> Jincheng
>
>
> Chesnay Schepler  于2019年3月18日周一 下午4:15写道:
>
>> We release SNAPSHOT artifacts for all module, see
>>
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
>> .
>>
>> The artifacts to be released do not have a SNAPSHOT suffix:
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>>
>> Finally, we are already adding flink-shaded-hadoop to the optional
>> components section in this PR:
>> https://github.com/apache/flink-web/pull/180
>>
>> On 18.03.2019 08:55, jincheng sun wrote:
>> > -1
>> >
>> > Currently, we have released the Hadoop-related JRA as a snapshot
>> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>> > <
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
>> >),
>> > I think we should release a stable version.
>> > When testing the release code on YARN, currently user cannot find out
>> the
>> > Hadoop dependency.  Although there is a download explanation for Hadoop
>> in
>> > PR [`Update Downloads page for Flink 1.8
>> > `], a 404 error
>> occurs
>> > when you click Download ( I had left detail comments in the PR).
>> >
>> > So, I suggest as follows:
>> >
>> >1. It would be better to add the changes for
>> > `downloads.html#optional-components`, add the Hadoop relation JARs
>> download
>> > link first.
>> >2. Then add instructions on how to get the dependencies of the
>> Hadoop or
>> > add the correct download link directly in the next VOTE mail, due to we
>> do
>> > not include Hadoop in `flink-dist`.
>> >3.  Release a stable version Hadoop-related JRAs.
>> >
>> > Then, contributors can test it more easily on YARN.  What do you think?
>> >
>> > Best,
>> > Jincheng
>> >
>> >
>> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
>> >
>> >> -1
>> >>
>> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
>> >> binary distribution).
>> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>> >>
>> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>> >>> Hi everyone,
>> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
>> >> follows:
>> >>> [ ] +1, Approve the release
>> >>> [ ] -1, Do not approve the release (please provide specific 

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
Hi Alijoscha,

I have merged the following issues found in RC1 and RC2 into the
release-1.8 branch.

- Add `frocksdbjni` dependency in NOTICE - FLINK-11950
- Improve end-to-end test  - FLINK-11892
- Deprecated Window API - FLINK-11918

Currently, I am performing functional testing of YARN cluster mode and
multiple operating systems. I think these tests result will be valid for
the next RC as well.

Best,
Jincheng

Shaoxuan Wang  于2019年3月19日周二 上午11:45写道:

> I tested RC2 with the following items:
> - Maven Central Repository contains all artifacts
> - Built the source with Maven (ensured all source files have Apache
> headers)
> - Checked checksums and GPG files (for instance, flink-core-1.8.0.jar) that
> match the corresponding release files
> - Verified that the source archives do not contains any binaries
> - Manually executed the tests in IDE
>
> @Alijoscha, per the discussion in RC1, we should consider sending the
> release vote to the user group to gather more feedbacks.
> @Gordon and @Yu, I noticed there are some perf regressions occurred on
> Jan.29 (and consistently exist after that) for the tests
> of stateBackends.FS and stateBackends.ROCKS_INC.
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
> @Chesnay, how did you notice and capture the license Notice issue? It seems
> very difficult to track. I am trying to understand the way how we organized
> the license Notice. For this case, why do we only need to add the
> dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
> seems there are other modules that bundles dependency of the
> flink-statebackend.
>
> Regards,
> Shaoxuan
>
>
>
> On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai 
> wrote:
>
> > Hi,
> >
> > The regressions in the benchmark were also brought up earlier in this
> > thread by Yu.
> > From the previous investigations, these are the commits that touched
> > relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
> > around Jan / Feb:
> >
> > TupleSerializer -
> > 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
> > subclasses of TupleSerializerBase to use new serialization compatibility
> > abstractions
> >
> > AvroSerializer -
> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> > from TypeSerializer
> > 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro] Manually
> > Java-deserialize AvroSerializer for backwards compatibility
> >
> > RowSerializer -
> > 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> > from TypeSerializer
> > b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table]
> Migrating
> > the RowSerializer to use new compatibility API
> >
> > The odd thing is, the times of these commits don't really match the drops
> > in their respective benchmark result timeline.
> > For tupleKeyBy benchmark, the drop started around end of January, where
> as
> > the TupleSerializer was only last touched mid February.
> > For the serializerRow and serializerAvro benchmarks, the drop occurred
> > around mid February, where as the only commit around that time was
> > 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
> >
> > The only possible explanation that I can provide for the AvroSerializer
> > benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
> > That commit had to touch the `readObject` method of the AvroSerializer,
> > which introduced some type checks / casts.
> > This may have caused regression in deserializing the AvroSerializer
> itself,
> > which would have been accounted for in the job initialization phase of
> the
> > serializerAvro benchmark.
> > The commit should not have affected per-record performance of the
> > AvroSerializer.
> > However, again, the commit time for 479ebd5987 was end of January, where
> as
> > the benchmark result drop occurred around mid February for the
> > serializerAvro benchmark.
> >
> > We haven't managed to identify any solid causes so far, only the above
> > speculations.
> >
> > Cheers,
> > Gordon
> >
> >
> > On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:
> >
> > > Piotr and me discovered a possible issue in the benchmarks.
> > >
> > > Looking at the time graphs, there seems to be one issue coming around
> end
> > > of January. It increased network throughput, but decreased overall
> > > performance and added more variation in time (possibly through GC).
> Check
> > > the trend in these graphs:
> > >
> > > Increased Throughput:
> > >
> > >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
> > > Higher variance in count benchmark:
> > >
> > >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
> > > Drop in tuple

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Shaoxuan Wang
I tested RC2 with the following items:
- Maven Central Repository contains all artifacts
- Built the source with Maven (ensured all source files have Apache headers)
- Checked checksums and GPG files (for instance, flink-core-1.8.0.jar) that
match the corresponding release files
- Verified that the source archives do not contains any binaries
- Manually executed the tests in IDE

@Alijoscha, per the discussion in RC1, we should consider sending the
release vote to the user group to gather more feedbacks.
@Gordon and @Yu, I noticed there are some perf regressions occurred on
Jan.29 (and consistently exist after that) for the tests
of stateBackends.FS and stateBackends.ROCKS_INC.
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=stateBackends.FS&env=2&revs=200&equid=off&quarts=on&extr=on
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tumblingWindow&env=2&revs=200&equid=off&quarts=on&extr=on
@Chesnay, how did you notice and capture the license Notice issue? It seems
very difficult to track. I am trying to understand the way how we organized
the license Notice. For this case, why do we only need to add the
dependency of 5.17.2-artisans-1.0 to the Notice file of flink-dist? It
seems there are other modules that bundles dependency of the
flink-statebackend.

Regards,
Shaoxuan



On Tue, Mar 19, 2019 at 10:49 AM Tzu-Li (Gordon) Tai 
wrote:

> Hi,
>
> The regressions in the benchmark were also brought up earlier in this
> thread by Yu.
> From the previous investigations, these are the commits that touched
> relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
> around Jan / Feb:
>
> TupleSerializer -
> 73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
> subclasses of TupleSerializerBase to use new serialization compatibility
> abstractions
>
> AvroSerializer -
> 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> from TypeSerializer
> 479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro] Manually
> Java-deserialize AvroSerializer for backwards compatibility
>
> RowSerializer -
> 09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
> from TypeSerializer
> b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table] Migrating
> the RowSerializer to use new compatibility API
>
> The odd thing is, the times of these commits don't really match the drops
> in their respective benchmark result timeline.
> For tupleKeyBy benchmark, the drop started around end of January, where as
> the TupleSerializer was only last touched mid February.
> For the serializerRow and serializerAvro benchmarks, the drop occurred
> around mid February, where as the only commit around that time was
> 09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).
>
> The only possible explanation that I can provide for the AvroSerializer
> benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
> That commit had to touch the `readObject` method of the AvroSerializer,
> which introduced some type checks / casts.
> This may have caused regression in deserializing the AvroSerializer itself,
> which would have been accounted for in the job initialization phase of the
> serializerAvro benchmark.
> The commit should not have affected per-record performance of the
> AvroSerializer.
> However, again, the commit time for 479ebd5987 was end of January, where as
> the benchmark result drop occurred around mid February for the
> serializerAvro benchmark.
>
> We haven't managed to identify any solid causes so far, only the above
> speculations.
>
> Cheers,
> Gordon
>
>
> On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:
>
> > Piotr and me discovered a possible issue in the benchmarks.
> >
> > Looking at the time graphs, there seems to be one issue coming around end
> > of January. It increased network throughput, but decreased overall
> > performance and added more variation in time (possibly through GC). Check
> > the trend in these graphs:
> >
> > Increased Throughput:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
> > Higher variance in count benchmark:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
> > Drop in tuple-key-by performance trend:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on
> >
> > In addition, the Avro and Row serializers seem to have a performance drop
> > since mid February:
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
> >
> >
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on
> >
> > @Gordon any idea what could be the cause of this?
> >
> >
> > On Mon, Mar 18, 2019 at 3:08 PM Yu Li  wrote:
> >
> > > Watching the benchmark data for days and indeed it's normal

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Tzu-Li (Gordon) Tai
Hi,

The regressions in the benchmark were also brought up earlier in this
thread by Yu.
>From the previous investigations, these are the commits that touched
relevant serializers (TupleSerializer, AvroSerializer, RowSerializer)
around Jan / Feb:

TupleSerializer -
73e4d0ecfd (Thu Feb 14 11:56:51 2019 +0800) [FLINK-10493] Migrate all
subclasses of TupleSerializerBase to use new serialization compatibility
abstractions

AvroSerializer -
09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
from TypeSerializer
479ebd5987 (Tue Jan 29 15:06:09 2019 +0800) [FLINK-11436] [avro] Manually
Java-deserialize AvroSerializer for backwards compatibility

RowSerializer -
09bb7bbc0f (Wed Feb 20 09:52:57 2019 +0100) [FLINK-9803] Drop canEqual()
from TypeSerializer
b434b32c08 (Wed Jan 30 22:53:27 2019 +0800) [FLINK-11329] [table] Migrating
the RowSerializer to use new compatibility API

The odd thing is, the times of these commits don't really match the drops
in their respective benchmark result timeline.
For tupleKeyBy benchmark, the drop started around end of January, where as
the TupleSerializer was only last touched mid February.
For the serializerRow and serializerAvro benchmarks, the drop occurred
around mid February, where as the only commit around that time was
09bb7bbc0f ([FLINK-9803] Drop canEqual() from TypeSerializer).

The only possible explanation that I can provide for the AvroSerializer
benchmark drop for now, is due to 479ebd5987 (FLINK-11436).
That commit had to touch the `readObject` method of the AvroSerializer,
which introduced some type checks / casts.
This may have caused regression in deserializing the AvroSerializer itself,
which would have been accounted for in the job initialization phase of the
serializerAvro benchmark.
The commit should not have affected per-record performance of the
AvroSerializer.
However, again, the commit time for 479ebd5987 was end of January, where as
the benchmark result drop occurred around mid February for the
serializerAvro benchmark.

We haven't managed to identify any solid causes so far, only the above
speculations.

Cheers,
Gordon


On Tue, Mar 19, 2019 at 1:36 AM Stephan Ewen  wrote:

> Piotr and me discovered a possible issue in the benchmarks.
>
> Looking at the time graphs, there seems to be one issue coming around end
> of January. It increased network throughput, but decreased overall
> performance and added more variation in time (possibly through GC). Check
> the trend in these graphs:
>
> Increased Throughput:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
> Higher variance in count benchmark:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
> Drop in tuple-key-by performance trend:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on
>
> In addition, the Avro and Row serializers seem to have a performance drop
> since mid February:
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
>
> http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on
>
> @Gordon any idea what could be the cause of this?
>
>
> On Mon, Mar 18, 2019 at 3:08 PM Yu Li  wrote:
>
> > Watching the benchmark data for days and indeed it's normalized for the
> > time being. However, the result seems to be unstable. I also tried the
> > benchmark locally and observed obvious wave even with the same commit...
> >
> > I guess we may need to improve it such as increasing the
> > RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a stable
> > micro benchmark is important to verify perf-related improvements (and I
> > think the benchmark and website are already great ones but just need some
> > love). Let me mark this as one of my backlog and will open a JIRA when
> > prepared.
> >
> > Anyway good to know it's not a regression, and thanks for the efforts
> spent
> > on checking it over! @Gordon @Chesnay
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler 
> wrote:
> >
> > > The regressions is already normalizing again. I'd observer it further
> > > before doing anything.
> > >
> > > The same applies to the benchmarkCount which tanked even more in that
> > > same run.
> > >
> > > On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
> > > > @Yu
> > > > Thanks for reporting that Yu, great that this was noticed.
> > > >
> > > > The serializerAvro case seems to only be testing on-wire
> serialization.
> > > > I checked the changes to the `AvroSerializer`, and it seems like
> > > > FLINK-11436 [1] with commit 479ebd59 was the only change that may
> have
> > > > affected that.
> > > > That commit wasn't introduced exactly around the time when the
> > indicated
> > > > performance regression occurred, but was still before 

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Stephan Ewen
Piotr and me discovered a possible issue in the benchmarks.

Looking at the time graphs, there seems to be one issue coming around end
of January. It increased network throughput, but decreased overall
performance and added more variation in time (possibly through GC). Check
the trend in these graphs:

Increased Throughput:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=networkThroughput.1000,100ms&env=2&revs=200&equid=off&quarts=on&extr=on
Higher variance in count benchmark:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=benchmarkCount&env=2&revs=200&equid=off&quarts=on&extr=on
Drop in tuple-key-by performance trend:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=tupleKeyBy&env=2&revs=200&equid=off&quarts=on&extr=on

In addition, the Avro and Row serializers seem to have a performance drop
since mid February:
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerAvro&env=2&revs=200&equid=off&quarts=on&extr=on
http://codespeed.dak8s.net:8000/timeline/#/?exe=1&ben=serializerRow&env=2&revs=200&equid=off&quarts=on&extr=on

@Gordon any idea what could be the cause of this?


On Mon, Mar 18, 2019 at 3:08 PM Yu Li  wrote:

> Watching the benchmark data for days and indeed it's normalized for the
> time being. However, the result seems to be unstable. I also tried the
> benchmark locally and observed obvious wave even with the same commit...
>
> I guess we may need to improve it such as increasing the
> RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a stable
> micro benchmark is important to verify perf-related improvements (and I
> think the benchmark and website are already great ones but just need some
> love). Let me mark this as one of my backlog and will open a JIRA when
> prepared.
>
> Anyway good to know it's not a regression, and thanks for the efforts spent
> on checking it over! @Gordon @Chesnay
>
> Best Regards,
> Yu
>
>
> On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler  wrote:
>
> > The regressions is already normalizing again. I'd observer it further
> > before doing anything.
> >
> > The same applies to the benchmarkCount which tanked even more in that
> > same run.
> >
> > On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
> > > @Yu
> > > Thanks for reporting that Yu, great that this was noticed.
> > >
> > > The serializerAvro case seems to only be testing on-wire serialization.
> > > I checked the changes to the `AvroSerializer`, and it seems like
> > > FLINK-11436 [1] with commit 479ebd59 was the only change that may have
> > > affected that.
> > > That commit wasn't introduced exactly around the time when the
> indicated
> > > performance regression occurred, but was still before the regression.
> > > The commit introduced some instanceof type checks / type casting in the
> > > readObject of the AvroSerializer, which may have caused this.
> > >
> > > Currently investigating further.
> > >
> > > Cheers,
> > > Gordon
> > >
> > > On Fri, Mar 15, 2019 at 11:45 AM Yu Li  wrote:
> > >
> > >> Hi Aljoscha and all,
> > >>
> > >>  From our performance benchmark web site (
> > >> http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
> > >> regression (-6.92%) on the serializerAvro case comparing the latest
> 100
> > >> revisions, which may need some attention. Thanks.
> > >>
> > >> Best Regards,
> > >> Yu
> > >>
> > >>
> > >> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek 
> > >> wrote:
> > >>
> > >>> Hi everyone,
> > >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> > >>> follows:
> > >>> [ ] +1, Approve the release
> > >>> [ ] -1, Do not approve the release (please provide specific comments)
> > >>>
> > >>>
> > >>> The complete staging area is available for your review, which
> includes:
> > >>> * JIRA release notes [1],
> > >>> * the official Apache source release and binary convenience releases
> to
> > >> be
> > >>> deployed to dist.apache.org  [2], which are
> > >>> signed with the key with fingerprint
> > >>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> > >>> * all artifacts to be deployed to the Maven Central Repository [4],
> > >>> * source code tag "release-1.8.0-rc2" [5],
> > >>> * website pull request listing the new release [6]
> > >>> * website pull request adding announcement blog post [7].
> > >>>
> > >>> The vote will be open for at least 72 hours. It is adopted by
> majority
> > >>> approval, with at least 3 PMC affirmative votes.
> > >>>
> > >>> Thanks,
> > >>> Aljoscha
> > >>>
> > >>> [1]
> > >>>
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > >>> <
> > >>>
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> > >>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> > >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> > >>> https://dist.apache.org/repos/dist/release

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
Watching the benchmark data for days and indeed it's normalized for the
time being. However, the result seems to be unstable. I also tried the
benchmark locally and observed obvious wave even with the same commit...

I guess we may need to improve it such as increasing the
RECORDS_PER_INVOCATION to generate a reproducible result. IMHO a stable
micro benchmark is important to verify perf-related improvements (and I
think the benchmark and website are already great ones but just need some
love). Let me mark this as one of my backlog and will open a JIRA when
prepared.

Anyway good to know it's not a regression, and thanks for the efforts spent
on checking it over! @Gordon @Chesnay

Best Regards,
Yu


On Fri, 15 Mar 2019 at 19:20, Chesnay Schepler  wrote:

> The regressions is already normalizing again. I'd observer it further
> before doing anything.
>
> The same applies to the benchmarkCount which tanked even more in that
> same run.
>
> On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:
> > @Yu
> > Thanks for reporting that Yu, great that this was noticed.
> >
> > The serializerAvro case seems to only be testing on-wire serialization.
> > I checked the changes to the `AvroSerializer`, and it seems like
> > FLINK-11436 [1] with commit 479ebd59 was the only change that may have
> > affected that.
> > That commit wasn't introduced exactly around the time when the indicated
> > performance regression occurred, but was still before the regression.
> > The commit introduced some instanceof type checks / type casting in the
> > readObject of the AvroSerializer, which may have caused this.
> >
> > Currently investigating further.
> >
> > Cheers,
> > Gordon
> >
> > On Fri, Mar 15, 2019 at 11:45 AM Yu Li  wrote:
> >
> >> Hi Aljoscha and all,
> >>
> >>  From our performance benchmark web site (
> >> http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
> >> regression (-6.92%) on the serializerAvro case comparing the latest 100
> >> revisions, which may need some attention. Thanks.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek 
> >> wrote:
> >>
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> >>> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be
> >>> deployed to dist.apache.org  [2], which are
> >>> signed with the key with fingerprint
> >>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc2" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >>> approval, with at least 3 PMC affirmative votes.
> >>>
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> <
> >>>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> >>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> >>> https://dist.apache.org/repos/dist/release/flink/KEYS>
> >>> [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1213
> >>> <
> https://repository.apache.org/content/repositories/orgapacheflink-1210/
> >>>
> >>> [5]
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> >>> <
> >>>
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >>> [6] https://github.com/apache/flink-web/pull/180 <
> >>> https://github.com/apache/flink-web/pull/180>
> >>> [7] https://github.com/apache/flink-web/pull/179 <
> >>> https://github.com/apache/flink-web/pull/179>
> >>>
> >>> P.S. The difference to the previous RC1 is very small, you can fetch
> the
> >>> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see
> >> the
> >>> difference in commits. Its fixes for the issues that led to the
> >>> cancellation of the previous RC plus smaller fixes. Most
> >>> verification/testing that was carried out should apply as is to this
> RC.
>
>
>


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
@Aljoscha I see, thanks for the quick response!

Best Regards,
Yu


On Mon, 18 Mar 2019 at 19:11, Aljoscha Krettek  wrote:

> @Yu Thanks for the pointer. This is because I didn’t yet update the
> buildbot configuration for the new release. It’s a point that is very low
> in the release guide but I think I’ll do that now.
>
> > On 18. Mar 2019, at 09:37, Yu Li  wrote:
> >
> > One supplement for point #2: there's a Note on the doc for the error, but
> > I'm wondering why we don't directly remove the -DarchetypeCatalog option
> in
> > the command and tell users to specify the catalog in settings.xml if they
> > prefer to. I mean, user tends to try the command first before checking
> the
> > note and will get the error. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Mon, 18 Mar 2019 at 16:30, Yu Li  wrote:
> >
> >> Issues observed when checking quick start:
> >>
> >> 1. The versions on the document
> >> <
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/projectsetup/java_api_quickstart.html>
> are
> >> still "1.9-SNAPSHOT" instead of "1.8.0"
> >>
> >> 2. The "Use Maven archetypes" command failed with below error:
> >> [ERROR] Failed to execute goal
> >> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
> >> (default-cli) on project standalone-pom: archetypeCatalog '
> >> https://repository.apache.org/content/repositories/snapshots/' is not
> >> supported anymore. Please read the plugin documentation for details.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler 
> wrote:
> >>
> >>> We release SNAPSHOT artifacts for all module, see
> >>>
> >>>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
> >>> .
> >>>
> >>> The artifacts to be released do not have a SNAPSHOT suffix:
> >>>
> >>>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
> >>>
> >>> Finally, we are already adding flink-shaded-hadoop to the optional
> >>> components section in this PR:
> >>> https://github.com/apache/flink-web/pull/180
> >>>
> >>> On 18.03.2019 08:55, jincheng sun wrote:
>  -1
> 
>  Currently, we have released the Hadoop-related JRA as a snapshot
>  version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>  <
> >>>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
>  ),
>  I think we should release a stable version.
>  When testing the release code on YARN, currently user cannot find out
> >>> the
>  Hadoop dependency.  Although there is a download explanation for
> Hadoop
> >>> in
>  PR [`Update Downloads page for Flink 1.8
>  `], a 404 error
> >>> occurs
>  when you click Download ( I had left detail comments in the PR).
> 
>  So, I suggest as follows:
> 
>    1. It would be better to add the changes for
>  `downloads.html#optional-components`, add the Hadoop relation JARs
> >>> download
>  link first.
>    2. Then add instructions on how to get the dependencies of the
> >>> Hadoop or
>  add the correct download link directly in the next VOTE mail, due to
> we
> >>> do
>  not include Hadoop in `flink-dist`.
>    3.  Release a stable version Hadoop-related JRAs.
> 
>  Then, contributors can test it more easily on YARN.  What do you
> think?
> 
>  Best,
>  Jincheng
> 
> 
>  Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
> 
> > -1
> >
> > Missing dependencies in NOTICE file of flink-dist (and by extension
> the
> > binary distribution).
> > * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> >
> > On 14.03.2019 13:42, Aljoscha Krettek wrote:
> >> Hi everyone,
> >> Please review and vote on the release candidate 2 for Flink 1.8.0,
> as
> > follows:
> >> [ ] +1, Approve the release
> >> [ ] -1, Do not approve the release (please provide specific
> comments)
> >>
> >>
> >> The complete staging area is available for your review, which
> >>> includes:
> >> * JIRA release notes [1],
> >> * the official Apache source release and binary convenience releases
> >>> to
> > be deployed to dist.apache.org  [2], which
> >>> are
> > signed with the key with fingerprint
> > F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> * source code tag "release-1.8.0-rc2" [5],
> >> * website pull request listing the new release [6]
> >> * website pull request adding announcement blog post [7].
> >>
> >> The vote will be open for at least 72 hours. It is adopted by
> majority
> > approval, with at least 3 PMC affirmative votes.
> >> Thanks,
> >> Aljoscha
> >>
> >> [1]
> >
> >>>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proj

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Chesnay Schepler
Long term plan _is_ to move flink-shaded-hadoop to flink-shaded, I 
believe there's even a JIRA for that.


Until that is in place they _must_ have retain the flink version as 
otherwise we'd be unable to change them in follow-up releases without 
changing the version scheme again.


And even after the move they will retain the flink-shaded version like 
all other flink-shaded modules, for the above reason.


On 18.03.2019 12:10, jincheng sun wrote:

Hi Chesnay,

The artifacts to be released do not have a SNAPSHOT suffix:

https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/

Thank you for providing this link. It's very useful for contributors 
who want to check the RC on YARN.


My suggestion may not describe being clear, let me explain:

1. Since 1.8.0, Flink's release package will not contain the 
corresponding Hadoop dependency, then the user has two ways to get the 
required hadoop dependency:


   1). Download the existing Hadoop version on the Flink download page.
   2). Generate the version required by the user from the source code 
(see 
https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#hadoop-versions) 
For example, version 2.6.1 is required: `mvn clean install -DskipTests 
-Dhadoop.version=2.6.1`.


2. About how to manage the JARs release of Hadoop dependencies:

   1). The name of Hadoop shaded version should not include Flink 
version,  take your link as an example:

 `.../flink-shaded-hadoop2-uber/2.4.1-1.8.0/xx.jar`
 `.../flink-shaded-hadoop2-uber/2.6.5-1.8.0/xx.jar`
 `.../flink-shaded-hadoop2-uber/2.7.5-1.8.0/xx.jar`
 `.../flink-shaded-hadoop2-uber/2.8.3-1.8.0/xx.jar`
The above version name I think it is possible to change `2.4.1-1.8.0` 
to `2.4.1`. That is, the same version of `Hadoop` shade can be used in 
many Flink versions, such as 2.8.3 Hadoop is not only available for 
Flink-1.8.0, it can be used by Flink-1.8.x or it can be used by 
Flink-1.9.x. etc.


   2). Release the shaded-Hadoop independently:
   For a long-term,  we can release the shaded JARs independently and 
move `flink-shaded-hadoop` into 
`https://github.com/apache/flink-shaded` 
,  So I suggest that we can 
publish Hadoop versions independently,  and share them in multiple 
Flink versions.

What do you think?

Best,
Jincheng


Chesnay Schepler mailto:ches...@apache.org>> 
于2019年3月18日周一 下午4:15写道:


We release SNAPSHOT artifacts for all module, see

https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/

.

The artifacts to be released do not have a SNAPSHOT suffix:

https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/

Finally, we are already adding flink-shaded-hadoop to the optional
components section in this PR:
https://github.com/apache/flink-web/pull/180

On 18.03.2019 08:55, jincheng sun wrote:
> -1
>
> Currently, we have released the Hadoop-related JRA as a snapshot
> version(such as flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>

),
> I think we should release a stable version.
> When testing the release code on YARN, currently user cannot
find out the
> Hadoop dependency.  Although there is a download explanation for
Hadoop in
> PR [`Update Downloads page for Flink 1.8
> `], a 404
error occurs
> when you click Download ( I had left detail comments in the PR).
>
> So, I suggest as follows:
>
>1. It would be better to add the changes for
> `downloads.html#optional-components`, add the Hadoop relation
JARs download
> link first.
>2. Then add instructions on how to get the dependencies of
the Hadoop or
> add the correct download link directly in the next VOTE mail,
due to we do
> not include Hadoop in `flink-dist`.
>3.  Release a stable version Hadoop-related JRAs.
>
> Then, contributors can test it more easily on YARN.  What do you
think?
>
> Best,
> Jincheng
>
>
> Chesnay Schepler mailto:ches...@apache.org>> 于2019年3月15日周五 下午10:35写道:
>
>> -1
>>
>> Missing dependencies in NOTICE file of flink-dist (and by
extension the
>> binary distribution).
>> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>>
>> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>>> Hi everyone,
>>> Please review and vote on the release candidate 2 for Flink
1.8.0, as
>> follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific
comments)
>>>
>>>
>>> The complete staging area is available for your review, which
includes:
>>> * JIRA release notes [1],
  

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Chesnay Schepler
Additionally, which I fortunately did not realize earlier, we must also 
remove the "org.rocksdb:rocksdbjni" entry from the NOTICE files. (i.e. 
replace them with frocksdb)


On 15.03.2019 15:35, Chesnay Schepler wrote:

-1

Missing dependencies in NOTICE file of flink-dist (and by extension 
the binary distribution).

* com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0

On 14.03.2019 13:42, Aljoscha Krettek wrote:

Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as 
follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases 
to be deployed to dist.apache.org  [2], 
which are signed with the key with fingerprint 
F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by 
majority approval, with at least 3 PMC affirmative votes.


Thanks,
Aljoscha

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 
 

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ 

[3] https://dist.apache.org/repos/dist/release/flink/KEYS 

[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1213 
 

[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd 
 

[6] https://github.com/apache/flink-web/pull/180 

[7] https://github.com/apache/flink-web/pull/179 



P.S. The difference to the previous RC1 is very small, you can fetch 
the two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” 
to see the difference in commits. Its fixes for the issues that led 
to the cancellation of the previous RC plus smaller fixes. Most 
verification/testing that was carried out should apply as is to this RC.








Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
Hi Chesnay,

The artifacts to be released do not have a SNAPSHOT suffix:
>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/

Thank you for providing this link. It's very useful for contributors who
want to check the RC on YARN.

My suggestion may not describe being clear, let me explain:

1. Since 1.8.0, Flink's release package will not contain the corresponding
Hadoop dependency, then the user has two ways to get the required hadoop
dependency:

   1). Download the existing Hadoop version on the Flink download page.
   2). Generate the version required by the user from the source code (see
https://ci.apache.org/projects/flink/flink-docs-master/flinkDev/building.html#hadoop-versions)
For example, version 2.6.1 is required: `mvn clean install -DskipTests
-Dhadoop.version=2.6.1`.

2. About how to manage the JARs release of Hadoop dependencies:

   1). The name of Hadoop shaded version should not include Flink version,
take your link as an example:
   `.../flink-shaded-hadoop2-uber/2.4.1-1.8.0/xx.jar`
   `.../flink-shaded-hadoop2-uber/2.6.5-1.8.0/xx.jar`
   `.../flink-shaded-hadoop2-uber/2.7.5-1.8.0/xx.jar`
   `.../flink-shaded-hadoop2-uber/2.8.3-1.8.0/xx.jar`
The above version name I think it is possible to change `2.4.1-1.8.0` to
`2.4.1`. That is, the same version of `Hadoop` shade can be used in many
Flink versions, such as 2.8.3 Hadoop is not only available for Flink-1.8.0,
it can be used by Flink-1.8.x or it can be used by Flink-1.9.x. etc.

   2). Release the shaded-Hadoop independently:
   For a long-term,  we can release the shaded JARs independently and move
`flink-shaded-hadoop` into `https://github.com/apache/flink-shaded`,  So I
suggest that we can publish Hadoop versions independently,  and share them
in multiple Flink versions.

What do you think?

Best,
Jincheng


Chesnay Schepler  于2019年3月18日周一 下午4:15写道:

> We release SNAPSHOT artifacts for all module, see
>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
> .
>
> The artifacts to be released do not have a SNAPSHOT suffix:
>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>
> Finally, we are already adding flink-shaded-hadoop to the optional
> components section in this PR:
> https://github.com/apache/flink-web/pull/180
>
> On 18.03.2019 08:55, jincheng sun wrote:
> > -1
> >
> > Currently, we have released the Hadoop-related JRA as a snapshot
> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
> > <
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
> >),
> > I think we should release a stable version.
> > When testing the release code on YARN, currently user cannot find out the
> > Hadoop dependency.  Although there is a download explanation for Hadoop
> in
> > PR [`Update Downloads page for Flink 1.8
> > `], a 404 error
> occurs
> > when you click Download ( I had left detail comments in the PR).
> >
> > So, I suggest as follows:
> >
> >1. It would be better to add the changes for
> > `downloads.html#optional-components`, add the Hadoop relation JARs
> download
> > link first.
> >2. Then add instructions on how to get the dependencies of the Hadoop
> or
> > add the correct download link directly in the next VOTE mail, due to we
> do
> > not include Hadoop in `flink-dist`.
> >3.  Release a stable version Hadoop-related JRAs.
> >
> > Then, contributors can test it more easily on YARN.  What do you think?
> >
> > Best,
> > Jincheng
> >
> >
> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
> >
> >> -1
> >>
> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
> >> binary distribution).
> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> >>
> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> >> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be deployed to dist.apache.org  [2], which are
> >> signed with the key with fingerprint
> >> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc2" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>
> https://issues.apache.org/jira/sec

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Aljoscha Krettek
@Yu Thanks for the pointer. This is because I didn’t yet update the buildbot 
configuration for the new release. It’s a point that is very low in the release 
guide but I think I’ll do that now.

> On 18. Mar 2019, at 09:37, Yu Li  wrote:
> 
> One supplement for point #2: there's a Note on the doc for the error, but
> I'm wondering why we don't directly remove the -DarchetypeCatalog option in
> the command and tell users to specify the catalog in settings.xml if they
> prefer to. I mean, user tends to try the command first before checking the
> note and will get the error. Thanks.
> 
> Best Regards,
> Yu
> 
> 
> On Mon, 18 Mar 2019 at 16:30, Yu Li  wrote:
> 
>> Issues observed when checking quick start:
>> 
>> 1. The versions on the document
>> 
>>  are
>> still "1.9-SNAPSHOT" instead of "1.8.0"
>> 
>> 2. The "Use Maven archetypes" command failed with below error:
>> [ERROR] Failed to execute goal
>> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
>> (default-cli) on project standalone-pom: archetypeCatalog '
>> https://repository.apache.org/content/repositories/snapshots/' is not
>> supported anymore. Please read the plugin documentation for details.
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler  wrote:
>> 
>>> We release SNAPSHOT artifacts for all module, see
>>> 
>>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
>>> .
>>> 
>>> The artifacts to be released do not have a SNAPSHOT suffix:
>>> 
>>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>>> 
>>> Finally, we are already adding flink-shaded-hadoop to the optional
>>> components section in this PR:
>>> https://github.com/apache/flink-web/pull/180
>>> 
>>> On 18.03.2019 08:55, jincheng sun wrote:
 -1
 
 Currently, we have released the Hadoop-related JRA as a snapshot
 version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
 <
>>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
 ),
 I think we should release a stable version.
 When testing the release code on YARN, currently user cannot find out
>>> the
 Hadoop dependency.  Although there is a download explanation for Hadoop
>>> in
 PR [`Update Downloads page for Flink 1.8
 `], a 404 error
>>> occurs
 when you click Download ( I had left detail comments in the PR).
 
 So, I suggest as follows:
 
   1. It would be better to add the changes for
 `downloads.html#optional-components`, add the Hadoop relation JARs
>>> download
 link first.
   2. Then add instructions on how to get the dependencies of the
>>> Hadoop or
 add the correct download link directly in the next VOTE mail, due to we
>>> do
 not include Hadoop in `flink-dist`.
   3.  Release a stable version Hadoop-related JRAs.
 
 Then, contributors can test it more easily on YARN.  What do you think?
 
 Best,
 Jincheng
 
 
 Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
 
> -1
> 
> Missing dependencies in NOTICE file of flink-dist (and by extension the
> binary distribution).
> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> 
> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>> Hi everyone,
>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>> 
>> 
>> The complete staging area is available for your review, which
>>> includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience releases
>>> to
> be deployed to dist.apache.org  [2], which
>>> are
> signed with the key with fingerprint
> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.8.0-rc2" [5],
>> * website pull request listing the new release [6]
>> * website pull request adding announcement blog post [7].
>> 
>> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>> Thanks,
>> Aljoscha
>> 
>> [1]
> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> <
> 
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> https://di

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
One supplement for point #2: there's a Note on the doc for the error, but
I'm wondering why we don't directly remove the -DarchetypeCatalog option in
the command and tell users to specify the catalog in settings.xml if they
prefer to. I mean, user tends to try the command first before checking the
note and will get the error. Thanks.

Best Regards,
Yu


On Mon, 18 Mar 2019 at 16:30, Yu Li  wrote:

> Issues observed when checking quick start:
>
> 1. The versions on the document
> 
>  are
> still "1.9-SNAPSHOT" instead of "1.8.0"
>
> 2. The "Use Maven archetypes" command failed with below error:
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
> (default-cli) on project standalone-pom: archetypeCatalog '
> https://repository.apache.org/content/repositories/snapshots/' is not
> supported anymore. Please read the plugin documentation for details.
>
> Best Regards,
> Yu
>
>
> On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler  wrote:
>
>> We release SNAPSHOT artifacts for all module, see
>>
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
>> .
>>
>> The artifacts to be released do not have a SNAPSHOT suffix:
>>
>> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>>
>> Finally, we are already adding flink-shaded-hadoop to the optional
>> components section in this PR:
>> https://github.com/apache/flink-web/pull/180
>>
>> On 18.03.2019 08:55, jincheng sun wrote:
>> > -1
>> >
>> > Currently, we have released the Hadoop-related JRA as a snapshot
>> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
>> > <
>> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
>> >),
>> > I think we should release a stable version.
>> > When testing the release code on YARN, currently user cannot find out
>> the
>> > Hadoop dependency.  Although there is a download explanation for Hadoop
>> in
>> > PR [`Update Downloads page for Flink 1.8
>> > `], a 404 error
>> occurs
>> > when you click Download ( I had left detail comments in the PR).
>> >
>> > So, I suggest as follows:
>> >
>> >1. It would be better to add the changes for
>> > `downloads.html#optional-components`, add the Hadoop relation JARs
>> download
>> > link first.
>> >2. Then add instructions on how to get the dependencies of the
>> Hadoop or
>> > add the correct download link directly in the next VOTE mail, due to we
>> do
>> > not include Hadoop in `flink-dist`.
>> >3.  Release a stable version Hadoop-related JRAs.
>> >
>> > Then, contributors can test it more easily on YARN.  What do you think?
>> >
>> > Best,
>> > Jincheng
>> >
>> >
>> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
>> >
>> >> -1
>> >>
>> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
>> >> binary distribution).
>> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>> >>
>> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
>> >>> Hi everyone,
>> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
>> >> follows:
>> >>> [ ] +1, Approve the release
>> >>> [ ] -1, Do not approve the release (please provide specific comments)
>> >>>
>> >>>
>> >>> The complete staging area is available for your review, which
>> includes:
>> >>> * JIRA release notes [1],
>> >>> * the official Apache source release and binary convenience releases
>> to
>> >> be deployed to dist.apache.org  [2], which
>> are
>> >> signed with the key with fingerprint
>> >> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>> >>> * all artifacts to be deployed to the Maven Central Repository [4],
>> >>> * source code tag "release-1.8.0-rc2" [5],
>> >>> * website pull request listing the new release [6]
>> >>> * website pull request adding announcement blog post [7].
>> >>>
>> >>> The vote will be open for at least 72 hours. It is adopted by majority
>> >> approval, with at least 3 PMC affirmative votes.
>> >>> Thanks,
>> >>> Aljoscha
>> >>>
>> >>> [1]
>> >>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>> >> <
>> >>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
>> >> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
>> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
>> >> https://dist.apache.org/repos/dist/release/flink/KEYS>
>> >>> [4]
>> >> https://repository.apache.org/content/repositories/orgapacheflink-1213
>> <
>> >>
>> https://repository.apache.org/content/repositories/orgapacheflink-1210/>
>> >>> [5]
>> >>
>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Yu Li
Issues observed when checking quick start:

1. The versions on the document

are
still "1.9-SNAPSHOT" instead of "1.8.0"

2. The "Use Maven archetypes" command failed with below error:
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate
(default-cli) on project standalone-pom: archetypeCatalog '
https://repository.apache.org/content/repositories/snapshots/' is not
supported anymore. Please read the plugin documentation for details.

Best Regards,
Yu


On Mon, 18 Mar 2019 at 16:15, Chesnay Schepler  wrote:

> We release SNAPSHOT artifacts for all module, see
>
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/
> .
>
> The artifacts to be released do not have a SNAPSHOT suffix:
>
> https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/
>
> Finally, we are already adding flink-shaded-hadoop to the optional
> components section in this PR:
> https://github.com/apache/flink-web/pull/180
>
> On 18.03.2019 08:55, jincheng sun wrote:
> > -1
> >
> > Currently, we have released the Hadoop-related JRA as a snapshot
> > version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
> > <
> https://repository.apache.org/content/groups/public/org/apache/flink/flink-shaded-hadoop2-uber/
> >),
> > I think we should release a stable version.
> > When testing the release code on YARN, currently user cannot find out the
> > Hadoop dependency.  Although there is a download explanation for Hadoop
> in
> > PR [`Update Downloads page for Flink 1.8
> > `], a 404 error
> occurs
> > when you click Download ( I had left detail comments in the PR).
> >
> > So, I suggest as follows:
> >
> >1. It would be better to add the changes for
> > `downloads.html#optional-components`, add the Hadoop relation JARs
> download
> > link first.
> >2. Then add instructions on how to get the dependencies of the Hadoop
> or
> > add the correct download link directly in the next VOTE mail, due to we
> do
> > not include Hadoop in `flink-dist`.
> >3.  Release a stable version Hadoop-related JRAs.
> >
> > Then, contributors can test it more easily on YARN.  What do you think?
> >
> > Best,
> > Jincheng
> >
> >
> > Chesnay Schepler  于2019年3月15日周五 下午10:35写道:
> >
> >> -1
> >>
> >> Missing dependencies in NOTICE file of flink-dist (and by extension the
> >> binary distribution).
> >> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
> >>
> >> On 14.03.2019 13:42, Aljoscha Krettek wrote:
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> >> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be deployed to dist.apache.org  [2], which are
> >> signed with the key with fingerprint
> >> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc2" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >> <
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> >> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> >> https://dist.apache.org/repos/dist/release/flink/KEYS>
> >>> [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1213
> <
> >> https://repository.apache.org/content/repositories/orgapacheflink-1210/
> >
> >>> [5]
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> >> <
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >>> [6] https://github.com/apache/flink-web/pull/180 <
> >> https://github.com/apache/flink-web/pull/180>
> >>> [7] https://github.com/apache/flink-web/pull/179 <
> >> https://github.com/apache/flink-web/pull/179>
> >>> P.S. The difference to the previous RC1 is very small, you can fetch
> the
> >> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see
> the
> >> difference in commits. Its fixes for the issues th

Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread Chesnay Schepler
We release SNAPSHOT artifacts for all module, see 
https://repository.apache.org/content/groups/public/org/apache/flink/flink-core/ 
.


The artifacts to be released do not have a SNAPSHOT suffix: 
https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/


Finally, we are already adding flink-shaded-hadoop to the optional 
components section in this PR: https://github.com/apache/flink-web/pull/180


On 18.03.2019 08:55, jincheng sun wrote:

-1

Currently, we have released the Hadoop-related JRA as a snapshot
version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
),
I think we should release a stable version.
When testing the release code on YARN, currently user cannot find out the
Hadoop dependency.  Although there is a download explanation for Hadoop in
PR [`Update Downloads page for Flink 1.8
`], a 404 error occurs
when you click Download ( I had left detail comments in the PR).

So, I suggest as follows:

   1. It would be better to add the changes for
`downloads.html#optional-components`, add the Hadoop relation JARs download
link first.
   2. Then add instructions on how to get the dependencies of the Hadoop or
add the correct download link directly in the next VOTE mail, due to we do
not include Hadoop in `flink-dist`.
   3.  Release a stable version Hadoop-related JRAs.

Then, contributors can test it more easily on YARN.  What do you think?

Best,
Jincheng


Chesnay Schepler  于2019年3月15日周五 下午10:35写道:


-1

Missing dependencies in NOTICE file of flink-dist (and by extension the
binary distribution).
* com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0

On 14.03.2019 13:42, Aljoscha Krettek wrote:

Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as

follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to

be deployed to dist.apache.org  [2], which are
signed with the key with fingerprint
F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority

approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1]

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
<
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <

https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>

[3] https://dist.apache.org/repos/dist/release/flink/KEYS <

https://dist.apache.org/repos/dist/release/flink/KEYS>

[4]

https://repository.apache.org/content/repositories/orgapacheflink-1213 <
https://repository.apache.org/content/repositories/orgapacheflink-1210/>

[5]

https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
<
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5

[6] https://github.com/apache/flink-web/pull/180 <

https://github.com/apache/flink-web/pull/180>

[7] https://github.com/apache/flink-web/pull/179 <

https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RC1 is very small, you can fetch the

two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the
difference in commits. Its fixes for the issues that led to the
cancellation of the previous RC plus smaller fixes. Most
verification/testing that was carried out should apply as is to this RC.







Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-18 Thread jincheng sun
-1

Currently, we have released the Hadoop-related JRA as a snapshot
version(such as  flink-shaded-hadoop2-uber/2.4.1-1.8-SNAPSHOT
),
I think we should release a stable version.
When testing the release code on YARN, currently user cannot find out the
Hadoop dependency.  Although there is a download explanation for Hadoop in
PR [`Update Downloads page for Flink 1.8
`], a 404 error occurs
when you click Download ( I had left detail comments in the PR).

So, I suggest as follows:

  1. It would be better to add the changes for
`downloads.html#optional-components`, add the Hadoop relation JARs download
link first.
  2. Then add instructions on how to get the dependencies of the Hadoop or
add the correct download link directly in the next VOTE mail, due to we do
not include Hadoop in `flink-dist`.
  3.  Release a stable version Hadoop-related JRAs.

Then, contributors can test it more easily on YARN.  What do you think?

Best,
Jincheng


Chesnay Schepler  于2019年3月15日周五 下午10:35写道:

> -1
>
> Missing dependencies in NOTICE file of flink-dist (and by extension the
> binary distribution).
> * com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0
>
> On 14.03.2019 13:42, Aljoscha Krettek wrote:
> > Hi everyone,
> > Please review and vote on the release candidate 2 for Flink 1.8.0, as
> follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be deployed to dist.apache.org  [2], which are
> signed with the key with fingerprint
> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.8.0-rc2" [5],
> > * website pull request listing the new release [6]
> > * website pull request adding announcement blog post [7].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Aljoscha
> >
> > [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> https://dist.apache.org/repos/dist/release/flink/KEYS>
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1213 <
> https://repository.apache.org/content/repositories/orgapacheflink-1210/>
> > [5]
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> <
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >
> > [6] https://github.com/apache/flink-web/pull/180 <
> https://github.com/apache/flink-web/pull/180>
> > [7] https://github.com/apache/flink-web/pull/179 <
> https://github.com/apache/flink-web/pull/179>
> >
> > P.S. The difference to the previous RC1 is very small, you can fetch the
> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the
> difference in commits. Its fixes for the issues that led to the
> cancellation of the previous RC plus smaller fixes. Most
> verification/testing that was carried out should apply as is to this RC.
>
>
>


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-15 Thread Chesnay Schepler

-1

Missing dependencies in NOTICE file of flink-dist (and by extension the 
binary distribution).

* com.data-artisans:frocksdbjni:jar:5.17.2-artisans-1.0

On 14.03.2019 13:42, Aljoscha Krettek wrote:

Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed 
to dist.apache.org  [2], which are signed with the key 
with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
 

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ 

[3] https://dist.apache.org/repos/dist/release/flink/KEYS 

[4] https://repository.apache.org/content/repositories/orgapacheflink-1213 

[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
 

[6] https://github.com/apache/flink-web/pull/180 

[7] https://github.com/apache/flink-web/pull/179 


P.S. The difference to the previous RC1 is very small, you can fetch the two tags 
and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the difference 
in commits. Its fixes for the issues that led to the cancellation of the previous RC 
plus smaller fixes. Most verification/testing that was carried out should apply as 
is to this RC.





Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-15 Thread Chesnay Schepler
I went over the release notes and made quite a few modifications. The 
website PR will need an update.


On 14.03.2019 13:42, Aljoscha Krettek wrote:

Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed 
to dist.apache.org  [2], which are signed with the key 
with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
 

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ 

[3] https://dist.apache.org/repos/dist/release/flink/KEYS 

[4] https://repository.apache.org/content/repositories/orgapacheflink-1213 

[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
 

[6] https://github.com/apache/flink-web/pull/180 

[7] https://github.com/apache/flink-web/pull/179 


P.S. The difference to the previous RC1 is very small, you can fetch the two tags 
and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the difference 
in commits. Its fixes for the issues that led to the cancellation of the previous RC 
plus smaller fixes. Most verification/testing that was carried out should apply as 
is to this RC.





Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-15 Thread Chesnay Schepler
The regressions is already normalizing again. I'd observer it further 
before doing anything.


The same applies to the benchmarkCount which tanked even more in that 
same run.


On 15.03.2019 06:02, Tzu-Li (Gordon) Tai wrote:

@Yu
Thanks for reporting that Yu, great that this was noticed.

The serializerAvro case seems to only be testing on-wire serialization.
I checked the changes to the `AvroSerializer`, and it seems like
FLINK-11436 [1] with commit 479ebd59 was the only change that may have
affected that.
That commit wasn't introduced exactly around the time when the indicated
performance regression occurred, but was still before the regression.
The commit introduced some instanceof type checks / type casting in the
readObject of the AvroSerializer, which may have caused this.

Currently investigating further.

Cheers,
Gordon

On Fri, Mar 15, 2019 at 11:45 AM Yu Li  wrote:


Hi Aljoscha and all,

 From our performance benchmark web site (
http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
regression (-6.92%) on the serializerAvro case comparing the latest 100
revisions, which may need some attention. Thanks.

Best Regards,
Yu


On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek 
wrote:


Hi everyone,
Please review and vote on the release candidate 2 for Flink 1.8.0, as
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to

be

deployed to dist.apache.org  [2], which are
signed with the key with fingerprint
F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc2" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1]


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274

<


https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <
https://dist.apache.org/repos/dist/release/flink/KEYS>
[4]

https://repository.apache.org/content/repositories/orgapacheflink-1213


[7] https://github.com/apache/flink-web/pull/179 <
https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RC1 is very small, you can fetch the
two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see

the

difference in commits. Its fixes for the issues that led to the
cancellation of the previous RC plus smaller fixes. Most
verification/testing that was carried out should apply as is to this RC.





Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-14 Thread Tzu-Li (Gordon) Tai
@Yu
Thanks for reporting that Yu, great that this was noticed.

The serializerAvro case seems to only be testing on-wire serialization.
I checked the changes to the `AvroSerializer`, and it seems like
FLINK-11436 [1] with commit 479ebd59 was the only change that may have
affected that.
That commit wasn't introduced exactly around the time when the indicated
performance regression occurred, but was still before the regression.
The commit introduced some instanceof type checks / type casting in the
readObject of the AvroSerializer, which may have caused this.

Currently investigating further.

Cheers,
Gordon

On Fri, Mar 15, 2019 at 11:45 AM Yu Li  wrote:

> Hi Aljoscha and all,
>
> From our performance benchmark web site (
> http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
> regression (-6.92%) on the serializerAvro case comparing the latest 100
> revisions, which may need some attention. Thanks.
>
> Best Regards,
> Yu
>
>
> On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek 
> wrote:
>
> > Hi everyone,
> > Please review and vote on the release candidate 2 for Flink 1.8.0, as
> > follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release and binary convenience releases to
> be
> > deployed to dist.apache.org  [2], which are
> > signed with the key with fingerprint
> > F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag "release-1.8.0-rc2" [5],
> > * website pull request listing the new release [6]
> > * website pull request adding announcement blog post [7].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Aljoscha
> >
> > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > <
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> > >
> > [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> > https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> > https://dist.apache.org/repos/dist/release/flink/KEYS>
> > [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1213
> >  >
> > [5]
> >
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> > <
> >
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> > >
> > [6] https://github.com/apache/flink-web/pull/180 <
> > https://github.com/apache/flink-web/pull/180>
> > [7] https://github.com/apache/flink-web/pull/179 <
> > https://github.com/apache/flink-web/pull/179>
> >
> > P.S. The difference to the previous RC1 is very small, you can fetch the
> > two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see
> the
> > difference in commits. Its fixes for the issues that led to the
> > cancellation of the previous RC plus smaller fixes. Most
> > verification/testing that was carried out should apply as is to this RC.
>


Re: [VOTE] Release 1.8.0, release candidate #2

2019-03-14 Thread Yu Li
Hi Aljoscha and all,

>From our performance benchmark web site (
http://codespeed.dak8s.net:8000/changes/) I observed a noticeable
regression (-6.92%) on the serializerAvro case comparing the latest 100
revisions, which may need some attention. Thanks.

Best Regards,
Yu


On Thu, 14 Mar 2019 at 20:42, Aljoscha Krettek  wrote:

> Hi everyone,
> Please review and vote on the release candidate 2 for Flink 1.8.0, as
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org  [2], which are
> signed with the key with fingerprint
> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.8.0-rc2" [5],
> * website pull request listing the new release [6]
> * website pull request adding announcement blog post [7].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Aljoscha
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/ <
> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc2/>
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
> https://dist.apache.org/repos/dist/release/flink/KEYS>
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1213
> 
> [5]
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c77a329b71e3068bfde965ae91921ad5c47246dd
> <
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=2d00b1c26d7b4554707063ab0d1d6cc236cfe8a5
> >
> [6] https://github.com/apache/flink-web/pull/180 <
> https://github.com/apache/flink-web/pull/180>
> [7] https://github.com/apache/flink-web/pull/179 <
> https://github.com/apache/flink-web/pull/179>
>
> P.S. The difference to the previous RC1 is very small, you can fetch the
> two tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc2” to see the
> difference in commits. Its fixes for the issues that led to the
> cancellation of the previous RC plus smaller fixes. Most
> verification/testing that was carried out should apply as is to this RC.