On Mon, Oct 29, 2018 at 12:40 PM, Ismaël Mejía <ieme...@gmail.com> wrote:
> From the Apache point of view nothing impedes anyone from doing > intermediate releases for non LTS releases, only needed thing is > someone willing to do the release and the due vote process. > Agreed. I was not suggesting not doing a release. I wanted to understand cost benefit. > > I don’t know however how will we decide this, we are exactly in the > middle of the release cycle and in 3 weeks we will be cutting the next > version so not sure if it is worth, any thoughts? > My suggestion is to look from a user perspective. Are we affecting a significant chunk of users? And could those stay on 2.7 until we release 2.9? From there we can decide whether this warrants a patch release or not. I do not have information on how large of a user base we are affecting. I assume the answer to the second question is yes and we can suggest them to stay on 2.7 until a new release is out. From that perspective, I would suggest skipping a patch release and waiting for the next regular release. > > On Mon, Oct 29, 2018 at 6:08 PM Ahmet Altay <al...@google.com> wrote: > > > > > > > > On Mon, Oct 29, 2018 at 8:55 AM, Kenneth Knowles <k...@apache.org> > wrote: > >> > >> I think definitely open a cherry pick PR to a 2.8.x branch. I think we > must not corrupt maven central, so if it is published to users this has to > be 2.8.1. Ahmet - we are to this point, right? > > > > > > Yes, if someone is willing to make a new release this would be 2.8.1 > release. (2.8.0 is already on Maven central.) > > > > Side question about the initial LTS discussion. We have decided to not > make 2.8.0 a LTS release. Should we wait until next release to patch this > issue? What is the cost/benefit of maintaining this branch? > > > >> > >> > >> Kenn > >> > >> On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <ieme...@gmail.com> wrote: > >>> > >>> First thanks Etienne and Kenn for noting the performance issue. I > >>> reviewed the discussed PR.It introduced a new ‘@Experimental’ option > >>> to the Spark runner to change the default source partitioning and > >>> enable users to control it via a predefined size (a prerrequisite for > >>> Spark’s dynamicAllocation). > >>> > >>> This however must not be the default behavior, it seems after looking > >>> at the PR that things are not as expected and the default is now the > >>> new behavior. I will provide a PR to fix this quickly. However the > >>> question is, should I do cherry pick it and we do a new RC (since the > >>> release was already 'passed') ? > >>> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <k...@apache.org> > wrote: > >>> > > >>> > I didn't isolate it to a cause and commit, so that is extremely > useful to know. To bring some details on thread: > >>> > > >>> > query 4: a single aggregation in sliding windows > >>> > query 8: a single join with no other interesting logic > >>> > query 9 (prefix of query 6*): find the winning bid for each auction > >>> > query 6: query 9 followed by a single aggregation > >>> > > >>> > Kenn > >>> > > >>> > * they seem out of order because the original queries were 1-8 and > we added 9 later to benchmark the baseline without the aggregation > >>> > > >>> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot < > echauc...@apache.org> wrote: > >>> >> > >>> >> Oops, just saw than Kenn already mentioned spark perf degradation > on spark runner around 10/05. Sorry for the repetition. > >>> >> Nevertheless, IMHO, I think it will be still worth checking PR > #6181. > >>> >> > >>> >> Etienne > >>> >> > >>> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit : > >>> >> > >>> >> Hey, > >>> >> I would vote -0 : here is the explanation: > >>> >> > >>> >> I took a look at Nexmark dashboards for output size and performance > for all the runners in all the modes around the date of the release cut to > search for regressions. > >>> >> > >>> >> I noted a regression on the performance of the spark runner. > Query4, Query6, Query8 and Query9 running times were multiplied by 2 to 3 > around the date of 10/05/18. See https://apache-beam-testing. > appspot.com/explore?dashboard=5138380291571712 > >>> >> So I searched in the commit history of the spark runner module for > what happened around 10/05/18. And I found this commit > >>> >> > >>> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request > #6181: [BEAM-4783] Add bundleSize for splitting BoundedSources > >>> >> > >>> >> I don't know if it should be considered a blocker but we should > definitely take another look at pull request #6181 that seems to change the > way we split on spark runner. > >>> >> > >>> >> Best > >>> >> Etienne > >>> >> > >>> >> > >>> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a > écrit : > >>> >> > >>> >> +1 (binding) > >>> >> > >>> >> > >>> >> On 26.10.18 17:45, Kenneth Knowles wrote: > >>> >> > >>> >> Nice. Thanks. > >>> >> > >>> >> > >>> >> +1 > >>> >> > >>> >> > >>> >> > >>> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw < > rober...@google.com > >>> >> > >>> >> <mailto:rober...@google.com>> wrote: > >>> >> > >>> >> > >>> >> Thanks Tim! > >>> >> > >>> >> > >>> >> This was my only hesitation, and sounds like we're in the clear > here. > >>> >> > >>> >> > >>> >> +1 (binding) > >>> >> > >>> >> On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson > >>> >> > >>> >> <timrobertson...@gmail.com <mailto:timrobertson...@gmail.com>> > wrote: > >>> >> > >>> >> > > >>> >> > >>> >> > A colleague and I tested on 2.7.0 and 2.8.0RC1: > >>> >> > >>> >> > > >>> >> > >>> >> > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in > >>> >> > >>> >> spreadsheet) > >>> >> > >>> >> > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we > >>> >> > >>> >> backport the un-merged BEAM-5036 fix in our code) > >>> >> > >>> >> > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS > >>> >> > >>> >> > > >>> >> > >>> >> > Everything worked, and performance was similar on both. > >>> >> > >>> >> > We built using maven pointing at > >>> >> > >>> >> https://repository.apache.org/content/repositories/ > orgapachebeam-1049/ > >>> >> > >>> >> > > >>> >> > >>> >> > Based on this limited testing: +1 > >>> >> > >>> >> > > >>> >> > >>> >> > Thank you to the release managers, > >>> >> > >>> >> > Tim > >>> >> > >>> >> > > >>> >> > >>> >> > > >>> >> > >>> >> > On Thu, Oct 25, 2018 at 7:21 PM Tim < > timrobertson...@gmail.com > >>> >> > >>> >> <mailto:timrobertson...@gmail.com>> wrote: > >>> >> > >>> >> >> > >>> >> > >>> >> >> I can do some tests on Spark / YARN tomorrow (CEST > timezone). > >>> >> > >>> >> Sorry I’ve just been too busy to assist. > >>> >> > >>> >> >> > >>> >> > >>> >> >> Tim > >>> >> > >>> >> >> > >>> >> > >>> >> >> On 25 Oct 2018, at 18:59, Kenneth Knowles <k...@apache.org > >>> >> > >>> >> <mailto:k...@apache.org>> wrote: > >>> >> > >>> >> >> > >>> >> > >>> >> >> I tried to do a more thorough job on this. > >>> >> > >>> >> >> > >>> >> > >>> >> >> - I could not reproduce the slowdown in Query 9. I believe > the > >>> >> > >>> >> variance was simply high given the parameters and environment > >>> >> > >>> >> >> - I saw the same slowdown in Query 8 when running as part > of > >>> >> > >>> >> the suite, but it vanished when I ran repeatedly on its own, so > >>> >> > >>> >> again it is not good methodology probably > >>> >> > >>> >> >> > >>> >> > >>> >> >> We do have the dashboard at > >>> >> > >>> >> https://apache-beam-testing.appspot.com/dashboard-admin though > no > >>> >> > >>> >> anomaly detection set up AFAIK. > >>> >> > >>> >> >> > >>> >> > >>> >> >> - There is no issue easily visible in DirectRunner: > >>> >> > >>> >> https://apache-beam-testing.appspot.com/explore?dashboard= > 5084698770407424 > >>> >> > >>> >> >> - There is a notable degradation in Spark runner on 10/5 > for > >>> >> > >>> >> many queries. > >>> >> > >>> >> https://apache-beam-testing.appspot.com/explore?dashboard= > 5138380291571712 > >>> >> > >>> >> >> - Something minor happened for Dataflow around 10/1: > >>> >> > >>> >> https://apache-beam-testing.appspot.com/explore?dashboard= > 5670405876482048 > >>> >> > >>> >> >> - Flink runner seems to have had some fantastic > improvements > >>> >> > >>> >> :-) > >>> >> > >>> >> https://apache-beam-testing.appspot.com/explore?dashboard= > 5699257587728384 > >>> >> > >>> >> >> > >>> >> > >>> >> >> So if there is a blocker it would really be the Spark runner > >>> >> > >>> >> perf changes. Of course, all these except Dataflow are using > local > >>> >> > >>> >> instances so may not be representative of larger scale AFAIK. > >>> >> > >>> >> >> > >>> >> > >>> >> >> Kenn > >>> >> > >>> >> >> > >>> >> > >>> >> >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels > >>> >> > >>> >> <m...@apache.org <mailto:m...@apache.org>> wrote: > >>> >> > >>> >> >>> > >>> >> > >>> >> >>> I've run WordCount using Quickstart with the FlinkRunner > >>> >> > >>> >> (locally and > >>> >> > >>> >> >>> against a Flink cluster). > >>> >> > >>> >> >>> > >>> >> > >>> >> >>> Would give a +1 but waiting what Kenn finds. > >>> >> > >>> >> >>> > >>> >> > >>> >> >>> -Max > >>> >> > >>> >> >>> > >>> >> > >>> >> >>> On 23.10.18 07:11, Ahmet Altay wrote: > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles > >>> >> > >>> >> <k...@apache.org <mailto:k...@apache.org> > >>> >> > >>> >> >>> > <mailto:k...@apache.org <mailto:k...@apache.org>>> > wrote: > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > You two did so much verification I had a hard time > >>> >> > >>> >> finding something > >>> >> > >>> >> >>> > where my help was meaningful! :-) > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > I did run the Nexmark suite on the DirectRunner > against > >>> >> > >>> >> 2.7.0 and > >>> >> > >>> >> >>> > 2.8.0 following > >>> >> > >>> >> >>> > > >>> >> > >>> >> https://beam.apache.org/documentation/sdks/java/ > nexmark/#running-smoke-suite-on-the-directrunner-local > >>> >> > >>> >> >>> > > >>> >> > >>> >> <https://beam.apache.org/documentation/sdks/java/ > nexmark/#running-smoke-suite-on-the-directrunner-local>. > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > It is admittedly a very silly test - the > instructions leave > >>> >> > >>> >> >>> > immutability enforcement on, etc. But it does appear > that > >>> >> > >>> >> there is a > >>> >> > >>> >> >>> > 30% degradation in query 8 and 15% in query 9. These > are > >>> >> > >>> >> the pure > >>> >> > >>> >> >>> > Java tests, not the SQL variants. The rest of the > queries > >>> >> > >>> >> are close > >>> >> > >>> >> >>> > enough that differences are not meaningful. > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > (It would be a good improvement for us to have alerts on > daily > >>> >> > >>> >> >>> > benchmarks if we do not have such a concept already.) > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > I would ask a little more time to see what is going > on > >>> >> > >>> >> here - is it > >>> >> > >>> >> >>> > a real performance issue or an artifact of how the > tests are > >>> >> > >>> >> >>> > invoked, or ...? > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Thank you! Much appreciated. Please let us know when you > are > >>> >> > >>> >> done with > >>> >> > >>> >> >>> > your investigation. > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Kenn > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay > >>> >> > >>> >> <al...@google.com <mailto:al...@google.com> > >>> >> > >>> >> >>> > <mailto:al...@google.com <mailto:al...@google.com>>> > wrote: > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Hi all, > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Did you have a chance to review this RC? Between > me > >>> >> > >>> >> and Robert > >>> >> > >>> >> >>> > we ran a significant chunk of the validations. > Let me > >>> >> > >>> >> know if > >>> >> > >>> >> >>> > you have any questions. > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Ahmet > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay > >>> >> > >>> >> <al...@google.com <mailto:al...@google.com> > >>> >> > >>> >> >>> > <mailto:al...@google.com <mailto: > al...@google.com>>> > >>> >> > >>> >> wrote: > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Hi everyone, > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Please review and vote on the release > candidate > >>> >> > >>> >> #1 for the > >>> >> > >>> >> >>> > version 2.8.0, as follows: > >>> >> > >>> >> >>> > [ ] +1, Approve the release > >>> >> > >>> >> >>> > [ ] -1, Do not approve the release (please > >>> >> > >>> >> provide specific > >>> >> > >>> >> >>> > comments) > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > The complete staging area is available for > your > >>> >> > >>> >> review, > >>> >> > >>> >> >>> > which includes: > >>> >> > >>> >> >>> > * JIRA release notes [1], > >>> >> > >>> >> >>> > * the official Apache source release to be > >>> >> > >>> >> deployed to > >>> >> > >>> >> >>> > dist.apache.org <http://dist.apache.org> > >>> >> > >>> >> <http://dist.apache.org> [2], which is > >>> >> > >>> >> >>> > signed with the key with fingerprint > 6096FA00 [3], > >>> >> > >>> >> >>> > * all artifacts to be deployed to the Maven > Central > >>> >> > >>> >> >>> > Repository [4], > >>> >> > >>> >> >>> > * source code tag "v2.8.0-RC1" [5], > >>> >> > >>> >> >>> > * website pull request listing the release > and > >>> >> > >>> >> publishing > >>> >> > >>> >> >>> > the API reference manual [6]. > >>> >> > >>> >> >>> > * Python artifacts are deployed along with > the source > >>> >> > >>> >> >>> > release to the dist.apache.org > >>> >> > >>> >> <http://dist.apache.org> <http://dist.apache.org> [2]. > >>> >> > >>> >> >>> > * Validation sheet with a tab for 2.8.0 > release > >>> >> > >>> >> to help with > >>> >> > >>> >> >>> > validation [7]. > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > The vote will be open for at least 72 hours. > It > >>> >> > >>> >> is adopted > >>> >> > >>> >> >>> > by majority approval, with at least 3 PMC > >>> >> > >>> >> affirmative votes. > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > Thanks, > >>> >> > >>> >> >>> > Ahmet > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > [1] > >>> >> > >>> >> >>> > > >>> >> > >>> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa? > projectId=12319527&version=12343985 > >>> >> > >>> >> >>> > > >>> >> > >>> >> <https://issues.apache.org/jira/secure/ReleaseNote.jspa? > projectId=12319527&version=12343985> > >>> >> > >>> >> >>> > [2] https://dist.apache.org/repos/ > dist/dev/beam/2.8.0 > >>> >> > >>> >> >>> > <https://dist.apache.org/ > repos/dist/dev/beam/2.8.0> > >>> >> > >>> >> >>> > [3] https://dist.apache.org/repos/ > dist/dev/beam/KEYS > >>> >> > >>> >> >>> > <https://dist.apache.org/ > repos/dist/dev/beam/KEYS> > >>> >> > >>> >> >>> > [4] > >>> >> > >>> >> >>> > > >>> >> > >>> >> https://repository.apache.org/content/repositories/ > orgapachebeam-1049/ > >>> >> > >>> >> >>> > > >>> >> > >>> >> <https://repository.apache.org/content/repositories/ > orgapachebeam-1049/> > >>> >> > >>> >> >>> > [5] https://github.com/apache/ > beam/tree/v2.8.0-RC1 > >>> >> > >>> >> >>> > <https://github.com/apache/ > beam/tree/v2.8.0-RC1> > >>> >> > >>> >> >>> > [6] https://github.com/apache/ > beam-site/pull/583 > >>> >> > >>> >> >>> > <https://github.com/apache/ > beam-site/pull/583> and > >>> >> > >>> >> >>> > https://github.com/apache/beam/pull/6745 > >>> >> > >>> >> >>> > <https://github.com/apache/beam/pull/6745> > >>> >> > >>> >> >>> > [7] > >>> >> > >>> >> >>> > > >>> >> > >>> >> https://docs.google.com/spreadsheets/d/1qk- > N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816 > >>> >> > >>> >> >>> > > >>> >> > >>> >> <https://docs.google.com/spreadsheets/d/1qk- > N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816> > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> >>> > > >>> >> > >>> >> > > > > >