Thank you JB for your work! I tested running simple streaming (KafkaIO) and batch (TextIO / HDFS) pipelines with SparkRunner on YARN cluster - it works fine.
WBR, Alexey > On 8 Jun 2018, at 10:00, Etienne Chauchot <[email protected]> wrote: > > I forgot to vote: > +1 (non binding). > What I tested: > - no functional or performance regression comparing to v2.4 > - dependencies in the poms are ok > > Etienne > Le vendredi 08 juin 2018 à 08:27 +0200, Romain Manni-Bucau a écrit : >> +1 (non-binding), mainstream usage is not broken by the pom changes and >> runtime has no known regression compared to the 2.4.0 >> >> (side note: kudo to JB for this build tool change release, I know how it can >> hurt ;)) >> >> Romain Manni-Bucau >> @rmannibucau <https://twitter.com/rmannibucau> | Blog >> <https://rmannibucau.metawerx.net/> | Old Blog >> <http://rmannibucau.wordpress.com/> | Github >> <https://github.com/rmannibucau> | LinkedIn >> <https://www.linkedin.com/in/rmannibucau> | Book >> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >> >> Le jeu. 7 juin 2018 à 16:17, Jean-Baptiste Onofré <[email protected] >> <mailto:[email protected]>> a écrit : >>> Thanks for the details Etienne ! >>> >>> The good news is that the artifacts seem OK and the overall Nexmark >>> results are consistent with the 2.4.0 release ones. >>> >>> I'm starting a complete review using the beam-samples as well. >>> >>> Regards >>> JB >>> >>> On 07/06/2018 16:14, Etienne Chauchot wrote: >>> > Hi, >>> > I've just run the nexmark queries on v2.5.0-RC1 tag >>> > What we can notice: >>> > - query 3 (exercises CoGroupByKey, state and timer) shows different >>> > output with DR between batch and streaming and with the other runners => >>> > I compared with v2.4 there were still these differences but with >>> > different output size numbers >>> > >>> > - query 6 (exercises specialized combiner) shows different output >>> > between the runners => the correct output is 401. strange that in batch >>> > mode some runners output les Sellers. I compared with v2.4 same output >>> > >>> > - response time of query 7 (exercices Max transform, fanout and side >>> > input) is very slow on DR => I compared with v2.4 , comparable execution >>> > times >>> > >>> > I'm not comparing q10 because it is a write to GCS so it is very specific. >>> > >>> > => Basically no regression comparing to v2.4 >>> > >>> > For the record here is the output (waiting for ongoing perfkit >>> > integration): >>> > >>> > >>> > 1. DR batch >>> > >>> > Performance: >>> > >>> > Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) >>> > Results (Baseline) >>> > >>> > 0000 5,8 17283,1 >>> > 100000 >>> > >>> > 0001 3,2 31104,2 >>> > 92000 >>> > >>> > 0002 1,2 82918,7 >>> > 351 >>> > >>> > 0003 2,2 46210,7 >>> > 458 >>> > >>> > 0004 1,2 8503,4 >>> > 40 >>> > >>> > 0005 4,0 25220,7 >>> > 12 >>> > >>> > 0006 0,9 11148,3 >>> > 401 >>> > >>> > 0007 13,2 7580,9 >>> > 1 >>> > >>> > 0008 1,5 67340,1 >>> > 6000 >>> > >>> > 0009 0,7 14025,2 >>> > 298 >>> > >>> > 0010 12,8 7793,0 >>> > 1 >>> > >>> > 0011 2,4 42319,1 >>> > 1919 >>> > >>> > 0012 1,6 61462,8 >>> > 1919 >>> > ========================================================================================== >>> > >>> > 2. DR streaming >>> > >>> > Performance: >>> > >>> > Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) >>> > Results (Baseline) >>> > >>> > 0000 6,5 15285,8 >>> > 100000 >>> > >>> > 0001 3,7 27397,3 >>> > 92000 >>> > >>> > 0002 1,4 69108,5 >>> > 351 >>> > >>> > 0003 3,2 31181,8 >>> > 447 >>> > >>> > 0004 1,2 8361,2 >>> > 40 >>> > >>> > 0005 5,3 18903,6 >>> > 12 >>> > >>> > 0006 0,9 11111,1 >>> > 401 >>> > >>> > 0007 82,5 1212,2 >>> > 1 >>> > >>> > 0008 2,0 51072,5 >>> > 6000 >>> > >>> > 0009 0,8 12903,2 >>> > 298 >>> > >>> > 0010 49,5 2021,8 >>> > 1 >>> > >>> > 0011 3,9 25667,4 >>> > 1919 >>> > >>> > 0012 2,4 41067,8 >>> > 1919 >>> > ========================================================================================== >>> > >>> > 3. Flink batch >>> > Performance: >>> > >>> > Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) >>> > Results (Baseline) >>> > >>> > 0000 1,0 97656,3 >>> > 100000 >>> > >>> > 0001 0,7 141643,1 >>> > 92000 >>> > >>> > 0002 0,4 228310,5 >>> > 351 >>> > >>> > 0003 1,6 64020,5 >>> > 580 >>> > >>> > 0004 0,7 13831,3 >>> > 40 >>> > >>> > 0005 1,4 72939,5 >>> > 12 >>> > >>> > 0006 0,5 20491,8 >>> > 103 >>> > >>> > 0007 1,3 74239,0 >>> > 1 >>> > >>> > 0008 0,8 121506,7 >>> > 6000 >>> > >>> > 0009 0,6 17953,3 >>> > 298 >>> > >>> > 0010 1,3 74682,6 >>> > 1 >>> > >>> > 0011 1,1 92936,8 >>> > 1919 >>> > >>> > 0012 0,8 123001,2 >>> > 1919 >>> > ========================================================================================== >>> > >>> > 4. Flink streaming >>> > Performance: >>> > >>> > Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) >>> > Results (Baseline) >>> > >>> > 0000 5,4 18677,6 >>> > 100000 >>> > >>> > 0001 2,8 35511,4 >>> > 92000 >>> > >>> > 0002 1,8 54318,3 >>> > 351 >>> > >>> > 0003 2,4 41614,6 >>> > 580 >>> > >>> > 0004 1,0 10341,3 >>> > 40 >>> > >>> > 0005 3,4 29568,3 >>> > 12 >>> > >>> > 0006 0,7 13369,0 >>> > 401 >>> > >>> > 0007 2,8 36192,5 >>> > 1 >>> > >>> > 0008 1,8 54854,6 >>> > 6000 >>> > >>> > 0009 0,7 13369,0 >>> > 298 >>> > >>> > 0010 3,4 29841,8 >>> > 2 >>> > >>> > 0011 5,0 19932,2 >>> > 1919 >>> > >>> > 0012 2,6 38835,0 >>> > 1919 >>> > ========================================================================================== >>> > >>> > 5. Spark batch >>> > Performance: >>> > >>> > Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) >>> > Results (Baseline) >>> > >>> > 0000 1,5 65445,0 >>> > 100000 >>> > >>> > 0001 1,3 79491,3 >>> > 92000 >>> > >>> > 0002 0,9 112107,6 >>> > 351 >>> > >>> > 0003 2,0 48804,3 >>> > 580 >>> > >>> > 0004 1,2 8382,2 >>> > 40 >>> > >>> > 0005 2,0 50838,8 >>> > 12 >>> > >>> > 0006 1,0 9699,3 >>> > 103 >>> > >>> > 0007 2,3 43308,8 >>> > 1 >>> > >>> > 0008 2,1 46794,6 >>> > 6000 >>> > >>> > 0009 1,1 8976,7 >>> > 298 >>> > >>> > 0010 1,6 62111,8 >>> > 1 >>> > >>> > 0011 2,1 46598,3 >>> > 1919 >>> > >>> > 0012 2,3 43687,2 >>> > 1919 >>> > ========================================================================================== >>> > >>> > Le mercredi 06 juin 2018 à 10:50 +0200, Etienne Chauchot a écrit : >>> >> Thanks JB for all your work ! I believe doing the first gradle release >>> >> must have been hard. >>> >> I'll run Nexmark on the release and keep you posted. >>> >> >>> >> Best >>> >> Etienne >>> >> >>> >> >>> >> Le mercredi 06 juin 2018 à 10:44 +0200, Jean-Baptiste Onofré a écrit : >>> >>> Hi everyone, >>> >>> >>> >>> Please review and vote on the release candidate #1 for the version >>> >>> 2.5.0, as follows: >>> >>> >>> >>> [ ] +1, Approve the release >>> >>> [ ] -1, Do not approve the release (please provide specific comments) >>> >>> >>> >>> NB: this is the first release using Gradle, so don't be too harsh ;) A >>> >>> PR about the release guide will follow thanks to this release. >>> >>> >>> >>> The complete staging area is available for your review, which includes: >>> >>> * JIRA release notes [1], >>> >>> * the official Apache source release to be deployed to dist.apache.org >>> >>> <http://dist.apache.org/> >>> >>> [2], which is signed with the key with fingerprint C8282E76 [3], >>> >>> * all artifacts to be deployed to the Maven Central Repository [4], >>> >>> * source code tag "v2.5.0-RC1" [5], >>> >>> * website pull request listing the release and publishing the API >>> >>> reference manual [6]. >>> >>> * Java artifacts were built with Gradle 4.7 (wrapper) and OpenJDK/Oracle >>> >>> JDK 1.8.0_172 (Oracle Corporation 25.172-b11). >>> >>> * Python artifacts are deployed along with the source release to the >>> >>> dist.apache.org <http://dist.apache.org/> [2]. >>> >>> >>> >>> The vote will be open for at least 72 hours. It is adopted by majority >>> >>> approval, with at least 3 PMC affirmative votes. >>> >>> >>> >>> Thanks, >>> >>> JB >>> >>> >>> >>> [1] >>> >>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12342847 >>> >>> >>> >>> <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12342847> >>> >>> [2] https://dist.apache.org/repos/dist/dev/beam/2.5.0/ >>> >>> <https://dist.apache.org/repos/dist/dev/beam/2.5.0/> >>> >>> [3] https://dist.apache.org/repos/dist/release/beam/KEYS >>> >>> <https://dist.apache.org/repos/dist/release/beam/KEYS> >>> >>> [4] >>> >>> https://repository.apache.org/content/repositories/orgapachebeam-1041/ >>> >>> <https://repository.apache.org/content/repositories/orgapachebeam-1041/> >>> >>> [5] https://github.com/apache/beam/tree/v2.5.0-RC1 >>> >>> <https://github.com/apache/beam/tree/v2.5.0-RC1> >>> >>> [6] https://github.com/apache/beam-site/pull/463 >>> >>> <https://github.com/apache/beam-site/pull/463> >>> >>> >>>
