Romain hinted that this was a dependency issue but when comparing the two dependency trees I don't get much of a difference:
lcwik@lcwik0: ~$ diff /tmp/260 /tmp/270 < [INFO] +- org.apache.beam:beam-runners-spark:jar:2.6.0:compile < [INFO] | +- org.apache.beam:beam-model-pipeline:jar:2.6.0:compile --- > [INFO] +- org.apache.beam:beam-runners-spark:jar:2.7.0:compile > [INFO] | +- org.apache.beam:beam-model-pipeline:jar:2.7.0:compile 5c6 < [INFO] | +- org.apache.beam:beam-sdks-java-core:jar:2.6.0:compile --- > [INFO] | +- org.apache.beam:beam-sdks-java-core:jar:2.7.0:compile 14,18c15,19 < [INFO] | | \- org.tukaani:xz:jar:1.5:compile < [INFO] | +- org.apache.beam:beam-runners-core-construction-java:jar:2.6.0:compile < [INFO] | | \- org.apache.beam:beam-model-job-management:jar:2.6.0:compile < [INFO] | +- org.apache.beam:beam-runners-core-java:jar:2.6.0:compile < [INFO] | | \- org.apache.beam:beam-model-fn-execution:jar:2.6.0:compile --- > [INFO] | | \- org.tukaani:xz:jar:1.8:compile > [INFO] | +- org.apache.beam:beam-runners-core-construction-java:jar:2.7.0:compile > [INFO] | | \- org.apache.beam:beam-model-job-management:jar:2.7.0:compile > [INFO] | +- org.apache.beam:beam-runners-core-java:jar:2.7.0:compile > [INFO] | | \- org.apache.beam:beam-model-fn-execution:jar:2.7.0:compile Other then Beam package changes, the only other change is xz which I don't believe could be causing the issue. On Tue, Sep 18, 2018 at 8:38 AM Jean-Baptiste Onofré <[email protected]> wrote: > Thanks, let me take a look. > > Regards > JB > > On 18/09/2018 17:36, Romain Manni-Bucau wrote: > > > > > > > > Le mar. 18 sept. 2018 à 16:44, Jean-Baptiste Onofré <[email protected] > > <mailto:[email protected]>> a écrit : > > > > Hi, > > > > I don't have the issue ;) > > > > As said in my vote, I tested 2.7.0 RC1 on beam-samples with Spark > > without problem. > > > > I don't reproduce Romain issue as well. > > > > @Romain can you provide some details to reproduce the issue ? > > > > > > Sure, you can use this > > reproducer: https://github.com/rmannibucau/beam-2.7.0-fails > > It shows that it suceeds on 2.6 and fails on 2.7. > > > > > > > > Regards > > JB > > > > On 17/09/2018 19:17, Charles Chen wrote: > > > Luke, Maximillian, Raghu, can you please propose cherry-pick PRs > > to the > > > release-2.7.0 for your issues and add me as a reviewer > > (@charlesccychen)? > > > > > > Romain, JB: is there any way I can help with debugging the issue > > you're > > > facing so we can unblock the release? > > > > > > On Fri, Sep 14, 2018 at 1:49 PM Raghu Angadi <[email protected] > > <mailto:[email protected]> > > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > > > I would like propose one more cherrypick for RC2 > > > : https://github.com/apache/beam/pull/6391 > > > This is a KafkaIO bug fix. Once a user hits this bug, there is > no > > > easy work around for them, especially on Dataflow. Only work > > around > > > in Dataflow is to restart or reload the job. > > > > > > The fix itself fairly safe and is tested. > > > Raghu. > > > > > > On Fri, Sep 14, 2018 at 12:52 AM Alexey Romanenko > > > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> > > wrote: > > > > > > Perhaps it could help, but I run simple WordCount (built > with > > > Beam 2.7) on YARN/Spark (HDP Sandbox) cluster and it > > worked fine > > > for me. > > > > > >> On 14 Sep 2018, at 06:56, Romain Manni-Bucau > > >> <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> > wrote: > > >> > > >> Hi Charles, > > >> > > >> I didn't get enough time to check deeply but it is > clearly a > > >> dependency issue and it is not in beam spark runner > > itself but > > >> in another transitive module of beam. It does not happen > in > > >> existing spark test cause none of them are in a cluster > (even > > >> just with 1 worker) but this seems to be a regression > since > > >> 2.6 works OOTB. > > >> > > >> Romain Manni-Bucau > > >> @rmannibucau <https://twitter.com/rmannibucau> | Blog > > >> <https://rmannibucau.metawerx.net/> | Old Blog > > >> <http://rmannibucau.wordpress.com/> | Github > > >> <https://github.com/rmannibucau> | LinkedIn > > >> <https://www.linkedin.com/in/rmannibucau> | Book > > >> > > < > https://www.packtpub.com/application-development/java-ee-8-high-performance > > > > >> > > >> > > >> Le jeu. 13 sept. 2018 à 22:15, Charles Chen > > <[email protected] <mailto:[email protected]> > > >> <mailto:[email protected] <mailto:[email protected]>>> a > écrit : > > >> > > >> Romain and JB, can you please add the results of your > > >> investigations into the errors you've seen above? > Given > > >> that the existing SparkRunner tests pass for this RC, > and > > >> that the integration test you ran is in another repo > that > > >> is not continuously tested with Beam, it is not clear > how > > >> we should move forward and whether this is a blocking > > >> issue, unless we can find a root cause in Beam. > > >> > > >> On Wed, Sep 12, 2018 at 2:08 AM Etienne Chauchot > > >> <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > >> > > >> Hi all, > > >> > > >> on a performance and functional regression stand > > point > > >> I see no regression: > > >> > > >> I looked at nexmark graphs "output pcollection > size" > > >> and "execution time" around release cut date on > > >> dataflow, spark, flink and direct runner in batch > and > > >> streaming modes. There seems to be no regression. > > >> > > >> Etienne > > >> > > >> Le mardi 11 septembre 2018 à 12:25 -0700, Charles > > Chen > > >> a écrit : > > >>> The SparkRunner validation test > > >>> > > (here: > https://beam.apache.org/contribute/release-guide/#run-validation-tests) > > >>> passes on my machine. It looks like we are > likely > > >>> missing test coverage where Romain is hitting > > issues. > > >>> > > >>> On Tue, Sep 11, 2018 at 12:15 PM Ahmet Altay > > >>> <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > >>>> Could anyone else help with looking at these > issues > > >>>> earlier? > > >>>> > > >>>> On Tue, Sep 11, 2018 at 12:03 PM, Romain > > Manni-Bucau > > >>>> <[email protected] > > <mailto:[email protected]> > > >>>> <mailto:[email protected] > > <mailto:[email protected]>>> wrote: > > >>>>> Im running this main [1] through this IT [2]. > Was > > >>>>> working fine since ~1 year but 2.7.0 broke it. > > >>>>> Didnt investigate more but can have a look > later > > >>>>> this month if it helps. > > >>>>> > > >>>>> > > [1] > https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/main/java/org/talend/sdk/component/beam/it/clusterserialization/Main.java > > >>>>> > > [2] > https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/test/java/org/talend/sdk/component/beam/it/SerializationOverClusterIT.java > > >>>>> > > >>>>> Le mar. 11 sept. 2018 20:54, Charles Chen > > >>>>> <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> a écrit : > > >>>>>> Romain: can you give more details on the > failure > > >>>>>> you're encountering, i.e. how you are > performing > > >>>>>> this validation? > > >>>>>> > > >>>>>> On Tue, Sep 11, 2018 at 9:36 AM Jean-Baptiste > > >>>>>> Onofré <[email protected] > > <mailto:[email protected]> <mailto:[email protected] > > <mailto:[email protected]>>> > > >>>>>> wrote: > > >>>>>>> Hi, > > >>>>>>> > > >>>>>>> weird, I didn't have it on Beam samples. Let > me > > >>>>>>> try to reproduce and I > > >>>>>>> will create the Jira. > > >>>>>>> > > >>>>>>> Regards > > >>>>>>> JB > > >>>>>>> > > >>>>>>> On 11/09/2018 11:44, Romain Manni-Bucau > wrote: > > >>>>>>> > -1, seems spark integration is broken > (tested > > >>>>>>> with spark 2.3.1 and 2.2.1): > > >>>>>>> > > > >>>>>>> > 18/09/11 11:33:29 WARN TaskSetManager: Lost > > >>>>>>> task 0.0 in stage 0.0 (TID 0, RMANNIBUCAU, > > >>>>>>> executor 0): java.lang.ClassCastException: > > cannot > > >>>>>>> assign instance of > > >>>>>>> > > scala.collection.immutable.List$SerializationProxy to > > >>>>>>> fieldorg.apache.spark.rdd.RDD.org > > <http://fieldorg.apache.spark.rdd.RDD.org> > > >>>>>>> <http://fieldorg.apache.spark.rdd.rdd.org/> > > >>>>>>> <http://org.apache.spark.rdd.RDD.org > > >>>>>>> > > <http://org.apache.spark.rdd.rdd.org/ > >>$apache$spark$rdd$RDD$$dependencies_ > > >>>>>>> of type scala.collection.Seq in instance of > > >>>>>>> org.apache.spark.rdd.MapPartitionsRDD > > >>>>>>> > at > > >>>>>>> > > > > java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2233) > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > Also the issue Lukasz identified is > important > > >>>>>>> even if workarounds can be > > >>>>>>> > put in place so +1 to fix it as well if > > possible. > > >>>>>>> > > > >>>>>>> > Romain Manni-Bucau > > >>>>>>> > @rmannibucau < > https://twitter.com/rmannibucau> > > >>>>>>> | Blog > > >>>>>>> > <https://rmannibucau.metawerx.net/> | Old > Blog > > >>>>>>> > <http://rmannibucau.wordpress.com > > >>>>>>> <http://rmannibucau.wordpress.com/>> | > Github > > >>>>>>> > <https://github.com/rmannibucau> | > LinkedIn > > >>>>>>> > > > <https://www.linkedin.com/in/rmannibucau> | Book > > >>>>>>> > > > >>>>>>> > > < > https://www.packtpub.com/application-development/java-ee-8-high-performance > > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> > Le lun. 10 sept. 2018 à 20:48, Lukasz Cwik > > >>>>>>> <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>> > > >>>>>>> > <mailto:[email protected] > > <mailto:[email protected]> > > >>>>>>> <mailto:[email protected] > > <mailto:[email protected]>>>> a écrit : > > >>>>>>> > > > >>>>>>> > I found an issue where we are no longer > > >>>>>>> packaging the pom.xml within > > >>>>>>> > the artifact jars at > > >>>>>>> META-INF/maven/groupId/artifactId. More > details > > >>>>>>> > in > > >>>>>>> > https://issues.apache.org/jira/browse/BEAM-5351. > > >>>>>>> I wouldn't > > >>>>>>> > consider this a blocker but it was an > > easy fix > > >>>>>>> > ( > https://github.com/apache/beam/pull/6358) > > >>>>>>> and users may rely on the > > >>>>>>> > pom.xml. > > >>>>>>> > > > >>>>>>> > Should we recut the release candidate > to > > >>>>>>> include this? > > >>>>>>> > > > >>>>>>> > On Mon, Sep 10, 2018 at 4:58 AM > > >>>>>>> Jean-Baptiste Onofré > > >>>>>>> > <[email protected] > > <mailto:[email protected]> <mailto:[email protected] > > <mailto:[email protected]>> > > >>>>>>> <mailto:[email protected] <mailto: > [email protected]> > > >>>>>>> <mailto:[email protected] > > <mailto:[email protected]>>>> wrote: > > >>>>>>> > > > >>>>>>> > +1 (binding) > > >>>>>>> > > > >>>>>>> > Tested successfully on Beam > Samples. > > >>>>>>> > > > >>>>>>> > Thanks ! > > >>>>>>> > > > >>>>>>> > Regards > > >>>>>>> > JB > > >>>>>>> > > > >>>>>>> > On 07/09/2018 23:56, Charles Chen > > wrote: > > >>>>>>> > > Hi everyone, > > >>>>>>> > > > > >>>>>>> > > Please review and vote on the > > >>>>>>> release candidate #1 for the > > >>>>>>> > version > > >>>>>>> > > 2.7.0, as follows: > > >>>>>>> > > [ ] +1, Approve the release > > >>>>>>> > > [ ] -1, Do not approve the > release > > >>>>>>> (please provide specific > > >>>>>>> > comments) > > >>>>>>> > > > > >>>>>>> > > The complete staging area is > > >>>>>>> available for your review, which > > >>>>>>> > includes: > > >>>>>>> > > * JIRA release notes [1], > > >>>>>>> > > * the official Apache source > > release > > >>>>>>> to be deployed to > > >>>>>>> > dist.apache.org > > <http://dist.apache.org> > > >>>>>>> <http://dist.apache.org/> > > <http://dist.apache.org > > >>>>>>> <http://dist.apache.org/>> > > >>>>>>> > > <http://dist.apache.org > > >>>>>>> <http://dist.apache.org/>> [2], which is > signed > > >>>>>>> with the key with > > >>>>>>> > > fingerprint 45C60AAAD115F560 > [3], > > >>>>>>> > > * all artifacts to be deployed > to > > >>>>>>> the Maven Central > > >>>>>>> > Repository [4], > > >>>>>>> > > * source code tag "v2.7.0-RC1" > [5], > > >>>>>>> > > * website pull request listing > the > > >>>>>>> release and publishing the API > > >>>>>>> > > reference manual [6]. > > >>>>>>> > > * Java artifacts were built with > > >>>>>>> Gradle 4.8 and OpenJDK > > >>>>>>> > > > 1.8.0_181-8u181-b13-1~deb9u1-b13. > > >>>>>>> > > * Python artifacts are deployed > > >>>>>>> along with the source release > > >>>>>>> > to the > > >>>>>>> > > dist.apache.org > > <http://dist.apache.org> > > >>>>>>> <http://dist.apache.org/> > > <http://dist.apache.org > > >>>>>>> <http://dist.apache.org/>> > > >>>>>>> > <http://dist.apache.org > > >>>>>>> <http://dist.apache.org/>> [2]. > > >>>>>>> > > > > >>>>>>> > > The vote will be open for at > least > > >>>>>>> 72 hours. It is adopted by > > >>>>>>> > majority > > >>>>>>> > > approval, with at least 3 PMC > > >>>>>>> affirmative votes. > > >>>>>>> > > > > >>>>>>> > > Thanks, > > >>>>>>> > > Charles > > >>>>>>> > > > > >>>>>>> > > [1] > > >>>>>>> > > > > >>>>>>> > > > >>>>>>> > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343654 > > >>>>>>> > > [2] > > >>>>>>> > > https://dist.apache.org/repos/dist/dev/beam/2.7.0 > > >>>>>>> > > [3] > > >>>>>>> > https://dist.apache.org/repos/dist/dev/beam/KEYS > > >>>>>>> > > [4] > > >>>>>>> > > > >>>>>>> > > > https://repository.apache.org/content/repositories/orgapachebeam-1046/ > > >>>>>>> > > [5] > > >>>>>>> > https://github.com/apache/beam/tree/v2.7.0-RC1 > > >>>>>>> > > [6] > > >>>>>>> https://github.com/apache/beam-site/pull/549 > > >>>>>>> > > > >>>>>>> > -- > > >>>>>>> > Jean-Baptiste Onofré > > >>>>>>> > [email protected] > > <mailto:[email protected]> > > >>>>>>> <mailto:[email protected] > > <mailto:[email protected]>> > > >>>>>>> <mailto:[email protected] > > <mailto:[email protected]> > > >>>>>>> <mailto:[email protected] > > <mailto:[email protected]>>> > > >>>>>>> > http://blog.nanthrax.net > > >>>>>>> <http://blog.nanthrax.net/> > > >>>>>>> > Talend - http://www.talend.com > > >>>>>>> <http://www.talend.com/> > > >>>>>>> > > > >>>> > > > > > > > -- > > Jean-Baptiste Onofré > > [email protected] <mailto:[email protected]> > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > > > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
