Romain hinted that this was a dependency issue but when comparing the two
dependency trees I don't get much of a difference:

lcwik@lcwik0: ~$ diff /tmp/260 /tmp/270
< [INFO] +- org.apache.beam:beam-runners-spark:jar:2.6.0:compile
< [INFO] |  +- org.apache.beam:beam-model-pipeline:jar:2.6.0:compile
---
> [INFO] +- org.apache.beam:beam-runners-spark:jar:2.7.0:compile
> [INFO] |  +- org.apache.beam:beam-model-pipeline:jar:2.7.0:compile
5c6
< [INFO] |  +- org.apache.beam:beam-sdks-java-core:jar:2.6.0:compile
---
> [INFO] |  +- org.apache.beam:beam-sdks-java-core:jar:2.7.0:compile
14,18c15,19
< [INFO] |  |  \- org.tukaani:xz:jar:1.5:compile
< [INFO] |  +-
org.apache.beam:beam-runners-core-construction-java:jar:2.6.0:compile
< [INFO] |  |  \-
org.apache.beam:beam-model-job-management:jar:2.6.0:compile
< [INFO] |  +- org.apache.beam:beam-runners-core-java:jar:2.6.0:compile
< [INFO] |  |  \- org.apache.beam:beam-model-fn-execution:jar:2.6.0:compile
---
> [INFO] |  |  \- org.tukaani:xz:jar:1.8:compile
> [INFO] |  +-
org.apache.beam:beam-runners-core-construction-java:jar:2.7.0:compile
> [INFO] |  |  \-
org.apache.beam:beam-model-job-management:jar:2.7.0:compile
> [INFO] |  +- org.apache.beam:beam-runners-core-java:jar:2.7.0:compile
> [INFO] |  |  \- org.apache.beam:beam-model-fn-execution:jar:2.7.0:compile

Other then Beam package changes, the only other change is xz which I don't
believe could be causing the issue.

On Tue, Sep 18, 2018 at 8:38 AM Jean-Baptiste Onofré <[email protected]>
wrote:

> Thanks, let me take a look.
>
> Regards
> JB
>
> On 18/09/2018 17:36, Romain Manni-Bucau wrote:
> >
> >
> >
> > Le mar. 18 sept. 2018 à 16:44, Jean-Baptiste Onofré <[email protected]
> > <mailto:[email protected]>> a écrit :
> >
> >     Hi,
> >
> >     I don't have the issue ;)
> >
> >     As said in my vote, I tested 2.7.0 RC1 on beam-samples with Spark
> >     without problem.
> >
> >     I don't reproduce Romain issue as well.
> >
> >     @Romain can you provide some details to reproduce the issue ?
> >
> >
> > Sure, you can use this
> > reproducer: https://github.com/rmannibucau/beam-2.7.0-fails
> > It shows that it suceeds on 2.6 and fails on 2.7.
> >
> >
> >
> >     Regards
> >     JB
> >
> >     On 17/09/2018 19:17, Charles Chen wrote:
> >     > Luke, Maximillian, Raghu, can you please propose cherry-pick PRs
> >     to the
> >     > release-2.7.0 for your issues and add me as a reviewer
> >     (@charlesccychen)?
> >     >
> >     > Romain, JB: is there any way I can help with debugging the issue
> >     you're
> >     > facing so we can unblock the release?
> >     >
> >     > On Fri, Sep 14, 2018 at 1:49 PM Raghu Angadi <[email protected]
> >     <mailto:[email protected]>
> >     > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >     >
> >     >     I would like propose one more cherrypick for RC2
> >     >     : https://github.com/apache/beam/pull/6391
> >     >     This is a KafkaIO bug fix. Once a user hits this bug, there is
> no
> >     >     easy work around for them, especially on Dataflow. Only work
> >     around
> >     >     in Dataflow is to restart or reload the job.
> >     >
> >     >     The fix itself fairly safe and is tested.
> >     >     Raghu.
> >     >
> >     >     On Fri, Sep 14, 2018 at 12:52 AM Alexey Romanenko
> >     >     <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>
> >     wrote:
> >     >
> >     >         Perhaps it could help, but I run simple WordCount (built
> with
> >     >         Beam 2.7) on YARN/Spark (HDP Sandbox) cluster and it
> >     worked fine
> >     >         for me.
> >     >
> >     >>         On 14 Sep 2018, at 06:56, Romain Manni-Bucau
> >     >>         <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>>
> wrote:
> >     >>
> >     >>         Hi Charles,
> >     >>
> >     >>         I didn't get enough time to check deeply but it is
> clearly a
> >     >>         dependency issue and it is not in beam spark runner
> >     itself but
> >     >>         in another transitive module of beam. It does not happen
> in
> >     >>         existing spark test cause none of them are in a cluster
> (even
> >     >>         just with 1 worker) but this seems to be a regression
> since
> >     >>         2.6 works OOTB.
> >     >>
> >     >>         Romain Manni-Bucau
> >     >>         @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> >     >>         <https://rmannibucau.metawerx.net/> | Old Blog
> >     >>         <http://rmannibucau.wordpress.com/> | Github
> >     >>         <https://github.com/rmannibucau> | LinkedIn
> >     >>         <https://www.linkedin.com/in/rmannibucau> | Book
> >     >>
> >      <
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> >
> >     >>
> >     >>
> >     >>         Le jeu. 13 sept. 2018 à 22:15, Charles Chen
> >     <[email protected] <mailto:[email protected]>
> >     >>         <mailto:[email protected] <mailto:[email protected]>>> a
> écrit :
> >     >>
> >     >>             Romain and JB, can you please add the results of your
> >     >>             investigations into the errors you've seen above?
> Given
> >     >>             that the existing SparkRunner tests pass for this RC,
> and
> >     >>             that the integration test you ran is in another repo
> that
> >     >>             is not continuously tested with Beam, it is not clear
> how
> >     >>             we should move forward and whether this is a blocking
> >     >>             issue, unless we can find a root cause in Beam.
> >     >>
> >     >>             On Wed, Sep 12, 2018 at 2:08 AM Etienne Chauchot
> >     >>             <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >     >>
> >     >>                 Hi all,
> >     >>
> >     >>                 on a performance and functional regression stand
> >     point
> >     >>                 I see no regression:
> >     >>
> >     >>                 I looked at nexmark graphs "output pcollection
> size"
> >     >>                 and "execution time" around release cut date on
> >     >>                 dataflow, spark, flink and direct runner in batch
> and
> >     >>                 streaming modes. There seems to be no regression.
> >     >>
> >     >>                 Etienne
> >     >>
> >     >>                 Le mardi 11 septembre 2018 à 12:25 -0700, Charles
> >     Chen
> >     >>                 a écrit :
> >     >>>                 The SparkRunner validation test
> >     >>>
> >      (here:
> https://beam.apache.org/contribute/release-guide/#run-validation-tests)
> >     >>>                 passes on my machine.  It looks like we are
> likely
> >     >>>                 missing test coverage where Romain is hitting
> >     issues.
> >     >>>
> >     >>>                 On Tue, Sep 11, 2018 at 12:15 PM Ahmet Altay
> >     >>>                 <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >     >>>>                 Could anyone else help with looking at these
> issues
> >     >>>>                 earlier?
> >     >>>>
> >     >>>>                 On Tue, Sep 11, 2018 at 12:03 PM, Romain
> >     Manni-Bucau
> >     >>>>                 <[email protected]
> >     <mailto:[email protected]>
> >     >>>>                 <mailto:[email protected]
> >     <mailto:[email protected]>>> wrote:
> >     >>>>>                 Im running this main [1] through this IT [2].
> Was
> >     >>>>>                 working fine since ~1 year but 2.7.0 broke it.
> >     >>>>>                 Didnt investigate more but can have a look
> later
> >     >>>>>                 this month if it helps.
> >     >>>>>
> >     >>>>>
> >      [1]
> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/main/java/org/talend/sdk/component/beam/it/clusterserialization/Main.java
> >     >>>>>
> >      [2]
> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/test/java/org/talend/sdk/component/beam/it/SerializationOverClusterIT.java
> >     >>>>>
> >     >>>>>                 Le mar. 11 sept. 2018 20:54, Charles Chen
> >     >>>>>                 <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>> a écrit :
> >     >>>>>>                 Romain: can you give more details on the
> failure
> >     >>>>>>                 you're encountering, i.e. how you are
> performing
> >     >>>>>>                 this validation?
> >     >>>>>>
> >     >>>>>>                 On Tue, Sep 11, 2018 at 9:36 AM Jean-Baptiste
> >     >>>>>>                 Onofré <[email protected]
> >     <mailto:[email protected]> <mailto:[email protected]
> >     <mailto:[email protected]>>>
> >     >>>>>>                 wrote:
> >     >>>>>>>                 Hi,
> >     >>>>>>>
> >     >>>>>>>                 weird, I didn't have it on Beam samples. Let
> me
> >     >>>>>>>                 try to reproduce and I
> >     >>>>>>>                 will create the Jira.
> >     >>>>>>>
> >     >>>>>>>                 Regards
> >     >>>>>>>                 JB
> >     >>>>>>>
> >     >>>>>>>                 On 11/09/2018 11:44, Romain Manni-Bucau
> wrote:
> >     >>>>>>>                 > -1, seems spark integration is broken
> (tested
> >     >>>>>>>                 with spark 2.3.1 and 2.2.1):
> >     >>>>>>>                 >
> >     >>>>>>>                 > 18/09/11 11:33:29 WARN TaskSetManager: Lost
> >     >>>>>>>                 task 0.0 in stage 0.0 (TID 0, RMANNIBUCAU,
> >     >>>>>>>                 executor 0): java.lang.ClassCastException:
> >     cannot
> >     >>>>>>>                 assign instance of
> >     >>>>>>>
> >      scala.collection.immutable.List$SerializationProxy to
> >     >>>>>>>                 fieldorg.apache.spark.rdd.RDD.org
> >     <http://fieldorg.apache.spark.rdd.RDD.org>
> >     >>>>>>>                 <http://fieldorg.apache.spark.rdd.rdd.org/>
> >     >>>>>>>                 <http://org.apache.spark.rdd.RDD.org
> >     >>>>>>>
> >      <http://org.apache.spark.rdd.rdd.org/
> >>$apache$spark$rdd$RDD$$dependencies_
> >     >>>>>>>                 of type scala.collection.Seq in instance of
> >     >>>>>>>                 org.apache.spark.rdd.MapPartitionsRDD
> >     >>>>>>>                 >       at
> >     >>>>>>>
> >
>   
> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2233)
> >     >>>>>>>                 >
> >     >>>>>>>                 >
> >     >>>>>>>                 > Also the issue Lukasz identified is
> important
> >     >>>>>>>                 even if workarounds can be
> >     >>>>>>>                 > put in place so +1 to fix it as well if
> >     possible.
> >     >>>>>>>                 >
> >     >>>>>>>                 > Romain Manni-Bucau
> >     >>>>>>>                 > @rmannibucau <
> https://twitter.com/rmannibucau>
> >     >>>>>>>                 | Blog
> >     >>>>>>>                 > <https://rmannibucau.metawerx.net/> | Old
> Blog
> >     >>>>>>>                 > <http://rmannibucau.wordpress.com
> >     >>>>>>>                 <http://rmannibucau.wordpress.com/>> |
> Github
> >     >>>>>>>                 > <https://github.com/rmannibucau> |
> LinkedIn
> >     >>>>>>>                 >
> >     <https://www.linkedin.com/in/rmannibucau> | Book
> >     >>>>>>>                 >
> >     >>>>>>>
> >      <
> https://www.packtpub.com/application-development/java-ee-8-high-performance
> >
> >     >>>>>>>                 >
> >     >>>>>>>                 >
> >     >>>>>>>                 > Le lun. 10 sept. 2018 à 20:48, Lukasz Cwik
> >     >>>>>>>                 <[email protected] <mailto:[email protected]>
> >     <mailto:[email protected] <mailto:[email protected]>>
> >     >>>>>>>                 > <mailto:[email protected]
> >     <mailto:[email protected]>
> >     >>>>>>>                 <mailto:[email protected]
> >     <mailto:[email protected]>>>> a écrit :
> >     >>>>>>>                 >
> >     >>>>>>>                 >     I found an issue where we are no longer
> >     >>>>>>>                 packaging the pom.xml within
> >     >>>>>>>                 >     the artifact jars at
> >     >>>>>>>                 META-INF/maven/groupId/artifactId. More
> details
> >     >>>>>>>                 >     in
> >     >>>>>>>
> https://issues.apache.org/jira/browse/BEAM-5351.
> >     >>>>>>>                 I wouldn't
> >     >>>>>>>                 >     consider this a blocker but it was an
> >     easy fix
> >     >>>>>>>                 >     (
> https://github.com/apache/beam/pull/6358)
> >     >>>>>>>                 and users may rely on the
> >     >>>>>>>                 >     pom.xml.
> >     >>>>>>>                 >
> >     >>>>>>>                 >     Should we recut the release candidate
> to
> >     >>>>>>>                 include this?
> >     >>>>>>>                 >
> >     >>>>>>>                 >     On Mon, Sep 10, 2018 at 4:58 AM
> >     >>>>>>>                 Jean-Baptiste Onofré
> >     >>>>>>>                 >     <[email protected]
> >     <mailto:[email protected]> <mailto:[email protected]
> >     <mailto:[email protected]>>
> >     >>>>>>>                 <mailto:[email protected] <mailto:
> [email protected]>
> >     >>>>>>>                 <mailto:[email protected]
> >     <mailto:[email protected]>>>> wrote:
> >     >>>>>>>                 >
> >     >>>>>>>                 >         +1 (binding)
> >     >>>>>>>                 >
> >     >>>>>>>                 >         Tested successfully on Beam
> Samples.
> >     >>>>>>>                 >
> >     >>>>>>>                 >         Thanks !
> >     >>>>>>>                 >
> >     >>>>>>>                 >         Regards
> >     >>>>>>>                 >         JB
> >     >>>>>>>                 >
> >     >>>>>>>                 >         On 07/09/2018 23:56, Charles Chen
> >     wrote:
> >     >>>>>>>                 >          > Hi everyone,
> >     >>>>>>>                 >          >
> >     >>>>>>>                 >          > Please review and vote on the
> >     >>>>>>>                 release candidate #1 for the
> >     >>>>>>>                 >         version
> >     >>>>>>>                 >          > 2.7.0, as follows:
> >     >>>>>>>                 >          > [ ] +1, Approve the release
> >     >>>>>>>                 >          > [ ] -1, Do not approve the
> release
> >     >>>>>>>                 (please provide specific
> >     >>>>>>>                 >         comments)
> >     >>>>>>>                 >          >
> >     >>>>>>>                 >          > The complete staging area is
> >     >>>>>>>                 available for your review, which
> >     >>>>>>>                 >         includes:
> >     >>>>>>>                 >          > * JIRA release notes [1],
> >     >>>>>>>                 >          > * the official Apache source
> >     release
> >     >>>>>>>                 to be deployed to
> >     >>>>>>>                 >         dist.apache.org
> >     <http://dist.apache.org>
> >     >>>>>>>                 <http://dist.apache.org/>
> >     <http://dist.apache.org
> >     >>>>>>>                 <http://dist.apache.org/>>
> >     >>>>>>>                 >          > <http://dist.apache.org
> >     >>>>>>>                 <http://dist.apache.org/>> [2], which is
> signed
> >     >>>>>>>                 with the key with
> >     >>>>>>>                 >          > fingerprint 45C60AAAD115F560
> [3],
> >     >>>>>>>                 >          > * all artifacts to be deployed
> to
> >     >>>>>>>                 the Maven Central
> >     >>>>>>>                 >         Repository [4],
> >     >>>>>>>                 >          > * source code tag "v2.7.0-RC1"
> [5],
> >     >>>>>>>                 >          > * website pull request listing
> the
> >     >>>>>>>                 release and publishing the API
> >     >>>>>>>                 >          > reference manual [6].
> >     >>>>>>>                 >          > * Java artifacts were built with
> >     >>>>>>>                 Gradle 4.8 and OpenJDK
> >     >>>>>>>                 >          >
> 1.8.0_181-8u181-b13-1~deb9u1-b13.
> >     >>>>>>>                 >          > * Python artifacts are deployed
> >     >>>>>>>                 along with the source release
> >     >>>>>>>                 >         to the
> >     >>>>>>>                 >          > dist.apache.org
> >     <http://dist.apache.org>
> >     >>>>>>>                 <http://dist.apache.org/>
> >     <http://dist.apache.org
> >     >>>>>>>                 <http://dist.apache.org/>>
> >     >>>>>>>                 >         <http://dist.apache.org
> >     >>>>>>>                 <http://dist.apache.org/>> [2].
> >     >>>>>>>                 >          >
> >     >>>>>>>                 >          > The vote will be open for at
> least
> >     >>>>>>>                 72 hours. It is adopted by
> >     >>>>>>>                 >         majority
> >     >>>>>>>                 >          > approval, with at least 3 PMC
> >     >>>>>>>                 affirmative votes.
> >     >>>>>>>                 >          >
> >     >>>>>>>                 >          > Thanks,
> >     >>>>>>>                 >          > Charles
> >     >>>>>>>                 >          >
> >     >>>>>>>                 >          > [1]
> >     >>>>>>>                 >          >
> >     >>>>>>>                 >
> >     >>>>>>>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343654
> >     >>>>>>>                 >          > [2]
> >     >>>>>>>
> >      https://dist.apache.org/repos/dist/dev/beam/2.7.0
> >     >>>>>>>                 >          > [3]
> >     >>>>>>>
> https://dist.apache.org/repos/dist/dev/beam/KEYS
> >     >>>>>>>                 >          > [4]
> >     >>>>>>>                 >
> >     >>>>>>>
> >
> https://repository.apache.org/content/repositories/orgapachebeam-1046/
> >     >>>>>>>                 >          > [5]
> >     >>>>>>>
> https://github.com/apache/beam/tree/v2.7.0-RC1
> >     >>>>>>>                 >          > [6]
> >     >>>>>>>                 https://github.com/apache/beam-site/pull/549
> >     >>>>>>>                 >
> >     >>>>>>>                 >         --
> >     >>>>>>>                 >         Jean-Baptiste Onofré
> >     >>>>>>>                 >         [email protected]
> >     <mailto:[email protected]>
> >     >>>>>>>                 <mailto:[email protected]
> >     <mailto:[email protected]>>
> >     >>>>>>>                 <mailto:[email protected]
> >     <mailto:[email protected]>
> >     >>>>>>>                 <mailto:[email protected]
> >     <mailto:[email protected]>>>
> >     >>>>>>>                 >         http://blog.nanthrax.net
> >     >>>>>>>                 <http://blog.nanthrax.net/>
> >     >>>>>>>                 >         Talend - http://www.talend.com
> >     >>>>>>>                 <http://www.talend.com/>
> >     >>>>>>>                 >
> >     >>>>
> >     >
> >
> >     --
> >     Jean-Baptiste Onofré
> >     [email protected] <mailto:[email protected]>
> >     http://blog.nanthrax.net
> >     Talend - http://www.talend.com
> >
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to