Can you send the specific command / config you are using to reproduce? On Mon, Sep 17, 2018 at 10:28 AM Romain Manni-Bucau <rmannibu...@gmail.com> wrote:
> @Charles: guess you can download > https://github.com/Talend/component-runtime/tree/master/component-runtime-beam/src/it/serialization-over-cluster > subproject, replace project.version by 1.0.4 (other placeholders are on the > net/more known) and you should be able to reproduce forcing beam to be in > v2.7.0. I dont have much time this week to check out this particular issue > but hopefully next one should be more doable if the issue is still pending. > > Romain Manni-Bucau > @rmannibucau <https://twitter.com/rmannibucau> | Blog > <https://rmannibucau.metawerx.net/> | Old Blog > <http://rmannibucau.wordpress.com> | Github > <https://github.com/rmannibucau> | LinkedIn > <https://www.linkedin.com/in/rmannibucau> | Book > <https://www.packtpub.com/application-development/java-ee-8-high-performance> > > > Le lun. 17 sept. 2018 à 19:18, Charles Chen <c...@google.com> a écrit : > >> Luke, Maximillian, Raghu, can you please propose cherry-pick PRs to the >> release-2.7.0 for your issues and add me as a reviewer (@charlesccychen)? >> >> Romain, JB: is there any way I can help with debugging the issue you're >> facing so we can unblock the release? >> >> On Fri, Sep 14, 2018 at 1:49 PM Raghu Angadi <rang...@google.com> wrote: >> >>> I would like propose one more cherrypick for RC2 : >>> https://github.com/apache/beam/pull/6391 >>> This is a KafkaIO bug fix. Once a user hits this bug, there is no easy >>> work around for them, especially on Dataflow. Only work around in Dataflow >>> is to restart or reload the job. >>> >>> The fix itself fairly safe and is tested. >>> Raghu. >>> >>> On Fri, Sep 14, 2018 at 12:52 AM Alexey Romanenko < >>> aromanenko....@gmail.com> wrote: >>> >>>> Perhaps it could help, but I run simple WordCount (built with Beam 2.7) >>>> on YARN/Spark (HDP Sandbox) cluster and it worked fine for me. >>>> >>>> On 14 Sep 2018, at 06:56, Romain Manni-Bucau <rmannibu...@gmail.com> >>>> wrote: >>>> >>>> Hi Charles, >>>> >>>> I didn't get enough time to check deeply but it is clearly a dependency >>>> issue and it is not in beam spark runner itself but in another transitive >>>> module of beam. It does not happen in existing spark test cause none of >>>> them are in a cluster (even just with 1 worker) but this seems to be a >>>> regression since 2.6 works OOTB. >>>> >>>> Romain Manni-Bucau >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>> <http://rmannibucau.wordpress.com/> | Github >>>> <https://github.com/rmannibucau> | LinkedIn >>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>> >>>> >>>> Le jeu. 13 sept. 2018 à 22:15, Charles Chen <c...@google.com> a écrit : >>>> >>>>> Romain and JB, can you please add the results of your investigations >>>>> into the errors you've seen above? Given that the existing SparkRunner >>>>> tests pass for this RC, and that the integration test you ran is in >>>>> another >>>>> repo that is not continuously tested with Beam, it is not clear how we >>>>> should move forward and whether this is a blocking issue, unless we can >>>>> find a root cause in Beam. >>>>> >>>>> On Wed, Sep 12, 2018 at 2:08 AM Etienne Chauchot <echauc...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> on a performance and functional regression stand point I see no >>>>>> regression: >>>>>> >>>>>> I looked at nexmark graphs "output pcollection size" and "execution >>>>>> time" around release cut date on dataflow, spark, flink and direct runner >>>>>> in batch and streaming modes. There seems to be no regression. >>>>>> >>>>>> Etienne >>>>>> >>>>>> Le mardi 11 septembre 2018 à 12:25 -0700, Charles Chen a écrit : >>>>>> >>>>>> The SparkRunner validation test (here: >>>>>> https://beam.apache.org/contribute/release-guide/#run-validation-tests) >>>>>> passes on my machine. It looks like we are likely missing test coverage >>>>>> where Romain is hitting issues. >>>>>> >>>>>> On Tue, Sep 11, 2018 at 12:15 PM Ahmet Altay <al...@google.com> >>>>>> wrote: >>>>>> >>>>>> Could anyone else help with looking at these issues earlier? >>>>>> >>>>>> On Tue, Sep 11, 2018 at 12:03 PM, Romain Manni-Bucau < >>>>>> rmannibu...@gmail.com> wrote: >>>>>> >>>>>> Im running this main [1] through this IT [2]. Was working fine since >>>>>> ~1 year but 2.7.0 broke it. Didnt investigate more but can have a look >>>>>> later this month if it helps. >>>>>> >>>>>> [1] >>>>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/main/java/org/talend/sdk/component/beam/it/clusterserialization/Main.java >>>>>> [2] >>>>>> https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/it/serialization-over-cluster/src/test/java/org/talend/sdk/component/beam/it/SerializationOverClusterIT.java >>>>>> >>>>>> Le mar. 11 sept. 2018 20:54, Charles Chen <c...@google.com> a écrit : >>>>>> >>>>>> Romain: can you give more details on the failure you're encountering, >>>>>> i.e. how you are performing this validation? >>>>>> >>>>>> On Tue, Sep 11, 2018 at 9:36 AM Jean-Baptiste Onofré <j...@nanthrax.net> >>>>>> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> weird, I didn't have it on Beam samples. Let me try to reproduce and >>>>>> I >>>>>> will create the Jira. >>>>>> >>>>>> Regards >>>>>> JB >>>>>> >>>>>> On 11/09/2018 11:44, Romain Manni-Bucau wrote: >>>>>> > -1, seems spark integration is broken (tested with spark 2.3.1 and >>>>>> 2.2.1): >>>>>> > >>>>>> > 18/09/11 11:33:29 WARN TaskSetManager: Lost task 0.0 in stage 0.0 >>>>>> (TID 0, RMANNIBUCAU, executor 0): java.lang.ClassCastException: cannot >>>>>> assign instance of scala.collection.immutable.List$SerializationProxy to >>>>>> fieldorg.apache.spark.rdd.RDD.org >>>>>> <http://fieldorg.apache.spark.rdd.rdd.org/> < >>>>>> http://org.apache.spark.rdd.RDD.org >>>>>> <http://org.apache.spark.rdd.rdd.org/>>$apache$spark$rdd$RDD$$dependencies_ >>>>>> of type scala.collection.Seq in instance of >>>>>> org.apache.spark.rdd.MapPartitionsRDD >>>>>> > at >>>>>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2233) >>>>>> > >>>>>> > >>>>>> > Also the issue Lukasz identified is important even if workarounds >>>>>> can be >>>>>> > put in place so +1 to fix it as well if possible. >>>>>> > >>>>>> > Romain Manni-Bucau >>>>>> > @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>> > <https://rmannibucau.metawerx.net/> | Old Blog >>>>>> > <http://rmannibucau.wordpress.com> | Github >>>>>> > <https://github.com/rmannibucau> | LinkedIn >>>>>> > <https://www.linkedin.com/in/rmannibucau> | Book >>>>>> > < >>>>>> https://www.packtpub.com/application-development/java-ee-8-high-performance >>>>>> > >>>>>> > >>>>>> > >>>>>> > Le lun. 10 sept. 2018 à 20:48, Lukasz Cwik <lc...@google.com >>>>>> > <mailto:lc...@google.com>> a écrit : >>>>>> > >>>>>> > I found an issue where we are no longer packaging the pom.xml >>>>>> within >>>>>> > the artifact jars at META-INF/maven/groupId/artifactId. More >>>>>> details >>>>>> > in https://issues.apache.org/jira/browse/BEAM-5351. I wouldn't >>>>>> > consider this a blocker but it was an easy fix >>>>>> > (https://github.com/apache/beam/pull/6358) and users may rely >>>>>> on the >>>>>> > pom.xml. >>>>>> > >>>>>> > Should we recut the release candidate to include this? >>>>>> > >>>>>> > On Mon, Sep 10, 2018 at 4:58 AM Jean-Baptiste Onofré >>>>>> > <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote: >>>>>> > >>>>>> > +1 (binding) >>>>>> > >>>>>> > Tested successfully on Beam Samples. >>>>>> > >>>>>> > Thanks ! >>>>>> > >>>>>> > Regards >>>>>> > JB >>>>>> > >>>>>> > On 07/09/2018 23:56, Charles Chen wrote: >>>>>> > > Hi everyone, >>>>>> > > >>>>>> > > Please review and vote on the release candidate #1 for >>>>>> the >>>>>> > version >>>>>> > > 2.7.0, as follows: >>>>>> > > [ ] +1, Approve the release >>>>>> > > [ ] -1, Do not approve the release (please provide >>>>>> specific >>>>>> > comments) >>>>>> > > >>>>>> > > The complete staging area is available for your review, >>>>>> which >>>>>> > includes: >>>>>> > > * JIRA release notes [1], >>>>>> > > * the official Apache source release to be deployed to >>>>>> > dist.apache.org <http://dist.apache.org> >>>>>> > > <http://dist.apache.org> [2], which is signed with the >>>>>> key with >>>>>> > > fingerprint 45C60AAAD115F560 [3], >>>>>> > > * all artifacts to be deployed to the Maven Central >>>>>> > Repository [4], >>>>>> > > * source code tag "v2.7.0-RC1" [5], >>>>>> > > * website pull request listing the release and >>>>>> publishing the API >>>>>> > > reference manual [6]. >>>>>> > > * Java artifacts were built with Gradle 4.8 and OpenJDK >>>>>> > > 1.8.0_181-8u181-b13-1~deb9u1-b13. >>>>>> > > * Python artifacts are deployed along with the source >>>>>> release >>>>>> > to the >>>>>> > > dist.apache.org <http://dist.apache.org> >>>>>> > <http://dist.apache.org> [2]. >>>>>> > > >>>>>> > > The vote will be open for at least 72 hours. It is >>>>>> adopted by >>>>>> > majority >>>>>> > > approval, with at least 3 PMC affirmative votes. >>>>>> > > >>>>>> > > Thanks, >>>>>> > > Charles >>>>>> > > >>>>>> > > [1] >>>>>> > > >>>>>> > >>>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343654 >>>>>> > > [2] https://dist.apache.org/repos/dist/dev/beam/2.7.0 >>>>>> > > [3] https://dist.apache.org/repos/dist/dev/beam/KEYS >>>>>> > > [4] >>>>>> > >>>>>> https://repository.apache.org/content/repositories/orgapachebeam-1046/ >>>>>> > > [5] https://github.com/apache/beam/tree/v2.7.0-RC1 >>>>>> > > [6] https://github.com/apache/beam-site/pull/549 >>>>>> > >>>>>> > -- >>>>>> > Jean-Baptiste Onofré >>>>>> > jbono...@apache.org <mailto:jbono...@apache.org> >>>>>> > http://blog.nanthrax.net >>>>>> > Talend - http://www.talend.com >>>>>> > >>>>>> >>>>>> >>>>>> >>>>