[ https://issues.apache.org/jira/browse/BEAM-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997742#comment-15997742 ]
Aviem Zur edited comment on BEAM-1582 at 5/5/17 3:01 AM: --------------------------------------------------------- I don't think we should disable this test as it is a very important one. In the last 30 builds that Jenkins saves the history for this test has not failed once: https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Spark/ So I'm not sure why this is such an issue? (P.S. The {{@ValidatesRunner}} suggestion was so this would run in PostCommit rather than PreCommit so if it ever does flake it won't cause friction for contributors working on something unrelated, but I see this is already the case, this test already runs in PostCommit only using a different annotation {{@UsesCheckpointRecovery}}) was (Author: aviemzur): I don't think we should disable this test as it is a very important one. In the last 30 builds that Jenkins saves the history for this test has not failed once: https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Spark/ So I'm not sure why this is such an issue? > ResumeFromCheckpointStreamingTest flakes with what appears as a second firing. > ------------------------------------------------------------------------------ > > Key: BEAM-1582 > URL: https://issues.apache.org/jira/browse/BEAM-1582 > Project: Beam > Issue Type: Bug > Components: runner-spark > Reporter: Amit Sela > Assignee: Amit Sela > Labels: flake > Fix For: First stable release > > > See: > https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_MavenInstall/org.apache.beam$beam-runners-spark/2788/testReport/junit/org.apache.beam.runners.spark.translation.streaming/ResumeFromCheckpointStreamingTest/testWithResume/ > After some digging in it appears that a second firing occurs (though only one > is expected) but it doesn't come from a stale state (state is empty before it > fires). > Might be a retry happening for some reason, which is OK in terms of > fault-tolerance guarantees (at-least-once), but not so much in terms of flaky > tests. > I'm looking into this hoping to fix this ASAP. -- This message was sent by Atlassian JIRA (v6.3.15#6346)