[ 
https://issues.apache.org/jira/browse/SPARK-28770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922771#comment-16922771
 ] 

Imran Rashid commented on SPARK-28770:
--------------------------------------

I'm not sure I follow ... the metrics from the driver will be in the log file 
if and only if {{park.eventLog.logStageExecutorMetrics.enabled=true}} and there 
is a stage completed event.  But so that same criteria should apply on replay, 
so the EventMonster would also "log" the metrics update (meaning, record it in 
its internal buffer) if and only if the event log file also contained the stage 
completed event.

[~bzhaoopenstack] since you can reproduce this consistently, could you share 
the event log file from a failure?  I think you just need to delete this block 
in ReplayListenerQuite:

{code}
  after {
    Utils.deleteRecursively(testDir)
  }
{code}

and maybe print out that dir somewhere so you can grab the file.

> Flaky Tests: Test ReplayListenerSuite.End-to-end replay with compression 
> failed
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-28770
>                 URL: https://issues.apache.org/jira/browse/SPARK-28770
>             Project: Spark
>          Issue Type: Test
>          Components: Spark Core
>    Affects Versions: 2.4.3
>         Environment: Community jenkins and our arm testing instance.
>            Reporter: huangtianhua
>            Priority: Major
>
> Test
> org.apache.spark.scheduler.ReplayListenerSuite.End-to-end replay with 
> compression is failed  see 
> [https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/267/testReport/junit/org.apache.spark.scheduler/ReplayListenerSuite/End_to_end_replay_with_compression/]
>  
> And also the test is failed on arm instance, I sent email to spark-dev 
> before, and we suspect there is something related with the commit 
> [https://github.com/apache/spark/pull/23767], we tried to revert it and the 
> tests are passed:
> ReplayListenerSuite:
>        - ...
>        - End-to-end replay *** FAILED ***
>          "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622)
>        - End-to-end replay with compression *** FAILED ***
>          "[driver]" did not equal "[1]" (JsonProtocolSuite.scala:622) 
>  
> Not sure what's wrong, hope someone can help to figure it out, thanks very 
> much.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to