Prashant Sharma created SPARK-33668:
---------------------------------------

             Summary: Fix flaky test "Verify logging configuration is picked 
from the provided SPARK_CONF_DIR/log4j.properties."
                 Key: SPARK-33668
                 URL: https://issues.apache.org/jira/browse/SPARK-33668
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes, Tests
    Affects Versions: 3.1.0
            Reporter: Prashant Sharma


The test is flaking, and at more than one instance and the reason for the 
failure is
{code:java}
  The code passed to eventually never returned normally. Attempted 109 times 
over 3.0079882413999997 minutes. Last failure message: Failure executing: GET 
at: 
https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
 Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
uid=null, additionalProperties={}), kind=Status, message=pods 
"spark-pi-97a9bc76308e7fe3-exec-1" not found, metadata=ListMeta(_continue=null, 
remainingItemCount=null, resourceVersion=null, selfLink=null, 
additionalProperties={}), reason=NotFound, status=Failure, 
additionalProperties={}).. (KubernetesSuite.scala:402)
{code}

>From the above failure, it seems, that executor finishes too quickly and is 
>removed by spark before the test can complete. 

So, in order to mitigate this situation, one way is to turn on the flag

{code}
   "spark.kubernetes.executor.deleteOnTermination"
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to