[jira] [Created] (SPARK-37488) With enough resources, the task may still be permanently pending

Yiqun Zhang (Jira) Mon, 29 Nov 2021 05:57:05 -0800

Yiqun Zhang created SPARK-37488:
-----------------------------------

             Summary: With enough resources, the task may still be permanently 
pending
                 Key: SPARK-37488
                 URL: https://issues.apache.org/jira/browse/SPARK-37488
             Project: Spark
          Issue Type: Bug
          Components: Scheduler, Spark Core
    Affects Versions: 3.2.0, 3.1.2, 3.0.3
         Environment: Spark 3.1.2，Default Configuration
            Reporter: Yiqun Zhang



{code:java}
// The online environment is actually hive partition data imported to tidb, the 
code logic can be simplified as follows
    SparkSession testApp = SparkSession.builder()
        .master("local[*]")
        .appName("test app")
        .enableHiveSupport()
        .getOrCreate();
    Dataset<Row> dataset = testApp.sql("select * from default.test where dt = 
'20211129'");
    dataset.persist(StorageLevel.MEMORY_AND_DISK());
    dataset.count();
{code}

I have observed that tasks are permanently blocked and reruns can always be 
reproduced.

Since it is only reproducible online, I use the arthas runtime to see the 
status of the function entries and returns within the TaskSetManager.
https://gist.github.com/guiyanakuang/431584f191645513552a937d16ae8fbd

NODE_LOCAL level, because the persist function is called, the 
pendingTasks.forHost has a collection of pending tasks, but it points to the 
machine where the block of partitioned data is located, and since the only 
resource spark gets is the driver. In this case, it cannot be scheduled. 
getAllowedLocalityLevel gives the wrong runlevel, so it cannot be run with 
TaskLocality.Any

The task pending permanently because the scheduling time is very short and it 
is too late to raise the runlevel with a timeout.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-37488) With enough resources, the task may still be permanently pending

Reply via email to