Vladislav Sterkhov created SPARK-33620:
------------------------------------------

             Summary: Task not started after filtering
                 Key: SPARK-33620
                 URL: https://issues.apache.org/jira/browse/SPARK-33620
             Project: Spark
          Issue Type: Question
          Components: Spark Core
    Affects Versions: 2.4.7
            Reporter: Vladislav Sterkhov


Hello i have problem with big memory used ~2000gb hdfs stack. With 300gb memory 
used task starting and complete, but we need use unlimited stack. Please help

 

!image-2020-12-01-13-34-17-283.png!


!image-2020-12-01-13-34-31-288.png!

 

This my code:

 

{{var allTrafficRDD = sparkContext.emptyRDD[String]
    for (traffic <- trafficBuffer) \{
      logger.info("Load traffic path - "+traffic)
      val trafficRDD = sparkContext.textFile(traffic)
      if (isValidTraffic(trafficRDD, isMasterData)) {
        allTrafficRDD = allTrafficRDD.++(filterTraffic(trafficRDD))
      }
    }
    
hiveService.insertTrafficRDD(allTrafficRDD.repartition(beforeInsertPartitionsNum),
 outTable, isMasterData)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to