Vladislav Sterkhov created SPARK-33620: ------------------------------------------
Summary: Task not started after filtering Key: SPARK-33620 URL: https://issues.apache.org/jira/browse/SPARK-33620 Project: Spark Issue Type: Question Components: Spark Core Affects Versions: 2.4.7 Reporter: Vladislav Sterkhov Hello i have problem with big memory used ~2000gb hdfs stack. With 300gb memory used task starting and complete, but we need use unlimited stack. Please help !image-2020-12-01-13-34-17-283.png! !image-2020-12-01-13-34-31-288.png! This my code: {{var allTrafficRDD = sparkContext.emptyRDD[String] for (traffic <- trafficBuffer) \{ logger.info("Load traffic path - "+traffic) val trafficRDD = sparkContext.textFile(traffic) if (isValidTraffic(trafficRDD, isMasterData)) { allTrafficRDD = allTrafficRDD.++(filterTraffic(trafficRDD)) } } hiveService.insertTrafficRDD(allTrafficRDD.repartition(beforeInsertPartitionsNum), outTable, isMasterData)}} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org