I've filed a JIRA about this problem. 
https://issues.apache.org/jira/browse/SPARK-19532
<https://issues.apache.org/jira/browse/SPARK-19532>  

I've tried to set `spark.speculation` to `false`, but the off-heap also
exceed about 10G after triggering a FullGC to the Executor
process(--executor-memory 30G), as follow:

test@test Online ~ $ ps aux | grep CoarseGrainedExecutorBackend
test      105371  106 21.5 67325492 42621992 ?   Sl   15:20  55:14
/home/test/service/jdk/bin/java -cp
/home/test/service/hadoop/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar:/home/test/service/hadoop/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jar:/home/test/service/spark/conf/:/home/test/service/spark/jars/*:/home/test/service/hadoop/etc/hadoop/
-Xmx30720M -Dspark.driver.port=9835 -Dtag=spark_2_1_test -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -Xloggc:./gc.log -verbose:gc
org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
spark://CoarseGrainedScheduler@172.16.34.235:9835 --executor-id 4 --hostname
test-192 --cores 36 --app-id app-20170213152037-0043 --worker-url
spark://Worker@test-192:33890

So, I think these are also other reasons for this problem.

We have been trying to upgrade our Spark from the releasing of Spark 2.1.0.

This version is unstable and not available for us because of the memory
problems, we should pay attention to this.


StanZhai wrote
> From thread dump page of Executor of WebUI, I found that there are about
> 1300 threads named  "DataStreamer for file
> /test/data/test_temp/_temporary/0/_temporary/attempt_20170207172435_80750_m_000069_1/part-00069-690407af-0900-46b1-9590-a6d6c696fe68.snappy.parquet"
> in TIMED_WAITING state like this:
<http://apache-spark-developers-list.1001551.n3.nabble.com/file/n20881/QQ20170207-212340.png>
 
> 
> The exceed off-heap memory may be caused by these abnormal threads. 
> 
> This problem occurs only when writing data to the Hadoop(tasks may be
> killed by Executor during writing).
> 
> Could this be related to 
> https://issues.apache.org/jira/browse/HDFS-9812
> <https://issues.apache.org/jira/browse/HDFS-9812>  
> ?
> 
> It's may be a bug of Spark when killing tasks during writing data. What's
> the difference between Spark 1.6.x and 2.1.0 in killing tasks?
> 
> This is a critical issue, I've worked on this for days.
> 
> Any help?





--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Executors-exceed-maximum-memory-defined-with-executor-memory-in-Spark-2-1-0-tp20697p20935.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to