---------- Forwarded message ---------- From: Renu Yadav <yren...@gmail.com> Date: Mon, Sep 14, 2015 at 4:51 PM Subject: Spark job failed To: d...@spark.apache.org
I am getting below error while running spark job: storage.DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd java.io.FileNotFoundException: /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd (No such file or directory I am running 1.3TB data following are the transformation read from hadoop->map(key/value).coalease(2000).groupByKey. then sorting each record by server_ts and select most recent saving data into parquet. Following is the command spark-submit --class com.test.Myapp--master yarn-cluster --driver-memory 16g --executor-memory 20g --executor-cores 5 --num-executors 150 --files /home/renu_yadav/fmyapp/hive-site.xml --conf spark.yarn.preserve.staging.files=true --conf spark.shuffle.memoryFraction=0.6 --conf spark.storage.memoryFraction=0.1 --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m" --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m" --conf spark.akka.timeout=400000 --conf spark.locality.wait=10 --conf spark.yarn.executor.memoryOverhead=8000 --conf SPARK_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" --conf spark.reducer.maxMbInFlight=96 --conf spark.shuffle.file.buffer.kb=64 --conf spark.core.connection.ack.wait.timeout=120 --jars /usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-rdbms-3.2.9.jar myapp_2.10-1.0.jar Cluster configuration 20 Nodes 32 cores per node 125 GB ram per node Please Help. Thanks & Regards, Renu Yadav