AM restart in a other node make SparkSQL job into a state of feign death

2017-12-20 Thread Bang Xiao
I run "spark-sql --master yarn --deploy-mode client -f 'SQLs' " in shell, The application is stuck when the AM is down and restart in other nodes. It seems the driver wait for the next sql. Is this a bug?In my opinion,Either the application execute the failed sql or exit with a failure when

AM restart in a other node makes SparkSQL jobs into a state of feign death

2017-12-20 Thread Bang Xiao
I run "spark-sql --master yarn --deploy-mode client -f 'SQLs' " in shell, The application is stuck when the AM is down and restart in other nodes. It seems the driver wait for the next sql. Is this a bug?In my opinion,Either the application execute the failed sql or exit with a failure when

Exception in Shutdown-thread, bad file descriptor

2017-12-20 Thread Noorul Islam Kamal Malmiyoda
Hi all, We are getting the following exception and this somehow blocks the parent thread from proceeding further. 17/11/14 16:50:09 SPARK_APP WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/11/14 16:50:17 SPARK_APP

Re: Help Required on Spark - Convert DataFrame to List with out using collect

2017-12-20 Thread Sunitha Chennareddy
Hi, Thank You All.. Here is my requirement, I have a dataframe which contains list of rows retrieved from oracle table. I need to iterate dataframe and fetch each record and call a common function by passing few parameters. Issue I am facing is : I am not able to call common function JavaRDD

Re: Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-20 Thread chopinxb
In my practice of spark application(almost Spark-SQL) , when there is a complete node failure in my cluster, jobs which have shuffle blocks on the node will completely fail after 4 task retries. It seems that data lineage didn't work. What' more, our applications use multiple SQL statements for

keep sparkContext alive and wait for next job just like spark-shell

2017-12-20 Thread CondyZhou
Hi ,All. i am confused of how can i keep a sparkContext alive. Just in the situation that we write a sql query on a web and backend we init a sparkContext then submit the spark jobs. However the question is everytime we run the query string,spark with request the resources from yarn.It is

Re: /tmp fills up to 100GB when using a window function

2017-12-20 Thread Vadim Semenov
Ah, yes, I missed that part it's `spark.local.dir` spark.local.dir /tmp Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories

Re: /tmp fills up to 100GB when using a window function

2017-12-20 Thread Gourav Sengupta
I do think that there is an option to set the temporary shuffle location to a particular directory. While working with EMR I set it to /mnt1/. Let me know in case you are not able to find it. On Mon, Dec 18, 2017 at 8:10 PM, Mihai Iacob wrote: > This code generates files

Fwd: ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM

2017-12-20 Thread Vishal Verma
Hi All, please help me with this error 17/12/20 11:07:16 INFO executor.CoarseGrainedExecutorBackend: Started daemon with process name: 19581@ddh-dev-dataproc-sw-hdgx 17/12/20 11:07:16 INFO util.SignalUtils: Registered signal handler for TERM 17/12/20 11:07:16 INFO util.SignalUtils: Registered

Re: Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-20 Thread vincent gromakowski
Probability of a complete node failure is low. I would rely on data lineage and accept the reprocessing overhead. Another option would be to Write on distributed FS but it will drastically reduce all your jobs speed Le 20 déc. 2017 11:23, "chopinxb" a écrit : > Yes,shuffle

Re: Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-20 Thread chopinxb
Yes,shuffle service was already started in each NodeManager. What i mean about node fails is the machine is down,all the service include nodemanager process in this machine is down. So in this case, shuffle service is no longer helpfull -- Sent from:

Re: Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-20 Thread vincent gromakowski
In your case you need to externalize the shuffle files to a component outside of your spark cluster to make them persist after spark workers death. https://spark.apache.org/docs/latest/running-on-yarn.html#configuring-the-external-shuffle-service 2017-12-20 10:46 GMT+01:00 chopinxb

Can spark shuffle leverage Alluxio to abtain higher stability?

2017-12-20 Thread chopinxb
In my use case, i run spark on yarn-client mode with dynamicAllocation enabled, When a node shutting down abnormally, my spark application will fails because of task fail to fetch shuffle blocks from that node 4 times. Why spark do not leverage Alluxio(distributed in-memory filesystem) to write