Re: Where does the Driver run?

2019-03-23 Thread Akhil Das
If you are starting your "my-app" on your local machine, that's where the driver is running. [image: image.png] Hope this helps. On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel wrote: > I have researched this for a significant amount of

Re: Spark - Hadoop custom filesystem service loading

2019-03-23 Thread Felix Cheung
Hmm thanks. Do you have a proposed solution? From: Jhon Anderson Cardenas Diaz Sent: Monday, March 18, 2019 1:24 PM To: user Subject: Spark - Hadoop custom filesystem service loading Hi everyone, On spark 2.2.0, if you wanted to create a custom file system

Apache Spark Newsletter Issue 2

2019-03-23 Thread Ankur Gupta
Hello, Issue two of the newsletter https://newsletterspot.com/apache-spark/2/ Feel free to submit articles to the newsletter https://newsletterspot.com/apache-spark/submit/ Next issue onwards will be adding * Spark Events / User Meetups * Tags to identifying content e.g. videos,

Where does the Driver run?

2019-03-23 Thread Pat Ferrel
I have researched this for a significant amount of time and find answers that seem to be for a slightly different question than mine. The Spark 2.3.3 cluster is running fine. I see the GUI on “ http://master-address:8080;, there are 2 idle workers, as configured. I have a Scala application that

JavaRDD and WrappedArrays type iterate

2019-03-23 Thread 1266
Hi everyone , I have encountered some problems when using spark, I have encountered a type of WrappedArray(), ​I don't know how to iterate over this type, or how to convert this type to a List or an Array? Also, after I convert the dataframe type data into JavaRDD, the

spark core / spark sql -- unexpected disk IO activity after all the spark tasks finished but spark context has not stopped.

2019-03-23 Thread Chenghao
Hi, I detected an unexpected disk IO (DISKBUSY spike) after all my spark tasks finished but spark context has not stopped -- as shown in figure case 2 at 21:56:47. Could anyone help explain it and give suggestions on how to avoid or postpone it? Or does spark context have some periodical async

[spark context / spark sql] unexpected disk IO activity after spark job finished but spark context has not

2019-03-23 Thread Chenghao
Hi, I have a SparkSQL workload and ran it as a batch job in two cases. In the first case, I execute the workload, and stop the batch job after `.show()` finished. In the second case, I executed the same workload, and called a 1-minute sleep `Thread.sleep(6)` before I stop its spark context

[spark context / spark sql] unexpected disk IO activity after spark job finished but spark context has not

2019-03-23 Thread Chenghao
Hi, I have a SparkSQL workload and ran it as a batch job in two cases. In the first case, I execute the workload, and stop the batch job after `.show()` finished. In the second case, I executed the same workload, and called a 1-minute sleep `Thread.sleep(6)` before I stop its spark context