Thanks Xiao for the info. I was looking for this, too. This page
wasn't linked from anywhere on the main doc page (Overview) or any of
the pull-down menus. Someone should remind the doc team to update the
table of contents on the Overview page.
-- ND
On 7/19/20 10:30 PM, Xiao Li wrote:
htt
Is there anyway to make the spark process visible via Spark UI when
running Spark 3.0 on a Hadoop yarn cluster? The spark documentation
talked about replacing Spark UI with the spark history server, but
didn't give much details. Therefore I would assume it is still possible
to use Spark UI wh
I've been trying to set up the latest stable version of Spark 3.0 on a
hadoop cluster using yarn. When running spark-submit in client mode, I
always got an error of org.apache.spark.deploy.yarn.ExecutorLauncher not
found. This happened when I preload the spark jar files onto HDFS and
specifie
Thank you all for the responses. I believe the user shouldn't be
worried about creating the log dir explicitly. The event logging should
behave like other logs (e.g. master or slave) that the directory should
be automatically created if not exist.
-- ND
On 7/2/20 9:19 AM, Zero wrote:
This
Could you share your code? Are you sure you Spark 2.4 cluster had
indeed read anything? Looks like the Input size field is empty under 2.4.
-- ND
On 6/27/20 7:58 PM, Sanjeev Mishra wrote:
I have large amount of json files that Spark can read in 36 seconds
but Spark 3.0 takes almost 33 minu
While launching a spark job from Zeppelin against a standalone spark
cluster (Spark 3.0 with multiple workers without hadoop), we have
encountered a Spark interpreter exception caused by a I/O File Not Found
exception due to the non-existence of the /tmp/spark-events directory.
We had to creat
If you are using Maven to manage your jar dependencies, the jar files
are located in the maven repository on your home directory. It is
usually in the .m2 directory.
Hope this helps.
-ND
On 6/23/20 3:21 PM, Anwar AliKhan wrote:
Hi,
I prefer to do most of my projects in Python and for that I
We were trying to use structured streaming from file source, but had
problems getting the files read by Spark properly. We have another
process generating the data files in the Spark data source directory on
a continuous basis. What we have observed was that the moment a data
file is created