Hi,
I am running Spark-on-YARN on a secure cluster with yarn log aggregation set
up. Once a job completes, when viewing stdout/stderr executor logs in the Spark
history server UI it redirects me to the local nodemanager where a page appears
for a second saying ‘Redirecting to log server….’ and
Hi,
I am having issues with /tmp space filling up during Spark jobs because
Spark-on-YARN uses the yarn.nodemanager.local-dirs for shuffle space. I noticed
this message appears when submitting Spark-on-YARN jobs:
WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by
the
That’s really useful, thanks.
From: Andrew Ash [mailto:and...@andrewash.com]
Sent: 09 January 2015 22:42
To: England, Michael (IT/UK)
Cc: raghavendra.pan...@gmail.com; user
Subject: Re: Cleaning up spark.local.dir automatically
That's a worker setting which cleans up the files left behind by
Thanks, I imagine this will kill any cached RDDs if their files are beyond the
ttl?
Thanks
From: Raghavendra Pandey [mailto:raghavendra.pan...@gmail.com]
Sent: 09 January 2015 15:29
To: England, Michael (IT/UK); user@spark.apache.org
Subject: Re: Cleaning up spark.local.dir automatically
You
Hi Marcelo,
On MapR, the mapr user can read the files using the NFS mount, however using
the normal hadoop fs -cat /... command, I get permission denied. As the history
server is pointing to a location on mapfs, not the NFS mount, I'd imagine the
Spark history server is trying to read the
Hi,
Is there a way of automatically cleaning up the spark.local.dir after a job has
been run? I have noticed a large number of temporary files have been stored
here and are not cleaned up. The only solution I can think of is to run some
sort of cron job to delete files older than a few days. I
Hi Vanzin,
I am using the MapR distribution of Hadoop. The history server logs are created
by a job with the permissions:
drwxrwx--- - myusername mygroup 2 2015-01-08 09:14
/apps/spark/historyserver/logs/spark-1420708455212
However, the permissions of the higher directories
Hi,
I am currently running pyspark jobs against Spark 1.1.0 on YARN. When I run
example Java jobs such as spark-pi, the following files get created:
bash-4.1$ tree spark-pi-1420624364958
spark-pi-1420624364958
âââ APPLICATION_COMPLETE
âââ EVENT_LOG_1
âââ SPARK_VERSION_1.1.0
0 directories, 3
Thanks Andrew, simple fix ☺.
From: Andrew Ash [mailto:and...@andrewash.com]
Sent: 07 January 2015 15:26
To: England, Michael (IT/UK)
Cc: user
Subject: Re: FW: No APPLICATION_COMPLETE file created in history server log
location upon pyspark job success
Hi Michael,
I think you need to
Hi,
When I run jobs and save the event logs, they are saved with the permissions of
the unix user and group that ran the spark job. The history server is run as a
service account and therefore can’t read the files:
Extract from the History server logs:
2015-01-07 15:37:24,3021 ERROR Client
10 matches
Mail list logo