Hello Neelesh,
Thank you for the checklist for determining the correct configuration of
Spark. I will go through these and let you know if I have further questions.
Regards,
Chris
--
View this message in context:
Hi Chris,
Thank you for posting the question.
Tuning spark configurations is a tricky task since there are a lot factors
to consider.
The configurations that you listed cover the most them.
To understand the situation that can guide you in making a decision about
tuning:
1) What kind of spark
ion:
BDA v3 server : SUN SERVER X4-2L
Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
CPU cores : 32
GB of memory (>=63): 63
number of disks : 12 spark-defaults.conf
spark.driver.memory 20g
spark.executor.memory 40g
spark.executor.extraJavaOptions -XX:+PrintGCDetai
I think it's a missing feature.
On Wed, Sep 2, 2015 at 10:58 PM, Axel Dahl <a...@whisperstream.com> wrote:
> So a bit more investigation, shows that:
>
> if I have configured spark-defaults.conf with:
>
> "spark.files library.py"
>
> then if I call
&
t more investigation, shows that:
> >
> > if I have configured spark-defaults.conf with:
> >
> > "spark.files library.py"
> >
> > then if I call
> >
> > "spark-submit.py -v test.py"
> >
> > I see that my
in my spark-defaults.conf I have:
spark.files file1.zip, file2.py
spark.master spark://master.domain.com:7077
If I execute:
bin/pyspark
I can see it adding the files correctly.
However if I execute
bin/spark-submit test.py
where test.py relies on the file1.zip, I get
This should be a bug, could you create a JIRA for it?
On Wed, Sep 2, 2015 at 4:38 PM, Axel Dahl <a...@whisperstream.com> wrote:
> in my spark-defaults.conf I have:
> spark.files file1.zip, file2.py
> spark.master spark://master.domain.com:7077
>
>
So a bit more investigation, shows that:
if I have configured spark-defaults.conf with:
"spark.files library.py"
then if I call
"spark-submit.py -v test.py"
I see that my "spark.files" default option has been replaced with
"spark.files
Does anybody have any idea how to solve this problem?
Ningjun
From: Wang, Ningjun (LNG-NPV)
Sent: Thursday, July 30, 2015 11:06 AM
To: user@spark.apache.org
Subject: How to register array class with Kyro in spark-defaults.conf
I register my class with Kyro in spark-defaults.conf as follow
: Friday, July 31, 2015 11:49 AM
To: Wang, Ningjun (LNG-NPV)
Cc: user@spark.apache.org
Subject: Re: How to register array class with Kyro in spark-defaults.conf
For the second exception, was there anything following SparkException which
would give us more clue ?
Can you tell us how EsDoc
I register my class with Kyro in spark-defaults.conf as follow
spark.serializer
org.apache.spark.serializer.KryoSerializer
spark.kryo.registrationRequired true
spark.kryo.classesToRegister ltn.analytics.es.EsDoc
But I got the following
to pass both
-c spark.executor.instances (or --num-executors) *and* -c
spark.dynamicAllocation.enabled=true to spark-submit on the command line (as
opposed to having one of them in spark-defaults.conf and one of them in the
spark-submit args), but currently there doesn't seem to be any way
in
spark-defaults.conf
I've set up my cluster with a pre-calcualted value for spark.executor.instances
in spark-defaults.conf such that I can run a job and have it maximize the
utilization of the cluster resources by default. However, if I want to run a
job with dynamicAllocation (by passing -c
the exception would be helpful if, say, you tried to pass
both -c spark.executor.instances (or --num-executors) *and* -c
spark.dynamicAllocation.enabled=true to spark-submit on the command line
(as opposed to having one of them in spark-defaults.conf and one of them in
the spark-submit args
...@amazon.com
Cc: user@spark.apache.org user@spark.apache.org
Subject: Re: Unable to use dynamicAllocation if spark.executor.instances
is set in spark-defaults.conf
Hi Jonathan,
This is a problem that has come up for us as well, because we'd like
dynamic allocation to be turned on by default
I've set up my cluster with a pre-calcualted value for spark.executor.instances
in spark-defaults.conf such that I can run a job and have it maximize the
utilization of the cluster resources by default. However, if I want to run a
job with dynamicAllocation (by passing -c
Original message /divdivFrom: Akhil Das
ak...@sigmoidanalytics.com /divdivDate:07/01/2015 2:27 AM (GMT-05:00)
/divdivTo: Yana Kadiyska yana.kadiy...@gmail.com /divdivCc:
user@spark.apache.org /divdivSubject: Re: Difference between
spark-defaults.conf and SparkConf.set /divdiv
spark.driver.extraClassPath
to point to some external JARs. If I set them in spark-defaults.conf
everything works perfectly.
However, if I remove spark-defaults.conf and just create a SparkConf and
call
.set(spark.executor.extraClassPath,...)
.set(spark.driver.extraClassPath,...)
I get ClassNotFound
Hi folks, running into a pretty strange issue:
I'm setting
spark.executor.extraClassPath
spark.driver.extraClassPath
to point to some external JARs. If I set them in spark-defaults.conf
everything works perfectly.
However, if I remove spark-defaults.conf and just create a SparkConf and
call
.set
So no takers regarding why spark-defaults.conf is not being picked up.
Here is another one:
If Zookeeper is configured in Spark why do we need to start a slave like
this:
spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh 1 spark://somemaster:7077
i.e. why do we need to specify the master url
I renamed spark-defaults.conf.template to spark-defaults.conf
and invoked
spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh
But I still get
failed to launch org.apache.spark.deploy.worker.Worker:
--properties-file FILE Path to a custom Spark properties file
-defaults.conf
and invoked
spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh
But I still get
failed to launch org.apache.spark.deploy.worker.Worker:
--properties-file FILE Path to a custom Spark properties file.
Default is conf/spark-defaults.conf.
But I'm thinking
Thanks.
I've set SPARK_HOME and SPARK_CONF_DIR appropriately in .bash_profile
But when I start worker like this
spark-1.3.0-bin-hadoop2.4/sbin/start-slave.sh
I still get
failed to launch org.apache.spark.deploy.worker.Worker:
Default is conf/spark-defaults.conf
Here is related problem:
http://apache-spark-user-list.1001560.n3.nabble.com/Launching-history-server-problem-td12574.html
but no answer.
What I'm trying to do: wrap spark-history with /etc/init.d script
Problems I have: can't make it read spark-defaults.conf
I've put this file here:
/etc/spark
Hi,all:
Can value in spark-defaults.conf support system variables?
Such as mess = ${user.home}/${user.name}.
Best Regards
Zhanfeng Huo
No, not currently.
2014-09-01 2:53 GMT-07:00 Zhanfeng Huo huozhanf...@gmail.com:
Hi,all:
Can value in spark-defaults.conf support system variables?
Such as mess = ${user.home}/${user.name}.
Best Regards
--
Zhanfeng Huo
Hi All,
Not sure if anyone has ran into this problem, but this exist in spark 1.0.0
when you specify the location in conf/spark-defaults.conf for
spark.eventLog.dir hdfs:///user/$USER/spark/logs
to use the $USER env variable.
For example, I'm running the command with user 'test'.
In spark-submit
Hi Andrew,
It's definitely not bad practice to use spark-shell with HistoryServer. The
issue here is not with spark-shell, but the way we pass Spark configs to
the application. spark-defaults.conf does not currently support embedding
environment variables, but instead interprets everything
them to create their own
spark-defaults.conf since this is set to read-only. A workaround is to set it
to a shared folder e.g. /user/spark/logs and user permission 1777. This isn't
really ideal since other people can see what are the other jobs running on the
shared cluster.
It will be nice to have
29 matches
Mail list logo