Hi,
I'm building a spark application in which I load some data from an
Elasticsearch cluster (using latest elasticsearch-hadoop connector) and
continue to perform some calculations on the spark cluster.
In one case, I use collect on the RDD as soon as it is created (loaded from
ES).
However, it
Thanks Andrew.
On Sun, Aug 16, 2015 at 1:53 PM, Andrew Or and...@databricks.com wrote:
Hi Canan, TestSQLContext is no longer a singleton but now a class. It is
never meant to be a fully public API, but if you wish to use it you can
just instantiate a new one:
val sqlContext = new
JavaPairReceiverInputDStreamString, byte[] messages =
KafkaUtils.createStream(...);
JavaPairDStreamString, byte[] filteredMessages =
filterValidMessages(messages);
JavaDStreamString useCase1 = calculateUseCase1(filteredMessages);
JavaDStreamString useCase2 = calculateUseCase2(filteredMessages);
try --jars rather than --class to submit jar.
On Fri, Aug 14, 2015 at 6:19 AM, Stephen Boesch java...@gmail.com wrote:
The NoClassDefFoundException differs from ClassNotFoundException : it
indicates an error while initializing that class: but the class is found in
the classpath. Please
Hi
I have been trying to run standalone application using spark-submit but
somehow spark started the http server and added jar file to it but it is
unable to fetch the jar file. I am running the spark-cluster on localhost.
If anyone can help me to find what i am missing here, thanks in advance.
I am building spark with the following options - most notably the
**scala-2.11**:
. dev/switch-to-scala-2.11.sh
mvn -Phive -Pyarn -Phadoop-2.6 -Dhadoop2.6.2 -Pscala-2.11 -DskipTests
-Dmaven.javadoc.skip=true clean package
The build goes pretty far but fails in one of the minor modules
Hi,
I am trying to run SparkPi in Intellij and getting NoClassDefFoundError.
Anyone else saw this issue before ?
Exception in thread main java.lang.NoClassDefFoundError:
scala/collection/Seq
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at
I did check it out and although I did get a general understanding of the
various classes used to implement Sort and Hash shuffles, however these
slides lack details as to how they are implemented and why sort generally
has better performance than hash
On Sun, Aug 16, 2015 at 4:31 AM, Ravi Kiran
Hi I have written Spark job which seems to be working fine for almost an hour
and after that executor start getting lost because of timeout I see the
following in log statement
15/08/16 12:26:46 WARN spark.HeartbeatReceiver: Removing executor 10 with no
recent heartbeats: 1051638 ms exceeds
Hi Mohit,
It depends on whether dynamic allocation is turned on. If not, the number
of executors is specified by the user with the --num-executors option. If
dynamic allocation is turned on, refer to the doc for details:
Check module example's dependency (right click examples and click Open
Modules Settings), by default scala-library is provided, you need to change
it to compile to run SparkPi in Intellij. As I remember, you also need to
change guava and jetty related library to compile too.
On Mon, Aug 17, 2015
To make it clear, Spark Standalone is similar to Yarn as a simple cluster
management system.
Spark Master --- Yarn Resource Manager
Spark Worker --- Yarn Node Manager
On Mon, Aug 17, 2015 at 4:59 AM, Ruslan Dautkhanov dautkha...@gmail.com
wrote:
There is no Spark master in YARN mode.
can you tell more about your environment. I understand you are running it
on a single machine but is firewall enabled?
On Sun, Aug 16, 2015 at 5:47 AM, t4ng0 manvendra.tom...@gmail.com wrote:
Hi
I am new to spark and trying to run standalone application using
spark-submit. Whatever i could
There is no Spark master in YARN mode. It's standalone mode terminology.
In YARN cluster mode, Spark's Application Master (Spark Driver runs in it)
will be restarted
automatically by RM up to yarn.resourcemanager.am.max-retries
times (default is 2).
--
Ruslan Dautkhanov
On Fri, Jul 17, 2015 at
Hi I have Spark driver program which has one loop which iterates for around
2000 times and for two thousands times it executes jobs in YARN. Since loop
will do the job serially I want to introduce parallelism If I create 2000
tasks/runnable/callable in my Spark driver program will it get executed
Hi,I have a basic spark sql join run in the local mode. I checked the UI,and
see that there are two jobs are run. There DAG graph are pasted at the end.
I have several questions here:
1. Looks that Job0 and Job1 all have the same DAG Stages, but the stage 3 and
stage4 are skipped. I would ask
In spark, every action (foreach, collect etc.) gets converted into a spark
job and jobs are executed sequentially.
You may want to refactor your code in calculateUseCase? to just run
transformations (map, flatmap) and call a single action in the end.
On Sun, Aug 16, 2015 at 3:19 PM, mohanaugust
17 matches
Mail list logo