Does Spark uses its own HDFS client?

2017-04-07 Thread Alvaro Brandon
I was going through the SparkContext.textFile() and I was wondering at that point does Spark communicates with HDFS. Since when you download Spark binaries you also specify the Hadoop version you will use, I'm guessing it has its own client that calls HDFS wherever you specify it in the

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Alvaro Brandon
city or fair scheduler. > > What is the use case for running it always on the same 10 machines. If it > is for licensing reasons then I would ask your vendor if this is a suitable > mean to ensure license compliance. Otherwise dedicated cluster. > > On 7 Feb 2017, at 12:09, Alvaro

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Alvaro Brandon
standalone cluster manager, and > than manage subset of machines through submitting application on different > masters. Or you can use Mesos attributes to mark subset of workers and > specify it in spark.mesos.constraints > > > On Tue, Feb 7, 2017 at 1:21 PM Alvaro Brandon <alvar

Launching an Spark application in a subset of machines

2017-02-07 Thread Alvaro Brandon
Hello all: I have the following scenario. - I have a cluster of 50 machines with Hadoop and Spark installed on them. - I want to launch one Spark application through spark submit. However I want this application to run on only a subset of these machines, disregarding data locality. (e.g. 10

Reading Shuffle Data from highly loaded nodes

2016-05-06 Thread Alvaro Brandon
Hello everyone: I'm running an experiment in a Spark cluster where some of the machines are highly loaded with CPU, memory and network consuming process ( let's call them straggler machines ). Obviously the tasks of these machines take longer to execute than in other nodes of the cluster.

Re: Dynamic allocation Spark

2016-02-26 Thread Alvaro Brandon
That was exactly it. I had the worker and master processes of Spark standalone running together with YARN and somehow the resource manager didn't see the nodes. It's working now. Thanks for the tip :-) 2016-02-26 12:33 GMT+01:00 Jeff Zhang : > Check the RM UI to ensure you

Re: Is there anyway to log properties from a Spark application

2015-12-28 Thread Alvaro Brandon
Thanks for the swift response. I'm launching my applications through YARN. Where will these properties be logged?. I guess they wont be part of YARN logs 2015-12-28 13:22 GMT+01:00 Jeff Zhang : > set spark.logConf as true in spark-default.conf will log the property in > driver