I think you have been through enough :).
Basically you have to download spark-ec2 scripts & run them. It'll just
need your amazon secret key & access key, start your cluster, install
everything, create security groups & give you the url, just login & go
ahead...

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Mon, Mar 3, 2014 at 11:00 AM, Bin Wang <binwang...@gmail.com> wrote:

> Hi there,
>
> I have a CDH cluster set up, and I tried using the Spark parcel come with
> Cloudera Manager, but it turned out they even don't have the run-example
> shell command in the bin folder. Then I removed it from the cluster and
> cloned the incubator-spark into the name node of my cluster, and built from
> source there successfully with everything as default.
>
> I ran a few examples and everything seems work fine in the local mode.
> Then I am thinking about scale it to my cluster, which is what the
> "DISTRIBUTE + ACTIVATE" command does in Cloudera Manager. I want to add all
> the datanodes to the slaves and think I should run Spark in the standalone
> mode.
>
> Say I am trying to set up Spark in the standalone mode following this
> instruction:
> https://spark.incubator.apache.org/docs/latest/spark-standalone.html
> However, it says "Once started, the master will print out a
> spark://HOST:PORT URL for itself, which you can use to connect workers to
> it, or pass as the “master” argument to SparkContext. You can also find
> this URL on the master’s web UI, which is http://localhost:8080 by
> default."
>
> After I started the master, there is no URL printed on the screen and
> neither the web UI is running.
> Here is the output:
> [root@box incubator-spark]# ./sbin/start-master.sh
> starting org.apache.spark.deploy.master.Master, logging to
> /root/bwang_spark_new/incubator-spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-box.out
>
> First Question: am I even in the ballpark to run Spark in standalone mode
> if I try to fully utilize my cluster? I saw there are four ways to launch
> Spark on a cluster, AWS-EC2, Spark in standalone, Apache Meso, Hadoop
> Yarn... which I guess standalone mode is the way to go?
>
> Second Question: how to get the Spark URL of the cluster, why the output
> is not like what the instruction says?
>
> Best regards,
>
> Bin
>

Reply via email to