Hi Chester,

Thank you very much, it is clear now - just two different way to support
spark on acluster

Thank you,
Konstantin Kudryavtsev


On Mon, Jul 7, 2014 at 3:22 PM, Chester @work <ches...@alpinenow.com> wrote:

> In Yarn cluster mode, you can either have spark on all the cluster nodes
> or supply the spark jar yourself. In the 2nd case, you don't need install
> spark on cluster at all. As you supply the spark assembly as we as your app
> jar together.
>
> I hope this make it clear
>
> Chester
>
> Sent from my iPhone
>
> On Jul 7, 2014, at 5:05 AM, Konstantin Kudryavtsev <
> kudryavtsev.konstan...@gmail.com> wrote:
>
> thank you Krishna!
>
>  Could you please explain why do I need install spark on each node if
> Spark official site said: If you have a Hadoop 2 cluster, you can run
> Spark without any installation needed
>
> I have HDP 2 (YARN) and that's why I hope I don't need to install spark on
> each node
>
> Thank you,
> Konstantin Kudryavtsev
>
>
> On Mon, Jul 7, 2014 at 1:57 PM, Krishna Sankar <ksanka...@gmail.com>
> wrote:
>
>> Konstantin,
>>
>>    1. You need to install the hadoop rpms on all nodes. If it is Hadoop
>>    2, the nodes would have hdfs & YARN.
>>    2. Then you need to install Spark on all nodes. I haven't had
>>    experience with HDP, but the tech preview might have installed Spark as
>>    well.
>>    3. In the end, one should have hdfs,yarn & spark installed on all the
>>    nodes.
>>    4. After installations, check the web console to make sure hdfs, yarn
>>    & spark are running.
>>    5. Then you are ready to start experimenting/developing spark
>>    applications.
>>
>> HTH.
>> Cheers
>> <k/>
>>
>>
>> On Mon, Jul 7, 2014 at 2:34 AM, Konstantin Kudryavtsev <
>> kudryavtsev.konstan...@gmail.com> wrote:
>>
>>> guys, I'm not talking about running spark on VM, I don have problem with
>>> it.
>>>
>>> I confused in the next:
>>> 1) Hortonworks describe installation process as RPMs on each node
>>> 2) spark home page said that everything I need is YARN
>>>
>>> And I'm in stucj with understanding what I need to do to run spark on
>>> yarn (do I need RPMs installations or only build spark on edge node?)
>>>
>>>
>>> Thank you,
>>> Konstantin Kudryavtsev
>>>
>>>
>>> On Mon, Jul 7, 2014 at 4:34 AM, Robert James <srobertja...@gmail.com>
>>> wrote:
>>>
>>>> I can say from my experience that getting Spark to work with Hadoop 2
>>>> is not for the beginner; after solving one problem after another
>>>> (dependencies, scripts, etc.), I went back to Hadoop 1.
>>>>
>>>> Spark's Maven, ec2 scripts, and others all use Hadoop 1 - not sure
>>>> why, but, given so, Hadoop 2 has too many bumps
>>>>
>>>> On 7/6/14, Marco Shaw <marco.s...@gmail.com> wrote:
>>>> > That is confusing based on the context you provided.
>>>> >
>>>> > This might take more time than I can spare to try to understand.
>>>> >
>>>> > For sure, you need to add Spark to run it in/on the HDP 2.1 express
>>>> VM.
>>>> >
>>>> > Cloudera's CDH 5 express VM includes Spark, but the service isn't
>>>> running by
>>>> > default.
>>>> >
>>>> > I can't remember for MapR...
>>>> >
>>>> > Marco
>>>> >
>>>> >> On Jul 6, 2014, at 6:33 PM, Konstantin Kudryavtsev
>>>> >> <kudryavtsev.konstan...@gmail.com> wrote:
>>>> >>
>>>> >> Marco,
>>>> >>
>>>> >> Hortonworks provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that
>>>> you
>>>> >> can try
>>>> >> from
>>>> >>
>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>  HDP 2.1 means YARN, at the same time they propose ti install rpm
>>>> >>
>>>> >> On other hand, http://spark.apache.org/ said "
>>>> >> Integrated with Hadoop
>>>> >> Spark can run on Hadoop 2's YARN cluster manager, and can read any
>>>> >> existing Hadoop data.
>>>> >>
>>>> >> If you have a Hadoop 2 cluster, you can run Spark without any
>>>> installation
>>>> >> needed. "
>>>> >>
>>>> >> And this is confusing for me... do I need rpm installation on not?...
>>>> >>
>>>> >>
>>>> >> Thank you,
>>>> >> Konstantin Kudryavtsev
>>>> >>
>>>> >>
>>>> >>> On Sun, Jul 6, 2014 at 10:56 PM, Marco Shaw <marco.s...@gmail.com>
>>>> >>> wrote:
>>>> >>> Can you provide links to the sections that are confusing?
>>>> >>>
>>>> >>> My understanding, the HDP1 binaries do not need YARN, while the HDP2
>>>> >>> binaries do.
>>>> >>>
>>>> >>> Now, you can also install Hortonworks Spark RPM...
>>>> >>>
>>>> >>> For production, in my opinion, RPMs are better for manageability.
>>>> >>>
>>>> >>>> On Jul 6, 2014, at 5:39 PM, Konstantin Kudryavtsev
>>>> >>>> <kudryavtsev.konstan...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hello, thanks for your message... I'm confused, Hortonworhs suggest
>>>> >>>> install spark rpm on each node, but on Spark main page said that
>>>> yarn
>>>> >>>> enough and I don't need to install it... What the difference?
>>>> >>>>
>>>> >>>> sent from my HTC
>>>> >>>>
>>>> >>>>> On Jul 6, 2014 8:34 PM, "vs" <vinayshu...@gmail.com> wrote:
>>>> >>>>> Konstantin,
>>>> >>>>>
>>>> >>>>> HWRK provides a Tech Preview of Spark 0.9.1 with HDP 2.1 that you
>>>> can
>>>> >>>>> try
>>>> >>>>> from
>>>> >>>>>
>>>> http://hortonworks.com/wp-content/uploads/2014/05/SparkTechnicalPreview.pdf
>>>> >>>>>
>>>> >>>>> Let me know if you see issues with the tech preview.
>>>> >>>>>
>>>> >>>>> "spark PI example on HDP 2.0
>>>> >>>>>
>>>> >>>>> I downloaded spark 1.0 pre-build from
>>>> >>>>> http://spark.apache.org/downloads.html
>>>> >>>>> (for HDP2)
>>>> >>>>> The run example from spark web-site:
>>>> >>>>> ./bin/spark-submit --class org.apache.spark.examples.SparkPi
>>>> >>>>> --master
>>>> >>>>> yarn-cluster --num-executors 3 --driver-memory 2g
>>>> --executor-memory 2g
>>>> >>>>> --executor-cores 1 ./lib/spark-examples-1.0.0-hadoop2.2.0.jar 2
>>>> >>>>>
>>>> >>>>> I got error:
>>>> >>>>> Application application_1404470405736_0044 failed 3 times due to
>>>> AM
>>>> >>>>> Container for appattempt_1404470405736_0044_000003 exited with
>>>> >>>>> exitCode: 1
>>>> >>>>> due to: Exception from container-launch:
>>>> >>>>> org.apache.hadoop.util.Shell$ExitCodeException:
>>>> >>>>> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>>>> >>>>> at org.apache.hadoop.util.Shell.run(Shell.java:379)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>>>> >>>>> at
>>>> >>>>>
>>>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>>>> >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>> >>>>> at
>>>> >>>>>
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> >>>>> at
>>>> >>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> >>>>> at java.lang.Thread.run(Thread.java:744)
>>>> >>>>> .Failing this attempt.. Failing the application.
>>>> >>>>>
>>>> >>>>> Unknown/unsupported param List(--executor-memory, 2048,
>>>> >>>>> --executor-cores, 1,
>>>> >>>>> --num-executors, 3)
>>>> >>>>> Usage: org.apache.spark.deploy.yarn.ApplicationMaster [options]
>>>> >>>>> Options:
>>>> >>>>>   --jar JAR_PATH       Path to your application's JAR file
>>>> (required)
>>>> >>>>>   --class CLASS_NAME   Name of your application's main class
>>>> >>>>> (required)
>>>> >>>>> ...bla-bla-bla
>>>> >>>>> "
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> View this message in context:
>>>> >>>>>
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-run-Spark-1-0-SparkPi-on-HDP-2-0-tp8802p8873.html
>>>> >>>>> Sent from the Apache Spark User List mailing list archive at
>>>> >>>>> Nabble.com.
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Reply via email to