Yes I believe you are correct.

For the build you may need to specify the specific HDP version of hadoop to
use with the -Dhadoop.version=????.  I went with the default 2.6.0, but
Horton may have a vendor specific version that needs to go here.  I know I
saw a similar post today where the solution was to use
-Dhadoop.version=2.5.0-cdh5.3.2 but that was for a cloudera installation.
I am not sure what the HDP version would be to put here.

-Todd

On Wed, Mar 18, 2015 at 12:49 AM, Bharath Ravi Kumar <reachb...@gmail.com>
wrote:

> Hi Todd,
>
> Yes, those entries were present in the conf under the same SPARK_HOME that
> was used to run spark-submit. On a related note, I'm assuming that the
> additional spark yarn options (like spark.yarn.jar) need to be set in the
> same properties file that is passed to spark-submit. That apart, I assume
> that no other host on the cluster should require a "deployment of" the
> spark distribution or any other config change to support a spark job.
> Isn't that correct?
>
> On Tue, Mar 17, 2015 at 6:19 PM, Todd Nist <tsind...@gmail.com> wrote:
>
>> Hi Bharath,
>>
>> Do you have these entries in your $SPARK_HOME/conf/spark-defaults.conf
>> file?
>>
>> spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0-2041
>> spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0-2041
>>
>>
>>
>>
>> On Tue, Mar 17, 2015 at 1:04 AM, Bharath Ravi Kumar <reachb...@gmail.com>
>> wrote:
>>
>>> Still no luck running purpose-built 1.3 against HDP 2.2 after following
>>> all the instructions. Anyone else faced this issue?
>>>
>>> On Mon, Mar 16, 2015 at 8:53 PM, Bharath Ravi Kumar <reachb...@gmail.com
>>> > wrote:
>>>
>>>> Hi Todd,
>>>>
>>>> Thanks for the help. I'll try again after building a distribution with
>>>> the 1.3 sources. However, I wanted to confirm what I mentioned earlier:  is
>>>> it sufficient to copy the distribution only to the client host from where
>>>> spark-submit is invoked(with spark.yarn.jar set), or is there a need to
>>>> ensure that the entire distribution is deployed made available pre-deployed
>>>> on every host in the yarn cluster? I'd assume that the latter shouldn't be
>>>> necessary.
>>>>
>>>> On Mon, Mar 16, 2015 at 8:38 PM, Todd Nist <tsind...@gmail.com> wrote:
>>>>
>>>>> Hi Bharath,
>>>>>
>>>>> I ran into the same issue a few days ago, here is a link to a post on
>>>>> Horton's fourm.
>>>>> http://hortonworks.com/community/forums/search/spark+1.2.1/
>>>>>
>>>>> Incase anyone else needs to perform this these are the steps I took to
>>>>> get it to work with Spark 1.2.1 as well as Spark 1.3.0-RC3:
>>>>>
>>>>> 1. Pull 1.2.1 Source
>>>>> 2. Apply the following patches
>>>>> a. Address jackson version, https://github.com/apache/spark/pull/3938
>>>>> b. Address the propagation of the hdp.version set in the
>>>>> spark-default.conf, https://github.com/apache/spark/pull/3409
>>>>> 3. build with $SPARK_HOME./make-distribution.sh –name hadoop2.6 –tgz
>>>>> -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver
>>>>> -DskipTests package
>>>>>
>>>>> Then deploy the resulting artifact => spark-1.2.1-bin-hadoop2.6.tgz
>>>>> following instructions in the HDP Spark preview
>>>>> http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/
>>>>>
>>>>> FWIW spark-1.3.0 appears to be working fine with HDP as well and steps
>>>>> 2a and 2b are not required.
>>>>>
>>>>> HTH
>>>>>
>>>>> -Todd
>>>>>
>>>>> On Mon, Mar 16, 2015 at 10:13 AM, Bharath Ravi Kumar <
>>>>> reachb...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Trying to run spark ( 1.2.1 built for hdp 2.2) against a yarn cluster 
>>>>>> results in the AM failing to start with following error on stderr:
>>>>>> Error: Could not find or load main class 
>>>>>> org.apache.spark.deploy.yarn.ExecutorLauncher
>>>>>> An application id was assigned to the job, but there were no logs. Note 
>>>>>> that the spark distribution has not been "installed" on every host in 
>>>>>> the cluster and the aforementioned spark build was copied  to one of the 
>>>>>> hadoop client hosts in the cluster to launch the
>>>>>> job. Spark-submit was run with --master yarn-client and spark.yarn.jar 
>>>>>> was set to the assembly jar from the above distribution. Switching the 
>>>>>> spark distribution to the HDP recommended  version
>>>>>> and following the instructions on this page 
>>>>>> <http://hortonworks.com/hadoop-tutorial/using-apache-spark-hdp/> did not 
>>>>>> fix the problem either. Any idea what may have caused this error ?
>>>>>>
>>>>>> Thanks,
>>>>>> Bharath
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to