Re: Running Spark on Gateway - Connecting to Resource Manager Retries

2015-04-20 Thread Fernando O.
I'm experimenting the same issue with spark 1.3.1

I verified that hadoop works (ie: running hadoop's pi example)

It seems like hadoop conf is in the classpath
(/opt/test/service/hadoop/etc/hadoop )

SPARK_PRINT_LAUNCH_COMMAND=1 ./bin/spark-shell --master yarn-client
Spark Command: /usr/lib/jvm/jre/bin/java -cp
/opt/test/service/spark/conf:/opt/test/service/spark/assembly/target/scala-2.11/spark-assembly-1.3.1-hadoop2.6.0.jar:/opt/test/service/spark/lib_managed/jars/datanucleus-core-3.2.10.jar:/opt/test/service/spark/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/opt/test/service/spark/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar:/opt/test/service/hadoop/etc/hadoop
-XX:MaxPermSize=128m -Dscala.usejavacp=true -Xms512m -Xmx512m
org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main
--master yarn-client spark-shell

15/04/20 19:39:11 INFO yarn.Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1429558750744
 final status: UNDEFINED
 tracking URL:
http://namenode-01.test.xxx.com:8088/proxy/application_1429543348669_0014/
 user: nobody




I do have hadoop running in HA mode.


and when I go to Hadoop logs I also see


15/04/20 19:39:15 INFO ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
MILLISECONDS)

...

could it be the same?


On Wed, Apr 15, 2015 at 4:58 AM, Vineet Mishra 
wrote:

> Hi Akhil,
>
> Its running fine when running through Namenode(RM) but fails while running
> through Gateway, if I add hadoop-core jars to the hadoop
> directory(/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/) it
> works fine.
>
> Its really strange that I am running the job through Spark-Submit and
> running via NameNode works fine and it fails when running through gateway
> even when both are having same classpath.
>
> Anyone tries running Spark from Gateway?
>
> Looking for the quick revert!
>
> Thanks,
>
>
> On Wed, Apr 15, 2015 at 12:07 PM, Akhil Das 
> wrote:
>
>> Make sure your yarn service is running on 8032.
>>
>> Thanks
>> Best Regards
>>
>> On Tue, Apr 14, 2015 at 12:35 PM, Vineet Mishra 
>> wrote:
>>
>>> Hi Team,
>>>
>>> I am running Spark Word Count example(
>>> https://github.com/sryza/simplesparkapp), if I go with master as local
>>> it works fine.
>>>
>>> But when I change the master to yarn its end with retries connecting to
>>> resource manager(stack trace mentioned below),
>>>
>>> 15/04/14 11:31:57 INFO RMProxy: Connecting to ResourceManager at /
>>> 0.0.0.0:8032
>>> 15/04/14 11:31:58 INFO Client: Retrying connect to server:
>>> 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>> MILLISECONDS)
>>> 15/04/14 11:31:59 INFO Client: Retrying connect to server:
>>> 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
>>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>>> MILLISECONDS)
>>>
>>> If I run the same command from Namenode instance it ends with
>>> ArrayOutofBoundException(Stack trace mentioned below),
>>>
>>> 15/04/14 11:38:44 INFO YarnClientSchedulerBackend: SchedulerBackend is
>>> ready for scheduling beginning after reached minRegisteredResourcesRatio:
>>> 0.8
>>> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
>>> at
>>> com.cloudera.sparkwordcount.SparkWordCount$.main(SparkWordCount.scala:28)
>>> at com.cloudera.sparkwordcount.SparkWordCount.main(SparkWordCount.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>> Looking forward to get it resolve to work on respective nodes.
>>>
>>> Thanks,
>>>
>>
>>
>


Re: Running Spark on Gateway - Connecting to Resource Manager Retries

2015-04-15 Thread Vineet Mishra
Hi Akhil,

Its running fine when running through Namenode(RM) but fails while running
through Gateway, if I add hadoop-core jars to the hadoop
directory(/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hadoop/) it
works fine.

Its really strange that I am running the job through Spark-Submit and
running via NameNode works fine and it fails when running through gateway
even when both are having same classpath.

Anyone tries running Spark from Gateway?

Looking for the quick revert!

Thanks,


On Wed, Apr 15, 2015 at 12:07 PM, Akhil Das 
wrote:

> Make sure your yarn service is running on 8032.
>
> Thanks
> Best Regards
>
> On Tue, Apr 14, 2015 at 12:35 PM, Vineet Mishra 
> wrote:
>
>> Hi Team,
>>
>> I am running Spark Word Count example(
>> https://github.com/sryza/simplesparkapp), if I go with master as local
>> it works fine.
>>
>> But when I change the master to yarn its end with retries connecting to
>> resource manager(stack trace mentioned below),
>>
>> 15/04/14 11:31:57 INFO RMProxy: Connecting to ResourceManager at /
>> 0.0.0.0:8032
>> 15/04/14 11:31:58 INFO Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>> MILLISECONDS)
>> 15/04/14 11:31:59 INFO Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
>> MILLISECONDS)
>>
>> If I run the same command from Namenode instance it ends with
>> ArrayOutofBoundException(Stack trace mentioned below),
>>
>> 15/04/14 11:38:44 INFO YarnClientSchedulerBackend: SchedulerBackend is
>> ready for scheduling beginning after reached minRegisteredResourcesRatio:
>> 0.8
>> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
>> at
>> com.cloudera.sparkwordcount.SparkWordCount$.main(SparkWordCount.scala:28)
>> at com.cloudera.sparkwordcount.SparkWordCount.main(SparkWordCount.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> Looking forward to get it resolve to work on respective nodes.
>>
>> Thanks,
>>
>
>


Re: Running Spark on Gateway - Connecting to Resource Manager Retries

2015-04-14 Thread Akhil Das
Make sure your yarn service is running on 8032.

Thanks
Best Regards

On Tue, Apr 14, 2015 at 12:35 PM, Vineet Mishra 
wrote:

> Hi Team,
>
> I am running Spark Word Count example(
> https://github.com/sryza/simplesparkapp), if I go with master as local it
> works fine.
>
> But when I change the master to yarn its end with retries connecting to
> resource manager(stack trace mentioned below),
>
> 15/04/14 11:31:57 INFO RMProxy: Connecting to ResourceManager at /
> 0.0.0.0:8032
> 15/04/14 11:31:58 INFO Client: Retrying connect to server:
> 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
> 15/04/14 11:31:59 INFO Client: Retrying connect to server:
> 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> MILLISECONDS)
>
> If I run the same command from Namenode instance it ends with
> ArrayOutofBoundException(Stack trace mentioned below),
>
> 15/04/14 11:38:44 INFO YarnClientSchedulerBackend: SchedulerBackend is
> ready for scheduling beginning after reached minRegisteredResourcesRatio:
> 0.8
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
> at
> com.cloudera.sparkwordcount.SparkWordCount$.main(SparkWordCount.scala:28)
> at com.cloudera.sparkwordcount.SparkWordCount.main(SparkWordCount.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> Looking forward to get it resolve to work on respective nodes.
>
> Thanks,
>


Re: Running Spark on Gateway - Connecting to Resource Manager Retries

2015-04-14 Thread Neal Yin
Your Yarn access is not configured.   0.0.0.0:8032   this 
is default yarn address.  I guess you don't have yarn-site.xml in your 
classpath.

-Neal



From: Vineet Mishra mailto:clearmido...@gmail.com>>
Date: Tuesday, April 14, 2015 at 12:05 AM
To: "user@spark.apache.org" 
mailto:user@spark.apache.org>>, 
"cdh-u...@cloudera.org" 
mailto:cdh-u...@cloudera.org>>
Subject: Running Spark on Gateway - Connecting to Resource Manager Retries

Hi Team,

I am running Spark Word Count example(https://github.com/sryza/simplesparkapp), 
if I go with master as local it works fine.

But when I change the master to yarn its end with retries connecting to 
resource manager(stack trace mentioned below),

15/04/14 11:31:57 INFO RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8032
15/04/14 11:31:58 INFO Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)
15/04/14 11:31:59 INFO Client: Retrying connect to server: 
0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1000 MILLISECONDS)

If I run the same command from Namenode instance it ends with 
ArrayOutofBoundException(Stack trace mentioned below),

15/04/14 11:38:44 INFO YarnClientSchedulerBackend: SchedulerBackend is ready 
for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at com.cloudera.sparkwordcount.SparkWordCount$.main(SparkWordCount.scala:28)
at com.cloudera.sparkwordcount.SparkWordCount.main(SparkWordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Looking forward to get it resolve to work on respective nodes.

Thanks,