[
https://issues.apache.org/jira/browse/SPARK-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536820#comment-14536820
]
Zoltán Zvara edited comment on SPARK-7504 at 5/9/15 6:36 PM:
-------------------------------------------------------------
No, I did not use {{spark-submit}}, but I ran a Spark program with {{java}}
(using IntelliJ basically) - as I've said a few comments earlier.
The problem from user point of view:
There is a user-facing API provided by {{SparkContext}}, which can be
configured with a {{SparkConf}} that has a valid parameter named
{{spark.master}} with a valid value {{yarn-cluster}}. If I use {{SparkContext}}
with this configuration and define a valid chain of RDDs, compile and run my
JAVA code, it simply blows up. For example:
- pick any Spark example,
- set {{spark.master}} to {{yarn-cluster}}
- run your code with JAVA.
Instead of blowing up with {{NullPointerExceptions}}, Spark should warn the
user that he or she can not deploy his or her application that way, because it
is not supported.
was (Author: ehnalis):
No, I did not use {{spark-submit}}, but I ran a Spark program with {{java}}
(using IntelliJ basically) - as I've said a few comments earlier.
The problem from user point of view:
There is a user-facing API provided by {{SparkContext}}, which can be
configured with a {{SparkConf}} that has a valid parameter named
{{spark.master}} with a valid value {{yarn-cluster}}. If I use {{SparkContext}}
with this configuration and define a valid chain of RDD's, compile and run my
JAVA code, it simply blows up. For example:
- pick any Spark example,
- set {{spark.master}} to {{yarn-cluster}}
- run your code with JAVA.
Instead of blowing up with {{NullPointerExceptions}}, Spark should warn the
user that he or she can not deploy his or her application that way, because it
is not supported.
> NullPointerException when initializing SparkContext in YARN-cluster mode
> ------------------------------------------------------------------------
>
> Key: SPARK-7504
> URL: https://issues.apache.org/jira/browse/SPARK-7504
> Project: Spark
> Issue Type: Bug
> Components: Deploy, YARN
> Reporter: Zoltán Zvara
> Labels: deployment, yarn, yarn-client
>
> It is not clear for most users that, while running Spark on YARN a
> {{SparkContext}} with a given execution plan can be run locally as
> {{yarn-client}}, but can not deploy itself to the cluster. This is currently
> performed using {{org.apache.spark.deploy.yarn.Client}}. {color:gray} I think
> we should support deployment through {{SparkContext}}, but this is not the
> point I wish to make here. {color}
> Configuring a {{SparkContext}} to deploy itself currently will yield an
> {{ERROR}} while accessing {{spark.yarn.app.id}} in
> {{YarnClusterSchedulerBackend}}, and after that a {{NullPointerException}}
> while referencing the {{ApplicationMaster}} instance.
> Spark should clearly inform the user that it might be running in
> {{yarn-cluster}} mode without a proper submission using {{Client}} and that
> deploying is not supported directly from {{SparkContext}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]