Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-24 Thread Emre Sevinc
Hello Sandy,

Thank you for your explanation. Then I would at least expect that to be
consistent across local, yarn-client, and yarn-cluster modes. (And not lead
to the case where it somehow works in two of them, and not for the third).

Kind regards,

Emre Sevinç
http://www.bigindustries.be/


On Tue, Mar 24, 2015 at 4:38 PM, Sandy Ryza sandy.r...@cloudera.com wrote:

 Ah, yes, I believe this is because only properties prefixed with spark
 get passed on.  The purpose of the --conf option is to allow passing
 Spark properties to the SparkConf, not to add general key-value pairs to
 the JVM system properties.

 -Sandy

 On Tue, Mar 24, 2015 at 4:25 AM, Emre Sevinc emre.sev...@gmail.com
 wrote:

 Hello Sandy,

 Your suggestion does not work when I try it locally:

 When I pass

   --conf key=someValue

 and then try to retrieve it like:

 SparkConf sparkConf = new SparkConf();
 logger.info(* * * key ~~~ {}, sparkConf.get(key));

 I get

   Exception in thread main java.util.NoSuchElementException: key

 And I think that's expected because the key is an arbitrary one, not
 necessarily a Spark configuration element. This is why I was passing it via
 --conf and retrieving System.getProperty(key) (which worked locally and
 in yarn-client mode but not in yarn-cluster mode). I'm surprised why I
 can't use it on the cluster while I can use it while local development and
 testing.

 Kind regards,

 Emre Sevinç
 http://www.bigindustries.be/



 On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Emre,

 The --conf property is meant to work with yarn-cluster mode.
 System.getProperty(key) isn't guaranteed, but new SparkConf().get(key)
 should.  Does it not?

 -Sandy

 On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc emre.sev...@gmail.com
 wrote:

 Hello,

 According to Spark Documentation at
 https://spark.apache.org/docs/1.2.1/submitting-applications.html :

   --conf: Arbitrary Spark configuration property in key=value format.
 For values that contain spaces wrap “key=value” in quotes (as shown).

 And indeed, when I use that parameter, in my Spark program I can
 retrieve the value of the key by using:

 System.getProperty(key);

 This works when I test my program locally, and also in yarn-client
 mode, I can log the value of the key and see that it matches what I wrote
 in the command line, but it returns *null* when I submit the very same
 program in *yarn-cluster* mode.

 Why can't I retrieve the value of key given as --conf key=value when
 I submit my Spark application in *yarn-cluster* mode?

 Any ideas and/or workarounds?


 --
 Emre Sevinç
 http://www.bigindustries.be/





 --
 Emre Sevinc





-- 
Emre Sevinc


Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-24 Thread Sandy Ryza
Ah, yes, I believe this is because only properties prefixed with spark
get passed on.  The purpose of the --conf option is to allow passing
Spark properties to the SparkConf, not to add general key-value pairs to
the JVM system properties.

-Sandy

On Tue, Mar 24, 2015 at 4:25 AM, Emre Sevinc emre.sev...@gmail.com wrote:

 Hello Sandy,

 Your suggestion does not work when I try it locally:

 When I pass

   --conf key=someValue

 and then try to retrieve it like:

 SparkConf sparkConf = new SparkConf();
 logger.info(* * * key ~~~ {}, sparkConf.get(key));

 I get

   Exception in thread main java.util.NoSuchElementException: key

 And I think that's expected because the key is an arbitrary one, not
 necessarily a Spark configuration element. This is why I was passing it via
 --conf and retrieving System.getProperty(key) (which worked locally and
 in yarn-client mode but not in yarn-cluster mode). I'm surprised why I
 can't use it on the cluster while I can use it while local development and
 testing.

 Kind regards,

 Emre Sevinç
 http://www.bigindustries.be/



 On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza sandy.r...@cloudera.com
 wrote:

 Hi Emre,

 The --conf property is meant to work with yarn-cluster mode.
 System.getProperty(key) isn't guaranteed, but new SparkConf().get(key)
 should.  Does it not?

 -Sandy

 On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc emre.sev...@gmail.com
 wrote:

 Hello,

 According to Spark Documentation at
 https://spark.apache.org/docs/1.2.1/submitting-applications.html :

   --conf: Arbitrary Spark configuration property in key=value format.
 For values that contain spaces wrap “key=value” in quotes (as shown).

 And indeed, when I use that parameter, in my Spark program I can
 retrieve the value of the key by using:

 System.getProperty(key);

 This works when I test my program locally, and also in yarn-client mode,
 I can log the value of the key and see that it matches what I wrote in the
 command line, but it returns *null* when I submit the very same program in
 *yarn-cluster* mode.

 Why can't I retrieve the value of key given as --conf key=value when I
 submit my Spark application in *yarn-cluster* mode?

 Any ideas and/or workarounds?


 --
 Emre Sevinç
 http://www.bigindustries.be/





 --
 Emre Sevinc



Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?

2015-03-23 Thread Sandy Ryza
Hi Emre,

The --conf property is meant to work with yarn-cluster mode.
System.getProperty(key) isn't guaranteed, but new SparkConf().get(key)
should.  Does it not?

-Sandy

On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc emre.sev...@gmail.com wrote:

 Hello,

 According to Spark Documentation at
 https://spark.apache.org/docs/1.2.1/submitting-applications.html :

   --conf: Arbitrary Spark configuration property in key=value format. For
 values that contain spaces wrap “key=value” in quotes (as shown).

 And indeed, when I use that parameter, in my Spark program I can retrieve
 the value of the key by using:

 System.getProperty(key);

 This works when I test my program locally, and also in yarn-client mode, I
 can log the value of the key and see that it matches what I wrote in the
 command line, but it returns *null* when I submit the very same program in
 *yarn-cluster* mode.

 Why can't I retrieve the value of key given as --conf key=value when I
 submit my Spark application in *yarn-cluster* mode?

 Any ideas and/or workarounds?


 --
 Emre Sevinç
 http://www.bigindustries.be/