Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?
Hello Sandy, Thank you for your explanation. Then I would at least expect that to be consistent across local, yarn-client, and yarn-cluster modes. (And not lead to the case where it somehow works in two of them, and not for the third). Kind regards, Emre Sevinç http://www.bigindustries.be/ On Tue, Mar 24, 2015 at 4:38 PM, Sandy Ryza wrote: > Ah, yes, I believe this is because only properties prefixed with "spark" > get passed on. The purpose of the "--conf" option is to allow passing > Spark properties to the SparkConf, not to add general key-value pairs to > the JVM system properties. > > -Sandy > > On Tue, Mar 24, 2015 at 4:25 AM, Emre Sevinc > wrote: > >> Hello Sandy, >> >> Your suggestion does not work when I try it locally: >> >> When I pass >> >> --conf "key=someValue" >> >> and then try to retrieve it like: >> >> SparkConf sparkConf = new SparkConf(); >> logger.info("* * * key ~~~> {}", sparkConf.get("key")); >> >> I get >> >> Exception in thread "main" java.util.NoSuchElementException: key >> >> And I think that's expected because the key is an arbitrary one, not >> necessarily a Spark configuration element. This is why I was passing it via >> --conf and retrieving System.getProperty("key") (which worked locally and >> in yarn-client mode but not in yarn-cluster mode). I'm surprised why I >> can't use it on the cluster while I can use it while local development and >> testing. >> >> Kind regards, >> >> Emre Sevinç >> http://www.bigindustries.be/ >> >> >> >> On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza >> wrote: >> >>> Hi Emre, >>> >>> The --conf property is meant to work with yarn-cluster mode. >>> System.getProperty("key") isn't guaranteed, but new SparkConf().get("key") >>> should. Does it not? >>> >>> -Sandy >>> >>> On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc >>> wrote: >>> Hello, According to Spark Documentation at https://spark.apache.org/docs/1.2.1/submitting-applications.html : --conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown). And indeed, when I use that parameter, in my Spark program I can retrieve the value of the key by using: System.getProperty("key"); This works when I test my program locally, and also in yarn-client mode, I can log the value of the key and see that it matches what I wrote in the command line, but it returns *null* when I submit the very same program in *yarn-cluster* mode. Why can't I retrieve the value of key given as --conf "key=value" when I submit my Spark application in *yarn-cluster* mode? Any ideas and/or workarounds? -- Emre Sevinç http://www.bigindustries.be/ >>> >> >> >> -- >> Emre Sevinc >> > > -- Emre Sevinc
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?
Ah, yes, I believe this is because only properties prefixed with "spark" get passed on. The purpose of the "--conf" option is to allow passing Spark properties to the SparkConf, not to add general key-value pairs to the JVM system properties. -Sandy On Tue, Mar 24, 2015 at 4:25 AM, Emre Sevinc wrote: > Hello Sandy, > > Your suggestion does not work when I try it locally: > > When I pass > > --conf "key=someValue" > > and then try to retrieve it like: > > SparkConf sparkConf = new SparkConf(); > logger.info("* * * key ~~~> {}", sparkConf.get("key")); > > I get > > Exception in thread "main" java.util.NoSuchElementException: key > > And I think that's expected because the key is an arbitrary one, not > necessarily a Spark configuration element. This is why I was passing it via > --conf and retrieving System.getProperty("key") (which worked locally and > in yarn-client mode but not in yarn-cluster mode). I'm surprised why I > can't use it on the cluster while I can use it while local development and > testing. > > Kind regards, > > Emre Sevinç > http://www.bigindustries.be/ > > > > On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza > wrote: > >> Hi Emre, >> >> The --conf property is meant to work with yarn-cluster mode. >> System.getProperty("key") isn't guaranteed, but new SparkConf().get("key") >> should. Does it not? >> >> -Sandy >> >> On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc >> wrote: >> >>> Hello, >>> >>> According to Spark Documentation at >>> https://spark.apache.org/docs/1.2.1/submitting-applications.html : >>> >>> --conf: Arbitrary Spark configuration property in key=value format. >>> For values that contain spaces wrap “key=value” in quotes (as shown). >>> >>> And indeed, when I use that parameter, in my Spark program I can >>> retrieve the value of the key by using: >>> >>> System.getProperty("key"); >>> >>> This works when I test my program locally, and also in yarn-client mode, >>> I can log the value of the key and see that it matches what I wrote in the >>> command line, but it returns *null* when I submit the very same program in >>> *yarn-cluster* mode. >>> >>> Why can't I retrieve the value of key given as --conf "key=value" when I >>> submit my Spark application in *yarn-cluster* mode? >>> >>> Any ideas and/or workarounds? >>> >>> >>> -- >>> Emre Sevinç >>> http://www.bigindustries.be/ >>> >>> >> > > > -- > Emre Sevinc >
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?
Hello Sandy, Your suggestion does not work when I try it locally: When I pass --conf "key=someValue" and then try to retrieve it like: SparkConf sparkConf = new SparkConf(); logger.info("* * * key ~~~> {}", sparkConf.get("key")); I get Exception in thread "main" java.util.NoSuchElementException: key And I think that's expected because the key is an arbitrary one, not necessarily a Spark configuration element. This is why I was passing it via --conf and retrieving System.getProperty("key") (which worked locally and in yarn-client mode but not in yarn-cluster mode). I'm surprised why I can't use it on the cluster while I can use it while local development and testing. Kind regards, Emre Sevinç http://www.bigindustries.be/ On Mon, Mar 23, 2015 at 6:15 PM, Sandy Ryza wrote: > Hi Emre, > > The --conf property is meant to work with yarn-cluster mode. > System.getProperty("key") isn't guaranteed, but new SparkConf().get("key") > should. Does it not? > > -Sandy > > On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc > wrote: > >> Hello, >> >> According to Spark Documentation at >> https://spark.apache.org/docs/1.2.1/submitting-applications.html : >> >> --conf: Arbitrary Spark configuration property in key=value format. >> For values that contain spaces wrap “key=value” in quotes (as shown). >> >> And indeed, when I use that parameter, in my Spark program I can retrieve >> the value of the key by using: >> >> System.getProperty("key"); >> >> This works when I test my program locally, and also in yarn-client mode, >> I can log the value of the key and see that it matches what I wrote in the >> command line, but it returns *null* when I submit the very same program in >> *yarn-cluster* mode. >> >> Why can't I retrieve the value of key given as --conf "key=value" when I >> submit my Spark application in *yarn-cluster* mode? >> >> Any ideas and/or workarounds? >> >> >> -- >> Emre Sevinç >> http://www.bigindustries.be/ >> >> > -- Emre Sevinc
Re: Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?
Hi Emre, The --conf property is meant to work with yarn-cluster mode. System.getProperty("key") isn't guaranteed, but new SparkConf().get("key") should. Does it not? -Sandy On Mon, Mar 23, 2015 at 8:39 AM, Emre Sevinc wrote: > Hello, > > According to Spark Documentation at > https://spark.apache.org/docs/1.2.1/submitting-applications.html : > > --conf: Arbitrary Spark configuration property in key=value format. For > values that contain spaces wrap “key=value” in quotes (as shown). > > And indeed, when I use that parameter, in my Spark program I can retrieve > the value of the key by using: > > System.getProperty("key"); > > This works when I test my program locally, and also in yarn-client mode, I > can log the value of the key and see that it matches what I wrote in the > command line, but it returns *null* when I submit the very same program in > *yarn-cluster* mode. > > Why can't I retrieve the value of key given as --conf "key=value" when I > submit my Spark application in *yarn-cluster* mode? > > Any ideas and/or workarounds? > > > -- > Emre Sevinç > http://www.bigindustries.be/ > >
Why doesn't the --conf parameter work in yarn-cluster mode (but works in yarn-client and local)?
Hello, According to Spark Documentation at https://spark.apache.org/docs/1.2.1/submitting-applications.html : --conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown). And indeed, when I use that parameter, in my Spark program I can retrieve the value of the key by using: System.getProperty("key"); This works when I test my program locally, and also in yarn-client mode, I can log the value of the key and see that it matches what I wrote in the command line, but it returns *null* when I submit the very same program in *yarn-cluster* mode. Why can't I retrieve the value of key given as --conf "key=value" when I submit my Spark application in *yarn-cluster* mode? Any ideas and/or workarounds? -- Emre Sevinç http://www.bigindustries.be/