Re: Config properties broken in master

2014-01-18 Thread Mridul Muralidharan
Hi,

  Speculation was an example, there are others in spark which are
affected by this ...
Some of them have been around for a while, so will break existing code/scripts.

Regards,
Mridul

On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com wrote:
 change spark.speculation to spark.speculation.switch?

 maybe we can restrict that all properties in Spark should be three levels


 On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.comwrote:

 Hi,

   Unless I am mistaken, the change to using typesafe ConfigFactory has
 broken some of the system properties we use in spark.

 For example: if we have both
 -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
 set, then the spark.speculation property is dropped.

 The rules of parseProperty actually document this clearly [1]


 I am not sure what the right fix here would be (other than replacing
 use of config that is).

 Any thoughts ?
 I would vote -1 for 0.9 to be released before this is fixed.


 Regards,
 Mridul


 [1]
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29



Re: Config properties broken in master

2014-01-18 Thread Reynold Xin
I also just went over the config options to see how pervasive this is. In
addition to speculation, there is one more conflict of this kind:

spark.locality.wait
spark.locality.wait.node
spark.locality.wait.process
spark.locality.wait.rack


spark.speculation
spark.speculation.interval
spark.speculation.multiplier
spark.speculation.quantile


On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia matei.zaha...@gmail.comwrote:

 This is definitely an important issue to fix. Instead of renaming
 properties, one solution would be to replace Typesafe Config with just
 reading Java system properties, and disable config files for this release.
 I kind of like that over renaming.

 Matei

 On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com
 wrote:

  Hi,
 
   Speculation was an example, there are others in spark which are
  affected by this ...
  Some of them have been around for a while, so will break existing
 code/scripts.
 
  Regards,
  Mridul
 
  On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com
 wrote:
  change spark.speculation to spark.speculation.switch?
 
  maybe we can restrict that all properties in Spark should be three
 levels
 
 
  On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
  Hi,
 
   Unless I am mistaken, the change to using typesafe ConfigFactory has
  broken some of the system properties we use in spark.
 
  For example: if we have both
  -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
  set, then the spark.speculation property is dropped.
 
  The rules of parseProperty actually document this clearly [1]
 
 
  I am not sure what the right fix here would be (other than replacing
  use of config that is).
 
  Any thoughts ?
  I would vote -1 for 0.9 to be released before this is fixed.
 
 
  Regards,
  Mridul
 
 
  [1]
 
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
 




Re: Config properties broken in master

2014-01-18 Thread Mark Hamstra
Really?  Disabling config files seems to me to be a bigger/more onerous
change for users than spark.speculation=true|false =
spark.speculation.enabled=true|false and spark.locality.wait =
spark.locality.wait.default.


On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia matei.zaha...@gmail.comwrote:

 This is definitely an important issue to fix. Instead of renaming
 properties, one solution would be to replace Typesafe Config with just
 reading Java system properties, and disable config files for this release.
 I kind of like that over renaming.

 Matei

 On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com
 wrote:

  Hi,
 
   Speculation was an example, there are others in spark which are
  affected by this ...
  Some of them have been around for a while, so will break existing
 code/scripts.
 
  Regards,
  Mridul
 
  On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com
 wrote:
  change spark.speculation to spark.speculation.switch?
 
  maybe we can restrict that all properties in Spark should be three
 levels
 
 
  On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
  Hi,
 
   Unless I am mistaken, the change to using typesafe ConfigFactory has
  broken some of the system properties we use in spark.
 
  For example: if we have both
  -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
  set, then the spark.speculation property is dropped.
 
  The rules of parseProperty actually document this clearly [1]
 
 
  I am not sure what the right fix here would be (other than replacing
  use of config that is).
 
  Any thoughts ?
  I would vote -1 for 0.9 to be released before this is fixed.
 
 
  Regards,
  Mridul
 
 
  [1]
 
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
 




Re: Config properties broken in master

2014-01-18 Thread Matei Zaharia
We can add config files in a later release. They were never officially 
released, and were only in master for about a month.

One other thing to note is that the config file feature is kind of limited 
anyway. Users will want to have a separate config file with each app, which 
they have to ship with its classpath, and there’s no great way of merging them 
in the current setup. I’m not actually sure it’s a feature we want to support, 
compared to say just a SparkConf.fromFile method that reads a Java Properties 
file.

Matei

On Jan 18, 2014, at 12:01 PM, Mark Hamstra m...@clearstorydata.com wrote:

 Really?  Disabling config files seems to me to be a bigger/more onerous
 change for users than spark.speculation=true|false =
 spark.speculation.enabled=true|false and spark.locality.wait =
 spark.locality.wait.default.
 
 
 On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia 
 matei.zaha...@gmail.comwrote:
 
 This is definitely an important issue to fix. Instead of renaming
 properties, one solution would be to replace Typesafe Config with just
 reading Java system properties, and disable config files for this release.
 I kind of like that over renaming.
 
 Matei
 
 On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
 Hi,
 
 Speculation was an example, there are others in spark which are
 affected by this ...
 Some of them have been around for a while, so will break existing
 code/scripts.
 
 Regards,
 Mridul
 
 On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com
 wrote:
 change spark.speculation to spark.speculation.switch?
 
 maybe we can restrict that all properties in Spark should be three
 levels
 
 
 On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
 Hi,
 
 Unless I am mistaken, the change to using typesafe ConfigFactory has
 broken some of the system properties we use in spark.
 
 For example: if we have both
 -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
 set, then the spark.speculation property is dropped.
 
 The rules of parseProperty actually document this clearly [1]
 
 
 I am not sure what the right fix here would be (other than replacing
 use of config that is).
 
 Any thoughts ?
 I would vote -1 for 0.9 to be released before this is fixed.
 
 
 Regards,
 Mridul
 
 
 [1]
 
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
 
 
 



Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-18 Thread Patrick Wendell
Mridul, thanks a *lot* for pointing this out. This is indeed an issue
and something which warrants cutting a new RC.

- Patrick

On Sat, Jan 18, 2014 at 11:14 AM, Mridul Muralidharan mri...@gmail.com wrote:
 I would vote -1 for this release until we resolve config property
 issue [1] : if there is a known resolution for this (which I could not
 find unfortunately, apologies if it exists !), then will change my
 vote.

 Thanks,
 Mridul


 [1] 
 http://apache-spark-developers-list.1001551.n3.nabble.com/Config-properties-broken-in-master-td208.html

 On Thu, Jan 16, 2014 at 7:18 AM, Patrick Wendell pwend...@gmail.com wrote:
 Please vote on releasing the following candidate as Apache Spark
 (incubating) version 0.9.0.

 A draft of the release notes along with the changes file is attached
 to this e-mail.

 The tag to be voted on is v0.9.0-incubating (commit 7348893):
 https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=7348893f0edd96dacce2f00970db1976266f7008

 The release files, including signatures, digests, etc can be found at:
 http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1001/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc1-docs/

 Please vote on releasing this package as Apache Spark 0.9.0-incubating!

 The vote is open until Sunday, January 19, at 02:00 UTC
 and passes if a majority of at least 3 +1 PPMC votes are cast.

 [ ] +1 Release this package as Apache Spark 0.9.0-incubating
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.incubator.apache.org/


Re: Config properties broken in master

2014-01-18 Thread Mridul Muralidharan
IMO we should shoot for more stable interfaces and not break them just
to workaround bugs - unless the benefit of breaking compatibility is
offset by the added functionality.
Since I was not around for a while, I am not sure how much config file
feature was requested ...

Regards,
Mridul

On Sun, Jan 19, 2014 at 1:31 AM, Mark Hamstra m...@clearstorydata.com wrote:
 Really?  Disabling config files seems to me to be a bigger/more onerous
 change for users than spark.speculation=true|false =
 spark.speculation.enabled=true|false and spark.locality.wait =
 spark.locality.wait.default.


 On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia 
 matei.zaha...@gmail.comwrote:

 This is definitely an important issue to fix. Instead of renaming
 properties, one solution would be to replace Typesafe Config with just
 reading Java system properties, and disable config files for this release.
 I kind of like that over renaming.

 Matei

 On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com
 wrote:

  Hi,
 
   Speculation was an example, there are others in spark which are
  affected by this ...
  Some of them have been around for a while, so will break existing
 code/scripts.
 
  Regards,
  Mridul
 
  On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com
 wrote:
  change spark.speculation to spark.speculation.switch?
 
  maybe we can restrict that all properties in Spark should be three
 levels
 
 
  On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
  Hi,
 
   Unless I am mistaken, the change to using typesafe ConfigFactory has
  broken some of the system properties we use in spark.
 
  For example: if we have both
  -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
  set, then the spark.speculation property is dropped.
 
  The rules of parseProperty actually document this clearly [1]
 
 
  I am not sure what the right fix here would be (other than replacing
  use of config that is).
 
  Any thoughts ?
  I would vote -1 for 0.9 to be released before this is fixed.
 
 
  Regards,
  Mridul
 
 
  [1]
 
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
 




Re: Config properties broken in master

2014-01-18 Thread Matei Zaharia
Yeah, this is exactly my reasoning as well.

Matei

On Jan 18, 2014, at 12:14 PM, Mridul Muralidharan mri...@gmail.com wrote:

 IMO we should shoot for more stable interfaces and not break them just
 to workaround bugs - unless the benefit of breaking compatibility is
 offset by the added functionality.
 Since I was not around for a while, I am not sure how much config file
 feature was requested ...
 
 Regards,
 Mridul
 
 On Sun, Jan 19, 2014 at 1:31 AM, Mark Hamstra m...@clearstorydata.com wrote:
 Really?  Disabling config files seems to me to be a bigger/more onerous
 change for users than spark.speculation=true|false =
 spark.speculation.enabled=true|false and spark.locality.wait =
 spark.locality.wait.default.
 
 
 On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia 
 matei.zaha...@gmail.comwrote:
 
 This is definitely an important issue to fix. Instead of renaming
 properties, one solution would be to replace Typesafe Config with just
 reading Java system properties, and disable config files for this release.
 I kind of like that over renaming.
 
 Matei
 
 On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
 Hi,
 
 Speculation was an example, there are others in spark which are
 affected by this ...
 Some of them have been around for a while, so will break existing
 code/scripts.
 
 Regards,
 Mridul
 
 On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com
 wrote:
 change spark.speculation to spark.speculation.switch?
 
 maybe we can restrict that all properties in Spark should be three
 levels
 
 
 On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com
 wrote:
 
 Hi,
 
 Unless I am mistaken, the change to using typesafe ConfigFactory has
 broken some of the system properties we use in spark.
 
 For example: if we have both
 -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
 set, then the spark.speculation property is dropped.
 
 The rules of parseProperty actually document this clearly [1]
 
 
 I am not sure what the right fix here would be (other than replacing
 use of config that is).
 
 Any thoughts ?
 I would vote -1 for 0.9 to be released before this is fixed.
 
 
 Regards,
 Mridul
 
 
 [1]
 
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
 
 
 



Re: Config properties broken in master

2014-01-18 Thread Mark Hamstra
Hah!  Stupid English language -- by fixed I mean established/stabilized,
not repaired.


On Sat, Jan 18, 2014 at 12:42 PM, Mark Hamstra m...@clearstorydata.comwrote:

 Yeah, I can get on board with that -- gives us another chance to
 re-think/re-work config files to address the limitations Matei mentioned
 before the interface is fixed for 1.0.


 On Sat, Jan 18, 2014 at 12:27 PM, Patrick Wendell pwend...@gmail.comwrote:

 Hey Mark - ya if we did add this I think it would be in the next major
 release.

 On Sat, Jan 18, 2014 at 12:17 PM, Mark Hamstra m...@clearstorydata.com
 wrote:
  That later release should be at least 0.10.0, then, since use of config
  files won't be backward compatible with 0.9.0.
 
 
  On Sat, Jan 18, 2014 at 12:11 PM, Matei Zaharia 
 matei.zaha...@gmail.comwrote:
 
  We can add config files in a later release. They were never officially
  released, and were only in master for about a month.
 
  One other thing to note is that the config file feature is kind of
 limited
  anyway. Users will want to have a separate config file with each app,
 which
  they have to ship with its classpath, and there’s no great way of
 merging
  them in the current setup. I’m not actually sure it’s a feature we
 want to
  support, compared to say just a SparkConf.fromFile method that reads a
 Java
  Properties file.
 
  Matei
 
  On Jan 18, 2014, at 12:01 PM, Mark Hamstra m...@clearstorydata.com
  wrote:
 
   Really?  Disabling config files seems to me to be a bigger/more
 onerous
   change for users than spark.speculation=true|false =
   spark.speculation.enabled=true|false and spark.locality.wait =
   spark.locality.wait.default.
  
  
   On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia 
 matei.zaha...@gmail.com
  wrote:
  
   This is definitely an important issue to fix. Instead of renaming
   properties, one solution would be to replace Typesafe Config with
 just
   reading Java system properties, and disable config files for this
  release.
   I kind of like that over renaming.
  
   Matei
  
   On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com
 
   wrote:
  
   Hi,
  
   Speculation was an example, there are others in spark which are
   affected by this ...
   Some of them have been around for a while, so will break existing
   code/scripts.
  
   Regards,
   Mridul
  
   On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com
   wrote:
   change spark.speculation to spark.speculation.switch?
  
   maybe we can restrict that all properties in Spark should be
 three
   levels
  
  
   On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan 
  mri...@gmail.com
   wrote:
  
   Hi,
  
   Unless I am mistaken, the change to using typesafe ConfigFactory
 has
   broken some of the system properties we use in spark.
  
   For example: if we have both
   -Dspark.speculation=true -Dspark.speculation.multiplier=0.95
   set, then the spark.speculation property is dropped.
  
   The rules of parseProperty actually document this clearly [1]
  
  
   I am not sure what the right fix here would be (other than
 replacing
   use of config that is).
  
   Any thoughts ?
   I would vote -1 for 0.9 to be released before this is fixed.
  
  
   Regards,
   Mridul
  
  
   [1]
  
  
 
 http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
  
  
  
 
 





Should Spark on YARN example include --addJars?

2014-01-18 Thread Sandy Ryza
Hey All,

I ran into an issue when trying to run SparkPi as described in the Spark on
YARN doc.

14/01/18 10:52:09 ERROR spark.SparkContext: Error adding jar
(java.io.FileNotFoundException:
spark-examples-assembly-0.9.0-incubating-SNAPSHOT.jar (No such file or
directory)), was the --addJars option used?

Is addJars not needed here?

Here's the doc:

SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar
\
./spark-class org.apache.spark.deploy.yarn.Client \
  --jar 
examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar
\
  --class org.apache.spark.examples.SparkPi \
  --args yarn-standalone \
  --num-workers 3 \
  --master-memory 4g \
  --worker-memory 2g \
  --worker-cores 1


thanks,
Sandy


Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc1)

2014-01-18 Thread Patrick Wendell
This vote is cancelled in favor of rc2 which I'll post shortly.

On Sat, Jan 18, 2014 at 12:14 PM, Patrick Wendell pwend...@gmail.com wrote:
 Mridul, thanks a *lot* for pointing this out. This is indeed an issue
 and something which warrants cutting a new RC.

 - Patrick

 On Sat, Jan 18, 2014 at 11:14 AM, Mridul Muralidharan mri...@gmail.com 
 wrote:
 I would vote -1 for this release until we resolve config property
 issue [1] : if there is a known resolution for this (which I could not
 find unfortunately, apologies if it exists !), then will change my
 vote.

 Thanks,
 Mridul


 [1] 
 http://apache-spark-developers-list.1001551.n3.nabble.com/Config-properties-broken-in-master-td208.html

 On Thu, Jan 16, 2014 at 7:18 AM, Patrick Wendell pwend...@gmail.com wrote:
 Please vote on releasing the following candidate as Apache Spark
 (incubating) version 0.9.0.

 A draft of the release notes along with the changes file is attached
 to this e-mail.

 The tag to be voted on is v0.9.0-incubating (commit 7348893):
 https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=7348893f0edd96dacce2f00970db1976266f7008

 The release files, including signatures, digests, etc can be found at:
 http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc1/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1001/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc1-docs/

 Please vote on releasing this package as Apache Spark 0.9.0-incubating!

 The vote is open until Sunday, January 19, at 02:00 UTC
 and passes if a majority of at least 3 +1 PPMC votes are cast.

 [ ] +1 Release this package as Apache Spark 0.9.0-incubating
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.incubator.apache.org/


Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc2)

2014-01-18 Thread Patrick Wendell
I'll kick of the voting with a +1.

On Sat, Jan 18, 2014 at 11:05 PM, Patrick Wendell pwend...@gmail.com wrote:
 Please vote on releasing the following candidate as Apache Spark
 (incubating) version 0.9.0.

 A draft of the release notes along with the changes file is attached
 to this e-mail.

 The tag to be voted on is v0.9.0-incubating (commit 00c847a):
 https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=00c847af1d4be2fe5fad887a57857eead1e517dc

 The release files, including signatures, digests, etc can be found at:
 http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc2/

 Release artifacts are signed with the following key:
 https://people.apache.org/keys/committer/pwendell.asc

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1003/

 The documentation corresponding to this release can be found at:
 http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc2-docs/

 Please vote on releasing this package as Apache Spark 0.9.0-incubating!

 The vote is open until Wednesday, January 22, at 07:05 UTC
 and passes if a majority of at least 3 +1 PPMC votes are cast.

 [ ] +1 Release this package as Apache Spark 0.9.0-incubating
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.incubator.apache.org/