Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc2)
+1 On Sat, Jan 18, 2014 at 11:11 PM, Patrick Wendell pwend...@gmail.comwrote: I'll kick of the voting with a +1. On Sat, Jan 18, 2014 at 11:05 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.9.0. A draft of the release notes along with the changes file is attached to this e-mail. The tag to be voted on is v0.9.0-incubating (commit 00c847a): https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=00c847af1d4be2fe5fad887a57857eead1e517dc The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc2/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1003/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc2-docs/ Please vote on releasing this package as Apache Spark 0.9.0-incubating! The vote is open until Wednesday, January 22, at 07:05 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.9.0-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/
Re: Config properties broken in master
Chanced upon spill related config which exhibit same pattern ... - Mridul On Sun, Jan 19, 2014 at 1:10 AM, Reynold Xin r...@databricks.com wrote: I also just went over the config options to see how pervasive this is. In addition to speculation, there is one more conflict of this kind: spark.locality.wait spark.locality.wait.node spark.locality.wait.process spark.locality.wait.rack spark.speculation spark.speculation.interval spark.speculation.multiplier spark.speculation.quantile On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia matei.zaha...@gmail.comwrote: This is definitely an important issue to fix. Instead of renaming properties, one solution would be to replace Typesafe Config with just reading Java system properties, and disable config files for this release. I kind of like that over renaming. Matei On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com wrote: Hi, Speculation was an example, there are others in spark which are affected by this ... Some of them have been around for a while, so will break existing code/scripts. Regards, Mridul On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com wrote: change spark.speculation to spark.speculation.switch? maybe we can restrict that all properties in Spark should be three levels On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com wrote: Hi, Unless I am mistaken, the change to using typesafe ConfigFactory has broken some of the system properties we use in spark. For example: if we have both -Dspark.speculation=true -Dspark.speculation.multiplier=0.95 set, then the spark.speculation property is dropped. The rules of parseProperty actually document this clearly [1] I am not sure what the right fix here would be (other than replacing use of config that is). Any thoughts ? I would vote -1 for 0.9 to be released before this is fixed. Regards, Mridul [1] http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
Re: Config properties broken in master
Hey Mridul this was patched and we cut a new release candidate. There were several different config options which had a.b and a.b.c... they should all work in the new RC. On Sun, Jan 19, 2014 at 4:56 AM, Mridul Muralidharan mri...@gmail.com wrote: Chanced upon spill related config which exhibit same pattern ... - Mridul On Sun, Jan 19, 2014 at 1:10 AM, Reynold Xin r...@databricks.com wrote: I also just went over the config options to see how pervasive this is. In addition to speculation, there is one more conflict of this kind: spark.locality.wait spark.locality.wait.node spark.locality.wait.process spark.locality.wait.rack spark.speculation spark.speculation.interval spark.speculation.multiplier spark.speculation.quantile On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia matei.zaha...@gmail.comwrote: This is definitely an important issue to fix. Instead of renaming properties, one solution would be to replace Typesafe Config with just reading Java system properties, and disable config files for this release. I kind of like that over renaming. Matei On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com wrote: Hi, Speculation was an example, there are others in spark which are affected by this ... Some of them have been around for a while, so will break existing code/scripts. Regards, Mridul On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com wrote: change spark.speculation to spark.speculation.switch? maybe we can restrict that all properties in Spark should be three levels On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com wrote: Hi, Unless I am mistaken, the change to using typesafe ConfigFactory has broken some of the system properties we use in spark. For example: if we have both -Dspark.speculation=true -Dspark.speculation.multiplier=0.95 set, then the spark.speculation property is dropped. The rules of parseProperty actually document this clearly [1] I am not sure what the right fix here would be (other than replacing use of config that is). Any thoughts ? I would vote -1 for 0.9 to be released before this is fixed. Regards, Mridul [1] http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
Re: Config properties broken in master
Oh great, just saw the PR from Matei ... for some odd reason, the dev mails are coming to be horribly delayed. Thanks, Mridul On Sun, Jan 19, 2014 at 10:35 PM, Patrick Wendell pwend...@gmail.com wrote: Hey Mridul this was patched and we cut a new release candidate. There were several different config options which had a.b and a.b.c... they should all work in the new RC. On Sun, Jan 19, 2014 at 4:56 AM, Mridul Muralidharan mri...@gmail.com wrote: Chanced upon spill related config which exhibit same pattern ... - Mridul On Sun, Jan 19, 2014 at 1:10 AM, Reynold Xin r...@databricks.com wrote: I also just went over the config options to see how pervasive this is. In addition to speculation, there is one more conflict of this kind: spark.locality.wait spark.locality.wait.node spark.locality.wait.process spark.locality.wait.rack spark.speculation spark.speculation.interval spark.speculation.multiplier spark.speculation.quantile On Sat, Jan 18, 2014 at 11:36 AM, Matei Zaharia matei.zaha...@gmail.comwrote: This is definitely an important issue to fix. Instead of renaming properties, one solution would be to replace Typesafe Config with just reading Java system properties, and disable config files for this release. I kind of like that over renaming. Matei On Jan 18, 2014, at 11:30 AM, Mridul Muralidharan mri...@gmail.com wrote: Hi, Speculation was an example, there are others in spark which are affected by this ... Some of them have been around for a while, so will break existing code/scripts. Regards, Mridul On Sun, Jan 19, 2014 at 12:51 AM, Nan Zhu zhunanmcg...@gmail.com wrote: change spark.speculation to spark.speculation.switch? maybe we can restrict that all properties in Spark should be three levels On Sat, Jan 18, 2014 at 2:10 PM, Mridul Muralidharan mri...@gmail.com wrote: Hi, Unless I am mistaken, the change to using typesafe ConfigFactory has broken some of the system properties we use in spark. For example: if we have both -Dspark.speculation=true -Dspark.speculation.multiplier=0.95 set, then the spark.speculation property is dropped. The rules of parseProperty actually document this clearly [1] I am not sure what the right fix here would be (other than replacing use of config that is). Any thoughts ? I would vote -1 for 0.9 to be released before this is fixed. Regards, Mridul [1] http://typesafehub.github.io/config/latest/api/com/typesafe/config/ConfigFactory.html#parseProperties%28java.util.Properties,%20com.typesafe.config.ConfigParseOptions%29
Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc2)
This vote is cancelled in favor of rc3 - which fixes the YARN issue Sandy ran into. @taka - thanks for reporting that bug. It's not enough to block this release however. Once a fix exists we can merge it into the 0.9 branch and it will be in 0.9.1 On Sun, Jan 19, 2014 at 12:37 PM, Taka Shinagawa taka.epsi...@gmail.com wrote: I've found a problem with the cartesian method on Pyspark and filed as SPARK-1034 https://spark-project.atlassian.net/browse/SPARK-1034 0.8.1 doesn't have this problem. On Scala, cartesian method works fine. It's also nice if SPARK-978 can be fixed, too. https://spark-project.atlassian.net/browse/SPARK-978 Thanks, Taka On Sun, Jan 19, 2014 at 1:24 AM, Sandy Ryza sandy.r...@cloudera.com wrote: Has anybody tested against YARN 2.2? I tried it out against a pseudo-distributed cluster and ran into an issue I just filed as SPARK-1031https://spark-project.atlassian.net/browse/SPARK-1031 . thanks, Sandy On Sun, Jan 19, 2014 at 12:55 AM, Reynold Xin r...@databricks.com wrote: +1 On Sat, Jan 18, 2014 at 11:11 PM, Patrick Wendell pwend...@gmail.com wrote: I'll kick of the voting with a +1. On Sat, Jan 18, 2014 at 11:05 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.9.0. A draft of the release notes along with the changes file is attached to this e-mail. The tag to be voted on is v0.9.0-incubating (commit 00c847a): https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=00c847af1d4be2fe5fad887a57857eead1e517dc The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc2/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1003/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc2-docs/ Please vote on releasing this package as Apache Spark 0.9.0-incubating! The vote is open until Wednesday, January 22, at 07:05 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.9.0-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/
Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc3)
Attempting to attach the release notes again (I think it may have been blocked previously due to not having an extension). On Sun, Jan 19, 2014 at 8:05 PM, Patrick Wendell pwend...@gmail.com wrote: I'll add my +1 as well On Sun, Jan 19, 2014 at 7:33 PM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Re-tested on Mac. Matei On Jan 19, 2014, at 7:09 PM, Tathagata Das tathagata.das1...@gmail.com wrote: Starting off. +1 On Sun, Jan 19, 2014 at 2:15 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.9.0. A draft of the release notes along with the changes file is attached to this e-mail. The tag to be voted on is v0.9.0-incubating (commit a7760eff): https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=a7760eff4ea6a474cab68896a88550f63bae8b0d The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1004/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3-docs/ Please vote on releasing this package as Apache Spark 0.9.0-incubating! The vote is open until Wednesday, January 22, at 22:15 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.9.0-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/ Spark 0.9.0 is a major release that adds significant new features. It updates Spark to Scala 2.10, simplifies high availability, and updates numerous components of the project. This release includes a first version of GraphX, a powerful new framework for graph processing that comes with a library of standard algorithms. In addition, Spark Streaming is now out of alpha, and includes significant optimizations and simplified high availability deployment. ### Scala 2.10 Support Spark now runs on Scala 2.10, letting users benefit from the language and library improvements in this version. ### Configuration System The new [SparkConf] class is now the preferred way to configure advanced settings on your SparkContext, though the previous Java system property still works. SparkConf is especially useful in tests to make sure properties donât stay set across tests. ### Spark Streaming Improvements Spark Streaming is no longer alpha, and comes with simplified high availability and several optimizations. * When running on a Spark standalone cluster with the [standalone cluster high availability mode], you can submit a Spark Streaming driver application to the cluster and have it automatically recovered if either the driver or the cluster master crashes. * Windowed operators have been sped up by 30-50%. * Spark Streamingâs input source plugins (e.g. for Twitter, Kafka and Flume) are now separate projects, making it easier to pull in only the dependencies you need. * A new StreamingListener interface has been added for monitoring statistics about the streaming computation. * A few aspects of the API have been improved: * `DStream` and `PairDStream` classes have been moved from `org.apache.spark.streaming` to `org.apache.spark.streaming.dstream` to keep it consistent with `org.apache.spark.rdd.RDD`. * `DStream.foreach` - `DStream.foreachRDD` to make it explicit that it works for every RDD, not every element * `StreamingContext.awaitTermination()` allows you wait for context shutdown and catch any exception that occurs in the streaming computation. *`StreamingContext.stop()` now allows stopping of StreamingContext without stopping the underlying SparkContext. ### GraphX Alpha GraphX is a new API for graph processing that uses recent advances in graph-parallel computation. It lets you build a graph within a Spark program using the standard Spark operators, then process it with new graph operators that are optimized for distributed computation. It includes basic transformations, a Pregel API for iterative computation, and a standard library of graph loaders and analytics algorithms. By offering these features within the Spark engine, GraphX can significantly speed up processing tasks compared to workflows that use different engines. GraphX features in this release include: * Building graphs from arbitrary Spark RDDs * Basic operations to transform graphs or extract subgraphs * An optimized Pregel API that takes advantage of graph partitioning and indexing * Standard algorithms including PageRank, connected components, strongly connected components, SVD++, and triangle counting * Interactive use from the Spark shell GraphX
Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc3)
Hi Patrick, quick question, where are you planning to add the release notes? I dont think it is part of the source, is it? - Henry On Sun, Jan 19, 2014 at 8:41 PM, Patrick Wendell pwend...@gmail.com wrote: Attempting to attach the release notes again (I think it may have been blocked previously due to not having an extension). On Sun, Jan 19, 2014 at 8:05 PM, Patrick Wendell pwend...@gmail.com wrote: I'll add my +1 as well On Sun, Jan 19, 2014 at 7:33 PM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Re-tested on Mac. Matei On Jan 19, 2014, at 7:09 PM, Tathagata Das tathagata.das1...@gmail.com wrote: Starting off. +1 On Sun, Jan 19, 2014 at 2:15 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.9.0. A draft of the release notes along with the changes file is attached to this e-mail. The tag to be voted on is v0.9.0-incubating (commit a7760eff): https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=a7760eff4ea6a474cab68896a88550f63bae8b0d The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1004/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3-docs/ Please vote on releasing this package as Apache Spark 0.9.0-incubating! The vote is open until Wednesday, January 22, at 22:15 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.9.0-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/
Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc3)
Eventually the notes get posted on the apache website. I attached them to this e-mail so that people can get a sense of what is in the release before they vote on it. On Sun, Jan 19, 2014 at 9:57 PM, Henry Saputra henry.sapu...@gmail.com wrote: Hi Patrick, quick question, where are you planning to add the release notes? I dont think it is part of the source, is it? - Henry On Sun, Jan 19, 2014 at 8:41 PM, Patrick Wendell pwend...@gmail.com wrote: Attempting to attach the release notes again (I think it may have been blocked previously due to not having an extension). On Sun, Jan 19, 2014 at 8:05 PM, Patrick Wendell pwend...@gmail.com wrote: I'll add my +1 as well On Sun, Jan 19, 2014 at 7:33 PM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Re-tested on Mac. Matei On Jan 19, 2014, at 7:09 PM, Tathagata Das tathagata.das1...@gmail.com wrote: Starting off. +1 On Sun, Jan 19, 2014 at 2:15 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.9.0. A draft of the release notes along with the changes file is attached to this e-mail. The tag to be voted on is v0.9.0-incubating (commit a7760eff): https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=a7760eff4ea6a474cab68896a88550f63bae8b0d The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1004/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3-docs/ Please vote on releasing this package as Apache Spark 0.9.0-incubating! The vote is open until Wednesday, January 22, at 22:15 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.9.0-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/
Re: [VOTE] Release Apache Spark 0.9.0-incubating (rc3)
Ah yes, makes sense, thanks! - Henry On Sun, Jan 19, 2014 at 10:01 PM, Patrick Wendell pwend...@gmail.com wrote: Eventually the notes get posted on the apache website. I attached them to this e-mail so that people can get a sense of what is in the release before they vote on it. On Sun, Jan 19, 2014 at 9:57 PM, Henry Saputra henry.sapu...@gmail.com wrote: Hi Patrick, quick question, where are you planning to add the release notes? I dont think it is part of the source, is it? - Henry On Sun, Jan 19, 2014 at 8:41 PM, Patrick Wendell pwend...@gmail.com wrote: Attempting to attach the release notes again (I think it may have been blocked previously due to not having an extension). On Sun, Jan 19, 2014 at 8:05 PM, Patrick Wendell pwend...@gmail.com wrote: I'll add my +1 as well On Sun, Jan 19, 2014 at 7:33 PM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Re-tested on Mac. Matei On Jan 19, 2014, at 7:09 PM, Tathagata Das tathagata.das1...@gmail.com wrote: Starting off. +1 On Sun, Jan 19, 2014 at 2:15 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.9.0. A draft of the release notes along with the changes file is attached to this e-mail. The tag to be voted on is v0.9.0-incubating (commit a7760eff): https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=commit;h=a7760eff4ea6a474cab68896a88550f63bae8b0d The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1004/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.9.0-incubating-rc3-docs/ Please vote on releasing this package as Apache Spark 0.9.0-incubating! The vote is open until Wednesday, January 22, at 22:15 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.9.0-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/