Spark 0.9.1 released

2014-04-09 Thread Tathagata Das
Hi everyone,

We have just posted Spark 0.9.1, which is a maintenance release with
bug fixes, performance improvements, better stability with YARN and
improved parity of the Scala and Python API. We recommend all 0.9.0
users to upgrade to this stable release.

This is the first release since Spark graduated as a top level Apache
project. Contributions to this release came from 37 developers.

The full release notes are at:
http://spark.apache.org/releases/spark-release-0-9-1.html

You can download the release at:
http://spark.apache.org/downloads.html

Thanks all the developers who contributed to this release:
Aaron Davidson, Aaron Kimball, Andrew Ash, Andrew Or, Andrew Tulloch,
Bijay Bisht, Bouke van der Bijl, Bryn Keller, Chen Chao,
Christian Lundgren, Diana Carroll, Emtiaz Ahmed, Frank Dai,
Henry Saputra, jianghan, Josh Rosen, Jyotiska NK, Kay Ousterhout,
Kousuke Saruta, Mark Grover, Matei Zaharia, Nan Zhu, Nick Lanham,
Patrick Wendell, Prabin Banka, Prashant Sharma, Qiuzhuang,
Raymond Liu, Reynold Xin, Sandy Ryza, Sean Owen, Shixiong Zhu,
shiyun.wxm, Stevo Slavić, Tathagata Das, Tom Graves, Xiangrui Meng

TD


Re: Spark 0.9.1 released

2014-04-09 Thread Tathagata Das
A small additional note: Please use the direct download links in the Spark
Downloads http://spark.apache.org/downloads.html page. The Apache mirrors
take a day or so to sync from the main repo, so may not work immediately.

TD


On Wed, Apr 9, 2014 at 2:54 PM, Tathagata Das
tathagata.das1...@gmail.comwrote:

 Hi everyone,

 We have just posted Spark 0.9.1, which is a maintenance release with
 bug fixes, performance improvements, better stability with YARN and
 improved parity of the Scala and Python API. We recommend all 0.9.0
 users to upgrade to this stable release.

 This is the first release since Spark graduated as a top level Apache
 project. Contributions to this release came from 37 developers.

 The full release notes are at:
 http://spark.apache.org/releases/spark-release-0-9-1.html

 You can download the release at:
 http://spark.apache.org/downloads.html

 Thanks all the developers who contributed to this release:
 Aaron Davidson, Aaron Kimball, Andrew Ash, Andrew Or, Andrew Tulloch,
 Bijay Bisht, Bouke van der Bijl, Bryn Keller, Chen Chao,
 Christian Lundgren, Diana Carroll, Emtiaz Ahmed, Frank Dai,
 Henry Saputra, jianghan, Josh Rosen, Jyotiska NK, Kay Ousterhout,
 Kousuke Saruta, Mark Grover, Matei Zaharia, Nan Zhu, Nick Lanham,
 Patrick Wendell, Prabin Banka, Prashant Sharma, Qiuzhuang,
 Raymond Liu, Reynold Xin, Sandy Ryza, Sean Owen, Shixiong Zhu,
 shiyun.wxm, Stevo Slavić, Tathagata Das, Tom Graves, Xiangrui Meng

 TD



Re: Spark 0.9.1 released

2014-04-09 Thread Matei Zaharia
Thanks TD for managing this release, and thanks to everyone who contributed!

Matei

On Apr 9, 2014, at 2:59 PM, Tathagata Das tathagata.das1...@gmail.com wrote:

 A small additional note: Please use the direct download links in the Spark 
 Downloads page. The Apache mirrors take a day or so to sync from the main 
 repo, so may not work immediately.
 
 TD
 
 
 On Wed, Apr 9, 2014 at 2:54 PM, Tathagata Das tathagata.das1...@gmail.com 
 wrote:
 Hi everyone,
 
 We have just posted Spark 0.9.1, which is a maintenance release with
 bug fixes, performance improvements, better stability with YARN and
 improved parity of the Scala and Python API. We recommend all 0.9.0
 users to upgrade to this stable release.
 
 This is the first release since Spark graduated as a top level Apache
 project. Contributions to this release came from 37 developers.
 
 The full release notes are at:
 http://spark.apache.org/releases/spark-release-0-9-1.html
 
 You can download the release at:
 http://spark.apache.org/downloads.html
 
 Thanks all the developers who contributed to this release:
 Aaron Davidson, Aaron Kimball, Andrew Ash, Andrew Or, Andrew Tulloch,
 Bijay Bisht, Bouke van der Bijl, Bryn Keller, Chen Chao,
 Christian Lundgren, Diana Carroll, Emtiaz Ahmed, Frank Dai,
 Henry Saputra, jianghan, Josh Rosen, Jyotiska NK, Kay Ousterhout,
 Kousuke Saruta, Mark Grover, Matei Zaharia, Nan Zhu, Nick Lanham,
 Patrick Wendell, Prabin Banka, Prashant Sharma, Qiuzhuang,
 Raymond Liu, Reynold Xin, Sandy Ryza, Sean Owen, Shixiong Zhu,
 shiyun.wxm, Stevo Slavić, Tathagata Das, Tom Graves, Xiangrui Meng
 
 TD
 



Re: Spark 0.9.1 released

2014-04-09 Thread Nicholas Chammas
A very nice addition for us PySpark users in 0.9.1 is the addition of
RDD.repartition(), which is not mentioned in the release
noteshttp://spark.apache.org/releases/spark-release-0-9-1.html
!

This is super helpful for when you create an RDD from a gzipped file and
then need to explicitly shuffle the data around to parallelize operations
on it appropriately.

Thanks people!

FYI, 
docs/latesthttp://spark.apache.org/docs/latest/api/pyspark/index.htmlhasn't
been updated yet to reflect the new additions to PySpark.

Nick



On Wed, Apr 9, 2014 at 6:07 PM, Matei Zaharia matei.zaha...@gmail.comwrote:

 Thanks TD for managing this release, and thanks to everyone who
 contributed!

 Matei

 On Apr 9, 2014, at 2:59 PM, Tathagata Das tathagata.das1...@gmail.com
 wrote:

 A small additional note: Please use the direct download links in the Spark
 Downloads http://spark.apache.org/downloads.html page. The Apache
 mirrors take a day or so to sync from the main repo, so may not work
 immediately.

 TD


 On Wed, Apr 9, 2014 at 2:54 PM, Tathagata Das tathagata.das1...@gmail.com
  wrote:

 Hi everyone,

 We have just posted Spark 0.9.1, which is a maintenance release with
 bug fixes, performance improvements, better stability with YARN and
 improved parity of the Scala and Python API. We recommend all 0.9.0
 users to upgrade to this stable release.

 This is the first release since Spark graduated as a top level Apache
 project. Contributions to this release came from 37 developers.

 The full release notes are at:
 http://spark.apache.org/releases/spark-release-0-9-1.html

 You can download the release at:
 http://spark.apache.org/downloads.html

 Thanks all the developers who contributed to this release:
 Aaron Davidson, Aaron Kimball, Andrew Ash, Andrew Or, Andrew Tulloch,
 Bijay Bisht, Bouke van der Bijl, Bryn Keller, Chen Chao,
 Christian Lundgren, Diana Carroll, Emtiaz Ahmed, Frank Dai,
 Henry Saputra, jianghan, Josh Rosen, Jyotiska NK, Kay Ousterhout,
 Kousuke Saruta, Mark Grover, Matei Zaharia, Nan Zhu, Nick Lanham,
 Patrick Wendell, Prabin Banka, Prashant Sharma, Qiuzhuang,
 Raymond Liu, Reynold Xin, Sandy Ryza, Sean Owen, Shixiong Zhu,
 shiyun.wxm, Stevo Slavić, Tathagata Das, Tom Graves, Xiangrui Meng

 TD






Re: Spark 0.9.1 released

2014-04-09 Thread Nicholas Chammas
Ah, looks good now. It took me a minute to realize that doing a hard
refresh on the docs page was missing the RDD class doc page...

And thanks for updating the release notes.


On Wed, Apr 9, 2014 at 7:21 PM, Tathagata Das
tathagata.das1...@gmail.comwrote:

 Thanks Nick for pointing that out! I have updated the release 
 noteshttp://spark.apache.org/releases/spark-release-0-9-1.html.
 But I see the new operations like repartition in the latest PySpark RDD
 docs http://spark.apache.org/docs/latest/api/pyspark/index.html. Maybe
 refresh the page couple of times?

 TD


 On Wed, Apr 9, 2014 at 3:58 PM, Nicholas Chammas 
 nicholas.cham...@gmail.com wrote:

 A very nice addition for us PySpark users in 0.9.1 is the addition of
 RDD.repartition(), which is not mentioned in the release 
 noteshttp://spark.apache.org/releases/spark-release-0-9-1.html
 !

 This is super helpful for when you create an RDD from a gzipped file and
 then need to explicitly shuffle the data around to parallelize operations
 on it appropriately.

 Thanks people!

 FYI, 
 docs/latesthttp://spark.apache.org/docs/latest/api/pyspark/index.htmlhasn't
  been updated yet to reflect the new additions to PySpark.

 Nick



 On Wed, Apr 9, 2014 at 6:07 PM, Matei Zaharia matei.zaha...@gmail.comwrote:

 Thanks TD for managing this release, and thanks to everyone who
 contributed!

 Matei

 On Apr 9, 2014, at 2:59 PM, Tathagata Das tathagata.das1...@gmail.com
 wrote:

 A small additional note: Please use the direct download links in the
 Spark Downloads http://spark.apache.org/downloads.html page. The
 Apache mirrors take a day or so to sync from the main repo, so may not work
 immediately.

 TD


 On Wed, Apr 9, 2014 at 2:54 PM, Tathagata Das 
 tathagata.das1...@gmail.com wrote:

 Hi everyone,

 We have just posted Spark 0.9.1, which is a maintenance release with
 bug fixes, performance improvements, better stability with YARN and
 improved parity of the Scala and Python API. We recommend all 0.9.0
 users to upgrade to this stable release.

 This is the first release since Spark graduated as a top level Apache
 project. Contributions to this release came from 37 developers.

 The full release notes are at:
 http://spark.apache.org/releases/spark-release-0-9-1.html

 You can download the release at:
 http://spark.apache.org/downloads.html

 Thanks all the developers who contributed to this release:
 Aaron Davidson, Aaron Kimball, Andrew Ash, Andrew Or, Andrew Tulloch,
 Bijay Bisht, Bouke van der Bijl, Bryn Keller, Chen Chao,
 Christian Lundgren, Diana Carroll, Emtiaz Ahmed, Frank Dai,
 Henry Saputra, jianghan, Josh Rosen, Jyotiska NK, Kay Ousterhout,
 Kousuke Saruta, Mark Grover, Matei Zaharia, Nan Zhu, Nick Lanham,
 Patrick Wendell, Prabin Banka, Prashant Sharma, Qiuzhuang,
 Raymond Liu, Reynold Xin, Sandy Ryza, Sean Owen, Shixiong Zhu,
 shiyun.wxm, Stevo Slavić, Tathagata Das, Tom Graves, Xiangrui Meng

 TD