Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)
+1, built and tested on linux. On Thu, Dec 12, 2013 at 5:57 AM, Patrick Wendell pwend...@gmail.com wrote: I also talked to a few people who got corrupted binaries when downloading from the people.apache HTTP. In that case the checksum failed but if they re-downloaded it worked. So maybe just re-download and try again? On Wed, Dec 11, 2013 at 3:15 PM, Patrick Wendell pwend...@gmail.com wrote: Hey Tom, I re-verified the signatures and got someone else to do it. It seemed fine. Here is what I did. gpg --recv-key 9E4FE3AF wget http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/spark-0.8.1-incubating.tgz.asc wget http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/spark-0.8.1-incubating.tgz gpg --verify spark-0.8.1-incubating.tgz.asc spark-0.8.1-incubating.tgz gpg: Signature made Tue 10 Dec 2013 02:53:15 PM PST using RSA key ID 9E4FE3AF gpg: Good signature from Patrick Wendell pwend...@gmail.com On Wed, Dec 11, 2013 at 1:10 PM, Mark Hamstra m...@clearstorydata.com wrote: I don't know how to make sense of the numbers, but here's what I've got from a very small sample size. For both v0.8.0-incubating and v0.8.1-incubating, building separate assemblies is faster than `./sbt/sbt assembly` and the times for building separate assemblies for 0.8.0 and 0.8.1 are about the same. For v0.8.0-incubating, `./sbt/sbt assembly` takes about 2.5x as long as the sum of the separate assemblies. For v0.8.1-incubating, `./sbt/sbt assembly` takes almost 8x as long as the sum of the separate assemblies. Weird. On Wed, Dec 11, 2013 at 11:49 AM, Patrick Wendell pwend...@gmail.com wrote: I'll +1 myself also. For anyone who has the slow build problem: does this issue happen when building v0.8.0-incubating also? Trying to figure out whether it's related to something we added in 0.8.1 or if it's a long standing issue. - Patrick On Wed, Dec 11, 2013 at 10:39 AM, Matei Zaharia matei.zaha...@gmail.com wrote: Woah, weird, but definitely good to know. If you’re doing Spark development, there’s also a more convenient option added by Shivaram in the master branch. You can do sbt assemble-deps to package *just* the dependencies of each project in a special assembly JAR, and then use sbt compile to update the code. This will use the classes directly out of the target/scala-2.9.3/classes directories. You have to redo assemble-deps only if your external dependencies change. Matei On Dec 11, 2013, at 1:04 AM, Prashant Sharma scrapco...@gmail.com wrote: I hope this PR https://github.com/apache/incubator-spark/pull/252can help. Again this is not a blocker for the release from my side either. On Wed, Dec 11, 2013 at 2:14 PM, Mark Hamstra m...@clearstorydata.com wrote: Interesting, and confirmed: On my machine where `./sbt/sbt assembly` takes a long, long, long time to complete (a MBP, in my case), building three separate assemblies (`./sbt/sbt assembly/assembly`, `./sbt/sbt examples/assembly`, `./sbt/sbt tools/assembly`) takes much, much less time. On Wed, Dec 11, 2013 at 12:02 AM, Prashant Sharma scrapco...@gmail.com wrote: forgot to mention, after running sbt/sbt assembly/assembly running sbt/sbt examples/assembly takes just 37s. Not to mention my hardware is not really great. On Wed, Dec 11, 2013 at 1:28 PM, Prashant Sharma scrapco...@gmail.com wrote: Hi Patrick and Matei, Was trying out this and followed the quick start guide which says do sbt/sbt assembly, like few others I was also stuck for few minutes on linux. On the other hand if I use sbt/sbt assembly/assembly it is much faster. Should we change the documentation to reflect this. It will not be great for first time users to get stuck there. On Wed, Dec 11, 2013 at 9:54 AM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Built and tested it on Mac OS X. Matei On Dec 10, 2013, at 4:49 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.8.1. The tag to be voted on is v0.8.1-incubating (commit b87d31d): https://git-wip-us.apache.org/repos/asf/incubator-spark/repo?p=incubator-spark.git;a=commit;h=b87d31dd8eb4b4e47c0138e9242d0dd6922c8c4e The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-040/ The documentation corresponding to this release can be found at:
Re: PySpark / scikit-learn integration sprint at Cloudera - Strata Conference Friday 14th Feb 2014
Done. -- Olivier
RE: Scala 2.10 Merge
Hi Patrick What does that means for drop YARN 2.2? seems codes are still there. You mean if build upon 2.2 it will break, and won't and work right? Since the home made akka build on scala 2.10 are not there. While, if for this case, can we just use akka 2.3-M1 which run on protobuf 2.5 for replacement? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 4:21 PM To: dev@spark.incubator.apache.org Subject: Scala 2.10 Merge Hi Developers, In the next few days we are planning to merge Scala 2.10 support into Spark. For those that haven't been following this, Prashant Sharma has been maintaining the scala-2.10 branch of Spark for several months. This branch is current with master and has been reviewed for merging: https://github.com/apache/incubator-spark/tree/scala-2.10 Scala 2.10 support is one of the most requested features for Spark - it will be great to get this into Spark 0.9! Please note that *Scala 2.10 is not binary compatible with Scala 2.9*. With that in mind, I wanted to give a few heads-up/requests to developers: If you are developing applications on top of Spark's master branch, those will need to migrate to Scala 2.10. You may want to download and test the current scala-2.10 branch in order to make sure you will be okay as Spark developments move forward. Of course, you can always stick with the current master commit and be fine (I'll cut a tag when we do the merge in order to delineate where the version changes). Please open new threads on the dev list to report and discuss any issues. This merge will temporarily drop support for YARN 2.2 on the master branch. This is because the workaround we used was only compiled for Scala 2.9. We are going to come up with a more robust solution to YARN 2.2 support before releasing 0.9. Going forward, we will continue to make maintenance releases on branch-0.8 which will remain compatible with Scala 2.9. For those interested, the primary code changes in this merge are upgrading the akka version, changing the use of Scala 2.9's ClassManifest construct to Scala 2.10's ClassTag, and updating the spark shell to work with Scala 2.10's repl. - Patrick
Re: Scala 2.10 Merge
Also - the code is still there because of a recent merge that took in some newer changes... we'll be removing it for the final merge. On Thu, Dec 12, 2013 at 1:12 AM, Patrick Wendell pwend...@gmail.com wrote: Hey Raymond, This won't work because AFAIK akka 2.3-M1 is not binary compatible with akka 2.2.3 (right?). For all of the non-yarn 2.2 versions we need to still use the older protobuf library, so we'd need to support both. I'd also be concerned about having a reference to a non-released version of akka. Akka is the source of our hardest-to-find bugs and simultaneously trying to support 2.2.3 and 2.3-M1 is a bit daunting. Of course, if you are building off of master you can maintain a fork that uses this. - Patrick On Thu, Dec 12, 2013 at 12:42 AM, Liu, Raymond raymond@intel.comwrote: Hi Patrick What does that means for drop YARN 2.2? seems codes are still there. You mean if build upon 2.2 it will break, and won't and work right? Since the home made akka build on scala 2.10 are not there. While, if for this case, can we just use akka 2.3-M1 which run on protobuf 2.5 for replacement? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 4:21 PM To: dev@spark.incubator.apache.org Subject: Scala 2.10 Merge Hi Developers, In the next few days we are planning to merge Scala 2.10 support into Spark. For those that haven't been following this, Prashant Sharma has been maintaining the scala-2.10 branch of Spark for several months. This branch is current with master and has been reviewed for merging: https://github.com/apache/incubator-spark/tree/scala-2.10 Scala 2.10 support is one of the most requested features for Spark - it will be great to get this into Spark 0.9! Please note that *Scala 2.10 is not binary compatible with Scala 2.9*. With that in mind, I wanted to give a few heads-up/requests to developers: If you are developing applications on top of Spark's master branch, those will need to migrate to Scala 2.10. You may want to download and test the current scala-2.10 branch in order to make sure you will be okay as Spark developments move forward. Of course, you can always stick with the current master commit and be fine (I'll cut a tag when we do the merge in order to delineate where the version changes). Please open new threads on the dev list to report and discuss any issues. This merge will temporarily drop support for YARN 2.2 on the master branch. This is because the workaround we used was only compiled for Scala 2.9. We are going to come up with a more robust solution to YARN 2.2 support before releasing 0.9. Going forward, we will continue to make maintenance releases on branch-0.8 which will remain compatible with Scala 2.9. For those interested, the primary code changes in this merge are upgrading the akka version, changing the use of Scala 2.9's ClassManifest construct to Scala 2.10's ClassTag, and updating the spark shell to work with Scala 2.10's repl. - Patrick
Re: Scala 2.10 Merge
Hey Raymond, I just gave what you said a try but have no intentions of maintaining it. You might want it helpful https://github.com/ScrapCodes/incubator-spark/tree/yarn-2.2 incase you are willing to maintain it. For the record, above branch is updated to master with scala 2.10 and uses akka 2.3-M1 if one uses new-yarn and uses akka 2.2.3 otherwise. On Thu, Dec 12, 2013 at 2:12 PM, Liu, Raymond raymond@intel.com wrote: Hi Patrick What does that means for drop YARN 2.2? seems codes are still there. You mean if build upon 2.2 it will break, and won't and work right? Since the home made akka build on scala 2.10 are not there. While, if for this case, can we just use akka 2.3-M1 which run on protobuf 2.5 for replacement? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 4:21 PM To: dev@spark.incubator.apache.org Subject: Scala 2.10 Merge Hi Developers, In the next few days we are planning to merge Scala 2.10 support into Spark. For those that haven't been following this, Prashant Sharma has been maintaining the scala-2.10 branch of Spark for several months. This branch is current with master and has been reviewed for merging: https://github.com/apache/incubator-spark/tree/scala-2.10 Scala 2.10 support is one of the most requested features for Spark - it will be great to get this into Spark 0.9! Please note that *Scala 2.10 is not binary compatible with Scala 2.9*. With that in mind, I wanted to give a few heads-up/requests to developers: If you are developing applications on top of Spark's master branch, those will need to migrate to Scala 2.10. You may want to download and test the current scala-2.10 branch in order to make sure you will be okay as Spark developments move forward. Of course, you can always stick with the current master commit and be fine (I'll cut a tag when we do the merge in order to delineate where the version changes). Please open new threads on the dev list to report and discuss any issues. This merge will temporarily drop support for YARN 2.2 on the master branch. This is because the workaround we used was only compiled for Scala 2.9. We are going to come up with a more robust solution to YARN 2.2 support before releasing 0.9. Going forward, we will continue to make maintenance releases on branch-0.8 which will remain compatible with Scala 2.9. For those interested, the primary code changes in this merge are upgrading the akka version, changing the use of Scala 2.9's ClassManifest construct to Scala 2.10's ClassTag, and updating the spark shell to work with Scala 2.10's repl. - Patrick -- s
Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)
I re-downloaded the source tarball and it works now. Tom On Wednesday, December 11, 2013 6:27 PM, Patrick Wendell pwend...@gmail.com wrote: I also talked to a few people who got corrupted binaries when downloading from the people.apache HTTP. In that case the checksum failed but if they re-downloaded it worked. So maybe just re-download and try again? On Wed, Dec 11, 2013 at 3:15 PM, Patrick Wendell pwend...@gmail.com wrote: Hey Tom, I re-verified the signatures and got someone else to do it. It seemed fine. Here is what I did. gpg --recv-key 9E4FE3AF wget http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/spark-0.8.1-incubating.tgz.asc wget http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/spark-0.8.1-incubating.tgz gpg --verify spark-0.8.1-incubating.tgz.asc spark-0.8.1-incubating.tgz gpg: Signature made Tue 10 Dec 2013 02:53:15 PM PST using RSA key ID 9E4FE3AF gpg: Good signature from Patrick Wendell pwend...@gmail.com On Wed, Dec 11, 2013 at 1:10 PM, Mark Hamstra m...@clearstorydata.com wrote: I don't know how to make sense of the numbers, but here's what I've got from a very small sample size. For both v0.8.0-incubating and v0.8.1-incubating, building separate assemblies is faster than `./sbt/sbt assembly` and the times for building separate assemblies for 0.8.0 and 0.8.1 are about the same. For v0.8.0-incubating, `./sbt/sbt assembly` takes about 2.5x as long as the sum of the separate assemblies. For v0.8.1-incubating, `./sbt/sbt assembly` takes almost 8x as long as the sum of the separate assemblies. Weird. On Wed, Dec 11, 2013 at 11:49 AM, Patrick Wendell pwend...@gmail.comwrote: I'll +1 myself also. For anyone who has the slow build problem: does this issue happen when building v0.8.0-incubating also? Trying to figure out whether it's related to something we added in 0.8.1 or if it's a long standing issue. - Patrick On Wed, Dec 11, 2013 at 10:39 AM, Matei Zaharia matei.zaha...@gmail.com wrote: Woah, weird, but definitely good to know. If you’re doing Spark development, there’s also a more convenient option added by Shivaram in the master branch. You can do sbt assemble-deps to package *just* the dependencies of each project in a special assembly JAR, and then use sbt compile to update the code. This will use the classes directly out of the target/scala-2.9.3/classes directories. You have to redo assemble-deps only if your external dependencies change. Matei On Dec 11, 2013, at 1:04 AM, Prashant Sharma scrapco...@gmail.com wrote: I hope this PR https://github.com/apache/incubator-spark/pull/252 can help. Again this is not a blocker for the release from my side either. On Wed, Dec 11, 2013 at 2:14 PM, Mark Hamstra m...@clearstorydata.com wrote: Interesting, and confirmed: On my machine where `./sbt/sbt assembly` takes a long, long, long time to complete (a MBP, in my case), building three separate assemblies (`./sbt/sbt assembly/assembly`, `./sbt/sbt examples/assembly`, `./sbt/sbt tools/assembly`) takes much, much less time. On Wed, Dec 11, 2013 at 12:02 AM, Prashant Sharma scrapco...@gmail.com wrote: forgot to mention, after running sbt/sbt assembly/assembly running sbt/sbt examples/assembly takes just 37s. Not to mention my hardware is not really great. On Wed, Dec 11, 2013 at 1:28 PM, Prashant Sharma scrapco...@gmail.com wrote: Hi Patrick and Matei, Was trying out this and followed the quick start guide which says do sbt/sbt assembly, like few others I was also stuck for few minutes on linux. On the other hand if I use sbt/sbt assembly/assembly it is much faster. Should we change the documentation to reflect this. It will not be great for first time users to get stuck there. On Wed, Dec 11, 2013 at 9:54 AM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Built and tested it on Mac OS X. Matei On Dec 10, 2013, at 4:49 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.8.1. The tag to be voted on is v0.8.1-incubating (commit b87d31d): https://git-wip-us.apache.org/repos/asf/incubator-spark/repo?p=incubator-spark.git;a=commit;h=b87d31dd8eb4b4e47c0138e9242d0dd6922c8c4e The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-040/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4-docs/ For information about the contents of this release see:
Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)
+1. Built spark on yarn for both hadoop 0.23 and hadoop 2.2.0 on redhat linux using maven. Ran some tests on both a secure Hadoop 0.23 cluster and a secure Hadoop 2.2.0 cluster. Verified signatures and md5. Tom On Tuesday, December 10, 2013 6:49 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.8.1. The tag to be voted on is v0.8.1-incubating (commit b87d31d): https://git-wip-us.apache.org/repos/asf/incubator-spark/repo?p=incubator-spark.git;a=commit;h=b87d31dd8eb4b4e47c0138e9242d0dd6922c8c4e The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-040/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4-docs/ For information about the contents of this release see: https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=blob;f=CHANGES.txt;h=ce0aeab524505b63c7999e0371157ac2def6fe1c;hb=branch-0.8 Please vote on releasing this package as Apache Spark 0.8.1-incubating! The vote is open until Saturday, December 14th at 01:00 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.8.1-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/
Re: Spark API - support for asynchronous calls - Reactive style [I]
Mark, Thanks. The FutureAction API looks awesome. On Mon, Dec 9, 2013 at 9:31 AM, Mark Hamstra m...@clearstorydata.comwrote: Spark has already supported async jobs for awhile now -- https://github.com/apache/incubator-spark/pull/29, and they even work correctly after https://github.com/apache/incubator-spark/pull/232 There are now implicit conversions from RDD to AsyncRDDActions https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala , where async actions like countAsync are defined. On Mon, Dec 9, 2013 at 5:46 AM, Deenar Toraskar deenar.toras...@db.com wrote: Classification: For internal use only Hi developers Are there any plans to have Spark (and Shark) APIs that are asynchronous and non blocking? APIs that return Futures and Iteratee/Enumerators would be very useful to users building scalable apps using Spark, specially when combined with a fully asynchronous/non-blocking framework like Play!. Something along the lines of ReactiveMongo http://stephane.godbillon.com/2012/08/30/reactivemongo-for-scala-unleashing-mongodb-streaming-capabilities-for-realtime-web Deenar --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to http://www.db.com/en/content/eu_disclosures.htm for additional EU corporate and regulatory disclosures. -- -- Evan Chan Staff Engineer e...@ooyala.com | http://www.ooyala.com/ http://www.facebook.com/ooyalahttp://www.linkedin.com/company/ooyalahttp://www.twitter.com/ooyala
Re: [VOTE] Release Apache Spark 0.8.1-incubating (rc4)
I'd be personally fine with a standard workflow of assemble-deps + packaging just the Spark files as separate packages, if it speeds up everyone's development time. On Wed, Dec 11, 2013 at 1:10 PM, Mark Hamstra m...@clearstorydata.comwrote: I don't know how to make sense of the numbers, but here's what I've got from a very small sample size. For both v0.8.0-incubating and v0.8.1-incubating, building separate assemblies is faster than `./sbt/sbt assembly` and the times for building separate assemblies for 0.8.0 and 0.8.1 are about the same. For v0.8.0-incubating, `./sbt/sbt assembly` takes about 2.5x as long as the sum of the separate assemblies. For v0.8.1-incubating, `./sbt/sbt assembly` takes almost 8x as long as the sum of the separate assemblies. Weird. On Wed, Dec 11, 2013 at 11:49 AM, Patrick Wendell pwend...@gmail.com wrote: I'll +1 myself also. For anyone who has the slow build problem: does this issue happen when building v0.8.0-incubating also? Trying to figure out whether it's related to something we added in 0.8.1 or if it's a long standing issue. - Patrick On Wed, Dec 11, 2013 at 10:39 AM, Matei Zaharia matei.zaha...@gmail.com wrote: Woah, weird, but definitely good to know. If you’re doing Spark development, there’s also a more convenient option added by Shivaram in the master branch. You can do sbt assemble-deps to package *just* the dependencies of each project in a special assembly JAR, and then use sbt compile to update the code. This will use the classes directly out of the target/scala-2.9.3/classes directories. You have to redo assemble-deps only if your external dependencies change. Matei On Dec 11, 2013, at 1:04 AM, Prashant Sharma scrapco...@gmail.com wrote: I hope this PR https://github.com/apache/incubator-spark/pull/252 can help. Again this is not a blocker for the release from my side either. On Wed, Dec 11, 2013 at 2:14 PM, Mark Hamstra m...@clearstorydata.com wrote: Interesting, and confirmed: On my machine where `./sbt/sbt assembly` takes a long, long, long time to complete (a MBP, in my case), building three separate assemblies (`./sbt/sbt assembly/assembly`, `./sbt/sbt examples/assembly`, `./sbt/sbt tools/assembly`) takes much, much less time. On Wed, Dec 11, 2013 at 12:02 AM, Prashant Sharma scrapco...@gmail.com wrote: forgot to mention, after running sbt/sbt assembly/assembly running sbt/sbt examples/assembly takes just 37s. Not to mention my hardware is not really great. On Wed, Dec 11, 2013 at 1:28 PM, Prashant Sharma scrapco...@gmail.com wrote: Hi Patrick and Matei, Was trying out this and followed the quick start guide which says do sbt/sbt assembly, like few others I was also stuck for few minutes on linux. On the other hand if I use sbt/sbt assembly/assembly it is much faster. Should we change the documentation to reflect this. It will not be great for first time users to get stuck there. On Wed, Dec 11, 2013 at 9:54 AM, Matei Zaharia matei.zaha...@gmail.com wrote: +1 Built and tested it on Mac OS X. Matei On Dec 10, 2013, at 4:49 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark (incubating) version 0.8.1. The tag to be voted on is v0.8.1-incubating (commit b87d31d): https://git-wip-us.apache.org/repos/asf/incubator-spark/repo?p=incubator-spark.git;a=commit;h=b87d31dd8eb4b4e47c0138e9242d0dd6922c8c4e The release files, including signatures, digests, etc can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-040/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc4-docs/ For information about the contents of this release see: https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=blob;f=CHANGES.txt;h=ce0aeab524505b63c7999e0371157ac2def6fe1c;hb=branch-0.8 Please vote on releasing this package as Apache Spark 0.8.1-incubating! The vote is open until Saturday, December 14th at 01:00 UTC and passes if a majority of at least 3 +1 PPMC votes are cast. [ ] +1 Release this package as Apache Spark 0.8.1-incubating [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.incubator.apache.org/ -- s -- s -- s -- -- Evan Chan Staff Engineer e...@ooyala.com | http://www.ooyala.com/
RE: Scala 2.10 Merge
Hi Patrick So what's the plan for support Yarn 2.2 in 0.9? As far as I can see, if you want to support both 2.2 and 2.0 , due to protobuf version incompatible issue. You need two version of akka anyway. Akka 2.3-M1 looks like have a little bit change in API, we probably could isolate the code like what we did on yarn part API. I remember that it is mentioned that to use reflection for different API is preferred. So the purpose to use reflection is to use one release bin jar to support both version of Hadoop/Yarn on runtime, instead of build different bin jar on compile time? Then all code related to hadoop will also be built in separate modules for loading on demand? This sounds to me involve a lot of works. And you still need to have shim layer and separate code for different version API and depends on different version Akka etc. Sounds like and even strict demands versus our current approaching on master, and with dynamic class loader in addition, And the problem we are facing now are still there? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 5:13 PM To: dev@spark.incubator.apache.org Subject: Re: Scala 2.10 Merge Also - the code is still there because of a recent merge that took in some newer changes... we'll be removing it for the final merge. On Thu, Dec 12, 2013 at 1:12 AM, Patrick Wendell pwend...@gmail.com wrote: Hey Raymond, This won't work because AFAIK akka 2.3-M1 is not binary compatible with akka 2.2.3 (right?). For all of the non-yarn 2.2 versions we need to still use the older protobuf library, so we'd need to support both. I'd also be concerned about having a reference to a non-released version of akka. Akka is the source of our hardest-to-find bugs and simultaneously trying to support 2.2.3 and 2.3-M1 is a bit daunting. Of course, if you are building off of master you can maintain a fork that uses this. - Patrick On Thu, Dec 12, 2013 at 12:42 AM, Liu, Raymond raymond@intel.comwrote: Hi Patrick What does that means for drop YARN 2.2? seems codes are still there. You mean if build upon 2.2 it will break, and won't and work right? Since the home made akka build on scala 2.10 are not there. While, if for this case, can we just use akka 2.3-M1 which run on protobuf 2.5 for replacement? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 4:21 PM To: dev@spark.incubator.apache.org Subject: Scala 2.10 Merge Hi Developers, In the next few days we are planning to merge Scala 2.10 support into Spark. For those that haven't been following this, Prashant Sharma has been maintaining the scala-2.10 branch of Spark for several months. This branch is current with master and has been reviewed for merging: https://github.com/apache/incubator-spark/tree/scala-2.10 Scala 2.10 support is one of the most requested features for Spark - it will be great to get this into Spark 0.9! Please note that *Scala 2.10 is not binary compatible with Scala 2.9*. With that in mind, I wanted to give a few heads-up/requests to developers: If you are developing applications on top of Spark's master branch, those will need to migrate to Scala 2.10. You may want to download and test the current scala-2.10 branch in order to make sure you will be okay as Spark developments move forward. Of course, you can always stick with the current master commit and be fine (I'll cut a tag when we do the merge in order to delineate where the version changes). Please open new threads on the dev list to report and discuss any issues. This merge will temporarily drop support for YARN 2.2 on the master branch. This is because the workaround we used was only compiled for Scala 2.9. We are going to come up with a more robust solution to YARN 2.2 support before releasing 0.9. Going forward, we will continue to make maintenance releases on branch-0.8 which will remain compatible with Scala 2.9. For those interested, the primary code changes in this merge are upgrading the akka version, changing the use of Scala 2.9's ClassManifest construct to Scala 2.10's ClassTag, and updating the spark shell to work with Scala 2.10's repl. - Patrick
Re: Scala 2.10 Merge
Hey Reymond, Let's move this discussion out of this thread and into the associated JIRA. I'll write up our current approach over there. https://spark-project.atlassian.net/browse/SPARK-995 - Patrick On Thu, Dec 12, 2013 at 5:56 PM, Liu, Raymond raymond@intel.com wrote: Hi Patrick So what's the plan for support Yarn 2.2 in 0.9? As far as I can see, if you want to support both 2.2 and 2.0 , due to protobuf version incompatible issue. You need two version of akka anyway. Akka 2.3-M1 looks like have a little bit change in API, we probably could isolate the code like what we did on yarn part API. I remember that it is mentioned that to use reflection for different API is preferred. So the purpose to use reflection is to use one release bin jar to support both version of Hadoop/Yarn on runtime, instead of build different bin jar on compile time? Then all code related to hadoop will also be built in separate modules for loading on demand? This sounds to me involve a lot of works. And you still need to have shim layer and separate code for different version API and depends on different version Akka etc. Sounds like and even strict demands versus our current approaching on master, and with dynamic class loader in addition, And the problem we are facing now are still there? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 5:13 PM To: dev@spark.incubator.apache.org Subject: Re: Scala 2.10 Merge Also - the code is still there because of a recent merge that took in some newer changes... we'll be removing it for the final merge. On Thu, Dec 12, 2013 at 1:12 AM, Patrick Wendell pwend...@gmail.com wrote: Hey Raymond, This won't work because AFAIK akka 2.3-M1 is not binary compatible with akka 2.2.3 (right?). For all of the non-yarn 2.2 versions we need to still use the older protobuf library, so we'd need to support both. I'd also be concerned about having a reference to a non-released version of akka. Akka is the source of our hardest-to-find bugs and simultaneously trying to support 2.2.3 and 2.3-M1 is a bit daunting. Of course, if you are building off of master you can maintain a fork that uses this. - Patrick On Thu, Dec 12, 2013 at 12:42 AM, Liu, Raymond raymond@intel.com wrote: Hi Patrick What does that means for drop YARN 2.2? seems codes are still there. You mean if build upon 2.2 it will break, and won't and work right? Since the home made akka build on scala 2.10 are not there. While, if for this case, can we just use akka 2.3-M1 which run on protobuf 2.5 for replacement? Best Regards, Raymond Liu -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Thursday, December 12, 2013 4:21 PM To: dev@spark.incubator.apache.org Subject: Scala 2.10 Merge Hi Developers, In the next few days we are planning to merge Scala 2.10 support into Spark. For those that haven't been following this, Prashant Sharma has been maintaining the scala-2.10 branch of Spark for several months. This branch is current with master and has been reviewed for merging: https://github.com/apache/incubator-spark/tree/scala-2.10 Scala 2.10 support is one of the most requested features for Spark - it will be great to get this into Spark 0.9! Please note that *Scala 2.10 is not binary compatible with Scala 2.9*. With that in mind, I wanted to give a few heads-up/requests to developers: If you are developing applications on top of Spark's master branch, those will need to migrate to Scala 2.10. You may want to download and test the current scala-2.10 branch in order to make sure you will be okay as Spark developments move forward. Of course, you can always stick with the current master commit and be fine (I'll cut a tag when we do the merge in order to delineate where the version changes). Please open new threads on the dev list to report and discuss any issues. This merge will temporarily drop support for YARN 2.2 on the master branch. This is because the workaround we used was only compiled for Scala 2.9. We are going to come up with a more robust solution to YARN 2.2 support before releasing 0.9. Going forward, we will continue to make maintenance releases on branch-0.8 which will remain compatible with Scala 2.9. For those interested, the primary code changes in this merge are upgrading the akka version, changing the use of Scala 2.9's ClassManifest construct to Scala 2.10's ClassTag, and updating the spark shell to work with Scala 2.10's repl. - Patrick