may because of snappy-java, https://issues.apache.org/jira/browse/SPARK-5081
On May 23, 2015, at 1:23 AM, Josh Rosen <[email protected]> wrote: > I don't think that 0.9.3 has been released, so I'm assuming that you're > running on branch-0.9. > > There's been over 4000 commits between 0.9.3 and 1.3.1, so I'm afraid that > this question doesn't have a concise answer: > https://github.com/apache/spark/compare/branch-0.9...v1.3.1 > > To narrow down the potential causes, have you tried comparing 0.9.3 to, say, > 1.0.2 or branch-1.0, or some other version that's closer to 0.9? > > On Fri, May 22, 2015 at 9:43 AM, Shay Seng <[email protected]> wrote: > Hi. > I have a job that takes > ~50min with Spark 0.9.3 and > ~1.8hrs on Spark 1.3.1 on the same cluster. > > The only code difference between the two code bases is to fix the Seq -> Iter > changes that happened in the Spark 1.x series. > > Are there any other changes in the defaults from spark 0.9.3 -> 1.3.1 that > would cause such a large degradation in performance? Changes in partitioning > algorithms, scheduling etc? > > shay > >
