Re: Spark 0.9.1 release

2014-03-27 Thread Tathagata Das
I have cut another release candidate, RC3, with two important bug
fixes. See the following JIRAs for more details.
1. Bug with intercepts in MLLib's GLM:
https://spark-project.atlassian.net/browse/SPARK-1327
2. Bug in PySpark's RDD.top() ordering:
https://spark-project.atlassian.net/browse/SPARK-1322

Please vote on this candidate on the voting thread.

Thanks!

TD

On Wed, Mar 26, 2014 at 3:09 PM, Tathagata Das
 wrote:
> Updates:
> 1. Fix for the ASM problem that Kevin mentioned is already in Spark 0.9.1
> RC2
> 2. Fix for pyspark's RDD.top() that Patrick mentioned has been pulled into
> branch 0.9. This will get into the next RC if there is one.
>
> TD
>
>
> On Wed, Mar 26, 2014 at 9:21 AM, Patrick Wendell  wrote:
>>
>> Hey TD,
>>
>> This one we just merged into master this morning:
>> https://spark-project.atlassian.net/browse/SPARK-1322
>>
>> It should definitely go into the 0.9 branch because there was a bug in the
>> semantics of top() which at this point is unreleased in Python.
>>
>> I didn't backport it yet because I figured you might want to do this at a
>> specific time. So please go ahead and backport it. Not sure whether this
>> warrants another RC.
>>
>> - Patrick
>>
>>
>> On Tue, Mar 25, 2014 at 10:47 PM, Mridul Muralidharan
>> wrote:
>>
>> > On Wed, Mar 26, 2014 at 10:53 AM, Tathagata Das
>> >  wrote:
>> > > PR 159 seems like a fairly big patch to me. And quite recent, so its
>> > impact
>> > > on the scheduling is not clear. It may also depend on other changes
>> > > that
>> > > may have gotten into the DAGScheduler but not pulled into branch 0.9.
>> > > I
>> > am
>> > > not sure it is a good idea to pull that in. We can pull those changes
>> > later
>> > > for 0.9.2 if required.
>> >
>> >
>> > There is no impact on scheduling : it only has an impact on error
>> > handling - it ensures that you can actually use spark on yarn in
>> > multi-tennent clusters more reliably.
>> > Currently, any reasonably long running job (30 mins+) working on non
>> > trivial dataset will fail due to accumulated failures in spark.
>> >
>> >
>> > Regards,
>> > Mridul
>> >
>> >
>> > >
>> > > TD
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan > > >wrote:
>> > >
>> > >> Forgot to mention this in the earlier request for PR's.
>> > >> If there is another RC being cut, please add
>> > >> https://github.com/apache/spark/pull/159 to it too (if not done
>> > >> already !).
>> > >>
>> > >> Thanks,
>> > >> Mridul
>> > >>
>> > >> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>> > >>  wrote:
>> > >> >  Hello everyone,
>> > >> >
>> > >> > Since the release of Spark 0.9, we have received a number of
>> > >> > important
>> > >> bug
>> > >> > fixes and we would like to make a bug-fix release of Spark 0.9.1.
>> > >> > We
>> > are
>> > >> > going to cut a release candidate soon and we would love it if
>> > >> > people
>> > test
>> > >> > it out. We have backported several bug fixes into the 0.9 and
>> > >> > updated
>> > >> JIRA
>> > >> > accordingly<
>> > >>
>> >
>> > https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> > >> >.
>> > >> > Please let me know if there are fixes that were not backported but
>> > >> > you
>> > >> > would like to see them in 0.9.1.
>> > >> >
>> > >> > Thanks!
>> > >> >
>> > >> > TD
>> > >>
>> >
>
>


Re: Spark 0.9.1 release

2014-03-26 Thread Tathagata Das
Updates:
1. Fix for the ASM
problemthat
Kevin mentioned is already in Spark 0.9.1 RC2
2. Fix for pyspark's RDD.top()
that Patrick
mentioned has been pulled into branch 0.9. This will get into the next RC
if there is one.

TD


On Wed, Mar 26, 2014 at 9:21 AM, Patrick Wendell  wrote:

> Hey TD,
>
> This one we just merged into master this morning:
> https://spark-project.atlassian.net/browse/SPARK-1322
>
> It should definitely go into the 0.9 branch because there was a bug in the
> semantics of top() which at this point is unreleased in Python.
>
> I didn't backport it yet because I figured you might want to do this at a
> specific time. So please go ahead and backport it. Not sure whether this
> warrants another RC.
>
> - Patrick
>
>
> On Tue, Mar 25, 2014 at 10:47 PM, Mridul Muralidharan  >wrote:
>
> > On Wed, Mar 26, 2014 at 10:53 AM, Tathagata Das
> >  wrote:
> > > PR 159 seems like a fairly big patch to me. And quite recent, so its
> > impact
> > > on the scheduling is not clear. It may also depend on other changes
> that
> > > may have gotten into the DAGScheduler but not pulled into branch 0.9. I
> > am
> > > not sure it is a good idea to pull that in. We can pull those changes
> > later
> > > for 0.9.2 if required.
> >
> >
> > There is no impact on scheduling : it only has an impact on error
> > handling - it ensures that you can actually use spark on yarn in
> > multi-tennent clusters more reliably.
> > Currently, any reasonably long running job (30 mins+) working on non
> > trivial dataset will fail due to accumulated failures in spark.
> >
> >
> > Regards,
> > Mridul
> >
> >
> > >
> > > TD
> > >
> > >
> > >
> > >
> > > On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan  > >wrote:
> > >
> > >> Forgot to mention this in the earlier request for PR's.
> > >> If there is another RC being cut, please add
> > >> https://github.com/apache/spark/pull/159 to it too (if not done
> > >> already !).
> > >>
> > >> Thanks,
> > >> Mridul
> > >>
> > >> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
> > >>  wrote:
> > >> >  Hello everyone,
> > >> >
> > >> > Since the release of Spark 0.9, we have received a number of
> important
> > >> bug
> > >> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
> > are
> > >> > going to cut a release candidate soon and we would love it if people
> > test
> > >> > it out. We have backported several bug fixes into the 0.9 and
> updated
> > >> JIRA
> > >> > accordingly<
> > >>
> >
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> > >> >.
> > >> > Please let me know if there are fixes that were not backported but
> you
> > >> > would like to see them in 0.9.1.
> > >> >
> > >> > Thanks!
> > >> >
> > >> > TD
> > >>
> >
>


Re: Spark 0.9.1 release

2014-03-26 Thread Patrick Wendell
Hey TD,

This one we just merged into master this morning:
https://spark-project.atlassian.net/browse/SPARK-1322

It should definitely go into the 0.9 branch because there was a bug in the
semantics of top() which at this point is unreleased in Python.

I didn't backport it yet because I figured you might want to do this at a
specific time. So please go ahead and backport it. Not sure whether this
warrants another RC.

- Patrick


On Tue, Mar 25, 2014 at 10:47 PM, Mridul Muralidharan wrote:

> On Wed, Mar 26, 2014 at 10:53 AM, Tathagata Das
>  wrote:
> > PR 159 seems like a fairly big patch to me. And quite recent, so its
> impact
> > on the scheduling is not clear. It may also depend on other changes that
> > may have gotten into the DAGScheduler but not pulled into branch 0.9. I
> am
> > not sure it is a good idea to pull that in. We can pull those changes
> later
> > for 0.9.2 if required.
>
>
> There is no impact on scheduling : it only has an impact on error
> handling - it ensures that you can actually use spark on yarn in
> multi-tennent clusters more reliably.
> Currently, any reasonably long running job (30 mins+) working on non
> trivial dataset will fail due to accumulated failures in spark.
>
>
> Regards,
> Mridul
>
>
> >
> > TD
> >
> >
> >
> >
> > On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan  >wrote:
> >
> >> Forgot to mention this in the earlier request for PR's.
> >> If there is another RC being cut, please add
> >> https://github.com/apache/spark/pull/159 to it too (if not done
> >> already !).
> >>
> >> Thanks,
> >> Mridul
> >>
> >> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
> >>  wrote:
> >> >  Hello everyone,
> >> >
> >> > Since the release of Spark 0.9, we have received a number of important
> >> bug
> >> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
> are
> >> > going to cut a release candidate soon and we would love it if people
> test
> >> > it out. We have backported several bug fixes into the 0.9 and updated
> >> JIRA
> >> > accordingly<
> >>
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> >> >.
> >> > Please let me know if there are fixes that were not backported but you
> >> > would like to see them in 0.9.1.
> >> >
> >> > Thanks!
> >> >
> >> > TD
> >>
>


Re: Spark 0.9.1 release

2014-03-25 Thread Mridul Muralidharan
On Wed, Mar 26, 2014 at 10:53 AM, Tathagata Das
 wrote:
> PR 159 seems like a fairly big patch to me. And quite recent, so its impact
> on the scheduling is not clear. It may also depend on other changes that
> may have gotten into the DAGScheduler but not pulled into branch 0.9. I am
> not sure it is a good idea to pull that in. We can pull those changes later
> for 0.9.2 if required.


There is no impact on scheduling : it only has an impact on error
handling - it ensures that you can actually use spark on yarn in
multi-tennent clusters more reliably.
Currently, any reasonably long running job (30 mins+) working on non
trivial dataset will fail due to accumulated failures in spark.


Regards,
Mridul


>
> TD
>
>
>
>
> On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan wrote:
>
>> Forgot to mention this in the earlier request for PR's.
>> If there is another RC being cut, please add
>> https://github.com/apache/spark/pull/159 to it too (if not done
>> already !).
>>
>> Thanks,
>> Mridul
>>
>> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>>  wrote:
>> >  Hello everyone,
>> >
>> > Since the release of Spark 0.9, we have received a number of important
>> bug
>> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
>> > going to cut a release candidate soon and we would love it if people test
>> > it out. We have backported several bug fixes into the 0.9 and updated
>> JIRA
>> > accordingly<
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> >.
>> > Please let me know if there are fixes that were not backported but you
>> > would like to see them in 0.9.1.
>> >
>> > Thanks!
>> >
>> > TD
>>


Re: Spark 0.9.1 release

2014-03-25 Thread Mridul Muralidharan
On Wed, Mar 26, 2014 at 11:04 AM, Kay Ousterhout  wrote:
> I don't think the blacklisting is a priority and the CPUS_PER_TASK issue
> was still broken after this patch (so broken that I'm convinced no one
> actually uses this feature!!), so agree with TD's sentiment that this
> shouldn't go into 0.9.1.


I am not sure I follow what exactly was broken.
Note that there is no change of behavior by the PR on CPUS_PER_TASK :
that exists in 0.6 (probably earlier).
Is the behavior of CPUS_PER_TASK broken ? Yes - but that is not an
artifact of this PR.



Regards,
Mridul

>
>
> On Tue, Mar 25, 2014 at 10:23 PM, Tathagata Das > wrote:
>
>> PR 159 seems like a fairly big patch to me. And quite recent, so its impact
>> on the scheduling is not clear. It may also depend on other changes that
>> may have gotten into the DAGScheduler but not pulled into branch 0.9. I am
>> not sure it is a good idea to pull that in. We can pull those changes later
>> for 0.9.2 if required.
>>
>> TD
>>
>>
>>
>>
>> On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan > >wrote:
>>
>> > Forgot to mention this in the earlier request for PR's.
>> > If there is another RC being cut, please add
>> > https://github.com/apache/spark/pull/159 to it too (if not done
>> > already !).
>> >
>> > Thanks,
>> > Mridul
>> >
>> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>> >  wrote:
>> > >  Hello everyone,
>> > >
>> > > Since the release of Spark 0.9, we have received a number of important
>> > bug
>> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
>> are
>> > > going to cut a release candidate soon and we would love it if people
>> test
>> > > it out. We have backported several bug fixes into the 0.9 and updated
>> > JIRA
>> > > accordingly<
>> >
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> > >.
>> > > Please let me know if there are fixes that were not backported but you
>> > > would like to see them in 0.9.1.
>> > >
>> > > Thanks!
>> > >
>> > > TD
>> >
>>


Re: Spark 0.9.1 release

2014-03-25 Thread Kay Ousterhout
I don't think the blacklisting is a priority and the CPUS_PER_TASK issue
was still broken after this patch (so broken that I'm convinced no one
actually uses this feature!!), so agree with TD's sentiment that this
shouldn't go into 0.9.1.


On Tue, Mar 25, 2014 at 10:23 PM, Tathagata Das  wrote:

> PR 159 seems like a fairly big patch to me. And quite recent, so its impact
> on the scheduling is not clear. It may also depend on other changes that
> may have gotten into the DAGScheduler but not pulled into branch 0.9. I am
> not sure it is a good idea to pull that in. We can pull those changes later
> for 0.9.2 if required.
>
> TD
>
>
>
>
> On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan  >wrote:
>
> > Forgot to mention this in the earlier request for PR's.
> > If there is another RC being cut, please add
> > https://github.com/apache/spark/pull/159 to it too (if not done
> > already !).
> >
> > Thanks,
> > Mridul
> >
> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
> >  wrote:
> > >  Hello everyone,
> > >
> > > Since the release of Spark 0.9, we have received a number of important
> > bug
> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
> are
> > > going to cut a release candidate soon and we would love it if people
> test
> > > it out. We have backported several bug fixes into the 0.9 and updated
> > JIRA
> > > accordingly<
> >
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> > >.
> > > Please let me know if there are fixes that were not backported but you
> > > would like to see them in 0.9.1.
> > >
> > > Thanks!
> > >
> > > TD
> >
>


Re: Spark 0.9.1 release

2014-03-25 Thread Tathagata Das
PR 159 seems like a fairly big patch to me. And quite recent, so its impact
on the scheduling is not clear. It may also depend on other changes that
may have gotten into the DAGScheduler but not pulled into branch 0.9. I am
not sure it is a good idea to pull that in. We can pull those changes later
for 0.9.2 if required.

TD




On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan wrote:

> Forgot to mention this in the earlier request for PR's.
> If there is another RC being cut, please add
> https://github.com/apache/spark/pull/159 to it too (if not done
> already !).
>
> Thanks,
> Mridul
>
> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>  wrote:
> >  Hello everyone,
> >
> > Since the release of Spark 0.9, we have received a number of important
> bug
> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> > going to cut a release candidate soon and we would love it if people test
> > it out. We have backported several bug fixes into the 0.9 and updated
> JIRA
> > accordingly<
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> >.
> > Please let me know if there are fixes that were not backported but you
> > would like to see them in 0.9.1.
> >
> > Thanks!
> >
> > TD
>


Re: Spark 0.9.1 release

2014-03-25 Thread Mridul Muralidharan
Forgot to mention this in the earlier request for PR's.
If there is another RC being cut, please add
https://github.com/apache/spark/pull/159 to it too (if not done
already !).

Thanks,
Mridul

On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
 wrote:
>  Hello everyone,
>
> Since the release of Spark 0.9, we have received a number of important bug
> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> going to cut a release candidate soon and we would love it if people test
> it out. We have backported several bug fixes into the 0.9 and updated JIRA
> accordingly.
> Please let me know if there are fixes that were not backported but you
> would like to see them in 0.9.1.
>
> Thanks!
>
> TD


Re: Spark 0.9.1 release

2014-03-25 Thread Tathagata Das
@evan
>From the discussion in the JIRA, it seems that we still dont have a clear
solution for SPARK-1138. Nor do we have a sense of whether the solution is
going to small enough for a maintenance release. So I dont think we should
block the release of Spark 0.9.1 for this. We can make another Spark 0.9.2
release once the correct solution has been figured out.

@kevin
I understand the problem. I will try to port the solution for master
inthis PR  into
branch 0.9. Lets see if it works out.


On Tue, Mar 25, 2014 at 10:19 AM, Kevin Markey wrote:

> TD:
>
> A correct shading of ASM should only affect Spark code unless someone is
> relying on ASM 4.0 in unrelated project code, in which case they can add
> org.ow2.asm:asm:4.x as a dependency.
>
> Our short term solution has been to repackage other libraries with a 3.2
> dependency or to exclude ASM when our use of a dependent library really
> doesn't need it.  As you probably know, the real problem arises in
> ClassVisitor, which is an Interface in 3.x and before, but in 4.x it is an
> abstract class that takes a version constant as its constructor.  The ASM
> folks of course had our best interests in mind when they did this,
> attempting to deal with the Java-version dependent  changes from one ASM
> release to the next.  Unfortunately, they didn't change the names or
> locations of their classes and interfaces, which would have helped.
>
> In our particular case, the only library from which we couldn't exclude
> ASM was org.glassfish.jersey.containers:jersey-container-servlet:jar:2.5.1.
> I added a new module to our project, including some dummy source code,
> because we needed the library to be self contained, made the servlet --
> minus some unrelated transitive dependencies -- the only module dependency,
> then used the Maven shade plugin to relocate "org.objectweb.asm" to an
> arbitrary target.  We added the new shaded module as a new project
> dependency, plus the unrelated transitive dependencies excluded above.
> This solved the problem. At least until we added WADL to the project.  Then
> we needed to deal with it on its own terms.
>
> As you can see, we left Spark alone in all its ASM 4.0 glory.  Why? Spark
> is more volatile than the other libraries.  Also, the way in which we
> needed to deploy Spark and other resources on our (Yarn) clusters suggested
> that it would be easier to shade the other libraries.  I wanted to avoid
> having to install a locally patched Spark library into our build, updating
> the cluster and individual developers whenever there's a new patch.
>  Individual developers such as me who are testing the impact of patches can
> handle it, but the main build goes to Maven Central via our corporate
> Artifactory mirror.
>
> If suddenly we had a Spark 0.9.1 with a shaded ASM, it would have no
> negative impact on us.  Only a positive impact.
>
> I just wish that all users of ASM would read FAQ entry 15!!!
>
> Thanks
> Kevin
>
>
>
> On 03/24/2014 06:30 PM, Tathagata Das wrote:
>
>> Hello Kevin,
>>
>> A fix for SPARK-782 would definitely simplify building against Spark.
>> However, its possible that a fix for this issue in 0.9.1 will break
>> the builds (that reference spark) of existing 0.9 users, either due to
>> a change in the ASM version, or for being incompatible with their
>> current workarounds for this issue. That is not a good idea for a
>> maintenance release, especially when 1.0 is not too far away.
>>
>> Can you (and others) elaborate more on the current workarounds that
>> you have for this issue? Its best to understand all the implications
>> of this fix.
>>
>> Note that in branch 0.9, it is not fixed, neither in SBT nor in Maven.
>>
>> TD
>>
>> On Mon, Mar 24, 2014 at 4:38 PM, Kevin Markey 
>> wrote:
>>
>>> Is there any way that [SPARK-782] (Shade ASM) can be included?  I see
>>> that
>>> it is not currently backported to 0.9.  But there is no single issue that
>>> has caused us more grief as we integrate spark-core with other project
>>> dependencies.  There are way too many libraries out there in addition to
>>> Spark 0.9 and before that are not well-behaved (ASM FAQ recommends
>>> shading),
>>> including some Hive and Hadoop libraries and a number of servlet
>>> libraries.
>>> We can't control those, but if Spark were well behaved in this regard, it
>>> would help.  Even for a maintenance release, and even if 1.0 is only 6
>>> weeks
>>> away!
>>>
>>> (For those not following 782, according to Jira comments, the SBT build
>>> shades it, but it is the Maven build that ends up in Maven Central.)
>>>
>>> Thanks
>>> Kevin Markey
>>>
>>>
>>>
>>>
>>> On 03/19/2014 06:07 PM, Tathagata Das wrote:
>>>
Hello everyone,

 Since the release of Spark 0.9, we have received a number of important
 bug
 fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
 going to cut a release candidate soon and we would love it if people
 test
 it ou

Re: Spark 0.9.1 release

2014-03-25 Thread Kevin Markey

TD:

A correct shading of ASM should only affect Spark code unless someone is 
relying on ASM 4.0 in unrelated project code, in which case they can add 
org.ow2.asm:asm:4.x as a dependency.


Our short term solution has been to repackage other libraries with a 3.2 
dependency or to exclude ASM when our use of a dependent library really 
doesn't need it.  As you probably know, the real problem arises in 
ClassVisitor, which is an Interface in 3.x and before, but in 4.x it is 
an abstract class that takes a version constant as its constructor.  The 
ASM folks of course had our best interests in mind when they did this, 
attempting to deal with the Java-version dependent  changes from one ASM 
release to the next.  Unfortunately, they didn't change the names or 
locations of their classes and interfaces, which would have helped.


In our particular case, the only library from which we couldn't exclude 
ASM was 
org.glassfish.jersey.containers:jersey-container-servlet:jar:2.5.1. I 
added a new module to our project, including some dummy source code, 
because we needed the library to be self contained, made the servlet -- 
minus some unrelated transitive dependencies -- the only module 
dependency, then used the Maven shade plugin to relocate 
"org.objectweb.asm" to an arbitrary target.  We added the new shaded 
module as a new project dependency, plus the unrelated transitive 
dependencies excluded above.   This solved the problem. At least until 
we added WADL to the project.  Then we needed to deal with it on its own 
terms.


As you can see, we left Spark alone in all its ASM 4.0 glory.  Why? 
Spark is more volatile than the other libraries.  Also, the way in which 
we needed to deploy Spark and other resources on our (Yarn) clusters 
suggested that it would be easier to shade the other libraries.  I 
wanted to avoid having to install a locally patched Spark library into 
our build, updating the cluster and individual developers whenever 
there's a new patch.  Individual developers such as me who are testing 
the impact of patches can handle it, but the main build goes to Maven 
Central via our corporate Artifactory mirror.


If suddenly we had a Spark 0.9.1 with a shaded ASM, it would have no 
negative impact on us.  Only a positive impact.


I just wish that all users of ASM would read FAQ entry 15!!!

Thanks
Kevin


On 03/24/2014 06:30 PM, Tathagata Das wrote:

Hello Kevin,

A fix for SPARK-782 would definitely simplify building against Spark.
However, its possible that a fix for this issue in 0.9.1 will break
the builds (that reference spark) of existing 0.9 users, either due to
a change in the ASM version, or for being incompatible with their
current workarounds for this issue. That is not a good idea for a
maintenance release, especially when 1.0 is not too far away.

Can you (and others) elaborate more on the current workarounds that
you have for this issue? Its best to understand all the implications
of this fix.

Note that in branch 0.9, it is not fixed, neither in SBT nor in Maven.

TD

On Mon, Mar 24, 2014 at 4:38 PM, Kevin Markey  wrote:

Is there any way that [SPARK-782] (Shade ASM) can be included?  I see that
it is not currently backported to 0.9.  But there is no single issue that
has caused us more grief as we integrate spark-core with other project
dependencies.  There are way too many libraries out there in addition to
Spark 0.9 and before that are not well-behaved (ASM FAQ recommends shading),
including some Hive and Hadoop libraries and a number of servlet libraries.
We can't control those, but if Spark were well behaved in this regard, it
would help.  Even for a maintenance release, and even if 1.0 is only 6 weeks
away!

(For those not following 782, according to Jira comments, the SBT build
shades it, but it is the Maven build that ends up in Maven Central.)

Thanks
Kevin Markey




On 03/19/2014 06:07 PM, Tathagata Das wrote:

   Hello everyone,

Since the release of Spark 0.9, we have received a number of important bug
fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
going to cut a release candidate soon and we would love it if people test
it out. We have backported several bug fixes into the 0.9 and updated JIRA

accordingly.

Please let me know if there are fixes that were not backported but you
would like to see them in 0.9.1.

Thanks!

TD





Re: Spark 0.9.1 release

2014-03-25 Thread Evan Chan
Hey guys,

I think SPARK-1138 should be resolved before releasing Spark 0.9.1.
It's affecting multiple users ability to use Spark 0.9 with various
versions of Hadoop.
I have one fix but not sure if it works for others.

-Evan


On Mon, Mar 24, 2014 at 5:30 PM, Tathagata Das
 wrote:
> Hello Kevin,
>
> A fix for SPARK-782 would definitely simplify building against Spark.
> However, its possible that a fix for this issue in 0.9.1 will break
> the builds (that reference spark) of existing 0.9 users, either due to
> a change in the ASM version, or for being incompatible with their
> current workarounds for this issue. That is not a good idea for a
> maintenance release, especially when 1.0 is not too far away.
>
> Can you (and others) elaborate more on the current workarounds that
> you have for this issue? Its best to understand all the implications
> of this fix.
>
> Note that in branch 0.9, it is not fixed, neither in SBT nor in Maven.
>
> TD
>
> On Mon, Mar 24, 2014 at 4:38 PM, Kevin Markey  wrote:
>> Is there any way that [SPARK-782] (Shade ASM) can be included?  I see that
>> it is not currently backported to 0.9.  But there is no single issue that
>> has caused us more grief as we integrate spark-core with other project
>> dependencies.  There are way too many libraries out there in addition to
>> Spark 0.9 and before that are not well-behaved (ASM FAQ recommends shading),
>> including some Hive and Hadoop libraries and a number of servlet libraries.
>> We can't control those, but if Spark were well behaved in this regard, it
>> would help.  Even for a maintenance release, and even if 1.0 is only 6 weeks
>> away!
>>
>> (For those not following 782, according to Jira comments, the SBT build
>> shades it, but it is the Maven build that ends up in Maven Central.)
>>
>> Thanks
>> Kevin Markey
>>
>>
>>
>>
>> On 03/19/2014 06:07 PM, Tathagata Das wrote:
>>>
>>>   Hello everyone,
>>>
>>> Since the release of Spark 0.9, we have received a number of important bug
>>> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
>>> going to cut a release candidate soon and we would love it if people test
>>> it out. We have backported several bug fixes into the 0.9 and updated JIRA
>>>
>>> accordingly.
>>>
>>> Please let me know if there are fixes that were not backported but you
>>> would like to see them in 0.9.1.
>>>
>>> Thanks!
>>>
>>> TD
>>>
>>



-- 
--
Evan Chan
Staff Engineer
e...@ooyala.com  |


Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
Hello Kevin,

A fix for SPARK-782 would definitely simplify building against Spark.
However, its possible that a fix for this issue in 0.9.1 will break
the builds (that reference spark) of existing 0.9 users, either due to
a change in the ASM version, or for being incompatible with their
current workarounds for this issue. That is not a good idea for a
maintenance release, especially when 1.0 is not too far away.

Can you (and others) elaborate more on the current workarounds that
you have for this issue? Its best to understand all the implications
of this fix.

Note that in branch 0.9, it is not fixed, neither in SBT nor in Maven.

TD

On Mon, Mar 24, 2014 at 4:38 PM, Kevin Markey  wrote:
> Is there any way that [SPARK-782] (Shade ASM) can be included?  I see that
> it is not currently backported to 0.9.  But there is no single issue that
> has caused us more grief as we integrate spark-core with other project
> dependencies.  There are way too many libraries out there in addition to
> Spark 0.9 and before that are not well-behaved (ASM FAQ recommends shading),
> including some Hive and Hadoop libraries and a number of servlet libraries.
> We can't control those, but if Spark were well behaved in this regard, it
> would help.  Even for a maintenance release, and even if 1.0 is only 6 weeks
> away!
>
> (For those not following 782, according to Jira comments, the SBT build
> shades it, but it is the Maven build that ends up in Maven Central.)
>
> Thanks
> Kevin Markey
>
>
>
>
> On 03/19/2014 06:07 PM, Tathagata Das wrote:
>>
>>   Hello everyone,
>>
>> Since the release of Spark 0.9, we have received a number of important bug
>> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
>> going to cut a release candidate soon and we would love it if people test
>> it out. We have backported several bug fixes into the 0.9 and updated JIRA
>>
>> accordingly.
>>
>> Please let me know if there are fixes that were not backported but you
>> would like to see them in 0.9.1.
>>
>> Thanks!
>>
>> TD
>>
>


Re: Spark 0.9.1 release

2014-03-24 Thread Kevin Markey
Is there any way that [SPARK-782] (Shade ASM) can be included?  I see 
that it is not currently backported to 0.9.  But there is no single 
issue that has caused us more grief as we integrate spark-core with 
other project dependencies.  There are way too many libraries out there 
in addition to Spark 0.9 and before that are not well-behaved (ASM FAQ 
recommends shading), including some Hive and Hadoop libraries and a 
number of servlet libraries.  We can't control those, but if Spark were 
well behaved in this regard, it would help.  Even for a maintenance 
release, and even if 1.0 is only 6 weeks away!


(For those not following 782, according to Jira comments, the SBT build 
shades it, but it is the Maven build that ends up in Maven Central.)


Thanks
Kevin Markey



On 03/19/2014 06:07 PM, Tathagata Das wrote:

  Hello everyone,

Since the release of Spark 0.9, we have received a number of important bug
fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
going to cut a release candidate soon and we would love it if people test
it out. We have backported several bug fixes into the 0.9 and updated JIRA
accordingly.
Please let me know if there are fixes that were not backported but you
would like to see them in 0.9.1.

Thanks!

TD





Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
1051 has been pulled in!

search 1051 in
https://git-wip-us.apache.org/repos/asf?p=spark.git;a=shortlog;h=refs/heads/branch-0.9

TD

On Mon, Mar 24, 2014 at 4:26 PM, Kevin Markey  wrote:
> 1051 is essential!
> I'm not sure about the others, but anything that adds stability to
> Spark/Yarn would  be helpful.
> Kevin Markey
>
>
>
> On 03/20/2014 01:12 PM, Tom Graves wrote:
>>
>> I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running
>> on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as submitting
>> user - JIRA in.  The pyspark one I would consider more of an enhancement so
>> might not be appropriate for a point release.
>>
>> [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on
>> YA...
>> org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
>> at org.apache.spark.schedule...
>> View on spark-project.atlassian.net Preview by Yahoo
>>   [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
>> This means that they can't write/read from files that the yarn user
>> doesn't have permissions to but the submitting user does.
>> View on spark-project.atlassian.net Preview by Yahoo
>>
>>
>>
>> On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta 
>> wrote:
>>   It will be great if
>> "SPARK-1101:
>> Umbrella
>> for hardening Spark on YARN" can get into 0.9.1.
>>
>> Thanks,
>> Bhaskar
>>
>>
>> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>> wrote:
>>
>>>Hello everyone,
>>>
>>> Since the release of Spark 0.9, we have received a number of important
>>> bug
>>> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
>>> going to cut a release candidate soon and we would love it if people test
>>> it out. We have backported several bug fixes into the 0.9 and updated
>>> JIRA
>>> accordingly<
>>>
>>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)

 .
>>>
>>> Please let me know if there are fixes that were not backported but you
>>> would like to see them in 0.9.1.
>>>
>>> Thanks!
>>>
>>> TD
>>>
>


Re: Spark 0.9.1 release

2014-03-24 Thread Kevin Markey

1051 is essential!
I'm not sure about the others, but anything that adds stability to 
Spark/Yarn would  be helpful.

Kevin Markey


On 03/20/2014 01:12 PM, Tom Graves wrote:

I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on 
YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as submitting user 
- JIRA in.  The pyspark one I would consider more of an enhancement so might 
not be appropriate for a point release.

  
  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on YA...

org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
 at org.apache.spark.schedule...
View on spark-project.atlassian.net Preview by Yahoo
  
  
  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA

This means that they can't write/read from files that the yarn user doesn't 
have permissions to but the submitting user does.
View on spark-project.atlassian.net Preview by Yahoo
  
  




On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta  wrote:
  
It will be great if

"SPARK-1101:
Umbrella
for hardening Spark on YARN" can get into 0.9.1.

Thanks,
Bhaskar


On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
wrote:


   Hello everyone,

Since the release of Spark 0.9, we have received a number of important bug
fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
going to cut a release candidate soon and we would love it if people test
it out. We have backported several bug fixes into the 0.9 and updated JIRA
accordingly<
https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)

.

Please let me know if there are fixes that were not backported but you
would like to see them in 0.9.1.

Thanks!

TD





Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
Patrick, yes, that is indeed a risk.

On Mon, Mar 24, 2014 at 12:30 AM, Tathagata Das
 wrote:
> Patrick, that is a good point.
>
>
> On Mon, Mar 24, 2014 at 12:14 AM, Patrick Wendell wrote:
>
>> > Spark's dependency graph in a maintenance
>> *Modifying* Spark's dependency graph...
>>



-- 
--
Evan Chan
Staff Engineer
e...@ooyala.com  |


Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
Patrick, that is a good point.


On Mon, Mar 24, 2014 at 12:14 AM, Patrick Wendell wrote:

> > Spark's dependency graph in a maintenance
> *Modifying* Spark's dependency graph...
>


Re: Spark 0.9.1 release

2014-03-24 Thread Patrick Wendell
> Spark's dependency graph in a maintenance
*Modifying* Spark's dependency graph...


Re: Spark 0.9.1 release

2014-03-24 Thread Patrick Wendell
Hey Evan and TD,

Spark's dependency graph in a maintenance release seems potentially
harmful, especially upgrading a minor version (not just a patch
version) like this. This could affect other downstream users. For
instance, now without knowing their fastutil dependency gets bumped
and they hit some new problem in fastutil 6.5.

- Patrick

On Mon, Mar 24, 2014 at 12:02 AM, Tathagata Das
 wrote:
> @Shivaram, That is a useful patch but I am bit afraid merge it in.
> Randomizing the executor has performance implications, especially for Spark
> Streaming. The non-randomized ordering of allocating machines to tasks was
> subtly helping to speed up certain window-based shuffle operations.  For
> example, corresponding shuffle partitions in multiple shuffles using the
> same partitioner were likely to be co-located, that is, shuffle partition 0
> were likely to be on the same machine for multiple shuffles. While this is
> the not a reliable mechanism to rely on, randomization may lead to
> performance degradation. So I am afraid to merge this one without
> understanding the consequences.
>
> @Evan, I have already cut a release! You can submit the PR and we can merge
> it branch-0.9. If we have to cut another release, then we can include it.
>
>
>
> On Sun, Mar 23, 2014 at 11:42 PM, Evan Chan  wrote:
>
>> I also have a really minor fix for SPARK-1057  (upgrading fastutil),
>> could that also make it in?
>>
>> -Evan
>>
>>
>> On Sun, Mar 23, 2014 at 11:01 PM, Shivaram Venkataraman
>>  wrote:
>> > Sorry this request is coming in a bit late, but would it be possible to
>> > backport SPARK-979[1] to branch-0.9 ? This is the patch for randomizing
>> > executor offers and I would like to use this in a release sooner rather
>> > than later.
>> >
>> > Thanks
>> > Shivaram
>> >
>> > [1]
>> >
>> https://github.com/apache/spark/commit/556c56689bbc32c6cec0d07b57bd3ec73ceb243e#diff-8ef3258646b0e6a4793d6ad99848eacd
>> >
>> >
>> > On Thu, Mar 20, 2014 at 10:18 PM, Bhaskar Dutta 
>> wrote:
>> >
>> >> Thank You! We plan to test out 0.9.1 on YARN once it is out.
>> >>
>> >> Regards,
>> >> Bhaskar
>> >>
>> >> On Fri, Mar 21, 2014 at 12:42 AM, Tom Graves 
>> wrote:
>> >>
>> >> > I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when
>> running
>> >> > on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as
>> >> > submitting user - JIRA in.  The pyspark one I would consider more of
>> an
>> >> > enhancement so might not be appropriate for a point release.
>> >> >
>> >> >
>> >> >  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on
>> YA...
>> >> > org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
>> >> >
>> >>
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
>> >> > at org.apache.spark.schedule...
>> >> > View on spark-project.atlassian.net Preview by Yahoo
>> >> >
>> >> >
>> >> >  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
>> >> > This means that they can't write/read from files that the yarn user
>> >> > doesn't have permissions to but the submitting user does.
>> >> > View on spark-project.atlassian.net Preview by Yahoo
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta > >
>> >> > wrote:
>> >> >
>> >> > It will be great if
>> >> > "SPARK-1101:
>> >> > Umbrella
>> >> > for hardening Spark on YARN" can get into 0.9.1.
>> >> >
>> >> > Thanks,
>> >> > Bhaskar
>> >> >
>> >> >
>> >> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>> >> > wrote:
>> >> >
>> >> > >  Hello everyone,
>> >> > >
>> >> > > Since the release of Spark 0.9, we have received a number of
>> important
>> >> > bug
>> >> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
>> >> are
>> >> > > going to cut a release candidate soon and we would love it if people
>> >> test
>> >> > > it out. We have backported several bug fixes into the 0.9 and
>> updated
>> >> > JIRA
>> >> > > accordingly<
>> >> > >
>> >> >
>> >>
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> >> > > >.
>> >> > > Please let me know if there are fixes that were not backported but
>> you
>> >> > > would like to see them in 0.9.1.
>> >> > >
>> >> > > Thanks!
>> >> > >
>> >> > > TD
>> >> > >
>> >> >
>> >>
>>
>>
>>
>> --
>> --
>> Evan Chan
>> Staff Engineer
>> e...@ooyala.com  |
>>


Re: Spark 0.9.1 release

2014-03-24 Thread Evan Chan
@Tathagata,  the PR is here:
https://github.com/apache/spark/pull/215

On Mon, Mar 24, 2014 at 12:02 AM, Tathagata Das
 wrote:
> @Shivaram, That is a useful patch but I am bit afraid merge it in.
> Randomizing the executor has performance implications, especially for Spark
> Streaming. The non-randomized ordering of allocating machines to tasks was
> subtly helping to speed up certain window-based shuffle operations.  For
> example, corresponding shuffle partitions in multiple shuffles using the
> same partitioner were likely to be co-located, that is, shuffle partition 0
> were likely to be on the same machine for multiple shuffles. While this is
> the not a reliable mechanism to rely on, randomization may lead to
> performance degradation. So I am afraid to merge this one without
> understanding the consequences.
>
> @Evan, I have already cut a release! You can submit the PR and we can merge
> it branch-0.9. If we have to cut another release, then we can include it.
>
>
>
> On Sun, Mar 23, 2014 at 11:42 PM, Evan Chan  wrote:
>
>> I also have a really minor fix for SPARK-1057  (upgrading fastutil),
>> could that also make it in?
>>
>> -Evan
>>
>>
>> On Sun, Mar 23, 2014 at 11:01 PM, Shivaram Venkataraman
>>  wrote:
>> > Sorry this request is coming in a bit late, but would it be possible to
>> > backport SPARK-979[1] to branch-0.9 ? This is the patch for randomizing
>> > executor offers and I would like to use this in a release sooner rather
>> > than later.
>> >
>> > Thanks
>> > Shivaram
>> >
>> > [1]
>> >
>> https://github.com/apache/spark/commit/556c56689bbc32c6cec0d07b57bd3ec73ceb243e#diff-8ef3258646b0e6a4793d6ad99848eacd
>> >
>> >
>> > On Thu, Mar 20, 2014 at 10:18 PM, Bhaskar Dutta 
>> wrote:
>> >
>> >> Thank You! We plan to test out 0.9.1 on YARN once it is out.
>> >>
>> >> Regards,
>> >> Bhaskar
>> >>
>> >> On Fri, Mar 21, 2014 at 12:42 AM, Tom Graves 
>> wrote:
>> >>
>> >> > I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when
>> running
>> >> > on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as
>> >> > submitting user - JIRA in.  The pyspark one I would consider more of
>> an
>> >> > enhancement so might not be appropriate for a point release.
>> >> >
>> >> >
>> >> >  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on
>> YA...
>> >> > org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
>> >> >
>> >>
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
>> >> > at org.apache.spark.schedule...
>> >> > View on spark-project.atlassian.net Preview by Yahoo
>> >> >
>> >> >
>> >> >  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
>> >> > This means that they can't write/read from files that the yarn user
>> >> > doesn't have permissions to but the submitting user does.
>> >> > View on spark-project.atlassian.net Preview by Yahoo
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta > >
>> >> > wrote:
>> >> >
>> >> > It will be great if
>> >> > "SPARK-1101:
>> >> > Umbrella
>> >> > for hardening Spark on YARN" can get into 0.9.1.
>> >> >
>> >> > Thanks,
>> >> > Bhaskar
>> >> >
>> >> >
>> >> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>> >> > wrote:
>> >> >
>> >> > >  Hello everyone,
>> >> > >
>> >> > > Since the release of Spark 0.9, we have received a number of
>> important
>> >> > bug
>> >> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
>> >> are
>> >> > > going to cut a release candidate soon and we would love it if people
>> >> test
>> >> > > it out. We have backported several bug fixes into the 0.9 and
>> updated
>> >> > JIRA
>> >> > > accordingly<
>> >> > >
>> >> >
>> >>
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> >> > > >.
>> >> > > Please let me know if there are fixes that were not backported but
>> you
>> >> > > would like to see them in 0.9.1.
>> >> > >
>> >> > > Thanks!
>> >> > >
>> >> > > TD
>> >> > >
>> >> >
>> >>
>>
>>
>>
>> --
>> --
>> Evan Chan
>> Staff Engineer
>> e...@ooyala.com  |
>>



-- 
--
Evan Chan
Staff Engineer
e...@ooyala.com  |


Re: Spark 0.9.1 release

2014-03-24 Thread Tathagata Das
@Shivaram, That is a useful patch but I am bit afraid merge it in.
Randomizing the executor has performance implications, especially for Spark
Streaming. The non-randomized ordering of allocating machines to tasks was
subtly helping to speed up certain window-based shuffle operations.  For
example, corresponding shuffle partitions in multiple shuffles using the
same partitioner were likely to be co-located, that is, shuffle partition 0
were likely to be on the same machine for multiple shuffles. While this is
the not a reliable mechanism to rely on, randomization may lead to
performance degradation. So I am afraid to merge this one without
understanding the consequences.

@Evan, I have already cut a release! You can submit the PR and we can merge
it branch-0.9. If we have to cut another release, then we can include it.



On Sun, Mar 23, 2014 at 11:42 PM, Evan Chan  wrote:

> I also have a really minor fix for SPARK-1057  (upgrading fastutil),
> could that also make it in?
>
> -Evan
>
>
> On Sun, Mar 23, 2014 at 11:01 PM, Shivaram Venkataraman
>  wrote:
> > Sorry this request is coming in a bit late, but would it be possible to
> > backport SPARK-979[1] to branch-0.9 ? This is the patch for randomizing
> > executor offers and I would like to use this in a release sooner rather
> > than later.
> >
> > Thanks
> > Shivaram
> >
> > [1]
> >
> https://github.com/apache/spark/commit/556c56689bbc32c6cec0d07b57bd3ec73ceb243e#diff-8ef3258646b0e6a4793d6ad99848eacd
> >
> >
> > On Thu, Mar 20, 2014 at 10:18 PM, Bhaskar Dutta 
> wrote:
> >
> >> Thank You! We plan to test out 0.9.1 on YARN once it is out.
> >>
> >> Regards,
> >> Bhaskar
> >>
> >> On Fri, Mar 21, 2014 at 12:42 AM, Tom Graves 
> wrote:
> >>
> >> > I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when
> running
> >> > on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as
> >> > submitting user - JIRA in.  The pyspark one I would consider more of
> an
> >> > enhancement so might not be appropriate for a point release.
> >> >
> >> >
> >> >  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on
> YA...
> >> > org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
> >> >
> >>
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
> >> > at org.apache.spark.schedule...
> >> > View on spark-project.atlassian.net Preview by Yahoo
> >> >
> >> >
> >> >  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
> >> > This means that they can't write/read from files that the yarn user
> >> > doesn't have permissions to but the submitting user does.
> >> > View on spark-project.atlassian.net Preview by Yahoo
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta  >
> >> > wrote:
> >> >
> >> > It will be great if
> >> > "SPARK-1101:
> >> > Umbrella
> >> > for hardening Spark on YARN" can get into 0.9.1.
> >> >
> >> > Thanks,
> >> > Bhaskar
> >> >
> >> >
> >> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
> >> > wrote:
> >> >
> >> > >  Hello everyone,
> >> > >
> >> > > Since the release of Spark 0.9, we have received a number of
> important
> >> > bug
> >> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
> >> are
> >> > > going to cut a release candidate soon and we would love it if people
> >> test
> >> > > it out. We have backported several bug fixes into the 0.9 and
> updated
> >> > JIRA
> >> > > accordingly<
> >> > >
> >> >
> >>
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> >> > > >.
> >> > > Please let me know if there are fixes that were not backported but
> you
> >> > > would like to see them in 0.9.1.
> >> > >
> >> > > Thanks!
> >> > >
> >> > > TD
> >> > >
> >> >
> >>
>
>
>
> --
> --
> Evan Chan
> Staff Engineer
> e...@ooyala.com  |
>


Re: Spark 0.9.1 release

2014-03-23 Thread Evan Chan
I also have a really minor fix for SPARK-1057  (upgrading fastutil),
could that also make it in?

-Evan


On Sun, Mar 23, 2014 at 11:01 PM, Shivaram Venkataraman
 wrote:
> Sorry this request is coming in a bit late, but would it be possible to
> backport SPARK-979[1] to branch-0.9 ? This is the patch for randomizing
> executor offers and I would like to use this in a release sooner rather
> than later.
>
> Thanks
> Shivaram
>
> [1]
> https://github.com/apache/spark/commit/556c56689bbc32c6cec0d07b57bd3ec73ceb243e#diff-8ef3258646b0e6a4793d6ad99848eacd
>
>
> On Thu, Mar 20, 2014 at 10:18 PM, Bhaskar Dutta  wrote:
>
>> Thank You! We plan to test out 0.9.1 on YARN once it is out.
>>
>> Regards,
>> Bhaskar
>>
>> On Fri, Mar 21, 2014 at 12:42 AM, Tom Graves  wrote:
>>
>> > I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running
>> > on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as
>> > submitting user - JIRA in.  The pyspark one I would consider more of an
>> > enhancement so might not be appropriate for a point release.
>> >
>> >
>> >  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on YA...
>> > org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
>> >
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
>> > at org.apache.spark.schedule...
>> > View on spark-project.atlassian.net Preview by Yahoo
>> >
>> >
>> >  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
>> > This means that they can't write/read from files that the yarn user
>> > doesn't have permissions to but the submitting user does.
>> > View on spark-project.atlassian.net Preview by Yahoo
>> >
>> >
>> >
>> >
>> >
>> > On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta 
>> > wrote:
>> >
>> > It will be great if
>> > "SPARK-1101:
>> > Umbrella
>> > for hardening Spark on YARN" can get into 0.9.1.
>> >
>> > Thanks,
>> > Bhaskar
>> >
>> >
>> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
>> > wrote:
>> >
>> > >  Hello everyone,
>> > >
>> > > Since the release of Spark 0.9, we have received a number of important
>> > bug
>> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
>> are
>> > > going to cut a release candidate soon and we would love it if people
>> test
>> > > it out. We have backported several bug fixes into the 0.9 and updated
>> > JIRA
>> > > accordingly<
>> > >
>> >
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> > > >.
>> > > Please let me know if there are fixes that were not backported but you
>> > > would like to see them in 0.9.1.
>> > >
>> > > Thanks!
>> > >
>> > > TD
>> > >
>> >
>>



-- 
--
Evan Chan
Staff Engineer
e...@ooyala.com  |


Re: Spark 0.9.1 release

2014-03-23 Thread Shivaram Venkataraman
Sorry this request is coming in a bit late, but would it be possible to
backport SPARK-979[1] to branch-0.9 ? This is the patch for randomizing
executor offers and I would like to use this in a release sooner rather
than later.

Thanks
Shivaram

[1]
https://github.com/apache/spark/commit/556c56689bbc32c6cec0d07b57bd3ec73ceb243e#diff-8ef3258646b0e6a4793d6ad99848eacd


On Thu, Mar 20, 2014 at 10:18 PM, Bhaskar Dutta  wrote:

> Thank You! We plan to test out 0.9.1 on YARN once it is out.
>
> Regards,
> Bhaskar
>
> On Fri, Mar 21, 2014 at 12:42 AM, Tom Graves  wrote:
>
> > I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running
> > on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as
> > submitting user - JIRA in.  The pyspark one I would consider more of an
> > enhancement so might not be appropriate for a point release.
> >
> >
> >  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on YA...
> > org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
> >
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
> > at org.apache.spark.schedule...
> > View on spark-project.atlassian.net Preview by Yahoo
> >
> >
> >  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
> > This means that they can't write/read from files that the yarn user
> > doesn't have permissions to but the submitting user does.
> > View on spark-project.atlassian.net Preview by Yahoo
> >
> >
> >
> >
> >
> > On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta 
> > wrote:
> >
> > It will be great if
> > "SPARK-1101:
> > Umbrella
> > for hardening Spark on YARN" can get into 0.9.1.
> >
> > Thanks,
> > Bhaskar
> >
> >
> > On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
> > wrote:
> >
> > >  Hello everyone,
> > >
> > > Since the release of Spark 0.9, we have received a number of important
> > bug
> > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We
> are
> > > going to cut a release candidate soon and we would love it if people
> test
> > > it out. We have backported several bug fixes into the 0.9 and updated
> > JIRA
> > > accordingly<
> > >
> >
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> > > >.
> > > Please let me know if there are fixes that were not backported but you
> > > would like to see them in 0.9.1.
> > >
> > > Thanks!
> > >
> > > TD
> > >
> >
>


Re: Spark 0.9.1 release

2014-03-20 Thread Bhaskar Dutta
Thank You! We plan to test out 0.9.1 on YARN once it is out.

Regards,
Bhaskar

On Fri, Mar 21, 2014 at 12:42 AM, Tom Graves  wrote:

> I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running
> on YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as
> submitting user - JIRA in.  The pyspark one I would consider more of an
> enhancement so might not be appropriate for a point release.
>
>
>  [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on YA...
> org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
> at org.apache.spark.schedule...
> View on spark-project.atlassian.net Preview by Yahoo
>
>
>  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
> This means that they can't write/read from files that the yarn user
> doesn't have permissions to but the submitting user does.
> View on spark-project.atlassian.net Preview by Yahoo
>
>
>
>
>
> On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta 
> wrote:
>
> It will be great if
> "SPARK-1101:
> Umbrella
> for hardening Spark on YARN" can get into 0.9.1.
>
> Thanks,
> Bhaskar
>
>
> On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
> wrote:
>
> >  Hello everyone,
> >
> > Since the release of Spark 0.9, we have received a number of important
> bug
> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> > going to cut a release candidate soon and we would love it if people test
> > it out. We have backported several bug fixes into the 0.9 and updated
> JIRA
> > accordingly<
> >
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> > >.
> > Please let me know if there are fixes that were not backported but you
> > would like to see them in 0.9.1.
> >
> > Thanks!
> >
> > TD
> >
>


Re: Spark 0.9.1 release

2014-03-20 Thread Patrick Wendell
Thanks Tom,

After I looked more at this patch I don't see how this could have
regressed behavior for any users (it seems like it only pertains to
warnings and instructions). So maybe the user mistook this patch for a
different issue.

https://github.com/apache/incubator-spark/pull/553/files

- Patrick

On Thu, Mar 20, 2014 at 2:06 PM, Tom Graves  wrote:
> Thanks for the heads up, saw that and will make sure that is resolved before 
> pulling into 0.9.  Unless I'm missing something, they should just use 
> sc.addJar to distributed the jar rather then relying on SPARK_YARN_APP_JAR.
>
> Tom
>
>
>
> On Thursday, March 20, 2014 3:31 PM, Patrick Wendell  
> wrote:
>
> Hey Tom,
>
>> I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on 
>> YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as submitting 
>> user - JIRA in.  The pyspark one I would consider more of an enhancement so 
>> might not be appropriate for a point release.
>
> Someone recently sent me a personal e-mail reporting some problems
> with this. I'll ask them to forward it to you/the dev list. Might be
> worth looking into before merging.
>
>
>>  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
>> This means that they can't write/read from files that the yarn user doesn't 
>> have permissions to but the submitting user does.
>> View on spark-project.atlassian.net Preview by Yahoo
>
> Good call on this one.
>
> - Patrick


Re: Spark 0.9.1 release

2014-03-20 Thread Tom Graves
Thanks for the heads up, saw that and will make sure that is resolved before 
pulling into 0.9.  Unless I'm missing something, they should just use sc.addJar 
to distributed the jar rather then relying on SPARK_YARN_APP_JAR.

Tom



On Thursday, March 20, 2014 3:31 PM, Patrick Wendell  wrote:
 
Hey Tom,

> I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on 
> YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as submitting 
> user - JIRA in.  The pyspark one I would consider more of an enhancement so 
> might not be appropriate for a point release.

Someone recently sent me a personal e-mail reporting some problems
with this. I'll ask them to forward it to you/the dev list. Might be
worth looking into before merging.


>  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
> This means that they can't write/read from files that the yarn user doesn't 
> have permissions to but the submitting user does.
> View on spark-project.atlassian.net Preview by Yahoo

Good call on this one.

- Patrick

Re: Spark 0.9.1 release

2014-03-20 Thread Patrick Wendell
Hey Tom,

> I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on 
> YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as submitting 
> user - JIRA in.  The pyspark one I would consider more of an enhancement so 
> might not be appropriate for a point release.

Someone recently sent me a personal e-mail reporting some problems
with this. I'll ask them to forward it to you/the dev list. Might be
worth looking into before merging.

>  [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
> This means that they can't write/read from files that the yarn user doesn't 
> have permissions to but the submitting user does.
> View on spark-project.atlassian.net Preview by Yahoo

Good call on this one.

- Patrick


Re: Spark 0.9.1 release

2014-03-20 Thread Tom Graves
I'll pull [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on 
YARN - JIRA and  [SPARK-1051] On Yarn, executors don't doAs as submitting user 
- JIRA in.  The pyspark one I would consider more of an enhancement so might 
not be appropriate for a point release. 

 
 [SPARK-1053] Should not require SPARK_YARN_APP_JAR when running on YA...
org.apache.spark.SparkException: env SPARK_YARN_APP_JAR is not set at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:49)
 at org.apache.spark.schedule...  
View on spark-project.atlassian.net Preview by Yahoo  
 
 
 [SPARK-1051] On Yarn, executors don't doAs as submitting user - JIRA
This means that they can't write/read from files that the yarn user doesn't 
have permissions to but the submitting user does.   
View on spark-project.atlassian.net Preview by Yahoo  
 
 



On Thursday, March 20, 2014 1:35 PM, Bhaskar Dutta  wrote:
 
It will be great if
"SPARK-1101:
Umbrella
for hardening Spark on YARN" can get into 0.9.1.

Thanks,
Bhaskar


On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
wrote:

>  Hello everyone,
>
> Since the release of Spark 0.9, we have received a number of important bug
> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> going to cut a release candidate soon and we would love it if people test
> it out. We have backported several bug fixes into the 0.9 and updated JIRA
> accordingly<
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> >.
> Please let me know if there are fixes that were not backported but you
> would like to see them in 0.9.1.
>
> Thanks!
>
> TD
>

Re: Spark 0.9.1 release

2014-03-20 Thread Bhaskar Dutta
It will be great if
"SPARK-1101:
Umbrella
for hardening Spark on YARN" can get into 0.9.1.

Thanks,
Bhaskar

On Thu, Mar 20, 2014 at 5:37 AM, Tathagata Das
wrote:

>  Hello everyone,
>
> Since the release of Spark 0.9, we have received a number of important bug
> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> going to cut a release candidate soon and we would love it if people test
> it out. We have backported several bug fixes into the 0.9 and updated JIRA
> accordingly<
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> >.
> Please let me know if there are fixes that were not backported but you
> would like to see them in 0.9.1.
>
> Thanks!
>
> TD
>


Re: Spark 0.9.1 release

2014-03-19 Thread Mridul Muralidharan
If 1.0 is just round the corner, then it is fair enough to push to
that, thanks for clarifying !

Regards,
Mridul

On Wed, Mar 19, 2014 at 6:12 PM, Tathagata Das
 wrote:
> I agree that the garbage collection
> PRwould make things very
> convenient in a lot of usecases. However, there are
> two broads reasons why it is hard for that PR to get into 0.9.1.
> 1. The PR still needs some amount of work and quite a lot of testing. While
> we enable RDD and shuffle cleanup based on Java GC, its behavior in a real
> workloads still needs to be understood (especially since it is tied to
> Spark driver's garbage collection behavior).
> 2. This actually changes some of the semantic behavior of Spark and should
> not be included in a bug-fix release. The PR will definitely be present for
> Spark 1.0, which is expected to be release around end of April (not too far
> ;) ).
>
> TD
>
>
> On Wed, Mar 19, 2014 at 5:57 PM, Mridul Muralidharan wrote:
>
>> Would be great if the garbage collection PR is also committed - if not
>> the whole thing, atleast the part to unpersist broadcast variables
>> explicitly would be great.
>> Currently we are running with a custom impl which does something
>> similar, and I would like to move to standard distribution for that.
>>
>>
>> Thanks,
>> Mridul
>>
>>
>> On Wed, Mar 19, 2014 at 5:07 PM, Tathagata Das
>>  wrote:
>> >  Hello everyone,
>> >
>> > Since the release of Spark 0.9, we have received a number of important
>> bug
>> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
>> > going to cut a release candidate soon and we would love it if people test
>> > it out. We have backported several bug fixes into the 0.9 and updated
>> JIRA
>> > accordingly<
>> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
>> >.
>> > Please let me know if there are fixes that were not backported but you
>> > would like to see them in 0.9.1.
>> >
>> > Thanks!
>> >
>> > TD
>>


Re: Spark 0.9.1 release

2014-03-19 Thread Tathagata Das
I agree that the garbage collection
PRwould make things very
convenient in a lot of usecases. However, there are
two broads reasons why it is hard for that PR to get into 0.9.1.
1. The PR still needs some amount of work and quite a lot of testing. While
we enable RDD and shuffle cleanup based on Java GC, its behavior in a real
workloads still needs to be understood (especially since it is tied to
Spark driver's garbage collection behavior).
2. This actually changes some of the semantic behavior of Spark and should
not be included in a bug-fix release. The PR will definitely be present for
Spark 1.0, which is expected to be release around end of April (not too far
;) ).

TD


On Wed, Mar 19, 2014 at 5:57 PM, Mridul Muralidharan wrote:

> Would be great if the garbage collection PR is also committed - if not
> the whole thing, atleast the part to unpersist broadcast variables
> explicitly would be great.
> Currently we are running with a custom impl which does something
> similar, and I would like to move to standard distribution for that.
>
>
> Thanks,
> Mridul
>
>
> On Wed, Mar 19, 2014 at 5:07 PM, Tathagata Das
>  wrote:
> >  Hello everyone,
> >
> > Since the release of Spark 0.9, we have received a number of important
> bug
> > fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> > going to cut a release candidate soon and we would love it if people test
> > it out. We have backported several bug fixes into the 0.9 and updated
> JIRA
> > accordingly<
> https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed)
> >.
> > Please let me know if there are fixes that were not backported but you
> > would like to see them in 0.9.1.
> >
> > Thanks!
> >
> > TD
>


Re: Spark 0.9.1 release

2014-03-19 Thread Mridul Muralidharan
Would be great if the garbage collection PR is also committed - if not
the whole thing, atleast the part to unpersist broadcast variables
explicitly would be great.
Currently we are running with a custom impl which does something
similar, and I would like to move to standard distribution for that.


Thanks,
Mridul


On Wed, Mar 19, 2014 at 5:07 PM, Tathagata Das
 wrote:
>  Hello everyone,
>
> Since the release of Spark 0.9, we have received a number of important bug
> fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
> going to cut a release candidate soon and we would love it if people test
> it out. We have backported several bug fixes into the 0.9 and updated JIRA
> accordingly.
> Please let me know if there are fixes that were not backported but you
> would like to see them in 0.9.1.
>
> Thanks!
>
> TD


Spark 0.9.1 release

2014-03-19 Thread Tathagata Das
 Hello everyone,

Since the release of Spark 0.9, we have received a number of important bug
fixes and we would like to make a bug-fix release of Spark 0.9.1. We are
going to cut a release candidate soon and we would love it if people test
it out. We have backported several bug fixes into the 0.9 and updated JIRA
accordingly.
Please let me know if there are fixes that were not backported but you
would like to see them in 0.9.1.

Thanks!

TD