[ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-17 Thread Patrick Wendell
Hi All,

I've just posted a preview of the Spark 1.2.0. release for community
regression testing.

Issues reported now will get close attention, so please help us test!
You can help by running an existing Spark 1.X workload on this and
reporting any regressions. As we start voting, etc, the bar for
reported issues to hold the release will get higher and higher, so
test early!

The tag to be is v1.2.0-snapshot1 (commit 38c1fbd96)

The release files, including signatures, digests, etc can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-snapshot1

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1038/

The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-1.2.0-snapshot1-docs/

== Notes ==
- Maven artifacts are published for both Scala 2.10 and 2.11. Binary
distributions are not posted for Scala 2.11 yet, but will be posted
soon.

- There are two significant config default changes that users may want
to revert if doing A:B testing against older versions.

"spark.shuffle.manager" default has changed to "sort" (was "hash")
"spark.shuffle.blockTransferService" default has changed to "netty" (was "nio")

- This release contains a shuffle service for YARN. This jar is
present in all Hadoop 2.X binary packages in
"lib/spark-1.2.0-yarn-shuffle.jar"

Cheers,
Patrick

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Madhu
Thanks Patrick.

I've been testing some 1.2 features, looks good so far.
I have some example code that I think will be helpful for certain MR-style
use cases (secondary sort).
Can I still add that to the 1.2 documentation, or is that frozen at this
point?



-
--
Madhu
https://www.linkedin.com/in/msiddalingaiah
--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Corey Nolet
I was actually about to post this myself- I have a complex join that could
benefit from something like a GroupComparator vs having to do multiple
grouyBy operations. This is probably the wrong thread for a full discussion
on this but I didn't see a JIRA ticket for this or anything similar- any
reasons why this would not make sense given Spark's design?

On Thu, Nov 20, 2014 at 9:39 AM, Madhu  wrote:

> Thanks Patrick.
>
> I've been testing some 1.2 features, looks good so far.
> I have some example code that I think will be helpful for certain MR-style
> use cases (secondary sort).
> Can I still add that to the 1.2 documentation, or is that frozen at this
> point?
>
>
>
> -
> --
> Madhu
> https://www.linkedin.com/in/msiddalingaiah
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Nan Zhu
BTW, this PR https://github.com/apache/spark/pull/2524 is related to a blocker 
level bug, 

and this is actually close to be merged (have been reviewed for several rounds)

I would appreciated if anyone can continue the process, 

@mateiz 

-- 
Nan Zhu
http://codingcat.me


On Thursday, November 20, 2014 at 10:17 AM, Corey Nolet wrote:

> I was actually about to post this myself- I have a complex join that could
> benefit from something like a GroupComparator vs having to do multiple
> grouyBy operations. This is probably the wrong thread for a full discussion
> on this but I didn't see a JIRA ticket for this or anything similar- any
> reasons why this would not make sense given Spark's design?
> 
> On Thu, Nov 20, 2014 at 9:39 AM, Madhu  (mailto:ma...@madhu.com)> wrote:
> 
> > Thanks Patrick.
> > 
> > I've been testing some 1.2 features, looks good so far.
> > I have some example code that I think will be helpful for certain MR-style
> > use cases (secondary sort).
> > Can I still add that to the 1.2 documentation, or is that frozen at this
> > point?
> > 
> > 
> > 
> > -
> > --
> > Madhu
> > https://www.linkedin.com/in/msiddalingaiah
> > --
> > View this message in context:
> > http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> > Sent from the Apache Spark Developers List mailing list archive at
> > Nabble.com (http://Nabble.com).
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
> > (mailto:dev-unsubscr...@spark.apache.org)
> > For additional commands, e-mail: dev-h...@spark.apache.org 
> > (mailto:dev-h...@spark.apache.org)
> > 
> 
> 
> 
> 




Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Hector Yee
I'm getting a lot of task lost with this build in a large mesos cluster.
Happens with both hash and sort shuffles.

14/11/20 18:08:38 WARN TaskSetManager: Lost task 9.1 in stage 1.0 (TID 897,
i-d4d6553a.inst.aws.airbnb.com): FetchFailed(null, shuffleId=1, mapId=-1,
reduceId=9, message=
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output
location for shuffle 1
at
org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:386)
at
org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:383)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at
org.apache.spark.MapOutputTracker$.org$apache$spark$MapOutputTracker$$convertMapStatuses(MapOutputTracker.scala:382)
at
org.apache.spark.MapOutputTracker.getServerStatuses(MapOutputTracker.scala:178)
at
org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:42)
at
org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40)
at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)


On Thu, Nov 20, 2014 at 7:42 AM, Nan Zhu  wrote:

> BTW, this PR https://github.com/apache/spark/pull/2524 is related to a
> blocker level bug,
>
> and this is actually close to be merged (have been reviewed for several
> rounds)
>
> I would appreciated if anyone can continue the process,
>
> @mateiz
>
> --
> Nan Zhu
> http://codingcat.me
>
>
> On Thursday, November 20, 2014 at 10:17 AM, Corey Nolet wrote:
>
> > I was actually about to post this myself- I have a complex join that
> could
> > benefit from something like a GroupComparator vs having to do multiple
> > grouyBy operations. This is probably the wrong thread for a full
> discussion
> > on this but I didn't see a JIRA ticket for this or anything similar- any
> > reasons why this would not make sense given Spark's design?
> >
> > On Thu, Nov 20, 2014 at 9:39 AM, Madhu  ma...@madhu.com)> wrote:
> >
> > > Thanks Patrick.
> > >
> > > I've been testing some 1.2 features, looks good so far.
> > > I have some example code that I think will be helpful for certain
> MR-style
> > > use cases (secondary sort).
> > > Can I still add that to the 1.2 documentation, or is that frozen at
> this
> > > point?
> > >
> > >
> > >
> > > -
> > > --
> > > Madhu
> > > https://www.linkedin.com/in/msiddalingaiah
> > > --
> > > View this message in context:
> > >
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> > > Sent from the Apache Spark Developers List mailing list archive at
> > > Nabble.com (http://Nabble.com).
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org (mailto:
> dev-unsubscr...@spark.apache.org)
> > > For additional commands, e-mail: dev-h...@spark.apache.org (mailto:
> dev-h...@spark.apache.org)
> > >
> >
> >
> >
> >
>
>
>


-- 
Yee Yang Li Hector 
*google.com/+HectorYee *


Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Matei Zaharia
You can still send patches for docs until the release goes out -- please do if 
you see stuff.

Matei

> On Nov 20, 2014, at 6:39 AM, Madhu  wrote:
> 
> Thanks Patrick.
> 
> I've been testing some 1.2 features, looks good so far.
> I have some example code that I think will be helpful for certain MR-style
> use cases (secondary sort).
> Can I still add that to the 1.2 documentation, or is that frozen at this
> point?
> 
> 
> 
> -
> --
> Madhu
> https://www.linkedin.com/in/msiddalingaiah
> --
> View this message in context: 
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Spark 1.2.0 Release Preview Posted

2014-11-20 Thread Nishkam Ravi
Seeing issues with sort-based shuffle (OOM errors and memory leak):
https://issues.apache.org/jira/browse/SPARK-4515.

Good performance gains for TeraSort as compared to hash (as expected).

Thanks,
Nishkam


On Thu, Nov 20, 2014 at 11:20 AM, Matei Zaharia 
wrote:

> You can still send patches for docs until the release goes out -- please
> do if you see stuff.
>
> Matei
>
> > On Nov 20, 2014, at 6:39 AM, Madhu  wrote:
> >
> > Thanks Patrick.
> >
> > I've been testing some 1.2 features, looks good so far.
> > I have some example code that I think will be helpful for certain
> MR-style
> > use cases (secondary sort).
> > Can I still add that to the 1.2 documentation, or is that frozen at this
> > point?
> >
> >
> >
> > -
> > --
> > Madhu
> > https://www.linkedin.com/in/msiddalingaiah
> > --
> > View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Spark-1-2-0-Release-Preview-Posted-tp9400p9449.html
> > Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > For additional commands, e-mail: dev-h...@spark.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>