Re: Spark 1.6.1

2016-02-22 Thread Reynold Xin
Yes, we don't want to clutter maven central. The staging repo is included in the release candidate voting thread. See the following for an example: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-6-0-RC1-td15424.html On Mon, Feb 22, 2016 at 11:37 PM, Romi

Re: Spark 1.6.1

2016-02-22 Thread Romi Kuntsman
Sounds fair. Is it to avoid cluttering maven central with too many intermediate versions? What do I need to add in my pom.xml section to make it work? *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Tue, Feb 23, 2016 at 9:34 AM, Reynold Xin wrote: > We

Re: Spark 1.6.1

2016-02-22 Thread Reynold Xin
We usually publish to a staging maven repo hosted by the ASF (not maven central). On Mon, Feb 22, 2016 at 11:32 PM, Romi Kuntsman wrote: > Is it possible to make RC versions available via Maven? (many projects do > that) > That will make integration much easier, so many more

Re: Spark 1.6.1

2016-02-22 Thread Romi Kuntsman
Is it possible to make RC versions available via Maven? (many projects do that) That will make integration much easier, so many more people can test the version before the final release. Thanks! *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Tue, Feb 23, 2016 at 8:07 AM, Luciano

Re: Spark 1.6.1

2016-02-22 Thread Luciano Resende
On Mon, Feb 22, 2016 at 9:08 PM, Michael Armbrust wrote: > An update: people.apache.org has been shut down so the release scripts > are broken. Will try again after we fix them. > > If you skip uploading to people.a.o, it should still be available in nexus for review.

Re: Spark 1.6.1

2016-02-22 Thread Michael Armbrust
An update: people.apache.org has been shut down so the release scripts are broken. Will try again after we fix them. On Mon, Feb 22, 2016 at 6:28 PM, Michael Armbrust wrote: > I've kicked off the build. Please be extra careful about merging into > branch-1.6 until after

Re: Opening a JIRA for QuantileDiscretizer bug

2016-02-22 Thread Ted Yu
When you click on Create, you're brought to 'Create Issue' dialog where you choose Project Spark. Component should be MLlib. Please see also: http://search-hadoop.com/m/q3RTtmsshe1W6cH22/spark+pull+template=pull+request+template On Mon, Feb 22, 2016 at 6:45 PM, Pierson, Oliver C

Opening a JIRA for QuantileDiscretizer bug

2016-02-22 Thread Pierson, Oliver C
Hello, I've discovered a bug in the QuantileDiscretizer estimator. Specifically, for large DataFrames QuantileDiscretizer will only create one split (i.e. two bins). The error happens in lines 113 and 114 of QuantileDiscretizer.scala: val requiredSamples = math.max(numBins * numBins,

Re: Spark 1.6.1

2016-02-22 Thread Michael Armbrust
I've kicked off the build. Please be extra careful about merging into branch-1.6 until after the release. On Mon, Feb 22, 2016 at 10:24 AM, Michael Armbrust wrote: > I will cut the RC today. Sorry for the delay! > > On Mon, Feb 22, 2016 at 5:19 AM, Patrick Woody

Re: Spark not able to fetch events from Amazon Kinesis

2016-02-22 Thread Yash Sharma
Answering my own Question - I have got some success with Spark Kinesis integration, and the key being the unionStreams.foreachRDD. There are 2 versions of the foreachRDD available - unionStreams.foreachRDD - unionStreams.foreachRDD ((rdd: RDD[Array[Byte]], time: Time) For some reason the first

Re: Spark 1.6.1

2016-02-22 Thread Michael Armbrust
I will cut the RC today. Sorry for the delay! On Mon, Feb 22, 2016 at 5:19 AM, Patrick Woody wrote: > Hey Michael, > > Any update on a first cut of the RC? > > Thanks! > -Pat > > On Mon, Feb 15, 2016 at 6:50 PM, Michael Armbrust > wrote: > >>

Re: 回复: a new FileFormat 5x~100x faster than parquet

2016-02-22 Thread Ted Yu
The referenced benchmark is in Chinese. Please provide English version so that more people can understand. For item 7, looks like the speed of ingest is much slower compared to using Parquet. Cheers On Mon, Feb 22, 2016 at 6:12 AM, 开心延年 wrote: > 1.ya100 is not only the

?????? ?????? a new FileFormat 5x~100x faster than parquet

2016-02-22 Thread ????????
1.ya100 is not only the invert index ,but also include the TOP N sort lazy read,also include label . 2.our test on ya100 and parquet is on this link address https://github.com/ycloudnet/ya100/blob/master/v1.0.8/ya100%E6%80%A7%E8%83%BD%E6%B5%8B%E8%AF%95%E6%8A%A5%E5%91%8A.docx?raw=true 3.you are

Re: Spark 1.6.1

2016-02-22 Thread Patrick Woody
Hey Michael, Any update on a first cut of the RC? Thanks! -Pat On Mon, Feb 15, 2016 at 6:50 PM, Michael Armbrust wrote: > I'm not going to be able to do anything until after the Spark Summit, but > I will kick off RC1 after that (end of week). Get your patches in

Builds are failing

2016-02-22 Thread Iulian Dragoș
Just in case you missed this: https://issues.apache.org/jira/browse/SPARK-13431 Builds are failing with 'Method code too large' in the "shading" step with Maven. iulian -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com

Re: How do we run that PR auto-close script again?

2016-02-22 Thread Sean Owen
That's what I'm talking about, yes, but I'm looking for the actual script. I'm sure there was a discussion about where it was and how to run it somewhere. Really just looking to have it run again. On Mon, Feb 22, 2016 at 10:44 AM, Akhil Das wrote: > This? >

Re: How do we run that PR auto-close script again?

2016-02-22 Thread Akhil Das
This? http://apache-spark-developers-list.1001551.n3.nabble.com/Automated-close-of-PR-s-td15862.html Thanks Best Regards On Mon, Feb 22, 2016 at 2:47 PM, Sean Owen wrote: > I know Patrick told us at some point, but I can't find the email or > wiki that describes how to run

How do we run that PR auto-close script again?

2016-02-22 Thread Sean Owen
I know Patrick told us at some point, but I can't find the email or wiki that describes how to run the script that auto-closes PRs with "do you mind closing this PR". Does anyone know? I think it's been a long time since it was run.