date:20160831

Questions about bucketing in Spark

2016-08-31 Thread Tejas Patil

Hi everyone, I am working towards making Spark's Sort Merge join in par with Hive's Sort-Merge-Bucket join to use sorted. So far I have identified these main items to be addressed: 1. Make query planner to use `sorted`ness information for sort merge join (SPARK-15453, SPARK-17271) 2.

Re: Model abstract class in spark ml

2016-08-31 Thread Cody Koeninger

http://blog.originate.com/blog/2014/02/27/types-inside-types-in-scala/ On Wed, Aug 31, 2016 at 2:19 AM, Sean Owen wrote: > Weird, I recompiled Spark with a similar change to Model and it seemed > to work but maybe I missed a step in there. > > On Wed, Aug 31, 2016 at 6:33 AM,

Re: KMeans calls takeSample() twice?

2016-08-31 Thread Yanbo Liang

I added println at the start of function takeSample, and found it was printed only once for each run of KMeans. Thanks Yanbo On Tue, Aug 30, 2016 at 10:31 AM, Georgios Samaras < georgesamaras...@gmail.com> wrote: > Good catch Shivaram. However, the very next line states: > > // this shouldn't

Re: Model abstract class in spark ml

2016-08-31 Thread Sean Owen

Weird, I recompiled Spark with a similar change to Model and it seemed to work but maybe I missed a step in there. On Wed, Aug 31, 2016 at 6:33 AM, Mohit Jaggi wrote: > I think I figured it out. There is indeed "something deeper in Scala” :-) > > abstract class A { > def