why spark ml package doesn't contain svm algorithm

2016-09-27 Thread hxw
I have found spark ml package have implement naivebayes algorithm and the 
source code is simple,.
I am confusing why spark ml package doesn’t contain svm algorithm,it seems not 
very hard to do that.


spark1.4.1 extremely slow for take(1) or head() or first() or show

2015-12-03 Thread hxw
Dear All,



I have a hive table with 100 million data and I just ran some very simple 
operations on this dataset like:



  val df = sqlContext.sql("select * from user ").toDF
  df.cache
  df.registerTempTable("tb")
  val b=sqlContext.sql("select  
'uid',max(length(uid)),count(distinct(uid)),count(uid),sum(case when uid is 
null then 0 else 1 end),sum(case when uid is null then 1 else 0 end),sum(case 
when uid is null then 1 else 0 end)/count(uid) from tb")
  b.show  //the result just one line but this step is extremely slow

Is this expected? Why show is so slow for dataframe? Is it a bug in the 
optimizer? or I did something wrong?


Best Regards,
tylor