kylin 2.1.1 for hbase 0.98

2017-09-04 Thread yuyong . zhai
how to build kylin(v2.1.0) Binary Package for hbase0.98?

Re: Different watermark for different kafka partitions in Structured Streaming

2017-09-04 Thread 张万新
@Rayn It's frequently observed in our production environment that different partition's consumption rate vary for kinds of reasons, including performance difference of machines holding the partitions, unevenly distribution of messages and so on. So I hope there can be some advice on how to design

Re: sparkR 3rd library

2017-09-04 Thread Felix Cheung
Can you include the code you call spark.lapply? From: patcharee Sent: Sunday, September 3, 2017 11:46:40 PM To: spar >> user@spark.apache.org Subject: sparkR 3rd library Hi, I am using spark.lapply to execute an existing R script in

unsubscribe

2017-09-04 Thread Pavel Gladkov
unsubscribe -- С уважением, Павел Гладков

Re: Is watermark always set using processing time or event time or both?

2017-09-04 Thread Jacek Laskowski
Hi, https://stackoverflow.com/q/46032001/1305344 :) Pozdrawiam, Jacek Laskowski https://about.me/JacekLaskowski Spark Structured Streaming (Apache Spark 2.2+) https://bit.ly/spark-structured-streaming Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at

Re: Is watermark always set using processing time or event time or both?

2017-09-04 Thread Jacek Laskowski
Hi, It's by default event time-based as there's no way to define the column using withWatermark operator. See http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset@withWatermark(eventTime:String,delayThreshold:String):org.apache.spark.sql.Dataset[T] But...

sparkR 3rd library

2017-09-04 Thread patcharee
Hi, I am using spark.lapply to execute an existing R script in standalone mode. This script calls a function 'rbga' from a 3rd library 'genalg'. This rbga function works fine in sparkR env when I call it directly, but when I apply this to spark.lapply I get the error could not find function

Re: Spark GroupBy Save to different files

2017-09-04 Thread Pralabh Kumar
Hi arun rdd1.groupBy(_.city).map(s=>(s._1,s._2.toList.toString())).toDF("city","data").write. *partitionBy("city")*.csv("/data") should work for you . Regards Pralabh On Sat, Sep 2, 2017 at 7:58 AM, Ryan wrote: > you may try foreachPartition > > On Fri, Sep 1, 2017 at