subject:"\[Spark SQL\] How to select first row in each GROUP BY group\?"

Re: [Spark SQL] How to select first row in each GROUP BY group?

2014-08-25 Thread Michael Armbrust

In our case, the ROW has about 80 columns which exceeds the case class limit. Starting with Spark 1.1 you'll be able to also use the applySchema API https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L126 .

Re: [Spark SQL] How to select first row in each GROUP BY group?

2014-08-21 Thread Fengyun RAO

on the SchemaRDD, then for each group just take the first record. From: Fengyun RAO raofeng...@gmail.com Date: Thursday, August 21, 2014 at 8:26 AM To: user@spark.apache.org user@spark.apache.org Subject: Re: [Spark SQL] How to select first row in each GROUP BY group? Could anybody help? I googled

[Spark SQL] How to select first row in each GROUP BY group?

2014-08-20 Thread Fengyun RAO

I have a table with 4 columns: a, b, c, time What I need is something like: SELECT a, b, GroupFirst(c) FROM t GROUP BY a, b GroupFirst means the first item of column c group, and by the first I mean minimal time in that group. In Oracle/Sql Server, we could write: WITH summary AS (