subject:"Spark SQL group by"

Re: Spark Sql group by less performant

2018-12-10 Thread Georg Heiler

See https://databricks.com/blog/2016/05/19/approximate-algorithms-in-apache-spark-hyperloglog-and-quantiles.html you most probably do not require exact counts. Am Di., 11. Dez. 2018 um 02:09 Uhr schrieb 15313776907 <15313776...@163.com >: > i think you can add executer memory > > 15313776907 >

Re: Spark Sql group by less performant

2018-12-10 Thread 15313776907

i think you can add executer memory | | 15313776907 | | 邮箱：15313776...@163.com | 签名由网易邮箱大师定制 On 12/11/2018 08:28, lsn24 wrote: Hello, I have a requirement where I need to get total count of rows and total count of failedRows based on a grouping. The code looks like below:

Spark Sql group by less performant

2018-12-10 Thread lsn24

Hello, I have a requirement where I need to get total count of rows and total count of failedRows based on a grouping. The code looks like below: myDataset.createOrReplaceTempView("temp_view"); Dataset countDataset = sparkSession.sql("Select

Re: spark sql - group by constant column

2015-07-15 Thread Lior Chaga

I found out the problem. Grouping by a constant column value is indeed impossible. The reason it was working in my project is that I gave the constant column an alias that exists in the schema of the dataframe. The dataframe contained a data_timestamp representing an hour, and I added to the

spark sql - group by constant column

2015-07-15 Thread Lior Chaga

Hi, Facing a bug with group by in SparkSQL (version 1.4). Registered a JavaRDD with object containing integer fields as a table. Then I'm trying to do a group by, with a constant value in the group by fields: SELECT primary_one, primary_two, 10 as num, SUM(measure) as total_measures FROM tbl

Spark SQL group by

2015-02-06 Thread Mohnish Kodnani

Hi, i am trying to issue a sql query against a parquet file and am getting errors and would like some help to figure out what is going on. The sql : select timestamp, count(rid), qi.clientname from records where timestamp 0 group by qi.clientname I am getting the following error:

Re: Spark SQL group by

2015-02-06 Thread Michael Armbrust

You can't use columns (timestamp) that aren't in the GROUP BY clause. Spark 1.2+ give you a better error message for this case. On Fri, Feb 6, 2015 at 3:12 PM, Mohnish Kodnani mohnish.kodn...@gmail.com wrote: Hi, i am trying to issue a sql query against a parquet file and am getting errors

Re: Spark SQL group by

2015-02-06 Thread Mohnish Kodnani

Doh :) Thanks.. seems like brain freeze. On Fri, Feb 6, 2015 at 3:22 PM, Michael Armbrust mich...@databricks.com wrote: You can't use columns (timestamp) that aren't in the GROUP BY clause. Spark 1.2+ give you a better error message for this case. On Fri, Feb 6, 2015 at 3:12 PM, Mohnish

Re: Spark Sql group by less performant

Re: Spark Sql group by less performant

Spark Sql group by less performant

Re: spark sql - group by constant column

spark sql - group by constant column

Spark SQL group by

Re: Spark SQL group by

Re: Spark SQL group by

8 matches

Site Navigation

Mail list logo

Footer information