Re: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-14 Thread Hao Ren
se it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetar

Re: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-12 Thread Subash Prabakar
n of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >

Re: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-09 Thread Hao Ren
monetary damages arising from > such loss, damage or destruction. > > > > > On Thu, 8 Aug 2019 at 15:16, Hao Ren wrote: > >> >> >> -- Forwarded message ----- >> From: Hao Ren >> Date: Thu, Aug 8, 2019 at 4:15 PM >> Subject: Re:

Re: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-08 Thread Mich Talebzadeh
-- Forwarded message - > From: Hao Ren > Date: Thu, Aug 8, 2019 at 4:15 PM > Subject: Re: Spark SQL reads all leaf directories on a partitioned Hive > table > To: Gourav Sengupta > > > Hi Gourva, > > I am using enableHiveSupport. > The table was not created by Spa

Fwd: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-08 Thread Hao Ren
-- Forwarded message - From: Hao Ren Date: Thu, Aug 8, 2019 at 4:15 PM Subject: Re: Spark SQL reads all leaf directories on a partitioned Hive table To: Gourav Sengupta Hi Gourva, I am using enableHiveSupport. The table was not created by Spark. The table already exists

Re: Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-08 Thread Gourav Sengupta
Hi, Just out of curiosity did you start the SPARK session using enableHiveSupport() ? Or are you creating the table using SPARK? Regards, Gourav On Wed, Aug 7, 2019 at 3:28 PM Hao Ren wrote: > Hi, > I am using Spark SQL 2.3.3 to read a hive table which is partitioned by > day, hour,

Spark SQL reads all leaf directories on a partitioned Hive table

2019-08-07 Thread Hao Ren
Hi, I am using Spark SQL 2.3.3 to read a hive table which is partitioned by day, hour, platform, request_status and is_sampled. The underlying data is in parquet format on HDFS. Here is the SQL query to read just *one partition*. ``` spark.sql(""" SELECT rtb_platform_id, SUM(e_cpm) FROM