Hi,
I am trying to get a sample of a sql query in to make the query run faster.
My query look like this :
SELECT `Category` as `Category`,sum(`bookings`) as `bookings`,sum(`dealviews`) 
as `dealviews` FROM groupon_dropbox WHERE  `event_date` >= '2015-11-14' AND 
`event_date` <= '2016-02-19' GROUP BY `Category` LIMIT 100

The table is partitioned by event_date. And the code I am using is:
 df = self.df_from_sql(sql, srcs)

results = df.sample(False, 0.5).collect()

 The results are a little bit different, but the execution time is almost the 
same. Am I missing something?


thanks

Reply via email to