subject:"\[pyspark 2.3\+\] Bucketing with sort \- incremental data load\?"

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

2019-05-31 Thread Georg Heiler

sing date partitions, instead? >> >> >> >> *From: *Rishi Shah >> *Date: *Thursday, May 30, 2019 at 10:43 PM >> *To: *"user @spark" >> *Subject: *[pyspark 2.3+] Bucketing with sort - incremental data load? >> >> >> >> Hi A

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

2019-05-31 Thread Rishi Shah

odically running a compaction job. > > > > If you’re simply appending daily snapshots, then you could just consider > using date partitions, instead? > > > > *From: *Rishi Shah > *Date: *Thursday, May 30, 2019 at 10:43 PM > *To: *"user @spark" > *Subje

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

2019-05-31 Thread Silvio Fiorito

yspark 2.3+] Bucketing with sort - incremental data load? Hi All, Can we use bucketing with sorting functionality to save data incrementally (say daily) ? I understand bucketing is supported in Spark only with saveAsTable, however can this be used with mode "append" instead of "over

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

2019-05-31 Thread Gourav Sengupta

Hi Rishi, I think that if you are using sorting and then appending data locally there will no need to bucket data and you are good with external tables that way. Regards, Gourav On Fri, May 31, 2019 at 3:43 AM Rishi Shah wrote: > Hi All, > > Can we use bucketing with sorting functionality to

[pyspark 2.3+] Bucketing with sort - incremental data load?

2019-05-30 Thread Rishi Shah

Hi All, Can we use bucketing with sorting functionality to save data incrementally (say daily) ? I understand bucketing is supported in Spark only with saveAsTable, however can this be used with mode "append" instead of "overwrite"? My understanding around bucketing was, you need to rewrite

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

Re: [pyspark 2.3+] Bucketing with sort - incremental data load?

[pyspark 2.3+] Bucketing with sort - incremental data load?

5 matches

Site Navigation

Mail list logo

Footer information