sing date partitions, instead?
>>
>>
>>
>> *From: *Rishi Shah
>> *Date: *Thursday, May 30, 2019 at 10:43 PM
>> *To: *"user @spark"
>> *Subject: *[pyspark 2.3+] Bucketing with sort - incremental data load?
>>
>>
>>
>> Hi A
odically running a compaction job.
>
>
>
> If you’re simply appending daily snapshots, then you could just consider
> using date partitions, instead?
>
>
>
> *From: *Rishi Shah
> *Date: *Thursday, May 30, 2019 at 10:43 PM
> *To: *"user @spark"
> *Subje
yspark 2.3+] Bucketing with sort - incremental data load?
Hi All,
Can we use bucketing with sorting functionality to save data incrementally (say
daily) ? I understand bucketing is supported in Spark only with saveAsTable,
however can this be used with mode "append" instead of "over
Hi Rishi,
I think that if you are using sorting and then appending data locally there
will no need to bucket data and you are good with external tables that way.
Regards,
Gourav
On Fri, May 31, 2019 at 3:43 AM Rishi Shah wrote:
> Hi All,
>
> Can we use bucketing with sorting functionality to
Hi All,
Can we use bucketing with sorting functionality to save data incrementally
(say daily) ? I understand bucketing is supported in Spark only with
saveAsTable, however can this be used with mode "append" instead of
"overwrite"?
My understanding around bucketing was, you need to rewrite