of this is also covered in the two links I sent.
>
>
> On Thu, Jul 18, 2019 at 10:37 PM Amarnath Venkataswamy <
> amarnath.venkatasw...@gmail.com> wrote:
>
>> After I set the shuffle parallelism i can able to complete the job without
>> failure but there is one
.
If we can do this in less than one to 2 hours(incremental update : 240
million daily) after tuning all the memory and other parameters i would be
very happy.
On Fri, Jul 19, 2019 at 12:19 AM Amarnath Venkataswamy <
amarnath.venkatasw...@gmail.com> wrote:
> yes.I am looking for the s
t;
> > On Thu, Jul 18, 2019 at 9:37 AM Vinoth Chandar
> wrote:
> >
> > > https://cwiki.apache.org/confluence/display/HUDI/Tuning+Guide
> > > https://hudi.apache.org/performance.html
> > > are good resources for what you need.
> > >
> >
Hi
Can you anyone of you share the Spark configuration used at UBER I didn't
save that link to my favorites.
I am currently doing some performance test against 240million records and
job is failing for one or other reasons with memory.
Regards
Amarnath
Hi
Is there any option to write the hoodie dataset without any partition?
I tried but hive sync is failing when you sync up without any partition.
Delta streamer creates with default as partition when there is no partition
column.
Sent from my iPhone
; Do you mean mapr fs? If so, I don't think this has come up before. I think
>> it should be doable though.
>> Can you provide more details may be?
>>
>> Thanks
>> Vinoth
>>
>> On Tue, May 14, 2019 at 8:18 AM Amarnath Venkataswamy <
>>
Hi
Is there anyone implemented the Hudi on Mapr platform?
Sent from my iPhone