Re: HUDI 0.6 Read Table Performance

Tanuj Sun, 25 Apr 2021 21:05:16 -0700

Thanks Udit. I am just reading Spark data Source to read the full table. 
Sometimes we provide partitions and performance is ok and sometimes we cant due 
to the nature of data. Are you looking for HUDI parameters that we set while 
reading the table.



On 2021/04/26 04:02:31, Udit Mehrotra <[email protected]> wrote: 
> Hi Tanuj,
> 
> Can you provide exact commands how you are reading the table ? We might be 
> able to guide based on that.
> 
> Thanks,
> Udit
> 
> Sent from my iPhone
> 
> > On Apr 25, 2021, at 8:34 PM, Tanuj <[email protected]> wrote:
> > 
> > Hi,
> > We are using HUDI 0.6 and noticed that some hudi tables are very slow to 
> > read specially with large number of partitions, probably due to S3 listing. 
> > I know in later versions of HUDI we have fixed some of the issues but it 
> > will take us some time to migrate . Is there anything in 0.6 I can leverage 
> > ?
> > 
> > I also dont understand what ./aux does as these folders are empty for us. 
> > We sometimes do S3 to S3 copy and read HUDI tables from the new copied 
> > location and able to read without .aux/ folders. 
> > When S3 copy works it doesnt copy empty folders.
> > 
> > Thnaks,
> > Tanu
>

Re: HUDI 0.6 Read Table Performance

Reply via email to