+1 great reading and values! On Mon, 24 Feb 2020, 15:31 nishith agarwal, <n3.nas...@gmail.com> wrote:
> +100 > - Reduces index lookup time hence improves job runtime > - Paves the way for streaming style ingestion > - Eliminates dependency on Hbase (alternate "global index" support at the > moment) > > -Nishith > > On Mon, Feb 24, 2020 at 10:56 AM Vinoth Chandar <vin...@apache.org> wrote: > > > +1 from me as well. This will be a product defining feature, if we can do > > it/ > > > > On Sun, Feb 23, 2020 at 6:27 PM vino yang <yanghua1...@gmail.com> wrote: > > > > > Hi Sivabalan, > > > > > > Thanks for your proposal. > > > > > > Big +1 from my side, indexing for record granularity is really good for > > > performance. It is also towards the streaming processing. > > > > > > Best, > > > Vino > > > > > > Sivabalan <n.siv...@gmail.com> 于2020年2月23日周日 上午12:52写道: > > > > > > > As Aapche Hudi is getting widely adopted, performance has become the > > need > > > > of the hour. This RFC focusses on improving performance of the Hudi > > index > > > > by introducing record level index. The proposal is to implement a new > > > index > > > > format that is a mapping of (recordKey <-> partition, fileId) or > > > > ((recordKey, partitionPath) → fileId). This mapping will be stored > and > > > > maintained by Hudi as another implementation of HoodieIndex. This > > record > > > > level indexing will definitely give a boost to both read and write > > > > performance. > > > > > > > > Here > > > > < > > > > > > > > > > https://cwiki.apache.org/confluence/display/HUDI/RFC+-+08+%3A+Record+level+indexing+mechanisms+for+Hudi+datasets > > > > > > > > > is the link to RFC. > > > > > > > > Appreciate your review and thoughts. > > > > > > > > -- > > > > Regards, > > > > -Sivabalan > > > > > > > > > >