Thanks for the detailed design write up Vinoth. I concur with the others on 
option 2, default indexing as off and enable it when we have enough confidence 
on stability & performance. Although, I do think practically it might be good 
to have the code in place for users who might revert to an older build as part 
of some build rollback mechanisms that they may have in place (for reasons not 
even related to hudi). The latest data block (denoted by the latest version) 
being a new block as suggested by Balaji sounds like one option - not sure how 
the complicated the code will become though...
Will comment on the RFC about some doubts/concerns regarding first migration 
customers from canIndexLogFiles = false to true and then rollback to ensure my 
understand is correct. 

-Nishith

Sent from my iPhone

> On Oct 30, 2019, at 4:00 PM, Balaji Varadarajan <v.bal...@ymail.com.invalid> 
> wrote:
> 
> Thanks Vinoth for proposing a clean and extendable design. The overall design 
> looks great. Another rollout option is to only use consolidated log index for 
> index lookup if latest "valid" log block has been written in new format. If 
> that is not the case, we can revert to scanning previous log blocks for index 
> lookup.
> Balaji.V    On Tuesday, October 29, 2019, 07:52:00 PM PDT, Bhavani Sudha 
> <bhavanisud...@gmail.com> wrote:  
> 
> I vote for the second option. Also it can give time to analyze on how to
> deal with backwards compatibility. I ll take a look at the RFC later
> tonight and get back.
> 
> 
>> On Sun, Oct 27, 2019 at 10:24 AM Vinoth Chandar <vin...@apache.org> wrote:
>> 
>> One issue I have some open questions myself
>> 
>> Is it ok to assume log will have old data block versions, followed by new
>> data block versions. For e.g, if rollout new code, then revert back then
>> there could be an arbitrary mix of new and old data blocks. Handling this
>> might make design/code fairly complex. Alternatively we can keep it simple
>> for now, disable by default and only advise to enable for new tables or
>> when hudi version is stable
>> 
>> 
>>> On Sun, Oct 27, 2019 at 12:13 AM Vinoth Chandar <vin...@apache.org> wrote:
>>> 
>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/HUDI/RFC-6+Add+indexing+support+to+the+log+file
>>> 
>>> 
>>> Feedback welcome, on this RFC tackling HUDI-86
>>> 
>> 

Reply via email to