[DISCUSS] Default partition path in TimestampBasedKeyGenerator

2019-12-11 Thread Pratyaksh Sharma
Hi, If value for configured partitionPathField is not present, we are defaulting to default partition path in all the key generator classes except TimestampBasedKeyGenerator. In TimestampBasedKeyGenerator, we directly throw exception if the value is null. I wanted to know if this behaviour is

Re: [DISCUSS] Next Apache Release

2019-12-11 Thread Pratyaksh Sharma
Hi Vinoth, We are targeting HUDI-288 also as part of 0.5.1 release. I will change the fix version of that jira as well. Right now, it is not included in the list you shared above. On Thu, Dec 12, 2019 at 8:22 AM Vinoth Chandar wrote: > +1 for

Re:Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread lamberken
Hi, @vinoth 1, Hoodie*Config classes are only used to set default value when call their build method currently. They will be replaced by HoodieMemoryOptions, HoodieIndexOptions, HoodieHBaseIndexOptions, etc... 2, I don't understand the question "It is not clear to me whether there is any

Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread Vinoth Chandar
I actually prefer the builder pattern for making the configs, because I can do `builder.` in the IDE and actually see all the options... That said, most developers program against the Spark datasource and so this may not be useful, unless we expose a builder for that.. I will concede that since

Re: [DISCUSS] Next Apache Release

2019-12-11 Thread Vinoth Chandar
+1 for leesf, driving the release.. >From http://www.apache.org/dev/release-publishing.html#release_manager, it does explicitly confirm that any committer can be RM. I am happy to volunteer my services to assist leesf in the process. @all : Please speak up if you have concerns with the

Re: [DISCUSS] Next Apache Release

2019-12-11 Thread leesf
Hi Balaji, Thanks for kicking the discussion off. +1 to release next version as we made many improvements since last released version and Jan is reasonable considering the upcoming holidays. Besides I am wondering if I can be the release manager of 0.5.1 to work with you. It is always

Re: [QUESTION] Handle record partition change

2019-12-11 Thread Sivabalan
Depends on whether you are using regular BLOOM or GLOBAL_BLOOM. May I know which one are you talking about? On Wed, Dec 11, 2019 at 9:12 AM Shiyan Xu wrote: > Hi Hudi devs, > > Upon upsert operations, does Hudi detect record's partition path change? As > for the same record, the partition path

Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread Sivabalan
Let me summarize your initial proposal and then will get into details. - Introduce ConfigOptions for ease of handling of default values. - Remove all Hoodie*Config classes and just have HoodieWriteConfig. What this means is that, every other config file will be replaced by ConfigOptions. eg,

[QUESTION] Handle record partition change

2019-12-11 Thread Shiyan Xu
Hi Hudi devs, Upon upsert operations, does Hudi detect record's partition path change? As for the same record, the partition path field may get updated while the record key (the primary id) stays the same, then the insert would result in duplicate record (based on record key) in the dataset. Is

[DISCUSS] Next Apache Release

2019-12-11 Thread Balaji Varadarajan
Hello all, In the spirit of making Apache Hudi (incubating) releases at regular cadence, we are starting this thread to kickstart the planning and preparatory work for next release (0.5.1). As discussed in yesterdays meeting, the current plan is to have a release by end of Jan 2020. As

Re:Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread lamberken
Hi, On 1,2. Yes, you are right, moving the getter to the component level Config class itself. On 3, HoodieWriteConfig can also set value through ConfigOption, small code snippets. From the bellow snippets, we can see that clients need to know each component's builders and also call

Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread Vinoth Chandar
Hi Lamber-ken, I looked at the sample PR you put up as well. On 1,2 => Seems your intent is to replace these with moving the getter to the component level Config class itself? I am fine with that (although I think its not that big of a hurdle really to use atm). But, once we do that we could