Hi Brandon, I contributed to complex ComplexKeyGenerator sometime back. I don't think it is intended behavior. If you are getting exception is it because of DataSourceUtils.getNestedFieldValAsString(record, recordKeyField) ? I can't think of any other reason why it should throw exception. I think when val is null DataSourceUtils.getNestedFieldValAsString throw error due to if (!(val instanceof GenericRecord)) { throw new HoodieException("Cannot find a record at part value :" + part); } Maybe Balaji can confirm it.
Thanks, Jaimin On Sun, 10 Nov 2019 at 03:40, Scheller, Brandon <bsche...@amazon.com.invalid> wrote: > Thanks for the quick response Balaji! > > I think there is a lot here to continue with: > 1. I did see that recent pull request for the delete API. I think > collaborating to support another delete API with just record key would be a > great next step. I'll begin looking into it. Additionally, the scenario of > using Hbase as the global index is definitely something which we'd be > interested in understanding further. > 2. Actually I was speaking to the case of ComplexKeyGenerator. Currently > if any single component of it is null, it will throw an exception. If this > is not intended behavior, I'd be happy to fix this bug as it looks to solve > our use case. > 3. Thanks for the update on this. The spark upgrade is definitely a large > undertaking that I'd be happy to help with. > > Thanks again, > Brandon > > On 11/8/19, 3:52 PM, "Balaji Varadarajan" <varadar...@gmail.com> wrote: > > Brandon, > > Great initiative and thoughts. Thanks for writing detailed description > on > what you are looking to achieve. > > Here are some of my comments/thoughts: > > 1. HUDI-326 : There is some work that is happening in this > direction. > But, we should be able to collaborate on this. Siva has opened a PR > ( > https://github.com/apache/incubator-hudi/pull/1004) to support > delete > using only HoodieKey (partitionPath, recordKey). Technically, we can > support an interface for delete with only recordKeys if the index > is of > type global (Current implementation supports > HoodieGlobalBloomIndex). > Within Uber, we use Hbase as the global Hudi index to support > partition > agnostic record-key lookups. In other words, we can have 2 flavors > of > delete APIs - one with input being RDD<HoodieKeys> (works for all > index > types) and another with input RDD<RecordKey> that works with global > index. Our vision is to support an external clustered index > (global) as the > de-facto index that resides in DFS along with dataset. > 2. HUDI-327 : IIUC, Just like ComplexKeyGenerator, the new key > generator would need composite keys (in this case primary and > secondary for > breaking the "null" tie ). Are you concerned about the record-key > footprint > for each key when using the key generated by ComplexKeyGenerator? > In that > case, makes sense to me. Otherwise, ComplexKeyGenerator should be > able to > handle cases when some component of it is null. right ? > 3. As for HUDI-83, at-least on the write side, we have tied this > with > spark-2.4 upgrade. There is ongoing work happening in this regard. > I will > request folks who is working on this to provide status. Last I > know, we > were running into some test failures when doing this upgrade. But > yes, as > this is a massive upgrade, we would need your help in reviewing, > debugging > and testing this change :) > > Others, Thoughts ? > > Thanks, > Balaji.V > > On Fri, Nov 8, 2019 at 2:49 PM Scheller, Brandon > <bsche...@amazon.com.invalid> wrote: > > > Hi Hudi community, > > > > We at AWS EMR are interested in starting work on a few different > usability > > improvements for Hudi and we’re interested to hear your feedback. > > > > Here are some of our ideas: > > https://issues.apache.org/jira/browse/HUDI-326 > > https://issues.apache.org/jira/browse/HUDI-327 > > > > Additionally, we were hoping to help drive: > > https://issues.apache.org/jira/browse/HUDI-83 and its associated > Hive > > Jira: https://issues.apache.org/jira/browse/HIVE-22224 > > > > I am looking forward to improving Hudi with you all. And feel free > to let > > us know if there is anything specific, you’d like us to look at. > > > > Thanks, > > Brandon > > > > >