Hi community:


Here I want to start a discussion about improving the hudi user experience.




Now hudi has more and more users all over the world, but most of them don’t 
know hudi like uber engineers or us.

when they start hudi tasks, they need to do a lot of configuration,many of 
which are not user-friendly.




such as:
```

hoodie.datasource.write.keygenerator.class   ->  
org.apache.hudi.keygen.SimpleKeyGenerator

hoodie.datasource.write.payload.class -> 
org.apache.hudi.OverwriteWithLatestAvroPayload`

--schemaprovider-class` -> subclass of org.apache.hudi.utilities.schema

--transformer-class -> full class names to act transform

--sync-tool-classes -> full class names of sync tool

--source-class -> Subclass of org.apache.hudi.utilities.sources
...
```

I think asking users to provide the full name of the class is not very 
friendly, especially for new users.

so, maybe we can provide more ways to configure parameters, just like the case 
of `HoodieIndex`. 




In `HoodieIndex` case, The users can configure one of the index type or index 
class names to tell hudi which index to use. 

```

hoodie.index.type -> HBASE

```

or

```

hoodie.index.class -> org.apache.hudi.index.hbase.SparkHoodieHBaseIndex

```

I believe more users like the `hoodie.index.type` way.




So, I think we can make some configuration above support being set through 
type, and keep the way of class name configuration at the same time, in case of 
some users need customizing functions on their own.




I'm looking forward to your feedback. Any suggestions are appreciated

Reply via email to