qjqqyy commented on code in PR #5520:
URL: https://github.com/apache/hudi/pull/5520#discussion_r874349668


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/ComplexKeyGenerator.java:
##########
@@ -37,14 +36,10 @@ public class ComplexKeyGenerator extends 
BuiltinKeyGenerator {
 
   public ComplexKeyGenerator(TypedProperties props) {
     super(props);
-    this.recordKeyFields = 
Arrays.stream(props.getString(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.key()).split(","))
-        .map(String::trim)
-        .filter(s -> !s.isEmpty())
-        .collect(Collectors.toList());
-    this.partitionPathFields = 
Arrays.stream(props.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME.key()).split(","))
-        .map(String::trim)
-        .filter(s -> !s.isEmpty())
-        .collect(Collectors.toList());
+    this.recordKeyFields = 
props.getStringList(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.key(), ",",
+        
Collections.singletonList(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.defaultValue()));

Review Comment:
   hi @codope, I think this change is backwards compatible. Given a config that 
works in <0.11.0, the behaviour is the same after this change.
   
   Assume that `uuid` is not part of the Schema, and the user did not set 
`RECORDKEY_FIELD_NAME`.
   
   - Old behaviour: keygen throws an exception due to unset config
   - New behaviour: the write fails due to an invalid record key
   
   Wouldn't the new failure mode make more sense considering that 
`RECORDKEY_FIELD_NAME` actually has a default value?
   
   In any case, I can re-scope this change to only fix hive-sync if needed.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to