qjqqyy commented on code in PR #5520: URL: https://github.com/apache/hudi/pull/5520#discussion_r874349668
########## hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/ComplexKeyGenerator.java: ########## @@ -37,14 +36,10 @@ public class ComplexKeyGenerator extends BuiltinKeyGenerator { public ComplexKeyGenerator(TypedProperties props) { super(props); - this.recordKeyFields = Arrays.stream(props.getString(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.key()).split(",")) - .map(String::trim) - .filter(s -> !s.isEmpty()) - .collect(Collectors.toList()); - this.partitionPathFields = Arrays.stream(props.getString(KeyGeneratorOptions.PARTITIONPATH_FIELD_NAME.key()).split(",")) - .map(String::trim) - .filter(s -> !s.isEmpty()) - .collect(Collectors.toList()); + this.recordKeyFields = props.getStringList(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.key(), ",", + Collections.singletonList(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.defaultValue())); Review Comment: hi @codope, I think this change is backwards compatible. Given a config that works in ≤0.11.0, the behaviour is the same after this change. Assume that `uuid` is not part of the Schema, and the user did not set `RECORDKEY_FIELD_NAME`. - Old behaviour: keygen throws an exception due to unset config - New behaviour: the write fails due to an invalid record key Wouldn't the new failure mode make more sense considering that `RECORDKEY_FIELD_NAME` actually has a default value? In any case, I can re-scope this change to only fix hive-sync if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org