hudi-agent commented on code in PR #18893:
URL: https://github.com/apache/hudi/pull/18893#discussion_r3416625898
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/HiveSchemaProvider.java:
##########
@@ -86,18 +83,16 @@ public HiveSchemaProvider(TypedProperties props,
JavaSparkContext jssc) {
}
@Override
- @Deprecated
- public Schema getSourceSchema() {
- return getSourceHoodieSchema().toAvroSchema();
+ public HoodieSchema getSourceHoodieSchema() {
Review Comment:
🤖 Same concern as on `FilebasedSchemaProvider`: the deprecated
`getSourceSchema()`/`getTargetSchema()` overrides are removed here, so any
caller of the legacy API now reaches the parent's default, which throws
`UnsupportedOperationException`. Could you keep them as thin
`getXxxHoodieSchema().toAvroSchema()` delegators (as
`SchemaProviderWithPostProcessor` still does) until callers are migrated?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java:
##########
@@ -72,26 +70,31 @@ public FilebasedSchemaProvider(TypedProperties props,
JavaSparkContext jssc) {
}
}
- private Schema parseSchema(String schemaFile) {
+ private HoodieSchema parseSchema(String schemaFile) {
return readSchemaFromFile(schemaFile, this.fs, config);
}
@Override
- public Schema getTargetSchema() {
+ public HoodieSchema getSourceHoodieSchema() {
+ return sourceSchema;
+ }
Review Comment:
🤖 Removing the `@Getter` (which generated `getSourceSchema()`) and dropping
the deprecated `getTargetSchema()` override means callers of the legacy API now
hit the parent default in `SchemaProvider`, which throws
`UnsupportedOperationException` — e.g.
`hudi-integ-test/.../WriterContext.java:79` calls
`schemaProvider.getSourceSchema().toString()` and would now break for this
provider. Could you either keep the deprecated overrides delegating to
`getSourceHoodieSchema().toAvroSchema()` (the pattern
`SchemaProviderWithPostProcessor` already uses) or update the parent's
deprecated methods to delegate? @yihua
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SimpleSchemaProvider.java:
##########
@@ -21,17 +21,18 @@
import org.apache.hudi.common.config.TypedProperties;
import org.apache.hudi.common.schema.HoodieSchema;
-import lombok.Getter;
-import org.apache.avro.Schema;
import org.apache.spark.api.java.JavaSparkContext;
-@Getter
public class SimpleSchemaProvider extends SchemaProvider {
-
- private final Schema sourceSchema;
+ private final HoodieSchema sourceSchema;
public SimpleSchemaProvider(JavaSparkContext jssc, HoodieSchema
sourceSchema, TypedProperties props) {
super(props, jssc);
- this.sourceSchema = sourceSchema.toAvroSchema();
+ this.sourceSchema = sourceSchema;
+ }
+
+ @Override
+ public HoodieSchema getSourceHoodieSchema() {
Review Comment:
🤖 Same backward-compat concern as the other migrated providers — the removed
`@Getter`-generated `getSourceSchema()` is no longer present, so calling the
deprecated API on a `SimpleSchemaProvider` now throws
`UnsupportedOperationException` (the parent's default). Consider keeping a
deprecated `getSourceSchema()` override returning `sourceSchema.toAvroSchema()`
until callers are migrated.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java:
##########
@@ -72,26 +70,31 @@ public FilebasedSchemaProvider(TypedProperties props,
JavaSparkContext jssc) {
}
}
- private Schema parseSchema(String schemaFile) {
+ private HoodieSchema parseSchema(String schemaFile) {
return readSchemaFromFile(schemaFile, this.fs, config);
}
@Override
- public Schema getTargetSchema() {
+ public HoodieSchema getSourceHoodieSchema() {
+ return sourceSchema;
+ }
+
+ @Override
+ public HoodieSchema getTargetHoodieSchema() {
Review Comment:
🤖 nit: could you simplify this to `return targetSchema != null ?
targetSchema : super.getTargetHoodieSchema();`? The same if/else appears in
`HiveSchemaProvider.getTargetHoodieSchema()` too — worth collapsing both.
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]