hudi-agent commented on code in PR #18893:
URL: https://github.com/apache/hudi/pull/18893#discussion_r3416625898


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/HiveSchemaProvider.java:
##########
@@ -86,18 +83,16 @@ public HiveSchemaProvider(TypedProperties props, 
JavaSparkContext jssc) {
   }
 
   @Override
-  @Deprecated
-  public Schema getSourceSchema() {
-    return getSourceHoodieSchema().toAvroSchema();
+  public HoodieSchema getSourceHoodieSchema() {

Review Comment:
   🤖 Same concern as on `FilebasedSchemaProvider`: the deprecated 
`getSourceSchema()`/`getTargetSchema()` overrides are removed here, so any 
caller of the legacy API now reaches the parent's default, which throws 
`UnsupportedOperationException`. Could you keep them as thin 
`getXxxHoodieSchema().toAvroSchema()` delegators (as 
`SchemaProviderWithPostProcessor` still does) until callers are migrated?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java:
##########
@@ -72,26 +70,31 @@ public FilebasedSchemaProvider(TypedProperties props, 
JavaSparkContext jssc) {
     }
   }
 
-  private Schema parseSchema(String schemaFile) {
+  private HoodieSchema parseSchema(String schemaFile) {
     return readSchemaFromFile(schemaFile, this.fs, config);
   }
 
   @Override
-  public Schema getTargetSchema() {
+  public HoodieSchema getSourceHoodieSchema() {
+    return sourceSchema;
+  }

Review Comment:
   🤖 Removing the `@Getter` (which generated `getSourceSchema()`) and dropping 
the deprecated `getTargetSchema()` override means callers of the legacy API now 
hit the parent default in `SchemaProvider`, which throws 
`UnsupportedOperationException` — e.g. 
`hudi-integ-test/.../WriterContext.java:79` calls 
`schemaProvider.getSourceSchema().toString()` and would now break for this 
provider. Could you either keep the deprecated overrides delegating to 
`getSourceHoodieSchema().toAvroSchema()` (the pattern 
`SchemaProviderWithPostProcessor` already uses) or update the parent's 
deprecated methods to delegate? @yihua
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SimpleSchemaProvider.java:
##########
@@ -21,17 +21,18 @@
 import org.apache.hudi.common.config.TypedProperties;
 import org.apache.hudi.common.schema.HoodieSchema;
 
-import lombok.Getter;
-import org.apache.avro.Schema;
 import org.apache.spark.api.java.JavaSparkContext;
 
-@Getter
 public class SimpleSchemaProvider extends SchemaProvider {
-
-  private final Schema sourceSchema;
+  private final HoodieSchema sourceSchema;
 
   public SimpleSchemaProvider(JavaSparkContext jssc, HoodieSchema 
sourceSchema, TypedProperties props) {
     super(props, jssc);
-    this.sourceSchema = sourceSchema.toAvroSchema();
+    this.sourceSchema = sourceSchema;
+  }
+
+  @Override
+  public HoodieSchema getSourceHoodieSchema() {

Review Comment:
   🤖 Same backward-compat concern as the other migrated providers — the removed 
`@Getter`-generated `getSourceSchema()` is no longer present, so calling the 
deprecated API on a `SimpleSchemaProvider` now throws 
`UnsupportedOperationException` (the parent's default). Consider keeping a 
deprecated `getSourceSchema()` override returning `sourceSchema.toAvroSchema()` 
until callers are migrated.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/FilebasedSchemaProvider.java:
##########
@@ -72,26 +70,31 @@ public FilebasedSchemaProvider(TypedProperties props, 
JavaSparkContext jssc) {
     }
   }
 
-  private Schema parseSchema(String schemaFile) {
+  private HoodieSchema parseSchema(String schemaFile) {
     return readSchemaFromFile(schemaFile, this.fs, config);
   }
 
   @Override
-  public Schema getTargetSchema() {
+  public HoodieSchema getSourceHoodieSchema() {
+    return sourceSchema;
+  }
+
+  @Override
+  public HoodieSchema getTargetHoodieSchema() {

Review Comment:
   🤖 nit: could you simplify this to `return targetSchema != null ? 
targetSchema : super.getTargetHoodieSchema();`? The same if/else appears in 
`HiveSchemaProvider.getTargetHoodieSchema()` too — worth collapsing both.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to