the-other-tim-brown commented on code in PR #8638:
URL: https://github.com/apache/hudi/pull/8638#discussion_r1194391041


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java:
##########
@@ -1119,7 +1129,7 @@ private Schema getSchemaForWriteConfig(Schema 
targetSchema) {
       }
       return newWriteSchema;
     } catch (Exception e) {
-      throw new HoodieException("Failed to fetch schema from table ", e);
+      throw new HoodieDeltaStreamerSchemaFetchException("Failed to fetch 
schema from table ", e);

Review Comment:
   nitpick: don't need a trailing space



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/AvroConvertor.java:
##########
@@ -108,9 +109,13 @@ private void initJsonConvertor() {
   }
 
   public GenericRecord fromJson(String json) {
-    initSchema();
-    initJsonConvertor();
-    return jsonConverter.convert(json, schema);
+    try {
+      initSchema();
+      initJsonConvertor();
+      return jsonConverter.convert(json, schema);
+    } catch (Exception e) {
+      throw new HoodieDeltaStreamerSchemaCompatibilityException("Failed to 
convert schema from json to avro", e);

Review Comment:
   We want to distinguish between parsing/conversion style errors and 
compatibility errors. Compatibility is more like something violates the 
supported schema evolution 
[here](https://hudi.apache.org/docs/schema_evolution#out-of-the-box-schema-evolution)
 
   
   For all of the errors in this class, they fall into a parsing/conversion 
style error



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java:
##########
@@ -679,8 +683,12 @@ private Pair<SchemaProvider, Pair<String, 
JavaRDD<HoodieRecord>>> fetchFromSourc
   }
 
   private JavaRDD<GenericRecord> getTransformedRDD(Dataset<Row> rowDataset, 
boolean reconcileSchema, Schema readerSchema) {
-    return HoodieSparkUtils.createRdd(rowDataset, HOODIE_RECORD_STRUCT_NAME, 
HOODIE_RECORD_NAMESPACE, reconcileSchema,
-        Option.ofNullable(readerSchema)).toJavaRDD();
+    try {
+      return HoodieSparkUtils.createRdd(rowDataset, HOODIE_RECORD_STRUCT_NAME, 
HOODIE_RECORD_NAMESPACE, reconcileSchema,
+          Option.ofNullable(readerSchema)).toJavaRDD();
+    } catch (Exception e) {
+      throw new HoodieDeltaStreamerSchemaCompatibilityException("Failed to get 
transformed RDD", e);

Review Comment:
   Do we know that this will always be a schema issue? 
   
   What is the readSchema at this point? We could include the context of this 
dataset's schema (possibly converted to avro) and the readerSchema



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java:
##########
@@ -970,10 +979,11 @@ public void runMetaSync() {
         } catch (HoodieException e) {
           LOG.info("SyncTool class " + impl.trim() + " failed with exception", 
e);
           metaSyncExceptions.add(e);
+          implsFailed.add(impl.trim());
         }
       }
       if (!metaSyncExceptions.isEmpty()) {
-        throw SyncUtilHelpers.getExceptionFromList(metaSyncExceptions);
+        throw new HoodieDeltaStreamerMetaSyncException("Meta sync failure for 
" + String.join(",", implsFailed), 
SyncUtilHelpers.getExceptionFromList(metaSyncExceptions));

Review Comment:
   It may be easier to work with the exception if we have it take in the list 
of `implsfailed` and then construct the message in the call to `super`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to