the-other-tim-brown commented on code in PR #8638: URL: https://github.com/apache/hudi/pull/8638#discussion_r1194391041
########## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ########## @@ -1119,7 +1129,7 @@ private Schema getSchemaForWriteConfig(Schema targetSchema) { } return newWriteSchema; } catch (Exception e) { - throw new HoodieException("Failed to fetch schema from table ", e); + throw new HoodieDeltaStreamerSchemaFetchException("Failed to fetch schema from table ", e); Review Comment: nitpick: don't need a trailing space ########## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/AvroConvertor.java: ########## @@ -108,9 +109,13 @@ private void initJsonConvertor() { } public GenericRecord fromJson(String json) { - initSchema(); - initJsonConvertor(); - return jsonConverter.convert(json, schema); + try { + initSchema(); + initJsonConvertor(); + return jsonConverter.convert(json, schema); + } catch (Exception e) { + throw new HoodieDeltaStreamerSchemaCompatibilityException("Failed to convert schema from json to avro", e); Review Comment: We want to distinguish between parsing/conversion style errors and compatibility errors. Compatibility is more like something violates the supported schema evolution [here](https://hudi.apache.org/docs/schema_evolution#out-of-the-box-schema-evolution) For all of the errors in this class, they fall into a parsing/conversion style error ########## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ########## @@ -679,8 +683,12 @@ private Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchFromSourc } private JavaRDD<GenericRecord> getTransformedRDD(Dataset<Row> rowDataset, boolean reconcileSchema, Schema readerSchema) { - return HoodieSparkUtils.createRdd(rowDataset, HOODIE_RECORD_STRUCT_NAME, HOODIE_RECORD_NAMESPACE, reconcileSchema, - Option.ofNullable(readerSchema)).toJavaRDD(); + try { + return HoodieSparkUtils.createRdd(rowDataset, HOODIE_RECORD_STRUCT_NAME, HOODIE_RECORD_NAMESPACE, reconcileSchema, + Option.ofNullable(readerSchema)).toJavaRDD(); + } catch (Exception e) { + throw new HoodieDeltaStreamerSchemaCompatibilityException("Failed to get transformed RDD", e); Review Comment: Do we know that this will always be a schema issue? What is the readSchema at this point? We could include the context of this dataset's schema (possibly converted to avro) and the readerSchema ########## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ########## @@ -970,10 +979,11 @@ public void runMetaSync() { } catch (HoodieException e) { LOG.info("SyncTool class " + impl.trim() + " failed with exception", e); metaSyncExceptions.add(e); + implsFailed.add(impl.trim()); } } if (!metaSyncExceptions.isEmpty()) { - throw SyncUtilHelpers.getExceptionFromList(metaSyncExceptions); + throw new HoodieDeltaStreamerMetaSyncException("Meta sync failure for " + String.join(",", implsFailed), SyncUtilHelpers.getExceptionFromList(metaSyncExceptions)); Review Comment: It may be easier to work with the exception if we have it take in the list of `implsfailed` and then construct the message in the call to `super` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org