phet commented on code in PR #3542:
URL: https://github.com/apache/gobblin/pull/3542#discussion_r954412100


##########
gobblin-data-management/src/test/java/org/apache/gobblin/data/management/copy/CopySourceTest.java:
##########
@@ -339,4 +344,39 @@ public void testDefaultHiveDatasetShardTempPaths()
       Assert.assertEquals(datasetPaths.contains(tempDirRoot + 
"/targetPath/testDB/table" + i), true);
     }
   }
+
+  @Test (expectedExceptions = RuntimeException.class)
+  public void testGetWorkUnitsExecutionFastFailure() {
+
+    SourceState state = new SourceState();
+
+    state.setProp(ConfigurationKeys.SOURCE_FILEBASED_FS_URI, "file:///");
+    state.setProp(ConfigurationKeys.WRITER_FILE_SYSTEM_URI, "file:///");
+    state.setProp(ConfigurationKeys.DATA_PUBLISHER_FINAL_DIR, "/target/dir");
+    state.setProp(DatasetUtils.DATASET_PROFILE_CLASS_KEY,
+        TestCopyablePartitionableDatasedFinder.class.getCanonicalName());
+    state.setProp(ConfigurationKeys.COPY_SOURCE_FILESET_WU_GENERATOR_CLASS, 
MockedFileSetWorkUnitGenerator.class.getName());
+    state.setProp(ConfigurationKeys.WORK_UNIT_FAST_FAIL_ENABLED, true);
+
+    CopySource source = new CopySource();
+
+    List<WorkUnit> workunits = source.getWorkunits(state);
+    Assert.assertNull(workunits);
+  }
+
+  class MockedFileSetWorkUnitGenerator extends 
CopySource.FileSetWorkUnitGenerator {
+
+    public MockedFileSetWorkUnitGenerator(CopyableDatasetBase copyableDataset, 
FileSet<CopyEntity> fileSet, State state,
+        FileSystem targetFs, SetMultimap<FileSet<CopyEntity>, WorkUnit> 
workUnitList,
+        Optional<CopyableFileWatermarkGenerator> watermarkGenerator, long 
minWorkUnitWeight,
+        Optional<LineageInfo> lineageInfo) {
+      super(copyableDataset, fileSet, state, targetFs, workUnitList, 
watermarkGenerator, minWorkUnitWeight,
+          lineageInfo);
+    }
+
+    @Override
+    public Void call(){
+      return null;

Review Comment:
   I'm surprised here the mock doesn't throw an exception.  I'm guessing this 
results in NPE, so same difference, but for documentation's sake, an explicit 
`RuntimeException("boom!")` would better capture the intent.



##########
gobblin-data-management/src/test/java/org/apache/gobblin/data/management/copy/CopySourceTest.java:
##########
@@ -339,4 +344,39 @@ public void testDefaultHiveDatasetShardTempPaths()
       Assert.assertEquals(datasetPaths.contains(tempDirRoot + 
"/targetPath/testDB/table" + i), true);
     }
   }
+
+  @Test (expectedExceptions = RuntimeException.class)
+  public void testGetWorkUnitsExecutionFastFailure() {
+
+    SourceState state = new SourceState();
+
+    state.setProp(ConfigurationKeys.SOURCE_FILEBASED_FS_URI, "file:///");
+    state.setProp(ConfigurationKeys.WRITER_FILE_SYSTEM_URI, "file:///");
+    state.setProp(ConfigurationKeys.DATA_PUBLISHER_FINAL_DIR, "/target/dir");
+    state.setProp(DatasetUtils.DATASET_PROFILE_CLASS_KEY,
+        TestCopyablePartitionableDatasedFinder.class.getCanonicalName());
+    state.setProp(ConfigurationKeys.COPY_SOURCE_FILESET_WU_GENERATOR_CLASS, 
MockedFileSetWorkUnitGenerator.class.getName());
+    state.setProp(ConfigurationKeys.WORK_UNIT_FAST_FAIL_ENABLED, true);
+
+    CopySource source = new CopySource();
+
+    List<WorkUnit> workunits = source.getWorkunits(state);
+    Assert.assertNull(workunits);
+  }
+
+  class MockedFileSetWorkUnitGenerator extends 
CopySource.FileSetWorkUnitGenerator {

Review Comment:
   `AlwaysThrowsMockedFileSetWorkUnitGenerator`?



##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/CopySource.java:
##########
@@ -214,6 +214,7 @@ public List<WorkUnit> getWorkunits(final SourceState state) 
{
       failJobIfAllRequestsRejected(allocator, prioritizedFileSets);
 
       String filesetWuGeneratorAlias = 
state.getProp(ConfigurationKeys.COPY_SOURCE_FILESET_WU_GENERATOR_CLASS, 
FileSetWorkUnitGenerator.class.getName());
+      boolean isWUFastFailOverEnabled = 
state.getPropAsBoolean(ConfigurationKeys.WORK_UNIT_FAST_FAIL_ENABLED, true);

Review Comment:
   I agree on naming the default value and making that `static`.
   
   I believe the default of 'true' is warranted.  fatal failure is believed to 
be ideal and appreciated across the board.  nonetheless, as it is backwards 
incompatible behavior for our installed legacy base, we provide this means to 
disable the new behavior selectively on each specific flow config.



##########
gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java:
##########
@@ -296,6 +296,7 @@ public class ConfigurationKeys {
   public static final String WORK_UNIT_STATE_ACTUAL_HIGH_WATER_MARK_KEY = 
"workunit.state.actual.high.water.mark";
   public static final String WORK_UNIT_DATE_PARTITION_KEY = 
"workunit.source.date.partition";
   public static final String WORK_UNIT_DATE_PARTITION_NAME = 
"workunit.source.date.partitionName";
+  public static final String WORK_UNIT_FAST_FAIL_ENABLED = 
"workunit.fast.fail.enabled";

Review Comment:
   technically the WU generator is already failing fast--we just want that to 
be regarded as fatal overall, not merely logged.  maybe 
`"workunit.generator.failure.is.fatal"`?



##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/CopySource.java:
##########
@@ -214,6 +214,7 @@ public List<WorkUnit> getWorkunits(final SourceState state) 
{
       failJobIfAllRequestsRejected(allocator, prioritizedFileSets);
 
       String filesetWuGeneratorAlias = 
state.getProp(ConfigurationKeys.COPY_SOURCE_FILESET_WU_GENERATOR_CLASS, 
FileSetWorkUnitGenerator.class.getName());
+      boolean isWUFastFailOverEnabled = 
state.getPropAsBoolean(ConfigurationKeys.WORK_UNIT_FAST_FAIL_ENABLED, true);

Review Comment:
   "failover" doesn't seem quite right to me either.  maybe 
`isWuGeneratorFailureFatal` or `shouldWuGeneratorFailureBeFatal`?  (note the 
`Wu` casing, as used in the line directly above, rather than `WU`.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to