ZihanLi58 commented on a change in pull request #3252:
URL: https://github.com/apache/gobblin/pull/3252#discussion_r604510346



##########
File path: 
gobblin-iceberg/src/main/java/org/apache/gobblin/iceberg/publisher/GobblinMCEPublisher.java
##########
@@ -132,6 +137,36 @@ public void publishData(Collection<? extends 
WorkUnitState> states) throws IOExc
     return newFiles;
   }
 
+  /**
+   * Choose the latest file from the work unit state. There will be no 
modification to the file.
+   * It's used in GMCE writer {@link GobblinMCEWriter} merely for getting the 
DB and table name.
+   * @throws IOException
+   */
+  private Map<Path, Metrics> computeDummyFile (State state) throws IOException 
{
+    Map<Path, Metrics> newFiles = new HashMap<>();
+    FileSystem fs = FileSystem.get(conf);
+    for (final String pathString : 
state.getPropAsList(ConfigurationKeys.DATA_PUBLISHER_DATASET_DIR, "")) {

Review comment:
       Oh one more thing I want to point out that if writer.partition.prefix is 
set, you may also want to include that value in the initial path, so that you 
won't end up with get the daily data which may cause the following operation to 
fail




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to