prashantwason commented on a change in pull request #3836:
URL: https://github.com/apache/hudi/pull/3836#discussion_r744019234



##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
##########
@@ -95,10 +95,19 @@ public SparkRDDWriteClient(HoodieEngineContext context, 
HoodieWriteConfig writeC
   public SparkRDDWriteClient(HoodieEngineContext context, HoodieWriteConfig 
writeConfig,
                              Option<EmbeddedTimelineService> timelineService) {
     super(context, writeConfig, timelineService);
+    bootstrapMetadataTable();
+  }
+
+  private void bootstrapMetadataTable() {
     if (config.isMetadataTableEnabled()) {
-      // If the metadata table does not exist, it should be bootstrapped here
-      // TODO: Check if we can remove this requirement - auto bootstrap on 
commit
-      
SparkHoodieBackedTableMetadataWriter.create(context.getHadoopConf().get(), 
config, context);
+      // Defer bootstrap if upgrade / downgrade is pending
+      HoodieTableMetaClient metaClient = createMetaClient(true);
+      UpgradeDowngrade upgradeDowngrade = new UpgradeDowngrade(
+          metaClient, config, context, 
SparkUpgradeDowngradeHelper.getInstance());
+      if 
(!upgradeDowngrade.needsUpgradeOrDowngrade(HoodieTableVersion.current())) {

Review comment:
       I tried that step but it did not work because:
   1. When getTableAndInitCtx is called an action is already started on the 
table
   2. Metadata bootstrap does not happen because it detects an in-progress 
action 
   
   Bootstrap in the constructor is surely not ideal. 
   
   Possible ways:
   1. Make the bootstrap aware of the "current" operation so it can neglect it. 
Then we can bootstrap right after upgrade/downgrade step (as you suggested).
   2. Bootstrap automatically before write-to-metadata step. 
   
   I prefer #1 too as it is cleaner and metadata table will be available before 
any actions on the dataset start. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to