AakashPradeep commented on a change in pull request #1580: URL: https://github.com/apache/incubator-hudi/pull/1580#discussion_r419837606
########## File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ########## @@ -83,6 +83,13 @@ private[hudi] object HoodieSparkSqlWriter { val fs = basePath.getFileSystem(sparkContext.hadoopConfiguration) var exists = fs.exists(new Path(basePath, HoodieTableMetaClient.METAFOLDER_NAME)) + if (exists && mode == SaveMode.Append) { + val existingTableName = new HoodieTableMetaClient(sparkContext.hadoopConfiguration, path.get).getTableConfig.getTableName + if (!existingTableName.equals(tblName.get)) { + throw new HoodieException(s"hoodie table with name $existingTableName already exist at $basePath") Review comment: Thanks for the comment @vinothchandar. I would suggest the following : 1. I can use HoodieTableConfig here instead of HoodieTableMetaClient (which seems little expensive here) 2. I will explore the hoodie-client code. But I would suggest to either keep all the check based on save mode in this class or move all to hoodie-client. The earlier it throws exception better it would be, but I would leave that on you guys to decide. 3. If we decide to keep all the checks as it is then I will suggest moving checks at Line number 116 to the beginning of the if section so that we can fail fast and avoid a lot of initialization. Same for the table existence check at 172, it should be moved to the beginning of else section. Please let me know if it sounds reasonable to you. I can file another Jira for improvement. Thanks! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org