vinothchandar commented on a change in pull request #2136:
URL: https://github.com/apache/hudi/pull/2136#discussion_r511527366



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java
##########
@@ -30,12 +30,16 @@
 import org.apache.hudi.config.HoodieWriteConfig;
 
 import org.apache.hadoop.fs.FileSystem;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.index.HoodieIndex;
 import org.apache.log4j.LogManager;
 import org.apache.log4j.Logger;
 
 import java.io.IOException;
 import java.io.Serializable;
 
+import static org.apache.hudi.config.HoodieIndexConfig.INDEX_TYPE_PROP;

Review comment:
       We ideally need some guidelines around this. but I would to avoid static 
imports unles the context is super clear. here, I think its better to just 
import HoodieIndexConfig alone. 

##########
File path: 
hudi-common/src/test/java/org/apache/hudi/common/table/timeline/TestHoodieActiveTimeline.java
##########
@@ -111,7 +111,8 @@ public void testLoadingInstantsFromFiles() throws 
IOException {
     // Backwards compatibility testing for reading compaction plans
     metaClient = 
HoodieTableMetaClient.initTableType(metaClient.getHadoopConf(),
         metaClient.getBasePath(), metaClient.getTableType(), 
metaClient.getTableConfig().getTableName(),
-        metaClient.getArchivePath(), 
metaClient.getTableConfig().getPayloadClass(), VERSION_0);
+        metaClient.getArchivePath(), 
metaClient.getTableConfig().getPayloadClass(), VERSION_0,
+        
metaClient.getTableConfig().getProperties().getProperty("hoodie.index.type"));

Review comment:
       and refer here and everywhere else

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java
##########
@@ -124,9 +128,30 @@ public HoodieWriteConfig getConfig() {
     return config;
   }
 
+  private void checkIndexTypeCompatible(String writeIndexType, String 
persistIndexType) {
+    if (writeIndexType.equals(persistIndexType)) {

Review comment:
       could we strcuture this more easily as a switch statement, so it's 
easier to extend. or even just a single if -check. any reason the else block 
below can't be a else if block? 

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java
##########
@@ -124,9 +128,30 @@ public HoodieWriteConfig getConfig() {
     return config;
   }
 
+  private void checkIndexTypeCompatible(String writeIndexType, String 
persistIndexType) {
+    if (writeIndexType.equals(persistIndexType)) {
+      return;
+    } else {
+      if ((persistIndexType.equals(HoodieIndex.IndexType.GLOBAL_BLOOM.name())
+          && (writeIndexType.equals(HoodieIndex.IndexType.BLOOM.name()) || 
writeIndexType.equals(HoodieIndex.IndexType.SIMPLE.name())))
+          || writeIndexType.equals(HoodieIndex.IndexType.INMEMORY.name())) {
+        return;
+      } else {
+        throw new HoodieException("The new write indextype " + writeIndexType

Review comment:
       One tricky thing here is  : we have made HoodieIndex pluggable cc @leesf 
. So, the index type there would be unknown? How about adding a CUSTOM index 
type and a new `indexType()` to HoodieIndex class? then we can treat that here 
as something that is not compatible with any other index type. 
   
   Right now, we write null for this case? So this method could error in that 
scenario. Just saying we need to think about this. 

##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
##########
@@ -389,6 +389,9 @@ private static HoodieTableMetaClient 
initTableType(Configuration hadoopConf, Str
       properties.put(HoodieTableConfig.HOODIE_BOOTSTRAP_BASE_PATH, 
bootstrapBasePath);
     }
 
+    if (null != indexType) {
+      properties.put("hoodie.index.type", indexType);

Review comment:
       lets add a new HoodieTableConfig for this constant?

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java
##########
@@ -124,9 +128,30 @@ public HoodieWriteConfig getConfig() {
     return config;
   }
 
+  private void checkIndexTypeCompatible(String writeIndexType, String 
persistIndexType) {
+    if (writeIndexType.equals(persistIndexType)) {
+      return;
+    } else {
+      if ((persistIndexType.equals(HoodieIndex.IndexType.GLOBAL_BLOOM.name())

Review comment:
       It appears that this is currently allowing the following cases 
   
   a) if we persisted global_index, then allow transitions to BLOOM, SIMPLE. 
   
   
   Additionally, I think 
   
   - We should error out if INMEMORY is every used in production code :) 
   - GLOBAL_SIMPLE should be compatible with GLOBAL_BLOOM (and converse)

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java
##########
@@ -124,9 +128,30 @@ public HoodieWriteConfig getConfig() {
     return config;
   }
 
+  private void checkIndexTypeCompatible(String writeIndexType, String 
persistIndexType) {

Review comment:
       could we possibly move this method to `HoodieIndex` to keep it modular. 

##########
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/index/TestHoodieIndex.java
##########
@@ -77,7 +79,14 @@
 
   private void setUp(IndexType indexType) throws Exception {
     this.indexType = indexType;
-    initResources();
+    initPath();

Review comment:
       is it possible to change `initResources()` to take a properties file?

##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/AbstractUpgradeDowngrade.java
##########
@@ -132,6 +134,8 @@ protected void run(HoodieTableVersion toVersion, String 
instantTime) throws IOEx
   }
 
   private void createUpdatedFile(Properties props) throws IOException {
+    DefaultHoodieConfig.setDefaultOnCondition(props, 
!props.containsKey(HoodieIndexConfig.INDEX_TYPE_PROP),

Review comment:
       just confirming: ideally there would be an index type set already in 
props, in most cases. so thats what gets written. not the default, in that case

##########
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/TestTableIndexTypeCompatible.java
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.index.HoodieIndex.IndexType;
+import org.apache.hudi.testutils.HoodieClientTestBase;
+import org.junit.jupiter.params.ParameterizedTest;
+import org.junit.jupiter.params.provider.MethodSource;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+import java.io.IOException;
+import java.util.Arrays;
+
+import static 
org.apache.hudi.common.table.timeline.versioning.TimelineLayoutVersion.VERSION_1;
+
+public class TestTableIndexTypeCompatible extends HoodieClientTestBase {

Review comment:
       rename: TestIndexCompatibility 

##########
File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -251,9 +253,11 @@ private[hudi] object HoodieSparkSqlWriter {
     if (!tableExists) {
       val archiveLogFolder = parameters.getOrElse(
         HoodieTableConfig.HOODIE_ARCHIVELOG_FOLDER_PROP_NAME, "archived")
+      val indexType = parameters.getOrElse(

Review comment:
       same here

##########
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/index/TestHoodieIndex.java
##########
@@ -56,12 +56,14 @@
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
+import java.util.Properties;
 import java.util.Random;
 import java.util.UUID;
 
 import scala.Tuple2;
 
 import static 
org.apache.hudi.common.testutils.SchemaTestUtil.getSchemaFromResource;
+import static org.apache.hudi.config.HoodieIndexConfig.INDEX_TYPE_PROP;

Review comment:
       please no static imports for this. :)

##########
File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -110,9 +111,11 @@ private[hudi] object HoodieSparkSqlWriter {
       if (!tableExists) {
         val archiveLogFolder = parameters.getOrElse(
           HoodieTableConfig.HOODIE_ARCHIVELOG_FOLDER_PROP_NAME, "archived")
+        val indexType = parameters.getOrElse(

Review comment:
       I think we have a place where all common properties are set.
   
       val parameters = HoodieWriterUtils.parametersWithWriteDefaults(optParams)
    
    can set the index type default there and simply use it here. i.e no need to 
do parameters.getOrElse, just parameters.get() 

##########
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -221,7 +221,8 @@ private void refreshTimeline() throws IOException {
     } else {
       this.commitTimelineOpt = Option.empty();
       HoodieTableMetaClient.initTableType(new 
Configuration(jssc.hadoopConfiguration()), cfg.targetBasePath,
-          HoodieTableType.valueOf(cfg.tableType), cfg.targetTableName, 
"archived", cfg.payloadClassName, cfg.baseFileFormat);
+          HoodieTableType.valueOf(cfg.tableType), cfg.targetTableName, 
"archived", cfg.payloadClassName, cfg.baseFileFormat,
+          getHoodieClientConfig(this.schemaProvider).getIndexType().name());

Review comment:
       getHoodieClientConfig() does a bunch of things. Can we pass some object 
or use an existing member and avoid calling getHoodieClientConfig() again here? 

##########
File path: 
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/index/TestHoodieIndex.java
##########
@@ -77,7 +79,14 @@
 
   private void setUp(IndexType indexType) throws Exception {
     this.indexType = indexType;
-    initResources();
+    initPath();

Review comment:
       Love to not expand these setup blocks again. our dear Raymond has spent 
a lot of time cleaning them up :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to