Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595870231


##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -635,16 +603,7 @@ public static Long getSizeInMB(long sizeInBytes) {
   }
 
   public static Path constructAbsolutePathInHadoopPath(String basePath, String 
relativePartitionPath) {
-if (StringUtils.isNullOrEmpty(relativePartitionPath)) {
-  return new Path(basePath);
-}
-
-// NOTE: We have to chop leading "/" to make sure Hadoop does not treat it 
like
-//   absolute path
-String properPartitionPath = 
relativePartitionPath.startsWith(PATH_SEPARATOR)
-? relativePartitionPath.substring(1)
-: relativePartitionPath;
-return constructAbsolutePath(new CachingPath(basePath), 
properPartitionPath);
+return new Path(constructAbsolutePath(basePath, 
relativePartitionPath).toUri());

Review Comment:
   Got it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595862165


##
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##
@@ -19,37 +19,63 @@
 
 package org.apache.hudi.storage;
 
-import org.apache.hudi.hadoop.fs.HadoopFSUtils;
-import org.apache.hudi.hadoop.fs.HoodieWrapperFileSystem;
-import org.apache.hudi.storage.hadoop.HoodieHadoopStorage;
+import org.apache.hudi.common.fs.ConsistencyGuard;
+import org.apache.hudi.common.util.ReflectionUtils;
 
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FileSystem;
 
 public class HoodieStorageUtils {
+
+  public static final String HUDI_HADOOP_STORAGE = 
"org.apache.hudi.storage.hadoop.HoodieHadoopStorage";
+  public static final String HADOOP_STORAGE_CONF = 
"org.apache.hudi.storage.hadoop.HadoopStorageConfiguration";
   public static final String DEFAULT_URI = "file:///";
 
   public static HoodieStorage getStorage(StorageConfiguration conf) {
 return getStorage(DEFAULT_URI, conf);
   }
 
   public static HoodieStorage getStorage(FileSystem fs) {
-return new HoodieHadoopStorage(fs);
+return (HoodieStorage) ReflectionUtils.loadClass(HUDI_HADOOP_STORAGE, new 
Class[] {FileSystem.class}, fs);
   }
 
   public static HoodieStorage getStorage(String basePath, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(basePath, conf));
+return (HoodieStorage) ReflectionUtils.loadClass(HUDI_HADOOP_STORAGE, new 
Class[] {String.class, StorageConfiguration.class}, basePath, conf);
+  }
+
+  public static HoodieStorage getStorage(String basePath, Configuration conf) {
+return (HoodieStorage) ReflectionUtils.loadClass(HUDI_HADOOP_STORAGE, new 
Class[] {String.class, Configuration.class}, basePath, conf);
   }
 
   public static HoodieStorage getStorage(StoragePath path, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(path, 
conf.unwrapAs(Configuration.class)));
+return (HoodieStorage) ReflectionUtils.loadClass(HUDI_HADOOP_STORAGE, new 
Class[] {StoragePath.class, StorageConfiguration.class}, path, conf);
+  }
+
+  public static HoodieStorage getStorage(StoragePath path,
+ StorageConfiguration conf,
+ boolean enableRetry,
+ long maxRetryIntervalMs,
+ int maxRetryNumbers,
+ long initialRetryIntervalMs,
+ String retryExceptions,
+ ConsistencyGuard consistencyGuard) {
+return (HoodieStorage) ReflectionUtils.loadClass(HUDI_HADOOP_STORAGE,
+new Class[] {StoragePath.class, StorageConfiguration.class, 
boolean.class, long.class, int.class, long.class,
+String.class, ConsistencyGuard.class},
+path, conf, enableRetry, maxRetryIntervalMs, maxRetryNumbers, 
initialRetryIntervalMs, retryExceptions,
+consistencyGuard);
   }
 
   public static HoodieStorage getRawStorage(HoodieStorage storage) {
-FileSystem fs = (FileSystem) storage.getFileSystem();
-if (fs instanceof HoodieWrapperFileSystem) {
-  return getStorage(((HoodieWrapperFileSystem) fs).getFileSystem());
-}
-return storage;
+return (HoodieStorage) ReflectionUtils.loadClass(HUDI_HADOOP_STORAGE, new 
Class[] {HoodieStorage.class}, storage);
+  }
+
+  public static StorageConfiguration getStorageConf(Configuration conf) {
+return (StorageConfiguration) 
ReflectionUtils.loadClass(HADOOP_STORAGE_CONF,
+new Class[] {Configuration.class}, conf);
+  }
+
+  public static StorageConfiguration 
getStorageConfWithCopy(Configuration conf) {
+return (StorageConfiguration) 
getStorageConf(conf).newInstance();

Review Comment:
   Should this use the constructor `HadoopStorageConfiguration(Configuration 
configuration, boolean copy)` instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595861415


##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -209,21 +193,7 @@ public static List 
getAllPartitionFoldersThreeLevelsDown(HoodieStorage s
* Given a base partition and a partition path, return relative path of 
partition path to the base path.
*/
   public static String getRelativePartitionPath(Path basePath, Path 
fullPartitionPath) {
-basePath = CachingPath.getPathWithoutSchemeAndAuthority(basePath);
-fullPartitionPath = 
CachingPath.getPathWithoutSchemeAndAuthority(fullPartitionPath);
-
-String fullPartitionPathStr = fullPartitionPath.toString();
-
-if (!fullPartitionPathStr.startsWith(basePath.toString())) {
-  throw new IllegalArgumentException("Partition path \"" + 
fullPartitionPathStr
-  + "\" does not belong to base-path \"" + basePath + "\"");
-}
-
-int partitionStartIndex = fullPartitionPathStr.indexOf(basePath.getName(),
-basePath.getParent() == null ? 0 : 
basePath.getParent().toString().length());
-// Partition-Path could be empty for non-partitioned tables
-return partitionStartIndex + basePath.getName().length() == 
fullPartitionPathStr.length() ? ""
-: fullPartitionPathStr.substring(partitionStartIndex + 
basePath.getName().length() + 1);
+return getRelativePartitionPath(new StoragePath(basePath.toUri()), new 
StoragePath(fullPartitionPath.toUri()));

Review Comment:
   thats in hadoopfsutils so we can't



##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -635,16 +603,7 @@ public static Long getSizeInMB(long sizeInBytes) {
   }
 
   public static Path constructAbsolutePathInHadoopPath(String basePath, String 
relativePartitionPath) {
-if (StringUtils.isNullOrEmpty(relativePartitionPath)) {
-  return new Path(basePath);
-}
-
-// NOTE: We have to chop leading "/" to make sure Hadoop does not treat it 
like
-//   absolute path
-String properPartitionPath = 
relativePartitionPath.startsWith(PATH_SEPARATOR)
-? relativePartitionPath.substring(1)
-: relativePartitionPath;
-return constructAbsolutePath(new CachingPath(basePath), 
properPartitionPath);
+return new Path(constructAbsolutePath(basePath, 
relativePartitionPath).toUri());

Review Comment:
   thats in hadoopfsutils so we can't



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595849923


##
hudi-common/src/main/java/org/apache/hudi/common/config/PropertiesConfig.java:
##
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+/**
+ * Used for loading filesystem specific configs
+ */
+public abstract class PropertiesConfig {
+

Review Comment:
   nit: remove empty line



##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -209,21 +193,7 @@ public static List 
getAllPartitionFoldersThreeLevelsDown(HoodieStorage s
* Given a base partition and a partition path, return relative path of 
partition path to the base path.
*/
   public static String getRelativePartitionPath(Path basePath, Path 
fullPartitionPath) {
-basePath = CachingPath.getPathWithoutSchemeAndAuthority(basePath);
-fullPartitionPath = 
CachingPath.getPathWithoutSchemeAndAuthority(fullPartitionPath);
-
-String fullPartitionPathStr = fullPartitionPath.toString();
-
-if (!fullPartitionPathStr.startsWith(basePath.toString())) {
-  throw new IllegalArgumentException("Partition path \"" + 
fullPartitionPathStr
-  + "\" does not belong to base-path \"" + basePath + "\"");
-}
-
-int partitionStartIndex = fullPartitionPathStr.indexOf(basePath.getName(),
-basePath.getParent() == null ? 0 : 
basePath.getParent().toString().length());
-// Partition-Path could be empty for non-partitioned tables
-return partitionStartIndex + basePath.getName().length() == 
fullPartitionPathStr.length() ? ""
-: fullPartitionPathStr.substring(partitionStartIndex + 
basePath.getName().length() + 1);
+return getRelativePartitionPath(new StoragePath(basePath.toUri()), new 
StoragePath(fullPartitionPath.toUri()));

Review Comment:
   nit: use `convertToStoragePath(Path)`



##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -635,16 +603,7 @@ public static Long getSizeInMB(long sizeInBytes) {
   }
 
   public static Path constructAbsolutePathInHadoopPath(String basePath, String 
relativePartitionPath) {
-if (StringUtils.isNullOrEmpty(relativePartitionPath)) {
-  return new Path(basePath);
-}
-
-// NOTE: We have to chop leading "/" to make sure Hadoop does not treat it 
like
-//   absolute path
-String properPartitionPath = 
relativePartitionPath.startsWith(PATH_SEPARATOR)
-? relativePartitionPath.substring(1)
-: relativePartitionPath;
-return constructAbsolutePath(new CachingPath(basePath), 
properPartitionPath);
+return new Path(constructAbsolutePath(basePath, 
relativePartitionPath).toUri());

Review Comment:
   nit: use `convertToPath(StoragePath)`



##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroFileWriterFactory.java:
##
@@ -47,6 +48,12 @@
 import static org.apache.hudi.io.storage.HoodieHFileConfig.PREFETCH_ON_OPEN;
 
 public class HoodieAvroFileWriterFactory extends HoodieFileWriterFactory {
+

Review Comment:
   nit: remove empty line



##
hudi-common/src/main/java/org/apache/hudi/common/util/BaseFileUtils.java:
##
@@ -56,7 +56,7 @@ public static BaseFileUtils getInstance(String path) {
 if (path.endsWith(HoodieFileFormat.PARQUET.getFileExtension())) {
   return new ParquetUtils();
 } else if (path.endsWith(HoodieFileFormat.ORC.getFileExtension())) {
-  return new OrcUtils();
+  return ReflectionUtils.loadClass("org.apache.hudi.common.util.OrcUtils");

Review Comment:
   nit: final static variable for `org.apache.hudi.common.util.OrcUtils`.



##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -209,13 +217,14 @@ public static HoodieTableMetaClient 
createMetaClient(StorageConfiguration sto
   }
 
   /**
-   * @param conf file system configuration.
-   * @param basePath base path of the Hudi table.
+   * @param conf storage configuration.
+   * @param basePathbase path of the Hudi table.
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaC

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595827566


##
hudi-common/src/main/java/org/apache/hudi/metrics/Metrics.java:
##
@@ -99,8 +97,7 @@ public static synchronized void shutdownAllMetrics() {
   private List 
addAdditionalMetricsExporters(HoodieMetricsConfig metricConfig) {
 List reporterList = new ArrayList<>();
 List propPathList = 
StringUtils.split(metricConfig.getMetricReporterFileBasedConfigs(), ",");
-try (HoodieStorage storage = HoodieStorageUtils.getStorage(
-propPathList.get(0), HadoopFSUtils.getStorageConf(new 
Configuration( {
+try (HoodieStorage storage = 
HoodieStorageUtils.getStorage(propPathList.get(0))) {

Review Comment:
   The empty `new Configuration()` should not be used here.  We should track 
down the existing code since it may cause issues of accessing the files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595818567


##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -96,20 +90,6 @@ public class FSUtils {
 
   private static final StoragePathFilter ALLOW_ALL_FILTER = file -> true;
 
-  public static Configuration buildInlineConf(Configuration conf) {

Review Comment:
   Discussed offline.  This change is coupled with the PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595815715


##
hudi-client/hudi-java-client/pom.xml:
##
@@ -42,6 +42,11 @@
 hudi-client-common
 ${project.parent.version}
 
+
+org.apache.hudi
+hudi-hadoop-common
+${project.version}
+

Review Comment:
   Sg.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595814771


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -674,14 +674,15 @@ private static HoodieLogBlock getBlock(HoodieWriteConfig 
writeConfig,
  List records,
  boolean shouldWriteRecordPositions,
  Map 
header,
- String keyField) {
+ String keyField,
+ HoodieStorage storage) {
 switch (logDataBlockFormat) {
   case AVRO_DATA_BLOCK:
 return new HoodieAvroDataBlock(records, shouldWriteRecordPositions, 
header, keyField);
   case HFILE_DATA_BLOCK:
 return new HoodieHFileDataBlock(
 records, header, writeConfig.getHFileCompressionAlgorithm(), new 
StoragePath(writeConfig.getBasePath()),
-
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER));
+
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER), 
storage);

Review Comment:
   We can change the logic in an independent PR.  It's usually good to avoid 
such argument passing changes if not necessary, so there's less risk of 
breaking the logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595813106


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##
@@ -322,7 +322,7 @@ public static HoodieTableMetaClient 
createMetaClient(JavaSparkContext jsc, Strin
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaClient createMetaClient(SparkSession spark, 
String basePath) {
-return 
HoodieTestUtils.createMetaClient(spark.sessionState().newHadoopConf(), 
basePath);
+return 
HoodieTestUtils.createMetaClient(HoodieStorageUtils.getStorageConf(spark.sessionState().newHadoopConf()),
 basePath);

Review Comment:
   Synced offline on this and we resolved the confusion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595811789


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##
@@ -313,7 +313,7 @@ public static Option 
getCommitMetadataForLatestInstant(Hoo
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaClient createMetaClient(JavaSparkContext jsc, 
String basePath) {
-return HoodieTestUtils.createMetaClient(jsc.hadoopConfiguration(), 
basePath);
+return 
HoodieTestUtils.createMetaClient(HoodieStorageUtils.getStorageConf(jsc.hadoopConfiguration()),
 basePath);

Review Comment:
   I was thinking about reusing utils.  Did not notice this method is within 
`HoodieClientTestUtils`.  I see other places are fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1595806678


##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopFSUtils.java:
##
@@ -109,25 +105,6 @@ public static FileSystem getFs(String pathStr, 
Configuration conf, boolean local
 return getFs(pathStr, conf);
   }
 
-  public static HoodieStorage getStorageWithWrapperFS(StoragePath path,

Review Comment:
   Looks like we have to add a new constructor due to the usage of reflection.  
We'll simplify this later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2102997699

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * 1219237f59281728a3c6c0286a131264d68121ed Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23787)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2102800416

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * 532c386fec8fdc1ba587da04b3b50092440e4097 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23766)
 
   * 1219237f59281728a3c6c0286a131264d68121ed Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23787)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-09 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2102785956

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * 532c386fec8fdc1ba587da04b3b50092440e4097 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23766)
 
   * 1219237f59281728a3c6c0286a131264d68121ed UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101814289

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * 532c386fec8fdc1ba587da04b3b50092440e4097 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23766)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101774189

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * a24b9db69ad0993d5ff649e222525238274481a5 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23769)
 
   * 532c386fec8fdc1ba587da04b3b50092440e4097 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101721636

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * ea50694f9547bff4b01b9b39a6cb400f411b0168 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23768)
 
   * a24b9db69ad0993d5ff649e222525238274481a5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23769)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101715271

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * ea50694f9547bff4b01b9b39a6cb400f411b0168 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23768)
 
   * a24b9db69ad0993d5ff649e222525238274481a5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101708907

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * ea50694f9547bff4b01b9b39a6cb400f411b0168 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23768)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101668214

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * 3d33de31b2a4b0a4f252a0e96cef728af83ddc16 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23763)
 
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   * ea50694f9547bff4b01b9b39a6cb400f411b0168 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101662372

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * 3d33de31b2a4b0a4f252a0e96cef728af83ddc16 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23763)
 
   * bf97332bd42d5005f6299c94b44aba122c4919b2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594778367


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -114,15 +112,15 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tableState: HoodieTableState,
 val requiredSchemaSplits = requiredSchemaWithMandatory.fields.partition(f 
=> HoodieRecord.HOODIE_META_COLUMNS_WITH_OPERATION.contains(f.name))
 val requiredMeta = StructType(requiredSchemaSplits._1)
 val requiredWithoutMeta = StructType(requiredSchemaSplits._2)
-val augmentedHadoopConf = FSUtils.buildInlineConf(hadoopConf)
+val confCopy = new Configuration(hadoopConf)

Review Comment:
   No, but I can do that in a different pr



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594772203


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestHoodieTableConfig.java:
##
@@ -63,7 +61,7 @@ public class TestHoodieTableConfig extends 
HoodieCommonTestHarness {
   @BeforeEach
   public void setUp() throws Exception {
 initPath();
-storage = HoodieStorageUtils.getStorage(basePath, 
HadoopFSUtils.getStorageConf(new Configuration()));
+storage = HoodieStorageUtils.getStorage(basePath);

Review Comment:
   https://issues.apache.org/jira/browse/HUDI-7731



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594771064


##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HadoopStorageConfiguration.java:
##
@@ -38,7 +38,11 @@ public class HadoopStorageConfiguration extends 
StorageConfigurationhttps://issues.apache.org/jira/browse/HUDI-7732
   https://issues.apache.org/jira/browse/HUDI-7733



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594767682


##
hudi-common/src/main/java/org/apache/hudi/metrics/Metrics.java:
##
@@ -99,8 +97,7 @@ public static synchronized void shutdownAllMetrics() {
   private List 
addAdditionalMetricsExporters(HoodieMetricsConfig metricConfig) {
 List reporterList = new ArrayList<>();
 List propPathList = 
StringUtils.split(metricConfig.getMetricReporterFileBasedConfigs(), ",");
-try (HoodieStorage storage = HoodieStorageUtils.getStorage(
-propPathList.get(0), HadoopFSUtils.getStorageConf(new 
Configuration( {
+try (HoodieStorage storage = 
HoodieStorageUtils.getStorage(propPathList.get(0))) {

Review Comment:
   https://issues.apache.org/jira/browse/HUDI-7731



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594754496


##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSourceStorage.scala:
##
@@ -25,10 +25,10 @@ import 
org.apache.hudi.common.testutils.{HoodieTestDataGenerator, HoodieTestUtil
 import org.apache.hudi.common.util.StringUtils
 import org.apache.hudi.config.HoodieWriteConfig
 import org.apache.hudi.hadoop.fs.HadoopFSUtils
+import org.apache.hudi.storage.hadoop.HoodieHadoopStorage

Review Comment:
   Is this import needed?



##
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scala:
##
@@ -77,7 +74,7 @@ class TestHoodieFileGroupReaderOnSpark extends 
TestHoodieFileGroupReaderBase[Int
   }
 
   override def getStorageConf: StorageConfiguration[_] = {
-FSUtils.buildInlineConf(HoodieTestUtils.getDefaultStorageConf)
+HoodieTestUtils.getDefaultStorageConf

Review Comment:
   Similar here, is InlineFS configuration no longer needed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594753453


##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##
@@ -265,7 +263,7 @@ private HFileReader newHFileReader() throws IOException {
 if (path.isPresent()) {
   HoodieStorage storage = HoodieStorageUtils.getStorage(path.get(), conf);
   fileSize = storage.getPathInfo(path.get()).getLength();
-  inputStream = new HadoopSeekableDataInputStream((FSDataInputStream) 
storage.open(path.get()));
+  inputStream = storage.openSeekable(path.get());

Review Comment:
   https://issues.apache.org/jira/browse/HUDI-7730



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594752319


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -28,12 +28,11 @@ import 
org.apache.hudi.common.table.read.HoodieFileGroupReader
 import org.apache.hudi.common.table.{HoodieTableConfig, HoodieTableMetaClient}
 import org.apache.hudi.common.util.FileIOUtils
 import org.apache.hudi.common.util.collection.ExternalSpillableMap.DiskMapType
-import org.apache.hudi.hadoop.fs.HadoopFSUtils
 import org.apache.hudi.storage.StorageConfiguration
 import org.apache.hudi.{AvroConversionUtils, HoodieFileIndex, 
HoodiePartitionCDCFileGroupMapping, HoodiePartitionFileSliceMapping, 
HoodieSparkUtils, HoodieTableSchema, HoodieTableState, SparkAdapterSupport, 
SparkFileFormatInternalRowReaderContext}
-
 import org.apache.hadoop.conf.Configuration
 import org.apache.hadoop.fs.Path
+import org.apache.hudi.hadoop.fs.HadoopFSUtils

Review Comment:
   This import should belong to the previous group.



##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##
@@ -114,15 +112,15 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tableState: HoodieTableState,
 val requiredSchemaSplits = requiredSchemaWithMandatory.fields.partition(f 
=> HoodieRecord.HOODIE_META_COLUMNS_WITH_OPERATION.contains(f.name))
 val requiredMeta = StructType(requiredSchemaSplits._1)
 val requiredWithoutMeta = StructType(requiredSchemaSplits._2)
-val augmentedHadoopConf = FSUtils.buildInlineConf(hadoopConf)
+val confCopy = new Configuration(hadoopConf)

Review Comment:
   Should this contain InlineFS configuration?



##
hudi-io/src/main/java/org/apache/hudi/storage/HoodieStorage.java:
##
@@ -278,6 +278,19 @@ public abstract boolean rename(StoragePath oldPath,
   @PublicAPIMethod(maturity = ApiMaturityLevel.EVOLVING)
   public abstract Object unwrapConf();
 
+  @PublicAPIMethod(maturity = ApiMaturityLevel.EVOLVING)
+  public abstract StorageConfiguration 
buildInlineConf(StorageConfiguration storageConf);
+
+  @PublicAPIMethod(maturity = ApiMaturityLevel.EVOLVING)
+  public final StoragePath getInlineFilePath(StoragePath outerPath, long 
inLineStartOffset, long inLineLength) {
+return getInlineFilePath(outerPath, getScheme(), inLineStartOffset, 
inLineLength);
+  }
+
+  @PublicAPIMethod(maturity = ApiMaturityLevel.EVOLVING)
+  public abstract StoragePath getInlineFilePath(StoragePath outerPath, String 
origScheme, long inLineStartOffset, long inLineLength);

Review Comment:
   As discussed, the InLine path and config logic should be the same for 
different `HoodieStorage` implementation.



##
hudi-io/src/main/java/org/apache/hudi/storage/inline/InLineFSUtils.java:
##
@@ -59,42 +53,13 @@ public static StoragePath getInlineFilePath(StoragePath 
outerPath,
   long inLineLength) {
 final String subPath = new 
File(outerPath.toString().substring(outerPath.toString().indexOf(":") + 
1)).getPath();
 return new StoragePath(
-InLineFileSystem.SCHEME + SCHEME_SEPARATOR
+SCHEME + SCHEME_SEPARATOR
 + StoragePath.SEPARATOR + subPath + StoragePath.SEPARATOR + 
origScheme
 + StoragePath.SEPARATOR + "?" + START_OFFSET_STR + EQUALS_STR + 
inLineStartOffset
 + "&" + LENGTH_STR + EQUALS_STR + inLineLength
 );
   }
 
-  /**
-   * InlineFS Path format:
-   * 
"inlinefs://path/to/outer/file/outer_file_scheme/?start_offset=start_offset>&length="
-   * 
-   * Outer File Path format:
-   * "outer_file_scheme://path/to/outer/file"
-   * 
-   * Example
-   * Input: "inlinefs://file1/s3a/?start_offset=20&length=40".
-   * Output: "s3a://file1"
-   *
-   * @param inlineFSPath InLineFS Path to get the outer file Path
-   * @return Outer file Path from the InLineFS Path
-   */
-  public static Path getOuterFilePathFromInlinePath(Path inlineFSPath) {

Review Comment:
   Maybe we can get rid of these utils taking Hadoop `Path` for InlineFS and 
reuse the utils with `StoragePath`?



##
hudi-io/src/main/java/org/apache/hudi/storage/StoragePath.java:
##
@@ -235,6 +236,13 @@ public StoragePath makeQualified(URI defaultUri) {
 return new StoragePath(newUri);
   }
 
+  @PublicAPIMethod(maturity = ApiMaturityLevel.EVOLVING)
+  public String getFileExtension() {
+String fileName = new File(getName()).getName();

Review Comment:
   `String filename = getName()` should be sufficient? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please con

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594751287


##
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java:
##
@@ -100,7 +100,8 @@ public static ParquetMetadata 
readMetadata(StorageConfiguration conf, Storage
 ParquetMetadata footer;
 try {
   // TODO(vc): Should we use the parallel reading version here?
-  footer = 
ParquetFileReader.readFooter(HadoopFSUtils.getFs(parquetFileHadoopPath.toString(),
 conf).getConf(), parquetFileHadoopPath);
+  footer = ParquetFileReader.readFooter(HoodieStorageUtils.getStorage(

Review Comment:
   https://issues.apache.org/jira/browse/HUDI-7729



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594597401


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/testutils/FileSystemTestUtils.java:
##
@@ -79,34 +68,4 @@ public static void deleteFile(File fileToDelete) throws 
IOException {
   throw new IOException(message);
 }
   }
-
-  public static List listRecursive(FileSystem fs, Path path) 
throws IOException {
-return listFiles(fs, path, true);
-  }
-
-  public static List listFiles(FileSystem fs, Path path, boolean 
recursive) throws IOException {
-RemoteIterator itr = fs.listFiles(path, recursive);
-List statuses = new ArrayList<>();
-while (itr.hasNext()) {
-  statuses.add(itr.next());
-}
-return statuses;
-  }
-
-  public static List listRecursive(HoodieStorage storage, 
StoragePath path)
-  throws IOException {
-return listFiles(storage, path);
-  }
-
-  public static List listFiles(HoodieStorage storage, 
StoragePath path)
-  throws IOException {
-return storage.listFiles(path);
-  }
-
-  public static String readLastLineFromResourceFile(String resourceName) 
throws IOException {
-try (InputStream inputStream = 
TestLogReaderUtils.class.getResourceAsStream(resourceName)) {
-  List lines = FileIOUtils.readAsUTFStringLines(inputStream);
-  return lines.get(lines.size() - 1);
-}
-  }

Review Comment:
   add info to pr description



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594592122


##
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##
@@ -34,22 +32,35 @@ public static HoodieStorage 
getStorage(StorageConfiguration conf) {
   }
 
   public static HoodieStorage getStorage(FileSystem fs) {
-return new HoodieHadoopStorage(fs);
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {FileSystem.class}, fs);
+  }
+
+  public static HoodieStorage getStorage(String basePath) {
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class}, basePath);
   }
 
   public static HoodieStorage getStorage(String basePath, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(basePath, conf));
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class, StorageConfiguration.class}, basePath, conf);
+  }
+
+  public static HoodieStorage getStorage(String basePath, Configuration conf) {
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class, Configuration.class}, basePath, conf);
   }
 
   public static HoodieStorage getStorage(StoragePath path, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(path, 
conf.unwrapAs(Configuration.class)));
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {StoragePath.class, StorageConfiguration.class}, path, conf);
   }
 
   public static HoodieStorage getRawStorage(HoodieStorage storage) {
-FileSystem fs = (FileSystem) storage.getFileSystem();
-if (fs instanceof HoodieWrapperFileSystem) {
-  return getStorage(((HoodieWrapperFileSystem) fs).getFileSystem());
-}
-return storage;
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {HoodieStorage.class}, storage);
+  }
+
+  public static StorageConfiguration getNewStorageConf() {

Review Comment:
   discussed that all the prod code should not be using a newly created config, 
and if they do, it is an implementation issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594591453


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/functional/TestHoodieLogFormat.java:
##
@@ -459,7 +457,7 @@ public void testHugeLogFileWrite() throws IOException, 
URISyntaxException, Inter
 header.put(HoodieLogBlock.HeaderMetadataType.SCHEMA, 
getSimpleSchema().toString());
 byte[] dataBlockContentBytes = getDataBlock(DEFAULT_DATA_BLOCK_TYPE, 
records, header).getContentBytes();
 HoodieLogBlock.HoodieLogBlockContentLocation logBlockContentLoc = new 
HoodieLogBlock.HoodieLogBlockContentLocation(
-HadoopFSUtils.getStorageConf(new Configuration()), null, 0, 
dataBlockContentBytes.length, 0);
+HoodieStorageUtils.getNewStorageConf(), null, 0, 
dataBlockContentBytes.length, 0);

Review Comment:
   put this into a test util method because we don't want prod code to use 
empty config



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594589135


##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HoodieHadoopStorage.java:
##
@@ -213,6 +267,16 @@ public Configuration unwrapConf() {
 return fs.getConf();
   }
 
+  @Override
+  public StorageConfiguration buildInlineConf(StorageConfiguration 
storageConf) {
+return HadoopInLineFSUtils.buildInlineConf(storageConf);
+  }
+
+  @Override
+  public StoragePath getInlineFilePath(StoragePath outerPath, String 
origScheme, long inLineStartOffset, long inLineLength) {
+return HadoopInLineFSUtils.getInlineFilePath(outerPath, origScheme, 
inLineStartOffset, inLineLength);
+  }

Review Comment:
   move this to hoodie-common



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594587944


##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HadoopStorageConfiguration.java:
##
@@ -38,7 +38,11 @@ public class HadoopStorageConfiguration extends 
StorageConfiguration

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594586377


##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopFSUtils.java:
##
@@ -109,25 +105,6 @@ public static FileSystem getFs(String pathStr, 
Configuration conf, boolean local
 return getFs(pathStr, conf);
   }
 
-  public static HoodieStorage getStorageWithWrapperFS(StoragePath path,

Review Comment:
   @yihua to respond later



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101365987

   
   ## CI report:
   
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * 3d33de31b2a4b0a4f252a0e96cef728af83ddc16 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23763)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594581863


##
hudi-common/src/main/java/org/apache/hudi/metrics/Metrics.java:
##
@@ -99,8 +97,7 @@ public static synchronized void shutdownAllMetrics() {
   private List 
addAdditionalMetricsExporters(HoodieMetricsConfig metricConfig) {
 List reporterList = new ArrayList<>();
 List propPathList = 
StringUtils.split(metricConfig.getMetricReporterFileBasedConfigs(), ",");
-try (HoodieStorage storage = HoodieStorageUtils.getStorage(
-propPathList.get(0), HadoopFSUtils.getStorageConf(new 
Configuration( {
+try (HoodieStorage storage = 
HoodieStorageUtils.getStorage(propPathList.get(0))) {

Review Comment:
   don't make it use hoodiestorageutils becuase we don't want to expose an 
option to not pass a config. Create followup ticket



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594577738


##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##
@@ -265,7 +263,7 @@ private HFileReader newHFileReader() throws IOException {
 if (path.isPresent()) {
   HoodieStorage storage = HoodieStorageUtils.getStorage(path.get(), conf);
   fileSize = storage.getPathInfo(path.get()).getLength();
-  inputStream = new HadoopSeekableDataInputStream((FSDataInputStream) 
storage.open(path.get()));
+  inputStream = storage.openSeekable(path.get());

Review Comment:
   create issue ticket to make a more permanent solution 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594570213


##
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java:
##
@@ -100,7 +100,8 @@ public static ParquetMetadata 
readMetadata(StorageConfiguration conf, Storage
 ParquetMetadata footer;
 try {
   // TODO(vc): Should we use the parallel reading version here?
-  footer = 
ParquetFileReader.readFooter(HadoopFSUtils.getFs(parquetFileHadoopPath.toString(),
 conf).getConf(), parquetFileHadoopPath);
+  footer = ParquetFileReader.readFooter(HoodieStorageUtils.getStorage(

Review Comment:
   in followup pr, make this method abstract in the base and throw not 
implemented for orc utils



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594560411


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -674,14 +674,15 @@ private static HoodieLogBlock getBlock(HoodieWriteConfig 
writeConfig,
  List records,
  boolean shouldWriteRecordPositions,
  Map 
header,
- String keyField) {
+ String keyField,
+ HoodieStorage storage) {
 switch (logDataBlockFormat) {
   case AVRO_DATA_BLOCK:
 return new HoodieAvroDataBlock(records, shouldWriteRecordPositions, 
header, keyField);
   case HFILE_DATA_BLOCK:
 return new HoodieHFileDataBlock(
 records, header, writeConfig.getHFileCompressionAlgorithm(), new 
StoragePath(writeConfig.getBasePath()),
-
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER));
+
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER), 
storage);

Review Comment:
   make inline stuff part of the storageconfig instead of storage



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101275676

   
   ## CI report:
   
   * 2bde7d57048e342132d0b70b524317e1bcbdb96b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23742)
 
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * 3d33de31b2a4b0a4f252a0e96cef728af83ddc16 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23763)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101206487

   
   ## CI report:
   
   * 2bde7d57048e342132d0b70b524317e1bcbdb96b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23742)
 
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   * 3d33de31b2a4b0a4f252a0e96cef728af83ddc16 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2101189533

   
   ## CI report:
   
   * 2bde7d57048e342132d0b70b524317e1bcbdb96b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23742)
 
   * 94063d1bd8234c38e9ffb1befaf8760ae664621f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594410585


##
hudi-hadoop-mr/pom.xml:
##
@@ -108,6 +114,22 @@
   test-jar
   test
 
+
+  org.apache.hudi
+  hudi-hadoop-common
+  ${project.version}
+  tests
+  test-jar

Review Comment:
   IDK, I just copied from what the other poms did



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594408581


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/testutils/FileSystemTestUtils.java:
##
@@ -79,34 +68,4 @@ public static void deleteFile(File fileToDelete) throws 
IOException {
   throw new IOException(message);
 }
   }
-
-  public static List listRecursive(FileSystem fs, Path path) 
throws IOException {
-return listFiles(fs, path, true);
-  }
-
-  public static List listFiles(FileSystem fs, Path path, boolean 
recursive) throws IOException {
-RemoteIterator itr = fs.listFiles(path, recursive);
-List statuses = new ArrayList<>();
-while (itr.hasNext()) {
-  statuses.add(itr.next());
-}
-return statuses;
-  }
-
-  public static List listRecursive(HoodieStorage storage, 
StoragePath path)
-  throws IOException {
-return listFiles(storage, path);
-  }
-
-  public static List listFiles(HoodieStorage storage, 
StoragePath path)
-  throws IOException {
-return storage.listFiles(path);
-  }
-
-  public static String readLastLineFromResourceFile(String resourceName) 
throws IOException {
-try (InputStream inputStream = 
TestLogReaderUtils.class.getResourceAsStream(resourceName)) {
-  List lines = FileIOUtils.readAsUTFStringLines(inputStream);
-  return lines.get(lines.size() - 1);
-}
-  }

Review Comment:
   ```
 public static String readLastLineFromResourceFile(String resourceName) 
throws IOException {
   try (InputStream inputStream = 
TestLogReaderUtils.class.getResourceAsStream(resourceName)) {
 List lines = FileIOUtils.readAsUTFStringLines(inputStream);
 return lines.get(lines.size() - 1);
   }
 }
   ```
   needed to be moved since it uses a class that's now in hudi-hadoop-common. 
Since I was moving that one, I just moved all the rest of the hadoop ones as 
well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594399186


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestWaitBasedTimeGenerator.java:
##
@@ -75,7 +75,7 @@ public boolean tryLock(long time, TimeUnit unit) {
   // Clock skew time
   private final long clockSkewTime = 20L;
 
-  private final StorageConfiguration storageConf = 
HadoopFSUtils.getStorageConf(new Configuration());
+  private final StorageConfiguration storageConf = 
HoodieStorageUtils.getNewStorageConf();

Review Comment:
   getDefaultStorageConf does
   ```
   new Configuration(false)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594398612


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestHoodieTableConfig.java:
##
@@ -63,7 +61,7 @@ public class TestHoodieTableConfig extends 
HoodieCommonTestHarness {
   @BeforeEach
   public void setUp() throws Exception {
 initPath();
-storage = HoodieStorageUtils.getStorage(basePath, 
HadoopFSUtils.getStorageConf(new Configuration()));
+storage = HoodieStorageUtils.getStorage(basePath);

Review Comment:
   HoodieParquetDataBlock does. So does InProcessTimeGenerator.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594396489


##
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##
@@ -34,22 +32,35 @@ public static HoodieStorage 
getStorage(StorageConfiguration conf) {
   }
 
   public static HoodieStorage getStorage(FileSystem fs) {
-return new HoodieHadoopStorage(fs);
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {FileSystem.class}, fs);
+  }
+
+  public static HoodieStorage getStorage(String basePath) {
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class}, basePath);
   }
 
   public static HoodieStorage getStorage(String basePath, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(basePath, conf));
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class, StorageConfiguration.class}, basePath, conf);
+  }
+
+  public static HoodieStorage getStorage(String basePath, Configuration conf) {
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class, Configuration.class}, basePath, conf);
   }
 
   public static HoodieStorage getStorage(StoragePath path, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(path, 
conf.unwrapAs(Configuration.class)));
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {StoragePath.class, StorageConfiguration.class}, path, conf);
   }
 
   public static HoodieStorage getRawStorage(HoodieStorage storage) {
-FileSystem fs = (FileSystem) storage.getFileSystem();
-if (fs instanceof HoodieWrapperFileSystem) {
-  return getStorage(((HoodieWrapperFileSystem) fs).getFileSystem());
-}
-return storage;
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {HoodieStorage.class}, storage);
+  }
+
+  public static StorageConfiguration getNewStorageConf() {

Review Comment:
   Why?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594395638


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/functional/TestHoodieLogFormat.java:
##
@@ -459,7 +457,7 @@ public void testHugeLogFileWrite() throws IOException, 
URISyntaxException, Inter
 header.put(HoodieLogBlock.HeaderMetadataType.SCHEMA, 
getSimpleSchema().toString());
 byte[] dataBlockContentBytes = getDataBlock(DEFAULT_DATA_BLOCK_TYPE, 
records, header).getContentBytes();
 HoodieLogBlock.HoodieLogBlockContentLocation logBlockContentLoc = new 
HoodieLogBlock.HoodieLogBlockContentLocation(
-HadoopFSUtils.getStorageConf(new Configuration()), null, 0, 
dataBlockContentBytes.length, 0);
+HoodieStorageUtils.getNewStorageConf(), null, 0, 
dataBlockContentBytes.length, 0);

Review Comment:
   getDefaultStorageConf does
   ```
   new Configuration(false)
   ```



##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestTableSchemaResolver.java:
##
@@ -100,7 +98,7 @@ public void testReadSchemaFromLogFile() throws IOException, 
URISyntaxException,
 assertEquals(
 new AvroSchemaConverter().convert(expectedSchema),
 
TableSchemaResolver.readSchemaFromLogFile(HoodieStorageUtils.getStorage(
-logFilePath, HadoopFSUtils.getStorageConf(new Configuration())), 
logFilePath));
+logFilePath, HoodieStorageUtils.getNewStorageConf()), 
logFilePath));

Review Comment:
   getDefaultStorageConf does
   ```
   new Configuration(false)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594392368


##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HadoopStorageConfiguration.java:
##
@@ -38,7 +38,11 @@ public class HadoopStorageConfiguration extends 
StorageConfiguration

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594382795


##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HoodieHadoopStorage.java:
##
@@ -43,16 +47,66 @@
 import java.util.List;
 import java.util.stream.Collectors;
 
+import static org.apache.hudi.common.util.ValidationUtils.checkArgument;
 import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToHadoopPath;
 import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToStoragePath;
 import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToStoragePathInfo;
+import static org.apache.hudi.hadoop.fs.HadoopFSUtils.getFs;
 
 /**
  * Implementation of {@link HoodieStorage} using Hadoop's {@link FileSystem}
  */
 public class HoodieHadoopStorage extends HoodieStorage {
   private final FileSystem fs;
 
+  public HoodieHadoopStorage(HoodieStorage storage) {

Review Comment:
   Yeah, the reason for this is there are like 10 different HadoopFSUtils.getFs 
methods. I wanted to minimize the number of changes in this pr. We can follow 
up in another pr and reduce this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594377249


##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HadoopStorageConfiguration.java:
##
@@ -38,7 +38,11 @@ public class HadoopStorageConfiguration extends 
StorageConfiguration

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594370366


##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HoodieHadoopUtils.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.hadoop.fs;
+
+import org.apache.hudi.storage.StoragePath;
+
+import org.apache.hadoop.conf.Configuration;
+
+public class HoodieHadoopUtils {
+
+  public static Configuration registerFileSystem(StoragePath file, 
Configuration conf) {

Review Comment:
   good idea!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594367946


##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopFSUtils.java:
##
@@ -109,25 +105,6 @@ public static FileSystem getFs(String pathStr, 
Configuration conf, boolean local
 return getFs(pathStr, conf);
   }
 
-  public static HoodieStorage getStorageWithWrapperFS(StoragePath path,

Review Comment:
   Yeah, since we need to use reflection, I moved it into a new constructor for 
HoodieHadoopStorage:
   ```
 public HoodieHadoopStorage(StoragePath path,
StorageConfiguration conf,
boolean enableRetry,
long maxRetryIntervalMs,
int maxRetryNumbers,
long initialRetryIntervalMs,
String retryExceptions,
ConsistencyGuard consistencyGuard) {
   FileSystem fileSystem = getFs(path, 
conf.unwrapCopyAs(Configuration.class));
   
   if (enableRetry) {
 fileSystem = new HoodieRetryWrapperFileSystem(fileSystem,
 maxRetryIntervalMs, maxRetryNumbers, initialRetryIntervalMs, 
retryExceptions);
   }
   checkArgument(!(fileSystem instanceof HoodieWrapperFileSystem),
   "File System not expected to be that of HoodieWrapperFileSystem");
   this.fs = new HoodieWrapperFileSystem(fileSystem, consistencyGuard);
 }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594361490


##
hudi-hadoop-common/src/main/java/org/apache/hudi/common/config/DFSPropertiesConfiguration.java:
##
@@ -202,6 +202,11 @@ public void addPropsFromStream(BufferedReader reader, 
StoragePath cfgFilePath) t
 }
   }
 
+  @Override
+  public TypedProperties getGlobalProperties() {

Review Comment:
   it was used a number of times. And we can't override static method



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594360124


##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -198,24 +199,25 @@ public static HoodieTableMetaClient init(String basePath, 
HoodieTableType tableT
   }
 
   /**
-   * @param storageConf storage configuration.
+   * @param conf storage configuration.
* @param basePathbase path of the Hudi table.
* @return a new {@link HoodieTableMetaClient} instance.
*/
-  public static HoodieTableMetaClient createMetaClient(StorageConfiguration 
storageConf,
+  public static HoodieTableMetaClient createMetaClient(Configuration conf,
String basePath) {
 return HoodieTableMetaClient.builder()
-.setConf(storageConf).setBasePath(basePath).build();
+
.setConf(HoodieStorageUtils.getStorageConf(conf)).setBasePath(basePath).build();

Review Comment:
   good catch!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594352910


##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java:
##
@@ -578,8 +576,8 @@ private static void createMetadataFile(String f, String 
basePath, StorageConfigu
 basePath + "/" + HoodieTableMetaClient.METAFOLDER_NAME + "/" + f);
 OutputStream os = null;
 try {
-  FileSystem fs = HadoopFSUtils.getFs(basePath, configuration);
-  os = fs.create(commitFile, true);
+  HoodieStorage storage = HoodieStorageUtils.getStorage(basePath, 
configuration);

Review Comment:
   HadoopFSUtils is removed so it was needed here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594350164


##
hudi-common/src/main/java/org/apache/hudi/metrics/Metrics.java:
##
@@ -99,8 +97,7 @@ public static synchronized void shutdownAllMetrics() {
   private List 
addAdditionalMetricsExporters(HoodieMetricsConfig metricConfig) {
 List reporterList = new ArrayList<>();
 List propPathList = 
StringUtils.split(metricConfig.getMetricReporterFileBasedConfigs(), ",");
-try (HoodieStorage storage = HoodieStorageUtils.getStorage(
-propPathList.get(0), HadoopFSUtils.getStorageConf(new 
Configuration( {
+try (HoodieStorage storage = 
HoodieStorageUtils.getStorage(propPathList.get(0))) {

Review Comment:
   I don't see how this is any different



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594348426


##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##
@@ -265,7 +263,7 @@ private HFileReader newHFileReader() throws IOException {
 if (path.isPresent()) {
   HoodieStorage storage = HoodieStorageUtils.getStorage(path.get(), conf);
   fileSize = storage.getPathInfo(path.get()).getLength();
-  inputStream = new HadoopSeekableDataInputStream((FSDataInputStream) 
storage.open(path.get()));
+  inputStream = storage.openSeekable(path.get());

Review Comment:
   Will discuss with you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594338645


##
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java:
##
@@ -100,7 +100,8 @@ public static ParquetMetadata 
readMetadata(StorageConfiguration conf, Storage
 ParquetMetadata footer;
 try {
   // TODO(vc): Should we use the parallel reading version here?
-  footer = 
ParquetFileReader.readFooter(HadoopFSUtils.getFs(parquetFileHadoopPath.toString(),
 conf).getConf(), parquetFileHadoopPath);
+  footer = ParquetFileReader.readFooter(HoodieStorageUtils.getStorage(

Review Comment:
   PTAL at the usages of this. Seems to be used a decent amount in hudi-common



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594313548


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -674,14 +674,15 @@ private static HoodieLogBlock getBlock(HoodieWriteConfig 
writeConfig,
  List records,
  boolean shouldWriteRecordPositions,
  Map 
header,
- String keyField) {
+ String keyField,
+ HoodieStorage storage) {
 switch (logDataBlockFormat) {
   case AVRO_DATA_BLOCK:
 return new HoodieAvroDataBlock(records, shouldWriteRecordPositions, 
header, keyField);
   case HFILE_DATA_BLOCK:
 return new HoodieHFileDataBlock(
 records, header, writeConfig.getHFileCompressionAlgorithm(), new 
StoragePath(writeConfig.getBasePath()),
-
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER));
+
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER), 
storage);

Review Comment:
   I got rid of the change. Seems less efficient then just passing it though. 
We will see if it still passes ci.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594288001


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##
@@ -322,7 +322,7 @@ public static HoodieTableMetaClient 
createMetaClient(JavaSparkContext jsc, Strin
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaClient createMetaClient(SparkSession spark, 
String basePath) {
-return 
HoodieTestUtils.createMetaClient(spark.sessionState().newHadoopConf(), 
basePath);
+return 
HoodieTestUtils.createMetaClient(HoodieStorageUtils.getStorageConf(spark.sessionState().newHadoopConf()),
 basePath);

Review Comment:
   Definitely can't have spark in HoodieTestUtils



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594285849


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##
@@ -313,7 +313,7 @@ public static Option 
getCommitMetadataForLatestInstant(Hoo
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaClient createMetaClient(JavaSparkContext jsc, 
String basePath) {
-return HoodieTestUtils.createMetaClient(jsc.hadoopConfiguration(), 
basePath);
+return 
HoodieTestUtils.createMetaClient(HoodieStorageUtils.getStorageConf(jsc.hadoopConfiguration()),
 basePath);

Review Comment:
   Meta client is is hudi-common so it will take in storage conf. 
HoodieTestUtils is also in hudi-common. I don't see what the problem is with my 
change



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594281556


##
hudi-client/hudi-java-client/pom.xml:
##
@@ -42,6 +42,11 @@
 hudi-client-common
 ${project.parent.version}
 
+
+org.apache.hudi
+hudi-hadoop-common
+${project.version}
+

Review Comment:
   Same thing I said above. We can try removing it when the rest of the review 
is done. But don't want to mess around with this too much while we are making 
other changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1594272278


##
hudi-client/hudi-client-common/pom.xml:
##
@@ -43,6 +43,16 @@
   hudi-common
   ${project.version}
 
+
+  org.apache.hudi
+  hudi-io
+  ${project.version}
+

Review Comment:
   That's what I initially assumed, but I was getting some class not found 
issues  and this fixed it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593573466


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/testutils/FileSystemTestUtils.java:
##
@@ -79,34 +68,4 @@ public static void deleteFile(File fileToDelete) throws 
IOException {
   throw new IOException(message);
 }
   }
-
-  public static List listRecursive(FileSystem fs, Path path) 
throws IOException {
-return listFiles(fs, path, true);
-  }
-
-  public static List listFiles(FileSystem fs, Path path, boolean 
recursive) throws IOException {
-RemoteIterator itr = fs.listFiles(path, recursive);
-List statuses = new ArrayList<>();
-while (itr.hasNext()) {
-  statuses.add(itr.next());
-}
-return statuses;
-  }
-
-  public static List listRecursive(HoodieStorage storage, 
StoragePath path)
-  throws IOException {
-return listFiles(storage, path);
-  }
-
-  public static List listFiles(HoodieStorage storage, 
StoragePath path)
-  throws IOException {
-return storage.listFiles(path);
-  }
-
-  public static String readLastLineFromResourceFile(String resourceName) 
throws IOException {
-try (InputStream inputStream = 
TestLogReaderUtils.class.getResourceAsStream(resourceName)) {
-  List lines = FileIOUtils.readAsUTFStringLines(inputStream);
-  return lines.get(lines.size() - 1);
-}
-  }

Review Comment:
   Are these only used by tests?  Should the methods stay here if there is no 
compelling reason to move them?



##
hudi-hadoop-mr/pom.xml:
##
@@ -108,6 +114,22 @@
   test-jar
   test
 
+
+  org.apache.hudi
+  hudi-hadoop-common
+  ${project.version}
+  tests
+  test-jar

Review Comment:
   Is `test-jar` type required?



##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestWaitBasedTimeGenerator.java:
##
@@ -75,7 +75,7 @@ public boolean tryLock(long time, TimeUnit unit) {
   // Clock skew time
   private final long clockSkewTime = 20L;
 
-  private final StorageConfiguration storageConf = 
HadoopFSUtils.getStorageConf(new Configuration());
+  private final StorageConfiguration storageConf = 
HoodieStorageUtils.getNewStorageConf();

Review Comment:
   use `getDefaultStorageConf()`



##
hudi-hadoop-common/src/test/resources/props/testdfs.properties:
##
@@ -0,0 +1,18 @@
+

Review Comment:
   nit: remove empty line



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593562711


##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/functional/TestHoodieLogFormat.java:
##
@@ -459,7 +457,7 @@ public void testHugeLogFileWrite() throws IOException, 
URISyntaxException, Inter
 header.put(HoodieLogBlock.HeaderMetadataType.SCHEMA, 
getSimpleSchema().toString());
 byte[] dataBlockContentBytes = getDataBlock(DEFAULT_DATA_BLOCK_TYPE, 
records, header).getContentBytes();
 HoodieLogBlock.HoodieLogBlockContentLocation logBlockContentLoc = new 
HoodieLogBlock.HoodieLogBlockContentLocation(
-HadoopFSUtils.getStorageConf(new Configuration()), null, 0, 
dataBlockContentBytes.length, 0);
+HoodieStorageUtils.getNewStorageConf(), null, 0, 
dataBlockContentBytes.length, 0);

Review Comment:
   use `getDefaultStorageConf`.



##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HoodieHadoopStorage.java:
##
@@ -43,16 +47,66 @@
 import java.util.List;
 import java.util.stream.Collectors;
 
+import static org.apache.hudi.common.util.ValidationUtils.checkArgument;
 import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToHadoopPath;
 import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToStoragePath;
 import static org.apache.hudi.hadoop.fs.HadoopFSUtils.convertToStoragePathInfo;
+import static org.apache.hudi.hadoop.fs.HadoopFSUtils.getFs;
 
 /**
  * Implementation of {@link HoodieStorage} using Hadoop's {@link FileSystem}
  */
 public class HoodieHadoopStorage extends HoodieStorage {
   private final FileSystem fs;
 
+  public HoodieHadoopStorage(HoodieStorage storage) {

Review Comment:
   I feel like there are too many constructors here.  Ideally, only a couple 
should suffice.  To load the implementation through config and reflection, we 
cannot support many constructors, ideally only one or two
   .



##
hudi-hadoop-common/src/main/java/org/apache/hudi/storage/hadoop/HoodieHadoopStorage.java:
##
@@ -213,6 +267,16 @@ public Configuration unwrapConf() {
 return fs.getConf();
   }
 
+  @Override
+  public StorageConfiguration buildInlineConf(StorageConfiguration 
storageConf) {
+return HadoopInLineFSUtils.buildInlineConf(storageConf);
+  }
+
+  @Override
+  public StoragePath getInlineFilePath(StoragePath outerPath, String 
origScheme, long inLineStartOffset, long inLineLength) {
+return HadoopInLineFSUtils.getInlineFilePath(outerPath, origScheme, 
inLineStartOffset, inLineLength);
+  }

Review Comment:
   I think this is not the logic that `HoodieStorage` implementation should 
care about.  Inline Path is invariant compared to different `HoodieStorage` 
implementations.



##
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestHoodieTableConfig.java:
##
@@ -63,7 +61,7 @@ public class TestHoodieTableConfig extends 
HoodieCommonTestHarness {
   @BeforeEach
   public void setUp() throws Exception {
 initPath();
-storage = HoodieStorageUtils.getStorage(basePath, 
HadoopFSUtils.getStorageConf(new Configuration()));
+storage = HoodieStorageUtils.getStorage(basePath);

Review Comment:
   Let's create a util method in test utils only.  Production code should not 
use empty `Configuration`.



##
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##
@@ -34,22 +32,35 @@ public static HoodieStorage 
getStorage(StorageConfiguration conf) {
   }
 
   public static HoodieStorage getStorage(FileSystem fs) {
-return new HoodieHadoopStorage(fs);
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {FileSystem.class}, fs);
+  }
+
+  public static HoodieStorage getStorage(String basePath) {
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class}, basePath);
   }
 
   public static HoodieStorage getStorage(String basePath, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(basePath, conf));
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class, StorageConfiguration.class}, basePath, conf);
+  }
+
+  public static HoodieStorage getStorage(String basePath, Configuration conf) {
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {String.class, Configuration.class}, basePath, conf);
   }
 
   public static HoodieStorage getStorage(StoragePath path, 
StorageConfiguration conf) {
-return getStorage(HadoopFSUtils.getFs(path, 
conf.unwrapAs(Configuration.class)));
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {StoragePath.class, StorageConfiguration.class}, path, conf);
   }
 
   public static HoodieStorage getRawStora

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593538784


##
hudi-hadoop-common/src/main/java/org/apache/hudi/common/config/DFSPropertiesConfiguration.java:
##
@@ -202,6 +202,11 @@ public void addPropsFromStream(BufferedReader reader, 
StoragePath cfgFilePath) t
 }
   }
 
+  @Override
+  public TypedProperties getGlobalProperties() {

Review Comment:
   Move the logic of `getGlobalProps()` directly into this new method?



##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopFSUtils.java:
##
@@ -211,10 +188,12 @@ public static FSDataInputStream 
getFSDataInputStream(FileSystem fs,
   return new BoundedFsDataInputStream(fs, convertToHadoopPath(filePath), 
fsDataInputStream);
 }
 
+/*

Review Comment:
   uncomment this after fixing `openSeekable`?



##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HadoopFSUtils.java:
##
@@ -109,25 +105,6 @@ public static FileSystem getFs(String pathStr, 
Configuration conf, boolean local
 return getFs(pathStr, conf);
   }
 
-  public static HoodieStorage getStorageWithWrapperFS(StoragePath path,

Review Comment:
   Is this moved to somewhere else?



##
hudi-common/src/test/java/org/apache/hudi/common/testutils/reader/HoodieFileSliceTestUtils.java:
##
@@ -267,7 +273,8 @@ public static HoodieBaseFile createBaseFile(
 0.1,
 true);
 
-try (HoodieAvroParquetWriter writer = new HoodieAvroParquetWriter(
+try (HoodieAvroFileWriter writer = (HoodieAvroFileWriter) 
ReflectionUtils.loadClass("org.apache.hudi.io.storage.HoodieAvroParquetWriter",

Review Comment:
   Make sure you revisit all such reflection calls based on my previous 
comments.



##
hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HoodieHadoopUtils.java:
##
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.hadoop.fs;
+
+import org.apache.hudi.storage.StoragePath;
+
+import org.apache.hadoop.conf.Configuration;
+
+public class HoodieHadoopUtils {
+
+  public static Configuration registerFileSystem(StoragePath file, 
Configuration conf) {

Review Comment:
   Should this be in `HadoopFSUtils`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593527111


##
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##
@@ -34,22 +32,35 @@ public static HoodieStorage 
getStorage(StorageConfiguration conf) {
   }
 
   public static HoodieStorage getStorage(FileSystem fs) {
-return new HoodieHadoopStorage(fs);
+return (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage", 
new Class[] {FileSystem.class}, fs);

Review Comment:
   Similar here on the class name and reflection utils to construct 
`HoodieStorage` using reflection. 



##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -198,24 +199,25 @@ public static HoodieTableMetaClient init(String basePath, 
HoodieTableType tableT
   }
 
   /**
-   * @param storageConf storage configuration.
+   * @param conf storage configuration.
* @param basePathbase path of the Hudi table.
* @return a new {@link HoodieTableMetaClient} instance.
*/
-  public static HoodieTableMetaClient createMetaClient(StorageConfiguration 
storageConf,
+  public static HoodieTableMetaClient createMetaClient(Configuration conf,

Review Comment:
   Did you flip the order of these two?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593525459


##
hudi-common/src/main/java/org/apache/hudi/metrics/Metrics.java:
##
@@ -99,8 +97,7 @@ public static synchronized void shutdownAllMetrics() {
   private List 
addAdditionalMetricsExporters(HoodieMetricsConfig metricConfig) {
 List reporterList = new ArrayList<>();
 List propPathList = 
StringUtils.split(metricConfig.getMetricReporterFileBasedConfigs(), ",");
-try (HoodieStorage storage = HoodieStorageUtils.getStorage(
-propPathList.get(0), HadoopFSUtils.getStorageConf(new 
Configuration( {
+try (HoodieStorage storage = 
HoodieStorageUtils.getStorage(propPathList.get(0))) {

Review Comment:
   Separate this change out?  Also, should it get the actual config from 
somewhere else?  An empty config may not work in some cases (reach out to the 
author who made the changes in `Metrics` class). 



##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java:
##
@@ -578,8 +576,8 @@ private static void createMetadataFile(String f, String 
basePath, StorageConfigu
 basePath + "/" + HoodieTableMetaClient.METAFOLDER_NAME + "/" + f);
 OutputStream os = null;
 try {
-  FileSystem fs = HadoopFSUtils.getFs(basePath, configuration);
-  os = fs.create(commitFile, true);
+  HoodieStorage storage = HoodieStorageUtils.getStorage(basePath, 
configuration);

Review Comment:
   Changes here can be in an independent PR.



##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -198,24 +199,25 @@ public static HoodieTableMetaClient init(String basePath, 
HoodieTableType tableT
   }
 
   /**
-   * @param storageConf storage configuration.
+   * @param conf storage configuration.
* @param basePathbase path of the Hudi table.
* @return a new {@link HoodieTableMetaClient} instance.
*/
-  public static HoodieTableMetaClient createMetaClient(StorageConfiguration 
storageConf,
+  public static HoodieTableMetaClient createMetaClient(Configuration conf,
String basePath) {
 return HoodieTableMetaClient.builder()
-.setConf(storageConf).setBasePath(basePath).build();
+
.setConf(HoodieStorageUtils.getStorageConf(conf)).setBasePath(basePath).build();

Review Comment:
   This needs to use `HadoopFSUtils.getStorageConfWithCopy(conf)` to be 
consistent with previous behavior.



##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -198,24 +199,25 @@ public static HoodieTableMetaClient init(String basePath, 
HoodieTableType tableT
   }
 
   /**
-   * @param storageConf storage configuration.
+   * @param conf storage configuration.
* @param basePathbase path of the Hudi table.
* @return a new {@link HoodieTableMetaClient} instance.
*/
-  public static HoodieTableMetaClient createMetaClient(StorageConfiguration 
storageConf,
+  public static HoodieTableMetaClient createMetaClient(Configuration conf,
String basePath) {
 return HoodieTableMetaClient.builder()
-.setConf(storageConf).setBasePath(basePath).build();
+
.setConf(HoodieStorageUtils.getStorageConf(conf)).setBasePath(basePath).build();

Review Comment:
   Let's strictly follow the same behavior, even if there may not be obvious or 
bug.  Any behavior change should be in separate PRs for a closer look.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593522855


##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieNativeAvroHFileReader.java:
##
@@ -265,7 +263,7 @@ private HFileReader newHFileReader() throws IOException {
 if (path.isPresent()) {
   HoodieStorage storage = HoodieStorageUtils.getStorage(path.get(), conf);
   fileSize = storage.getPathInfo(path.get()).getLength();
-  inputStream = new HadoopSeekableDataInputStream((FSDataInputStream) 
storage.open(path.get()));
+  inputStream = storage.openSeekable(path.get());

Review Comment:
   This is a behavior change.  The current implementation of `openSeekable` in 
`HoodieHadoopStorage` leverages `HadoopFSUtils#getFSDataInputStream` which 
wraps the `FSDataInputStream` based on different conditions, which are intended 
for reading log file only (not block content with InlineFS).  Here for reading 
HFile, it should return the `new 
HadoopSeekableDataInputStream((FSDataInputStream) storage.open(path.get()))` 
without any wrapping.  One option is to create an overloaded `openSeekable` 
with different arguments for skipping such wrapping in `HoodieStorage`, to 
maintain the same behavior.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-08 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593483883


##
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java:
##
@@ -189,9 +193,7 @@ protected byte[] serializeRecords(List 
records) throws IOException
   protected  ClosableIterator> deserializeRecords(byte[] 
content, HoodieRecordType type) throws IOException {
 checkState(readerSchema != null, "Reader's schema has to be non-null");
 
-StorageConfiguration storageConf =
-
FSUtils.buildInlineConf(getBlockContentLocation().get().getStorageConf());
-HoodieStorage storage = HoodieStorageUtils.getStorage(pathForReader, 
storageConf);

Review Comment:
Is the newly passed-in storage instance the same as the storage instance 
instantiated here before?



##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroFileWriterFactory.java:
##
@@ -106,7 +120,13 @@ protected HoodieFileWriter newOrcFileWriter(
 config.getInt(HoodieStorageConfig.ORC_STRIPE_SIZE),
 config.getInt(HoodieStorageConfig.ORC_BLOCK_SIZE),
 config.getLong(HoodieStorageConfig.ORC_FILE_MAX_SIZE), filter);
-return new HoodieAvroOrcWriter(instantTime, path, orcConfig, schema, 
taskContextSupplier);
+try {
+  return (HoodieFileWriter) 
ReflectionUtils.loadClass("org.apache.hudi.io.storage.HoodieAvroOrcWriter",

Review Comment:
   Same



##
hudi-common/src/main/java/org/apache/hudi/common/util/OrcUtils.java:
##
@@ -80,7 +79,7 @@ public class OrcUtils extends BaseFileUtils {
   public ClosableIterator 
getHoodieKeyIterator(StorageConfiguration configuration, StoragePath 
filePath) {
 try {
   Configuration conf = configuration.unwrapCopyAs(Configuration.class);
-  conf.addResource(HadoopFSUtils.getFs(filePath.toString(), 
conf).getConf());
+  conf.addResource(HoodieStorageUtils.getStorage(filePath, 
configuration).getConf().unwrapAs(Configuration.class));

Review Comment:
   This ORC reader is Hadoop-based, so there is no need to change this logic.  
This implementation of `BaseFileUtils` should be moved to `hudi-hadoop-common`.



##
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java:
##
@@ -100,7 +100,8 @@ public static ParquetMetadata 
readMetadata(StorageConfiguration conf, Storage
 ParquetMetadata footer;
 try {
   // TODO(vc): Should we use the parallel reading version here?
-  footer = 
ParquetFileReader.readFooter(HadoopFSUtils.getFs(parquetFileHadoopPath.toString(),
 conf).getConf(), parquetFileHadoopPath);
+  footer = ParquetFileReader.readFooter(HoodieStorageUtils.getStorage(

Review Comment:
   Similarly, this `ParquetUtils` is also Hadoop-based.



##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroFileWriterFactory.java:
##
@@ -94,7 +102,13 @@ protected HoodieFileWriter newHFileFileWriter(
 HoodieAvroHFileReaderImplBase.KEY_FIELD_NAME,
 PREFETCH_ON_OPEN, CACHE_DATA_IN_L1, DROP_BEHIND_CACHE_COMPACTION, 
filter, HFILE_COMPARATOR);
 
-return new HoodieAvroHFileWriter(instantTime, path, hfileConfig, schema, 
taskContextSupplier, config.getBoolean(HoodieTableConfig.POPULATE_META_FIELDS));
+try {
+  return (HoodieFileWriter) 
ReflectionUtils.loadClass("org.apache.hudi.io.storage.HoodieAvroHFileWriter",

Review Comment:
   Here too.



##
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroFileWriterFactory.java:
##
@@ -66,7 +67,14 @@ protected HoodieFileWriter newParquetFileWriter(
 config.getLongOrDefault(HoodieStorageConfig.PARQUET_MAX_FILE_SIZE),
 conf.unwrapAs(Configuration.class), 
config.getDoubleOrDefault(HoodieStorageConfig.PARQUET_COMPRESSION_RATIO_FRACTION),
 
config.getBooleanOrDefault(HoodieStorageConfig.PARQUET_DICTIONARY_ENABLED));
-return new HoodieAvroParquetWriter(path, parquetConfig, instantTime, 
taskContextSupplier, populateMetaFields);
+try {
+  return (HoodieFileWriter) 
ReflectionUtils.loadClass("org.apache.hudi.io.storage.HoodieAvroParquetWriter",

Review Comment:
   Similarly, define such implementation class names, e.g., 
`"org.apache.hudi.io.storage.HoodieAvroParquetWriter"`, in a common place, so 
it's easier to replace later with configs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-07 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593442058


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##
@@ -674,14 +674,15 @@ private static HoodieLogBlock getBlock(HoodieWriteConfig 
writeConfig,
  List records,
  boolean shouldWriteRecordPositions,
  Map 
header,
- String keyField) {
+ String keyField,
+ HoodieStorage storage) {
 switch (logDataBlockFormat) {
   case AVRO_DATA_BLOCK:
 return new HoodieAvroDataBlock(records, shouldWriteRecordPositions, 
header, keyField);
   case HFILE_DATA_BLOCK:
 return new HoodieHFileDataBlock(
 records, header, writeConfig.getHFileCompressionAlgorithm(), new 
StoragePath(writeConfig.getBasePath()),
-
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER));
+
writeConfig.getBooleanOrDefault(HoodieReaderConfig.USE_NATIVE_HFILE_READER), 
storage);

Review Comment:
   Is passing the storage instance required here?  We should strictly do 
refactoring with changing as little argument as possible.



##
hudi-common/src/main/java/org/apache/hudi/common/config/PropertiesConfig.java:
##
@@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+public interface PropertiesConfig {

Review Comment:
   Javadocs?



##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -212,21 +192,7 @@ public static List 
getAllPartitionFoldersThreeLevelsDown(HoodieStorage s
* Given a base partition and a partition path, return relative path of 
partition path to the base path.
*/
   public static String getRelativePartitionPath(Path basePath, Path 
fullPartitionPath) {
-basePath = CachingPath.getPathWithoutSchemeAndAuthority(basePath);
-fullPartitionPath = 
CachingPath.getPathWithoutSchemeAndAuthority(fullPartitionPath);
-
-String fullPartitionPathStr = fullPartitionPath.toString();
-
-if (!fullPartitionPathStr.startsWith(basePath.toString())) {
-  throw new IllegalArgumentException("Partition path \"" + 
fullPartitionPathStr
-  + "\" does not belong to base-path \"" + basePath + "\"");
-}
-
-int partitionStartIndex = fullPartitionPathStr.indexOf(basePath.getName(),
-basePath.getParent() == null ? 0 : 
basePath.getParent().toString().length());
-// Partition-Path could be empty for non-partitioned tables
-return partitionStartIndex + basePath.getName().length() == 
fullPartitionPathStr.length() ? ""
-: fullPartitionPathStr.substring(partitionStartIndex + 
basePath.getName().length() + 1);
+return getRelativePartitionPath(new StoragePath(basePath.toUri()), new 
StoragePath(fullPartitionPath.toUri()));

Review Comment:
   Let's separate this into a different PR since it's not relevant to fixing 
the dependency tree?



##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -390,7 +389,9 @@ public HoodieStorage getStorage() {
   consistencyGuardConfig)
   : new NoOpConsistencyGuard();
 
-  storage = getStorageWithWrapperFS(
+  storage = (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HoodieHadoopStorage",
+  new Class[] {StoragePath.class, StorageConfiguration.class, 
boolean.class, long.class, int.class,
+  long.class, String.class, ConsistencyGuard.class},

Review Comment:
   Have a util method to load the `HoodieStorage` using reflection?



##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -390,7 +389,9 @@ public HoodieStorage getStorage() {
   consistencyGuardConfig)
   : new NoOpConsistencyGuard();
 
-  storage = getStorageWithWrapperFS(
+  storage = (HoodieStorage) 
ReflectionUtils.loadClass("org.apache.hudi.storage.had

Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-07 Thread via GitHub


yihua commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1593272677


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##
@@ -313,7 +313,7 @@ public static Option 
getCommitMetadataForLatestInstant(Hoo
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaClient createMetaClient(JavaSparkContext jsc, 
String basePath) {
-return HoodieTestUtils.createMetaClient(jsc.hadoopConfiguration(), 
basePath);
+return 
HoodieTestUtils.createMetaClient(HoodieStorageUtils.getStorageConf(jsc.hadoopConfiguration()),
 basePath);

Review Comment:
   Still keep a test util `HoodieTestUtils.createMetaClient` that takes Hadoop 
configuration?  Or use `HoodieClientTestUtils#createMetaClient(JavaSparkContext 
jsc, String basePath)`?



##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##
@@ -322,7 +322,7 @@ public static HoodieTableMetaClient 
createMetaClient(JavaSparkContext jsc, Strin
* @return a new {@link HoodieTableMetaClient} instance.
*/
   public static HoodieTableMetaClient createMetaClient(SparkSession spark, 
String basePath) {
-return 
HoodieTestUtils.createMetaClient(spark.sessionState().newHadoopConf(), 
basePath);
+return 
HoodieTestUtils.createMetaClient(HoodieStorageUtils.getStorageConf(spark.sessionState().newHadoopConf()),
 basePath);

Review Comment:
   Use `HoodieTestUtils#createMetaClient(SparkSession spark, String basePath)`?



##
hudi-client/hudi-java-client/pom.xml:
##
@@ -42,6 +42,11 @@
 hudi-client-common
 ${project.parent.version}
 
+
+org.apache.hudi
+hudi-hadoop-common
+${project.version}
+

Review Comment:
   Move this above `hudi-client-common`.  Also, I think `hudi-client-common` 
pulls in `hudi-hadoop-common`?  Similar for other POMs. 



##
hudi-client/hudi-client-common/pom.xml:
##
@@ -43,6 +43,16 @@
   hudi-common
   ${project.version}
 
+
+  org.apache.hudi
+  hudi-io
+  ${project.version}
+

Review Comment:
   Is this needed, given `hudi-common` module pulls in `hudi-io` dependency 
transitively?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-07 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2098943852

   
   ## CI report:
   
   * 2bde7d57048e342132d0b70b524317e1bcbdb96b Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23742)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-07 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2098748252

   
   ## CI report:
   
   * 7c72471a1b9b5ad43ca63ab60da0f3d260f67cea Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23721)
 
   * 2bde7d57048e342132d0b70b524317e1bcbdb96b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23742)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-07 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2098720660

   
   ## CI report:
   
   * 7c72471a1b9b5ad43ca63ab60da0f3d260f67cea Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23721)
 
   * 2bde7d57048e342132d0b70b524317e1bcbdb96b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2097467747

   
   ## CI report:
   
   * 7c72471a1b9b5ad43ca63ab60da0f3d260f67cea Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23721)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2097379202

   
   ## CI report:
   
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23705)
 
   * 7c72471a1b9b5ad43ca63ab60da0f3d260f67cea Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23721)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2097373459

   
   ## CI report:
   
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23705)
 
   * 7c72471a1b9b5ad43ca63ab60da0f3d260f67cea UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2096530418

   
   ## CI report:
   
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23705)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2096434273

   
   ## CI report:
   
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23705)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2096420931

   
   ## CI report:
   
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2096181036

   
   ## CI report:
   
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23705)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-06 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2096163440

   
   ## CI report:
   
   * 834aad2a8b073a221e68fb3c960200f684b84dfd Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23630)
 
   * 1a5e0d9b0b3bd73b7034ef290fbeaf6aa7a66441 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091884462

   
   ## CI report:
   
   * 834aad2a8b073a221e68fb3c960200f684b84dfd Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23630)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091820661

   
   ## CI report:
   
   * c0a81f2890f9b066738fdf74cad9edf79cae0fda Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23625)
 
   * 834aad2a8b073a221e68fb3c960200f684b84dfd Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23630)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091675567

   
   ## CI report:
   
   * c0a81f2890f9b066738fdf74cad9edf79cae0fda Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23625)
 
   * 834aad2a8b073a221e68fb3c960200f684b84dfd UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091556390

   
   ## CI report:
   
   * c0a81f2890f9b066738fdf74cad9edf79cae0fda Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23625)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091448728

   
   ## CI report:
   
   * b88fa88e1a946edf8da8f0686345fe06fd0f55ce Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23621)
 
   * c0a81f2890f9b066738fdf74cad9edf79cae0fda Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23625)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091343002

   
   ## CI report:
   
   * b88fa88e1a946edf8da8f0686345fe06fd0f55ce Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23621)
 
   * c0a81f2890f9b066738fdf74cad9edf79cae0fda UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1588149891


##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -68,7 +68,8 @@ public class HoodieTestUtils {
   public static final String[] DEFAULT_PARTITION_PATHS = {"2016/03/15", 
"2015/03/16", "2015/03/17"};
 
   public static StorageConfiguration getDefaultStorageConf() {
-return HadoopFSUtils.getStorageConf(new Configuration(false));
+return (StorageConfiguration) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HadoopStorageConfiguration",

Review Comment:
   should make this use hoodiestorageutils



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091284841

   
   ## CI report:
   
   * b88fa88e1a946edf8da8f0686345fe06fd0f55ce Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23621)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


jonvex commented on code in PR #11131:
URL: https://github.com/apache/hudi/pull/11131#discussion_r1588149891


##
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java:
##
@@ -68,7 +68,8 @@ public class HoodieTestUtils {
   public static final String[] DEFAULT_PARTITION_PATHS = {"2016/03/15", 
"2015/03/16", "2015/03/17"};
 
   public static StorageConfiguration getDefaultStorageConf() {
-return HadoopFSUtils.getStorageConf(new Configuration(false));
+return (StorageConfiguration) 
ReflectionUtils.loadClass("org.apache.hudi.storage.hadoop.HadoopStorageConfiguration",

Review Comment:
   should make this use hoodiestorageutils



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091205515

   
   ## CI report:
   
   * 4e4a01a12bc7022fee11414a3b6be9e72e4a18dc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23618)
 
   * b88fa88e1a946edf8da8f0686345fe06fd0f55ce Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23621)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091191554

   
   ## CI report:
   
   * 4e4a01a12bc7022fee11414a3b6be9e72e4a18dc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23618)
 
   * b88fa88e1a946edf8da8f0686345fe06fd0f55ce UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091177439

   
   ## CI report:
   
   * 4e4a01a12bc7022fee11414a3b6be9e72e4a18dc UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091097360

   
   ## CI report:
   
   * d1de8c5240cf8f3695303a6e118538a87dea82a8 UNKNOWN
   * 4e4a01a12bc7022fee11414a3b6be9e72e4a18dc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23618)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2091083909

   
   ## CI report:
   
   * d1de8c5240cf8f3695303a6e118538a87dea82a8 UNKNOWN
   * 50b743df39f533e7824741d997f0981cb2f8b32c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23617)
 
   * 4e4a01a12bc7022fee11414a3b6be9e72e4a18dc UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7587] Make bundle dependencies for storage abstraction in correct order [hudi]

2024-05-02 Thread via GitHub


hudi-bot commented on PR #11131:
URL: https://github.com/apache/hudi/pull/11131#issuecomment-2090939013

   
   ## CI report:
   
   * d1de8c5240cf8f3695303a6e118538a87dea82a8 UNKNOWN
   * 50b743df39f533e7824741d997f0981cb2f8b32c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23617)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >