Re: [PR] [HUDI-7589] Add API to create HoodieStorage in HoodieIOFactory [hudi]

via GitHub Mon, 13 May 2024 16:05:00 -0700


yihua commented on code in PR #11208:
URL: https://github.com/apache/hudi/pull/11208#discussion_r1599190553



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/HoodieSparkFileReaderFactory.java:
##########
@@ -31,34 +31,37 @@
 
 public class HoodieSparkFileReaderFactory extends HoodieFileReaderFactory {
 
+  public HoodieSparkFileReaderFactory(StorageConfiguration<?> storageConf) {
+    super(storageConf);
+  }
+
   @Override
-  public HoodieFileReader newParquetFileReader(StorageConfiguration<?> conf, 
StoragePath path) {
-    conf.setIfUnset(SQLConf.PARQUET_BINARY_AS_STRING().key(), 
SQLConf.PARQUET_BINARY_AS_STRING().defaultValueString());
-    conf.setIfUnset(SQLConf.PARQUET_INT96_AS_TIMESTAMP().key(), 
SQLConf.PARQUET_INT96_AS_TIMESTAMP().defaultValueString());
-    conf.setIfUnset(SQLConf.CASE_SENSITIVE().key(), 
SQLConf.CASE_SENSITIVE().defaultValueString());
+  public HoodieFileReader newParquetFileReader(StoragePath path) {
+    storageConf.setIfUnset(SQLConf.PARQUET_BINARY_AS_STRING().key(), 
SQLConf.PARQUET_BINARY_AS_STRING().defaultValueString());
+    storageConf.setIfUnset(SQLConf.PARQUET_INT96_AS_TIMESTAMP().key(), 
SQLConf.PARQUET_INT96_AS_TIMESTAMP().defaultValueString());
+    storageConf.setIfUnset(SQLConf.CASE_SENSITIVE().key(), 
SQLConf.CASE_SENSITIVE().defaultValueString());
     // Using string value of this conf to preserve compatibility across spark 
versions.
-    conf.setIfUnset("spark.sql.legacy.parquet.nanosAsLong", "false");
+    storageConf.setIfUnset("spark.sql.legacy.parquet.nanosAsLong", "false");
     // This is a required config since Spark 3.4.0: 
SQLConf.PARQUET_INFER_TIMESTAMP_NTZ_ENABLED
     // Using string value of this conf to preserve compatibility across spark 
versions.
-    conf.setIfUnset("spark.sql.parquet.inferTimestampNTZ.enabled", "true");
-    return new HoodieSparkParquetReader(conf, path);
+    storageConf.setIfUnset("spark.sql.parquet.inferTimestampNTZ.enabled", 
"true");
+    return new HoodieSparkParquetReader(storageConf, path);
   }
 
   @Override
   protected HoodieFileReader newHFileFileReader(HoodieConfig hoodieConfig,
-                                                StorageConfiguration<?> conf,
                                                 StoragePath path,
                                                 Option<Schema> schemaOption) 
throws IOException {
     throw new HoodieIOException("Not support read HFile");
   }
 
   @Override
-  protected HoodieFileReader newOrcFileReader(StorageConfiguration<?> conf, 
StoragePath path) {
+  protected HoodieFileReader newOrcFileReader(StoragePath path) {
     throw new HoodieIOException("Not support read orc file");
   }
 
   @Override
   public HoodieFileReader newBootstrapFileReader(HoodieFileReader 
skeletonFileReader, HoodieFileReader dataFileReader, Option<String[]> 
partitionFields, Object[] partitionValues) {
     return new HoodieSparkBootstrapFileReader(skeletonFileReader, 
dataFileReader, partitionFields, partitionValues);
   }
-}
+}

Review Comment:
   nit: keep the new line.



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -470,9 +470,9 @@ public void performMergeDataValidationCheck(WriteStatus 
writeStatus) {
     }
 
     long oldNumWrites = 0;
-    try (HoodieFileReader reader = 
HoodieIOFactory.getIOFactory(storage.getConf())
+    try (HoodieFileReader reader = 
HoodieIOFactory.getIOFactory(hoodieTable.getStorageConf())

Review Comment:
   Is there a difference between `storage.getConf()` and 
`hoodieTable.getStorageConf()`?  Probably we can remove `HoodieStorage` and 
`FileSystem` instances in the `HoodieIOHandle` (in a separate PR).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [HUDI-7589] Add API to create HoodieStorage in HoodieIOFactory [hudi]

Reply via email to