yihua commented on code in PR #13264:
URL: https://github.com/apache/hudi/pull/13264#discussion_r2076021145


##########
hudi-spark-datasource/hudi-spark-common/src/test/scala/org/apache/spark/sql/hive/TestHiveClientUtils.scala:
##########
@@ -50,4 +50,12 @@ class TestHiveClientUtils {
     assert(spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "hive")
     assert(HiveClientUtils.getSingletonClientForMetadata(spark) == hiveClient)
   }
+
+  @AfterAll

Review Comment:
   nit: this change seems unrelated.  We can keep it separate if CI still 
passes.



##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java:
##########
@@ -746,4 +790,38 @@ private static 
org.apache.hudi.avro.model.HoodieCommitMetadata convertCommitMeta
     
avroMetaData.getExtraMetadata().put(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY,
 "");
     return avroMetaData;
   }
+
+  @Test
+  void shouldReadArchivedFileFrom2025AndValidateContent() {
+    Path path = new Path(TestArchivedTimelineV1.class
+        .getResource("/archivecommits/.commits_.archive.681_1-0-1").getPath());
+
+    assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+  }
+
+  @Test
+  void shouldReadArchivedFileFrom2022AndValidateContent() {
+    Path path = new Path(TestArchivedTimelineV1.class
+        .getResource("/archivecommits/.commits_.archive.1_1-0-1").getPath());

Review Comment:
   Could we generate smaller artifacts of archival files?  I also see that they 
contain specific schema.  Let's make sure there is no sensitive information 
there.



##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java:
##########
@@ -746,4 +790,38 @@ private static 
org.apache.hudi.avro.model.HoodieCommitMetadata convertCommitMeta
     
avroMetaData.getExtraMetadata().put(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY,
 "");
     return avroMetaData;
   }
+
+  @Test
+  void shouldReadArchivedFileFrom2025AndValidateContent() {
+    Path path = new Path(TestArchivedTimelineV1.class
+        .getResource("/archivecommits/.commits_.archive.681_1-0-1").getPath());
+
+    assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+  }
+
+  @Test
+  void shouldReadArchivedFileFrom2022AndValidateContent() {

Review Comment:
   Make this Hudi-version specific instead of year?



##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java:
##########
@@ -746,4 +790,38 @@ private static 
org.apache.hudi.avro.model.HoodieCommitMetadata convertCommitMeta
     
avroMetaData.getExtraMetadata().put(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY,
 "");
     return avroMetaData;
   }
+
+  @Test
+  void shouldReadArchivedFileFrom2025AndValidateContent() {
+    Path path = new Path(TestArchivedTimelineV1.class
+        .getResource("/archivecommits/.commits_.archive.681_1-0-1").getPath());
+
+    assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+  }
+
+  @Test
+  void shouldReadArchivedFileFrom2022AndValidateContent() {
+    Path path = new Path(TestArchivedTimelineV1.class
+        .getResource("/archivecommits/.commits_.archive.1_1-0-1").getPath());
+
+    assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+  }
+
+  void readAndValidateArchivedFile(Path path, HoodieTableMetaClient 
metaClient) throws IOException {

Review Comment:
   is `.archive_commit_older_schema.data` used?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to