yihua commented on code in PR #13264:
URL: https://github.com/apache/hudi/pull/13264#discussion_r2076021145
##########
hudi-spark-datasource/hudi-spark-common/src/test/scala/org/apache/spark/sql/hive/TestHiveClientUtils.scala:
##########
@@ -50,4 +50,12 @@ class TestHiveClientUtils {
assert(spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "hive")
assert(HiveClientUtils.getSingletonClientForMetadata(spark) == hiveClient)
}
+
+ @AfterAll
Review Comment:
nit: this change seems unrelated. We can keep it separate if CI still
passes.
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java:
##########
@@ -746,4 +790,38 @@ private static
org.apache.hudi.avro.model.HoodieCommitMetadata convertCommitMeta
avroMetaData.getExtraMetadata().put(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY,
"");
return avroMetaData;
}
+
+ @Test
+ void shouldReadArchivedFileFrom2025AndValidateContent() {
+ Path path = new Path(TestArchivedTimelineV1.class
+ .getResource("/archivecommits/.commits_.archive.681_1-0-1").getPath());
+
+ assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+ }
+
+ @Test
+ void shouldReadArchivedFileFrom2022AndValidateContent() {
+ Path path = new Path(TestArchivedTimelineV1.class
+ .getResource("/archivecommits/.commits_.archive.1_1-0-1").getPath());
Review Comment:
Could we generate smaller artifacts of archival files? I also see that they
contain specific schema. Let's make sure there is no sensitive information
there.
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java:
##########
@@ -746,4 +790,38 @@ private static
org.apache.hudi.avro.model.HoodieCommitMetadata convertCommitMeta
avroMetaData.getExtraMetadata().put(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY,
"");
return avroMetaData;
}
+
+ @Test
+ void shouldReadArchivedFileFrom2025AndValidateContent() {
+ Path path = new Path(TestArchivedTimelineV1.class
+ .getResource("/archivecommits/.commits_.archive.681_1-0-1").getPath());
+
+ assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+ }
+
+ @Test
+ void shouldReadArchivedFileFrom2022AndValidateContent() {
Review Comment:
Make this Hudi-version specific instead of year?
##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java:
##########
@@ -746,4 +790,38 @@ private static
org.apache.hudi.avro.model.HoodieCommitMetadata convertCommitMeta
avroMetaData.getExtraMetadata().put(HoodieRollingStatMetadata.ROLLING_STAT_METADATA_KEY,
"");
return avroMetaData;
}
+
+ @Test
+ void shouldReadArchivedFileFrom2025AndValidateContent() {
+ Path path = new Path(TestArchivedTimelineV1.class
+ .getResource("/archivecommits/.commits_.archive.681_1-0-1").getPath());
+
+ assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+ }
+
+ @Test
+ void shouldReadArchivedFileFrom2022AndValidateContent() {
+ Path path = new Path(TestArchivedTimelineV1.class
+ .getResource("/archivecommits/.commits_.archive.1_1-0-1").getPath());
+
+ assertDoesNotThrow(() -> readAndValidateArchivedFile(path, metaClient));
+ }
+
+ void readAndValidateArchivedFile(Path path, HoodieTableMetaClient
metaClient) throws IOException {
Review Comment:
is `.archive_commit_older_schema.data` used?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]