yihua commented on code in PR #13264:
URL: https://github.com/apache/hudi/pull/13264#discussion_r2076030849
##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieDataBlock.java:
##########
@@ -371,7 +372,16 @@ protected Option<String> getRecordKey(HoodieRecord record)
{
protected Schema getSchemaFromHeader() {
String schemaStr = getLogBlockHeader().get(HeaderMetadataType.SCHEMA);
- SCHEMA_MAP.computeIfAbsent(schemaStr, (schemaString) -> new
Schema.Parser().parse(schemaString));
+ SCHEMA_MAP.computeIfAbsent(schemaStr,
Review Comment:
Could we add a unit test at the data log block layer, by writing a schema
string to a log block that passes in the old Avro version and fails in the new
Avro version, and validate `getSchemaFromHeader()` that it returns the expected
schema?
##########
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/versioning/v1/ArchivedTimelineV1.java:
##########
@@ -58,7 +59,8 @@ public class ArchivedTimelineV1 extends BaseTimelineV1
implements HoodieArchived
private static final String ACTION_STATE = "actionState";
private static final String STATE_TRANSITION_TIME = "stateTransitionTime";
private HoodieTableMetaClient metaClient;
- private final Map<String, Map<HoodieInstant.State, byte[]>> readCommits =
new HashMap<>();
+ // The first key is the timestamp -> multiple action types -> hoodie instant
state and contents
Review Comment:
Does `ArchivedTimelineV2` handle the same instant time with multiple action
types properly, e.g., during table upgrade from version 6 to 8?
##########
hudi-spark-datasource/hudi-spark-common/src/test/scala/org/apache/spark/sql/hive/TestHiveClientUtils.scala:
##########
@@ -50,4 +50,12 @@ class TestHiveClientUtils {
assert(spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "hive")
assert(HiveClientUtils.getSingletonClientForMetadata(spark) == hiveClient)
}
+
+ @AfterAll
Review Comment:
nit: this change seems unrelated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]