hudi-agent commented on code in PR #18680:
URL: https://github.com/apache/hudi/pull/18680#discussion_r3194907920
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSchemaUtils.scala:
##########
@@ -73,59 +70,48 @@ object HoodieSchemaUtils {
}
/**
- * get latest internalSchema from table
- *
- * @param config instance of {@link HoodieConfig}
- * @param tableMetaClient instance of HoodieTableMetaClient
- * @return Option of InternalSchema. Will always be empty if schema on read
is disabled
+ * Returns the schema-on-read evolution [[HoodieSchema]] for the table —
carrying
+ * field ids and version metadata — or [[None]] when schema-on-read is
disabled,
+ * no schema is in commit metadata, or any read error occurs (read failures
are
+ * swallowed rather than propagated).
*/
- def getLatestTableInternalSchema(config: HoodieConfig,
- tableMetaClient: HoodieTableMetaClient):
Option[InternalSchema] = {
- getLatestTableInternalSchema(config.getProps, tableMetaClient)
+ def getLatestTableEvolutionSchema(config: HoodieConfig,
+ tableMetaClient: HoodieTableMetaClient):
Option[HoodieSchema] = {
+ getLatestTableEvolutionSchema(config.getProps, tableMetaClient)
}
- /**
- * get latest internalSchema from table
- *
- * @param props instance of {@link Properties}
- * @param tableMetaClient instance of HoodieTableMetaClient
- * @return Option of InternalSchema. Will always be empty if schema on read
is disabled
- */
- def getLatestTableInternalSchema(props: Properties,
- tableMetaClient: HoodieTableMetaClient):
Option[InternalSchema] = {
+ def getLatestTableEvolutionSchema(props: Properties,
+ tableMetaClient: HoodieTableMetaClient):
Option[HoodieSchema] = {
Review Comment:
🤖 nit: this `Properties` overload lost the scaladoc that the original
`getLatestTableInternalSchema(props, ...)` had. Could you add a short scaladoc
here too (or factor a shared one) so both overloads stay documented?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/InternalSchemaManager.java:
##########
@@ -61,11 +60,11 @@ public class InternalSchemaManager implements Serializable {
private static final long serialVersionUID = 1L;
- public static final InternalSchemaManager DISABLED = new
InternalSchemaManager(null, InternalSchema.getEmptyInternalSchema(), null, null,
+ public static final InternalSchemaManager DISABLED = new
InternalSchemaManager(null, HoodieSchema.empty(), null, null,
TimelineLayout.fromVersion(TimelineLayoutVersion.CURR_LAYOUT_VERSION),
null);
@Getter
- private final InternalSchema querySchema;
+ private final HoodieSchema querySchema;
Review Comment:
🤖 nit: the class is still named `InternalSchemaManager` but its
`querySchema` (and the rest of its API) is now `HoodieSchema`. Could you rename
the class to something like `HoodieEvolutionSchemaManager` to match the new
type, either here or as a follow-up?
<sub><i>- AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]