hudi-bot opened a new issue, #15459:
URL: https://github.com/apache/hudi/issues/15459
Currently, we assume that the partition column is always in String type
during bootstrap operation.
TestDataSourceForBootstrap.testMetadataBootstrapCOWHiveStylePartitioned fails
for date partition column if the type inference of partition column is turned
on.
We need to add a config to allow partition column inference in bootstrap so
that other types of partition columns are supported.
HoodieSparkBootstrapSchemaProvider
{code:java}
private static Schema getBootstrapSourceSchemaParquet(HoodieWriteConfig
writeConfig, HoodieEngineContext context, Path filePath) {
// NOTE: The type inference of partition column in the parquet table is
turned off explicitly,
// to be consistent with the existing bootstrap behavior, where the
partition column is String
// typed in Hudi table.
((HoodieSparkEngineContext) context).getSqlContext()
.setConf(SQLConf.PARTITION_COLUMN_TYPE_INFERENCE(), false);
StructType parquetSchema = ((HoodieSparkEngineContext)
context).getSqlContext().read()
.option("basePath", writeConfig.getBootstrapSourceBasePath())
.parquet(filePath.toString())
.schema(); {code}
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-4932
- Type: Improvement
- Epic: https://issues.apache.org/jira/browse/HUDI-1265
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]