hudi-bot opened a new issue, #16381: URL: https://github.com/apache/hudi/issues/16381
We have Spark structured streaming job writing data in hudi format. After we made an upgrade from hudi 0.11.0 to hudi 0.13.0, the streaming app doesn't write data to existing hudi table. The streaming app started successfully, triggered listing job but didn't trigger any other job to compact, clean , write data , etc. No errors in Spark UI nor Stdout/Stderr logs. When running the streaming application to write to new s3 location (hudie table), everything works fine. We use append output mode and 30 seconds trigger processing time. Here are hudi configurations used (confiscated some values with xxx): 'hoodie.datasource.write.table.type': 'MERGE_ON_READ', 'hoodie.datasource.write.keygenerator.class': 'org.apache.hudi.keygen.CustomKeyGenerator', 'hoodie.datasource.write.precombine.field': 'xxx', 'hoodie.datasource.write.partitionpath.field': 'xxx:SIMPLE', 'hoodie.embed.timeline.server': False, 'hoodie.index.type': 'BLOOM', 'hoodie.parquet.compression.codec': 'snappy', 'hoodie.clean.async': True, 'hoodie.clean.max.commits': 5, 'hoodie.parquet.max.file.size': 125829120, 'hoodie.parquet.small.file.limit': 104857600, 'hoodie.parquet.block.size': 125829120, 'hoodie.metadata.enable': True, 'hoodie.metadata.validate': True, 'hoodie.datasource.write.hive_style_partitioning': True, 'hoodie.datasource.hive_sync.support_timestamp': True, 'hoodie.datasource.hive_sync.jdbcurl': "xxx", 'hoodie.datasource.hive_sync.username': 'xxx', 'hoodie.datasource.hive_sync.password': 'xxx', 'hoodie.datasource.hive_sync.partition_fields': 'xxx', 'hoodie.datasource.hive_sync.enable': True, 'hoodie.datasource.hive_sync.partition_extractor_class': 'org.apache.hudi.hive.MultiPartKeysValueExtractor', 'hoodie.avro.schema.external.transformation': True, 'hoodie.avro.schema.validate': True, 'hoodie.table.name', 'xxx' 'hoodie.datasource.write.table.name', 'xxx' 'hoodie.datasource.write.recordkey.field', 'xxx' 'hoodie.datasource.hive_sync.database', 'xxx' 'hoodie.datasource.hive_sync.table', 'xxx' 'hoodie.datasource.write.operation', 'upsert' ## JIRA info - Link: https://issues.apache.org/jira/browse/HUDI-7349 - Type: Bug - Affects version(s): - 0.13.0 - Fix version(s): - 1.1.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
