[jira] [Created] (HUDI-5636) Fix possible changes loss for flink with data_before and op_key_only supplemental logging
Danny Chen created HUDI-5636: Summary: Fix possible changes loss for flink with data_before and op_key_only supplemental logging Key: HUDI-5636 URL: https://issues.apache.org/jira/browse/HUDI-5636 Project: Apache Hudi Issue Type: Improvement Components: flink-sql Reporter: Danny Chen Assignee: Danny Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] danny0405 commented on a diff in pull request #7760: [HUDI-5626] Rename CDC logging mode options
danny0405 commented on code in PR #7760: URL: https://github.com/apache/hudi/pull/7760#discussion_r1089676727 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java: ## @@ -168,12 +169,11 @@ private FlinkOptions() { public static final ConfigOption SUPPLEMENTAL_LOGGING_MODE = ConfigOptions .key("cdc.supplemental.logging.mode") .stringType() - .defaultValue("cdc_data_before_after") // default record all the change log images + .defaultValue(op_key_only.name()) .withFallbackKeys(HoodieTableConfig.CDC_SUPPLEMENTAL_LOGGING_MODE.key()) Review Comment: Just create one: https://issues.apache.org/jira/browse/HUDI-5636 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5634) Rename CDC related classes
[ https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5634: - Epic Link: HUDI-3478 > Rename CDC related classes > -- > > Key: HUDI-5634 > URL: https://issues.apache.org/jira/browse/HUDI-5634 > Project: Apache Hudi > Issue Type: Improvement > Components: core >Reporter: Yann Byron >Assignee: Yann Byron >Priority: Major > Labels: pull-request-available > > this ticket solves some comments left in > https://github.com/apache/hudi/pull/6727. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5634) Rename CDC related classes
[ https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5634: - Summary: Rename CDC related classes (was: Imporve cdc-related codes) > Rename CDC related classes > -- > > Key: HUDI-5634 > URL: https://issues.apache.org/jira/browse/HUDI-5634 > Project: Apache Hudi > Issue Type: Improvement > Components: core >Reporter: Yann Byron >Assignee: Yann Byron >Priority: Major > Labels: pull-request-available > > this ticket solves some comments left in > https://github.com/apache/hudi/pull/6727. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`
hudi-bot commented on PR #7769: URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407314821 ## CI report: * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688) * 80d38554649038cb9e668be4edc3a3c0a2c4373f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14693) * 36a0d9aeab63f713dff106ed9a76411aceb900b2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7773: [MINOR] Skip docs generation for table service manager
hudi-bot commented on PR #7773: URL: https://github.com/apache/hudi/pull/7773#issuecomment-1407313623 ## CI report: * 0112cf363fd5c17bed98e60471b4c69cd5fc8c75 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14703) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7772: [MINOR] Cleaning up recently introduced configs
hudi-bot commented on PR #7772: URL: https://github.com/apache/hudi/pull/7772#issuecomment-1407313614 ## CI report: * 5bceb2d752b66d50909f119fbe31f397dfca9a08 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14702) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs
hudi-bot commented on PR #7770: URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407313607 ## CI report: * acff503192ea74b693532da26dc91bac782c56cb Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14689) * ab0d70b4650a708efda76438cbc6f96397e3075a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14701) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7764: [HUDI-5628] Fixing log record reader scan V2 config name
hudi-bot commented on PR #7764: URL: https://github.com/apache/hudi/pull/7764#issuecomment-1407313589 ## CI report: * ebcae89f01be1f36d59d23006a2580d0e99b04f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14680) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14678) * 27109ab732b8be8d653e378a65724885a7e45789 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14700) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7760: [HUDI-5626] Rename CDC logging mode options
hudi-bot commented on PR #7760: URL: https://github.com/apache/hudi/pull/7760#issuecomment-1407313581 ## CI report: * 846574aa53ff3f81ba1828a7ce11bad7a3dd1f75 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14673) * 177e797f339ab1ef72f9079da00109e51f3bfe8d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14699) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API
hudi-bot commented on PR #7759: URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407313576 ## CI report: * 8bb795346fd54da170b3282a3a647bbe64c818cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14679) * 9f247d90caf948a2de54be2cf109151453b2a57d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14698) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes
hudi-bot commented on PR #7410: URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407313354 ## CI report: * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685) * 4e00b43a1d383a78e5781ffecece5618690c97cd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14697) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7773: [MINOR] Skip docs generation for table service manager
hudi-bot commented on PR #7773: URL: https://github.com/apache/hudi/pull/7773#issuecomment-1407312563 ## CI report: * 0112cf363fd5c17bed98e60471b4c69cd5fc8c75 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7772: [MINOR] Cleaning up recently introduced configs
hudi-bot commented on PR #7772: URL: https://github.com/apache/hudi/pull/7772#issuecomment-1407312555 ## CI report: * 5bceb2d752b66d50909f119fbe31f397dfca9a08 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7771: [DO NOT MERGE] Test PR 5631 with FF on: Improve defaults of early conflict detection configs
hudi-bot commented on PR #7771: URL: https://github.com/apache/hudi/pull/7771#issuecomment-1407312546 ## CI report: * 568c02cb4875155af7a99e0afb05ac43906a69c3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14690) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs
hudi-bot commented on PR #7770: URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407312542 ## CI report: * acff503192ea74b693532da26dc91bac782c56cb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14689) * ab0d70b4650a708efda76438cbc6f96397e3075a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7764: [HUDI-5628] Fixing log record reader scan V2 config name
hudi-bot commented on PR #7764: URL: https://github.com/apache/hudi/pull/7764#issuecomment-1407312532 ## CI report: * ebcae89f01be1f36d59d23006a2580d0e99b04f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14680) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14678) * 27109ab732b8be8d653e378a65724885a7e45789 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7760: [HUDI-5626] Rename CDC logging mode options
hudi-bot commented on PR #7760: URL: https://github.com/apache/hudi/pull/7760#issuecomment-1407312517 ## CI report: * 846574aa53ff3f81ba1828a7ce11bad7a3dd1f75 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14673) * 177e797f339ab1ef72f9079da00109e51f3bfe8d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API
hudi-bot commented on PR #7759: URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407312504 ## CI report: * 8bb795346fd54da170b3282a3a647bbe64c818cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14679) * 9f247d90caf948a2de54be2cf109151453b2a57d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes
hudi-bot commented on PR #7410: URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407312368 ## CI report: * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685) * 4e00b43a1d383a78e5781ffecece5618690c97cd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show|update hudi's table properties
hudi-bot commented on PR #7680: URL: https://github.com/apache/hudi/pull/7680#issuecomment-1407311087 ## CI report: * 0970573f82ef1a49184d1875975463f76f7d791d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14686) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes
hudi-bot commented on PR #7410: URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407310982 ## CI report: * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5632) CLI commands using Spark fails to execute with hudi-cli-bundle
[ https://issues.apache.org/jira/browse/HUDI-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5632: Status: In Progress (was: Open) > CLI commands using Spark fails to execute with hudi-cli-bundle > -- > > Key: HUDI-5632 > URL: https://issues.apache.org/jira/browse/HUDI-5632 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Fix For: 0.13.0 > > > The following commands which require launching Spark cannot be executed in > Hudi CLI shell with hudi-cli-bundle: > {code:java} > savepoint create --commit > downgrade table --toVersion 3 > upgrade table --toVersion 5 {code} > Output: > {code:java} > hudi:hudi_trips_cow->savepoint create --commit 20230127115839445 > 437425 [main] INFO > org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded > instants upto : Option{val=[20230127115839445__commit__COMPLETED]} > 438860 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 WARN Utils: Your hostname, Ethans-MacBook-Pro.local > resolves to a loopback address: 127.0.0.1; using 192.168.1.21 instead (on > interface en0) > 438862 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > 439171 [Thread-6] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > Error: Failed to load org.apache.hudi.cli.commands.SparkMain: > org/apache/hudi/exception/HoodieSavepointException > 439175 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 INFO ShutdownHookManager: Shutdown hook called > 439176 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 INFO ShutdownHookManager: Deleting directory > /private/var/folders/60/wk8qzx310fd32b2dp7mhzvdcgn/T/spark-6e5631ce-26bf-4ceb-82cc-7ce77fcb177e > 439217 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] - > Loading HoodieTableMetaClient from > /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow > 439218 [main] INFO org.apache.hudi.common.table.HoodieTableConfig [] - > Loading table properties from > /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow/.hoodie/hoodie.properties > 439221 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] - > Finished Loading Table of type COPY_ON_WRITE(version=1, > baseFileFormat=PARQUET) from > /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow > Failed: Could not create savepoint "20230127115839445". {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5635) Fix release scripts
[ https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5635: Status: In Progress (was: Open) > Fix release scripts > --- > > Key: HUDI-5635 > URL: https://issues.apache.org/jira/browse/HUDI-5635 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5631) Improve defaults of early conflict detection configs
[ https://issues.apache.org/jira/browse/HUDI-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5631: Status: In Progress (was: Open) > Improve defaults of early conflict detection configs > > > Key: HUDI-5631 > URL: https://issues.apache.org/jira/browse/HUDI-5631 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5631) Improve defaults of early conflict detection configs
[ https://issues.apache.org/jira/browse/HUDI-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5631: Status: Patch Available (was: In Progress) > Improve defaults of early conflict detection configs > > > Key: HUDI-5631 > URL: https://issues.apache.org/jira/browse/HUDI-5631 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5635) Fix release scripts
[ https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5635: Sprint: 0.13.0 Final Sprint 3 > Fix release scripts > --- > > Key: HUDI-5635 > URL: https://issues.apache.org/jira/browse/HUDI-5635 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5631) Improve defaults of early conflict detection configs
[ https://issues.apache.org/jira/browse/HUDI-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5631: Sprint: 0.13.0 Final Sprint 3 > Improve defaults of early conflict detection configs > > > Key: HUDI-5631 > URL: https://issues.apache.org/jira/browse/HUDI-5631 > Project: Apache Hudi > Issue Type: Improvement >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-5635) Fix release scripts
[ https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo reassigned HUDI-5635: --- Assignee: Ethan Guo > Fix release scripts > --- > > Key: HUDI-5635 > URL: https://issues.apache.org/jira/browse/HUDI-5635 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5635) Fix release scripts
[ https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5635: Fix Version/s: 0.13.0 > Fix release scripts > --- > > Key: HUDI-5635 > URL: https://issues.apache.org/jira/browse/HUDI-5635 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Major > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5635) Fix release scripts
[ https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5635: Priority: Blocker (was: Major) > Fix release scripts > --- > > Key: HUDI-5635 > URL: https://issues.apache.org/jira/browse/HUDI-5635 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5632) CLI commands using Spark fails to execute with hudi-cli-bundle
[ https://issues.apache.org/jira/browse/HUDI-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5632: Sprint: 0.13.0 Final Sprint 3 > CLI commands using Spark fails to execute with hudi-cli-bundle > -- > > Key: HUDI-5632 > URL: https://issues.apache.org/jira/browse/HUDI-5632 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Fix For: 0.13.0 > > > The following commands which require launching Spark cannot be executed in > Hudi CLI shell with hudi-cli-bundle: > {code:java} > savepoint create --commit > downgrade table --toVersion 3 > upgrade table --toVersion 5 {code} > Output: > {code:java} > hudi:hudi_trips_cow->savepoint create --commit 20230127115839445 > 437425 [main] INFO > org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded > instants upto : Option{val=[20230127115839445__commit__COMPLETED]} > 438860 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 WARN Utils: Your hostname, Ethans-MacBook-Pro.local > resolves to a loopback address: 127.0.0.1; using 192.168.1.21 instead (on > interface en0) > 438862 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > 439171 [Thread-6] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > Error: Failed to load org.apache.hudi.cli.commands.SparkMain: > org/apache/hudi/exception/HoodieSavepointException > 439175 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 INFO ShutdownHookManager: Shutdown hook called > 439176 [Thread-7] INFO org.apache.hudi.cli.utils.InputStreamConsumer [] - > 23/01/27 12:15:18 INFO ShutdownHookManager: Deleting directory > /private/var/folders/60/wk8qzx310fd32b2dp7mhzvdcgn/T/spark-6e5631ce-26bf-4ceb-82cc-7ce77fcb177e > 439217 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] - > Loading HoodieTableMetaClient from > /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow > 439218 [main] INFO org.apache.hudi.common.table.HoodieTableConfig [] - > Loading table properties from > /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow/.hoodie/hoodie.properties > 439221 [main] INFO org.apache.hudi.common.table.HoodieTableMetaClient [] - > Finished Loading Table of type COPY_ON_WRITE(version=1, > baseFileFormat=PARQUET) from > /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow > Failed: Could not create savepoint "20230127115839445". {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5635) Fix release scripts
Ethan Guo created HUDI-5635: --- Summary: Fix release scripts Key: HUDI-5635 URL: https://issues.apache.org/jira/browse/HUDI-5635 Project: Apache Hudi Issue Type: Bug Reporter: Ethan Guo -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5635) Fix release scripts
[ https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-5635: Story Points: 2 > Fix release scripts > --- > > Key: HUDI-5635 > URL: https://issues.apache.org/jira/browse/HUDI-5635 > Project: Apache Hudi > Issue Type: Bug >Reporter: Ethan Guo >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[hudi] branch master updated (6011de44d47 -> 7352661283e)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 6011de44d47 [HUDI-5629] Clean CDC log files for enable/disable scenario (#7767) add 7352661283e [MINOR] Add `hudi-platform-service` and `hudi-metaserver-server-bundle` to root pom (#7774) No new revisions were added by this update. Summary of changes: pom.xml | 2 ++ 1 file changed, 2 insertions(+)
[GitHub] [hudi] nsivabalan merged pull request #7774: [MINOR] Add `hudi-platform-service` and `hudi-metaserver-server-bundle` to root pom
nsivabalan merged PR #7774: URL: https://github.com/apache/hudi/pull/7774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yihua opened a new pull request, #7774: [MINOR] Add `hudi-platform-service` and `hudi-metaserver-server-bundle` to root pom
yihua opened a new pull request, #7774: URL: https://github.com/apache/hudi/pull/7774 ### Change Logs As above. ### Impact If not doing, we're missing these artifacts when building the project. ### Risk level none ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #7772: [MINOR] Cleaning up recently introduced configs
nsivabalan commented on PR #7772: URL: https://github.com/apache/hudi/pull/7772#issuecomment-1407307807 rebased w/ latest master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] branch master updated (67d661c2952 -> 6011de44d47)
This is an automated email from the ASF dual-hosted git repository. biyan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from 67d661c2952 [HUDI-5630] Fixing flaky parquet projection tests (#7768) add 6011de44d47 [HUDI-5629] Clean CDC log files for enable/disable scenario (#7767) No new revisions were added by this update. Summary of changes: .../hudi/table/action/clean/CleanPlanner.java | 26 +- 1 file changed, 6 insertions(+), 20 deletions(-)
[GitHub] [hudi] YannByron merged pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario
YannByron merged PR #7767: URL: https://github.com/apache/hudi/pull/7767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API
nsivabalan commented on PR #7759: URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407307608 rebased w/ laster master. we landed a flaky test fix which was causing failures w/ CI runs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] YannByron commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario
YannByron commented on PR #7767: URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407307555 The failed UTs is not related to this pr, but fixed in https://github.com/apache/hudi/pull/7768. Now merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #7760: [HUDI-5626] Rename CDC logging mode options
nsivabalan commented on PR #7760: URL: https://github.com/apache/hudi/pull/7760#issuecomment-1407307296 rebased w/ lastest master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs
nsivabalan commented on PR #7770: URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407307041 rebased w/ latest master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] YannByron commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes
YannByron commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089657325 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat( FileSlice beforeFileSlice = new FileSlice(fileGroupId, writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList()); cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice)); } else if (writeStat.getNumUpdateWrites() == 0L && writeStat.getNumDeletes() == 0 -&& writeStat.getNumWrites() == writeStat.getNumInserts()) { +&& writeStat.getNumWrites() > 0) { Review Comment: this name is followed by https://github.com/apache/hudi/pull/6727#discussion_r980470347. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] branch master updated (ff590c6d72c -> 67d661c2952)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from ff590c6d72c [HUDI-5023] Switching default Write Executor type to `SIMPLE` (#7476) add 67d661c2952 [HUDI-5630] Fixing flaky parquet projection tests (#7768) No new revisions were added by this update. Summary of changes: .../hudi/functional/TestParquetColumnProjection.scala | 17 - 1 file changed, 8 insertions(+), 9 deletions(-)
[GitHub] [hudi] nsivabalan merged pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests
nsivabalan merged PR #7768: URL: https://github.com/apache/hudi/pull/7768 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests
nsivabalan commented on PR #7768: URL: https://github.com/apache/hudi/pull/7768#issuecomment-1407304655 Failed due to a flaky deltastreamer test. going ahead to unblock other blocker patches -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5535) Add support for keyless for all keygens(non partitioned, timestamp based key gen)
[ https://issues.apache.org/jira/browse/HUDI-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5535: -- Priority: Critical (was: Blocker) > Add support for keyless for all keygens(non partitioned, timestamp based key > gen) > - > > Key: HUDI-5535 > URL: https://issues.apache.org/jira/browse/HUDI-5535 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Critical > Labels: pull-request-available > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5535) Add support for keyless for all keygens(non partitioned, timestamp based key gen)
[ https://issues.apache.org/jira/browse/HUDI-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5535: -- Fix Version/s: 0.14.0 (was: 0.13.0) > Add support for keyless for all keygens(non partitioned, timestamp based key > gen) > - > > Key: HUDI-5535 > URL: https://issues.apache.org/jira/browse/HUDI-5535 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Critical > Labels: pull-request-available > Fix For: 0.14.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-4700) RFC for primary key-less data model
[ https://issues.apache.org/jira/browse/HUDI-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4700: -- Fix Version/s: 0.14.0 (was: 0.13.0) > RFC for primary key-less data model > --- > > Key: HUDI-4700 > URL: https://issues.apache.org/jira/browse/HUDI-4700 > Project: Apache Hudi > Issue Type: Task >Reporter: Sagar Sumit >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.14.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-4700) RFC for primary key-less data model
[ https://issues.apache.org/jira/browse/HUDI-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4700: -- Priority: Critical (was: Blocker) > RFC for primary key-less data model > --- > > Key: HUDI-4700 > URL: https://issues.apache.org/jira/browse/HUDI-4700 > Project: Apache Hudi > Issue Type: Task >Reporter: Sagar Sumit >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5571) Add support for keyless for all keygens(non partitioned, timestamp based key gen) row writer
[ https://issues.apache.org/jira/browse/HUDI-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5571: -- Fix Version/s: 0.14.0 (was: 0.13.0) > Add support for keyless for all keygens(non partitioned, timestamp based key > gen) row writer > - > > Key: HUDI-5571 > URL: https://issues.apache.org/jira/browse/HUDI-5571 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.14.0 > > > keyless support for all row writer apis in key gen interface -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5536) Support writing to hudi w/o any options
[ https://issues.apache.org/jira/browse/HUDI-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5536: -- Priority: Critical (was: Blocker) > Support writing to hudi w/o any options > > > Key: HUDI-5536 > URL: https://issues.apache.org/jira/browse/HUDI-5536 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.13.0 > > > with key less model, we should be able to support > df.write.format("hudi").save(path) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5586) Add support to auto generation of record keys for SimpleKeyGen
[ https://issues.apache.org/jira/browse/HUDI-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5586: -- Fix Version/s: 0.14.0 (was: 0.13.0) > Add support to auto generation of record keys for SimpleKeyGen > -- > > Key: HUDI-5586 > URL: https://issues.apache.org/jira/browse/HUDI-5586 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Critical > Fix For: 0.14.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-4701) Support bulk insert without primary key and precombine field
[ https://issues.apache.org/jira/browse/HUDI-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4701: -- Priority: Critical (was: Blocker) > Support bulk insert without primary key and precombine field > > > Key: HUDI-4701 > URL: https://issues.apache.org/jira/browse/HUDI-4701 > Project: Apache Hudi > Issue Type: Task >Reporter: Sagar Sumit >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5536) Support writing to hudi w/o any options
[ https://issues.apache.org/jira/browse/HUDI-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5536: -- Fix Version/s: 0.14.0 (was: 0.13.0) > Support writing to hudi w/o any options > > > Key: HUDI-5536 > URL: https://issues.apache.org/jira/browse/HUDI-5536 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.14.0 > > > with key less model, we should be able to support > df.write.format("hudi").save(path) > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5571) Add support for keyless for all keygens(non partitioned, timestamp based key gen) row writer
[ https://issues.apache.org/jira/browse/HUDI-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5571: -- Priority: Critical (was: Blocker) > Add support for keyless for all keygens(non partitioned, timestamp based key > gen) row writer > - > > Key: HUDI-5571 > URL: https://issues.apache.org/jira/browse/HUDI-5571 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.13.0 > > > keyless support for all row writer apis in key gen interface -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-4701) Support bulk insert without primary key and precombine field
[ https://issues.apache.org/jira/browse/HUDI-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-4701: -- Fix Version/s: 0.14.0 (was: 0.13.0) > Support bulk insert without primary key and precombine field > > > Key: HUDI-4701 > URL: https://issues.apache.org/jira/browse/HUDI-4701 > Project: Apache Hudi > Issue Type: Task >Reporter: Sagar Sumit >Assignee: Lokesh Jain >Priority: Critical > Fix For: 0.14.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5574) Support auto record key generation with Spark SQL
[ https://issues.apache.org/jira/browse/HUDI-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5574: -- Priority: Critical (was: Blocker) > Support auto record key generation with Spark SQL > - > > Key: HUDI-5574 > URL: https://issues.apache.org/jira/browse/HUDI-5574 > Project: Apache Hudi > Issue Type: Bug > Components: writer-core >Reporter: Lokesh Jain >Priority: Critical > Fix For: 0.13.0 > > > HUDI-2681 adds support for auto record key generation with spark dataframes. > This Jira aims to add support for the same with spark sql. > One of the changes required here as pointed out by [~kazdy] is that > SQL_INSERT_MODE would need to be handled here. In this case if > SQL_INSERT_MODE mode is set to strict, the insert should fail. > cc [~shivnarayan] > Essentially, based on this patch > ([https://github.com/apache/hudi/pull/7681),|https://github.com/apache/hudi/pull/7681,] > we want to ensure spark-sql writes also supports auto generation of record > keys. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5586) Add support to auto generation of record keys for SimpleKeyGen
[ https://issues.apache.org/jira/browse/HUDI-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-5586: -- Priority: Critical (was: Blocker) > Add support to auto generation of record keys for SimpleKeyGen > -- > > Key: HUDI-5586 > URL: https://issues.apache.org/jira/browse/HUDI-5586 > Project: Apache Hudi > Issue Type: Improvement > Components: writer-core >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Critical > Fix For: 0.13.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] danny0405 commented on pull request #7608: [HUDI-5503] Optimize flink table factory option check
danny0405 commented on PR #7608: URL: https://github.com/apache/hudi/pull/7608#issuecomment-1407301502 Thanks for the contribution, I have reviewed and applied a patch: [HUDI-5503.patch.zip](https://github.com/apache/hudi/files/10526044/HUDI-5503.patch.zip), please rebase with latest master code and then apply the path with cmd: `git apply xxx.patch` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan opened a new pull request, #7773: [MINOR] Skip docs generation for table service manager
xushiyan opened a new pull request, #7773: URL: https://github.com/apache/hudi/pull/7773 ### Change Logs Skip docs generation for `HoodieTableServiceManagerConfig` ### Impact NA ### Risk level None ### Documentation Update NA ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`
hudi-bot commented on PR #7769: URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407297918 ## CI report: * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688) * 80d38554649038cb9e668be4edc3a3c0a2c4373f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14693) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario
hudi-bot commented on PR #7767: URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407296545 ## CI report: * 461021069263f049ee764a74294ec596c9c6b8b0 UNKNOWN * 81c7f21d6f8c81490d2e4a5fef3323f6d670449d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14683) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7766: [DO NOT MERGE] RFC46 testing
hudi-bot commented on PR #7766: URL: https://github.com/apache/hudi/pull/7766#issuecomment-1407296538 ## CI report: * 744c663dbd926af8b218288a87e0b3061f2c4250 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14682) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes
xushiyan commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089638068 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala: ## @@ -157,7 +154,7 @@ class HoodieCDCRDD( split.changes.last.getInstant, recordKeyField, preCombineFieldOpt, -usesVirtualKeys = false, +usesVirtualKeys = !populateMetaFields, Review Comment: separate PR for the fix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes
xushiyan commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089636397 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat( FileSlice beforeFileSlice = new FileSlice(fileGroupId, writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList()); cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice)); } else if (writeStat.getNumUpdateWrites() == 0L && writeStat.getNumDeletes() == 0 -&& writeStat.getNumWrites() == writeStat.getNumInserts()) { +&& writeStat.getNumWrites() > 0) { Review Comment: it's inference case because `HoodieCDCExtractor` is all about inferring CDC result from commit metadata in different scenarios. I prefer more concise name over lengthy name that gives the same meaning. anyway are you filing a separate PR for the fixes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes
xushiyan commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089636397 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat( FileSlice beforeFileSlice = new FileSlice(fileGroupId, writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList()); cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice)); } else if (writeStat.getNumUpdateWrites() == 0L && writeStat.getNumDeletes() == 0 -&& writeStat.getNumWrites() == writeStat.getNumInserts()) { +&& writeStat.getNumWrites() > 0) { Review Comment: it's inference case because `HoodieCDCExtractor` is all about inferring CDC result from commit metadata in different scenarios. But `InferCase` is perfectly fine; i prefer more concise name that gives the same meaning over lengthy name. so i think we should just land the fixes into RC2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] alexeykudinkin opened a new pull request, #7772: [MINOR] Cleaning up recently introduced configs
alexeykudinkin opened a new pull request, #7772: URL: https://github.com/apache/hudi/pull/7772 ### Change Logs Cleaning up some of the recently introduced configs: - Shortening file-listing mode override for Spark's `FileIndex` - ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk level (write none, low medium or high below) _If medium or high, explain what verification was done to mitigate the risks._ ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`
hudi-bot commented on PR #7769: URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407282031 ## CI report: * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688) * 80d38554649038cb9e668be4edc3a3c0a2c4373f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7679: [HUDI-5563] Check table exist before drop table
hudi-bot commented on PR #7679: URL: https://github.com/apache/hudi/pull/7679#issuecomment-1407281990 ## CI report: * 18e390314ee0744e0f6a23d1293f3b4338750af3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14362) * c29efcd919af87dba3fc8499ab57647a51c25ce6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14692) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7615: [HUDI-5510] Reload active timeline when commit finish
hudi-bot commented on PR #7615: URL: https://github.com/apache/hudi/pull/7615#issuecomment-1407281953 ## CI report: * 1b42230e664a1f4554bc072e3198ee4f2ec8f32e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14215) * 1c2b2822737835978a64d21d80838a4a5a30f951 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14691) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7771: [DO NOT MERGE] Test PR 5631 with FF on: Improve defaults of early conflict detection configs
hudi-bot commented on PR #7771: URL: https://github.com/apache/hudi/pull/7771#issuecomment-1407280756 ## CI report: * 568c02cb4875155af7a99e0afb05ac43906a69c3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14690) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests
hudi-bot commented on PR #7768: URL: https://github.com/apache/hudi/pull/7768#issuecomment-1407280724 ## CI report: * 28b8b5d1c0a737da8a706b43305dcc852825457d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14687) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs
hudi-bot commented on PR #7770: URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407280747 ## CI report: * acff503192ea74b693532da26dc91bac782c56cb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14689) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`
hudi-bot commented on PR #7769: URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407280738 ## CI report: * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688) * 80d38554649038cb9e668be4edc3a3c0a2c4373f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7679: [HUDI-5563] Check table exist before drop table
hudi-bot commented on PR #7679: URL: https://github.com/apache/hudi/pull/7679#issuecomment-1407280667 ## CI report: * 18e390314ee0744e0f6a23d1293f3b4338750af3 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14362) * c29efcd919af87dba3fc8499ab57647a51c25ce6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show|update hudi's table properties
hudi-bot commented on PR #7680: URL: https://github.com/apache/hudi/pull/7680#issuecomment-1407280678 ## CI report: * df3a787ab69d1a3ac0ff854b671699e0a55dc01d Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14452) * 0970573f82ef1a49184d1875975463f76f7d791d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14686) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7615: [HUDI-5510] Reload active timeline when commit finish
hudi-bot commented on PR #7615: URL: https://github.com/apache/hudi/pull/7615#issuecomment-1407280623 ## CI report: * 1b42230e664a1f4554bc072e3198ee4f2ec8f32e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14215) * 1c2b2822737835978a64d21d80838a4a5a30f951 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes
hudi-bot commented on PR #7410: URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407280560 ## CI report: * 1195034b3a3dc06084733ae8572ffebc2b79d295 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13556) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13633) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13874) * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5633) Fixing HoodieSparkRecord performance bottlenecks
[ https://issues.apache.org/jira/browse/HUDI-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5633: - Labels: pull-request-available (was: ) > Fixing HoodieSparkRecord performance bottlenecks > > > Key: HUDI-5633 > URL: https://issues.apache.org/jira/browse/HUDI-5633 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > > There currently following issues w/ the current HoodieSparkRecord > implementation: > # It rewrites records using `rewriteRecord` and `rewriteRecordWithNewSchema` > which do Schema traversals for every record. Instead we should do schema > traversal only once and produce a transformer that will directly create new > record from the old one. > # Records are currently copied for every Executor even for Simple one which > actually is not buffering any records and therefore doesn't require records > to be copied. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] hudi-bot commented on pull request #7771: [DO NOT MERGE] Test HUDI-5631 with FF on: Improve defaults of early conflict detection configs
hudi-bot commented on PR #7771: URL: https://github.com/apache/hudi/pull/7771#issuecomment-1407278874 ## CI report: * 568c02cb4875155af7a99e0afb05ac43906a69c3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`
hudi-bot commented on PR #7769: URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407278862 ## CI report: * 0ece5561859923b8773d6ff9fa633f014c104300 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs
hudi-bot commented on PR #7770: URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407278868 ## CI report: * acff503192ea74b693532da26dc91bac782c56cb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario
hudi-bot commented on PR #7767: URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407278843 ## CI report: * 461021069263f049ee764a74294ec596c9c6b8b0 UNKNOWN * 81c7f21d6f8c81490d2e4a5fef3323f6d670449d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14683) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests
hudi-bot commented on PR #7768: URL: https://github.com/apache/hudi/pull/7768#issuecomment-1407278856 ## CI report: * 28b8b5d1c0a737da8a706b43305dcc852825457d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show|update hudi's table properties
hudi-bot commented on PR #7680: URL: https://github.com/apache/hudi/pull/7680#issuecomment-1407278795 ## CI report: * df3a787ab69d1a3ac0ff854b671699e0a55dc01d Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14452) * 0970573f82ef1a49184d1875975463f76f7d791d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5634) Imporve cdc-related codes
[ https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5634: - Labels: pull-request-available (was: ) > Imporve cdc-related codes > - > > Key: HUDI-5634 > URL: https://issues.apache.org/jira/browse/HUDI-5634 > Project: Apache Hudi > Issue Type: Improvement > Components: core >Reporter: Yann Byron >Assignee: Yann Byron >Priority: Major > Labels: pull-request-available > > this ticket solves some comments left in > https://github.com/apache/hudi/pull/6727. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes
hudi-bot commented on PR #7410: URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407278713 ## CI report: * 1195034b3a3dc06084733ae8572ffebc2b79d295 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13556) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13633) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13874) * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] YannByron commented on a diff in pull request #7410: [HUDI-3478] imporve cdc-related codes
YannByron commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089616197 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala: ## @@ -157,7 +154,7 @@ class HoodieCDCRDD( split.changes.last.getInstant, recordKeyField, preCombineFieldOpt, -usesVirtualKeys = false, +usesVirtualKeys = !populateMetaFields, Review Comment: Restore this first. let's keep this pr force on code-improvement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario
hudi-bot commented on PR #7767: URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407277186 ## CI report: * 461021069263f049ee764a74294ec596c9c6b8b0 UNKNOWN * 81c7f21d6f8c81490d2e4a5fef3323f6d670449d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API
hudi-bot commented on PR #7759: URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407277156 ## CI report: * 8bb795346fd54da170b3282a3a647bbe64c818cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14679) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7764: [HUDI-5628] Fixing log record reader scan V2 config name
hudi-bot commented on PR #7764: URL: https://github.com/apache/hudi/pull/7764#issuecomment-1407277171 ## CI report: * ebcae89f01be1f36d59d23006a2580d0e99b04f1 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14680) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14678) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] YannByron commented on a diff in pull request #7410: [HUDI-3478] imporve cdc-related codes
YannByron commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089614628 ## hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java: ## @@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat( FileSlice beforeFileSlice = new FileSlice(fileGroupId, writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList()); cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice)); } else if (writeStat.getNumUpdateWrites() == 0L && writeStat.getNumDeletes() == 0 -&& writeStat.getNumWrites() == writeStat.getNumInserts()) { +&& writeStat.getNumWrites() > 0) { Review Comment: In my thought, `writeStat.getNumWrites() == writeStat.getNumInserts()` is right. The change is just for this comment: https://github.com/apache/hudi/pull/6727#discussion_r980481223. if the case mentioned in this comment, should be fixed in other codes, not here. I will rollback this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yihua opened a new pull request, #7771: [DO NOT MERGE] Test HUDI-5631 with FF on: Improve defaults of early conflict detection configs
yihua opened a new pull request, #7771: URL: https://github.com/apache/hudi/pull/7771 ### Change Logs Run tests in CI only ### Impact N/A ### Risk level none ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HUDI-5634) Imporve cdc-related codes
[ https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yann Byron reassigned HUDI-5634: Assignee: Yann Byron > Imporve cdc-related codes > - > > Key: HUDI-5634 > URL: https://issues.apache.org/jira/browse/HUDI-5634 > Project: Apache Hudi > Issue Type: Improvement > Components: core >Reporter: Yann Byron >Assignee: Yann Byron >Priority: Major > > this ticket solves some comments left in > https://github.com/apache/hudi/pull/6727. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-5634) Imporve cdc-related codes
Yann Byron created HUDI-5634: Summary: Imporve cdc-related codes Key: HUDI-5634 URL: https://issues.apache.org/jira/browse/HUDI-5634 Project: Apache Hudi Issue Type: Improvement Components: core Reporter: Yann Byron this ticket solves some comments left in https://github.com/apache/hudi/pull/6727. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5633) Fixing HoodieSparkRecord performance bottlenecks
[ https://issues.apache.org/jira/browse/HUDI-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-5633: -- Fix Version/s: 0.13.0 > Fixing HoodieSparkRecord performance bottlenecks > > > Key: HUDI-5633 > URL: https://issues.apache.org/jira/browse/HUDI-5633 > Project: Apache Hudi > Issue Type: Bug >Reporter: Alexey Kudinkin >Assignee: Alexey Kudinkin >Priority: Blocker > Fix For: 0.13.0 > > > There currently following issues w/ the current HoodieSparkRecord > implementation: > # It rewrites records using `rewriteRecord` and `rewriteRecordWithNewSchema` > which do Schema traversals for every record. Instead we should do schema > traversal only once and produce a transformer that will directly create new > record from the old one. > # Records are currently copied for every Executor even for Simple one which > actually is not buffering any records and therefore doesn't require records > to be copied. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] danny0405 commented on a diff in pull request #7410: [HUDI-3478] imporve cdc-related codes
danny0405 commented on code in PR #7410: URL: https://github.com/apache/hudi/pull/7410#discussion_r1089610242 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala: ## @@ -157,7 +154,7 @@ class HoodieCDCRDD( split.changes.last.getInstant, recordKeyField, preCombineFieldOpt, -usesVirtualKeys = false, +usesVirtualKeys = !populateMetaFields, Review Comment: Seems critical, should be merged for RC2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org