[jira] [Created] (HUDI-5636) Fix possible changes loss for flink with data_before and op_key_only supplemental logging

2023-01-27 Thread Danny Chen (Jira)
Danny Chen created HUDI-5636:


 Summary: Fix possible changes loss for flink with data_before and 
op_key_only supplemental logging
 Key: HUDI-5636
 URL: https://issues.apache.org/jira/browse/HUDI-5636
 Project: Apache Hudi
  Issue Type: Improvement
  Components: flink-sql
Reporter: Danny Chen
Assignee: Danny Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] danny0405 commented on a diff in pull request #7760: [HUDI-5626] Rename CDC logging mode options

2023-01-27 Thread via GitHub


danny0405 commented on code in PR #7760:
URL: https://github.com/apache/hudi/pull/7760#discussion_r1089676727


##
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java:
##
@@ -168,12 +169,11 @@ private FlinkOptions() {
   public static final ConfigOption SUPPLEMENTAL_LOGGING_MODE = 
ConfigOptions
   .key("cdc.supplemental.logging.mode")
   .stringType()
-  .defaultValue("cdc_data_before_after") // default record all the change 
log images
+  .defaultValue(op_key_only.name())
   .withFallbackKeys(HoodieTableConfig.CDC_SUPPLEMENTAL_LOGGING_MODE.key())

Review Comment:
   Just create one: https://issues.apache.org/jira/browse/HUDI-5636



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5634) Rename CDC related classes

2023-01-27 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5634:
-
Epic Link: HUDI-3478

> Rename CDC related classes
> --
>
> Key: HUDI-5634
> URL: https://issues.apache.org/jira/browse/HUDI-5634
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: core
>Reporter: Yann Byron
>Assignee: Yann Byron
>Priority: Major
>  Labels: pull-request-available
>
> this ticket solves some comments left in 
> https://github.com/apache/hudi/pull/6727.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5634) Rename CDC related classes

2023-01-27 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5634:
-
Summary: Rename CDC related classes  (was: Imporve cdc-related codes)

> Rename CDC related classes
> --
>
> Key: HUDI-5634
> URL: https://issues.apache.org/jira/browse/HUDI-5634
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: core
>Reporter: Yann Byron
>Assignee: Yann Byron
>Priority: Major
>  Labels: pull-request-available
>
> this ticket solves some comments left in 
> https://github.com/apache/hudi/pull/6727.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7769:
URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407314821

   
   ## CI report:
   
   * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688)
 
   * 80d38554649038cb9e668be4edc3a3c0a2c4373f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14693)
 
   * 36a0d9aeab63f713dff106ed9a76411aceb900b2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7773: [MINOR] Skip docs generation for table service manager

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7773:
URL: https://github.com/apache/hudi/pull/7773#issuecomment-1407313623

   
   ## CI report:
   
   * 0112cf363fd5c17bed98e60471b4c69cd5fc8c75 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14703)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7772: [MINOR] Cleaning up recently introduced configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7772:
URL: https://github.com/apache/hudi/pull/7772#issuecomment-1407313614

   
   ## CI report:
   
   * 5bceb2d752b66d50909f119fbe31f397dfca9a08 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14702)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7770:
URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407313607

   
   ## CI report:
   
   * acff503192ea74b693532da26dc91bac782c56cb Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14689)
 
   * ab0d70b4650a708efda76438cbc6f96397e3075a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14701)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7764: [HUDI-5628] Fixing log record reader scan V2 config name

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7764:
URL: https://github.com/apache/hudi/pull/7764#issuecomment-1407313589

   
   ## CI report:
   
   * ebcae89f01be1f36d59d23006a2580d0e99b04f1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14680)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14678)
 
   * 27109ab732b8be8d653e378a65724885a7e45789 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14700)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7760: [HUDI-5626] Rename CDC logging mode options

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7760:
URL: https://github.com/apache/hudi/pull/7760#issuecomment-1407313581

   
   ## CI report:
   
   * 846574aa53ff3f81ba1828a7ce11bad7a3dd1f75 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14673)
 
   * 177e797f339ab1ef72f9079da00109e51f3bfe8d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14699)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7759:
URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407313576

   
   ## CI report:
   
   * 8bb795346fd54da170b3282a3a647bbe64c818cb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14679)
 
   * 9f247d90caf948a2de54be2cf109151453b2a57d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14698)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7410:
URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407313354

   
   ## CI report:
   
   * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685)
 
   * 4e00b43a1d383a78e5781ffecece5618690c97cd Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14697)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7773: [MINOR] Skip docs generation for table service manager

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7773:
URL: https://github.com/apache/hudi/pull/7773#issuecomment-1407312563

   
   ## CI report:
   
   * 0112cf363fd5c17bed98e60471b4c69cd5fc8c75 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7772: [MINOR] Cleaning up recently introduced configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7772:
URL: https://github.com/apache/hudi/pull/7772#issuecomment-1407312555

   
   ## CI report:
   
   * 5bceb2d752b66d50909f119fbe31f397dfca9a08 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7771: [DO NOT MERGE] Test PR 5631 with FF on: Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7771:
URL: https://github.com/apache/hudi/pull/7771#issuecomment-1407312546

   
   ## CI report:
   
   * 568c02cb4875155af7a99e0afb05ac43906a69c3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14690)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7770:
URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407312542

   
   ## CI report:
   
   * acff503192ea74b693532da26dc91bac782c56cb Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14689)
 
   * ab0d70b4650a708efda76438cbc6f96397e3075a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7764: [HUDI-5628] Fixing log record reader scan V2 config name

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7764:
URL: https://github.com/apache/hudi/pull/7764#issuecomment-1407312532

   
   ## CI report:
   
   * ebcae89f01be1f36d59d23006a2580d0e99b04f1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14680)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14678)
 
   * 27109ab732b8be8d653e378a65724885a7e45789 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7760: [HUDI-5626] Rename CDC logging mode options

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7760:
URL: https://github.com/apache/hudi/pull/7760#issuecomment-1407312517

   
   ## CI report:
   
   * 846574aa53ff3f81ba1828a7ce11bad7a3dd1f75 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14673)
 
   * 177e797f339ab1ef72f9079da00109e51f3bfe8d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7759:
URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407312504

   
   ## CI report:
   
   * 8bb795346fd54da170b3282a3a647bbe64c818cb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14679)
 
   * 9f247d90caf948a2de54be2cf109151453b2a57d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7410:
URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407312368

   
   ## CI report:
   
   * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685)
 
   * 4e00b43a1d383a78e5781ffecece5618690c97cd UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show|update hudi's table properties

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7680:
URL: https://github.com/apache/hudi/pull/7680#issuecomment-1407311087

   
   ## CI report:
   
   * 0970573f82ef1a49184d1875975463f76f7d791d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14686)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7410:
URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407310982

   
   ## CI report:
   
   * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5632) CLI commands using Spark fails to execute with hudi-cli-bundle

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5632:

Status: In Progress  (was: Open)

> CLI commands using Spark fails to execute with hudi-cli-bundle
> --
>
> Key: HUDI-5632
> URL: https://issues.apache.org/jira/browse/HUDI-5632
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>
> The following commands which require launching Spark cannot be executed in 
> Hudi CLI shell with hudi-cli-bundle:
> {code:java}
> savepoint create --commit 
> downgrade table --toVersion 3
> upgrade table --toVersion 5 {code}
> Output:
> {code:java}
> hudi:hudi_trips_cow->savepoint create --commit 20230127115839445
> 437425 [main] INFO  
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded 
> instants upto : Option{val=[20230127115839445__commit__COMPLETED]}
> 438860 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 WARN Utils: Your hostname, Ethans-MacBook-Pro.local 
> resolves to a loopback address: 127.0.0.1; using 192.168.1.21 instead (on 
> interface en0)
> 438862 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
> another address
> 439171 [Thread-6] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> Error: Failed to load org.apache.hudi.cli.commands.SparkMain: 
> org/apache/hudi/exception/HoodieSavepointException
> 439175 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 INFO ShutdownHookManager: Shutdown hook called
> 439176 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 INFO ShutdownHookManager: Deleting directory 
> /private/var/folders/60/wk8qzx310fd32b2dp7mhzvdcgn/T/spark-6e5631ce-26bf-4ceb-82cc-7ce77fcb177e
> 439217 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient [] - 
> Loading HoodieTableMetaClient from 
> /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow
> 439218 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig [] - 
> Loading table properties from 
> /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow/.hoodie/hoodie.properties
> 439221 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient [] - 
> Finished Loading Table of type COPY_ON_WRITE(version=1, 
> baseFileFormat=PARQUET) from 
> /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow
> Failed: Could not create savepoint "20230127115839445". {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5635:

Status: In Progress  (was: Open)

> Fix release scripts
> ---
>
> Key: HUDI-5635
> URL: https://issues.apache.org/jira/browse/HUDI-5635
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5631) Improve defaults of early conflict detection configs

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5631:

Status: In Progress  (was: Open)

> Improve defaults of early conflict detection configs
> 
>
> Key: HUDI-5631
> URL: https://issues.apache.org/jira/browse/HUDI-5631
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5631) Improve defaults of early conflict detection configs

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5631:

Status: Patch Available  (was: In Progress)

> Improve defaults of early conflict detection configs
> 
>
> Key: HUDI-5631
> URL: https://issues.apache.org/jira/browse/HUDI-5631
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5635:

Sprint: 0.13.0 Final Sprint 3

> Fix release scripts
> ---
>
> Key: HUDI-5635
> URL: https://issues.apache.org/jira/browse/HUDI-5635
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5631) Improve defaults of early conflict detection configs

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5631:

Sprint: 0.13.0 Final Sprint 3

> Improve defaults of early conflict detection configs
> 
>
> Key: HUDI-5631
> URL: https://issues.apache.org/jira/browse/HUDI-5631
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo reassigned HUDI-5635:
---

Assignee: Ethan Guo

> Fix release scripts
> ---
>
> Key: HUDI-5635
> URL: https://issues.apache.org/jira/browse/HUDI-5635
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5635:

Fix Version/s: 0.13.0

> Fix release scripts
> ---
>
> Key: HUDI-5635
> URL: https://issues.apache.org/jira/browse/HUDI-5635
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5635:

Priority: Blocker  (was: Major)

> Fix release scripts
> ---
>
> Key: HUDI-5635
> URL: https://issues.apache.org/jira/browse/HUDI-5635
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5632) CLI commands using Spark fails to execute with hudi-cli-bundle

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5632:

Sprint: 0.13.0 Final Sprint 3

> CLI commands using Spark fails to execute with hudi-cli-bundle
> --
>
> Key: HUDI-5632
> URL: https://issues.apache.org/jira/browse/HUDI-5632
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.0
>
>
> The following commands which require launching Spark cannot be executed in 
> Hudi CLI shell with hudi-cli-bundle:
> {code:java}
> savepoint create --commit 
> downgrade table --toVersion 3
> upgrade table --toVersion 5 {code}
> Output:
> {code:java}
> hudi:hudi_trips_cow->savepoint create --commit 20230127115839445
> 437425 [main] INFO  
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded 
> instants upto : Option{val=[20230127115839445__commit__COMPLETED]}
> 438860 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 WARN Utils: Your hostname, Ethans-MacBook-Pro.local 
> resolves to a loopback address: 127.0.0.1; using 192.168.1.21 instead (on 
> interface en0)
> 438862 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
> another address
> 439171 [Thread-6] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> Error: Failed to load org.apache.hudi.cli.commands.SparkMain: 
> org/apache/hudi/exception/HoodieSavepointException
> 439175 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 INFO ShutdownHookManager: Shutdown hook called
> 439176 [Thread-7] INFO  org.apache.hudi.cli.utils.InputStreamConsumer [] - 
> 23/01/27 12:15:18 INFO ShutdownHookManager: Deleting directory 
> /private/var/folders/60/wk8qzx310fd32b2dp7mhzvdcgn/T/spark-6e5631ce-26bf-4ceb-82cc-7ce77fcb177e
> 439217 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient [] - 
> Loading HoodieTableMetaClient from 
> /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow
> 439218 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig [] - 
> Loading table properties from 
> /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow/.hoodie/hoodie.properties
> 439221 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient [] - 
> Finished Loading Table of type COPY_ON_WRITE(version=1, 
> baseFileFormat=PARQUET) from 
> /Users/ethan/Work/tmp/20230127-test-cli-bundle/hudi_trips_cow
> Failed: Could not create savepoint "20230127115839445". {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-5635:
---

 Summary: Fix release scripts
 Key: HUDI-5635
 URL: https://issues.apache.org/jira/browse/HUDI-5635
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5635) Fix release scripts

2023-01-27 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5635:

Story Points: 2

> Fix release scripts
> ---
>
> Key: HUDI-5635
> URL: https://issues.apache.org/jira/browse/HUDI-5635
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[hudi] branch master updated (6011de44d47 -> 7352661283e)

2023-01-27 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from 6011de44d47 [HUDI-5629] Clean CDC log files for enable/disable 
scenario (#7767)
 add 7352661283e [MINOR] Add `hudi-platform-service` and 
`hudi-metaserver-server-bundle` to root pom (#7774)

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 ++
 1 file changed, 2 insertions(+)



[GitHub] [hudi] nsivabalan merged pull request #7774: [MINOR] Add `hudi-platform-service` and `hudi-metaserver-server-bundle` to root pom

2023-01-27 Thread via GitHub


nsivabalan merged PR #7774:
URL: https://github.com/apache/hudi/pull/7774


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua opened a new pull request, #7774: [MINOR] Add `hudi-platform-service` and `hudi-metaserver-server-bundle` to root pom

2023-01-27 Thread via GitHub


yihua opened a new pull request, #7774:
URL: https://github.com/apache/hudi/pull/7774

   ### Change Logs
   
   As above.
   
   ### Impact
   
   If not doing, we're missing these artifacts when building the project.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #7772: [MINOR] Cleaning up recently introduced configs

2023-01-27 Thread via GitHub


nsivabalan commented on PR #7772:
URL: https://github.com/apache/hudi/pull/7772#issuecomment-1407307807

   rebased w/ latest master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated (67d661c2952 -> 6011de44d47)

2023-01-27 Thread biyan
This is an automated email from the ASF dual-hosted git repository.

biyan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from 67d661c2952 [HUDI-5630] Fixing flaky parquet projection tests (#7768)
 add 6011de44d47 [HUDI-5629] Clean CDC log files for enable/disable 
scenario (#7767)

No new revisions were added by this update.

Summary of changes:
 .../hudi/table/action/clean/CleanPlanner.java  | 26 +-
 1 file changed, 6 insertions(+), 20 deletions(-)



[GitHub] [hudi] YannByron merged pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario

2023-01-27 Thread via GitHub


YannByron merged PR #7767:
URL: https://github.com/apache/hudi/pull/7767


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API

2023-01-27 Thread via GitHub


nsivabalan commented on PR #7759:
URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407307608

   rebased w/ laster master. we landed a flaky test fix which was causing 
failures w/ CI runs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] YannByron commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario

2023-01-27 Thread via GitHub


YannByron commented on PR #7767:
URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407307555

   The failed UTs is not related to this pr, but fixed in 
https://github.com/apache/hudi/pull/7768. Now merge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #7760: [HUDI-5626] Rename CDC logging mode options

2023-01-27 Thread via GitHub


nsivabalan commented on PR #7760:
URL: https://github.com/apache/hudi/pull/7760#issuecomment-1407307296

   rebased w/ lastest master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


nsivabalan commented on PR #7770:
URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407307041

   rebased w/ latest master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] YannByron commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


YannByron commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089657325


##
hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java:
##
@@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat(
   FileSlice beforeFileSlice = new FileSlice(fileGroupId, 
writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList());
   cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, 
new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice));
 } else if (writeStat.getNumUpdateWrites() == 0L && 
writeStat.getNumDeletes() == 0
-&& writeStat.getNumWrites() == writeStat.getNumInserts()) {
+&& writeStat.getNumWrites() > 0) {

Review Comment:
   this name is followed by 
https://github.com/apache/hudi/pull/6727#discussion_r980470347. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated (ff590c6d72c -> 67d661c2952)

2023-01-27 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository.

sivabalan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from ff590c6d72c [HUDI-5023] Switching default Write Executor type to 
`SIMPLE` (#7476)
 add 67d661c2952 [HUDI-5630] Fixing flaky parquet projection tests (#7768)

No new revisions were added by this update.

Summary of changes:
 .../hudi/functional/TestParquetColumnProjection.scala   | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)



[GitHub] [hudi] nsivabalan merged pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests

2023-01-27 Thread via GitHub


nsivabalan merged PR #7768:
URL: https://github.com/apache/hudi/pull/7768


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests

2023-01-27 Thread via GitHub


nsivabalan commented on PR #7768:
URL: https://github.com/apache/hudi/pull/7768#issuecomment-1407304655

   Failed due to a flaky deltastreamer test. going ahead to unblock other 
blocker patches


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5535) Add support for keyless for all keygens(non partitioned, timestamp based key gen)

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5535:
--
Priority: Critical  (was: Blocker)

> Add support for keyless for all keygens(non partitioned, timestamp based key 
> gen)
> -
>
> Key: HUDI-5535
> URL: https://issues.apache.org/jira/browse/HUDI-5535
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5535) Add support for keyless for all keygens(non partitioned, timestamp based key gen)

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5535:
--
Fix Version/s: 0.14.0
   (was: 0.13.0)

> Add support for keyless for all keygens(non partitioned, timestamp based key 
> gen)
> -
>
> Key: HUDI-5535
> URL: https://issues.apache.org/jira/browse/HUDI-5535
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4700) RFC for primary key-less data model

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-4700:
--
Fix Version/s: 0.14.0
   (was: 0.13.0)

> RFC for primary key-less data model
> ---
>
> Key: HUDI-4700
> URL: https://issues.apache.org/jira/browse/HUDI-4700
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4700) RFC for primary key-less data model

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-4700:
--
Priority: Critical  (was: Blocker)

> RFC for primary key-less data model
> ---
>
> Key: HUDI-4700
> URL: https://issues.apache.org/jira/browse/HUDI-4700
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5571) Add support for keyless for all keygens(non partitioned, timestamp based key gen) row writer

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5571:
--
Fix Version/s: 0.14.0
   (was: 0.13.0)

> Add support for keyless for all keygens(non partitioned, timestamp based key 
> gen) row writer 
> -
>
> Key: HUDI-5571
> URL: https://issues.apache.org/jira/browse/HUDI-5571
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.14.0
>
>
> keyless support for all row writer apis in key gen interface



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5536) Support writing to hudi w/o any options

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5536:
--
Priority: Critical  (was: Blocker)

> Support writing to hudi w/o any options 
> 
>
> Key: HUDI-5536
> URL: https://issues.apache.org/jira/browse/HUDI-5536
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.13.0
>
>
> with key less model, we should be able to support 
> df.write.format("hudi").save(path) 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5586) Add support to auto generation of record keys for SimpleKeyGen

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5586:
--
Fix Version/s: 0.14.0
   (was: 0.13.0)

> Add support to auto generation of record keys for SimpleKeyGen
> --
>
> Key: HUDI-5586
> URL: https://issues.apache.org/jira/browse/HUDI-5586
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Critical
> Fix For: 0.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4701) Support bulk insert without primary key and precombine field

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-4701:
--
Priority: Critical  (was: Blocker)

> Support bulk insert without primary key and precombine field
> 
>
> Key: HUDI-4701
> URL: https://issues.apache.org/jira/browse/HUDI-4701
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5536) Support writing to hudi w/o any options

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5536:
--
Fix Version/s: 0.14.0
   (was: 0.13.0)

> Support writing to hudi w/o any options 
> 
>
> Key: HUDI-5536
> URL: https://issues.apache.org/jira/browse/HUDI-5536
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.14.0
>
>
> with key less model, we should be able to support 
> df.write.format("hudi").save(path) 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5571) Add support for keyless for all keygens(non partitioned, timestamp based key gen) row writer

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5571:
--
Priority: Critical  (was: Blocker)

> Add support for keyless for all keygens(non partitioned, timestamp based key 
> gen) row writer 
> -
>
> Key: HUDI-5571
> URL: https://issues.apache.org/jira/browse/HUDI-5571
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.13.0
>
>
> keyless support for all row writer apis in key gen interface



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-4701) Support bulk insert without primary key and precombine field

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-4701:
--
Fix Version/s: 0.14.0
   (was: 0.13.0)

> Support bulk insert without primary key and precombine field
> 
>
> Key: HUDI-4701
> URL: https://issues.apache.org/jira/browse/HUDI-4701
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Lokesh Jain
>Priority: Critical
> Fix For: 0.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5574) Support auto record key generation with Spark SQL

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5574:
--
Priority: Critical  (was: Blocker)

> Support auto record key generation with Spark SQL
> -
>
> Key: HUDI-5574
> URL: https://issues.apache.org/jira/browse/HUDI-5574
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: writer-core
>Reporter: Lokesh Jain
>Priority: Critical
> Fix For: 0.13.0
>
>
> HUDI-2681 adds support for auto record key generation with spark dataframes. 
> This Jira aims to add support for the same with spark sql.
> One of the changes required here as pointed out by [~kazdy] is that 
> SQL_INSERT_MODE would need to be handled here. In this case if 
> SQL_INSERT_MODE mode is set to strict, the insert should fail.
> cc [~shivnarayan] 
> Essentially, based on this patch 
> ([https://github.com/apache/hudi/pull/7681),|https://github.com/apache/hudi/pull/7681,]
> we want to ensure spark-sql writes also supports auto generation of record 
> keys. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5586) Add support to auto generation of record keys for SimpleKeyGen

2023-01-27 Thread Sagar Sumit (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Sumit updated HUDI-5586:
--
Priority: Critical  (was: Blocker)

> Add support to auto generation of record keys for SimpleKeyGen
> --
>
> Key: HUDI-5586
> URL: https://issues.apache.org/jira/browse/HUDI-5586
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: writer-core
>Reporter: sivabalan narayanan
>Assignee: sivabalan narayanan
>Priority: Critical
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] danny0405 commented on pull request #7608: [HUDI-5503] Optimize flink table factory option check

2023-01-27 Thread via GitHub


danny0405 commented on PR #7608:
URL: https://github.com/apache/hudi/pull/7608#issuecomment-1407301502

   Thanks for the contribution, I have reviewed and applied a patch: 
   
[HUDI-5503.patch.zip](https://github.com/apache/hudi/files/10526044/HUDI-5503.patch.zip),
 please rebase with latest master code and then apply the path with cmd:
   `git apply xxx.patch`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan opened a new pull request, #7773: [MINOR] Skip docs generation for table service manager

2023-01-27 Thread via GitHub


xushiyan opened a new pull request, #7773:
URL: https://github.com/apache/hudi/pull/7773

   ### Change Logs
   
   Skip docs generation for `HoodieTableServiceManagerConfig`
   
   ### Impact
   
   NA
   
   ### Risk level
   
   None
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7769:
URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407297918

   
   ## CI report:
   
   * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688)
 
   * 80d38554649038cb9e668be4edc3a3c0a2c4373f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14693)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7767:
URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407296545

   
   ## CI report:
   
   * 461021069263f049ee764a74294ec596c9c6b8b0 UNKNOWN
   * 81c7f21d6f8c81490d2e4a5fef3323f6d670449d Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14683)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7766: [DO NOT MERGE] RFC46 testing

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7766:
URL: https://github.com/apache/hudi/pull/7766#issuecomment-1407296538

   
   ## CI report:
   
   * 744c663dbd926af8b218288a87e0b3061f2c4250 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14682)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


xushiyan commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089638068


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala:
##
@@ -157,7 +154,7 @@ class HoodieCDCRDD(
 split.changes.last.getInstant,
 recordKeyField,
 preCombineFieldOpt,
-usesVirtualKeys = false,
+usesVirtualKeys = !populateMetaFields,

Review Comment:
   separate PR for the fix?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


xushiyan commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089636397


##
hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java:
##
@@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat(
   FileSlice beforeFileSlice = new FileSlice(fileGroupId, 
writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList());
   cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, 
new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice));
 } else if (writeStat.getNumUpdateWrites() == 0L && 
writeStat.getNumDeletes() == 0
-&& writeStat.getNumWrites() == writeStat.getNumInserts()) {
+&& writeStat.getNumWrites() > 0) {

Review Comment:
   it's inference case because `HoodieCDCExtractor` is all about inferring CDC 
result from commit metadata in different scenarios. 
   
   I prefer more concise name over lengthy name that gives the same meaning. 
anyway are you filing a separate PR for the fixes?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xushiyan commented on a diff in pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


xushiyan commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089636397


##
hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java:
##
@@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat(
   FileSlice beforeFileSlice = new FileSlice(fileGroupId, 
writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList());
   cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, 
new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice));
 } else if (writeStat.getNumUpdateWrites() == 0L && 
writeStat.getNumDeletes() == 0
-&& writeStat.getNumWrites() == writeStat.getNumInserts()) {
+&& writeStat.getNumWrites() > 0) {

Review Comment:
   it's inference case because `HoodieCDCExtractor` is all about inferring CDC 
result from commit metadata in different scenarios. 
   
   But `InferCase` is perfectly fine; i prefer more concise name that gives the 
same meaning over lengthy name. so i think we should just land the fixes into 
RC2



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] alexeykudinkin opened a new pull request, #7772: [MINOR] Cleaning up recently introduced configs

2023-01-27 Thread via GitHub


alexeykudinkin opened a new pull request, #7772:
URL: https://github.com/apache/hudi/pull/7772

   ### Change Logs
   
   Cleaning up some of the recently introduced configs:
   
- Shortening file-listing mode override for Spark's `FileIndex`
- 
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7769:
URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407282031

   
   ## CI report:
   
   * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688)
 
   * 80d38554649038cb9e668be4edc3a3c0a2c4373f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7679: [HUDI-5563] Check table exist before drop table

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7679:
URL: https://github.com/apache/hudi/pull/7679#issuecomment-1407281990

   
   ## CI report:
   
   * 18e390314ee0744e0f6a23d1293f3b4338750af3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14362)
 
   * c29efcd919af87dba3fc8499ab57647a51c25ce6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14692)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7615: [HUDI-5510] Reload active timeline when commit finish

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7615:
URL: https://github.com/apache/hudi/pull/7615#issuecomment-1407281953

   
   ## CI report:
   
   * 1b42230e664a1f4554bc072e3198ee4f2ec8f32e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14215)
 
   * 1c2b2822737835978a64d21d80838a4a5a30f951 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14691)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7771: [DO NOT MERGE] Test PR 5631 with FF on: Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7771:
URL: https://github.com/apache/hudi/pull/7771#issuecomment-1407280756

   
   ## CI report:
   
   * 568c02cb4875155af7a99e0afb05ac43906a69c3 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14690)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7768:
URL: https://github.com/apache/hudi/pull/7768#issuecomment-1407280724

   
   ## CI report:
   
   * 28b8b5d1c0a737da8a706b43305dcc852825457d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14687)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7770:
URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407280747

   
   ## CI report:
   
   * acff503192ea74b693532da26dc91bac782c56cb Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14689)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7769:
URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407280738

   
   ## CI report:
   
   * 0ece5561859923b8773d6ff9fa633f014c104300 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14688)
 
   * 80d38554649038cb9e668be4edc3a3c0a2c4373f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7679: [HUDI-5563] Check table exist before drop table

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7679:
URL: https://github.com/apache/hudi/pull/7679#issuecomment-1407280667

   
   ## CI report:
   
   * 18e390314ee0744e0f6a23d1293f3b4338750af3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14362)
 
   * c29efcd919af87dba3fc8499ab57647a51c25ce6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show|update hudi's table properties

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7680:
URL: https://github.com/apache/hudi/pull/7680#issuecomment-1407280678

   
   ## CI report:
   
   * df3a787ab69d1a3ac0ff854b671699e0a55dc01d Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14452)
 
   * 0970573f82ef1a49184d1875975463f76f7d791d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14686)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7615: [HUDI-5510] Reload active timeline when commit finish

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7615:
URL: https://github.com/apache/hudi/pull/7615#issuecomment-1407280623

   
   ## CI report:
   
   * 1b42230e664a1f4554bc072e3198ee4f2ec8f32e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14215)
 
   * 1c2b2822737835978a64d21d80838a4a5a30f951 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7410:
URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407280560

   
   ## CI report:
   
   * 1195034b3a3dc06084733ae8572ffebc2b79d295 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13556)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13633)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13874)
 
   * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14685)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5633) Fixing HoodieSparkRecord performance bottlenecks

2023-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-5633:
-
Labels: pull-request-available  (was: )

> Fixing HoodieSparkRecord performance bottlenecks
> 
>
> Key: HUDI-5633
> URL: https://issues.apache.org/jira/browse/HUDI-5633
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Alexey Kudinkin
>Assignee: Alexey Kudinkin
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> There currently following issues w/ the current HoodieSparkRecord 
> implementation:
>  # It rewrites records using `rewriteRecord` and `rewriteRecordWithNewSchema` 
> which do Schema traversals for every record. Instead we should do schema 
> traversal only once and produce a transformer that will directly create new 
> record from the old one.
>  # Records are currently copied for every Executor even for Simple one which 
> actually is not buffering any records and therefore doesn't require records 
> to be copied.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7771: [DO NOT MERGE] Test HUDI-5631 with FF on: Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7771:
URL: https://github.com/apache/hudi/pull/7771#issuecomment-1407278874

   
   ## CI report:
   
   * 568c02cb4875155af7a99e0afb05ac43906a69c3 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7769: [HUDI-5633] Fixing performance regression in `HoodieSparkRecord`

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7769:
URL: https://github.com/apache/hudi/pull/7769#issuecomment-1407278862

   
   ## CI report:
   
   * 0ece5561859923b8773d6ff9fa633f014c104300 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7770: [HUDI-5631] Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7770:
URL: https://github.com/apache/hudi/pull/7770#issuecomment-1407278868

   
   ## CI report:
   
   * acff503192ea74b693532da26dc91bac782c56cb UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7767:
URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407278843

   
   ## CI report:
   
   * 461021069263f049ee764a74294ec596c9c6b8b0 UNKNOWN
   * 81c7f21d6f8c81490d2e4a5fef3323f6d670449d Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14683)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7768: [HUDI-5630] Fixing flaky parquet projection tests

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7768:
URL: https://github.com/apache/hudi/pull/7768#issuecomment-1407278856

   
   ## CI report:
   
   * 28b8b5d1c0a737da8a706b43305dcc852825457d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7680: [HUDI-5548] spark sql show|update hudi's table properties

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7680:
URL: https://github.com/apache/hudi/pull/7680#issuecomment-1407278795

   
   ## CI report:
   
   * df3a787ab69d1a3ac0ff854b671699e0a55dc01d Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14452)
 
   * 0970573f82ef1a49184d1875975463f76f7d791d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-5634) Imporve cdc-related codes

2023-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-5634:
-
Labels: pull-request-available  (was: )

> Imporve cdc-related codes
> -
>
> Key: HUDI-5634
> URL: https://issues.apache.org/jira/browse/HUDI-5634
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: core
>Reporter: Yann Byron
>Assignee: Yann Byron
>Priority: Major
>  Labels: pull-request-available
>
> this ticket solves some comments left in 
> https://github.com/apache/hudi/pull/6727.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #7410: [HUDI-5634] imporve cdc-related codes

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7410:
URL: https://github.com/apache/hudi/pull/7410#issuecomment-1407278713

   
   ## CI report:
   
   * 1195034b3a3dc06084733ae8572ffebc2b79d295 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13556)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13633)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=13874)
 
   * c3d82c532fc2f48e0d75fae3b7e69dd6305dafbf UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] YannByron commented on a diff in pull request #7410: [HUDI-3478] imporve cdc-related codes

2023-01-27 Thread via GitHub


YannByron commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089616197


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala:
##
@@ -157,7 +154,7 @@ class HoodieCDCRDD(
 split.changes.last.getInstant,
 recordKeyField,
 preCombineFieldOpt,
-usesVirtualKeys = false,
+usesVirtualKeys = !populateMetaFields,

Review Comment:
   Restore this first. let's keep this pr force on code-improvement.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7767: [HUDI-5629] Clean CDC log files for enable/disable scenario

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7767:
URL: https://github.com/apache/hudi/pull/7767#issuecomment-1407277186

   
   ## CI report:
   
   * 461021069263f049ee764a74294ec596c9c6b8b0 UNKNOWN
   * 81c7f21d6f8c81490d2e4a5fef3323f6d670449d UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7759: [HUDI-5624] Fix HoodieAvroRecordMerger to use new precombine API

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7759:
URL: https://github.com/apache/hudi/pull/7759#issuecomment-1407277156

   
   ## CI report:
   
   * 8bb795346fd54da170b3282a3a647bbe64c818cb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14679)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7764: [HUDI-5628] Fixing log record reader scan V2 config name

2023-01-27 Thread via GitHub


hudi-bot commented on PR #7764:
URL: https://github.com/apache/hudi/pull/7764#issuecomment-1407277171

   
   ## CI report:
   
   * ebcae89f01be1f36d59d23006a2580d0e99b04f1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14680)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14678)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] YannByron commented on a diff in pull request #7410: [HUDI-3478] imporve cdc-related codes

2023-01-27 Thread via GitHub


YannByron commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089614628


##
hudi-common/src/main/java/org/apache/hudi/common/table/cdc/HoodieCDCExtractor.java:
##
@@ -267,7 +267,7 @@ private HoodieCDCFileSplit parseWriteStat(
   FileSlice beforeFileSlice = new FileSlice(fileGroupId, 
writeStat.getPrevCommit(), beforeBaseFile, Collections.emptyList());
   cdcFileSplit = new HoodieCDCFileSplit(instantTs, BASE_FILE_DELETE, 
new ArrayList<>(), Option.empty(), Option.of(beforeFileSlice));
 } else if (writeStat.getNumUpdateWrites() == 0L && 
writeStat.getNumDeletes() == 0
-&& writeStat.getNumWrites() == writeStat.getNumInserts()) {
+&& writeStat.getNumWrites() > 0) {

Review Comment:
   In my thought, `writeStat.getNumWrites() == writeStat.getNumInserts()` is 
right. The change is just for this comment: 
https://github.com/apache/hudi/pull/6727#discussion_r980481223. if the case 
mentioned in this comment, should be fixed in other codes, not here. I will 
rollback this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] yihua opened a new pull request, #7771: [DO NOT MERGE] Test HUDI-5631 with FF on: Improve defaults of early conflict detection configs

2023-01-27 Thread via GitHub


yihua opened a new pull request, #7771:
URL: https://github.com/apache/hudi/pull/7771

   ### Change Logs
   
   Run tests in CI only
   
   ### Impact
   
   N/A
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-5634) Imporve cdc-related codes

2023-01-27 Thread Yann Byron (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yann Byron reassigned HUDI-5634:


Assignee: Yann Byron

> Imporve cdc-related codes
> -
>
> Key: HUDI-5634
> URL: https://issues.apache.org/jira/browse/HUDI-5634
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: core
>Reporter: Yann Byron
>Assignee: Yann Byron
>Priority: Major
>
> this ticket solves some comments left in 
> https://github.com/apache/hudi/pull/6727.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5634) Imporve cdc-related codes

2023-01-27 Thread Yann Byron (Jira)
Yann Byron created HUDI-5634:


 Summary: Imporve cdc-related codes
 Key: HUDI-5634
 URL: https://issues.apache.org/jira/browse/HUDI-5634
 Project: Apache Hudi
  Issue Type: Improvement
  Components: core
Reporter: Yann Byron


this ticket solves some comments left in 
https://github.com/apache/hudi/pull/6727.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5633) Fixing HoodieSparkRecord performance bottlenecks

2023-01-27 Thread Alexey Kudinkin (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Kudinkin updated HUDI-5633:
--
Fix Version/s: 0.13.0

> Fixing HoodieSparkRecord performance bottlenecks
> 
>
> Key: HUDI-5633
> URL: https://issues.apache.org/jira/browse/HUDI-5633
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Alexey Kudinkin
>Assignee: Alexey Kudinkin
>Priority: Blocker
> Fix For: 0.13.0
>
>
> There currently following issues w/ the current HoodieSparkRecord 
> implementation:
>  # It rewrites records using `rewriteRecord` and `rewriteRecordWithNewSchema` 
> which do Schema traversals for every record. Instead we should do schema 
> traversal only once and produce a transformer that will directly create new 
> record from the old one.
>  # Records are currently copied for every Executor even for Simple one which 
> actually is not buffering any records and therefore doesn't require records 
> to be copied.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] danny0405 commented on a diff in pull request #7410: [HUDI-3478] imporve cdc-related codes

2023-01-27 Thread via GitHub


danny0405 commented on code in PR #7410:
URL: https://github.com/apache/hudi/pull/7410#discussion_r1089610242


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/cdc/HoodieCDCRDD.scala:
##
@@ -157,7 +154,7 @@ class HoodieCDCRDD(
 split.changes.last.getInstant,
 recordKeyField,
 preCombineFieldOpt,
-usesVirtualKeys = false,
+usesVirtualKeys = !populateMetaFields,

Review Comment:
   Seems critical, should be merged for RC2



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   >