Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-09 Thread via GitHub


bvaradar merged PR #10459:
URL: https://github.com/apache/hudi/pull/10459


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-09 Thread via GitHub


waitingF commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1884079388

   > Thanks for addressing the comments. There are a couple of [test 
failures](https://github.com/apache/hudi/actions/runs/7447573282/job/20260155012?pr=10459)
 in `TestHoodieSparkSqlWriter#testDeletePartitionsV2` and 
`#testSchemaEvolutionForTableType` due to OOM. Can you please check?
   > 
   > ```
   > Error:  Tests run: 52, Failures: 0, Errors: 5, Skipped: 0, Time elapsed: 
1,036.092 s <<< FAILURE! - in org.apache.hudi.TestHoodieSparkSqlWriter
   > Error:  testDeletePartitionsV2{boolean}[1]  Time elapsed: 110.88 s  <<< 
ERROR!
   > org.apache.spark.SparkException: 
   > Job aborted due to stage failure: Task 2 in stage 76.0 failed 1 times, 
most recent failure: Lost task 2.0 in stage 76.0 (TID 128, 
fv-az1501-788.rnq23jqhr0re1ds00u55qn22fh.cx.internal.cloudapp.net, executor 
driver): org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 92 
bytes of memory, got 0
   > ```
   
   Seems the test failure was introduced by other PR, fixed by merging master. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-09 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1882897505

   
   ## CI report:
   
   * e36f217680f5a18af1c37ad8a44f1f1e4626dbb7 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21883)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-09 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1882625603

   
   ## CI report:
   
   * e0a43ce9e388b4c8daf83c4ced333f8435de9991 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21862)
 
   * e36f217680f5a18af1c37ad8a44f1f1e4626dbb7 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21883)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-09 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1882614165

   
   ## CI report:
   
   * e0a43ce9e388b4c8daf83c4ced333f8435de9991 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21862)
 
   * e36f217680f5a18af1c37ad8a44f1f1e4626dbb7 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1881211302

   
   ## CI report:
   
   * e0a43ce9e388b4c8daf83c4ced333f8435de9991 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21862)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1881101296

   
   ## CI report:
   
   * 889a89640b0db39545469a625c0b961829f0aa0a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21860)
 
   * e0a43ce9e388b4c8daf83c4ced333f8435de9991 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21862)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1881023369

   
   ## CI report:
   
   * 889a89640b0db39545469a625c0b961829f0aa0a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21860)
 
   * e0a43ce9e388b4c8daf83c4ced333f8435de9991 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21862)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1881009514

   
   ## CI report:
   
   * 889a89640b0db39545469a625c0b961829f0aa0a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21860)
 
   * e0a43ce9e388b4c8daf83c4ced333f8435de9991 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


waitingF commented on code in PR #10459:
URL: https://github.com/apache/hudi/pull/10459#discussion_r1444569349


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##
@@ -353,6 +353,18 @@ public class HoodieWriteConfig extends HoodieConfig {
   .markAdvanced()
   .withDocumentation("Size of in-memory buffer used for parallelizing 
network reads and lake storage writes.");
 
+  public static final ConfigProperty WRITE_BUFFER_RECORD_SAMPLING_RATE 
= ConfigProperty
+  .key("hoodie.write.buffer.record.sampling.rate")
+  .defaultValue(String.valueOf(64))
+  .markAdvanced()

Review Comment:
   sure, done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1880838207

   
   ## CI report:
   
   * 889a89640b0db39545469a625c0b961829f0aa0a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21860)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


hudi-bot commented on PR #10459:
URL: https://github.com/apache/hudi/pull/10459#issuecomment-1880827643

   
   ## CI report:
   
   * 889a89640b0db39545469a625c0b961829f0aa0a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7279] make sampling rate configurable for BOUNDED_IN_MEMORY executor type [hudi]

2024-01-08 Thread via GitHub


codope commented on code in PR #10459:
URL: https://github.com/apache/hudi/pull/10459#discussion_r148100


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##
@@ -353,6 +353,18 @@ public class HoodieWriteConfig extends HoodieConfig {
   .markAdvanced()
   .withDocumentation("Size of in-memory buffer used for parallelizing 
network reads and lake storage writes.");
 
+  public static final ConfigProperty WRITE_BUFFER_RECORD_SAMPLING_RATE 
= ConfigProperty
+  .key("hoodie.write.buffer.record.sampling.rate")
+  .defaultValue(String.valueOf(64))
+  .markAdvanced()

Review Comment:
   Please also add `.sinceVersion("1.0.0")` for both the configs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org