[PR] [MINOR][DNM][TESTING] Flink bundle testing 6 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11321:
URL: https://github.com/apache/hudi/pull/11321

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][DNM][TESTING] Flink bundle testing 6 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11320:
URL: https://github.com/apache/hudi/pull/11320

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][DNM][TESTING] Flink bundle testing 5 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11319:
URL: https://github.com/apache/hudi/pull/11319

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][DNM][TESTING] Flink bundle testing 4 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11318:
URL: https://github.com/apache/hudi/pull/11318

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][DNM][TESTING] Flink bundle testing 3 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11317:
URL: https://github.com/apache/hudi/pull/11317

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][DNM][TESTING] Flink bundle testing 2 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11316:
URL: https://github.com/apache/hudi/pull/11316

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][DNM][TESTING] Flink bundle testing 1 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11315:
URL: https://github.com/apache/hudi/pull/11315

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change. If not, put "none"._
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7802:
-
Labels: pull-request-available  (was: )

> Fix bundle validation scripts
> -
>
> Key: HUDI-7802
> URL: https://issues.apache.org/jira/browse/HUDI-7802
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.15.0, 1.0.0
>
>
> Issues:
>  * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
> release-0.15.0 branch due to script issue
>  * scripts/release/validate_staged_bundles.sh needs to include additional 
> bundles.
>  * Add release candidate validation on scala 2.13 bundles.
>  * Disable release candidate validation by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7802] Fix bundle validation scripts [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11314:
URL: https://github.com/apache/hudi/pull/11314

   ### Change Logs
   
   This PR makes several fixes to the bundle validation scripts:
   - `.github/workflow.bot.yml`: fix scala version for Spark 2 bundle validation
   - `.github/workflows/release_candidate_validation.yml`
 - Adds Spark 3.5 and Scala 2.13 validation
 - Removes ignored paths to avoid CI actions being cancelled
   - `packaging/bundle-validation/ci_run.sh`
 - Uses `STAGING_REPO_NUM` to determine whether to validate built bundle 
jars or release candidate jars
 - Fixes Spark 3.5 and Scala 2.13 validation
   - `scripts/release/validate_staged_bundles.sh`
 - Adds new bundle jars to validate
 - Makes curl requests parallelized to save time
   
   ### Impact
   
   Improves bundle validation scripts.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11313:
URL: https://github.com/apache/hudi/pull/11313

   ### Change Logs
   
   PR targeting master: https://github.com/apache/hudi/pull/11142
   This PR targets at `branch-0.x` with the same changes.
   
   Bundle validation with Java 8 and 11 are skipped in GH CI.  This PR 
reenables them by fixing the `bot.yml`.
   
   This PR includes changes to make `packaging/bundle-validation/ci_run.sh` 
take the docker container name to avoid name collision in the same GH CI task.
   
   ### Impact
   
   Improves bundle validation coverage.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]

2024-05-26 Thread via GitHub


yihua commented on PR #11142:
URL: https://github.com/apache/hudi/pull/11142#issuecomment-2132495956

   All bundle validations pass.  Merging the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated (b1ebcb7b95b -> b51d61ad7ce)

2024-05-26 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from b1ebcb7b95b [HUDI-7801] Directly pass down HoodieStorage instance 
instead of recreation (#11309)
 add b51d61ad7ce [HUDI-7707] Enable bundle validation on Java 8 and 11 
(#11142)

No new revisions were added by this update.

Summary of changes:
 .github/workflows/bot.yml  | 8 +---
 .github/workflows/release_candidate_validation.yml | 6 +++---
 packaging/bundle-validation/ci_run.sh  | 9 +
 packaging/bundle-validation/validate.sh| 2 +-
 4 files changed, 14 insertions(+), 11 deletions(-)



Re: [PR] [HUDI-7707] Enable bundle validation on Java 8 and 11 [hudi]

2024-05-26 Thread via GitHub


yihua merged PR #11142:
URL: https://github.com/apache/hudi/pull/11142


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7707][DNM] Enable bundle validation on Java 8 and 11 [hudi]

2024-05-26 Thread via GitHub


yihua commented on PR #11142:
URL: https://github.com/apache/hudi/pull/11142#issuecomment-2132492653

   Bundle validation runs on Java 8 and 11 again.
   https://github.com/apache/hudi/assets/2497195/ad803a8c-219a-4631-948c-f77493e8
   https://github.com/apache/hudi/assets/2497195/1cb33604-d228-4bcd-9c7d-da300818c541";>
   6348">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7803) Fix bundle validation on Flink 1.18

2024-05-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7803:

Description: 
For Flink 1.18 we're seeing this error
{code:java}
SLF4J: Class path contains multiple SLF4J bindings.
2385SLF4J: Found binding in 
[jar:file:/opt/bundle-validation/flink-1.18.0/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2386SLF4J: Found binding in 
[jar:file:/opt/bundle-validation/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2387SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.
2388SLF4J: Actual binding is of type 
[org.apache.logging.slf4j.Log4jLoggerFactory]
2389May 27, 2024 12:11:29 AM org.jline.utils.Log logr
2390WARNING: Unable to create a system terminal, creating a dumb terminal 
(enable debug logging for more information)
2391[INFO] Executing SQL from file.
2392
2393Command history file path: /root/.flink-sql-history
2394Flink SQL> /*
2395>  * Licensed to the Apache Software Foundation (ASF) under one
2396>  * or more contributor license agreements.  See the NOTICE file
2397>  * distributed with this work for additional information
2398>  * regarding copyright ownership.  The ASF licenses this file
2399>  * to you under the Apache License, Version 2.0 (the
2400>  * "License"); you may not use this file except in compliance
2401>  * with the License.  You may obtain a copy of the License at
2402>  *
2403>  *   http://www.apache.org/licenses/LICENSE-2.0
2404>  *
2405>  * Unless required by applicable law or agreed to in writing,
2406>  * software distributed under the License is distributed on an
2407>  * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
2408>  * KIND, either express or implied.  See the License for the
2409>  * specific language governing permissions and limitations
2410>  * under the License.
2411>  */
2412> 
2413> CREATE TABLE t1
2414> (
2415> uuidVARCHAR(20) PRIMARY KEY NOT ENFORCED,
2416> nameVARCHAR(10),
2417> age INT,
2418> ts  TIMESTAMP(3),
2419> `partition` VARCHAR(20)
2420> ) PARTITIONED BY (`partition`)
2421> WITH (
2422>   'connector' = 'hudi',
2423>   'table.type' = 'MERGE_ON_READ',
2424>   'metadata.enabled' = 'false', -- avoid classloader issue, class HFile 
can not be found
2425>   'path' = '/tmp/hudi-flink-bundle-test'
2426> )
2427
2428Error: Exception in thread "main" 
org.apache.flink.table.client.SqlClientException: Unexpected exception. This is 
a bug. Please consider filing an issue.
2429at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:242)
2430at org.apache.flink.table.client.SqlClient.main(SqlClient.java:179)
2431Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.flink.table.client.config.SqlClientOptions
2432at 
org.apache.flink.table.client.cli.CliClient.printExecutionException(CliClient.java:277)
2433at 
org.apache.flink.table.client.cli.CliClient.executeFile(CliClient.java:245)
2434at 
org.apache.flink.table.client.cli.CliClient.executeInNonInteractiveMode(CliClient.java:131)
2435at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:171)
2436at org.apache.flink.table.client.SqlClient.start(SqlClient.java:118)
2437at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:228)
2438... 1 more {code}

> Fix bundle validation on Flink 1.18
> ---
>
> Key: HUDI-7803
> URL: https://issues.apache.org/jira/browse/HUDI-7803
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
>
> For Flink 1.18 we're seeing this error
> {code:java}
> SLF4J: Class path contains multiple SLF4J bindings.
> 2385SLF4J: Found binding in 
> [jar:file:/opt/bundle-validation/flink-1.18.0/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> 2386SLF4J: Found binding in 
> [jar:file:/opt/bundle-validation/hadoop-3.3.5/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> 2387SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> 2388SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory]
> 2389May 27, 2024 12:11:29 AM org.jline.utils.Log logr
> 2390WARNING: Unable to create a system terminal, creating a dumb terminal 
> (enable debug logging for more information)
> 2391[INFO] Executing SQL from file.
> 2392
> 2393Command history file path: /root/.flink-sql-history
> 2394Flink SQL> /*
> 2395>  * Licensed to the Apache Software Foundation (ASF) under one
> 2396>  * or more contributor license agreements.  See the NOTICE file
> 2397>  * distributed with this work for additional information
> 2398>  * regarding copyright ownership.  The ASF licenses this file

[jira] [Created] (HUDI-7803) Fix bundle validation on Flink 1.18

2024-05-26 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7803:
---

 Summary: Fix bundle validation on Flink 1.18
 Key: HUDI-7803
 URL: https://issues.apache.org/jira/browse/HUDI-7803
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [MINOR][Testing][DNM] Release 0.15.0 test bundle validation [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11312:
URL: https://github.com/apache/hudi/pull/11312

   ### Change Logs
   
   As above
   ### Impact
   
   Testing only
   
   ### Risk level
   
   none
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7802:

Description: 
Issues:
 * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
release-0.15.0 branch due to script issue
 * scripts/release/validate_staged_bundles.sh needs to include additional 
bundles.
 * Add release candidate validation on scala 2.13 bundles.
 * Disable release candidate validation by default.

  was:
Issues:
 * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
release-0.15.0 branch due to script issue
 * scripts/release/validate_staged_bundles.sh needs to include additional 
bundles.
 * Disable release candidate validation by default.


> Fix bundle validation scripts
> -
>
> Key: HUDI-7802
> URL: https://issues.apache.org/jira/browse/HUDI-7802
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.15.0, 1.0.0
>
>
> Issues:
>  * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
> release-0.15.0 branch due to script issue
>  * scripts/release/validate_staged_bundles.sh needs to include additional 
> bundles.
>  * Add release candidate validation on scala 2.13 bundles.
>  * Disable release candidate validation by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7802:

Description: 
Issues:
 * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
release-0.15.0 branch due to script issue
 * scripts/release/validate_staged_bundles.sh needs to include additional 
bundles.
 * Disable release candidate validation by default.

  was:
Issues:
 * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
release-0.15.0 branch due to script issue
 * scripts/release/validate_staged_bundles.sh needs to include additional 
bundles.


> Fix bundle validation scripts
> -
>
> Key: HUDI-7802
> URL: https://issues.apache.org/jira/browse/HUDI-7802
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.15.0, 1.0.0
>
>
> Issues:
>  * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
> release-0.15.0 branch due to script issue
>  * scripts/release/validate_staged_bundles.sh needs to include additional 
> bundles.
>  * Disable release candidate validation by default.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7802:

Description: 
Issues:
 * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
release-0.15.0 branch due to script issue
 * scripts/release/validate_staged_bundles.sh needs to include additional 
bundles.

> Fix bundle validation scripts
> -
>
> Key: HUDI-7802
> URL: https://issues.apache.org/jira/browse/HUDI-7802
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.15.0, 1.0.0
>
>
> Issues:
>  * Bundle validation with packaging/bundle-validation/ci_run.sh fails for 
> release-0.15.0 branch due to script issue
>  * scripts/release/validate_staged_bundles.sh needs to include additional 
> bundles.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo reassigned HUDI-7802:
---

Assignee: Ethan Guo

> Fix bundle validation scripts
> -
>
> Key: HUDI-7802
> URL: https://issues.apache.org/jira/browse/HUDI-7802
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7802:
---

 Summary: Fix bundle validation scripts
 Key: HUDI-7802
 URL: https://issues.apache.org/jira/browse/HUDI-7802
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7802) Fix bundle validation scripts

2024-05-26 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7802:

Fix Version/s: 0.15.0
   1.0.0

> Fix bundle validation scripts
> -
>
> Key: HUDI-7802
> URL: https://issues.apache.org/jira/browse/HUDI-7802
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.15.0, 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] [SUPPORT] Flink bucket index partitioner may cause data skew [hudi]

2024-05-26 Thread via GitHub


danny0405 commented on issue #11288:
URL: https://github.com/apache/hudi/issues/11288#issuecomment-2132172783

   I guess you are saying `(Hash(partition) % parallelism * bucket_num + 
bucket_id) % parallelism ` ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR][TESTING] Release 0.15.0 branch [hudi]

2024-05-26 Thread via GitHub


yihua opened a new pull request, #11311:
URL: https://github.com/apache/hudi/pull/11311

   ### Change Logs
   
   As above
   ### Impact
   
   Testing only
   ### Risk level
   
   none
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch release-0.15.0 updated: [HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils implementation (#11310)

2024-05-26 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch release-0.15.0
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/release-0.15.0 by this push:
 new b8796d0cef5 [HUDI-7797] Use HoodieIOFactory to return pluggable 
FileFormatUtils implementation (#11310)
b8796d0cef5 is described below

commit b8796d0cef55ebb0c3440ed1d8b279b749e43d49
Author: Y Ethan Guo 
AuthorDate: Sun May 26 00:34:12 2024 -0700

[HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils 
implementation (#11310)
---
 .../hudi/io/HoodieKeyLocationFetchHandle.java  |  4 +-
 .../client/TestHoodieJavaWriteClientInsert.java|  7 ++-
 .../TestHoodieJavaClientOnCopyOnWriteStorage.java  |  6 +-
 .../commit/TestJavaCopyOnWriteActionExecutor.java  |  7 ++-
 .../testutils/HoodieJavaClientTestHarness.java | 12 ++--
 .../hudi/io/storage/HoodieSparkParquetReader.java  |  3 +-
 .../hudi/client/TestUpdateSchemaEvolution.java |  5 +-
 .../TestHoodieClientOnCopyOnWriteStorage.java  | 24 +---
 .../commit/TestCopyOnWriteActionExecutor.java  | 12 ++--
 .../hudi/common/model/HoodiePartitionMetadata.java |  7 ++-
 .../hudi/common/table/TableSchemaResolver.java |  8 ++-
 .../table/log/block/HoodieHFileDataBlock.java  |  8 +--
 .../table/log/block/HoodieParquetDataBlock.java|  6 +-
 .../apache/hudi/common/util/FileFormatUtils.java   | 31 --
 .../apache/hudi/io/storage/HoodieIOFactory.java| 56 +-
 .../hudi/metadata/HoodieTableMetadataUtil.java |  4 +-
 .../hudi/sink/bootstrap/BootstrapOperator.java |  4 +-
 .../apache/hudi/io/hadoop/HoodieAvroOrcReader.java |  3 +-
 .../hudi/io/hadoop/HoodieAvroParquetReader.java|  4 +-
 .../hudi/io/hadoop/HoodieHadoopIOFactory.java  | 19 +++
 .../hudi/io/hadoop/TestHoodieHadoopIOFactory.java  | 66 ++
 .../org/apache/spark/sql/hudi/SparkHelpers.scala   | 12 ++--
 .../org/apache/hudi/ColumnStatsIndexHelper.java|  4 +-
 .../utilities/HoodieMetadataTableValidator.java| 11 ++--
 24 files changed, 236 insertions(+), 87 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
index 4d82d661f64..c94e30c9d5c 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
@@ -26,6 +26,7 @@ import org.apache.hudi.common.util.FileFormatUtils;
 import org.apache.hudi.common.util.Option;
 import org.apache.hudi.common.util.collection.Pair;
 import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.io.storage.HoodieIOFactory;
 import org.apache.hudi.keygen.BaseKeyGenerator;
 import org.apache.hudi.table.HoodieTable;
 
@@ -50,7 +51,8 @@ public class HoodieKeyLocationFetchHandle extends 
HoodieReadHandle fetchHoodieKeys(HoodieBaseFile baseFile) {
-FileFormatUtils fileFormatUtils = 
FileFormatUtils.getInstance(baseFile.getStoragePath());
+FileFormatUtils fileFormatUtils = 
HoodieIOFactory.getIOFactory(hoodieTable.getStorage())
+.getFileFormatUtils(baseFile.getStoragePath());
 if (keyGeneratorOpt.isPresent()) {
   return fileFormatUtils.fetchHoodieKeys(hoodieTable.getStorage(), 
baseFile.getStoragePath(), keyGeneratorOpt);
 } else {
diff --git 
a/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
 
b/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
index 53d069736e7..60907acec5c 100644
--- 
a/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
+++ 
b/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
@@ -37,6 +37,7 @@ import org.apache.hudi.config.HoodieIndexConfig;
 import org.apache.hudi.config.HoodieWriteConfig;
 import org.apache.hudi.hadoop.HoodieParquetInputFormat;
 import org.apache.hudi.hadoop.utils.HoodieHiveUtils;
+import org.apache.hudi.io.storage.HoodieIOFactory;
 import org.apache.hudi.storage.StoragePath;
 import org.apache.hudi.testutils.HoodieJavaClientTestHarness;
 
@@ -147,7 +148,8 @@ public class TestHoodieJavaWriteClientInsert extends 
HoodieJavaClientTestHarness
 
 HoodieJavaWriteClient writeClient = getHoodieWriteClient(config);
 metaClient = HoodieTableMetaClient.reload(metaClient);
-FileFormatUtils fileUtils = FileFormatUtils.getInstance(metaClient);
+FileFormatUtils fileUtils = 
HoodieIOFactory.getIOFactory(metaClient.getStorage())
+.getFileFormatUtils(metaClient.getTableConfig().getBaseFileFormat());
 
 // Get some records belong to the same partition (2021/09/11)
 String insertRecordStr1 =

(hudi) branch branch-0.x updated: [HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils implementation (#11310)

2024-05-26 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch branch-0.x
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/branch-0.x by this push:
 new dac73ab6121 [HUDI-7797] Use HoodieIOFactory to return pluggable 
FileFormatUtils implementation (#11310)
dac73ab6121 is described below

commit dac73ab61214e12ae50cd111220476283cccf6d1
Author: Y Ethan Guo 
AuthorDate: Sun May 26 00:34:12 2024 -0700

[HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils 
implementation (#11310)
---
 .../hudi/io/HoodieKeyLocationFetchHandle.java  |  4 +-
 .../client/TestHoodieJavaWriteClientInsert.java|  7 ++-
 .../TestHoodieJavaClientOnCopyOnWriteStorage.java  |  6 +-
 .../commit/TestJavaCopyOnWriteActionExecutor.java  |  7 ++-
 .../testutils/HoodieJavaClientTestHarness.java | 12 ++--
 .../hudi/io/storage/HoodieSparkParquetReader.java  |  3 +-
 .../hudi/client/TestUpdateSchemaEvolution.java |  5 +-
 .../TestHoodieClientOnCopyOnWriteStorage.java  | 24 +---
 .../commit/TestCopyOnWriteActionExecutor.java  | 12 ++--
 .../hudi/common/model/HoodiePartitionMetadata.java |  7 ++-
 .../hudi/common/table/TableSchemaResolver.java |  8 ++-
 .../table/log/block/HoodieHFileDataBlock.java  |  8 +--
 .../table/log/block/HoodieParquetDataBlock.java|  6 +-
 .../apache/hudi/common/util/FileFormatUtils.java   | 31 --
 .../apache/hudi/io/storage/HoodieIOFactory.java| 56 +-
 .../hudi/metadata/HoodieTableMetadataUtil.java |  4 +-
 .../hudi/sink/bootstrap/BootstrapOperator.java |  4 +-
 .../apache/hudi/io/hadoop/HoodieAvroOrcReader.java |  3 +-
 .../hudi/io/hadoop/HoodieAvroParquetReader.java|  4 +-
 .../hudi/io/hadoop/HoodieHadoopIOFactory.java  | 19 +++
 .../hudi/io/hadoop/TestHoodieHadoopIOFactory.java  | 66 ++
 .../org/apache/spark/sql/hudi/SparkHelpers.scala   | 12 ++--
 .../org/apache/hudi/ColumnStatsIndexHelper.java|  4 +-
 .../utilities/HoodieMetadataTableValidator.java| 11 ++--
 24 files changed, 236 insertions(+), 87 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
index 4d82d661f64..c94e30c9d5c 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieKeyLocationFetchHandle.java
@@ -26,6 +26,7 @@ import org.apache.hudi.common.util.FileFormatUtils;
 import org.apache.hudi.common.util.Option;
 import org.apache.hudi.common.util.collection.Pair;
 import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.io.storage.HoodieIOFactory;
 import org.apache.hudi.keygen.BaseKeyGenerator;
 import org.apache.hudi.table.HoodieTable;
 
@@ -50,7 +51,8 @@ public class HoodieKeyLocationFetchHandle extends 
HoodieReadHandle fetchHoodieKeys(HoodieBaseFile baseFile) {
-FileFormatUtils fileFormatUtils = 
FileFormatUtils.getInstance(baseFile.getStoragePath());
+FileFormatUtils fileFormatUtils = 
HoodieIOFactory.getIOFactory(hoodieTable.getStorage())
+.getFileFormatUtils(baseFile.getStoragePath());
 if (keyGeneratorOpt.isPresent()) {
   return fileFormatUtils.fetchHoodieKeys(hoodieTable.getStorage(), 
baseFile.getStoragePath(), keyGeneratorOpt);
 } else {
diff --git 
a/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
 
b/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
index 53d069736e7..60907acec5c 100644
--- 
a/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
+++ 
b/hudi-client/hudi-java-client/src/test/java/org/apache/hudi/client/TestHoodieJavaWriteClientInsert.java
@@ -37,6 +37,7 @@ import org.apache.hudi.config.HoodieIndexConfig;
 import org.apache.hudi.config.HoodieWriteConfig;
 import org.apache.hudi.hadoop.HoodieParquetInputFormat;
 import org.apache.hudi.hadoop.utils.HoodieHiveUtils;
+import org.apache.hudi.io.storage.HoodieIOFactory;
 import org.apache.hudi.storage.StoragePath;
 import org.apache.hudi.testutils.HoodieJavaClientTestHarness;
 
@@ -147,7 +148,8 @@ public class TestHoodieJavaWriteClientInsert extends 
HoodieJavaClientTestHarness
 
 HoodieJavaWriteClient writeClient = getHoodieWriteClient(config);
 metaClient = HoodieTableMetaClient.reload(metaClient);
-FileFormatUtils fileUtils = FileFormatUtils.getInstance(metaClient);
+FileFormatUtils fileUtils = 
HoodieIOFactory.getIOFactory(metaClient.getStorage())
+.getFileFormatUtils(metaClient.getTableConfig().getBaseFileFormat());
 
 // Get some records belong to the same partition (2021/09/11)
 String insertRecordStr1 = "{\"_ro

Re: [PR] [HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils implementation [hudi]

2024-05-26 Thread via GitHub


yihua commented on PR #11310:
URL: https://github.com/apache/hudi/pull/11310#issuecomment-2132107921

   Azure CI is green.
   https://github.com/apache/hudi/assets/2497195/3c6a8e60-d21b-49cb-8fd5-45fa3c860120";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7797] Use HoodieIOFactory to return pluggable FileFormatUtils implementation [hudi]

2024-05-26 Thread via GitHub


yihua merged PR #11310:
URL: https://github.com/apache/hudi/pull/11310


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org