Re: [I] Hudi dbt incremental materialization not working during incremental dbt run with spark [hudi]

2024-01-07 Thread via GitHub
jetansi commented on issue #10448: URL: https://github.com/apache/hudi/issues/10448#issuecomment-1880520850 Sure thing @ad1happy2go. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [HUDI-1881]: draft implementation for trigger based on data availability [hudi]

2024-01-07 Thread via GitHub
Sarfaraz-214 commented on PR #5071: URL: https://github.com/apache/hudi/pull/5071#issuecomment-1880515749 @nsivabalan Siva, it seems @pratyakshsharma has already made an attempt at addressing the issue in the PR. Given the importance of the use-case and the progress made so far, it would be

Re: [PR] [HUDI-7144] Build storage partition stats index and use it for data skipping [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10352: URL: https://github.com/apache/hudi/pull/10352#issuecomment-1880504857 ## CI report: * 8cb19dccd8173767e53a5a6e1c58dc0167a3a481 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

Re: [I] hoodie.bulkinsert.shuffle.parallelism Not activated [hudi]

2024-01-07 Thread via GitHub
KnightChess commented on issue #10418: URL: https://github.com/apache/hudi/issues/10418#issuecomment-1880500598 @zhangjw123321 I create a issue to track it, https://issues.apache.org/jira/browse/HUDI-7277 -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] [HUDI-7144] Build storage partition stats index and use it for data skipping [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10352: URL: https://github.com/apache/hudi/pull/10352#issuecomment-1880497515 ## CI report: * 8cb19dccd8173767e53a5a6e1c58dc0167a3a481 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

[jira] [Created] (HUDI-7277) hoodie.bulkinsert.shuffle.parallelism not activated with no-partitioned table

2024-01-07 Thread KnightChess (Jira)
KnightChess created HUDI-7277: - Summary: hoodie.bulkinsert.shuffle.parallelism not activated with no-partitioned table Key: HUDI-7277 URL: https://issues.apache.org/jira/browse/HUDI-7277 Project: Apache H

(hudi) branch asf-site updated: [MINOR][DOCS] Note that hudi.metadata-enabled in Trino is defunct (#10289)

2024-01-07 Thread codope
This is an automated email from the ASF dual-hosted git repository. codope pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 799c12d5575 [MINOR][DOCS] Note that hudi.metad

Re: [PR] [MINOR][DOCS] Note that hudi.metadata-enabled in Trino is defunct [hudi]

2024-01-07 Thread via GitHub
codope merged PR #10289: URL: https://github.com/apache/hudi/pull/10289 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

Re: [I] [SUPPORT] Flink job failing with Avro ClassCastException [hudi]

2024-01-07 Thread via GitHub
raghunittala commented on issue #9596: URL: https://github.com/apache/hudi/issues/9596#issuecomment-1880458196 No, I wasn't able to fix this. I updated to Hudi 0.14, still I can see this error. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] hoodie.bulkinsert.shuffle.parallelism Not activated [hudi]

2024-01-07 Thread via GitHub
zhangjw123321 commented on issue #10418: URL: https://github.com/apache/hudi/issues/10418#issuecomment-1880457371 I try to set the number of files that can be generated normally. Thank you very much. @KnightChess @ad1happy2go -- This is an automated message from the Apache Git Service.

Re: [I] hoodie.bulkinsert.shuffle.parallelism Not activated [hudi]

2024-01-07 Thread via GitHub
zhangjw123321 commented on issue #10418: URL: https://github.com/apache/hudi/issues/10418#issuecomment-1880455240 > @zhangjw123321 you can try set it in spark submit, --conf, or by code sparkconf.set('xxx','yyy'), will match other branch, not use parent rdd partition size ![image](https://

Re: [I] hoodie.bulkinsert.shuffle.parallelism Not activated [hudi]

2024-01-07 Thread via GitHub
zhangjw123321 commented on issue #10418: URL: https://github.com/apache/hudi/issues/10418#issuecomment-1880454835 > @zhangjw123321 you can try set it in spark submit, --conf, or by code sparkconf.set('xxx','yyy'), will match other branch, not use parent rdd partition size ![image](https://

Re: [PR] [HUDI-7276] FILE_GROUP_READER_ENABLED should be disable for query [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10455: URL: https://github.com/apache/hudi/pull/10455#issuecomment-1880452974 ## CI report: * 8b9a7301e91e42ce4cbab9ff3fb05fd3bd6b0fab Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

Re: [PR] [HUDI-7276] FILE_GROUP_READER_ENABLED should be disable for query [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10455: URL: https://github.com/apache/hudi/pull/10455#issuecomment-1880447261 ## CI report: * 8b9a7301e91e42ce4cbab9ff3fb05fd3bd6b0fab UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run th

Re: [PR] [HUDI-7265] Support schema evolution by Flink SQL using HoodieHiveCatalog [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10426: URL: https://github.com/apache/hudi/pull/10426#issuecomment-1880441626 ## CI report: * b4a68ad41cfe6d582dea52aea53d9f4b96341f26 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

Re: [PR] [HUDI-5823] Partition ttl management [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #9723: URL: https://github.com/apache/hudi/pull/9723#issuecomment-1880440934 ## CI report: * ae3572005431da476574d5cbdf6a324ba93d4725 UNKNOWN * eb82362f82cc417fd10fa1907c4a7668917b1d22 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

Re: [I] [SUPPORT] Flink job failing with Avro ClassCastException [hudi]

2024-01-07 Thread via GitHub
ligou525 commented on issue #9596: URL: https://github.com/apache/hudi/issues/9596#issuecomment-1880439633 Hi @raghunittala, do you find a solution for this problem? I faced the same issue when call the insertOverwrite api: Caused by: org.apache.hudi.exception.HoodieException: Error g

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
voonhous commented on PR #8673: URL: https://github.com/apache/hudi/pull/8673#issuecomment-1880439615 @zhangyue19921010 No worries, added the markdown formatting back. :) Yeap, root cause here is that checkpoint interval is still running and a long-running rollback/archive might cause

[jira] [Updated] (HUDI-7276) FILE_GROUP_READER_ENABLED should be disable for query

2024-01-07 Thread xy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy updated HUDI-7276: - Description: FILE_GROUP_READER_ENABLED should be disable for query   java.io.IOException: com.esotericsoftware.kryo.KryoE

Re: [I] Hudi dbt incremental materialization not working during incremental dbt run with spark [hudi]

2024-01-07 Thread via GitHub
ad1happy2go commented on issue #10448: URL: https://github.com/apache/hudi/issues/10448#issuecomment-1880437237 @jetansi I was able to run dbt-examples with incremental model successfully even with multiple retires. Can we sync up on a call to understand the issue what you are facing. --

Re: [PR] [HUDI-7276] FILE_GROUP_READER_ENABLED should be disable for query [hudi]

2024-01-07 Thread via GitHub
xuzifu666 commented on PR #10455: URL: https://github.com/apache/hudi/pull/10455#issuecomment-1880435258 cc @stream2000 PTAL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [HUDI-7276] FILE_GROUP_READER_ENABLED should be disable for query [hudi]

2024-01-07 Thread via GitHub
xuzifu666 commented on PR #10455: URL: https://github.com/apache/hudi/pull/10455#issuecomment-1880433127 `java.io.IOException: com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException Serialization trace: props (org.apache.avro.Schema$LongSchema) types (org.apache.avr

[jira] [Updated] (HUDI-7276) FILE_GROUP_READER_ENABLED should be disable for query

2024-01-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7276: - Labels: pull-request-available (was: ) > FILE_GROUP_READER_ENABLED should be disable for query >

[PR] [HUDI-7276] FILE_GROUP_READER_ENABLED should be disable for query [hudi]

2024-01-07 Thread via GitHub
xuzifu666 opened a new pull request, #10455: URL: https://github.com/apache/hudi/pull/10455 ### Change Logs FILE_GROUP_READER_ENABLED should be disable for query ### Impact fix bug in query for Hudi 1.0 ### Risk level (write none, low medium or high below) n

[jira] [Created] (HUDI-7276) FILE_GROUP_READER_ENABLED should be disable for query

2024-01-07 Thread xy (Jira)
xy created HUDI-7276: Summary: FILE_GROUP_READER_ENABLED should be disable for query Key: HUDI-7276 URL: https://issues.apache.org/jira/browse/HUDI-7276 Project: Apache Hudi Issue Type: Bug Com

Re: [PR] [HUDI-7265] Support schema evolution by Flink SQL using HoodieHiveCatalog [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10426: URL: https://github.com/apache/hudi/pull/10426#issuecomment-1880412709 ## CI report: * b4a68ad41cfe6d582dea52aea53d9f4b96341f26 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

Re: [PR] [HUDI-7265] Support schema evolution by Flink SQL using HoodieHiveCatalog [hudi]

2024-01-07 Thread via GitHub
beyond1920 commented on code in PR #10426: URL: https://github.com/apache/hudi/pull/10426#discussion_r1444185356 ## hudi-flink-datasource/hudi-flink1.18.x/src/main/java/org/apache/hudi/adapter/HoodieHiveCatalogAdapter.java: ## @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Soft

Re: [PR] [HUDI-5823] Partition ttl management [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #9723: URL: https://github.com/apache/hudi/pull/9723#issuecomment-1880340742 ## CI report: * ae3572005431da476574d5cbdf6a324ba93d4725 UNKNOWN * 7b4d31ec9e571ad4e14c1e7e228424675fa20d50 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

Re: [PR] [HUDI-5823] Partition ttl management [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #9723: URL: https://github.com/apache/hudi/pull/9723#issuecomment-1880336264 ## CI report: * ae3572005431da476574d5cbdf6a324ba93d4725 UNKNOWN * 7b4d31ec9e571ad4e14c1e7e228424675fa20d50 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
hbgstc123 commented on code in PR #8673: URL: https://github.com/apache/hudi/pull/8673#discussion_r1444139475 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -423,25 +431,41 @@ private void initInstant(String inst

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
zhangyue19921010 commented on PR #8673: URL: https://github.com/apache/hudi/pull/8673#issuecomment-1880325585 Hi @voonhous Thanks! I believe the root cause for out of sync between JM and TM is `rollback action` is not in a stop the world state. And Sorry to modify for comment by accident Or

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
zhangyue19921010 commented on code in PR #8673: URL: https://github.com/apache/hudi/pull/8673#discussion_r1442618373 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -423,25 +431,41 @@ private void initInstant(Stri

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
zhangyue19921010 commented on code in PR #8673: URL: https://github.com/apache/hudi/pull/8673#discussion_r1442618373 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -423,25 +431,41 @@ private void initInstant(Stri

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
hbgstc123 commented on code in PR #8673: URL: https://github.com/apache/hudi/pull/8673#discussion_r1444136324 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java: ## @@ -453,7 +454,7 @@ private boolean flushBucket(DataBucket bucket) {

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
hbgstc123 commented on code in PR #8673: URL: https://github.com/apache/hudi/pull/8673#discussion_r1444132471 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java: ## @@ -422,26 +424,37 @@ private void initInstant(String inst

Re: [PR] [HUDI-7246] Fix Data Skipping Issue: No Results When Query Conditions Involve Both Columns with and without Column Stats [hudi]

2024-01-07 Thread via GitHub
majian1998 commented on code in PR #10389: URL: https://github.com/apache/hudi/pull/10389#discussion_r1444122277 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/ColumnStatsIndexSupport.scala: ## @@ -272,11 +272,13 @@ class ColumnStatsIndexSupport(spark:

[jira] [Updated] (HUDI-7274) AnalysisException thrown when executing SQL time travel query using TIMESTAMP AS OF

2024-01-07 Thread Jira
[ https://issues.apache.org/jira/browse/HUDI-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesús Camacho Rodríguez updated HUDI-7274: -- Description: An `AnalysisException` is thrown when executing SQL time travel que

Re: [PR] [HUDI-7184] Add IncrementalQueryAnalyzer for completion time based in… [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10255: URL: https://github.com/apache/hudi/pull/10255#issuecomment-1880057543 ## CI report: * d2c7c5cd379c4c8ba4297fd5fa1e989f73e58b5d Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2

Re: [PR] [HUDI-7184] Add IncrementalQueryAnalyzer for completion time based in… [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10255: URL: https://github.com/apache/hudi/pull/10255#issuecomment-1880018357 ## CI report: * b39e93881c130fd8832e5c260740f8f59bb8033e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

Re: [PR] [HUDI-7184] Add IncrementalQueryAnalyzer for completion time based in… [hudi]

2024-01-07 Thread via GitHub
hudi-bot commented on PR #10255: URL: https://github.com/apache/hudi/pull/10255#issuecomment-1880016530 ## CI report: * b39e93881c130fd8832e5c260740f8f59bb8033e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=21

Re: [PR] [HUDI-7246] Fix Data Skipping Issue: No Results When Query Conditions Involve Both Columns with and without Column Stats [hudi]

2024-01-07 Thread via GitHub
danny0405 commented on code in PR #10389: URL: https://github.com/apache/hudi/pull/10389#discussion_r1443964801 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/ColumnStatsIndexSupport.scala: ## @@ -272,11 +272,13 @@ class ColumnStatsIndexSupport(spark:

Re: [PR] [HUDI-6194] prevent flink writer getting the wrong instant to write [hudi]

2024-01-07 Thread via GitHub
danny0405 commented on PR #8673: URL: https://github.com/apache/hudi/pull/8673#issuecomment-1879998341 @voonhous , thanks for the detailed analysis, I will take some time to take a look once I get time. -- This is an automated message from the Apache Git Service. To respond to the message

[jira] [Closed] (HUDI-7266) add clustering metric for flink

2024-01-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-7266. Resolution: Fixed Fixed via master branch: 478833af96895f8765dcb639c0fdd971779b89b9 > add clustering metric

Re: [PR] [HUDI-7266] add clustering metric for flink [hudi]

2024-01-07 Thread via GitHub
danny0405 merged PR #10420: URL: https://github.com/apache/hudi/pull/10420 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apac

[jira] [Updated] (HUDI-7266) add clustering metric for flink

2024-01-07 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-7266: - Fix Version/s: 1.0.0 > add clustering metric for flink > --- > >

(hudi) branch master updated: [HUDI-7266] Add clustering metric for flink (#10420)

2024-01-07 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 478833af968 [HUDI-7266] Add clustering metric f