[GitHub] [hudi] KnightChess commented on issue #7057: [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering

2022-10-30 Thread GitBox
KnightChess commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1296640318 `when init table` org.apache.hudi.exception.HoodieException: Error getting all file groups in pending clustering at org.apache.hudi.common.util.ClusteringUtils.getAllFile

[GitHub] [hudi] TengHuo commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

2022-10-30 Thread GitBox
TengHuo commented on PR #6733: URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296632077 > @TengHuo please rebase master; there were some flaky test fixes sure, np, just rebased it to the latest master -- This is an automated message from the Apache Git Service. To respo

[GitHub] [hudi] KnightChess commented on issue #7057: [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering

2022-10-30 Thread GitBox
KnightChess commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1296617289 `error in runTableServicesInline` ```shell org.apache.hudi.exception.HoodieException: Error getting all file groups in pending clustering at org.apache.hudi.common.uti

[GitHub] [hudi] KnightChess commented on issue #7057: [SUPPORT] [OCC] HoodieException: Error getting all file groups in pending clustering

2022-10-30 Thread GitBox
KnightChess commented on issue #7057: URL: https://github.com/apache/hudi/issues/7057#issuecomment-1296615528 sorry, I give the wrong analyze log, there has three scenes will cause this error. the first log which I analyze can not found, set replace file to result. I will update if I found

[GitHub] [hudi] ROOBALJINDAL commented on issue #7064: [SUPPORT] Data ingestion from csv file i.e. CsvDFSSource is working for FilebasedSchemaProvider but not working if schema is provided with Schema

2022-10-30 Thread GitBox
ROOBALJINDAL commented on issue #7064: URL: https://github.com/apache/hudi/issues/7064#issuecomment-1296615480 > Will get back to you tomorrow on this. Bit busy with some stuff. Can you please check? @pratyakshsharma -- This is an automated message from the Apache Git Service. To r

[GitHub] [hudi] xushiyan commented on pull request #6733: [HUDI-4880] Fix corrupted parquet file issue left over by cancelled compaction task

2022-10-30 Thread GitBox
xushiyan commented on PR #6733: URL: https://github.com/apache/hudi/pull/6733#issuecomment-1296612988 @TengHuo please rebase master; there were some flaky test fixes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[hudi] branch master updated: [MINOR] use default maven version since it already fix the warnings recently (#6863)

2022-10-30 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new af31759931 [MINOR] use default maven version sin

[GitHub] [hudi] xushiyan merged pull request #6863: [MINOR] use default maven version since it already fix the warnings recently in Azure CI

2022-10-30 Thread GitBox
xushiyan merged PR #6863: URL: https://github.com/apache/hudi/pull/6863 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.

[GitHub] [hudi] skadooshhhh opened a new pull request, #7093: Hudi platoform spark 3.0

2022-10-30 Thread GitBox
skadoos opened a new pull request, #7093: URL: https://github.com/apache/hudi/pull/7093 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performa

[GitHub] [hudi] waywtdcc commented on pull request #7056: [HUDI-5088]Fix bug:Failed to synchronize the hive metadata of the Flink table

2022-10-30 Thread GitBox
waywtdcc commented on PR #7056: URL: https://github.com/apache/hudi/pull/7056#issuecomment-1296599163 > [5088.patch.zip](https://github.com/apache/hudi/files/9876623/5088.patch.zip) Thanks for the fix, i have reviewed and applied a patch ~ boolean withOperationField = Boolean.parseB

[GitHub] [hudi] waywtdcc commented on pull request #7056: [HUDI-5088]Fix bug:Failed to synchronize the hive metadata of the Flink table

2022-10-30 Thread GitBox
waywtdcc commented on PR #7056: URL: https://github.com/apache/hudi/pull/7056#issuecomment-1296597975 > [5088.patch.zip](https://github.com/apache/hudi/files/9876623/5088.patch.zip) Thanks for the fix, i have reviewed and applied a patch ~ Boolean.parseBoolean(table.getOptions().get

[GitHub] [hudi] xushiyan closed pull request #6961: fix(sec): upgrade com.beust:jcommander to 1.75

2022-10-30 Thread GitBox
xushiyan closed pull request #6961: fix(sec): upgrade com.beust:jcommander to 1.75 URL: https://github.com/apache/hudi/pull/6961 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] xushiyan commented on pull request #6961: fix(sec): upgrade com.beust:jcommander to 1.75

2022-10-30 Thread GitBox
xushiyan commented on PR #6961: URL: https://github.com/apache/hudi/pull/6961#issuecomment-1296591638 close in favor of https://github.com/apache/hudi/pull/7068 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [hudi] xushiyan commented on issue #6389: [SUPPORT] HELP :: Using TWO FIELDS to precombine :: 'hoodie.datasource.write.precombine.field': "column1,column2"

2022-10-30 Thread GitBox
xushiyan commented on issue #6389: URL: https://github.com/apache/hudi/issues/6389#issuecomment-1296587079 Some previous efforts on this feature; still WIP https://github.com/apache/hudi/pull/2519 -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [hudi] hudi-bot commented on pull request #7091: [HUDI-5105] Add Call show_commit_extra_metadata for spark sql

2022-10-30 Thread GitBox
hudi-bot commented on PR #7091: URL: https://github.com/apache/hudi/pull/7091#issuecomment-1296584158 ## CI report: * 68c5b981047fabcff8a5bcc1c93ec93cb80c5d54 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1267

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296584040 ## CI report: * 713b606326eb3d7bbd509ed12e4167ac8bf39e38 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=126

[GitHub] [hudi] hudi-bot commented on pull request #7091: [HUDI-5105] Add Call show_commit_extra_metadata for spark sql

2022-10-30 Thread GitBox
hudi-bot commented on PR #7091: URL: https://github.com/apache/hudi/pull/7091#issuecomment-1296580010 ## CI report: * 68c5b981047fabcff8a5bcc1c93ec93cb80c5d54 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296579901 ## CI report: * bcd2fb38e4074afe3c4e5a82eb8eb7c0c88fa74e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=126

[GitHub] [hudi] dongkelun commented on issue #5693: [SUPPORT]cant sync to correct hive schema

2022-10-30 Thread GitBox
dongkelun commented on issue #5693: URL: https://github.com/apache/hudi/issues/5693#issuecomment-1296577390 > @dongkelun would you be able to help with this? seems like hive sync config for database was not passed to sql properly. You may try reproduce this with both the latest master versi

[GitHub] [hudi] hudi-bot commented on pull request #7090: Revert "[HUDI-4741] hotfix to avoid partial failover cause restored s…

2022-10-30 Thread GitBox
hudi-bot commented on PR #7090: URL: https://github.com/apache/hudi/pull/7090#issuecomment-1296576611 ## CI report: * f2f685c10ae9adfff7bf493006939500f4c00535 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] xushiyan commented on pull request #7091: [HUDI-5105] Add Call show_commit_extra_metadata for spark sql

2022-10-30 Thread GitBox
xushiyan commented on PR #7091: URL: https://github.com/apache/hudi/pull/7091#issuecomment-1296575274 > ### Change Logs > NA > > ### Impact > NA > > ### Risk level (write none, low medium or high below) > low > > ### Documentation Update > NA > > ### Co

[GitHub] [hudi] nsivabalan opened a new pull request, #7092: [WIP] Adding presto query validation to integ tests

2022-10-30 Thread GitBox
nsivabalan opened a new pull request, #7092: URL: https://github.com/apache/hudi/pull/7092 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performan

[GitHub] [hudi] xushiyan commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
xushiyan commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296553928 @xicm i found upgrading to 1.78 resolves the original NPE issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[jira] [Updated] (HUDI-5105) Add Call show_commit_extra_metadata for spark sql

2022-10-30 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5105: - Labels: pull-request-available (was: ) > Add Call show_commit_extra_metadata for spark sql >

[GitHub] [hudi] XuQianJin-Stars opened a new pull request, #7091: [HUDI-5105] Add Call show_commit_extra_metadata for spark sql

2022-10-30 Thread GitBox
XuQianJin-Stars opened a new pull request, #7091: URL: https://github.com/apache/hudi/pull/7091 ### Change Logs NA ### Impact NA ### Risk level (write none, low medium or high below) low ### Documentation Update NA ### Contributor's chec

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296542698 ## CI report: * bcd2fb38e4074afe3c4e5a82eb8eb7c0c88fa74e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=126

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296540175 ## CI report: * bcd2fb38e4074afe3c4e5a82eb8eb7c0c88fa74e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=126

[jira] [Created] (HUDI-5105) Add Call show_commit_extra_metadata for spark sql

2022-10-30 Thread Forward Xu (Jira)
Forward Xu created HUDI-5105: Summary: Add Call show_commit_extra_metadata for spark sql Key: HUDI-5105 URL: https://issues.apache.org/jira/browse/HUDI-5105 Project: Apache Hudi Issue Type: New F

[jira] [Closed] (HUDI-5039) flink multi writer for bucket index

2022-10-30 Thread Forward Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Forward Xu closed HUDI-5039. Resolution: Duplicate > flink multi writer for bucket index > --- > >

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296529937 ## CI report: * bcd2fb38e4074afe3c4e5a82eb8eb7c0c88fa74e Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=126

[GitHub] [hudi] hudi-bot commented on pull request #7090: Revert "[HUDI-4741] hotfix to avoid partial failover cause restored s…

2022-10-30 Thread GitBox
hudi-bot commented on PR #7090: URL: https://github.com/apache/hudi/pull/7090#issuecomment-1296492175 ## CI report: * f2f685c10ae9adfff7bf493006939500f4c00535 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296492090 ## CI report: * 5f5752509e76de38a11d6c9af1efbddaacdb020b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1264

[GitHub] [hudi] hudi-bot commented on pull request #7090: Revert "[HUDI-4741] hotfix to avoid partial failover cause restored s…

2022-10-30 Thread GitBox
hudi-bot commented on PR #7090: URL: https://github.com/apache/hudi/pull/7090#issuecomment-1296487999 ## CI report: * f2f685c10ae9adfff7bf493006939500f4c00535 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
hudi-bot commented on PR #7068: URL: https://github.com/apache/hudi/pull/7068#issuecomment-1296487900 ## CI report: * 5f5752509e76de38a11d6c9af1efbddaacdb020b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1264

[GitHub] [hudi] xicm commented on a diff in pull request #7068: [HUDI-5096] boolean param is broken in HiveSyncTool

2022-10-30 Thread GitBox
xicm commented on code in PR #7068: URL: https://github.com/apache/hudi/pull/7068#discussion_r1008996231 ## hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java: ## @@ -108,45 +108,45 @@ public static class HiveSyncConfigParams { + "instea

[GitHub] [hudi] XuQianJin-Stars commented on pull request #7090: Revert "[HUDI-4741] hotfix to avoid partial failover cause restored s…

2022-10-30 Thread GitBox
XuQianJin-Stars commented on PR #7090: URL: https://github.com/apache/hudi/pull/7090#issuecomment-1296474173 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [hudi] codope commented on a diff in pull request #6862: [HUDI-4989] Fixing deltastreamer init failures

2022-10-30 Thread GitBox
codope commented on code in PR #6862: URL: https://github.com/apache/hudi/pull/6862#discussion_r1008990765 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ## @@ -255,44 +255,72 @@ public DeltaSync(HoodieDeltaStreamer.Config cfg, SparkSess

[jira] [Resolved] (HUDI-5057) Fix msck repair hudi table

2022-10-30 Thread zouxxyy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zouxxyy resolved HUDI-5057. --- > Fix msck repair hudi table > -- > > Key: HUDI-5057 > URL

[GitHub] [hudi] danny0405 opened a new pull request, #7090: Revert "[HUDI-4741] hotfix to avoid partial failover cause restored s…

2022-10-30 Thread GitBox
danny0405 opened a new pull request, #7090: URL: https://github.com/apache/hudi/pull/7090 …ubtask timeout (#6796)" This reverts commit e222693d87d48416670ca14c6f7fd69307432786. ### Change Logs revert c ### Impact Reverts before we do more tests and prove the

[GitHub] [hudi] dongkelun commented on pull request #5633: [HUDI-4123] Fix the exception due to SqlSource return null checkpoint

2022-10-30 Thread GitBox
dongkelun commented on PR #5633: URL: https://github.com/apache/hudi/pull/5633#issuecomment-1296412015 > not sure what does checkpoint refer to incase of sql source. Incase of kafka, it refers to offset and while polling for msgs from kafka we honor that. incase of DFS based sources, checkp

[GitHub] [hudi] weimingdiit commented on a diff in pull request #6983: [HUDI-5031]Hudi merge into creates empty partition files when the sou…

2022-10-30 Thread GitBox
weimingdiit commented on code in PR #6983: URL: https://github.com/apache/hudi/pull/6983#discussion_r1008954486 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/execution/CopyOnWriteInsertHandler.java: ## @@ -69,28 +73,29 @@ public CopyOnWriteInsertHandler(HoodieW

[GitHub] [hudi] hudi-bot commented on pull request #6781: [HUDI-4123] enchancing deltastreamer sql source tests

2022-10-30 Thread GitBox
hudi-bot commented on PR #6781: URL: https://github.com/apache/hudi/pull/6781#issuecomment-1296347531 ## CI report: * 6ce39eece62fc8eca9448eee52b9f9fb03780fb6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] hudi-bot commented on pull request #6695: [MINOR] adding tests for streaming read mor with compaction

2022-10-30 Thread GitBox
hudi-bot commented on PR #6695: URL: https://github.com/apache/hudi/pull/6695#issuecomment-1296347502 ## CI report: * b4ea04502ce51566a7f9bf49c5120ed148815635 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6661: [HUDI-4853] Speeding up reading S3 files in S3EventsIncrSource

2022-10-30 Thread GitBox
the-other-tim-brown commented on code in PR #6661: URL: https://github.com/apache/hudi/pull/6661#discussion_r1008922752 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/S3EventsHoodieIncrSource.java: ## @@ -213,15 +216,27 @@ public Pair>, String> fetchNextBatch

[GitHub] [hudi] xushiyan closed issue #5938: Why Hudi publish data size much more than the input file size when publish to hive

2022-10-30 Thread GitBox
xushiyan closed issue #5938: Why Hudi publish data size much more than the input file size when publish to hive URL: https://github.com/apache/hudi/issues/5938 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] xushiyan commented on issue #5938: Why Hudi publish data size much more than the input file size when publish to hive

2022-10-30 Thread GitBox
xushiyan commented on issue #5938: URL: https://github.com/apache/hudi/issues/5938#issuecomment-1296325168 > "Getting small files from partitions" stage refers to reading existing data from hudi to fetch list of small file groups. So, this could refer to your hudi table size and not your in

[GitHub] [hudi] xushiyan commented on issue #5916: [SUPPORT] `show fsview latest` throwing IllegalStateException...pending compactions for merge_on_read table

2022-10-30 Thread GitBox
xushiyan commented on issue #5916: URL: https://github.com/apache/hudi/issues/5916#issuecomment-1296324097 @amit-ranjan-de Hudi version : 0.5.0-incubating is pretty ancient. Do you want to give 0.12.1 a try and see if problem resolves? -- This is an automated message from the Apache Git

[GitHub] [hudi] xushiyan commented on issue #5857: [SUPPORT]Problem using Multiple writers(flink spark) to write to hudi

2022-10-30 Thread GitBox
xushiyan commented on issue #5857: URL: https://github.com/apache/hudi/issues/5857#issuecomment-1296323395 > More clues for data duplication issue: I noticed two exactly the same records, one in avro log file, the other in merged parquet file after spark insert. Spark and flink write

[GitHub] [hudi] xushiyan commented on issue #5717: [SUPPORT] Hudi 0.10.1 Reconcile schema not working

2022-10-30 Thread GitBox
xushiyan commented on issue #5717: URL: https://github.com/apache/hudi/issues/5717#issuecomment-1296320707 @nsivabalan let's try to reproduce before closing. If latest master works, then we can close. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] xushiyan commented on issue #5724: [SUPPORT]Executor executes action [sync hive metadata for instant 2 0220601105502016] error java.lang.NoClassDefFoundError: org/apache/thrift/TBase

2022-10-30 Thread GitBox
xushiyan commented on issue #5724: URL: https://github.com/apache/hudi/issues/5724#issuecomment-1296319459 @nsivabalan @yuzhaojing for this one we should try to reproduce the setup with the latest master version. if meta sync works properly, then we can close this -- This is an automated

[GitHub] [hudi] hudi-bot commented on pull request #6781: [HUDI-4123] enchancing deltastreamer sql source tests

2022-10-30 Thread GitBox
hudi-bot commented on PR #6781: URL: https://github.com/apache/hudi/pull/6781#issuecomment-1296319400 ## CI report: * 6ce39eece62fc8eca9448eee52b9f9fb03780fb6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] hudi-bot commented on pull request #6695: [MINOR] adding tests for streaming read mor with compaction

2022-10-30 Thread GitBox
hudi-bot commented on PR #6695: URL: https://github.com/apache/hudi/pull/6695#issuecomment-1296319375 ## CI report: * b4ea04502ce51566a7f9bf49c5120ed148815635 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] xushiyan commented on issue #5163: [SUPPORT] Compaction for flink bounded source

2022-10-30 Thread GitBox
xushiyan commented on issue #5163: URL: https://github.com/apache/hudi/issues/5163#issuecomment-1296318647 > > We also encounter this issue, is there any quick work around currently? Thx in advance > > This PR is effective. You can try it. #6093 Solution was provided. closing t

[GitHub] [hudi] xushiyan closed issue #5163: [SUPPORT] Compaction for flink bounded source

2022-10-30 Thread GitBox
xushiyan closed issue #5163: [SUPPORT] Compaction for flink bounded source URL: https://github.com/apache/hudi/issues/5163 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [hudi] hudi-bot commented on pull request #6781: [HUDI-4123] enchancing deltastreamer sql source tests

2022-10-30 Thread GitBox
hudi-bot commented on PR #6781: URL: https://github.com/apache/hudi/pull/6781#issuecomment-1296318283 ## CI report: * 6ce39eece62fc8eca9448eee52b9f9fb03780fb6 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6695: [MINOR] adding tests for streaming read mor with compaction

2022-10-30 Thread GitBox
hudi-bot commented on PR #6695: URL: https://github.com/apache/hudi/pull/6695#issuecomment-1296318238 ## CI report: * b4ea04502ce51566a7f9bf49c5120ed148815635 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] xushiyan commented on issue #5735: No hudi dataset was saved to s3

2022-10-30 Thread GitBox
xushiyan commented on issue #5735: URL: https://github.com/apache/hudi/issues/5735#issuecomment-1296317288 close due to inactivity -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [hudi] xushiyan closed issue #5735: No hudi dataset was saved to s3

2022-10-30 Thread GitBox
xushiyan closed issue #5735: No hudi dataset was saved to s3 URL: https://github.com/apache/hudi/issues/5735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[GitHub] [hudi] xushiyan commented on issue #5767: [SUPPORT] Meatadata table unable to initialize read with log file

2022-10-30 Thread GitBox
xushiyan commented on issue #5767: URL: https://github.com/apache/hudi/issues/5767#issuecomment-1296316463 > Lowering the priority as there is a workaround @codope can you share what was the workaround here? Looks like some discussion happened offline, could you keep this ticket up to

[GitHub] [hudi] xushiyan commented on issue #5693: [SUPPORT]cant sync to correct hive schema

2022-10-30 Thread GitBox
xushiyan commented on issue #5693: URL: https://github.com/apache/hudi/issues/5693#issuecomment-1296314523 @dongkelun would you be able to help with this? seems like hive sync config for database was not passed to sql properly. -- This is an automated message from the Apache Git Service.

[GitHub] [hudi] xushiyan commented on issue #5685: [SUPPORT] Loading older data with old schema version into Hudi

2022-10-30 Thread GitBox
xushiyan commented on issue #5685: URL: https://github.com/apache/hudi/issues/5685#issuecomment-1296312727 @MikeTipico looks like similar scenario to this one https://github.com/apache/hudi/issues/5683#issuecomment-1296312285 as mentioned there, would need more info to help diagnosis

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6661: [HUDI-4853] Speeding up reading S3 files in S3EventsIncrSource

2022-10-30 Thread GitBox
nsivabalan commented on code in PR #6661: URL: https://github.com/apache/hudi/pull/6661#discussion_r1008901845 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/S3EventsHoodieIncrSource.java: ## @@ -217,11 +220,18 @@ public Pair>, String> fetchNextBatch(Option l

[GitHub] [hudi] xushiyan commented on issue #5683: [SUPPORT] org.apache.hudi.exception.HoodieUpsertException: Failed to merge old record into new file

2022-10-30 Thread GitBox
xushiyan commented on issue #5683: URL: https://github.com/apache/hudi/issues/5683#issuecomment-1296312285 > What we found strange in this case is that we prepared for the change and were no longer selecting the now dropped integer field to be inserted into Hudi. is this in step 5?

[GitHub] [hudi] nsivabalan commented on pull request #5633: [HUDI-4123] Fix the exception due to SqlSource return null checkpoint

2022-10-30 Thread GitBox
nsivabalan commented on PR #5633: URL: https://github.com/apache/hudi/pull/5633#issuecomment-1296305809 not sure what does checkpoint refer to incase of sql source. Incase of kafka, it refers to offset and while polling for msgs from kafka we honor that. incase of DFS based sources, checkpo

[GitHub] [hudi] nsivabalan commented on a diff in pull request #6862: [HUDI-4989] Fixing deltastreamer init failures

2022-10-30 Thread GitBox
nsivabalan commented on code in PR #6862: URL: https://github.com/apache/hudi/pull/6862#discussion_r1008897155 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java: ## @@ -255,44 +255,72 @@ public DeltaSync(HoodieDeltaStreamer.Config cfg, Spark

[GitHub] [hudi] xushiyan closed issue #6107: [SUPPORT] RO table did not get updated while RT table did

2022-10-30 Thread GitBox
xushiyan closed issue #6107: [SUPPORT] RO table did not get updated while RT table did URL: https://github.com/apache/hudi/issues/6107 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [hudi] xushiyan commented on issue #6107: [SUPPORT] RO table did not get updated while RT table did

2022-10-30 Thread GitBox
xushiyan commented on issue #6107: URL: https://github.com/apache/hudi/issues/6107#issuecomment-1296304199 analysis and suggestions were provided above. closing due to inactivity -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] xushiyan closed issue #6104: [SUPPORT] Hope to maintain a stable version

2022-10-30 Thread GitBox
xushiyan closed issue #6104: [SUPPORT] Hope to maintain a stable version URL: https://github.com/apache/hudi/issues/6104 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [hudi] xushiyan commented on issue #6104: [SUPPORT] Hope to maintain a stable version

2022-10-30 Thread GitBox
xushiyan commented on issue #6104: URL: https://github.com/apache/hudi/issues/6104#issuecomment-1296303102 0.12.1 was out and 0.12.x is meant to be a stable version -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [hudi] xushiyan commented on issue #6102: [SUPPORT]Missing data problem,exigency!!!

2022-10-30 Thread GitBox
xushiyan commented on issue #6102: URL: https://github.com/apache/hudi/issues/6102#issuecomment-1296302757 @Aload as 0.12.1 was out, have you given it a try? would like to know how it goes -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [hudi] xushiyan commented on issue #6089: [SUPPORT] Upgrading to 0.11.1 resulting use sparksql

2022-10-30 Thread GitBox
xushiyan commented on issue #6089: URL: https://github.com/apache/hudi/issues/6089#issuecomment-1296302200 > I'll put up a fix to change the default value. @yihua have you made the patch already? let's link the JIRA or PR here. > but I found that the MOR partition table cannot b

[GitHub] [hudi] xushiyan commented on issue #6048: [SUPPORT] S3 throttling while loading a table written with "hoodie.metadata.enable" = true

2022-10-30 Thread GitBox
xushiyan commented on issue #6048: URL: https://github.com/apache/hudi/issues/6048#issuecomment-1296300272 @noahtaite assume things are going well with 0.11 on EMR, good to close this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] xushiyan commented on issue #6038: [SUPPORT] MOR taking more time than COW using HoodieJavaWriteClient

2022-10-30 Thread GitBox
xushiyan commented on issue #6038: URL: https://github.com/apache/hudi/issues/6038#issuecomment-1296299606 Thanks for the support @fengjian428 ! @tommss do you need further assistance? we may close this in a week time due to inactivity -- This is an automated message from the Apache G

[GitHub] [hudi] xushiyan closed issue #6001: [SUPPORT] Cannot create again after deleting the Hudi external table using Spark SQL

2022-10-30 Thread GitBox
xushiyan closed issue #6001: [SUPPORT] Cannot create again after deleting the Hudi external table using Spark SQL URL: https://github.com/apache/hudi/issues/6001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [hudi] xushiyan commented on issue #6001: [SUPPORT] Cannot create again after deleting the Hudi external table using Spark SQL

2022-10-30 Thread GitBox
xushiyan commented on issue #6001: URL: https://github.com/apache/hudi/issues/6001#issuecomment-1296297443 @JoshuaZhuCN let me clarify: when it comes to delete the whole table, we support 3 syntaxes - TRUNCATE TABLE: delete all records via file system; table retained in metastore

[GitHub] [hudi] xushiyan commented on issue #5980: [SUPPORT] Insert/Upsert in 0.10.1 is slow compared to 0.8.0

2022-10-30 Thread GitBox
xushiyan commented on issue #5980: URL: https://github.com/apache/hudi/issues/5980#issuecomment-1296292989 > @bkosuru : we have made lot of fixes around perf in 0.12 on both read and write side. can you try 0.12 and let us know what you see. please disable bloom filter and column stats. Try

[GitHub] [hudi] xushiyan commented on issue #5547: [SUPPORT] Read Hudi data with flink-1.13.6 and report java.lang.NoSuchMethodError

2022-10-30 Thread GitBox
xushiyan commented on issue #5547: URL: https://github.com/apache/hudi/issues/5547#issuecomment-1296285815 @danny0405 seem like some workarounds are available. we would need to decide a proper fix. any suggestion here? -- This is an automated message from the Apache Git Service. To respon

[jira] [Updated] (HUDI-2762) Ensure hive can query insert only logs in MOR

2022-10-30 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2762: - Priority: Critical (was: Major) > Ensure hive can query insert only logs in MOR > ---

[GitHub] [hudi] xushiyan commented on issue #5593: [SUPPORT]I use Hive select Hudi mor table hive ERROR Invalid state: base-flile has to be present

2022-10-30 Thread GitBox
xushiyan commented on issue #5593: URL: https://github.com/apache/hudi/issues/5593#issuecomment-1296285051 this is an known limitation and it's tracked in https://issues.apache.org/jira/browse/HUDI-2762 -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [hudi] xushiyan closed issue #5593: [SUPPORT]I use Hive select Hudi mor table hive ERROR Invalid state: base-flile has to be present

2022-10-30 Thread GitBox
xushiyan closed issue #5593: [SUPPORT]I use Hive select Hudi mor table hive ERROR Invalid state: base-flile has to be present URL: https://github.com/apache/hudi/issues/5593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[jira] [Updated] (HUDI-2762) Ensure hive can query insert only logs in MOR

2022-10-30 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2762: - Issue Type: Improvement (was: Task) > Ensure hive can query insert only logs in MOR > ---

[jira] [Updated] (HUDI-2762) Ensure hive can query insert only logs in MOR

2022-10-30 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2762: - Component/s: reader-core > Ensure hive can query insert only logs in MOR > ---

[GitHub] [hudi] xushiyan commented on issue #5942: [SUPPORT] Partial Update on Global Index with BLOOM_INDEX_UPDATE_PARTITION_PATH_ENABLE

2022-10-30 Thread GitBox
xushiyan commented on issue #5942: URL: https://github.com/apache/hudi/issues/5942#issuecomment-1296275785 @nsivabalan could you share the latest update on the discussion please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[jira] [Commented] (HUDI-5091) MergeInto syntax merge_condition does not support Non-Equal

2022-10-30 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17626238#comment-17626238 ] Raymond Xu commented on HUDI-5091: -- [~KnightChess] any plan to pick this up? > MergeInto

[GitHub] [hudi] xushiyan commented on issue #6400: [SUPPORT] MergeInto syntax merge_condition does not support Non-Equal condition

2022-10-30 Thread GitBox
xushiyan commented on issue #6400: URL: https://github.com/apache/hudi/issues/6400#issuecomment-1296275288 @KnightChess thanks for filing the ticket. a good improvement for sql. will close this and track the implementation work from jira. -- This is an automated message from the Apache Gi

[GitHub] [hudi] xushiyan closed issue #6400: [SUPPORT] MergeInto syntax merge_condition does not support Non-Equal condition

2022-10-30 Thread GitBox
xushiyan closed issue #6400: [SUPPORT] MergeInto syntax merge_condition does not support Non-Equal condition URL: https://github.com/apache/hudi/issues/6400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[jira] [Updated] (HUDI-5091) MergeInto syntax merge_condition does not support Non-Equal

2022-10-30 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5091: - Priority: Major (was: Minor) > MergeInto syntax merge_condition does not support Non-Equal >

[jira] [Updated] (HUDI-5091) MergeInto syntax merge_condition does not support Non-Equal

2022-10-30 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5091: - Fix Version/s: 0.12.2 > MergeInto syntax merge_condition does not support Non-Equal >

[GitHub] [hudi] xushiyan closed issue #6397: [SUPPORT] spark history server - sql tab

2022-10-30 Thread GitBox
xushiyan closed issue #6397: [SUPPORT] spark history server - sql tab URL: https://github.com/apache/hudi/issues/6397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubsc

[GitHub] [hudi] xushiyan commented on issue #6397: [SUPPORT] spark history server - sql tab

2022-10-30 Thread GitBox
xushiyan commented on issue #6397: URL: https://github.com/apache/hudi/issues/6397#issuecomment-1296274031 > > Yes exactly that’s what we‘re looking for. Did we miss some configurations? > > I didn't set other parameters about it, I don't know why you can't display it, maybe it's a w

[GitHub] [hudi] xushiyan commented on issue #6389: [SUPPORT] HELP :: Using TWO FIELDS to precombine :: 'hoodie.datasource.write.precombine.field': "column1,column2"

2022-10-30 Thread GitBox
xushiyan commented on issue #6389: URL: https://github.com/apache/hudi/issues/6389#issuecomment-1296273104 > > Unfortunately, there is no out of the box solution to use two fields as preCombine for now. > > Thanks a lot for reply. We are a startup, planning to move to hudi, you might

[GitHub] [hudi] xushiyan closed issue #6389: [SUPPORT] HELP :: Using TWO FIELDS to precombine :: 'hoodie.datasource.write.precombine.field': "column1,column2"

2022-10-30 Thread GitBox
xushiyan closed issue #6389: [SUPPORT] HELP :: Using TWO FIELDS to precombine :: 'hoodie.datasource.write.precombine.field': "column1,column2" URL: https://github.com/apache/hudi/issues/6389 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] xushiyan commented on issue #6143: Exception org.apache.hudi.exception.HoodieIOException: Could not read commit details

2022-10-30 Thread GitBox
xushiyan commented on issue #6143: URL: https://github.com/apache/hudi/issues/6143#issuecomment-1296266953 closed as discussed offline, this issue is not re-occurring now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [hudi] xushiyan closed issue #6143: Exception org.apache.hudi.exception.HoodieIOException: Could not read commit details

2022-10-30 Thread GitBox
xushiyan closed issue #6143: Exception org.apache.hudi.exception.HoodieIOException: Could not read commit details URL: https://github.com/apache/hudi/issues/6143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [hudi] hudi-bot commented on pull request #7036: [HUDI-5076] Fixing non serializable path used in engineContext with metadata table intialization

2022-10-30 Thread GitBox
hudi-bot commented on PR #7036: URL: https://github.com/apache/hudi/pull/7036#issuecomment-1296251916 ## CI report: * a94904a798a927b5f6a6b76b16ef2f358ee358aa Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1266

[GitHub] [hudi] xushiyan commented on issue #6297: [SUPPORT] Flink SQL client cow table query error "org/apache/parquet/column/ColumnDescriptor" (but mor table query normal)

2022-10-30 Thread GitBox
xushiyan commented on issue #6297: URL: https://github.com/apache/hudi/issues/6297#issuecomment-1296234649 > The app uses wrong classloader, would suggest to use the per-job mode instead of yarn-session. Thanks @danny0405 for the suggestion. @fujianhua168 have you tried the suggestio

[GitHub] [hudi] xushiyan commented on issue #6259: [SUPPORT] Write audit publish

2022-10-30 Thread GitBox
xushiyan commented on issue #6259: URL: https://github.com/apache/hudi/issues/6259#issuecomment-1296232384 > if I am not wrong, hudi already has data quality validator that you can run before completing a commit. If the validation fails, the commit will abort. Would that work for your case?

[GitHub] [hudi] xushiyan closed issue #6259: [SUPPORT] Write audit publish

2022-10-30 Thread GitBox
xushiyan closed issue #6259: [SUPPORT] Write audit publish URL: https://github.com/apache/hudi/issues/6259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[GitHub] [hudi] xushiyan closed issue #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase

2022-10-30 Thread GitBox
xushiyan closed issue #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase URL: https://github.com/apache/hudi/issues/6209 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [hudi] xushiyan commented on issue #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase

2022-10-30 Thread GitBox
xushiyan commented on issue #6209: URL: https://github.com/apache/hudi/issues/6209#issuecomment-1296227206 Thanks @xiarixiaoyao for showing the right configs and the example. will close this now. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

  1   2   >