[jira] [Updated] (HUDI-3545) Make HoodieAvroWriteSupport class configurable

2024-01-31 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-3545: -- Status: In Progress (was: Open) > Make HoodieAvroWriteSupport class configurable >

[jira] [Closed] (HUDI-3545) Make HoodieAvroWriteSupport class configurable

2024-01-31 Thread Jonathan Vexler (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler closed HUDI-3545. - Resolution: Fixed > Make HoodieAvroWriteSupport class configurable >

Re: [PR] [Hudi-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10577: URL: https://github.com/apache/hudi/pull/10577#issuecomment-1919546729 ## CI report: * 40cbc324442334d3e1313f995c8ae9feed7d0db7 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1919546440 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [Hudi-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10577: URL: https://github.com/apache/hudi/pull/10577#issuecomment-1919532025 ## CI report: * 40cbc324442334d3e1313f995c8ae9feed7d0db7 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1919531736 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1919517887 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1919517087 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * a6973f9c50ba8fcc6485bc87a8107752988447eb Azure:

Re: [PR] [Hudi-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on code in PR #10577: URL: https://github.com/apache/hudi/pull/10577#discussion_r1473131674 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -998,7 +1000,16 @@ object HoodieSparkSqlWriter {

Re: [PR] [HUDI-7045] Create parquet readers inside the reader context and implement schema.on.read in the filegroup reader in spark [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10278: URL: https://github.com/apache/hudi/pull/10278#issuecomment-1919432441 ## CI report: * d98b47625ecada36364aa02aa1496dafd330c6a9 UNKNOWN * ab0b2127349325a3c939fe65da9d8caaac0da018 UNKNOWN * a6973f9c50ba8fcc6485bc87a8107752988447eb Azure:

Re: [PR] [Hudi-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on code in PR #10577: URL: https://github.com/apache/hudi/pull/10577#discussion_r1473085790 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -998,7 +1000,16 @@ object HoodieSparkSqlWriter {

Re: [I] [SUPPORT] parquet bloom filters not supported by hudi [hudi]

2024-01-31 Thread via GitHub
jonvex commented on issue #7117: URL: https://github.com/apache/hudi/issues/7117#issuecomment-1919400827 https://github.com/apache/hudi/pull/10278 I am working on the FileGroup Reader for Hudi 1.0 and that test was failing but if I change it to accu.add(1) then it works. So that's why I'm

[PR] [Docs] Added known regression note for 0.14.1 release related to ComplexKeyGen [hudi]

2024-01-31 Thread via GitHub
ad1happy2go opened a new pull request, #10597: URL: https://github.com/apache/hudi/pull/10597 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any

Re: [I] too many s3 list when hoodie.metadata.enable=true [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #9751: URL: https://github.com/apache/hudi/issues/9751#issuecomment-1919303669 @ad1happy2go I did internal benchmarks with different versions of hudi here. With metadata enabled between various version, I didn't saw significant increase in S3 calls.

Re: [I] [SUPPORT] Unable to read column_stats sub-table of a HUDI table for some tables [hudi]

2024-01-31 Thread via GitHub
codope closed issue #9399: [SUPPORT] Unable to read column_stats sub-table of a HUDI table for some tables URL: https://github.com/apache/hudi/issues/9399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [I] [SUPPORT] Unable to read column_stats sub-table of a HUDI table for some tables [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #9399: URL: https://github.com/apache/hudi/issues/9399#issuecomment-1919299483 @nandubatchu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] [SUPPORT] Unable to read column_stats sub-table of a HUDI table for some tables [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #9399: URL: https://github.com/apache/hudi/issues/9399#issuecomment-1919299172 Closing this out. Please reopen in case you still facing this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
parisni commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1919296148 we recently faced a more general problem with spark datasource where subsequent read.table("hudi_table") are cached and won't reflect hudi commits except if you restart the context (or

Re: [I] [SUPPORT] Dirty data filtering failed [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #9877: URL: https://github.com/apache/hudi/issues/9877#issuecomment-1919294351 @deasea Sorry for delay here. @danny0405 Do you have any insights here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [SUPPORT] hudi sql task hang java.lang.System.exit block [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10112: URL: https://github.com/apache/hudi/issues/10112#issuecomment-1919292489 @zyclove Did you got a chance to try this? Was this PR fixed your issue. Please share the insights here. Thanks in advance. -- This is an automated message from the Apache Git

Re: [I] [SUPPORT] parquet bloom filters not supported by hudi [hudi]

2024-01-31 Thread via GitHub
parisni commented on issue #7117: URL: https://github.com/apache/hudi/issues/7117#issuecomment-1919291600 that's a good point. I don't know, I found that code in the spark tests. The point is it does increment ! -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [SUPPORT] HoodieMultiTableDeltaStreamer does not work as expected [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10246: URL: https://github.com/apache/hudi/issues/10246#issuecomment-1919284449 @nttq1sub Sorry for delay in response here. Yes, You are correct. It will read from one topic and ingest one table for that MicroBatchExecution and then runs the next table.

Re: [I] Querying Hudi Table Created With Version 0.12.3 Not Working on Trino 430 [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10228: URL: https://github.com/apache/hudi/issues/10228#issuecomment-1919280265 @Amar1404 Ideally HiveSync also should delegate to AwsGlueCatalogSync if Glue is enabled for EMR. So ideally should not cause any difference. -- This is an automated message

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1919262947 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [I] [SUPPORT] How to do Schema Evolution with Apache Flink DataStream API when doing CDC? [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10349: URL: https://github.com/apache/hudi/issues/10349#issuecomment-1919260878 @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Partitioning data into two keys is taking more time (10x) than partitioning into one key. [hudi]

2024-01-31 Thread via GitHub
maheshguptags commented on issue #10456: URL: https://github.com/apache/hudi/issues/10456#issuecomment-1919229376 Hi @ad1happy2go, There is little correction on the commit file size. > which ultimately causing OOM due to 400MB commit files. its a 41 Mb commit file size

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1919174513 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7362] Fix hudi partition base path scheme to s3 [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10596: URL: https://github.com/apache/hudi/pull/10596#issuecomment-1919145004 ## CI report: * febb22c2b62f65657dbe46f4242ca032dd64185f Azure:

Re: [PR] [HUDI-7146] Support non-unique keys for secondary index [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10211: URL: https://github.com/apache/hudi/pull/10211#issuecomment-1919143807 ## CI report: * b3c87bc228fa2be4558a349c9f44f47a695f8a8d Azure:

Re: [PR] [HUDI-7146] Support non-unique keys for secondary index [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10211: URL: https://github.com/apache/hudi/pull/10211#issuecomment-1919059558 ## CI report: * d97a61842c678093b17ae5c42f95a1f4e2aa925f Azure:

Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10594: URL: https://github.com/apache/hudi/pull/10594#issuecomment-1919049335 ## CI report: * d5f75fac99c5cd1039f0418e0900fc3aae608a33 Azure:

Re: [PR] [HUDI-7146] Support non-unique keys for secondary index [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10211: URL: https://github.com/apache/hudi/pull/10211#issuecomment-1919048326 ## CI report: * d97a61842c678093b17ae5c42f95a1f4e2aa925f Azure:

Re: [I] [SUPPORT] Hudi 0.13.1 on EMR, MOR table writer hangs intermittently with S3 read timeout error for column stats index [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10415: URL: https://github.com/apache/hudi/issues/10415#issuecomment-1919021038 Thanks for trying @ergophobiac. @CTTY any insights here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] [SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10329: URL: https://github.com/apache/hudi/issues/10329#issuecomment-1919019227 cc @danny0405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] [SUPPORT] CoW: Hudi Upsert not working when there is a timestamp field in the composite key [hudi]

2024-01-31 Thread via GitHub
codope closed issue #10303: [SUPPORT] CoW: Hudi Upsert not working when there is a timestamp field in the composite key URL: https://github.com/apache/hudi/issues/10303 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] [SUPPORT] CoW: Hudi Upsert not working when there is a timestamp field in the composite key [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10303: URL: https://github.com/apache/hudi/issues/10303#issuecomment-1919013789 @srinikandi Closing out this issue, Please reopen in case you still faces this issue after setting `hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled`

Re: [I] [SUPPORT] Can't delete key (row) for all commits in HUDI Table (history)? [hudi]

2024-01-31 Thread via GitHub
jens4doc commented on issue #10581: URL: https://github.com/apache/hudi/issues/10581#issuecomment-1919012227 Thank you, unfortunate that right to be forgotten cannot be applied by HUDI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] [SUPPORT] Can't delete key (row) for all commits in HUDI Table (history)? [hudi]

2024-01-31 Thread via GitHub
jens4doc closed issue #10581: [SUPPORT] Can't delete key (row) for all commits in HUDI Table (history)? URL: https://github.com/apache/hudi/issues/10581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] [SUPPORT] MOR hudi 0.14, Bloom Filters are not being used on query time [hudi]

2024-01-31 Thread via GitHub
bk-mz commented on issue #10511: URL: https://github.com/apache/hudi/issues/10511#issuecomment-1919010119 >when number of output rows with bloom is clearly lot less than number of output rows without bloom. @ad1happy2go The query performance is same for both ro and snapshot

Re: [I] Duplicate Row in Same Partition using Global Bloom Index [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #9536: URL: https://github.com/apache/hudi/issues/9536#issuecomment-1919009780 @Raghvendradubey Closing this. Please reopen if you still faces this issue with this PR or 0.14.1. Thanks a lot for raising this. -- This is an automated message from the Apache

Re: [I] [SUPPORT] Deltastreamer throws exception when ingesting INT96 timestamps [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #9151: URL: https://github.com/apache/hudi/issues/9151#issuecomment-1919005476 @satyasinha-94 Any update on this? Were you able to get your issue resolved. Please share the insights. -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [BUG]Data duplication, multiple data primary keys are duplicated [hudi]

2024-01-31 Thread via GitHub
codope closed issue #10545: [BUG]Data duplication, multiple data primary keys are duplicated URL: https://github.com/apache/hudi/issues/10545 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [BUG]Data duplication, multiple data primary keys are duplicated [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10545: URL: https://github.com/apache/hudi/issues/10545#issuecomment-1919000324 @waywtdcc Closing this out. Please reopen in case you need any more help on this. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] [Support] An error occurred while calling o1748.load.\n: java.io.FileNotFoundException [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10503: URL: https://github.com/apache/hudi/issues/10503#issuecomment-1918998944 So you mean spark standalone mode? Does that mode works for 0.14 also? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [Support] An error occurred while calling o1748.load.\n: java.io.FileNotFoundException [hudi]

2024-01-31 Thread via GitHub
gsudhanshu commented on issue #10503: URL: https://github.com/apache/hudi/issues/10503#issuecomment-1918985618 @ad1happy2go it is working in 0.13.1 and standalone mode -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [I] [SUPPORT] Hope Hudi 0.13. 1 can support Flink 1.17+ [hudi]

2024-01-31 Thread via GitHub
codope closed issue #10434: [SUPPORT] Hope Hudi 0.13. 1 can support Flink 1.17+ URL: https://github.com/apache/hudi/issues/10434 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] [SUPPORT] Hope Hudi 0.13. 1 can support Flink 1.17+ [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10434: URL: https://github.com/apache/hudi/issues/10434#issuecomment-1918970432 @lmhongwei Closing this issue, Please reopen in case you need any further help. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [I] [SUPPORT] Flink streaming read MOR table, thrown Unexpected cdc file split infer case: LOG_FILE Exception [hudi]

2024-01-31 Thread via GitHub
codope closed issue #10539: [SUPPORT] Flink streaming read MOR table, thrown Unexpected cdc file split infer case: LOG_FILE Exception URL: https://github.com/apache/hudi/issues/10539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] [SUPPORT] Error Category: UNCLASSIFIED_ERROR; An error occurred while calling o230.save. Parquet/Avro schema mismatch: Avro field 'id' not found [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10555: URL: https://github.com/apache/hudi/issues/10555#issuecomment-1918966565 Any update here @jayesh2424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [I] [SUPPORT] Flink streaming read MOR table, thrown Unexpected cdc file split infer case: LOG_FILE Exception [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10539: URL: https://github.com/apache/hudi/issues/10539#issuecomment-1918968410 @nicholasxu Closing out this issue. Please reopen or create a new one in case of any further queries/issues. Thanks. -- This is an automated message from the Apache Git

Re: [I] [SUPPORT] MOR hudi 0.14, Bloom Filters are not being used on query time [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10511: URL: https://github.com/apache/hudi/issues/10511#issuecomment-1918965640 @bk-mz Why do you think "indexing and statistical means in hudi are ineffective" when number of output rows with bloom is clearly lot less than number of output rows without

Re: [PR] [HUDI-7362] Fix hudi partition base path scheme to s3 [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10596: URL: https://github.com/apache/hudi/pull/10596#issuecomment-1918959104 ## CI report: * febb22c2b62f65657dbe46f4242ca032dd64185f Azure:

Re: [I] [SUPPORT] HUDI baseFile is empty String and this causes IllegalArgumentException [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10458: URL: https://github.com/apache/hudi/issues/10458#issuecomment-1918953387 I will work on updating the docs. Thanks @stayrascal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [I] Partitioning data into two keys is taking more time (10x) than partitioning into one key. [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10456: URL: https://github.com/apache/hudi/issues/10456#issuecomment-1918948729 @xicm @danny0405 Had a discussion with @maheshguptags . Let me try to summarise his issue. He is having around 5000 partitions in total and using the bucket index. When he

Re: [PR] [HUDI-7362] Fix hudi partition base path scheme to s3 [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10596: URL: https://github.com/apache/hudi/pull/10596#issuecomment-1918891968 ## CI report: * febb22c2b62f65657dbe46f4242ca032dd64185f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

[jira] [Updated] (HUDI-7362) Athena does not support s3a partition scheme anymore leading to missing data

2024-01-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7362: - Labels: pull-request-available (was: ) > Athena does not support s3a partition scheme anymore

[jira] [Created] (HUDI-7362) Athena does not support s3a partition scheme anymore leading to missing data

2024-01-31 Thread nicolas paris (Jira)
nicolas paris created HUDI-7362: --- Summary: Athena does not support s3a partition scheme anymore leading to missing data Key: HUDI-7362 URL: https://issues.apache.org/jira/browse/HUDI-7362 Project:

[PR] Fix hudi partition base path scheme to s3 [hudi]

2024-01-31 Thread via GitHub
parisni opened a new pull request, #10596: URL: https://github.com/apache/hudi/pull/10596 ### Change Logs Fixes #10595 ### Impact People having the issue should drop the glue table and recreate it from scratch w/ this patch ### Risk level (write none, low medium

[jira] [Commented] (HUDI-7287) Exception in streaming read while querying tables with 'cdc.enabled' is true

2024-01-31 Thread Aditya Goenka (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812652#comment-17812652 ] Aditya Goenka commented on HUDI-7287: - Need documentation update here, as only MOR supports cdc. as

Re: [I] Hudi behaviour if AWS Glue concurrency is triggered[SUPPORT] [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10559: URL: https://github.com/apache/hudi/issues/10559#issuecomment-1918835176 @rishabhreply It will handle and process all the 10 files. It is simple spark/distributed computing concept to process files in parallel. Let me know in case I am missing

Re: [I] [SUPPORT] Datasource incremental subsequent read same as first read [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #7846: URL: https://github.com/apache/hudi/issues/7846#issuecomment-1918831478 adding @beyond1920 @yihua @nsivabalan for more insights here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] Partitioning data into two keys is taking more time (10x) than partitioning into one key. [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10456: URL: https://github.com/apache/hudi/issues/10456#issuecomment-1918825052 @maheshguptags Lets get into a call to discuss this further. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] [Support] An error occurred while calling o1748.load.\n: java.io.FileNotFoundException [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10503: URL: https://github.com/apache/hudi/issues/10503#issuecomment-1918819721 @gsudhanshu Can you let us know if Just by downgrading Hudi version to 0.13.1 makes your existing setup works? If yes then we need to dig deep and something in 0.14 release

Re: [I] The Schema Evolution Not working For Hudi 0.12.3 [hudi]

2024-01-31 Thread via GitHub
ad1happy2go commented on issue #10309: URL: https://github.com/apache/hudi/issues/10309#issuecomment-1918797127 @lei-su-awx I tried this code with 0.14.1 and it worked fine. With 0.14.0 I can see the error. @lei-su-awx @Amar1404 Can you guys try with 0.14.1 and let me know in case

(hudi) branch master updated (e466fb221b5 -> c5573ab34b2)

2024-01-31 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from e466fb221b5 [HUDI-7345] Remove usage of org.apache.hadoop.util.VersionUtil (#10571) add c5573ab34b2 [HUDI-7344]

(hudi) branch master updated (a078242b19d -> e466fb221b5)

2024-01-31 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git from a078242b19d [HUDI-7343] Replace Path.SEPARATOR with HoodieLocation.SEPARATOR (#10570) add e466fb221b5 [HUDI-7345]

Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10594: URL: https://github.com/apache/hudi/pull/10594#issuecomment-1918779515 ## CI report: * d5f75fac99c5cd1039f0418e0900fc3aae608a33 Azure:

Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10594: URL: https://github.com/apache/hudi/pull/10594#issuecomment-1918766300 ## CI report: * d5f75fac99c5cd1039f0418e0900fc3aae608a33 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run

Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
cmmp6 commented on PR #10594: URL: https://github.com/apache/hudi/pull/10594#issuecomment-1918722320 Related items https://github.com/apache/hudi/pull/7886,https://github.com/apache/hudi/pull/7904, -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
cmmp6 commented on PR #10594: URL: https://github.com/apache/hudi/pull/10594#issuecomment-1918703432 @danny0405 please review this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
cmmp6 commented on PR #10594: URL: https://github.com/apache/hudi/pull/10594#issuecomment-1918701697 PR is to solve problem in https://github.com/apache/hudi/issues/9424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-01-31 Thread via GitHub
cmmp6 opened a new pull request, #10594: URL: https://github.com/apache/hudi/pull/10594 ### Change Logs This PR makes the changes to support using local timezone when writing flink TIMESTAMP data. ### Impact User can use utc or local timezone to write flink TIMESTAMP

[jira] [Closed] (HUDI-7361) Fix a concurrency issue caused by clean

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric closed HUDI-7361. -- Fix Version/s: 0.14.0 Resolution: Fixed > Fix a concurrency issue caused by clean >

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1918675170 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

[jira] [Updated] (HUDI-7361) Fix a concurrency issue caused by clean

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7361: --- Summary: Fix a concurrency issue caused by clean (was: Fix a concurrency issue caused by rollbackFailedWrites) >

[jira] [Updated] (HUDI-7361) Fix a concurrency issue caused by rollbackFailedWrites

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7361: --- Attachment: (was: taskmanager_log.txt) > Fix a concurrency issue caused by rollbackFailedWrites >

[jira] [Resolved] (HUDI-7361) Fix a concurrency issue caused by rollbackFailedWrites

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric resolved HUDI-7361. > Fix a concurrency issue caused by rollbackFailedWrites > -- > >

[jira] [Updated] (HUDI-7361) Fix a concurrency issue caused by rollbackFailedWrites

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7361: --- Component/s: (was: writer-core) Description: (was: {quote}CREATE TABLE tbl ( .. ) WITH ( 'connector' =

[jira] [Updated] (HUDI-7361) Fix a concurrency issue caused by rollbackFailedWrites

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] eric updated HUDI-7361: --- Attachment: (was: jobmanager_log.txt) > Fix a concurrency issue caused by rollbackFailedWrites >

[jira] [Commented] (HUDI-7361) Fix a concurrency issue caused by rollbackFailedWrites

2024-01-31 Thread eric (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812583#comment-17812583 ] eric commented on HUDI-7361: [[HUDI-5675] fix lazy clean schedule rollback on completed instant by stream2000

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1918663165 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

Re: [PR] [HUDI-7361] Fix a concurrency issue caused by rollbackFailedWrites [hudi]

2024-01-31 Thread via GitHub
eric9204 commented on PR #10593: URL: https://github.com/apache/hudi/pull/10593#issuecomment-1918662863 this issue has been resolved by this pr https://github.com/apache/hudi/pull/7826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [I] The Schema Evolution Not working For Hudi 0.12.3 [hudi]

2024-01-31 Thread via GitHub
lei-su-awx commented on issue #10309: URL: https://github.com/apache/hudi/issues/10309#issuecomment-1918659000 I had a similar question, when table schema is double, and the incoming data schema is long, then why the data can not upsert into table? I think double can handle long. Below is

[jira] [Updated] (HUDI-7361) Fix a concurrency issue caused by rollbackFailedWrites

2024-01-31 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7361: - Labels: pull-request-available (was: ) > Fix a concurrency issue caused by rollbackFailedWrites

Re: [PR] [HUDI-7361] Fix a concurrency issue caused by rollbackFailedWrites [hudi]

2024-01-31 Thread via GitHub
eric9204 closed pull request #10593: [HUDI-7361] Fix a concurrency issue caused by rollbackFailedWrites URL: https://github.com/apache/hudi/pull/10593 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] fix(HoodieRecord): add serialVersionUID [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10592: URL: https://github.com/apache/hudi/pull/10592#issuecomment-1918651993 ## CI report: * 8de03a278356eafd1cf9f012d58a5993f5314b56 Azure:

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10591: URL: https://github.com/apache/hudi/pull/10591#issuecomment-1918651897 ## CI report: * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN * 4e39d3ba20d5d2236e599a55c96a9c731ed721c0 Azure:

Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-01-31 Thread via GitHub
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1918651519 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN *

<    1   2