[GitHub] [hudi] hudi-bot commented on pull request #6239: [HUDI-4501] Throwing exception when restore is attempted with hoodie.arhive.beyond.savepoint is enabled

2022-07-28 Thread GitBox
hudi-bot commented on PR #6239: URL: https://github.com/apache/hudi/pull/6239#issuecomment-1198907872 ## CI report: * b1a3c6eb8bbabf29e95a20d866b9a3c83681ad48 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6234: [HUDI-4221] Optimzing getAllPartitionPaths

2022-07-28 Thread GitBox
hudi-bot commented on PR #6234: URL: https://github.com/apache/hudi/pull/6234#issuecomment-1198905664 ## CI report: * 1540cb6e143bc4de4d012dc6e8aba6ed7b194fea UNKNOWN * a7d72c57e52f80e3509c2675605d03c34813c702 Azure:

[jira] [Updated] (HUDI-4501) throw exception for "restore" when "hoodie.archive.beyond.savepoint" is enabled

2022-07-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4501: - Labels: pull-request-available (was: ) > throw exception for "restore" when

[GitHub] [hudi] nsivabalan opened a new pull request, #6239: [HUDI-4501] Throwing exception when restore is attempted with hoodie.arhive.beyond.savepoint is enabled

2022-07-28 Thread GitBox
nsivabalan opened a new pull request, #6239: URL: https://github.com/apache/hudi/pull/6239 ## What is the purpose of the pull request When "hoodie.archive.beyond.savepoint" is enabled, we can't support "restore" as there could be holes in the timeline. We have a

[jira] [Updated] (HUDI-4501) throw exception for "restore" when "hoodie.archive.beyond.savepoint" is enabled

2022-07-28 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-4501: -- Epic Link: HUDI-3967 > throw exception for "restore" when

[jira] [Created] (HUDI-4501) throw exception for "restore" when "hoodie.archive.beyond.savepoint" is enabled

2022-07-28 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4501: - Summary: throw exception for "restore" when "hoodie.archive.beyond.savepoint" is enabled Key: HUDI-4501 URL: https://issues.apache.org/jira/browse/HUDI-4501

[GitHub] [hudi] hudi-bot commented on pull request #6238: [HUDI-4499] Tweak default retry times for flink metadata table lock

2022-07-28 Thread GitBox
hudi-bot commented on PR #6238: URL: https://github.com/apache/hudi/pull/6238#issuecomment-1198877219 ## CI report: * 9659bfe7b2144533b08ed14ba1efdde4ac3b2ec7 Azure:

[jira] [Created] (HUDI-4500) Support restore to savepoint with holes in the timeline

2022-07-28 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-4500: - Summary: Support restore to savepoint with holes in the timeline Key: HUDI-4500 URL: https://issues.apache.org/jira/browse/HUDI-4500 Project: Apache Hudi Issue

[GitHub] [hudi] hudi-bot commented on pull request #6238: [HUDI-4499] Tweak default retry times for flink metadata table lock

2022-07-28 Thread GitBox
hudi-bot commented on PR #6238: URL: https://github.com/apache/hudi/pull/6238#issuecomment-1198875403 ## CI report: * 9659bfe7b2144533b08ed14ba1efdde4ac3b2ec7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] nsivabalan commented on pull request #5837: [HUDI-3884] Support archival beyond savepoint commits

2022-07-28 Thread GitBox
nsivabalan commented on PR #5837: URL: https://github.com/apache/hudi/pull/5837#issuecomment-1198874544 but atleast don't we need to fail fast? i.e. when someone tries to do a restore when this config is enabled, throw an exception saying that "restore is not supported when this config is

[jira] [Updated] (HUDI-4499) Tweak default retry times for flink metadata table lock

2022-07-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4499: - Labels: pull-request-available (was: ) > Tweak default retry times for flink metadata table lock

[GitHub] [hudi] danny0405 opened a new pull request, #6238: [HUDI-4499] Tweak default retry times for flink metadata table lock

2022-07-28 Thread GitBox
danny0405 opened a new pull request, #6238: URL: https://github.com/apache/hudi/pull/6238 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[jira] [Created] (HUDI-4499) Tweak default retry times for flink metadata table lock

2022-07-28 Thread Danny Chen (Jira)
Danny Chen created HUDI-4499: Summary: Tweak default retry times for flink metadata table lock Key: HUDI-4499 URL: https://issues.apache.org/jira/browse/HUDI-4499 Project: Apache Hudi Issue

[jira] [Closed] (HUDI-4498) Insert mysql catalog table failed

2022-07-28 Thread Jira
[ https://issues.apache.org/jira/browse/HUDI-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 董可伦 closed HUDI-4498. - Resolution: Fixed > Insert mysql catalog table failed > - > > Key:

[GitHub] [hudi] vamshigv commented on a diff in pull request #6228: [HUDI-4488] Improve S3EventsHoodieIncrSource efficiency

2022-07-28 Thread GitBox
vamshigv commented on code in PR #6228: URL: https://github.com/apache/hudi/pull/6228#discussion_r932859360 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/S3EventsHoodieIncrSource.java: ## @@ -172,37 +177,47 @@ public Pair>, String> fetchNextBatch(Option

[jira] [Assigned] (HUDI-4498) Insert mysql catalog table failed

2022-07-28 Thread Jira
[ https://issues.apache.org/jira/browse/HUDI-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 董可伦 reassigned HUDI-4498: - Assignee: 董可伦 > Insert mysql catalog table failed > - > > Key:

[jira] [Created] (HUDI-4498) Insert mysql catalog table failed

2022-07-28 Thread Jira
董可伦 created HUDI-4498: - Summary: Insert mysql catalog table failed Key: HUDI-4498 URL: https://issues.apache.org/jira/browse/HUDI-4498 Project: Apache Hudi Issue Type: Bug Components:

[jira] [Created] (HUDI-4497) Vet all critical code paths for double-checked locking

2022-07-28 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-4497: - Summary: Vet all critical code paths for double-checked locking Key: HUDI-4497 URL: https://issues.apache.org/jira/browse/HUDI-4497 Project: Apache Hudi Issue

[GitHub] [hudi] codope commented on pull request #5837: [HUDI-3884] Support archival beyond savepoint commits

2022-07-28 Thread GitBox
codope commented on PR #5837: URL: https://github.com/apache/hudi/pull/5837#issuecomment-1198849123 > @codope @yihua : whats the consensus here on supporting restore when this config is enabled? We should support it. Punted on it due to time constraints for the release. But i can

[GitHub] [hudi] hudi-bot commented on pull request #6234: [HUDI-4221] Optimzing getAllPartitionPaths

2022-07-28 Thread GitBox
hudi-bot commented on PR #6234: URL: https://github.com/apache/hudi/pull/6234#issuecomment-1198848440 ## CI report: * 1540cb6e143bc4de4d012dc6e8aba6ed7b194fea UNKNOWN * Unknown: [CANCELED](TBD) * a7d72c57e52f80e3509c2675605d03c34813c702 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6234: [HUDI-4221] Optimzing getAllPartitionPaths

2022-07-28 Thread GitBox
hudi-bot commented on PR #6234: URL: https://github.com/apache/hudi/pull/6234#issuecomment-1198846736 ## CI report: * 1540cb6e143bc4de4d012dc6e8aba6ed7b194fea UNKNOWN * Unknown: [CANCELED](TBD) * a7d72c57e52f80e3509c2675605d03c34813c702 UNKNOWN Bot commands

[GitHub] [hudi] YannByron commented on issue #6223: [SUPPORT] [BUG] SparkSQL Insert Into other catalog Error

2022-07-28 Thread GitBox
YannByron commented on issue #6223: URL: https://github.com/apache/hudi/issues/6223#issuecomment-1198846762 @huzk8 please give the whole stack trace to help me address this. and maybe you should use the `catalog.database.table` format to insert into, not `mysql.st` which only have

[GitHub] [hudi] hudi-bot commented on pull request #6225: [HUDI-4487] support to create ro/rt table by spark sql

2022-07-28 Thread GitBox
hudi-bot commented on PR #6225: URL: https://github.com/apache/hudi/pull/6225#issuecomment-1198846719 ## CI report: * de8c1ae0ed8433f13e2f2e3087bc31499a9b3c05 Azure:

[GitHub] [hudi] eric9204 commented on issue #6011: [SUPPORT] HoodieFlinkCompactor failed

2022-07-28 Thread GitBox
eric9204 commented on issue #6011: URL: https://github.com/apache/hudi/issues/6011#issuecomment-1198845658 > @eric9204 Can you show the start command, I can't reproduce this problem locally, is it not a separate process that is started? @yuzhaojing `bin/flink run -t yarn-per-job -d

[GitHub] [hudi] hudi-bot commented on pull request #6225: [HUDI-4487] support to create ro/rt table by spark sql

2022-07-28 Thread GitBox
hudi-bot commented on PR #6225: URL: https://github.com/apache/hudi/pull/6225#issuecomment-1198844906 ## CI report: * de8c1ae0ed8433f13e2f2e3087bc31499a9b3c05 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6234: [HUDI-4221] Optimzing getAllPartitionPaths

2022-07-28 Thread GitBox
hudi-bot commented on PR #6234: URL: https://github.com/apache/hudi/pull/6234#issuecomment-1198844948 ## CI report: * Unknown: [CANCELED](TBD) * 1540cb6e143bc4de4d012dc6e8aba6ed7b194fea UNKNOWN Bot commands @hudi-bot supports the following commands: -

[GitHub] [hudi] YannByron commented on issue #6232: [SUPPORT] Hudi V0.9 truncating second precision for timestamp columns

2022-07-28 Thread GitBox
YannByron commented on issue #6232: URL: https://github.com/apache/hudi/issues/6232#issuecomment-1198839746 @neerajpadarthi guess you use spark dataframe api, then maybe you can try to set `spark.sql.parquet.writeLegacyFormat` to `TIMESTAMP_MICROS` when create `SparkSession` object.

[hudi] branch master updated: [HUDI-4495] Fix handling of S3 paths incompatible with java URI standards (#6237)

2022-07-28 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new c39e88dcf0 [HUDI-4495] Fix handling of S3 paths

[GitHub] [hudi] yihua merged pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
yihua merged PR #6237: URL: https://github.com/apache/hudi/pull/6237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on pull request #5837: [HUDI-3884] Support archival beyond savepoint commits

2022-07-28 Thread GitBox
nsivabalan commented on PR #5837: URL: https://github.com/apache/hudi/pull/5837#issuecomment-1198830203 @codope @yihua : whats the consensus here on supporting restore when this config is enabled? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] YannByron commented on pull request #6225: [HUDI-4487] support to create ro/rt table by spark sql

2022-07-28 Thread GitBox
YannByron commented on PR #6225: URL: https://github.com/apache/hudi/pull/6225#issuecomment-1198829691 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] nsivabalan commented on pull request #6234: [HUDI-4221] Optimzing getAllPartitionPaths

2022-07-28 Thread GitBox
nsivabalan commented on PR #6234: URL: https://github.com/apache/hudi/pull/6234#issuecomment-1198828386 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[jira] [Comment Edited] (HUDI-4485) Hudi cli got empty result for command show fsview all

2022-07-28 Thread Yao Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17571705#comment-17571705 ] Yao Zhang edited comment on HUDI-4485 at 7/29/22 2:55 AM: -- This problem is caused

[jira] [Comment Edited] (HUDI-4485) Hudi cli got empty result for command show fsview all

2022-07-28 Thread Yao Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572719#comment-17572719 ] Yao Zhang edited comment on HUDI-4485 at 7/29/22 2:53 AM: -- Hi all, After further

[jira] [Updated] (HUDI-4485) Hudi cli got empty result for command show fsview all

2022-07-28 Thread Yao Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yao Zhang updated HUDI-4485: Status: In Progress (was: Open) > Hudi cli got empty result for command show fsview all >

[jira] [Commented] (HUDI-4485) Hudi cli got empty result for command show fsview all

2022-07-28 Thread Yao Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17572719#comment-17572719 ] Yao Zhang commented on HUDI-4485: - Hi all, After further investigation I found that Spring-shell 1.2.0

[GitHub] [hudi] codope closed pull request #6235: [HUDI-4493] Fixing handling of corrupt rollback plans

2022-07-28 Thread GitBox
codope closed pull request #6235: [HUDI-4493] Fixing handling of corrupt rollback plans URL: https://github.com/apache/hudi/pull/6235 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot commented on pull request #6016: [HUDI-4465] Optimizing file-listing sequence of Metadata Table

2022-07-28 Thread GitBox
hudi-bot commented on PR #6016: URL: https://github.com/apache/hudi/pull/6016#issuecomment-1198816105 ## CI report: * 705660efda3e17a13071c7ab3550daceefa9d3b8 Azure:

[GitHub] [hudi] TengHuo closed issue #6208: [SUPPORT] Hudi append only pipeline failed due to parquet FileNotFoundException

2022-07-28 Thread GitBox
TengHuo closed issue #6208: [SUPPORT] Hudi append only pipeline failed due to parquet FileNotFoundException URL: https://github.com/apache/hudi/issues/6208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] TengHuo commented on issue #6208: [SUPPORT] Hudi append only pipeline failed due to parquet FileNotFoundException

2022-07-28 Thread GitBox
TengHuo commented on issue #6208: URL: https://github.com/apache/hudi/issues/6208#issuecomment-1198815645 Yeah, sure, the patch fixed my pipeline, let me close this issue. I will append more detail about the root cause here later Thanks thanks -- This is an automated message from the

[GitHub] [hudi] yuzhaojing commented on issue #6011: [SUPPORT] HoodieFlinkCompactor failed

2022-07-28 Thread GitBox
yuzhaojing commented on issue #6011: URL: https://github.com/apache/hudi/issues/6011#issuecomment-1198812966 @eric9204 Can you show the start command, I can't reproduce this problem locally, is it not a separate process that is started? -- This is an automated message from the Apache Git

[GitHub] [hudi] xiarixiaoyao commented on issue #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase

2022-07-28 Thread GitBox
xiarixiaoyao commented on issue #6209: URL: https://github.com/apache/hudi/issues/6209#issuecomment-1198804788 @yihua @lz1984sh yes, schema evolution can support that case; ``` spark.sql("set hoodie.schema.on.read.enable=true") // NOTE: This is required since

[GitHub] [hudi] xiarixiaoyao commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
xiarixiaoyao commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932802177 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -241,39 +240,49 @@ object HoodieSparkSqlWriter {

[GitHub] [hudi] xushiyan commented on a diff in pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
xushiyan commented on code in PR #6237: URL: https://github.com/apache/hudi/pull/6237#discussion_r932800022 ## hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java: ## @@ -145,8 +145,11 @@ public static Path convertPathWithScheme(Path oldPath,

[GitHub] [hudi] hudi-bot commented on pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
hudi-bot commented on PR #6237: URL: https://github.com/apache/hudi/pull/6237#issuecomment-1198785903 ## CI report: * c4b81c6e6b0476203aca974dfb10cfc4f389480a Azure:

[GitHub] [hudi] yihua commented on issue #6208: [SUPPORT] Hudi append only pipeline failed due to parquet FileNotFoundException

2022-07-28 Thread GitBox
yihua commented on issue #6208: URL: https://github.com/apache/hudi/issues/6208#issuecomment-1198774372 @TengHuo if the problem is fully resolved, feel free to close this issue. The discussion can still happen here regarding the root cause and the merged fix. -- This is an automated

[GitHub] [hudi] yihua commented on issue #6209: [SUPPORT] hudi 0.11 not support decimal field precision increase

2022-07-28 Thread GitBox
yihua commented on issue #6209: URL: https://github.com/apache/hudi/issues/6209#issuecomment-1198758964 @xiarixiaoyao do you know if schema evolution support such a use case? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] yihua commented on issue #6223: [SUPPORT] [BUG] SparkSQL Insert Into other catalog Error

2022-07-28 Thread GitBox
yihua commented on issue #6223: URL: https://github.com/apache/hudi/issues/6223#issuecomment-1198753764 @xiarixiaoyao @YannByron @XuQianJin-Stars @alexeykudinkin any of you can help here? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] danny0405 commented on a diff in pull request #6222: [HUDI-4484] Add default lock config options for flink metadata table

2022-07-28 Thread GitBox
danny0405 commented on code in PR #6222: URL: https://github.com/apache/hudi/pull/6222#discussion_r932769944 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java: ## @@ -236,11 +236,16 @@ void testStreamWriteBatchReadOptimized() {

[GitHub] [hudi] yihua commented on issue #6224: [SUPPORT] Caused by: java.lang.IllegalArgumentException: Cannot use marker based rollback strategy on completed instant

2022-07-28 Thread GitBox
yihua commented on issue #6224: URL: https://github.com/apache/hudi/issues/6224#issuecomment-1198736769 @jtchen-study can you share the Hudi configs and the setup you use for Java client? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] yihua commented on issue #6232: [SUPPORT] Hudi V0.9 truncating second precision for timestamp columns

2022-07-28 Thread GitBox
yihua commented on issue #6232: URL: https://github.com/apache/hudi/issues/6232#issuecomment-1198731643 @YannByron do you have any suggestions or this issue cannot be fixed in 0.9.0? @neerajpadarthi is it possible for you to upgrade to Hudi 0.10.1 to pick up the

[GitHub] [hudi] zhedoubushishi commented on issue #6226: [SUPPORT] OCC locks with data on S3 and DynamoDB fails to acquire

2022-07-28 Thread GitBox
zhedoubushishi commented on issue #6226: URL: https://github.com/apache/hudi/issues/6226#issuecomment-1198725504 No this is a new issue for us. As @fengjian428 mentioned, the locking happens during the commit stage, not in the data writing stage. Would be good if you can share some logs so

[GitHub] [hudi] hudi-bot commented on pull request #6016: [HUDI-4465] Optimizing file-listing sequence of Metadata Table

2022-07-28 Thread GitBox
hudi-bot commented on PR #6016: URL: https://github.com/apache/hudi/pull/6016#issuecomment-1198717956 ## CI report: * b038d97682d74420f906dffabf439eee036553dd Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6016: [HUDI-4465] Optimizing file-listing sequence of Metadata Table

2022-07-28 Thread GitBox
hudi-bot commented on PR #6016: URL: https://github.com/apache/hudi/pull/6016#issuecomment-1198715352 ## CI report: * b038d97682d74420f906dffabf439eee036553dd Azure:

[GitHub] [hudi] umehrot2 commented on a diff in pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
umehrot2 commented on code in PR #6237: URL: https://github.com/apache/hudi/pull/6237#discussion_r932745731 ## hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java: ## @@ -145,8 +145,11 @@ public static Path convertPathWithScheme(Path oldPath,

[GitHub] [hudi] hudi-bot commented on pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
hudi-bot commented on PR #6237: URL: https://github.com/apache/hudi/pull/6237#issuecomment-1198713147 ## CI report: * c4b81c6e6b0476203aca974dfb10cfc4f389480a Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
hudi-bot commented on PR #6237: URL: https://github.com/apache/hudi/pull/6237#issuecomment-1198710976 ## CI report: * c4b81c6e6b0476203aca974dfb10cfc4f389480a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[jira] [Updated] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4495: - Labels: pull-request-available release-blocker (was: release-blocker) > Specific S3 URI patterns

[GitHub] [hudi] xushiyan commented on a diff in pull request #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
xushiyan commented on code in PR #6237: URL: https://github.com/apache/hudi/pull/6237#discussion_r932742580 ## hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java: ## @@ -145,8 +145,11 @@ public static Path convertPathWithScheme(Path oldPath,

[GitHub] [hudi] hudi-bot commented on pull request #6016: [HUDI-4465] Optimizing file-listing sequence of Metadata Table

2022-07-28 Thread GitBox
hudi-bot commented on PR #6016: URL: https://github.com/apache/hudi/pull/6016#issuecomment-1198708367 ## CI report: * b038d97682d74420f906dffabf439eee036553dd Azure:

[jira] [Updated] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-4495: Labels: release-blocker (was: pull-request-available) > Specific S3 URI patterns break with Hudi >

[jira] [Updated] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-4495: Priority: Blocker (was: Major) > Specific S3 URI patterns break with Hudi >

[jira] [Updated] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4495: - Labels: pull-request-available (was: ) > Specific S3 URI patterns break with Hudi >

[GitHub] [hudi] umehrot2 opened a new pull request, #6237: [HUDI-4495] Fix handling of S3 paths incompatible with java URI stand…

2022-07-28 Thread GitBox
umehrot2 opened a new pull request, #6237: URL: https://github.com/apache/hudi/pull/6237 …ards ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
alexeykudinkin commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932731150 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -241,39 +240,49 @@ object HoodieSparkSqlWriter {

[GitHub] [hudi] xushiyan merged pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
xushiyan merged PR #6213: URL: https://github.com/apache/hudi/pull/6213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[hudi] branch master updated: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap (#6213)

2022-07-28 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new cfd0c1ee34 [HUDI-4081][HUDI-4472] Addressing

[GitHub] [hudi] xushiyan commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
xushiyan commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932729471 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -241,39 +240,49 @@ object HoodieSparkSqlWriter {

[GitHub] [hudi] xushiyan commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
xushiyan commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932726718 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/InsertIntoHoodieTableCommand.scala: ## @@ -66,103 +79,139 @@ object

[jira] [Updated] (HUDI-4492) Spark 3.3 support follow up

2022-07-28 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4492: Fix Version/s: 0.13.0 > Spark 3.3 support follow up > --- > > Key:

[GitHub] [hudi] alexeykudinkin commented on pull request #6230: [HUDI-4478] Rename existing Spark/Flink modules concisely

2022-07-28 Thread GitBox
alexeykudinkin commented on PR #6230: URL: https://github.com/apache/hudi/pull/6230#issuecomment-1198686421 @CTTY can you please summarize the changes in the description? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[jira] [Assigned] (HUDI-4496) ORC fails w/ Spark 3.1

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-4496: - Assignee: Alexey Kudinkin > ORC fails w/ Spark 3.1 > -- > >

[jira] [Created] (HUDI-4496) ORC fails w/ Spark 3.1

2022-07-28 Thread Alexey Kudinkin (Jira)
Alexey Kudinkin created HUDI-4496: - Summary: ORC fails w/ Spark 3.1 Key: HUDI-4496 URL: https://issues.apache.org/jira/browse/HUDI-4496 Project: Apache Hudi Issue Type: Bug Affects

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
alexeykudinkin commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932718492 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala: ## @@ -241,39 +240,49 @@ object HoodieSparkSqlWriter {

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
alexeykudinkin commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932717646 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/InsertIntoHoodieTableCommand.scala: ## @@ -66,103 +79,139 @@ object

[GitHub] [hudi] xushiyan commented on a diff in pull request #6228: [HUDI-4488] Improve S3EventsHoodieIncrSource efficiency

2022-07-28 Thread GitBox
xushiyan commented on code in PR #6228: URL: https://github.com/apache/hudi/pull/6228#discussion_r932717590 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/S3EventsHoodieIncrSource.java: ## @@ -172,37 +177,47 @@ public Pair>, String> fetchNextBatch(Option

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
alexeykudinkin commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932716228 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/InsertIntoHoodieTableCommand.scala: ## @@ -66,103 +79,139 @@ object

[GitHub] [hudi] xushiyan commented on a diff in pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
xushiyan commented on code in PR #6213: URL: https://github.com/apache/hudi/pull/6213#discussion_r932713368 ## hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/InsertIntoHoodieTableCommand.scala: ## @@ -66,103 +79,139 @@ object

[GitHub] [hudi] hudi-bot commented on pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
hudi-bot commented on PR #6213: URL: https://github.com/apache/hudi/pull/6213#issuecomment-1198671818 ## CI report: * 4a639a47b823e71425b6450883dc1427ec4c8898 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #5328: [HUDI-3883] Add new Bulk Insert mode to repartition the dataset based on Partition Path without sorting

2022-07-28 Thread GitBox
hudi-bot commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1198671070 ## CI report: * 76fea0d2cbac3928c2f908862207afbad053 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6230: [HUDI-4478] Rename existing Spark/Flink modules concisely

2022-07-28 Thread GitBox
hudi-bot commented on PR #6230: URL: https://github.com/apache/hudi/pull/6230#issuecomment-1198668756 ## CI report: * 18944f9a2bf0ae9a27fb9d93ffe5c9753638db85 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6228: [HUDI-4488] Improve S3EventsHoodieIncrSource efficiency

2022-07-28 Thread GitBox
hudi-bot commented on PR #6228: URL: https://github.com/apache/hudi/pull/6228#issuecomment-1198668722 ## CI report: * 0cc2dbb39e432baf741bb3dd94c6d627cb250297 UNKNOWN * e14bff1ef93f0c1fbbacf384d4fcaa3ef314050c Azure:

[GitHub] [hudi] hudi-bot commented on pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
hudi-bot commented on PR #6213: URL: https://github.com/apache/hudi/pull/6213#issuecomment-1198668639 ## CI report: * 4a639a47b823e71425b6450883dc1427ec4c8898 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] yihua commented on pull request #6230: [HUDI-4478] Rename existing Spark/Flink modules concisely

2022-07-28 Thread GitBox
yihua commented on PR #6230: URL: https://github.com/apache/hudi/pull/6230#issuecomment-1198667005 @CTTY Let's target this for the next release, not merging into master for 0.12.0. wdyt? cc @codope @xushiyan @alexeykudinkin @danny0405 -- This is an automated message from the Apache

[jira] [Updated] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-4495: Description: Current certain S3 path patterns break with Hudi, for ex: paths with *.* pattern in

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6222: [HUDI-4484] Add default lock config options for flink metadata table

2022-07-28 Thread GitBox
alexeykudinkin commented on code in PR #6222: URL: https://github.com/apache/hudi/pull/6222#discussion_r932702325 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java: ## @@ -236,11 +236,16 @@ void

[jira] [Created] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-4495: --- Summary: Specific S3 URI patterns break with Hudi Key: HUDI-4495 URL: https://issues.apache.org/jira/browse/HUDI-4495 Project: Apache Hudi Issue Type: Bug

[jira] [Assigned] (HUDI-4495) Specific S3 URI patterns break with Hudi

2022-07-28 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-4495: --- Assignee: Udit Mehrotra > Specific S3 URI patterns break with Hudi >

[GitHub] [hudi] neerajpadarthi commented on issue #6232: [SUPPORT] Hudi V0.9 truncating second precision for timestamp columns

2022-07-28 Thread GitBox
neerajpadarthi commented on issue #6232: URL: https://github.com/apache/hudi/issues/6232#issuecomment-1198656143 Hey thanks for checking. I have tired with the first option earlier but no luck. Also, I don’t find this config here for 0.9V (https://hudi.apache.org/docs/0.9.0/configurations)

[GitHub] [hudi] alexeykudinkin commented on pull request #6213: [HUDI-4081][HUDI-4472] Addressing Spark SQL vs Spark DS performance gap

2022-07-28 Thread GitBox
alexeykudinkin commented on PR #6213: URL: https://github.com/apache/hudi/pull/6213#issuecomment-1198652832 CI is green: https://user-images.githubusercontent.com/428277/181640968-c6bc5665-182f-407f-9705-0a73da2f6321.png;>

[jira] [Updated] (HUDI-3035) Unify Parquet writers

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3035: -- Fix Version/s: 0.13.0 (was: 0.12.0) > Unify Parquet writers >

[jira] [Updated] (HUDI-2598) Redesign record payload class to decouple HoodieRecordPayload from Avro

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-2598: -- Fix Version/s: 0.13.0 (was: 0.12.0) > Redesign record payload class to

[jira] [Updated] (HUDI-3828) We need to revisit MOR block merging sequence

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3828: -- Fix Version/s: 0.13.0 (was: 0.12.0) > We need to revisit MOR block

[jira] [Updated] (HUDI-3866) Support Data Skipping for MOR

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3866: -- Fix Version/s: 0.13.0 (was: 0.12.0) > Support Data Skipping for MOR >

[jira] [Updated] (HUDI-3247) Support incremental queries in AbstractHoodieTableFileIndex

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3247: -- Fix Version/s: 0.13.0 (was: 0.12.0) > Support incremental queries in

[jira] [Assigned] (HUDI-2749) Improve the streaming read for hudi

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin reassigned HUDI-2749: - Assignee: (was: Alexey Kudinkin) > Improve the streaming read for hudi >

[GitHub] [hudi] hudi-bot commented on pull request #5328: [HUDI-3883] Add new Bulk Insert mode to repartition the dataset based on Partition Path without sorting

2022-07-28 Thread GitBox
hudi-bot commented on PR #5328: URL: https://github.com/apache/hudi/pull/5328#issuecomment-1198618309 ## CI report: * f1c00f46279d3d79c4cf438af1e5a398718c426a Azure:

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Status: Patch Available (was: In Progress) > Revisit schema handling in HoodieSparkSqlWriter >

[jira] [Updated] (HUDI-4472) Revisit schema handling in HoodieSparkSqlWriter

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-4472: -- Status: In Progress (was: Open) > Revisit schema handling in HoodieSparkSqlWriter >

[jira] [Updated] (HUDI-3883) Bulk-insert w/ sort-mode "NONE" leads to file-sizing issues

2022-07-28 Thread Alexey Kudinkin (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Kudinkin updated HUDI-3883: -- Fix Version/s: 0.13.0 (was: 0.12.0) > Bulk-insert w/ sort-mode "NONE"

  1   2   3   >