[GitHub] [hudi] YannByron commented on a diff in pull request #6256: [RFC-51][HUDI-3478] Update RFC: CDC support

2022-08-16 Thread GitBox
YannByron commented on code in PR #6256: URL: https://github.com/apache/hudi/pull/6256#discussion_r947420470 ## rfc/rfc-51/rfc-51.md: ## @@ -64,69 +65,72 @@ We follow the debezium output format: four columns as shown below Note: the illustration here ignores all the Hudi met

[GitHub] [hudi] YannByron commented on a diff in pull request #6256: [RFC-51][HUDI-3478] Update RFC: CDC support

2022-08-16 Thread GitBox
YannByron commented on code in PR #6256: URL: https://github.com/apache/hudi/pull/6256#discussion_r947417129 ## rfc/rfc-51/rfc-51.md: ## @@ -148,20 +152,27 @@ hudi_cdc_table/ Under a partition directory, the `.log` file with `CDCBlock` above will keep the changing data we ha

[GitHub] [hudi] 1032851561 commented on issue #6167: [SUPPORT] No results are returned from incremental queries within the archived range

2022-08-16 Thread GitBox
1032851561 commented on issue #6167: URL: https://github.com/apache/hudi/issues/6167#issuecomment-1217415847 > > In this case, why not merge archived instants before return? > > @1032851561 i don't think it's expected to return incremental results for archived commits. A design consid

[GitHub] [hudi] vinothchandar commented on pull request #6408: [DOCS] Edits to the Hudi Tech specs

2022-08-16 Thread GitBox
vinothchandar commented on PR #6408: URL: https://github.com/apache/hudi/pull/6408#issuecomment-1217410416 @prasannarajaperumal Not sure if `tunable` is the right word either. Not married to it. Landed for now, lets keep looking and update if we find sth better. -- This is an automated m

[hudi] branch asf-site updated: [DOCS] Edits to the Hudi Tech specs (#6408)

2022-08-16 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 9735364cc0 [DOCS] Edits to the Hudi Tech specs

[GitHub] [hudi] vinothchandar merged pull request #6408: [DOCS] Edits to the Hudi Tech specs

2022-08-16 Thread GitBox
vinothchandar merged PR #6408: URL: https://github.com/apache/hudi/pull/6408 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.ap

[GitHub] [hudi] danny0405 opened a new pull request, #6415: [HUDI-4632] Remove the force active property for flink1.14 profile

2022-08-16 Thread GitBox
danny0405 opened a new pull request, #6415: URL: https://github.com/apache/hudi/pull/6415 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performanc

[jira] [Updated] (HUDI-4632) Remove the force active property for flink1.14 profile

2022-08-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-4632: - Labels: pull-request-available (was: ) > Remove the force active property for flink1.14 profile >

[jira] [Created] (HUDI-4632) Remove the force active property for flink1.14 profile

2022-08-16 Thread Danny Chen (Jira)
Danny Chen created HUDI-4632: Summary: Remove the force active property for flink1.14 profile Key: HUDI-4632 URL: https://issues.apache.org/jira/browse/HUDI-4632 Project: Apache Hudi Issue Type:

[GitHub] [hudi] boneanxs opened a new issue, #6414: [SUPPORT] Spark3 with Hadoop3 using metadata could have compatible issue when reading hfile

2022-08-16 Thread GitBox
boneanxs opened a new issue, #6414: URL: https://github.com/apache/hudi/issues/6414 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-subscr.

[GitHub] [hudi] danny0405 commented on a diff in pull request #6256: [RFC-51][HUDI-3478] Update RFC: CDC support

2022-08-16 Thread GitBox
danny0405 commented on code in PR #6256: URL: https://github.com/apache/hudi/pull/6256#discussion_r947407085 ## rfc/rfc-51/rfc-51.md: ## @@ -64,69 +65,72 @@ We follow the debezium output format: four columns as shown below Note: the illustration here ignores all the Hudi met

[GitHub] [hudi] danny0405 commented on a diff in pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
danny0405 commented on code in PR #6312: URL: https://github.com/apache/hudi/pull/6312#discussion_r947403350 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java: ## @@ -492,8 +492,6 @@ void testBatchModeUpsertWithoutPartition(Hoodi

[GitHub] [hudi] codope opened a new pull request, #6413: [MINOR] Update DOAP with 0.12.0 Release

2022-08-16 Thread GitBox
codope opened a new pull request, #6413: URL: https://github.com/apache/hudi/pull/6413 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance

[GitHub] [hudi] YannByron commented on a diff in pull request #6256: [RFC-51][HUDI-3478] Update RFC: CDC support

2022-08-16 Thread GitBox
YannByron commented on code in PR #6256: URL: https://github.com/apache/hudi/pull/6256#discussion_r947401482 ## rfc/rfc-51/rfc-51.md: ## @@ -64,69 +65,72 @@ We follow the debezium output format: four columns as shown below Note: the illustration here ignores all the Hudi met

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-08-16 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1217392517 ## CI report: * 378a3752f4cdf975b47efeada5c26cd4ce089215 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] hudi-bot commented on pull request #6358: [HUDI-4588] Fixing `HoodieParquetReader` to properly specify projected schema when reading Parquet file

2022-08-16 Thread GitBox
hudi-bot commented on PR #6358: URL: https://github.com/apache/hudi/pull/6358#issuecomment-1217390172 ## CI report: * 9cb5a7a62af7c2a6bf418b7556caa56348522a00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1077

[GitHub] [hudi] danny0405 commented on issue #6411: Hudi Record Key Data Type Must be String

2022-08-16 Thread GitBox
danny0405 commented on issue #6411: URL: https://github.com/apache/hudi/issues/6411#issuecomment-1217374162 The byte primary key type expects to be supported, what exception it throws there for your use case ? -- This is an automated message from the Apache Git Service. To respond to the

[jira] [Commented] (HUDI-4601) read error from MOR table after compaction

2022-08-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580531#comment-17580531 ] Danny Chen commented on HUDI-4601: -- Fixed via master branch: 642f87cc6b6b2971911d2f27619e

[jira] [Updated] (HUDI-4601) Read error from MOR table after compaction with timestamp partitioning

2022-08-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-4601: - Summary: Read error from MOR table after compaction with timestamp partitioning (was: read error from MOR

[hudi] branch master updated: [HUDI-4601] Read error from MOR table after compaction with timestamp partitioning (#6365)

2022-08-16 Thread danny0405
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 642f87cc6b [HUDI-4601] Read error from MOR tabl

[jira] [Resolved] (HUDI-4601) read error from MOR table after compaction

2022-08-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-4601. -- > read error from MOR table after compaction > -- > >

[GitHub] [hudi] danny0405 merged pull request #6365: [HUDI-4601] Read error from MOR table after compaction with timestamp partitioning

2022-08-16 Thread GitBox
danny0405 merged PR #6365: URL: https://github.com/apache/hudi/pull/6365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache

[jira] [Updated] (HUDI-4601) read error from MOR table after compaction

2022-08-16 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-4601: - Fix Version/s: 0.12.1 > read error from MOR table after compaction > -

[GitHub] [hudi] hudi-bot commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
hudi-bot commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1217359926 ## CI report: * ea678bd169316fadda9480bee07d7d326e7bebc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] hudi-bot commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
hudi-bot commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1217357371 ## CI report: * ea678bd169316fadda9480bee07d7d326e7bebc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1217357234 ## CI report: * 9853ba3c9e723da382c45eced48a88e66ee234bf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] hudi-bot commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
hudi-bot commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1217351947 ## CI report: * ea678bd169316fadda9480bee07d7d326e7bebc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1217351792 ## CI report: * 9853ba3c9e723da382c45eced48a88e66ee234bf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] SteNicholas commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
SteNicholas commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1217344656 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] todd5167 closed issue #6094: [SUPPORT] hudi rollback throw java.lang.IllegalArgumentException

2022-08-16 Thread GitBox
todd5167 closed issue #6094: [SUPPORT] hudi rollback throw java.lang.IllegalArgumentException URL: https://github.com/apache/hudi/issues/6094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [hudi] dongkelun commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
dongkelun commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1217313882 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] paul8263 commented on issue #6177: [SUPPORT] Hudi cli got empty result for command show fsview all

2022-08-16 Thread GitBox
paul8263 commented on issue #6177: URL: https://github.com/apache/hudi/issues/6177#issuecomment-1217309084 Hi @nsivabalan , Sorry for my late response. I have created the issue in jira [HUDI-4485](https://issues.apache.org/jira/projects/HUDI/issues/HUDI-4485?filter=allissues). -- Th

[GitHub] [hudi] bithw1 opened a new issue, #6412: [SUPPORT]query between 0 and max commit time yields empty result set.

2022-08-16 Thread GitBox
bithw1 opened a new issue, #6412: URL: https://github.com/apache/hudi/issues/6412 Hi, I have a hudi that has 10 commit times , say ,the smallest commit time is T1, the biggest commit time is T2, when I do the following the query: ` spark.sql(s"select * from tbl_order_i

[GitHub] [hudi] xushiyan commented on a diff in pull request #6170: [HUDI-4441] Log4j2 configuration fixes and removal of log4j1 dependencies

2022-08-16 Thread GitBox
xushiyan commented on code in PR #6170: URL: https://github.com/apache/hudi/pull/6170#discussion_r947300784 ## hudi-client/hudi-client-common/src/test/java/org/apache/hudi/callback/http/TestCallbackHttpClient.java: ## @@ -70,7 +71,7 @@ public class TestCallbackHttpClient {

[GitHub] [hudi] HEPBO3AH commented on issue #6212: [SUPPORT] Hudi creates duplicate, redundant file during clustering

2022-08-16 Thread GitBox
HEPBO3AH commented on issue #6212: URL: https://github.com/apache/hudi/issues/6212#issuecomment-1217241460 > increase the INLINE_CLUSTERING_MAX_COMMITS.key() to say 3. and lets see if we encounter this issue. It is still happening. > can we add a delay of say 30 secs between su

[jira] [Created] (HUDI-4631) Enhance retries for failed writes w/ write conflicts in a multi writer scenarios

2022-08-16 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-4631: - Summary: Enhance retries for failed writes w/ write conflicts in a multi writer scenarios Key: HUDI-4631 URL: https://issues.apache.org/jira/browse/HUDI-4631

[GitHub] [hudi] vburenin commented on a diff in pull request #6270: stop sleeping where it is not necessary after the success

2022-08-16 Thread GitBox
vburenin commented on code in PR #6270: URL: https://github.com/apache/hudi/pull/6270#discussion_r947184646 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java: ## @@ -311,7 +311,9 @@ private List fetchPartitionInfos(KafkaConsumer consu

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #6270: stop sleeping where it is not necessary after the success

2022-08-16 Thread GitBox
alexeykudinkin commented on code in PR #6270: URL: https://github.com/apache/hudi/pull/6270#discussion_r947181371 ## hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java: ## @@ -311,7 +311,9 @@ private List fetchPartitionInfos(KafkaConsumer

[GitHub] [hudi] nsivabalan commented on issue #5808: [SUPPORT] Data skipping using Column Stats Bloom does not seem to work at all

2022-08-16 Thread GitBox
nsivabalan commented on issue #5808: URL: https://github.com/apache/hudi/issues/5808#issuecomment-1217015237 thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [hudi] nsivabalan closed issue #5808: [SUPPORT] Data skipping using Column Stats Bloom does not seem to work at all

2022-08-16 Thread GitBox
nsivabalan closed issue #5808: [SUPPORT] Data skipping using Column Stats Bloom does not seem to work at all URL: https://github.com/apache/hudi/issues/5808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [hudi] sufei2009 opened a new issue, #6411: Hudi Record Key Data Type Must be String

2022-08-16 Thread GitBox
sufei2009 opened a new issue, #6411: URL: https://github.com/apache/hudi/issues/6411 I have a table where I set the record key in byte data type. It did not work until I changed it to string data type. Why does Hudi not allowing byte type for its record key? I use hudi on AWS with Lake Fo

[GitHub] [hudi] nsivabalan merged pull request #6399: [HUDI-4583][DOCS] Optimal write configs for bulk insert

2022-08-16 Thread GitBox
nsivabalan merged PR #6399: URL: https://github.com/apache/hudi/pull/6399 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apach

[hudi] branch asf-site updated: [HUDI-4583][DOCS] Optimal write configs for bulk insert (#6399)

2022-08-16 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 6dacc12699 [HUDI-4583][DOCS] Optimal write

[GitHub] [hudi] nsivabalan commented on issue #4622: [SUPPORT] Can't query Redshift rows even after downgrade from 0.10

2022-08-16 Thread GitBox
nsivabalan commented on issue #4622: URL: https://github.com/apache/hudi/issues/4622#issuecomment-1216968477 @nochimow @rubenssoto and others: Looks like hudi 0.10.0 is supported from the docs https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html#c-spectrum-column-m

[GitHub] [hudi] zhedoubushishi commented on issue #5628: [SUPPORT] - Deltastreamer not shutting down properly

2022-08-16 Thread GitBox
zhedoubushishi commented on issue #5628: URL: https://github.com/apache/hudi/issues/5628#issuecomment-1216956859 This is a known issue. We are working on a fix and will push a PR once it's ready. -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [hudi] alexeykudinkin commented on issue #6278: [SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

2022-08-16 Thread GitBox
alexeykudinkin commented on issue #6278: URL: https://github.com/apache/hudi/issues/6278#issuecomment-1216952667 @brskiran1 can you please try this PR: https://github.com/apache/hudi/pull/6352 and see if it resolves the issue -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] brskiran1 commented on issue #6305: Hudi Delta Streamer unable to read Older Dates

2022-08-16 Thread GitBox
brskiran1 commented on issue #6305: URL: https://github.com/apache/hudi/issues/6305#issuecomment-1216927166 @alexeykudinkin @nsivabalan can you please let me know what should be tried to see if the issue is resolved? or can you please. I have responded in the ticket #6278 -- This is a

[GitHub] [hudi] alexeykudinkin commented on issue #6305: Hudi Delta Streamer unable to read Older Dates

2022-08-16 Thread GitBox
alexeykudinkin commented on issue #6305: URL: https://github.com/apache/hudi/issues/6305#issuecomment-1216920765 @nsivabalan let's merge this one w/ https://github.com/apache/hudi/issues/6278 I've put up https://github.com/apache/hudi/pull/6352 to address this, but didn't hear back f

[jira] [Closed] (HUDI-3827) Promote the inetAddress picking strategy for NetworkUtils#getHostname

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3827. - Resolution: Fixed > Promote the inetAddress picking strategy for NetworkUtils#getHostname > --

[jira] [Closed] (HUDI-3917) Flink write task hangs if last checkpoint has no data input

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3917. - Resolution: Fixed > Flink write task hangs if last checkpoint has no data input >

[jira] [Closed] (HUDI-3868) Disable the sort input for flink streaming append mode

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3868. - Resolution: Fixed > Disable the sort input for flink streaming append mode > -

[jira] [Closed] (HUDI-3085) Refactor fileId & writeHandler logic into partitioner for bulk_insert

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3085. - Resolution: Fixed > Refactor fileId & writeHandler logic into partitioner for bulk_insert > --

[jira] [Closed] (HUDI-4239) Revisit TestCOWDataSourceStorage#testCopyOnWriteStorage

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4239. - Resolution: Duplicate > Revisit TestCOWDataSourceStorage#testCopyOnWriteStorage >

[jira] [Closed] (HUDI-3985) Refactor DLASyncTool to support read hoodie table as datasource table

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-3985. - Resolution: Done > Refactor DLASyncTool to support read hoodie table as datasource table > ---

[jira] [Closed] (HUDI-4089) Support HMS for flink HoodieCatalog

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4089. - Resolution: Fixed > Support HMS for flink HoodieCatalog > --- > >

[jira] [Closed] (HUDI-4343) Update docs for building Hudi for the docker demo

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4343. - Resolution: Fixed > Update docs for building Hudi for the docker demo > --

[GitHub] [hudi] yihua merged pull request #6027: [HUDI-4354] add --force-empty-sync flag to deltastreamer

2022-08-16 Thread GitBox
yihua merged PR #6027: URL: https://github.com/apache/hudi/pull/6027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[hudi] branch master updated: [HUDI-4354] Add --force-empty-sync flag to deltastreamer (#6027)

2022-08-16 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 8c02e90a9b [HUDI-4354] Add --force-empty-sync flag

[jira] [Closed] (HUDI-4151) flink split_reader supports rocksdb

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4151. - Resolution: Fixed > flink split_reader supports rocksdb > --- > >

[jira] [Closed] (HUDI-4168) Support marker command based on Call Produce Command

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4168. - Resolution: Done > Support marker command based on Call Produce Command >

[jira] [Closed] (HUDI-4425) Remove test resources for hudi-flink module

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4425. - Resolution: Fixed > Remove test resources for hudi-flink module > ---

[jira] [Closed] (HUDI-4558) lost 'hoodie.table.keygenerator.class' in hoodie.properties

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4558. - Resolution: Fixed > lost 'hoodie.table.keygenerator.class' in hoodie.properties >

[jira] [Closed] (HUDI-4535) Add document for flink data skipping

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4535. - Resolution: Done > Add document for flink data skipping > > >

[GitHub] [hudi] yihua commented on pull request #6027: [HUDI-4354] add --force-empty-sync flag to deltastreamer

2022-08-16 Thread GitBox
yihua commented on PR #6027: URL: https://github.com/apache/hudi/pull/6027#issuecomment-1216900448 @qjqqyy Sorry for the delay. Azure CI actually passes now. Merging this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[jira] [Closed] (HUDI-2197) Replace ConfigOptions with ConfigProperty for FlinkOptions

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-2197. - Resolution: Done > Replace ConfigOptions with ConfigProperty for FlinkOptions > --

[jira] [Closed] (HUDI-4583) [DOCS] Optimal write configs for different workload patterns

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4583. - Resolution: Done > [DOCS] Optimal write configs for different workload patterns >

[jira] [Closed] (HUDI-4560) [DOCS] Update default value for partition extractor and note about infer function

2022-08-16 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit closed HUDI-4560. - Resolution: Done > [DOCS] Update default value for partition extractor and note about infer > function >

[GitHub] [hudi] vburenin commented on pull request #5632: [HUDI-4122] Fix NPE caused by adding kafka nodes

2022-08-16 Thread GitBox
vburenin commented on PR #5632: URL: https://github.com/apache/hudi/pull/5632#issuecomment-1216893883 @nsivabalan I submitted this pull request some time ago https://github.com/apache/hudi/pull/6270 -- This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [hudi] yihua merged pull request #6362: [MINOR][DOCS] add tip to schema evolution

2022-08-16 Thread GitBox
yihua merged PR #6362: URL: https://github.com/apache/hudi/pull/6362 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

[hudi] branch asf-site updated: [MINOR][DOCS] add tip to schema evolution (#6362)

2022-08-16 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 54b25661b1 [MINOR][DOCS] add tip to schema evol

svn commit: r56325 - in /release/hudi/0.12.0: ./ hudi-0.12.0.src.tgz hudi-0.12.0.src.tgz.asc hudi-0.12.0.src.tgz.sha512

2022-08-16 Thread sivabalan
Author: sivabalan Date: Tue Aug 16 16:39:12 2022 New Revision: 56325 Log: Adding source for 0.12 Added: release/hudi/0.12.0/ release/hudi/0.12.0/hudi-0.12.0.src.tgz (with props) release/hudi/0.12.0/hudi-0.12.0.src.tgz.asc release/hudi/0.12.0/hudi-0.12.0.src.tgz.sha512 Added: re

[jira] [Assigned] (HUDI-3287) Remove unnecessary deps in hudi-kafka-connect

2022-08-16 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu reassigned HUDI-3287: Assignee: Sagar Sumit (was: Ethan Guo) > Remove unnecessary deps in hudi-kafka-connect > -

[jira] [Updated] (HUDI-3654) Support basic actions based on hudi metastore server

2022-08-16 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3654: - Reviewers: Prasanna Rajaperumal, Raymond Xu > Support basic actions based on hudi metastore server >

[GitHub] [hudi] prasannarajaperumal commented on a diff in pull request #6408: [DOCS] Edits to the Hudi Tech specs

2022-08-16 Thread GitBox
prasannarajaperumal commented on code in PR #6408: URL: https://github.com/apache/hudi/pull/6408#discussion_r94694 ## website/src/pages/tech-specs.md: ## @@ -263,68 +274,68 @@ Readers will use snapshot isolation to query a Hudi dataset at a consistent poin ## Writer Expe

[GitHub] [hudi] cocoapan opened a new issue, #6410: [SUPPORT] MergeInto syntax WHEN MATCHED is not optional but must be set.

2022-08-16 Thread GitBox
cocoapan opened a new issue, #6410: URL: https://github.com/apache/hudi/issues/6410 Hi, When I update the hudi table using mergeInto syntax, I get an error message: assertion failed: hoodie.payload.update.condition.assignments have not set. https://user-images.githubusercontent

[GitHub] [hudi] hudi-bot commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
hudi-bot commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1216749487 ## CI report: * ea678bd169316fadda9480bee07d7d326e7bebc9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] SteNicholas closed pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
SteNicholas closed pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment URL: https://github.com/apache/hudi/pull/6312 -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [hudi] navbalaraman commented on issue #6101: [SUPPORT] Hudi Delete Not working with EMR, AWS Glue & S3

2022-08-16 Thread GitBox
navbalaraman commented on issue #6101: URL: https://github.com/apache/hudi/issues/6101#issuecomment-1216738397 @nsivabalan Thanks for your attention to this issue. Here is the current status: - Managed to get the deletes working. - Was trying to delete with the partition column name as

[GitHub] [hudi] SteNicholas commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
SteNicholas commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216730672 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[jira] [Updated] (HUDI-4626) Partitioning table by `_hoodie_partition_path` fails

2022-08-16 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4626: Description:   Currently, creating a table partitioned by "_hoodie_partition_path" using Glue catalog fail

[jira] [Updated] (HUDI-4626) Partitioning table by `_hoodie_partition_path` fails

2022-08-16 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4626: Description:   Currently, creating a table partitioned by "_hoodie_partition_path" fails w/ the following

[GitHub] [hudi] pramodbiligiri closed pull request #6357: [WIP] Detect new data in GCS buckets via Cloud Pubsub

2022-08-16 Thread GitBox
pramodbiligiri closed pull request #6357: [WIP] Detect new data in GCS buckets via Cloud Pubsub URL: https://github.com/apache/hudi/pull/6357 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [hudi] bkosuru commented on issue #5824: HoodieDeltaStreamer does not work with AvroKafkaSource and FilebasedSchemaProvider

2022-08-16 Thread GitBox
bkosuru commented on issue #5824: URL: https://github.com/apache/hudi/issues/5824#issuecomment-1216672014 @nsivabalan We are not using this feature at this time. It will take me sometime to test. You can close this if you prefer. I will reopen if it does not work. -- This is an automated

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216669979 ## CI report: * 9853ba3c9e723da382c45eced48a88e66ee234bf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] nsivabalan commented on issue #4978: [SUPPORT] Wrong table path when using Hive to query xxx_rt table before the first compaction

2022-08-16 Thread GitBox
nsivabalan commented on issue #4978: URL: https://github.com/apache/hudi/issues/4978#issuecomment-1216664212 @CrazyBeeline : gentle ping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [hudi] nsivabalan commented on issue #5348: [SUPPORT]org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit time 20220418194506064

2022-08-16 Thread GitBox
nsivabalan commented on issue #5348: URL: https://github.com/apache/hudi/issues/5348#issuecomment-1216663723 @lanyu1hao : gentle ping. If the issue is resolved, feel free to close out the issue. -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [hudi] nsivabalan commented on issue #6101: [SUPPORT] Hudi Delete Not working with EMR, AWS Glue & S3

2022-08-16 Thread GitBox
nsivabalan commented on issue #6101: URL: https://github.com/apache/hudi/issues/6101#issuecomment-1216661151 @navbalaraman : gentle ping -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [hudi] nsivabalan commented on issue #6177: [SUPPORT] Hudi cli got empty result for command show fsview all

2022-08-16 Thread GitBox
nsivabalan commented on issue #6177: URL: https://github.com/apache/hudi/issues/6177#issuecomment-1216660679 @paul8263 : another ping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [hudi] nsivabalan commented on issue #6212: [SUPPORT] Hudi creates duplicate, redundant file during clustering

2022-08-16 Thread GitBox
nsivabalan commented on issue #6212: URL: https://github.com/apache/hudi/issues/6212#issuecomment-1216658622 @HEPBO3AH : any updates please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216656657 ## CI report: * c320e2012916972700be104f6992fc6d4b2c8f79 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] Armelabdelkbir commented on issue #6403: [SUPPORT] java.lang.IllegalStateException: Duplicate key Option{val=org.apache.hudi.common.HoodiePendingRollbackInfo

2022-08-16 Thread GitBox
Armelabdelkbir commented on issue #6403: URL: https://github.com/apache/hudi/issues/6403#issuecomment-1216621726 i bypassed the issue above by changing my build.sbt file from ``` libraryDependencies += "org.apache.hudi" %% "hudi-spark3-bundle" % "0.11.0" ``` to ``` libraryD

[GitHub] [hudi] vinothchandar commented on a diff in pull request #6408: [DOCS] Edits to the Hudi Tech specs

2022-08-16 Thread GitBox
vinothchandar commented on code in PR #6408: URL: https://github.com/apache/hudi/pull/6408#discussion_r946760002 ## website/src/pages/tech-specs.md: ## @@ -263,68 +274,68 @@ Readers will use snapshot isolation to query a Hudi dataset at a consistent poin ## Writer Expectatio

[GitHub] [hudi] bkosuru closed issue #5741: [SUPPORT] Hudi table copy failed for some partitions in 0.11.0

2022-08-16 Thread GitBox
bkosuru closed issue #5741: [SUPPORT] Hudi table copy failed for some partitions in 0.11.0 URL: https://github.com/apache/hudi/issues/5741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [hudi] bkosuru commented on issue #5741: [SUPPORT] Hudi table copy failed for some partitions in 0.11.0

2022-08-16 Thread GitBox
bkosuru commented on issue #5741: URL: https://github.com/apache/hudi/issues/5741#issuecomment-1216612876 We changed the implementation so that we have a number of small tables instead of one large single table. Not an issue anymore for us. -- This is an automated message from the Apache

[GitHub] [hudi] hudi-bot commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
hudi-bot commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1216589130 ## CI report: * ea678bd169316fadda9480bee07d7d326e7bebc9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1078

[GitHub] [hudi] Armelabdelkbir commented on issue #6403: [SUPPORT] java.lang.IllegalStateException: Duplicate key Option{val=org.apache.hudi.common.HoodiePendingRollbackInfo

2022-08-16 Thread GitBox
Armelabdelkbir commented on issue #6403: URL: https://github.com/apache/hudi/issues/6403#issuecomment-1216587133 thanks for replying i tried to upgrade to 0.11, and i got this error ``` at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execu

[GitHub] [hudi] hudi-bot commented on pull request #6409: [HUDI-4629] Create hive table from existing hoodie Table failed when the table schema is not defined

2022-08-16 Thread GitBox
hudi-bot commented on PR #6409: URL: https://github.com/apache/hudi/pull/6409#issuecomment-1216583150 ## CI report: * ea678bd169316fadda9480bee07d7d326e7bebc9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #6312: [HUDI-4551] The default value of READ_TASKS, WRITE_TASKS, CLUSTERING_TASKS is the parallelism of the execution environment

2022-08-16 Thread GitBox
hudi-bot commented on PR #6312: URL: https://github.com/apache/hudi/pull/6312#issuecomment-1216582810 ## CI report: * bb5426538ca6b80089d8a5ada00260ec12e61d65 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1077

[jira] [Updated] (HUDI-4630) Allow different transformers for different tables getting ingested with HoodieMultiTableDeltaStreamer

2022-08-16 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pratyaksh Sharma updated HUDI-4630: --- Component/s: deltastreamer > Allow different transformers for different tables getting ingeste

[jira] [Updated] (HUDI-4630) Allow different transformers for different tables getting ingested with HoodieMultiTableDeltaStreamer

2022-08-16 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pratyaksh Sharma updated HUDI-4630: --- Labels: delta (was: ) > Allow different transformers for different tables getting ingested wi

<    1   2   3   4   5   6   >