[GitHub] [hudi] nsivabalan commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
nsivabalan commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004405905 Can you try with a new table. usually changing index types, partition path fields or record key fields is not recommended for a given table. -- This is an automated message from

[GitHub] [hudi] nsivabalan commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
nsivabalan commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004405448 yes, its backwards incompatible change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[jira] [Updated] (HUDI-2590) Validate Diff key gen w/ and w/o glob path with and w/o metadata enabled

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2590: -- Status: Patch Available (was: In Progress) > Validate Diff key gen w/ and w/o glob path

[jira] [Updated] (HUDI-2590) Validate Diff key gen w/ and w/o glob path with and w/o metadata enabled

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2590: -- Status: In Progress (was: Open) > Validate Diff key gen w/ and w/o glob path with and w

[jira] [Updated] (HUDI-2977) Push the prometheus grafana dashboard to hudi OSS

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2977: -- Priority: Critical (was: Blocker) > Push the prometheus grafana dashboard to hudi OSS >

[jira] [Updated] (HUDI-2409) Using HBase shaded jars in Hudi presto bundle

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2409: - Epic Link: HUDI-2574 > Using HBase shaded jars in Hudi presto bundle > --

[jira] [Updated] (HUDI-2409) Using HBase shaded jars in Hudi presto bundle

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2409: - Parent: (was: HUDI-2574) Issue Type: Bug (was: Sub-task) > Using HBase shaded jars in

[jira] [Updated] (HUDI-2574) [UMBRELLA] Support for Hudi tables in presto-hive connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2574: - Epic Name: Presto-Hudi-Hive-Connector > [UMBRELLA] Support for Hudi tables in presto-hive connecto

[jira] [Updated] (HUDI-2574) [UMBRELLA] Support for Hudi tables in presto-hive connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2574: - Priority: Major (was: Blocker) > [UMBRELLA] Support for Hudi tables in presto-hive connector > --

[jira] [Updated] (HUDI-2574) [UMBRELLA] Support for Hudi tables in presto-hive connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2574: - Fix Version/s: (was: 0.11.0) > [UMBRELLA] Support for Hudi tables in presto-hive connector > -

[jira] [Updated] (HUDI-2574) [UMBRELLA] Support for Hudi tables in presto-hive connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2574: - Priority: Blocker (was: Major) > [UMBRELLA] Support for Hudi tables in presto-hive connector > --

[jira] [Deleted] (HUDI-3149) Hudi Support in Presto Hive Connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar deleted HUDI-3149: - > Hudi Support in Presto Hive Connector > - > > K

[jira] [Updated] (HUDI-2574) [UMBRELLA] Support for Hudi tables in presto-hive connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2574: - Fix Version/s: 0.11.0 > [UMBRELLA] Support for Hudi tables in presto-hive connector >

[jira] [Updated] (HUDI-3149) Hudi Support in Presto Hive Connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3149: - > Hudi Support in Presto Hive Connector > - > > Ke

[jira] [Updated] (HUDI-3149) Hudi Support in Presto Hive Connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3149: - > Hudi Support in Presto Hive Connector > - > > Ke

[jira] [Created] (HUDI-3149) Hudi Support in Presto Hive Connector

2022-01-03 Thread Vinoth Chandar (Jira)
Vinoth Chandar created HUDI-3149: Summary: Hudi Support in Presto Hive Connector Key: HUDI-3149 URL: https://issues.apache.org/jira/browse/HUDI-3149 Project: Apache Hudi Issue Type: Epic

[jira] [Resolved] (HUDI-3128) [UMBRELLA] Test Hudi Epic

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-3128. -- > [UMBRELLA] Test Hudi Epic > - > > Key: HUDI-3128 >

[jira] [Updated] (HUDI-3128) [UMBRELLA] Test Hudi Epic

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3128: - Issue Type: Task (was: Epic) > [UMBRELLA] Test Hudi Epic > - > >

[jira] [Updated] (HUDI-3145) test

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3145: - Epic Link: (was: HUDI-3128) > test > > > Key: HUDI-3145 >

[jira] [Resolved] (HUDI-3146) test sub task

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-3146. -- > test sub task > - > > Key: HUDI-3146 > URL: https://is

[GitHub] [hudi] jasondavindev edited a comment on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev edited a comment on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004362898 I noticed when use `withColumn` the final schema is changed. `is_syncing` column hasn't default value. ```python payment_type_cash \ .withColumn('is_syncing',

[GitHub] [hudi] jasondavindev commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004362898 I noticed when use `withColumn` the final schema is changed. `is_syncing` column hasn't default value. ```python payment_type_cash \ .withColumn('is_syncing', lit('tr

[GitHub] [hudi] jasondavindev edited a comment on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev edited a comment on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004342445 Reopen Thanks @nsivabalan @h7kanna but when I add `hoodie.index.type=GLOBAL_BLOOM` the given error is thrown ```bash 22/01/03 20:49:13 INFO DAGScheduler: Re

[GitHub] [hudi] jasondavindev edited a comment on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev edited a comment on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004342445 Reopen Thanks @nsivabalan @h7kanna but when I add `hoodie.index.type=GLOBAL_BLOOM` the given error is thrown ```bash 22/01/03 20:49:13 INFO DAGScheduler: Re

[GitHub] [hudi] jasondavindev commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004342445 Reopen Thanks @nsivabalan @h7kanna but when I add `hoodie.index.type=GLOBAL_BLOOM` the given error is thrown ```bash 22/01/03 20:29:05 INFO DAGScheduler: ResultSta

[GitHub] [hudi] jasondavindev removed a comment on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev removed a comment on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004341811 Reopen When I add `hoodie.index.type=GLOBAL_BLOOM` the given error is thrown ```bash 22/01/03 20:29:05 INFO DAGScheduler: ResultStage 37 (isEmpty at HoodieSpa

[GitHub] [hudi] jasondavindev removed a comment on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev removed a comment on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004338818 Thanks for the quick answer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [hudi] jasondavindev commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004341811 Reopen When I add `hoodie.index.type=GLOBAL_BLOOM` the given error is thrown ```bash 22/01/03 20:29:05 INFO DAGScheduler: ResultStage 37 (isEmpty at HoodieSparkSqlWri

[jira] [Updated] (HUDI-3145) test

2022-01-03 Thread nino b. manuel (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nino b. manuel updated HUDI-3145: - Attachment: _slash_storage_slash_emulated_slash_0_slash_SHAREit_slash_caches_slash_tmp_slash_16406

[GitHub] [hudi] jasondavindev closed issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev closed issue #4501: URL: https://github.com/apache/hudi/issues/4501 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr

[GitHub] [hudi] jasondavindev commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004338818 Thanks for the quick answer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[jira] [Updated] (HUDI-3145) test

2022-01-03 Thread nino b. manuel (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nino b. manuel updated HUDI-3145: - Attachment: .trashed-1638313746-split_config.arm64_v8a.apk (1).trash > test > > >

[jira] [Commented] (HUDI-3129) Test Hudi Subtask

2022-01-03 Thread nino b. manuel (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468188#comment-17468188 ] nino b. manuel commented on HUDI-3129: -- * HUDI-3147 > Test Hudi Subtask > --

[jira] [Updated] (HUDI-3112) KafkaConnect can not sync to Hive

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3112: -- Labels: pull-request-available sev:critical (was: pull-request-available) > KafkaConnec

[jira] [Updated] (HUDI-3125) Spark SQL writing timestamp type don't need to disable `spark.sql.datetime.java8API.enabled` manually

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3125: -- Labels: pull-request-available sev:critical (was: pull-request-available) > Spark SQL w

[jira] [Updated] (HUDI-3125) Spark SQL writing timestamp type don't need to disable `spark.sql.datetime.java8API.enabled` manually

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3125: -- Labels: pull-request-available sev:critical user-support-issues (was: pull-request-avai

[jira] [Updated] (HUDI-2947) HoodieDeltaStreamer/DeltaSync can improperly pick up the checkpoint config from CLI in continuous mode

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2947: -- Fix Version/s: 0.10.1 > HoodieDeltaStreamer/DeltaSync can improperly pick up the checkpo

[jira] [Updated] (HUDI-2947) HoodieDeltaStreamer/DeltaSync can improperly pick up the checkpoint config from CLI in continuous mode

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2947: -- Labels: pull-request-available sev:critical (was: pull-request-available sev:high) > H

[jira] [Updated] (HUDI-3148) Unable to push metrics via pushgateway reporter with https

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3148: -- Fix Version/s: 0.11.0 0.10.1 > Unable to push metrics via pushgateway

[GitHub] [hudi] nsivabalan edited a comment on issue #4496: [SUPPORT] Hudi is unable to push metrics to pushgateway via https

2022-01-03 Thread GitBox
nsivabalan edited a comment on issue #4496: URL: https://github.com/apache/hudi/issues/4496#issuecomment-1004325823 thanks for reporting .have filed https://issues.apache.org/jira/browse/HUDI-3148 -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [hudi] nsivabalan closed issue #4496: [SUPPORT] Hudi is unable to push metrics to pushgateway via https

2022-01-03 Thread GitBox
nsivabalan closed issue #4496: URL: https://github.com/apache/hudi/issues/4496 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[jira] [Created] (HUDI-3148) Unable to push metrics via pushgateway reporter with https

2022-01-03 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3148: - Summary: Unable to push metrics via pushgateway reporter with https Key: HUDI-3148 URL: https://issues.apache.org/jira/browse/HUDI-3148 Project: Apache Hudi

[GitHub] [hudi] nsivabalan commented on issue #4496: [SUPPORT] Hudi is unable to push metrics to pushgateway via https

2022-01-03 Thread GitBox
nsivabalan commented on issue #4496: URL: https://github.com/apache/hudi/issues/4496#issuecomment-1004325823 thanks for reporting .have filed https://issues.apache.org/jira/browse/HUDI-3147 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] nsivabalan commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
nsivabalan commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004322987 yes, for partition datasets and non global index, a pair of partition path and record key is unique across entire hudi table. I see two rows seen in the attached screenshot has diff

[GitHub] [hudi] nsivabalan commented on issue #4499: [SUPPORT] DynamoDBBasedLockProvider support for local dynamodb

2022-01-03 Thread GitBox
nsivabalan commented on issue #4499: URL: https://github.com/apache/hudi/issues/4499#issuecomment-1004322248 thanks for reporting. https://issues.apache.org/jira/browse/HUDI-3147 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] nsivabalan closed issue #4499: [SUPPORT] DynamoDBBasedLockProvider support for local dynamodb

2022-01-03 Thread GitBox
nsivabalan closed issue #4499: URL: https://github.com/apache/hudi/issues/4499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[jira] [Created] (HUDI-3147) add support for local dynamo db lock provider

2022-01-03 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3147: - Summary: add support for local dynamo db lock provider Key: HUDI-3147 URL: https://issues.apache.org/jira/browse/HUDI-3147 Project: Apache Hudi Iss

[jira] [Updated] (HUDI-3147) add support for local dynamo db lock provider

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3147: -- Fix Version/s: 0.11.0 0.10.1 > add support for local dynamo db lock p

[GitHub] [hudi] h7kanna commented on issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
h7kanna commented on issue #4501: URL: https://github.com/apache/hudi/issues/4501#issuecomment-1004320713 Hi, Difference of indexing schemes in Hudi https://hudi.apache.org/blog/2020/11/11/hudi-indexing-mechanisms/ check Global vs Non Global -- This is an automated message from the

[GitHub] [hudi] nsivabalan closed issue #3499: [SUPPORT] Inline Clustering fails with Hudi

2022-01-03 Thread GitBox
nsivabalan closed issue #3499: URL: https://github.com/apache/hudi/issues/3499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[GitHub] [hudi] nsivabalan commented on issue #3499: [SUPPORT] Inline Clustering fails with Hudi

2022-01-03 Thread GitBox
nsivabalan commented on issue #3499: URL: https://github.com/apache/hudi/issues/3499#issuecomment-1004320635 thanks. but do reach out to us if you hit any issues. would be happy to assist you. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] jasondavindev opened a new issue #4501: [SUPPORT] Duplicate records for same recordKey and differents partitionpath

2022-01-03 Thread GitBox
jasondavindev opened a new issue #4501: URL: https://github.com/apache/hudi/issues/4501 **Describe the problem you faced** I've created a simple script to test insert and upsert operations. When I run upsert operation for a given record but with different partition field column valu

[jira] [Commented] (HUDI-52) Implement Savepoints for Merge On Read table #88

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468164#comment-17468164 ] sivabalan narayanan commented on HUDI-52: - hey [~309637554] : I would like to take t

[jira] [Assigned] (HUDI-52) Implement Savepoints for Merge On Read table #88

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-52: --- Assignee: sivabalan narayanan (was: liwei) > Implement Savepoints for Merge On Read ta

[jira] [Updated] (HUDI-3129) Test Hudi Subtask

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3129: - Story Points: 1 > Test Hudi Subtask > - > > Key: HUDI-3129 >

[jira] [Updated] (HUDI-3145) test

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3145: - Story Points: 1 > test > > > Key: HUDI-3145 > URL: https://issues.apa

[jira] [Updated] (HUDI-3146) test sub task

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3146: - Story Points: 1 > test sub task > - > > Key: HUDI-3146 > URL:

[jira] [Updated] (HUDI-2644) Integrate existing curves with stats from the metadata table

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2644: - Epic Link: HUDI-2100 > Integrate existing curves with stats from the metadata table >

[jira] [Updated] (HUDI-2644) Integrate existing curves with stats from the metadata table

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2644: - Parent: (was: HUDI-2100) Issue Type: Improvement (was: Sub-task) > Integrate existing

[jira] [Updated] (HUDI-3146) test sub task

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3146: - Fix Version/s: 0.10.1 > test sub task > - > > Key: HUDI-3146 >

[jira] [Created] (HUDI-3146) test sub task

2022-01-03 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-3146: Summary: test sub task Key: HUDI-3146 URL: https://issues.apache.org/jira/browse/HUDI-3146 Project: Apache Hudi Issue Type: Sub-task Reporter: Raymond Xu

[jira] [Updated] (HUDI-3129) Test Hudi Subtask

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3129: - Parent Issue: HUDI-3145 (was: HUDI-3128) > Test Hudi Subtask > - > > Key:

[jira] [Updated] (HUDI-3145) test

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3145: - Fix Version/s: 0.10.1 > test > > > Key: HUDI-3145 > URL: https://issu

[jira] [Updated] (HUDI-3145) test

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3145: - Sprint: Hudi 0.10.1 test sprint > test > > > Key: HUDI-3145 > URL: ht

[jira] [Created] (HUDI-3145) test

2022-01-03 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-3145: Summary: test Key: HUDI-3145 URL: https://issues.apache.org/jira/browse/HUDI-3145 Project: Apache Hudi Issue Type: Task Reporter: Raymond Xu -- Thi

[jira] [Updated] (HUDI-3128) [UMBRELLA] Test Hudi Epic

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3128: - Epic Name: test > [UMBRELLA] Test Hudi Epic > - > > Key: HUDI-3128

[jira] [Created] (HUDI-3144) Parallelize metadata table getRecordsByKeys() operations

2022-01-03 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3144: Summary: Parallelize metadata table getRecordsByKeys() operations Key: HUDI-3144 URL: https://issues.apache.org/jira/browse/HUDI-3144 Project: Apache Hudi

[jira] [Updated] (HUDI-3129) Test Hudi Subtask

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3129: - Fix Version/s: 0.10.1 > Test Hudi Subtask > - > > Key: HUDI-3129 >

[jira] [Updated] (HUDI-3128) [UMBRELLA] Test Hudi Epic

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3128: - Fix Version/s: 0.10.1 > [UMBRELLA] Test Hudi Epic > - > > Key: HUD

[jira] [Created] (HUDI-3143) Support multiple file groups for metadata table index partitions

2022-01-03 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3143: Summary: Support multiple file groups for metadata table index partitions Key: HUDI-3143 URL: https://issues.apache.org/jira/browse/HUDI-3143 Project: Apache

[jira] [Created] (HUDI-3142) Metadata new Indices initialization during table creation

2022-01-03 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3142: Summary: Metadata new Indices initialization during table creation Key: HUDI-3142 URL: https://issues.apache.org/jira/browse/HUDI-3142 Project: Apache Hudi

[jira] [Updated] (HUDI-3141) Metadata table getAllFilesInPartition() crashes with NullPointerException

2022-01-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3141: - Parent: HUDI-1822 Issue Type: Sub-task (was: Task) > Metadata table getAllFilesIn

[jira] [Updated] (HUDI-3141) Metadata table getAllFilesInPartition() crashes with NullPointerException

2022-01-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3141: - Status: In Progress (was: Open) > Metadata table getAllFilesInPartition() crashes with Nu

[jira] [Created] (HUDI-3141) Metadata table getAllFilesInPartition() crashes with NullPointerException

2022-01-03 Thread Manoj Govindassamy (Jira)
Manoj Govindassamy created HUDI-3141: Summary: Metadata table getAllFilesInPartition() crashes with NullPointerException Key: HUDI-3141 URL: https://issues.apache.org/jira/browse/HUDI-3141 Project

[GitHub] [hudi] hudi-bot removed a comment on pull request #4020: [WIP][HUDI-2783] Upgrade HBase to 2.x

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4020: URL: https://github.com/apache/hudi/pull/4020#issuecomment-1004222459 ## CI report: * 790d2ba4b72e6174d90fb0f7fb15abf72f173181 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4020: [WIP][HUDI-2783] Upgrade HBase to 2.x

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4020: URL: https://github.com/apache/hudi/pull/4020#issuecomment-1004259878 ## CI report: * b8a7bfaa1a64c979809f656503eb44c9e173a450 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot commented on pull request #4500: [MINOR] Add endpoint_url to dynamodb lock provider

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4500: URL: https://github.com/apache/hudi/pull/4500#issuecomment-1004232266 ## CI report: * be749df0329f13128537622ca5865c1419be4280 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4500: [MINOR] Add endpoint_url to dynamodb lock provider

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4500: URL: https://github.com/apache/hudi/pull/4500#issuecomment-1004196916 ## CI report: * be749df0329f13128537622ca5865c1419be4280 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4020: [WIP][HUDI-2783] Upgrade HBase to 2.x

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4020: URL: https://github.com/apache/hudi/pull/4020#issuecomment-1004222459 ## CI report: * 790d2ba4b72e6174d90fb0f7fb15abf72f173181 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4020: [WIP][HUDI-2783] Upgrade HBase to 2.x

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4020: URL: https://github.com/apache/hudi/pull/4020#issuecomment-1004220887 ## CI report: * 790d2ba4b72e6174d90fb0f7fb15abf72f173181 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4020: [WIP][HUDI-2783] Upgrade HBase to 2.x

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4020: URL: https://github.com/apache/hudi/pull/4020#issuecomment-1004220887 ## CI report: * 790d2ba4b72e6174d90fb0f7fb15abf72f173181 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4020: [WIP][HUDI-2783] Upgrade HBase to 2.x

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4020: URL: https://github.com/apache/hudi/pull/4020#issuecomment-990284127 ## CI report: * 790d2ba4b72e6174d90fb0f7fb15abf72f173181 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/re

[GitHub] [hudi] hudi-bot commented on pull request #4500: [MINOR] Add endpoint_url to dynamodb lock provider

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4500: URL: https://github.com/apache/hudi/pull/4500#issuecomment-1004196916 ## CI report: * be749df0329f13128537622ca5865c1419be4280 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4500: [MINOR] Add endpoint_url to dynamodb lock provider

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4500: URL: https://github.com/apache/hudi/pull/4500#issuecomment-1004194659 ## CI report: * be749df0329f13128537622ca5865c1419be4280 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4500: [MINOR] Add endpoint_url to dynamodb lock provider

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4500: URL: https://github.com/apache/hudi/pull/4500#issuecomment-1004194659 ## CI report: * be749df0329f13128537622ca5865c1419be4280 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[GitHub] [hudi] parisni opened a new pull request #4500: [MINOR] Add endpoint_url to dynamodb lock provider

2022-01-03 Thread GitBox
parisni opened a new pull request #4500: URL: https://github.com/apache/hudi/pull/4500 ## What is the purpose of the pull request This adds a config param to specify the endpoint url for dynamodb lock provider fixes #4499 ## Verify this pull request I get tro

[GitHub] [hudi] hudi-bot commented on pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4498: URL: https://github.com/apache/hudi/pull/4498#issuecomment-1004167526 ## CI report: * bea80fb0c7cf51acdfc16308002e8fa9e53a0434 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4498: URL: https://github.com/apache/hudi/pull/4498#issuecomment-1004131868 ## CI report: * bea80fb0c7cf51acdfc16308002e8fa9e53a0434 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] parisni opened a new issue #4499: [SUPPORT] DynamoDBBasedLockProvider support for local dynamodb

2022-01-03 Thread GitBox
parisni opened a new issue #4499: URL: https://github.com/apache/hudi/issues/4499 so far, we can only deal with AWS endpoints. however, for local development having a local dynamodb instance is useful then allowing to specify the dynamodb endpoint would help in that case --

[GitHub] [hudi] BenjMaq commented on issue #4154: [SUPPORT] INSERT OVERWRITE operation does not work when using Spark SQL

2022-01-03 Thread GitBox
BenjMaq commented on issue #4154: URL: https://github.com/apache/hudi/issues/4154#issuecomment-1004148341 Hi everyone, sorry for the late reply as I was on holidays. I tried again after @YannByron 's comment but this time I tried reading the files using Scala (`spark.read.format("hud

[GitHub] [hudi] hudi-bot commented on pull request #4497: [MINOR] Create pushgateway client based on port

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4497: URL: https://github.com/apache/hudi/pull/4497#issuecomment-1004141119 ## CI report: * f3d6924b2c8f912bc95db932b3088487b76b0785 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4497: [MINOR] Create pushgateway client based on port

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4497: URL: https://github.com/apache/hudi/pull/4497#issuecomment-1004100037 ## CI report: * f3d6924b2c8f912bc95db932b3088487b76b0785 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4498: URL: https://github.com/apache/hudi/pull/4498#issuecomment-1004131868 ## CI report: * bea80fb0c7cf51acdfc16308002e8fa9e53a0434 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4498: URL: https://github.com/apache/hudi/pull/4498#issuecomment-1004129857 ## CI report: * bea80fb0c7cf51acdfc16308002e8fa9e53a0434 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run

[GitHub] [hudi] hudi-bot commented on pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4498: URL: https://github.com/apache/hudi/pull/4498#issuecomment-1004129857 ## CI report: * bea80fb0c7cf51acdfc16308002e8fa9e53a0434 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure`

[jira] [Updated] (HUDI-3140) Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3140: - Labels: pull-request-available (was: ) > Fix bulk_insert failure on Spark 3.2.0 > ---

[GitHub] [hudi] leesf opened a new pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
leesf opened a new pull request #4498: URL: https://github.com/apache/hudi/pull/4498 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the purpose o

[jira] [Created] (HUDI-3140) Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread leesf (Jira)
leesf created HUDI-3140: --- Summary: Fix bulk_insert failure on Spark 3.2.0 Key: HUDI-3140 URL: https://issues.apache.org/jira/browse/HUDI-3140 Project: Apache Hudi Issue Type: Sub-task Repor

[GitHub] [hudi] dmenin commented on issue #3975: [SUPPORT] Question on hudi's delete statment taking too long

2022-01-03 Thread GitBox
dmenin commented on issue #3975: URL: https://github.com/apache/hudi/issues/3975#issuecomment-1004110981 > Hi @nsivabalan "How is your updates/deletes spread in general? " I have the data partitioned by year/month/day, and deal with time related data, the vast majorit

[GitHub] [hudi] nsivabalan commented on issue #4311: Duplicate Records in Merge on Read [SUPPORT]

2022-01-03 Thread GitBox
nsivabalan commented on issue #4311: URL: https://github.com/apache/hudi/issues/4311#issuecomment-1004101626 thanks for the timeline. Here are our findings so far. unfortunately we don't have a root cause yet. but updating some findings for now. 1. We see same data file being cl

<    1   2   3   4   5   >