[jira] [Updated] (HUDI-1388) [UMBRELLA] Improve CLI features and usabilities

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1388: Epic Name: Hudi CLI > [UMBRELLA] Improve CLI features and usabilities >

[jira] [Updated] (HUDI-3007) Address minor feedbacks on the repair utility

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3007: - Sprint: Hudi-Sprint-Jan-3 > Address minor feedbacks on the repair utility > --

[jira] [Updated] (HUDI-3159) Hudi release 0.10.1 work

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3159: -- Story Points: 4 (was: 3) > Hudi release 0.10.1 work > > >

[jira] [Updated] (HUDI-3159) Hudi release 0.10.1 work

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3159: -- Story Points: 6 (was: 4) > Hudi release 0.10.1 work > > >

[jira] [Created] (HUDI-3159) Hudi release 0.10.1 work

2022-01-03 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3159: - Summary: Hudi release 0.10.1 work Key: HUDI-3159 URL: https://issues.apache.org/jira/browse/HUDI-3159 Project: Apache Hudi Issue Type: Task

[jira] [Updated] (HUDI-1978) [UMBRELLA] Support for Hudi tables in trino-hive connector

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1978: - Epic Name: Trino-Hudi-Hive-Connector > [UMBRELLA] Support for Hudi tables in trino-hive connector

[jira] [Updated] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3158: - Epic Link: HUDI-1658 > Reduce warn logs in Spark SQL INSERT OVERWRITE > --

[jira] [Updated] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3158: - Labels: sev:normal (was: ) > Reduce warn logs in Spark SQL INSERT OVERWRITE > ---

[jira] [Updated] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3158: - Description: 22/01/03 19:35:12 WARN ClusteringUtils: No content found in requested file for instant [==>2

[jira] [Updated] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3158: - Component/s: Spark Integration > Reduce warn logs in Spark SQL INSERT OVERWRITE >

[jira] [Created] (HUDI-3158) Reduce warn logs in Spark SQL INSERT OVERWRITE

2022-01-03 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-3158: Summary: Reduce warn logs in Spark SQL INSERT OVERWRITE Key: HUDI-3158 URL: https://issues.apache.org/jira/browse/HUDI-3158 Project: Apache Hudi Issue Type: Improvem

[jira] [Updated] (HUDI-2973) Rewrite/re-publish RFC for Data skipping index

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2973: -- Story Points: 0 > Rewrite/re-publish RFC for Data skipping index > -

[jira] [Updated] (HUDI-3007) Address minor feedbacks on the repair utility

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3007: Epic Link: HUDI-1388 > Address minor feedbacks on the repair utility > -

[jira] [Updated] (HUDI-3157) Shade aws-dependencies to avoid class conflicts

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3157: - Priority: Critical (was: Major) > Shade aws-dependencies to avoid class conflicts > -

[jira] [Created] (HUDI-3157) Shade aws-dependencies to avoid class conflicts

2022-01-03 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-3157: Summary: Shade aws-dependencies to avoid class conflicts Key: HUDI-3157 URL: https://issues.apache.org/jira/browse/HUDI-3157 Project: Apache Hudi Issue Type: Bug

[jira] [Updated] (HUDI-3157) Shade aws-dependencies to avoid class conflicts

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3157: - Labels: user-support-issues (was: ) > Shade aws-dependencies to avoid class conflicts > -

[jira] [Updated] (HUDI-3157) Shade aws-dependencies to avoid class conflicts

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3157: - Fix Version/s: 0.11.0 0.10.1 > Shade aws-dependencies to avoid class conflicts > --

[jira] [Updated] (HUDI-3157) Shade aws-dependencies to avoid class conflicts

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3157: - Labels: sev:critical user-support-issues (was: user-support-issues) > Shade aws-dependencies to avoid cla

[jira] [Updated] (HUDI-3156) Add detlastreamer source for file listing (used in bootstrapping cases)

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3156: -- Fix Version/s: 0.11.0 > Add detlastreamer source for file listing (used in bootstrapping

[jira] [Updated] (HUDI-3156) Add detlastreamer source for file listing (used in bootstrapping cases)

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3156: -- Story Points: 2 > Add detlastreamer source for file listing (used in bootstrapping cases

[jira] [Assigned] (HUDI-3156) Add detlastreamer source for file listing (used in bootstrapping cases)

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-3156: - Assignee: sivabalan narayanan > Add detlastreamer source for file listing (used i

[jira] [Updated] (HUDI-1295) Implement: Metadata based bloom index - write path

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1295: - Sprint: Hudi-Sprint-Jan-3 > Implement: Metadata based bloom index - write path > -

[jira] [Updated] (HUDI-2584) Unit tests for bloom filter index based out of metadata table.

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2584: - Sprint: Hudi-Sprint-Jan-3 > Unit tests for bloom filter index based out of metadata table. >

[jira] [Updated] (HUDI-2518) Implement stats/range tracking as a part of Metadata table

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2518: - Sprint: Hudi-Sprint-Jan-3 > Implement stats/range tracking as a part of Metadata table > -

[jira] [Created] (HUDI-3156) Add detlastreamer source for file listing (used in bootstrapping cases)

2022-01-03 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3156: - Summary: Add detlastreamer source for file listing (used in bootstrapping cases) Key: HUDI-3156 URL: https://issues.apache.org/jira/browse/HUDI-3156 Project

[jira] [Updated] (HUDI-52) Implement Savepoints for Merge On Read table #88

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-52: Story Points: 1 > Implement Savepoints for Merge On Read table #88 > -

[jira] [Updated] (HUDI-2714) Benchmark MetaIndex performance w/ bloom and column stat metadata

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2714: - Sprint: Hudi-Sprint-Jan-3 > Benchmark MetaIndex performance w/ bloom and column stat metadata > -

[jira] [Updated] (HUDI-3141) Metadata table getAllFilesInPartition() crashes with NullPointerException

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3141: - Story Points: 1 (was: 5) > Metadata table getAllFilesInPartition() crashes with NullPointerExcept

[jira] [Updated] (HUDI-3141) Metadata table getAllFilesInPartition() crashes with NullPointerException

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-3141: - Sprint: Hudi-Sprint-Jan-3 > Metadata table getAllFilesInPartition() crashes with NullPointerExcept

[jira] [Updated] (HUDI-2714) Benchmark MetaIndex performance w/ bloom and column stat metadata

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2714: - Story Points: 3 (was: 15) > Benchmark MetaIndex performance w/ bloom and column stat metadata >

[jira] [Updated] (HUDI-2590) Validate Diff key gen w/ and w/o glob path with and w/o metadata enabled

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2590: -- Story Points: 0 > Validate Diff key gen w/ and w/o glob path with and w/o metadata enabl

[jira] [Updated] (HUDI-2590) Validate Diff key gen w/ and w/o glob path with and w/o metadata enabled

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2590: -- Remaining Estimate: 1m Original Estimate: 1m > Validate Diff key gen w/ and w/o glo

[jira] [Updated] (HUDI-2590) Validate Diff key gen w/ and w/o glob path with and w/o metadata enabled

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2590: -- Remaining Estimate: 0h (was: 1m) Original Estimate: 0h (was: 1m) > Validate Diff

[jira] [Updated] (HUDI-2973) Rewrite/re-publish RFC for Data skipping index

2022-01-03 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2973: - Reviewers: Prashant Wason, Vinoth Chandar > Rewrite/re-publish RFC for Data skipping index > -

[jira] [Updated] (HUDI-3012) Investigate: Metadata table write performance impact

2022-01-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-3012: - Epic Link: HUDI-1292 > Investigate: Metadata table write performance impact >

[jira] [Updated] (HUDI-1822) [Umbrella] Multi Modal Indexing

2022-01-03 Thread Manoj Govindassamy (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-1822: - Epic Name: RFC-27 Multi Modal Indexing (was: RFC-27 Data Skipping) > [Umbrella] Multi Mod

[jira] [Updated] (HUDI-3065) spark auto partition discovery does not work from 0.9.0

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3065: -- Sprint: Hudi 0.10.1 - 2021/01/03 > spark auto partition discovery does not work from 0.

[jira] [Updated] (HUDI-3135) Fix Show Partitions Command's Result after drop partition

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3135: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Fix Show Partitions Command's Result after drop part

[jira] [Updated] (HUDI-2947) HoodieDeltaStreamer/DeltaSync can improperly pick up the checkpoint config from CLI in continuous mode

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2947: -- Sprint: Hudi 0.10.1 - 2021/01/03 > HoodieDeltaStreamer/DeltaSync can improperly pick up

[jira] [Updated] (HUDI-3147) add support for local dynamo db lock provider

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3147: -- Sprint: Hudi 0.10.1 - 2021/01/03 > add support for local dynamo db lock provider >

[jira] [Updated] (HUDI-3132) Minor fixes for HoodieCatalog

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3132: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Minor fixes for HoodieCatalog >

[jira] [Updated] (HUDI-3151) Docs for Spark SQL type support

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3151: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Docs for Spark SQL type support > --

[jira] [Updated] (HUDI-2780) Mor reads the log file and skips the complete block as a bad block, resulting in data loss

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2780: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Mor reads the log file and skips the complete block

[jira] [Updated] (HUDI-3148) Unable to push metrics via pushgateway reporter with https

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3148: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Unable to push metrics via pushgateway reporter with

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3139: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Shade htrace and parquet-avro >

[jira] [Updated] (HUDI-2711) Fallback to full table scan for IncrementalRelation and HoodieIncrSource when data file is missing.

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2711: -- Sprint: Hudi 0.10.1 - 2021/01/03 > Fallback to full table scan for IncrementalRelation

[jira] [Created] (HUDI-3155) java.lang.NoSuchFieldError for logical timestamp types when run hive sync tool

2022-01-03 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3155: - Summary: java.lang.NoSuchFieldError for logical timestamp types when run hive sync tool Key: HUDI-3155 URL: https://issues.apache.org/jira/browse/HUDI-3155 Project: Apache

[jira] [Updated] (HUDI-2780) Mor reads the log file and skips the complete block as a bad block, resulting in data loss

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2780: - Priority: Critical (was: Blocker) > Mor reads the log file and skips the complete block as a bad block, r

[jira] [Updated] (HUDI-3155) java.lang.NoSuchFieldError for logical timestamp types when run hive sync tool

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3155: -- Fix Version/s: 0.11.0 > java.lang.NoSuchFieldError for logical timestamp types when run hive sync tool >

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-3139: - Priority: Critical (was: Blocker) > Shade htrace and parquet-avro > - > >

[GitHub] [hudi] boneanxs edited a comment on issue #4474: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?

2022-01-03 Thread GitBox
boneanxs edited a comment on issue #4474: URL: https://github.com/apache/hudi/issues/4474#issuecomment-1004507464 @xushiyan @zhedoubushishi , Looks this is a common case for many users, does shading this have other side effects or not? If not, I'm willing to raise a ticket to solve this.

[GitHub] [hudi] boneanxs commented on issue #4474: [SUPPORT] Should we shade all aws dependencies to avoid class conflicts?

2022-01-03 Thread GitBox
boneanxs commented on issue #4474: URL: https://github.com/apache/hudi/issues/4474#issuecomment-1004507464 @xushiyan @zhedoubushishi , Looks this is a common case for many users, does shade this has other side effects or not? If not, I'm willing to raise a ticket to solve this. -- This

[jira] [Updated] (HUDI-2762) Ensure hive can query insert only logs in MOR

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2762: -- Story Points: 2 > Ensure hive can query insert only logs in MOR > --

[jira] [Updated] (HUDI-1862) Handle inconsistent views of source table and Hudi dataset

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-1862: -- Story Points: 4 > Handle inconsistent views of source table and Hudi dataset > -

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3139: -- Story Points: 1 > Shade htrace and parquet-avro > - > > Key:

[jira] [Updated] (HUDI-1042) [Umbrella] Support clustering on filegroups

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-1042: -- Epic Name: Table Services - Clustering > [Umbrella] Support clustering on filegroups > -

[jira] [Updated] (HUDI-2774) Async Clustering via deltstreamer fails with IllegalStateException: Duplicate key [==>20211116123724586__replacecommit__INFLIGHT]

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2774: -- Epic Link: HUDI-1042 > Async Clustering via deltstreamer fails with IllegalStateException: Duplicate >

[jira] [Updated] (HUDI-2774) Async Clustering via deltstreamer fails with IllegalStateException: Duplicate key [==>20211116123724586__replacecommit__INFLIGHT]

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2774: -- Story Points: 2 > Async Clustering via deltstreamer fails with IllegalStateException: Duplicate > key [

[jira] [Created] (HUDI-3154) Benchmark metadata-based listing with reuse disabled

2022-01-03 Thread Sagar Sumit (Jira)
Sagar Sumit created HUDI-3154: - Summary: Benchmark metadata-based listing with reuse disabled Key: HUDI-3154 URL: https://issues.apache.org/jira/browse/HUDI-3154 Project: Apache Hudi Issue Type:

[jira] [Updated] (HUDI-3006) Benchmark metadata-based listing with reuse enabled

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3006: -- Summary: Benchmark metadata-based listing with reuse enabled (was: Benchmark metadata-based listing wit

[jira] [Resolved] (HUDI-3006) Benchmark metadata-based listing with reuse enabled

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit resolved HUDI-3006. --- > Benchmark metadata-based listing with reuse enabled > --

[jira] [Updated] (HUDI-3006) Benchmark metadata-based listing with reuse enabled

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3006: -- Story Points: 1 (was: 2) > Benchmark metadata-based listing with reuse enabled > --

[jira] [Updated] (HUDI-3097) Address dependency issue with hudi-trino-bundle in connector

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3097: -- Status: Patch Available (was: In Progress) > Address dependency issue with hudi-trino-bundle in connect

[jira] [Updated] (HUDI-2971) Timestamp values being corrupted when using BULK INSERT with row writing enabled

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2971: -- Status: Patch Available (was: In Progress) > Timestamp values being corrupted when using BULK INSERT wi

[jira] [Updated] (HUDI-2971) Timestamp values being corrupted when using BULK INSERT with row writing enabled

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2971: -- Story Points: 2 > Timestamp values being corrupted when using BULK INSERT with row writing > enabled >

[jira] [Updated] (HUDI-2909) KeyGenerator is broken in 0.10.0

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2909: -- Story Points: 2 > KeyGenerator is broken in 0.10.0 > > >

[jira] [Updated] (HUDI-2774) Async Clustering via deltstreamer fails with IllegalStateException: Duplicate key [==>20211116123724586__replacecommit__INFLIGHT]

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2774: -- Status: Open (was: Patch Available) > Async Clustering via deltstreamer fails with IllegalStateExceptio

[jira] [Updated] (HUDI-2909) KeyGenerator is broken in 0.10.0

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-2909: -- Status: Patch Available (was: In Progress) > KeyGenerator is broken in 0.10.0 > ---

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3139: -- Fix Version/s: 0.11.0 > Shade htrace and parquet-avro > - > >

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3139: -- Priority: Blocker (was: Major) > Shade htrace and parquet-avro > - > >

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3139: -- Status: Patch Available (was: In Progress) > Shade htrace and parquet-avro > --

[jira] [Updated] (HUDI-3139) Shade htrace and parquet-avro

2022-01-03 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sagar Sumit updated HUDI-3139: -- Fix Version/s: 0.10.1 > Shade htrace and parquet-avro > - > >

[jira] [Updated] (HUDI-3153) Make Trino connector implementation extensible for different table/query types, data formats, etc.

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3153: Status: In Progress (was: Open) > Make Trino connector implementation extensible for different table/query

[GitHub] [hudi] hudi-bot commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004489100 ## CI report: * 0e6ae14a0643c581966b8b7eb1a35d8abf831812 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004471077 ## CI report: * ef365093435ae38aa5f0f918f87f05be0ed414cd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] a0x edited a comment on issue #4442: [SUPPORT] PySpark(3.1.2) with Hudi(0.10.0) failed when querying spark sql

2022-01-03 Thread GitBox
a0x edited a comment on issue #4442: URL: https://github.com/apache/hudi/issues/4442#issuecomment-1004486507 > I have the same issue when running hudi on emr. This issue seems to have the same root cause as in this one: #4474 . The solution is to shade and relocate aws dependencies introdu

[GitHub] [hudi] a0x commented on issue #4442: [SUPPORT] PySpark(3.1.2) with Hudi(0.10.0) failed when querying spark sql

2022-01-03 Thread GitBox
a0x commented on issue #4442: URL: https://github.com/apache/hudi/issues/4442#issuecomment-1004486507 > I have the same issue when running hudi on emr. This issue seems to have the same root cause as in this one: #4474 . The solution is to shade and relocate aws dependencies introduced in

[GitHub] [hudi] a0x commented on issue #4442: [SUPPORT] PySpark(3.1.2) with Hudi(0.10.0) failed when querying spark sql

2022-01-03 Thread GitBox
a0x commented on issue #4442: URL: https://github.com/apache/hudi/issues/4442#issuecomment-1004485623 @xushiyan Thanks for your reply. Do you mean not to replace Hudi 0.8.0 bundled in EMR and start spark session with Hudi 0.10.0 which in another separated dir? To be honest I t

[jira] [Created] (HUDI-3153) Make Trino connector implementation extensible for different table/query types, data formats, etc.

2022-01-03 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-3153: --- Summary: Make Trino connector implementation extensible for different table/query types, data formats, etc. Key: HUDI-3153 URL: https://issues.apache.org/jira/browse/HUDI-3153

[jira] [Updated] (HUDI-2724) Benchmark connector

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2724: Story Points: 1 (was: 2) > Benchmark connector > --- > > Key: HUDI-2724 >

[jira] [Updated] (HUDI-3126) Address whackamoles during testing of Hudi Trino connector

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3126: Status: Patch Available (was: In Progress) > Address whackamoles during testing of Hudi Trino connector > -

[jira] [Updated] (HUDI-3126) Address whackamoles during testing of Hudi Trino connector

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3126: Story Points: 0 (was: 3) > Address whackamoles during testing of Hudi Trino connector > ---

[jira] [Commented] (HUDI-3126) Address whackamoles during testing of Hudi Trino connector

2022-01-03 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468324#comment-17468324 ] Ethan Guo commented on HUDI-3126: - https://github.com/codope/trino/pull/13 > Address whac

[GitHub] [hudi] YannByron edited a comment on issue #4477: [SUPPORT]using spark on TimestampBasedKeyGenerator has no result when query by partition column

2022-01-03 Thread GitBox
YannByron edited a comment on issue #4477: URL: https://github.com/apache/hudi/issues/4477#issuecomment-1004474344 @taisenki @nsivabalan I found there are different behaviors in cow and mor type using the same config above. i'll take this up and reply asap. -- This is an automated me

[jira] [Commented] (HUDI-3122) presto query failed for bootstrap tables

2022-01-03 Thread Yue Zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17468314#comment-17468314 ] Yue Zhang commented on HUDI-3122: - Hi [~wenningd] is there any updates ? :) > presto quer

[GitHub] [hudi] YannByron commented on issue #4477: [SUPPORT]using spark on TimestampBasedKeyGenerator has no result when query by partition column

2022-01-03 Thread GitBox
YannByron commented on issue #4477: URL: https://github.com/apache/hudi/issues/4477#issuecomment-1004474344 @taisenki @nsivabalan I found there are different behaviors in cow and mor type. i'll take this up and reply asap. -- This is an automated message from the Apache Git Service.

[hudi] branch master updated: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0 (#4498)

2022-01-03 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 29ab6fb [HUDI-3140] Fix bulk_insert failure on Spa

[GitHub] [hudi] leesf merged pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
leesf merged pull request #4498: URL: https://github.com/apache/hudi/pull/4498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...

[GitHub] [hudi] leesf commented on pull request #4498: [HUDI-3140] Fix bulk_insert failure on Spark 3.2.0

2022-01-03 Thread GitBox
leesf commented on pull request #4498: URL: https://github.com/apache/hudi/pull/4498#issuecomment-1004474102 Merging as the CI passed and verify manually on 3.2.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [hudi] YannByron commented on issue #4154: [SUPPORT] INSERT OVERWRITE operation does not work when using Spark SQL

2022-01-03 Thread GitBox
YannByron commented on issue #4154: URL: https://github.com/apache/hudi/issues/4154#issuecomment-1004473756 @BenjMaq Does Spark-SQL work correctly also? Can @xushiyan give some advices for Hive/Presto? -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [hudi] nsivabalan commented on a change in pull request #4473: [HUDI-2590] Adding tests to validate different key generators

2022-01-03 Thread GitBox
nsivabalan commented on a change in pull request #4473: URL: https://github.com/apache/hudi/pull/4473#discussion_r84743 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSourceStorage.scala ## @@ -100,8 +120,26 @@ class Tes

[jira] [Assigned] (HUDI-3152) Add support to generate updates in HoodieTestDataGenerator for TimestampBasedKeyGen

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-3152: - Assignee: sivabalan narayanan > Add support to generate updates in HoodieTestData

[jira] [Updated] (HUDI-3152) Add support to generate updates in HoodieTestDataGenerator for TimestampBasedKeyGen

2022-01-03 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-3152: -- Fix Version/s: 0.11.0 > Add support to generate updates in HoodieTestDataGenerator for

[jira] [Created] (HUDI-3152) Add support to generate updates in HoodieTestDataGenerator for TimestampBasedKeyGen

2022-01-03 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-3152: - Summary: Add support to generate updates in HoodieTestDataGenerator for TimestampBasedKeyGen Key: HUDI-3152 URL: https://issues.apache.org/jira/browse/HUDI-3152

[GitHub] [hudi] hudi-bot commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004471077 ## CI report: * ef365093435ae38aa5f0f918f87f05be0ed414cd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004470177 ## CI report: * ef365093435ae38aa5f0f918f87f05be0ed414cd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] hudi-bot commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
hudi-bot commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004470177 ## CI report: * ef365093435ae38aa5f0f918f87f05be0ed414cd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?b

[GitHub] [hudi] hudi-bot removed a comment on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
hudi-bot removed a comment on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1003211881 ## CI report: * ef365093435ae38aa5f0f918f87f05be0ed414cd Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/r

[GitHub] [hudi] nsivabalan commented on pull request #4405: HUDI-3068 Fixing sync all partitions

2022-01-03 Thread GitBox
nsivabalan commented on pull request #4405: URL: https://github.com/apache/hudi/pull/4405#issuecomment-1004469797 @xiarixiaoyao : Do you have context around updating partitions in hive. can you check my comment above and respond. -- This is an automated message from the Apache Git Servi

[GitHub] [hudi] nsivabalan commented on pull request #4478: [HUDI-2966] Closing LogRecordScanner in compactor

2022-01-03 Thread GitBox
nsivabalan commented on pull request #4478: URL: https://github.com/apache/hudi/pull/4478#issuecomment-1004469280 @xiarixiaoyao : addressed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

<    1   2   3   4   5   >