[GitHub] [hudi] kywe665 opened a new pull request #3763: [MINOR] - Fixed typo in docker demo docs for kafkacat -> kcat

2021-10-07 Thread GitBox
kywe665 opened a new pull request #3763: URL: https://github.com/apache/hudi/pull/3763 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3623: URL: https://github.com/apache/hudi/pull/3623#issuecomment-915056982 ## CI report: * 44b255665f688477279fce5d07bf29c5537b7f05 UNKNOWN * 20c9cfdb70b3652c80fc4339789285473f6a7cbc Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3623: URL: https://github.com/apache/hudi/pull/3623#issuecomment-915056982 ## CI report: * 78577241f38f2021052fb62c8c19ed67d0db012e Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3623: URL: https://github.com/apache/hudi/pull/3623#issuecomment-915056982 ## CI report: * 78577241f38f2021052fb62c8c19ed67d0db012e Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3623: URL: https://github.com/apache/hudi/pull/3623#issuecomment-915056982 ## CI report: * 78577241f38f2021052fb62c8c19ed67d0db012e Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3719: [HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requests

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3719: URL: https://github.com/apache/hudi/pull/3719#issuecomment-927270024 ## CI report: * 82b6fa38d0f2e8fca6a7804a550b89a4328644f2 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3623: URL: https://github.com/apache/hudi/pull/3623#issuecomment-915056982 ## CI report: * 72fc50ea33d6267ebdc9a0ecd81cb4df3c833814 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3623: URL: https://github.com/apache/hudi/pull/3623#issuecomment-915056982 ## CI report: * 72fc50ea33d6267ebdc9a0ecd81cb4df3c833814 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3719: [HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requests

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3719: URL: https://github.com/apache/hudi/pull/3719#issuecomment-927270024 ## CI report: * f4d2fc3d664279975d143494274b787c6a6d5db1 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3719: [HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requests

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3719: URL: https://github.com/apache/hudi/pull/3719#issuecomment-927270024 ## CI report: * f4d2fc3d664279975d143494274b787c6a6d5db1 Azure:

[GitHub] [hudi] danny0405 commented on a change in pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-10-07 Thread GitBox
danny0405 commented on a change in pull request #3203: URL: https://github.com/apache/hudi/pull/3203#discussion_r724668250 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieMergedLogReader.java ## @@ -0,0 +1,144 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] nsivabalan commented on pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-10-07 Thread GitBox
nsivabalan commented on pull request #3203: URL: https://github.com/apache/hudi/pull/3203#issuecomment-938298379 sorry for long delay. I will review this by this weekend. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] xushiyan commented on a change in pull request #3416: [HUDI-2362] Add external config file support

2021-10-07 Thread GitBox
xushiyan commented on a change in pull request #3416: URL: https://github.com/apache/hudi/pull/3416#discussion_r724653630 ## File path: hudi-common/src/main/java/org/apache/hudi/common/config/DFSPropertiesConfiguration.java ## @@ -43,70 +45,88 @@ private static final

[GitHub] [hudi] danny0405 commented on a change in pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-10-07 Thread GitBox
danny0405 commented on a change in pull request #3203: URL: https://github.com/apache/hudi/pull/3203#discussion_r724660978 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java ## @@ -336,6 +336,11 @@ public static String getFileExtensionFromLog(Path

[GitHub] [hudi] rubenssoto edited a comment on issue #3751: [SUPPORT] Slow Write Speeds to Hudi

2021-10-07 Thread GitBox
rubenssoto edited a comment on issue #3751: URL: https://github.com/apache/hudi/issues/3751#issuecomment-938290542 @MikeBuh since July 16, Athena has full support for MoR tables https://docs.aws.amazon.com/athena/latest/ug/release-note-2021-07-16.html

[GitHub] [hudi] rubenssoto commented on issue #3751: [SUPPORT] Slow Write Speeds to Hudi

2021-10-07 Thread GitBox
rubenssoto commented on issue #3751: URL: https://github.com/apache/hudi/issues/3751#issuecomment-938290542 @MikeBuh since July 16, Athena has full support for MoR tables https://docs.aws.amazon.com/athena/latest/ug/release-note-2021-07-16.html -- This is an automated message from

[GitHub] [hudi] hudi-bot edited a comment on pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile log blocks in metadata table

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-938271221 ## CI report: * 5fb7a2afa196fd75ada005d26a0fb9fce5472545 UNKNOWN * cb7e9cea8fa966437a892be1e0917443c034e21e Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3761: [HUDI-2513] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3761: URL: https://github.com/apache/hudi/pull/3761#issuecomment-938264265 ## CI report: * cf20d97ab77a55797f1bcb4ee7dcb614681e8ae3 Azure:

[GitHub] [hudi] guanziyue commented on issue #3755: [Delta Streamer] file name mismatch with meta when compaction running

2021-10-07 Thread GitBox
guanziyue commented on issue #3755: URL: https://github.com/apache/hudi/issues/3755#issuecomment-938281439 It seems that the file left in reconcile stage is different with commit meta. Could you kindly share relevant logs and file status about marker file? -- This is an automated

[jira] [Commented] (HUDI-2275) HoodieDeltaStreamerException when using OCC and a second concurrent writer

2021-10-07 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425912#comment-17425912 ] Nishith Agarwal commented on HUDI-2275: --- [~dave_hagman] To ensure that the checkpoints from

[jira] [Updated] (HUDI-2531) [UMBRELLA] Support Dataset APIs in writer paths

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2531: - Fix Version/s: (was: 0.10.0) > [UMBRELLA] Support Dataset APIs in writer paths >

[jira] [Updated] (HUDI-2531) [UMBRELLA] Support Dataset APIs in writer paths

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2531: - Fix Version/s: 0.10.0 > [UMBRELLA] Support Dataset APIs in writer paths >

[GitHub] [hudi] hudi-bot edited a comment on pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile log blocks in metadata table

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-938271221 ## CI report: * 5fb7a2afa196fd75ada005d26a0fb9fce5472545 UNKNOWN * cb7e9cea8fa966437a892be1e0917443c034e21e Azure:

[jira] [Resolved] (HUDI-1854) Corrupt blocks in GCS log files

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1854. -- Resolution: Cannot Reproduce > Corrupt blocks in GCS log files >

[jira] [Updated] (HUDI-1834) Please delete old releases from mirroring system

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1834: - Priority: Blocker (was: Major) > Please delete old releases from mirroring system >

[jira] [Updated] (HUDI-1834) Please delete old releases from mirroring system

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1834: - Fix Version/s: 0.10.0 > Please delete old releases from mirroring system >

[jira] [Updated] (HUDI-2003) Auto Compute Compression ratio for input data to output parquet/orc file size

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2003: - Labels: user-support-issues (was: ) > Auto Compute Compression ratio for input data to output

[GitHub] [hudi] nsivabalan commented on a change in pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile log blocks in metadata table

2021-10-07 Thread GitBox
nsivabalan commented on a change in pull request #3762: URL: https://github.com/apache/hudi/pull/3762#discussion_r724642330 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java ## @@ -132,18 +149,31 @@ protected

[GitHub] [hudi] hudi-bot edited a comment on pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile log blocks in metadata table

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-938271221 ## CI report: * 5fb7a2afa196fd75ada005d26a0fb9fce5472545 UNKNOWN * cb7e9cea8fa966437a892be1e0917443c034e21e UNKNOWN Bot commands @hudi-bot

[jira] [Commented] (HUDI-2056) Spark speculation may produce dirty parquet files

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425906#comment-17425906 ] Vinoth Chandar commented on HUDI-2056: -- This should be solved using the marker file mechanism. no? >

[GitHub] [hudi] nsivabalan commented on a change in pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile log blocks in metadata table

2021-10-07 Thread GitBox
nsivabalan commented on a change in pull request #3762: URL: https://github.com/apache/hudi/pull/3762#discussion_r724642330 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java ## @@ -132,18 +149,31 @@ protected

[jira] [Updated] (HUDI-860) Ability to do small file handling without need for caching

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-860: Parent: HUDI-1628 Issue Type: Sub-task (was: Bug) > Ability to do small file handling

[jira] [Updated] (HUDI-1628) Improve data locality during ingestion

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1628: - Labels: hudi-umbrellas (was: ) > Improve data locality during ingestion >

[jira] [Updated] (HUDI-860) Ability to do small file handling without need for caching

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-860: Parent: (was: HUDI-538) Issue Type: Bug (was: Sub-task) > Ability to do small file

[jira] [Updated] (HUDI-2172) upserts failing due to _hoodie_record_key being null in the hudi table

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2172: - Labels: user-support-issues (was: ) > upserts failing due to _hoodie_record_key being null in

[jira] [Resolved] (HUDI-2091) Add Uber's grafana dashboard to OSS

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-2091. -- Resolution: Fixed > Add Uber's grafana dashboard to OSS > --- >

[GitHub] [hudi] hudi-bot commented on pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile blocks in metadata table

2021-10-07 Thread GitBox
hudi-bot commented on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-938271221 ## CI report: * 5fb7a2afa196fd75ada005d26a0fb9fce5472545 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[jira] [Commented] (HUDI-2287) Partition pruning not working on Hudi dataset

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425905#comment-17425905 ] Vinoth Chandar commented on HUDI-2287: -- [~rxu] could you triage and close? > Partition pruning not

[jira] [Updated] (HUDI-2287) Partition pruning not working on Hudi dataset

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2287: - Priority: Blocker (was: Major) > Partition pruning not working on Hudi dataset >

[jira] [Updated] (HUDI-2287) Partition pruning not working on Hudi dataset

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2287: - Fix Version/s: 0.10.0 > Partition pruning not working on Hudi dataset >

[GitHub] [hudi] fuyun2024 commented on pull request #3722: HUDI-2491 hoodie.datasource.hive_sync.mode=hms mode is supported in s…

2021-10-07 Thread GitBox
fuyun2024 commented on pull request #3722: URL: https://github.com/apache/hudi/pull/3722#issuecomment-938270825 @nsivabalan Thank you for your comments, but I don't know where the mistake is. I'm a novice -- This is an automated message from the Apache Git Service. To respond to the

[jira] [Assigned] (HUDI-2287) Partition pruning not working on Hudi dataset

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-2287: Assignee: Raymond Xu > Partition pruning not working on Hudi dataset >

[jira] [Updated] (HUDI-2287) Partition pruning not working on Hudi dataset

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2287: - Parent: HUDI-1297 Issue Type: Sub-task (was: Bug) > Partition pruning not working on

[jira] [Updated] (HUDI-2363) COW : Listing leaf files and directories twice

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2363: - Labels: user-support-issues (was: ) > COW : Listing leaf files and directories twice >

[jira] [Commented] (HUDI-2363) COW : Listing leaf files and directories twice

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425904#comment-17425904 ] Vinoth Chandar commented on HUDI-2363: -- I think these are long fixed in recent releases. 0.7.0 IIRC.

[GitHub] [hudi] nsivabalan opened a new pull request #3762: [HUDI-1294] Adding inline read and seekable read for hfile blocks in metadata table

2021-10-07 Thread GitBox
nsivabalan opened a new pull request #3762: URL: https://github.com/apache/hudi/pull/3762 ## What is the purpose of the pull request - Added support to read HFile log blocks via inline FileSystem in metadata table. - Also added support to read for a list of keys(batch get) rather

[jira] [Updated] (HUDI-2509) OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2509: - Labels: sev:critical user-support-issues (was: ) > OverwriteNonDefaultsWithLatestAvroPayload

[jira] [Resolved] (HUDI-2416) Move FAQs to website

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-2416. -- Resolution: Fixed > Move FAQs to website > > > Key:

[jira] [Updated] (HUDI-1194) Reorganize HoodieHiveClient and make it fully support Hive Metastore API

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1194: - Parent: HUDI-2519 Issue Type: Sub-task (was: Improvement) > Reorganize HoodieHiveClient

[jira] [Updated] (HUDI-2416) Move FAQs to website

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2416: - Status: Closed (was: Patch Available) > Move FAQs to website > > >

[jira] [Reopened] (HUDI-2416) Move FAQs to website

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reopened HUDI-2416: -- > Move FAQs to website > > > Key: HUDI-2416 >

[jira] [Resolved] (HUDI-6) Support for Hive 3.x

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-6. --- Resolution: Duplicate > Support for Hive 3.x > > > Key: HUDI-6 >

[jira] [Commented] (HUDI-6) Support for Hive 3.x

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425903#comment-17425903 ] Vinoth Chandar commented on HUDI-6: --- HUDI-2519 is tracking everythig under this. planned out for 0.10.0.

[jira] [Updated] (HUDI-1194) Reorganize HoodieHiveClient and make it fully support Hive Metastore API

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1194: - Status: Open (was: New) > Reorganize HoodieHiveClient and make it fully support Hive Metastore

[jira] [Updated] (HUDI-6) Support for Hive 3.x

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6: -- Issue Type: New Feature (was: Improvement) > Support for Hive 3.x > > >

[jira] [Updated] (HUDI-6) Support for Hive 3.x

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6: -- Priority: Blocker (was: Major) > Support for Hive 3.x > > > Key:

[jira] [Updated] (HUDI-6) Support for Hive 3.x

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-6: -- Fix Version/s: 0.10.0 > Support for Hive 3.x > > > Key: HUDI-6 >

[GitHub] [hudi] peanut-chenzhong commented on issue #3735: [SUPPORT] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
peanut-chenzhong commented on issue #3735: URL: https://github.com/apache/hudi/issues/3735#issuecomment-938266751 BTW, could help add me to HUDI JIRA group so that I can assign the task to me? @nsivabalan @codope -- This is an automated message from the Apache Git Service. To respond

[jira] [Updated] (HUDI-2339) Create Table If Not Exists Failed After Alter Table

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2339: - Parent: HUDI-1658 Issue Type: Sub-task (was: Bug) > Create Table If Not Exists Failed

[GitHub] [hudi] hudi-bot edited a comment on pull request #3761: [HUDI-2513] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3761: URL: https://github.com/apache/hudi/pull/3761#issuecomment-938264265 ## CI report: * cf20d97ab77a55797f1bcb4ee7dcb614681e8ae3 Azure:

[jira] [Created] (HUDI-2532) Set right default value for max delta commits for compaction in metadata table

2021-10-07 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-2532: - Summary: Set right default value for max delta commits for compaction in metadata table Key: HUDI-2532 URL: https://issues.apache.org/jira/browse/HUDI-2532

[jira] [Resolved] (HUDI-2389) Translate ByteDance Hudi Blog from Chinese to English

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-2389. -- Fix Version/s: 0.9.0 Resolution: Fixed > Translate ByteDance Hudi Blog from Chinese to

[jira] [Resolved] (HUDI-2347) Write a blog for marker mechanisms

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-2347. -- Fix Version/s: 0.9.0 Resolution: Fixed > Write a blog for marker mechanisms >

[jira] [Updated] (HUDI-1870) Move spark avro serialization class into hudi repo

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1870: - Fix Version/s: 0.10.0 > Move spark avro serialization class into hudi repo >

[jira] [Updated] (HUDI-1870) Move spark avro serialization class into hudi repo

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1870: - Priority: Blocker (was: Major) > Move spark avro serialization class into hudi repo >

[GitHub] [hudi] hudi-bot commented on pull request #3761: [HUDI-2513] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
hudi-bot commented on pull request #3761: URL: https://github.com/apache/hudi/pull/3761#issuecomment-938264265 ## CI report: * cf20d97ab77a55797f1bcb4ee7dcb614681e8ae3 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[jira] [Commented] (HUDI-2509) OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread Adam Z CHEN (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425900#comment-17425900 ] Adam Z CHEN commented on HUDI-2509: --- [https://github.com/apache/hudi/pull/3761] PR raised >

[GitHub] [hudi] peanut-chenzhong commented on issue #3735: [SUPPORT] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
peanut-chenzhong commented on issue #3735: URL: https://github.com/apache/hudi/issues/3735#issuecomment-938263412 https://github.com/apache/hudi/pull/3761 PR raised -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] peanut-chenzhong opened a new pull request #3761: [HUDI-2513] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
peanut-chenzhong opened a new pull request #3761: URL: https://github.com/apache/hudi/pull/3761 [HUDI-2513] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column ## Committer checklist - [Y ] Has a corresponding JIRA in PR title

[jira] [Updated] (HUDI-2424) Error checking bloom filter index (NPE)

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2424: - Labels: user-support-issues (was: ) > Error checking bloom filter index (NPE) >

[jira] [Updated] (HUDI-2427) SQL stmt broken with spark 3.1.x

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2427: - Labels: sev:high user-support-issues (was: ) > SQL stmt broken with spark 3.1.x >

[jira] [Updated] (HUDI-2426) spark sql extensions breaks read.table from metastore

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2426: - Labels: user-support-issues (was: ) > spark sql extensions breaks read.table from metastore >

[jira] [Updated] (HUDI-2470) use commit_time in the WHERE STATEMENT to optimize the incremental query

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2470: - Parent: HUDI-1658 Issue Type: Sub-task (was: Improvement) > use commit_time in the WHERE

[jira] [Commented] (HUDI-2470) use commit_time in the WHERE STATEMENT to optimize the incremental query

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425888#comment-17425888 ] Vinoth Chandar commented on HUDI-2470: -- Assigned it you! Look forward to the PR > use commit_time

[jira] [Assigned] (HUDI-2470) use commit_time in the WHERE STATEMENT to optimize the incremental query

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-2470: Assignee: David_Liang > use commit_time in the WHERE STATEMENT to optimize the

[jira] [Updated] (HUDI-1951) Hash Index for HUDI

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1951: - Parent: HUDI-1822 Issue Type: Sub-task (was: New Feature) > Hash Index for HUDI >

[jira] [Updated] (HUDI-2489) Tuning HoodieROTablePathFilter by caching, aiming to reduce unnecessary list/get requests

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2489: - Priority: Blocker (was: Major) > Tuning HoodieROTablePathFilter by caching, aiming to reduce

[jira] [Updated] (HUDI-2409) Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2409: - Priority: Blocker (was: Major) > Using HBase shaded jars in Hudi presto bundle >

[jira] [Updated] (HUDI-2409) Using HBase shaded jars in Hudi presto bundle

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2409: - Fix Version/s: 0.10.0 > Using HBase shaded jars in Hudi presto bundle >

[jira] [Updated] (HUDI-2489) Tuning HoodieROTablePathFilter by caching, aiming to reduce unnecessary list/get requests

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2489: - Fix Version/s: 0.10.0 > Tuning HoodieROTablePathFilter by caching, aiming to reduce unnecessary

[GitHub] [hudi] peanut-chenzhong commented on issue #3735: [SUPPORT] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-10-07 Thread GitBox
peanut-chenzhong commented on issue #3735: URL: https://github.com/apache/hudi/issues/3735#issuecomment-938254613 @codope sure, will raise PR soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[jira] [Updated] (HUDI-2199) DynamoDB based external index implementation

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2199: - Parent: HUDI-1822 Issue Type: Sub-task (was: New Feature) > DynamoDB based external

[jira] [Updated] (HUDI-686) Implement BloomIndexV2 that does not depend on memory caching

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-686: Parent: HUDI-1822 Issue Type: Sub-task (was: Improvement) > Implement BloomIndexV2 that

[jira] [Updated] (HUDI-2510) QuickStart html page is showing 404

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2510: - Status: Closed (was: Patch Available) > QuickStart html page is showing 404 >

[jira] [Resolved] (HUDI-2510) QuickStart html page is showing 404

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-2510. -- Resolution: Fixed > QuickStart html page is showing 404 > --- >

[jira] [Reopened] (HUDI-2510) QuickStart html page is showing 404

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reopened HUDI-2510: -- > QuickStart html page is showing 404 > --- > >

[jira] [Updated] (HUDI-1297) [Umbrella] Spark Datasource Support

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1297: - Summary: [Umbrella] Spark Datasource Support (was: [Umbrella] Revamp Spark Datasource support

[jira] [Updated] (HUDI-2526) Make spark.sql.parquet.writeLegacyFormat configurable

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2526: - Parent: HUDI-1297 Issue Type: Sub-task (was: Improvement) > Make

[jira] [Updated] (HUDI-2071) Support Reading Bootstrap MOR RT Table In Spark DataSource Table

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2071: - Parent: HUDI-1297 Issue Type: Sub-task (was: Improvement) > Support Reading Bootstrap

[jira] [Updated] (HUDI-2505) [UMBRELLA] Spark DataSource APIs and Spark SQL discrepancies

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2505: - Labels: hudi-umbrellas sev:critical (was: sev:critical) > [UMBRELLA] Spark DataSource APIs and

[jira] [Updated] (HUDI-2362) Hudi external configuration file support

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2362: - Priority: Blocker (was: Major) > Hudi external configuration file support >

[jira] [Updated] (HUDI-2362) Hudi external configuration file support

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2362: - Fix Version/s: 0.10.0 > Hudi external configuration file support >

[jira] [Updated] (HUDI-1608) MOR fetches all records for read optimized query w/ spark sql

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1608: - Labels: pull-request-available sev:high user-support-issues (was: pull-request-available

[jira] [Updated] (HUDI-2390) KeyGenerator discrepancy between DataFrame writer and SQL

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2390: - Labels: sev:critical user-support-issues (was: sev:critical) > KeyGenerator discrepancy between

[jira] [Updated] (HUDI-2275) HoodieDeltaStreamerException when using OCC and a second concurrent writer

2021-10-07 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-2275: - Parent: HUDI-1456 Issue Type: Sub-task (was: Bug) > HoodieDeltaStreamerException when

[GitHub] [hudi] nsivabalan edited a comment on issue #3605: [SUPPORT]Hudi Inserts and Upserts for MoR and CoW tables are taking very long time.

2021-10-07 Thread GitBox
nsivabalan edited a comment on issue #3605: URL: https://github.com/apache/hudi/issues/3605#issuecomment-937435741 sorry, whats the shuffle parallelism you are setting for these writes? In your original description, I see you are setting it to 2. definitely that would give you bad perf.

[GitHub] [hudi] nsivabalan merged pull request #3753: [HUDI-2510] Added a quickstart redirect page to fix broken external links in GCP docs

2021-10-07 Thread GitBox
nsivabalan merged pull request #3753: URL: https://github.com/apache/hudi/pull/3753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] nsivabalan commented on pull request #3411: [HUDI-2276] Enable metadata table by default for readers and writers

2021-10-07 Thread GitBox
nsivabalan commented on pull request #3411: URL: https://github.com/apache/hudi/pull/3411#issuecomment-937353116 fixed it along with https://github.com/apache/hudi/pull/3590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] hudi-bot edited a comment on pull request #3754: [HUDI-2482] support 'drop partition' sql

2021-10-07 Thread GitBox
hudi-bot edited a comment on pull request #3754: URL: https://github.com/apache/hudi/pull/3754#issuecomment-935480787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] nsivabalan merged pull request #3743: [HUDI-2513] Refactor table upgrade and downgrade actions in hudi-client module

2021-10-07 Thread GitBox
nsivabalan merged pull request #3743: URL: https://github.com/apache/hudi/pull/3743 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

  1   2   >