[GitHub] [hudi] prashantwason commented on a change in pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-15 Thread GitBox
prashantwason commented on a change in pull request #2326: URL: https://github.com/apache/hudi/pull/2326#discussion_r544085540 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieROTablePathFilter.java ## @@ -56,6 +58,15 @@ public class HoodieROTablePathFi

[GitHub] [hudi] yanghua commented on pull request #2337: [HUDI-982] Flink support mor table

2020-12-15 Thread GitBox
yanghua commented on pull request #2337: URL: https://github.com/apache/hudi/pull/2337#issuecomment-745865254 @wangxianghu Would you please review this pr firstly? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] Trevor-zhang closed issue #2130: [SUPPORT] Use hive jdbc to excute hudi query failed

2020-12-15 Thread GitBox
Trevor-zhang closed issue #2130: URL: https://github.com/apache/hudi/issues/2130 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] lichang-bd opened a new pull request #2339: Fix error information

2020-12-15 Thread GitBox
lichang-bd opened a new pull request #2339: URL: https://github.com/apache/hudi/pull/2339 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] prashantwason commented on a change in pull request #2326: [HUDI-1450] [RFC-15] Use metadata table for listing in HoodieROTablePathFilter

2020-12-15 Thread GitBox
prashantwason commented on a change in pull request #2326: URL: https://github.com/apache/hudi/pull/2326#discussion_r544083364 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieROTablePathFilter.java ## @@ -56,6 +58,15 @@ public class HoodieROTablePathFi

[GitHub] [hudi] prashantwason commented on a change in pull request #2332: [HUDI-1319] Make async operations work with metadata table

2020-12-15 Thread GitBox
prashantwason commented on a change in pull request #2332: URL: https://github.com/apache/hudi/pull/2332#discussion_r544072468 ## File path: hudi-client/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java ## @@ -725,6 +698,13 @@ private synchronized voi

[GitHub] [hudi] nsivabalan commented on pull request #2328: [HUDI-1451] Support bulk insert v2 with Spark 3.0.0

2020-12-15 Thread GitBox
nsivabalan commented on pull request #2328: URL: https://github.com/apache/hudi/pull/2328#issuecomment-745814067 reviewed source code. yet to review tests. This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] nsivabalan commented on a change in pull request #2328: [HUDI-1451] Support bulk insert v2 with Spark 3.0.0

2020-12-15 Thread GitBox
nsivabalan commented on a change in pull request #2328: URL: https://github.com/apache/hudi/pull/2328#discussion_r544040384 ## File path: hudi-spark-datasource/hudi-spark3/src/main/java/org/apache/hudi/spark3/internal/DefaultSource.java ## @@ -0,0 +1,71 @@ +/* + * Licensed to

[GitHub] [hudi] prashantwason commented on a change in pull request #2332: [HUDI-1319] Make async operations work with metadata table

2020-12-15 Thread GitBox
prashantwason commented on a change in pull request #2332: URL: https://github.com/apache/hudi/pull/2332#discussion_r544054515 ## File path: hudi-client/src/main/java/org/apache/hudi/client/HoodieWriteClient.java ## @@ -701,8 +704,6 @@ protected void completeCompaction(HoodieC

[GitHub] [hudi] nsivabalan commented on pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-15 Thread GitBox
nsivabalan commented on pull request #2311: URL: https://github.com/apache/hudi/pull/2311#issuecomment-745797881 yes, you are right. we are **NOT** making this new payload default for now. Have created https://issues.apache.org/jira/browse/HUDI-1464 to track this. btw, have addressed al

[jira] [Assigned] (HUDI-1464) Make DefaultHoodieRecordPayload default payload class

2020-12-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1464: - Assignee: sivabalan narayanan > Make DefaultHoodieRecordPayload default payload c

[jira] [Created] (HUDI-1464) Make DefaultHoodieRecordPayload default payload class

2020-12-15 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1464: - Summary: Make DefaultHoodieRecordPayload default payload class Key: HUDI-1464 URL: https://issues.apache.org/jira/browse/HUDI-1464 Project: Apache Hudi

[jira] [Updated] (HUDI-1464) Make DefaultHoodieRecordPayload default payload class

2020-12-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1464: -- Fix Version/s: 0.8.0 > Make DefaultHoodieRecordPayload default payload class > -

[jira] [Assigned] (HUDI-651) Incremental Query on Hive via Spark SQL does not return expected results

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-651: --- Assignee: sivabalan narayanan (was: Vinoth Chandar) > Incremental Query on Hive via Spark SQL

[jira] [Updated] (HUDI-1394) Ensure all instances of FSUtils.getAllPartitionsPaths() are replaced with calls to metadata table

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1394: - Status: Open (was: New) > Ensure all instances of FSUtils.getAllPartitionsPaths() are replaced wi

[jira] [Updated] (HUDI-1430) Support Dataset write w/o conversion to RDD

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1430: - Fix Version/s: (was: 0.7.0) 0.8.0 > Support Dataset write w/o conversion to

[jira] [Updated] (HUDI-1430) Support Dataset write w/o conversion to RDD

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1430: - Status: Open (was: New) > Support Dataset write w/o conversion to RDD > -

[jira] [Updated] (HUDI-1275) Incremental TImeline Syncing causes compaction to fail with FileNotFound exception

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1275: - Fix Version/s: (was: 0.7.0) 0.8.0 > Incremental TImeline Syncing causes com

[GitHub] [hudi] vinothchandar commented on pull request #2300: [HUDI-1434] fix incorrect log file path in HoodieWriteStat

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2300: URL: https://github.com/apache/hudi/pull/2300#issuecomment-745779978 Thanks @garyli1019 This is an automated message from the Apache Git Service. To respond to the message, plea

[jira] [Updated] (HUDI-860) Ability to do small file handling without need for caching

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-860: Fix Version/s: (was: 0.7.0) 0.8.0 > Ability to do small file handling without

[jira] [Updated] (HUDI-1040) Support Spark3 compatibility

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1040: - Status: Closed (was: Patch Available) > Support Spark3 compatibility > --

[jira] [Resolved] (HUDI-1040) Support Spark3 compatibility

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar resolved HUDI-1040. -- Resolution: Fixed > Support Spark3 compatibility > > >

[jira] [Assigned] (HUDI-1040) Support Spark3 compatibility

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-1040: Assignee: Wenning Ding (was: Vinoth Chandar) > Support Spark3 compatibility >

[jira] [Reopened] (HUDI-1040) Support Spark3 compatibility

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reopened HUDI-1040: -- > Support Spark3 compatibility > > > Key: HUDI-1040 >

[jira] [Assigned] (HUDI-1040) Support Spark3 compatibility

2020-12-15 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-1040: Assignee: Vinoth Chandar (was: Wenning Ding) > Support Spark3 compatibility >

[GitHub] [hudi] so-lazy closed issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2020-12-15 Thread GitBox
so-lazy closed issue #2338: URL: https://github.com/apache/hudi/issues/2338 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [hudi] so-lazy commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2020-12-15 Thread GitBox
so-lazy commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-745774960 > @nsivabalan : Can you take a look at this. > > Thanks, > Balaji.V Thanks for your quick reply This is

[GitHub] [hudi] vinothchandar commented on a change in pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-15 Thread GitBox
vinothchandar commented on a change in pull request #2311: URL: https://github.com/apache/hudi/pull/2311#discussion_r543976521 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apac

[GitHub] [hudi] nsivabalan commented on a change in pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-15 Thread GitBox
nsivabalan commented on a change in pull request #2311: URL: https://github.com/apache/hudi/pull/2311#discussion_r543974062 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] codecov-io edited a comment on pull request #2337: [HUDI-982] Flink support mor table

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2337: URL: https://github.com/apache/hudi/pull/2337#issuecomment-745769872 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2337?src=pr&el=h1) Report > Merging [#2337](https://codecov.io/gh/apache/hudi/pull/2337?src=pr&el=desc) (f7bef05) in

[GitHub] [hudi] nsivabalan commented on a change in pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-15 Thread GitBox
nsivabalan commented on a change in pull request #2311: URL: https://github.com/apache/hudi/pull/2311#discussion_r543973549 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] nsivabalan commented on a change in pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-15 Thread GitBox
nsivabalan commented on a change in pull request #2311: URL: https://github.com/apache/hudi/pull/2311#discussion_r543972882 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/DefaultHoodieRecordPayload.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache

[jira] [Updated] (HUDI-982) Make flink engine support MOR table

2020-12-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-982: Labels: pull-request-available (was: ) > Make flink engine support MOR table > -

[GitHub] [hudi] codecov-io commented on pull request #2337: [HUDI-982] Flink support mor table

2020-12-15 Thread GitBox
codecov-io commented on pull request #2337: URL: https://github.com/apache/hudi/pull/2337#issuecomment-745769872 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2337?src=pr&el=h1) Report > Merging [#2337](https://codecov.io/gh/apache/hudi/pull/2337?src=pr&el=desc) (f7bef05) into [ma

[jira] [Updated] (HUDI-1463) Accomplishments (2019-2020) and Roadmap (2021-2022)

2020-12-15 Thread Mani Jindal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mani Jindal updated HUDI-1463: -- Description: this Jira is to writing blog on accomplishments/journey and future roadmap for upcoming yea

[jira] [Created] (HUDI-1463) Accomplishments (2019-2020) and Roadmap (2021-2022)

2020-12-15 Thread Mani Jindal (Jira)
Mani Jindal created HUDI-1463: - Summary: Accomplishments (2019-2020) and Roadmap (2021-2022) Key: HUDI-1463 URL: https://issues.apache.org/jira/browse/HUDI-1463 Project: Apache Hudi Issue Type: T

[GitHub] [hudi] garyli1019 commented on pull request #2300: [HUDI-1434] fix incorrect log file path in HoodieWriteStat

2020-12-15 Thread GitBox
garyli1019 commented on pull request #2300: URL: https://github.com/apache/hudi/pull/2300#issuecomment-745758586 > @garyli1019 this is a release blocker for 0.7.0. Do you have cycles to knock this off in the next few days? Please let me know if you need any help @vinothchandar I will

[GitHub] [hudi] vinothchandar commented on pull request #2136: [HUDI-37] Persist the HoodieIndex type in the hoodie.properties file

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2136: URL: https://github.com/apache/hudi/pull/2136#issuecomment-745755294 yes. I am also wondering if we should log these to the metadata table in RFC-15. its a much better model in some sense, since it's fully self managed. Do you mind we hang on

[GitHub] [hudi] vinothchandar commented on pull request #2300: [HUDI-1434] fix incorrect log file path in HoodieWriteStat

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2300: URL: https://github.com/apache/hudi/pull/2300#issuecomment-745752910 @garyli1019 this is a release blocker for 0.7.0. Do you have cycles to knock this off in the next few days? Please let me know if you need any help -

[GitHub] [hudi] vinothchandar commented on issue #2323: [SUPPORT] GLOBAL_BLOOM index significantly slowing down processing time

2020-12-15 Thread GitBox
vinothchandar commented on issue #2323: URL: https://github.com/apache/hudi/issues/2323#issuecomment-745748741 @kirkuz 1. GLOBAL indexes with the config set to update partition path will solve the problem for you. Either GLOBAL_BLOOM/GLOBAL_SIMPLE. Indexing is not that well explained t

[GitHub] [hudi] vinothchandar commented on a change in pull request #2320: [HUDI-57] Added Orc Writer to Support Orc in Hudi

2020-12-15 Thread GitBox
vinothchandar commented on a change in pull request #2320: URL: https://github.com/apache/hudi/pull/2320#discussion_r543896015 ## File path: hudi-client/hudi-client-common/pom.xml ## @@ -102,7 +102,18 @@ io.prometheus simpleclient_pushgateway - + +

[hudi] branch rfc-15 updated: [HUDI-1319] Make async operations work with metadata table (#2332)

2020-12-15 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch rfc-15 in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/rfc-15 by this push: new 2e09aec [HUDI-1319] Make async operations work wi

[GitHub] [hudi] vinothchandar merged pull request #2332: [HUDI-1319] Make async operations work with metadata table

2020-12-15 Thread GitBox
vinothchandar merged pull request #2332: URL: https://github.com/apache/hudi/pull/2332 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on pull request #2332: [HUDI-1319] Make async operations work with metadata table

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2332: URL: https://github.com/apache/hudi/pull/2332#issuecomment-745742598 I am merging this into the branch to make forward progress. @prashantwason any code review comments, can go in a separate PR follow on --

[GitHub] [hudi] liujinhui1994 commented on pull request #2227: [HUDI-1367] Make delastreamer transition from dfsSouce to kafkasouce

2020-12-15 Thread GitBox
liujinhui1994 commented on pull request #2227: URL: https://github.com/apache/hudi/pull/2227#issuecomment-745731566 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [hudi] garyli1019 commented on a change in pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2020-12-15 Thread GitBox
garyli1019 commented on a change in pull request #2296: URL: https://github.com/apache/hudi/pull/2296#discussion_r543851674 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -320,4 +320,21 @@ class TestCOWDat

[jira] [Commented] (HUDI-1401) Presto use of Metadata Table for file listings

2020-12-15 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250060#comment-17250060 ] Udit Mehrotra commented on HUDI-1401: - PR for our internal review: https://github.com/

[GitHub] [hudi] bvaradar commented on issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2020-12-15 Thread GitBox
bvaradar commented on issue #2338: URL: https://github.com/apache/hudi/issues/2338#issuecomment-745713009 @nsivabalan : Can you take a look at this. Thanks, Balaji.V This is an automated message from the Apache Git

[GitHub] [hudi] bvaradar commented on issue #2335: [SUPPORT] - Range based partitioning to mitigate performance issues with large number of partitions

2020-12-15 Thread GitBox
bvaradar commented on issue #2335: URL: https://github.com/apache/hudi/issues/2335#issuecomment-745712219 Also, the slowdown caused due to file listings should be eliminated in next release when https://github.com/apache/hudi/pull/2189 and related diffs will be available. (https://cwiki.a

[GitHub] [hudi] bvaradar commented on issue #2335: [SUPPORT] - Range based partitioning to mitigate performance issues with large number of partitions

2020-12-15 Thread GitBox
bvaradar commented on issue #2335: URL: https://github.com/apache/hudi/issues/2335#issuecomment-745711266 @asharma4-lucid : The reason why any processing mechanism (hudi or others) takes longer time is the second order effects that the partitioning style causes. Having too many partitions

[GitHub] [hudi] so-lazy opened a new issue #2338: [SUPPORT] MOR table found duplicate and process so slowly

2020-12-15 Thread GitBox
so-lazy opened a new issue #2338: URL: https://github.com/apache/hudi/issues/2338 I want to have a pipeline consume incremental data from Kafka. first i have a full data import to a hudi mor table size around 23G, When this is done, everything is all right. But when i start consuming incre

[hudi] branch master updated (93d9c25 -> 26cdc45)

2020-12-15 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 93d9c25 [MINOR] Improve code readability by passing in the fileComparisonsRDD in bloom index (#2319) add 26cdc45

[GitHub] [hudi] vinothchandar merged pull request #2233: [HUDI-1376] Drop Hudi metadata cols at the beginning of Spark datasource writing

2020-12-15 Thread GitBox
vinothchandar merged pull request #2233: URL: https://github.com/apache/hudi/pull/2233 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on a change in pull request #2245: Adding Hudi indexing mechanisms blog

2020-12-15 Thread GitBox
vinothchandar commented on a change in pull request #2245: URL: https://github.com/apache/hudi/pull/2245#discussion_r543773277 ## File path: docs/_posts/2020-11-11-hudi-indexing-mechanisms.md ## @@ -0,0 +1,80 @@ +--- +title: "Apache Hudi Indexing mechanisms" +excerpt: "Detailin

[GitHub] [hudi] codecov-io edited a comment on pull request #2233: [HUDI-1376] Drop Hudi metadata cols at the beginning of Spark datasource writing

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2233: URL: https://github.com/apache/hudi/pull/2233#issuecomment-722775674 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2233?src=pr&el=h1) Report > Merging [#2233](https://codecov.io/gh/apache/hudi/pull/2233?src=pr&el=desc) (508a04a) in

[GitHub] [hudi] nsivabalan commented on pull request #2245: [WIP] Adding Hudi indexing mechanisms blog

2020-12-15 Thread GitBox
nsivabalan commented on pull request #2245: URL: https://github.com/apache/hudi/pull/2245#issuecomment-745613194 @vinothchandar : feel free to review the patch. This is an automated message from the Apache Git Service. To re

[GitHub] [hudi] nsivabalan commented on a change in pull request #2245: [WIP] Adding Hudi indexing mechanisms blog

2020-12-15 Thread GitBox
nsivabalan commented on a change in pull request #2245: URL: https://github.com/apache/hudi/pull/2245#discussion_r543739496 ## File path: docs/_posts/2020-11-11-hudi-indexing-mechanisms.md ## @@ -0,0 +1,80 @@ +--- +title: "Apache Hudi Indexing mechanisms" +excerpt: "Detailing d

[GitHub] [hudi] codecov-io edited a comment on pull request #2333: [HUDI-1160] Support update partial fields for CoW table

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2333: URL: https://github.com/apache/hudi/pull/2333#issuecomment-744582334 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2333?src=pr&el=h1) Report > Merging [#2333](https://codecov.io/gh/apache/hudi/pull/2333?src=pr&el=desc) (366f985) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2333: [HUDI-1160] Support update partial fields for CoW table

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2333: URL: https://github.com/apache/hudi/pull/2333#issuecomment-744582334 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2333?src=pr&el=h1) Report > Merging [#2333](https://codecov.io/gh/apache/hudi/pull/2333?src=pr&el=desc) (366f985) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2333: [HUDI-1160] Support update partial fields for CoW table

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2333: URL: https://github.com/apache/hudi/pull/2333#issuecomment-744582334 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2333?src=pr&el=h1) Report > Merging [#2333](https://codecov.io/gh/apache/hudi/pull/2333?src=pr&el=desc) (366f985) in

[GitHub] [hudi] umehrot2 commented on issue #2255: [SUPPORT] Global Bloom and partition update not working correctly in MOR table

2020-12-15 Thread GitBox
umehrot2 commented on issue #2255: URL: https://github.com/apache/hudi/issues/2255#issuecomment-745557336 @WTa-hash @nsivabalan As you already seem to know that Athena right now only has support for read optimized queries https://docs.aws.amazon.com/athena/latest/ug/querying-hudi.html and

[GitHub] [hudi] vinothchandar commented on pull request #2188: [HUDI-1347]fix Hbase index partition changes cause data duplication p…

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2188: URL: https://github.com/apache/hudi/pull/2188#issuecomment-745482578 cc @satishkotha @nbalajee as well, in case one of you have cycles. This is an automated message from the Apac

[GitHub] [hudi] vinothchandar commented on pull request #2188: [HUDI-1347]fix Hbase index partition changes cause data duplication p…

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2188: URL: https://github.com/apache/hudi/pull/2188#issuecomment-745482386 @n3nash could you please rebase and take care of landing this? @v3nkatesh your review would be appreciated, to ensure nothing regresses for you folks at uber ---

[GitHub] [hudi] zhedoubushishi commented on a change in pull request #2049: [HUDI-1104] Adding support for UserDefinedPartitioners and SortModes to BulkInsert with Rows

2020-12-15 Thread GitBox
zhedoubushishi commented on a change in pull request #2049: URL: https://github.com/apache/hudi/pull/2049#discussion_r543587976 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/GlobalSortPartitionerWithRows.java ## @@ -0,0 +1,45 @@

[GitHub] [hudi] zhedoubushishi commented on a change in pull request #2049: [HUDI-1104] Adding support for UserDefinedPartitioners and SortModes to BulkInsert with Rows

2020-12-15 Thread GitBox
zhedoubushishi commented on a change in pull request #2049: URL: https://github.com/apache/hudi/pull/2049#discussion_r543587976 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/execution/bulkinsert/GlobalSortPartitionerWithRows.java ## @@ -0,0 +1,45 @@

[GitHub] [hudi] vinothchandar commented on pull request #2233: [HUDI-1376] Drop Hudi metadata cols at the beginning of Spark datasource writing

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2233: URL: https://github.com/apache/hudi/pull/2233#issuecomment-745477535 Rebased, will merge once CI passes This is an automated message from the Apache Git Service. To respond to th

[GitHub] [hudi] evgeny-bondarenko commented on issue #2103: [SUPPORT] NullPointerException when using ComplexKeyGenerator

2020-12-15 Thread GitBox
evgeny-bondarenko commented on issue #2103: URL: https://github.com/apache/hudi/issues/2103#issuecomment-745467214 @bvaradar Thank you a lot. It worked. This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] bvaradar commented on issue #2331: Why does Hudi not support field deletions?

2020-12-15 Thread GitBox
bvaradar commented on issue #2331: URL: https://github.com/apache/hudi/issues/2331#issuecomment-745433097 @prashantwason @nbalajee @satishkotha : Can you please look into this ? This is an automated message from the Apache Gi

[hudi] branch asf-site updated: Travis CI build asf-site

2020-12-15 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 2749e7d Travis CI build asf-site 2749e7d is d

[GitHub] [hudi] vinothchandar commented on pull request #2328: [HUDI-1451] Support bulk insert v2 with Spark 3.0.0

2020-12-15 Thread GitBox
vinothchandar commented on pull request #2328: URL: https://github.com/apache/hudi/pull/2328#issuecomment-745398104 @nsivabalan can you please jump on this asap? this is a release blocker. This is an automated message from t

[GitHub] [hudi] vinothchandar commented on a change in pull request #2311: [HUDI-115] Adding DefaultHoodieRecordPayload to honor ordering with combineAndGetUpdateValue

2020-12-15 Thread GitBox
vinothchandar commented on a change in pull request #2311: URL: https://github.com/apache/hudi/pull/2311#discussion_r543467964 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodiePayloadProps.java ## @@ -0,0 +1,12 @@ +package org.apache.hudi.common.model

[GitHub] [hudi] codecov-io edited a comment on pull request #2334: [HUDI-1453] Throw Exception when input data schema is not equal to th…

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2334: URL: https://github.com/apache/hudi/pull/2334#issuecomment-745334158 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [hudi] codecov-io commented on pull request #2334: [HUDI-1453] Throw Exception when input data schema is not equal to th…

2020-12-15 Thread GitBox
codecov-io commented on pull request #2334: URL: https://github.com/apache/hudi/pull/2334#issuecomment-745334158 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2334?src=pr&el=h1) Report > Merging [#2334](https://codecov.io/gh/apache/hudi/pull/2334?src=pr&el=desc) (2e086b9) into [ma

[GitHub] [hudi] manijndl7 commented on pull request #2320: [HUDI-57] Added Orc Writer to Support Orc in Hudi

2020-12-15 Thread GitBox
manijndl7 commented on pull request #2320: URL: https://github.com/apache/hudi/pull/2320#issuecomment-745277619 Thanks @garyli1019 for checking this !! yea its in my pipeline , i am currently focusing on writing ORC Reader then i will commit the test cases. ---

[hudi] branch asf-site updated: [IMAGE] Add more components to hudi-lake (#2336)

2020-12-15 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 1517368 [IMAGE] Add more components to hudi-la

[GitHub] [hudi] leesf merged pull request #2336: [IMAGE] Add AliyunDLA Alluxio to hudi-lake

2020-12-15 Thread GitBox
leesf merged pull request #2336: URL: https://github.com/apache/hudi/pull/2336 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [hudi] shenh062326 commented on pull request #2286: [HUDI-1419] Add base implement for hudi java client

2020-12-15 Thread GitBox
shenh062326 commented on pull request #2286: URL: https://github.com/apache/hudi/pull/2286#issuecomment-745239498 "At a high level, we probably should do a second pass and move more code into hudi-client-common over time. but don't want to hold this PR for that, esp given its adding entire

[GitHub] [hudi] codecov-io edited a comment on pull request #2316: Add commons-codec to spark and utilities bundle jars

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2316: URL: https://github.com/apache/hudi/pull/2316#issuecomment-742049822 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2316?src=pr&el=h1) Report > Merging [#2316](https://codecov.io/gh/apache/hudi/pull/2316?src=pr&el=desc) (8081bdf) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2316: Add commons-codec to spark and utilities bundle jars

2020-12-15 Thread GitBox
codecov-io edited a comment on pull request #2316: URL: https://github.com/apache/hudi/pull/2316#issuecomment-742049822 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2316?src=pr&el=h1) Report > Merging [#2316](https://codecov.io/gh/apache/hudi/pull/2316?src=pr&el=desc) (8081bdf) in

[GitHub] [hudi] liujinhui1994 opened a new pull request #2337: [Hudi-982] Flink support mor table

2020-12-15 Thread GitBox
liujinhui1994 opened a new pull request #2337: URL: https://github.com/apache/hudi/pull/2337 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of t

[GitHub] [hudi] giaosudau commented on a change in pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

2020-12-15 Thread GitBox
giaosudau commented on a change in pull request #2012: URL: https://github.com/apache/hudi/pull/2012#discussion_r543093343 ## File path: hudi-spark/src/main/scala/org/apache/hudi/AvroConversionHelper.scala ## @@ -364,4 +366,40 @@ object AvroConversionHelper { } }

[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2020-12-15 Thread GitBox
pengzhiwei2018 commented on a change in pull request #2296: URL: https://github.com/apache/hudi/pull/2296#discussion_r543222800 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -320,4 +320,21 @@ class TestCO

[jira] [Updated] (HUDI-1453) Throw Exception when input data schema is not equal to the hoodie table schema

2020-12-15 Thread pengzhiwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1453: - Description: The hoodie table *h0's* schema is : {code:java} (id long, price double){code} when I write th

[GitHub] [hudi] giaosudau commented on a change in pull request #2012: [HUDI-1129] Deltastreamer Add support for schema evolution

2020-12-15 Thread GitBox
giaosudau commented on a change in pull request #2012: URL: https://github.com/apache/hudi/pull/2012#discussion_r543093343 ## File path: hudi-spark/src/main/scala/org/apache/hudi/AvroConversionHelper.scala ## @@ -364,4 +366,40 @@ object AvroConversionHelper { } }

[GitHub] [hudi] kirkuz commented on issue #2323: [SUPPORT] GLOBAL_BLOOM index significantly slowing down processing time

2020-12-15 Thread GitBox
kirkuz commented on issue #2323: URL: https://github.com/apache/hudi/issues/2323#issuecomment-745166974 @n3nash 1. I can't really understand what is the difference between GLOBAL_BLOOM and GLOBAL_SIMPLE. Will the latter solve the problem with updating the partition for me (I mean r

[jira] [Updated] (HUDI-1462) The rt view query returns a wrong result with predicate push down

2020-12-15 Thread qian heng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian heng updated HUDI-1462: Environment: hiveserver2 > The rt view query returns a wrong result with predicate push down > -

[GitHub] [hudi] garyli1019 commented on a change in pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2020-12-15 Thread GitBox
garyli1019 commented on a change in pull request #2296: URL: https://github.com/apache/hudi/pull/2296#discussion_r543162168 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -320,4 +320,21 @@ class TestCOWDat

[GitHub] [hudi] mauropelucchi commented on a change in pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2020-12-15 Thread GitBox
mauropelucchi commented on a change in pull request #2296: URL: https://github.com/apache/hudi/pull/2296#discussion_r543127878 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala ## @@ -320,4 +320,21 @@ class TestCOW