[GitHub] [hudi] codecov-commenter edited a comment on pull request #1722: [HUDI-69] Support Spark Datasource for MOR table

2020-06-11 Thread GitBox
codecov-commenter edited a comment on pull request #1722: URL: https://github.com/apache/hudi/pull/1722#issuecomment-643095877 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1722?src=pr&el=h1) Report > Merging [#1722](https://codecov.io/gh/apache/hudi/pull/1722?src=pr&el=desc) into

[GitHub] [hudi] codecov-commenter commented on pull request #1722: [HUDI-69] Support Spark Datasource for MOR table

2020-06-11 Thread GitBox
codecov-commenter commented on pull request #1722: URL: https://github.com/apache/hudi/pull/1722#issuecomment-643095877 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1722?src=pr&el=h1) Report > Merging [#1722](https://codecov.io/gh/apache/hudi/pull/1722?src=pr&el=desc) into [master

[GitHub] [hudi] garyli1019 commented on pull request #1719: [HUDI-1006]deltastreamer use kafkaSource with offset reset strategy:latest can't consume data

2020-06-11 Thread GitBox
garyli1019 commented on pull request #1719: URL: https://github.com/apache/hudi/pull/1719#issuecomment-643077420 for example, https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.ja

[GitHub] [hudi] garyli1019 commented on pull request #1719: [HUDI-1006]deltastreamer use kafkaSource with offset reset strategy:latest can't consume data

2020-06-11 Thread GitBox
garyli1019 commented on pull request #1719: URL: https://github.com/apache/hudi/pull/1719#issuecomment-643069988 > In `KafkaSource` like `JsonKafkaSource` & `AvroKafkaSource`, `Option.empty()`and `Option.of("")` treated in reset strategy branch Can we remove the empty string hand

[GitHub] [hudi] nandini57 commented on issue #1705: Tracking Hudi Data along transaction time and buisness time

2020-06-11 Thread GitBox
nandini57 commented on issue #1705: URL: https://github.com/apache/hudi/issues/1705#issuecomment-643062483 Hi @bvaradar @vinothchandar , do you see any problem with this approach or any points to consider? This is an automa

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #306

2020-06-11 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.40 KB...] settings.xml toolchains.xml /home/jenkins/tools/maven/apache-maven-3.5.4/conf/logging: simplelogger.properties /home/jenkins/tool

[jira] [Commented] (HUDI-760) Remove Rolling Stat management from Hudi Writer

2020-06-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133881#comment-17133881 ] sivabalan narayanan commented on HUDI-760: -- [~vbalaji]: I have some doubts on this

[jira] [Assigned] (HUDI-760) Remove Rolling Stat management from Hudi Writer

2020-06-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-760: Assignee: sivabalan narayanan (was: renyi.bao) > Remove Rolling Stat management fro

[jira] [Commented] (HUDI-760) Remove Rolling Stat management from Hudi Writer

2020-06-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133873#comment-17133873 ] sivabalan narayanan commented on HUDI-760: -- [~baobaoyeye]: I am taking this up as

[jira] [Commented] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133871#comment-17133871 ] sivabalan narayanan commented on HUDI-635: -- [~vinoth]: Have a follow up question o

[jira] [Assigned] (HUDI-791) Replace null by Option in Delta Streamer

2020-06-11 Thread Alan Chu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Chu reassigned HUDI-791: - Assignee: (was: Alan Chu) > Replace null by Option in Delta Streamer > ---

[jira] [Assigned] (HUDI-791) Replace null by Option in Delta Streamer

2020-06-11 Thread Alan Chu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Chu reassigned HUDI-791: - Assignee: Alan Chu > Replace null by Option in Delta Streamer > >

[GitHub] [hudi] wangxianghu commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
wangxianghu commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-643021903 > We may have to coordinate with the bootstrap pr bit more on conflicts/rebasng, so that either of your life is not hell :) will keep an eye on the bootstrap pr, thanks f

[GitHub] [hudi] wangxianghu commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
wangxianghu commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-643021667 > @leesf @wangxianghu Direction is definitely promising.. and very clean.. Let me know if you want a detailed line-by-line review > > also ccing @yanghua @bvaradar @n3nas

[GitHub] [hudi] wangxianghu commented on a change in pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
wangxianghu commented on a change in pull request #1727: URL: https://github.com/apache/hudi/pull/1727#discussion_r439159813 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java ## @@ -19,52 +19,53 @@ package org.apache.h

[GitHub] [hudi] yanghua commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
yanghua commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-643018511 > @leesf @wangxianghu Direction is definitely promising.. and very clean.. Let me know if you want a detailed line-by-line review > > also ccing @yanghua @bvaradar @n3nash to

[GitHub] [hudi] leesf commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
leesf commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-643016988 > @leesf @wangxianghu Direction is definitely promising.. and very clean.. Let me know if you want a detailed line-by-line review > > also ccing @yanghua @bvaradar @n3nash to b

[GitHub] [hudi] wangxianghu commented on a change in pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
wangxianghu commented on a change in pull request #1727: URL: https://github.com/apache/hudi/pull/1727#discussion_r439160108 ## File path: hudi-client/hudi-client-spark/src/main/java/org/apache/hudi/client/SparkCompactionAdminClient.java ## @@ -0,0 +1,131 @@ +package org.apach

[GitHub] [hudi] wangxianghu commented on a change in pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
wangxianghu commented on a change in pull request #1727: URL: https://github.com/apache/hudi/pull/1727#discussion_r439159907 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -0,0 +1,537 @@ +package org.apach

[GitHub] [hudi] wangxianghu commented on a change in pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
wangxianghu commented on a change in pull request #1727: URL: https://github.com/apache/hudi/pull/1727#discussion_r439159813 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieClient.java ## @@ -19,52 +19,53 @@ package org.apache.h

[jira] [Commented] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-11 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133815#comment-17133815 ] sivabalan narayanan commented on HUDI-635: -- got it. So, instead of storing entire

[GitHub] [hudi] vinothchandar commented on a change in pull request #1678: [HUDI-242] Metadata Bootstrap changes

2020-06-11 Thread GitBox
vinothchandar commented on a change in pull request #1678: URL: https://github.com/apache/hudi/pull/1678#discussion_r439127629 ## File path: .travis.yml ## @@ -13,14 +13,19 @@ # See the License for the specific language governing permissions and # limitations under the Licens

[GitHub] [hudi] umehrot2 commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-06-11 Thread GitBox
umehrot2 commented on issue #1552: URL: https://github.com/apache/hudi/issues/1552#issuecomment-642976731 @vinothchandar we have two series 5.x and 6.x which both support Hudi. Latest 5.x series i.e. `emr-5.30.0` support `0.5.2-incubating`, while the latest 6.x series is i.e. emr-6.0.0 is

[GitHub] [hudi] bhasudha commented on a change in pull request #1683: Updating release docs for release-0.5.3

2020-06-11 Thread GitBox
bhasudha commented on a change in pull request #1683: URL: https://github.com/apache/hudi/pull/1683#discussion_r439107887 ## File path: docs/_pages/releases.md ## @@ -3,8 +3,40 @@ title: "Releases" permalink: /releases layout: releases toc: true -last_modified_at: 2019-12-30

[GitHub] [hudi] bhasudha commented on issue #1696: COW Error on existing Hive Table

2020-06-11 Thread GitBox
bhasudha commented on issue #1696: URL: https://github.com/apache/hudi/issues/1696#issuecomment-642959493 > @balaji, Is this feature available in 0.5.0 version? Due to some restriction, I can't use spark 2.4 .x . So I am still using 0.5.0 This feature was introduced in 0.5.1 - https

[GitHub] [hudi] bhasudha commented on a change in pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-06-11 Thread GitBox
bhasudha commented on a change in pull request #1704: URL: https://github.com/apache/hudi/pull/1704#discussion_r439095206 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java ## @@ -50,8 +50,25 @@ * @param schema Schema used for rec

[GitHub] [hudi] codecov-commenter edited a comment on pull request #1717: [HUDI-1012] Add unit test for snapshot reads

2020-06-11 Thread GitBox
codecov-commenter edited a comment on pull request #1717: URL: https://github.com/apache/hudi/pull/1717#issuecomment-640854839 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1717?src=pr&el=h1) Report > Merging [#1717](https://codecov.io/gh/apache/hudi/pull/1717?src=pr&el=desc) into

[GitHub] [hudi] EdwinGuo commented on pull request #1721: Cache the explodeRecordRDDWithFileComparisons instead of commuting it…

2020-06-11 Thread GitBox
EdwinGuo commented on pull request #1721: URL: https://github.com/apache/hudi/pull/1721#issuecomment-642922843 > Good one. If you don't mind, can you run a sample job(with 1 M records or something) and show the spark UI stages screen shot to see the difference with and w/o this optimizatio

[GitHub] [hudi] satishkotha commented on a change in pull request #1717: [HUDI-1012] Add unit test for snapshot reads

2020-06-11 Thread GitBox
satishkotha commented on a change in pull request #1717: URL: https://github.com/apache/hudi/pull/1717#discussion_r439037336 ## File path: hudi-client/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java ## @@ -90,6 +90,8 @@ import static org.junit.jupiter.api.

[jira] [Created] (HUDI-1022) Document examples for Spark structured streaming writing into Hudi

2020-06-11 Thread Bhavani Sudha (Jira)
Bhavani Sudha created HUDI-1022: --- Summary: Document examples for Spark structured streaming writing into Hudi Key: HUDI-1022 URL: https://issues.apache.org/jira/browse/HUDI-1022 Project: Apache Hudi

[GitHub] [hudi] tooptoop4 commented on issue #857: http://hudi.apache.org/comparison.html# should mention Iceberg and DeltaLake

2020-06-11 Thread GitBox
tooptoop4 commented on issue #857: URL: https://github.com/apache/hudi/issues/857#issuecomment-642854161 @desavera deltaLake os does not support s3? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] tooptoop4 commented on issue #857: http://hudi.apache.org/comparison.html# should mention Iceberg and DeltaLake

2020-06-11 Thread GitBox
tooptoop4 commented on issue #857: URL: https://github.com/apache/hudi/issues/857#issuecomment-642854161 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] vinothchandar commented on a change in pull request #1716: [HUDI-875] Introduce a new pom module named hudi-common-sync

2020-06-11 Thread GitBox
vinothchandar commented on a change in pull request #1716: URL: https://github.com/apache/hudi/pull/1716#discussion_r438972414 ## File path: hudi-sync-common/pom.xml ## @@ -0,0 +1,121 @@ + + +http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-insta

[jira] [Commented] (HUDI-781) Re-design test utilities

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133539#comment-17133539 ] Vinoth Chandar commented on HUDI-781: - One thing to be careful about when deciding mock

[GitHub] [hudi] vinothchandar commented on pull request #1716: [HUDI-875] Introduce a new pom module named hudi-common-sync

2020-06-11 Thread GitBox
vinothchandar commented on pull request #1716: URL: https://github.com/apache/hudi/pull/1716#issuecomment-642840466 Good with the module.. Few thoughts/suggestions - Can we nest this under hudi-sync as `hudi-sync-common`, `hudi-sync-hive` and you can then introduce the aliyun metast

[jira] [Commented] (HUDI-783) Add official python support to create hudi datasets using pyspark

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133528#comment-17133528 ] Vinoth Chandar commented on HUDI-783: - This is awesome! :)    > Add official python s

[GitHub] [hudi] vinothchandar commented on pull request #1514: [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-06-11 Thread GitBox
vinothchandar commented on pull request #1514: URL: https://github.com/apache/hudi/pull/1514#issuecomment-642831847 @afilipchik I am pausing coz diverging too much from spark-avro is a maintenance headache.. Do we try to upstream these to spark directly? Probably a better path?

[GitHub] [hudi] harishchanderramesh commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-06-11 Thread GitBox
harishchanderramesh commented on issue #1552: URL: https://github.com/apache/hudi/issues/1552#issuecomment-642831361 Thats a good question. I would like to know how would it impact. I see hudi 0.5.0-incubating jar preloaded while launching hudi cli. but what if i use any other versions on

[GitHub] [hudi] vinothchandar commented on a change in pull request #1710: [MINOR] Add validation error messages in delta sync

2020-06-11 Thread GitBox
vinothchandar commented on a change in pull request #1710: URL: https://github.com/apache/hudi/pull/1710#discussion_r438955871 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java ## @@ -497,25 +499,30 @@ private void setupWriteClie

[GitHub] [hudi] vinothchandar commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
vinothchandar commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-642829232 We may have to coordinate with the bootstrap pr bit more on conflicts/rebasng, so that either of your life is not hell :) --

[GitHub] [hudi] vinothchandar commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
vinothchandar commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-642828881 @leesf @wangxianghu Direction is definitely promising.. and very clean.. Let me know if you want a detailed line-by-line review also ccing @yanghua @bvaradar @n3nash t

[GitHub] [hudi] vinothchandar commented on a change in pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
vinothchandar commented on a change in pull request #1727: URL: https://github.com/apache/hudi/pull/1727#discussion_r438953612 ## File path: hudi-client/hudi-client-spark/src/main/java/org/apache/hudi/client/SparkCompactionAdminClient.java ## @@ -0,0 +1,131 @@ +package org.apa

[GitHub] [hudi] vinothchandar commented on a change in pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
vinothchandar commented on a change in pull request #1727: URL: https://github.com/apache/hudi/pull/1727#discussion_r438950191 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java ## @@ -36,7 +35,7 @@ public class HoodieInde

[GitHub] [hudi] harishchanderramesh opened a new issue #1728: Processing time gradually increases while using spark structured streaming & [SUPPORT]

2020-06-11 Thread GitBox
harishchanderramesh opened a new issue #1728: URL: https://github.com/apache/hudi/issues/1728 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? - Join the mailing list to engage in conversations and get

[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

2020-06-11 Thread GitBox
vinothchandar commented on pull request #1665: URL: https://github.com/apache/hudi/pull/1665#issuecomment-642819377 Sg.. Will jump on #1727 . Closing this one This is an automated message from the Apache Git Service. To respo

[GitHub] [hudi] vinothchandar closed pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

2020-06-11 Thread GitBox
vinothchandar closed pull request #1665: URL: https://github.com/apache/hudi/pull/1665 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
vinothchandar commented on pull request #1727: URL: https://github.com/apache/hudi/pull/1727#issuecomment-642819197 @wangxianghu @leesf lets discuss on this PR.. its easy comment and iterate This is an automated message from

[GitHub] [hudi] vinothchandar opened a new pull request #1727: [WIP] [Review] refactor hudi-client

2020-06-11 Thread GitBox
vinothchandar opened a new pull request #1727: URL: https://github.com/apache/hudi/pull/1727 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of t

[GitHub] [hudi] vinothchandar commented on issue #1696: COW Error on existing Hive Table

2020-06-11 Thread GitBox
vinothchandar commented on issue #1696: URL: https://github.com/apache/hudi/issues/1696#issuecomment-642817911 cc @bhasudha can confirm This is an automated message from the Apache Git Service. To respond to the message, ple

[jira] [Commented] (HUDI-944) Support more complete concurrency control when writing data

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133432#comment-17133432 ] Vinoth Chandar commented on HUDI-944: - Thats great.. Please feel free to take my PR and

[jira] [Commented] (HUDI-839) Implement rollbacks using marker files instead of relying on commit metadata

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133433#comment-17133433 ] Vinoth Chandar commented on HUDI-839: - I assigned the issue to you for now..  Also push

[jira] [Assigned] (HUDI-839) Implement rollbacks using marker files instead of relying on commit metadata

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-839: --- Assignee: liwei (was: Vinoth Chandar) > Implement rollbacks using marker files instead of rel

[GitHub] [hudi] vinothchandar edited a comment on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-06-11 Thread GitBox
vinothchandar edited a comment on issue #1552: URL: https://github.com/apache/hudi/issues/1552#issuecomment-642814882 aws emr is on 0.5.0, correct? @umehrot2 seems like it https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-6x.html#emr-600-app-versions how wou

[GitHub] [hudi] vinothchandar commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-06-11 Thread GitBox
vinothchandar commented on issue #1552: URL: https://github.com/apache/hudi/issues/1552#issuecomment-642814882 aws emr is on 0.5.0, correct? @umehrot2 ? This is an automated message from the Apache Git Service. To respond

[jira] [Commented] (HUDI-1007) When earliestOffsets is greater than checkpoint, Hudi will not be able to successfully consume data

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133412#comment-17133412 ] Vinoth Chandar commented on HUDI-1007: -- a scenario where the expiry is happening cont

[jira] [Commented] (HUDI-914) support different target data clusters

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133406#comment-17133406 ] Vinoth Chandar commented on HUDI-914: - so they want to split a hudi table across two sp

[jira] [Commented] (HUDI-896) Parallelize CI testing to reduce CI wait time

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133405#comment-17133405 ] Vinoth Chandar commented on HUDI-896: - Thanks..  I am not fully in the weeds of the cod

[GitHub] [hudi] vinothchandar commented on a change in pull request #1683: Updating release docs for release-0.5.3

2020-06-11 Thread GitBox
vinothchandar commented on a change in pull request #1683: URL: https://github.com/apache/hudi/pull/1683#discussion_r438927550 ## File path: docs/_pages/releases.md ## @@ -3,8 +3,40 @@ title: "Releases" permalink: /releases layout: releases toc: true -last_modified_at: 2019-

[GitHub] [hudi] UZi5136225 closed pull request #1726: [HUDI-210]Hudi support prometheus

2020-06-11 Thread GitBox
UZi5136225 closed pull request #1726: URL: https://github.com/apache/hudi/pull/1726 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Commented] (HUDI-944) Support more complete concurrency control when writing data

2020-06-11 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133391#comment-17133391 ] liwei commented on HUDI-944: Thanks [~vinoth] , i will start with HUDI-839 tests, and try to f

[GitHub] [hudi] UZi5136225 removed a comment on pull request #1726: [HUDI-210]Hudi support prometheus

2020-06-11 Thread GitBox
UZi5136225 removed a comment on pull request #1726: URL: https://github.com/apache/hudi/pull/1726#issuecomment-642792495 @leesf @XuQianJin-Stars This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [hudi] UZi5136225 commented on pull request #1726: [HUDI-210]Hudi support prometheus

2020-06-11 Thread GitBox
UZi5136225 commented on pull request #1726: URL: https://github.com/apache/hudi/pull/1726#issuecomment-642792495 @leesf @XuQianJin-Stars This is an automated message from the Apache Git Service. To respond to the message, pl

[GitHub] [hudi] UZi5136225 opened a new pull request #1726: [HUDI-210]Hudi support prometheus

2020-06-11 Thread GitBox
UZi5136225 opened a new pull request #1726: URL: https://github.com/apache/hudi/pull/1726 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] nsivabalan commented on pull request #1703: [HUDI-993] Let delete API use "hoodie.delete.shuffle.parallelism"

2020-06-11 Thread GitBox
nsivabalan commented on pull request #1703: URL: https://github.com/apache/hudi/pull/1703#issuecomment-642790151 Once you rebase and fix the build issue, I can merge. This is an automated message from the Apache Git Service.

[GitHub] [hudi] n3nash commented on a change in pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1704: URL: https://github.com/apache/hudi/pull/1704#discussion_r438902475 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java ## @@ -50,8 +50,25 @@ * @param schema Schema used for recor

[jira] [Updated] (HUDI-115) Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-06-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-115: Labels: bug-bash-0.6.0 pull-request-available (was: bug-bash-0.6.0) > Enhance OverwriteWithLatestAvr

[GitHub] [hudi] n3nash commented on a change in pull request #1704: [HUDI-115] Enhance OverwriteWithLatestAvroPayload to also respect ordering value of record in storage

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1704: URL: https://github.com/apache/hudi/pull/1704#discussion_r438902475 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecordPayload.java ## @@ -50,8 +50,25 @@ * @param schema Schema used for recor

[GitHub] [hudi] n3nash commented on a change in pull request #1717: [HUDI-1012] Add unit test for snapshot reads

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1717: URL: https://github.com/apache/hudi/pull/1717#discussion_r438900358 ## File path: hudi-client/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java ## @@ -90,6 +90,8 @@ import static org.junit.jupiter.api.Asser

[GitHub] [hudi] n3nash commented on a change in pull request #1706: [HUDI-998] Introduce a robot to build testing website automatically

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1706: URL: https://github.com/apache/hudi/pull/1706#discussion_r438898630 ## File path: .github/workflows/web.yml ## @@ -0,0 +1,51 @@ +name: web sync + +on: + issue_comment: +types: [created] + +jobs: + bot: +runs-on: ubun

[GitHub] [hudi] n3nash commented on a change in pull request #1706: [HUDI-998] Introduce a robot to build testing website automatically

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1706: URL: https://github.com/apache/hudi/pull/1706#discussion_r438898007 ## File path: .github/workflows/web.yml ## @@ -0,0 +1,51 @@ +name: web sync + +on: + issue_comment: +types: [created] + +jobs: + bot: +runs-on: ubun

[GitHub] [hudi] n3nash commented on a change in pull request #1706: [HUDI-998] Introduce a robot to build testing website automatically

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1706: URL: https://github.com/apache/hudi/pull/1706#discussion_r438898007 ## File path: .github/workflows/web.yml ## @@ -0,0 +1,51 @@ +name: web sync + +on: + issue_comment: +types: [created] + +jobs: + bot: +runs-on: ubun

[GitHub] [hudi] n3nash commented on a change in pull request #1706: [HUDI-998] Introduce a robot to build testing website automatically

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1706: URL: https://github.com/apache/hudi/pull/1706#discussion_r438897508 ## File path: .github/workflows/web.yml ## @@ -0,0 +1,51 @@ +name: web sync + +on: + issue_comment: +types: [created] + +jobs: + bot: +runs-on: ubun

[GitHub] [hudi] n3nash commented on a change in pull request #1706: [HUDI-998] Introduce a robot to build testing website automatically

2020-06-11 Thread GitBox
n3nash commented on a change in pull request #1706: URL: https://github.com/apache/hudi/pull/1706#discussion_r438897242 ## File path: .github/workflows/web.yml ## @@ -0,0 +1,51 @@ +name: web sync Review comment: @lamber-ken how does this web.yml get triggered ?

[jira] [Commented] (HUDI-635) MergeHandle's DiskBasedMap entries can be thinner

2020-06-11 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133348#comment-17133348 ] Vinoth Chandar commented on HUDI-635: - [~shivnarayan] basic idea here is to avoid overh

[GitHub] [hudi] harishchanderramesh commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2020-06-11 Thread GitBox
harishchanderramesh commented on issue #1552: URL: https://github.com/apache/hudi/issues/1552#issuecomment-642655822 > @harshi2506 I am suspecting this may be due to a recent bug we fixed on master (still not 100%). Are you open to building hudi off master branch and giving that a shot? I

[GitHub] [hudi] desavera commented on issue #857: http://hudi.apache.org/comparison.html# should mention Iceberg and DeltaLake

2020-06-11 Thread GitBox
desavera commented on issue #857: URL: https://github.com/apache/hudi/issues/857#issuecomment-642565604 +1 over questioning the Opensource support as Delta Lake is not restricted to Databricks release but also has a OS version (that does not support S3 basically). ---

[GitHub] [hudi] codecov-commenter edited a comment on pull request #1719: [HUDI-1006]deltastreamer use kafkaSource with offset reset strategy:latest can't consume data

2020-06-11 Thread GitBox
codecov-commenter edited a comment on pull request #1719: URL: https://github.com/apache/hudi/pull/1719#issuecomment-641096930 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1719?src=pr&el=h1) Report > Merging [#1719](https://codecov.io/gh/apache/hudi/pull/1719?src=pr&el=desc) into

[GitHub] [hudi] Litianye commented on pull request #1719: [HUDI-1006]deltastreamer use kafkaSource with offset reset strategy:latest can't consume data

2020-06-11 Thread GitBox
Litianye commented on pull request #1719: URL: https://github.com/apache/hudi/pull/1719#issuecomment-642497057 > @Litianye no worry we can work through this together. I believe all the sources treat empty string and `Option.empty()` the same. If not then it's a bug. If we don't fix it now,

[GitHub] [hudi] selvarajperiyasamy commented on issue #1696: COW Error on existing Hive Table

2020-06-11 Thread GitBox
selvarajperiyasamy commented on issue #1696: URL: https://github.com/apache/hudi/issues/1696#issuecomment-642478387 @Balaji, Is this feature available in 0.5.0 version? Due to some restriction, I can't use spark 2.4 .x . So I am still using 0.5.0 --