[jira] [Updated] (HUDI-1763) DefaultHoodieRecordPayload does not honor ordering value when records within multiple log files are merged

2021-05-19 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1763: -- Labels: sev:critical (was: sev:high) > DefaultHoodieRecordPayload does not honor ordering

[jira] [Commented] (HUDI-1763) DefaultHoodieRecordPayload does not honor ordering value when records within multiple log files are merged

2021-05-19 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348071#comment-17348071 ] Nishith Agarwal commented on HUDI-1763: --- [~shivnarayan] This issue was brought up by a couple of

[GitHub] [hudi] KarthickAN opened a new issue #2970: [SUPPORT] Failed to upsert for commit time

2021-05-19 Thread GitBox
KarthickAN opened a new issue #2970: URL: https://github.com/apache/hudi/issues/2970 Hi, I keep getting the following error intermittently and I'm not sure what causes this issue. There may be two different hudi jobs running parallelly and writing to the same bucket. Will that be an

[GitHub] [hudi] nsivabalan commented on pull request #2969: Blog on Bulk insert sort variants

2021-05-19 Thread GitBox
nsivabalan commented on pull request #2969: URL: https://github.com/apache/hudi/pull/2969#issuecomment-844661858 https://user-images.githubusercontent.com/513218/118915682-725e0b00-b8fb-11eb-99ec-14c989e18ced.png;>

[GitHub] [hudi] nsivabalan opened a new pull request #2969: Blog on Bulk insert sort variants

2021-05-19 Thread GitBox
nsivabalan opened a new pull request #2969: URL: https://github.com/apache/hudi/pull/2969 ## What is the purpose of the pull request - Bulk insert sort modes variants ## Verify this pull request Verified by building the site locally ## Committer checklist

[GitHub] [hudi] swuferhong opened a new pull request #2968: [HUDI-1871] Fix hive conf for Flink writer hive meta sync

2021-05-19 Thread GitBox
swuferhong opened a new pull request #2968: URL: https://github.com/apache/hudi/pull/2968 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2833: [HUDI-89] Add configOption & refactor Hudi configuration framework

2021-05-19 Thread GitBox
codecov-commenter edited a comment on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-828792354 #

[GitHub] [hudi] danny0405 commented on a change in pull request #2961: [HUDI-1911] Reuse the partition path and file group id for flink writ…

2021-05-19 Thread GitBox
danny0405 commented on a change in pull request #2961: URL: https://github.com/apache/hudi/pull/2961#discussion_r635718807 ## File path: hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteCopyOnWrite.java ## @@ -439,22 +439,22 @@ public void testInsertWithMiniBatches()

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2833: [HUDI-89] Add configOption & refactor Hudi configuration framework

2021-05-19 Thread GitBox
codecov-commenter edited a comment on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-828792354 #

[GitHub] [hudi] garyli1019 commented on a change in pull request #2961: [HUDI-1911] Reuse the partition path and file group id for flink writ…

2021-05-19 Thread GitBox
garyli1019 commented on a change in pull request #2961: URL: https://github.com/apache/hudi/pull/2961#discussion_r635711875 ## File path: hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteCopyOnWrite.java ## @@ -439,22 +439,22 @@ public void testInsertWithMiniBatches()

[GitHub] [hudi] wangxianghu commented on pull request #2963: [HUDI-1904] Make SchemaProvider spark free and move it to hudi-client-common

2021-05-19 Thread GitBox
wangxianghu commented on pull request #2963: URL: https://github.com/apache/hudi/pull/2963#issuecomment-844642488 The full name of `SchemaProvider`s is not the same as before, can not say backwards compatible now :( eg, `org.apache.hudi.utilities.schema.FilebasedSchemaProvider` changed

[jira] [Updated] (HUDI-1915) Fix the file id for write data buffer before flushing

2021-05-19 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-1915: --- Fix Version/s: 0.9.0 > Fix the file id for write data buffer before flushing >

[jira] [Closed] (HUDI-1915) Fix the file id for write data buffer before flushing

2021-05-19 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1915. -- Resolution: Fixed 9b01d2f864e5cc4a559cfd4199136bca0979b095 > Fix the file id for write data buffer before

[hudi] branch master updated: [HUDI-1915] Fix the file id for write data buffer before flushing (#2966)

2021-05-19 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9b01d2f [HUDI-1915] Fix the file id for write

[GitHub] [hudi] yanghua merged pull request #2966: [HUDI-1915] Fix the file id for write data buffer before flushing

2021-05-19 Thread GitBox
yanghua merged pull request #2966: URL: https://github.com/apache/hudi/pull/2966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[hudi] branch master updated (fe3f5c2 -> ced068e)

2021-05-19 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from fe3f5c2 [HUDI-1913] Using streams instead of loops for input/output (#2962) add ced068e [MINOR] Remove

[GitHub] [hudi] yanghua merged pull request #2965: [MINOR] Remove unused method in BaseSparkCommitActionExecutor

2021-05-19 Thread GitBox
yanghua merged pull request #2965: URL: https://github.com/apache/hudi/pull/2965 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2963: [HUDI-1904] Make SchemaProvider spark free and move it to hudi-client-common

2021-05-19 Thread GitBox
codecov-commenter edited a comment on pull request #2963: URL: https://github.com/apache/hudi/pull/2963#issuecomment-843155329 #

[GitHub] [hudi] wangxianghu commented on pull request #2963: [HUDI-1904] Make SchemaProvider spark free and move it to hudi-client-common

2021-05-19 Thread GitBox
wangxianghu commented on pull request #2963: URL: https://github.com/apache/hudi/pull/2963#issuecomment-844602053 > Does this have breaking changes for the users? Non-backwards compatible class/package changes? > > Also can we create a hudi-utilities-common. Not sure if we should

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2927: [HUDI-1129] Improving schema evolution support in hudi

2021-05-19 Thread GitBox
codecov-commenter edited a comment on pull request #2927: URL: https://github.com/apache/hudi/pull/2927#issuecomment-835430045 #

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2833: [HUDI-89] Add configOption & refactor Hudi configuration framework

2021-05-19 Thread GitBox
codecov-commenter edited a comment on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-828792354 #

[jira] [Assigned] (HUDI-1916) Create a matrix of datatypes across spark, hive, presto, Avro, parquet.

2021-05-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1916: - Assignee: Nishith Agarwal > Create a matrix of datatypes across spark, hive,

[jira] [Created] (HUDI-1916) Create a matrix of datatypes across spark, hive, presto, Avro, parquet.

2021-05-19 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1916: - Summary: Create a matrix of datatypes across spark, hive, presto, Avro, parquet. Key: HUDI-1916 URL: https://issues.apache.org/jira/browse/HUDI-1916

[GitHub] [hudi] WilliamWhispell commented on issue #1581: [SUPPORT] Hive Metastore not in sync with Hudi Dataset using DataSource API

2021-05-19 Thread GitBox
WilliamWhispell commented on issue #1581: URL: https://github.com/apache/hudi/issues/1581#issuecomment-844534611 The jira is marked as resolved, yet the last comment there indicates the issue still persists. We are running into this today with the same scenario we are creating a new

[GitHub] [hudi] xiarixiaoyao commented on pull request #2720: [HUDI-1719]hive on spark/mr,Incremental query of the mor table, the partition field is incorrect

2021-05-19 Thread GitBox
xiarixiaoyao commented on pull request #2720: URL: https://github.com/apache/hudi/pull/2720#issuecomment-844404214 @nsivabalan UT added, pls review again, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[jira] [Commented] (HUDI-1910) Supporting Kafka based checkpointing for HoodieDeltaStreamer

2021-05-19 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347818#comment-17347818 ] Nishith Agarwal commented on HUDI-1910: --- [~vinaypatil18] which approach do you prefer to implement ?

[jira] [Commented] (HUDI-349) Make cleaner retention based on time period to account for higher deviations in ingestion runs

2021-05-19 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347813#comment-17347813 ] Pratyaksh Sharma commented on HUDI-349: --- Need to create a new Cleaning policy.  > Make cleaner

[GitHub] [hudi] pratyakshsharma opened a new pull request #2967: Added blog for Hudi cleaner service

2021-05-19 Thread GitBox
pratyakshsharma opened a new pull request #2967: URL: https://github.com/apache/hudi/pull/2967 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose

[GitHub] [hudi] BenjMaq edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-838618753 Hey everyone, I'm also facing this issue. I see some of you guys already worked on some type of fix/workaround. How would you advise dealing with this? I tried to add

[GitHub] [hudi] BenjMaq edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-838618753 Hey everyone, I'm also facing this issue. I see some of you guys already worked on some type of fix/workaround. How would you advise dealing with this? I tried to add

[GitHub] [hudi] BenjMaq edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-838618753 Hey everyone, I'm also facing this issue. I see some of you guys already worked on some type of fix/workaround. How would you advise dealing with this? I tried to add

[GitHub] [hudi] BenjMaq edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-838618753 Hey everyone, I'm also facing this issue. I see some of you guys already worked on some type of fix/workaround. How would you advise dealing with this? I tried to add

[GitHub] [hudi] BenjMaq edited a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq edited a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-838618753 Hey everyone, I'm also facing this issue. I see some of you guys already worked on some type of fix/workaround. How would you advise dealing with this? I tried to add

[GitHub] [hudi] BenjMaq commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-844267747 I've seen multiple issues raised about this here, and in most of them there's a link to @li36909's [comment above](https://github.com/apache/hudi/issues/2544#issuecomment-815849870) as

[GitHub] [hudi] BenjMaq removed a comment on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-05-19 Thread GitBox
BenjMaq removed a comment on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-844267747 I've seen multiple issues raised about this here, and in most of them there's a link to @li36909's [comment

[GitHub] [hudi] vinothchandar commented on pull request #2963: [HUDI-1904] Make SchemaProvider spark free and move it to hudi-client-common

2021-05-19 Thread GitBox
vinothchandar commented on pull request #2963: URL: https://github.com/apache/hudi/pull/2963#issuecomment-844242317 Does this have breaking changes for the users? Non-backwards compatible class/package changes? Also can we create a hudi-utilities-common. Not sure if we should move

[jira] [Commented] (HUDI-1277) [DOC] Need documentation explaining how to write custom record payload class

2021-05-19 Thread Pratyaksh Sharma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347715#comment-17347715 ] Pratyaksh Sharma commented on HUDI-1277: can we close this Jira then? > [DOC] Need documentation

[jira] [Updated] (HUDI-1904) Make SchemaProvider spark free and move it to hudi-client-common

2021-05-19 Thread Xianghu Wang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianghu Wang updated HUDI-1904: --- Summary: Make SchemaProvider spark free and move it to hudi-client-common (was: Move SchemaProvider

[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

2021-05-19 Thread GitBox
nsivabalan commented on pull request #2438: URL: https://github.com/apache/hudi/pull/2438#issuecomment-844060742 @liujinhui1994 : were you able to make progress on this. would be nice to have this in before next release. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] garyli1019 commented on a change in pull request #2926: [HUDI-1879] Support Partition Prune For MergeOnRead Snapshot Table

2021-05-19 Thread GitBox
garyli1019 commented on a change in pull request #2926: URL: https://github.com/apache/hudi/pull/2926#discussion_r635183335 ## File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala ## @@ -614,4 +614,66 @@ class

[jira] [Commented] (HUDI-57) [UMBRELLA] Support ORC Storage

2021-05-19 Thread manasa (Jira)
[ https://issues.apache.org/jira/browse/HUDI-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347489#comment-17347489 ] manasa commented on HUDI-57: Hi ...am willing to contribute to the task ( Support to ORC storage), could you

[GitHub] [hudi] codecov-commenter commented on pull request #2966: [HUDI-1915] Fix the file id for write data buffer before flushing

2021-05-19 Thread GitBox
codecov-commenter commented on pull request #2966: URL: https://github.com/apache/hudi/pull/2966#issuecomment-843955616 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2966?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)

[jira] [Updated] (HUDI-1915) Fix the file id for write data buffer before flushing

2021-05-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1915: - Labels: pull-request-available (was: ) > Fix the file id for write data buffer before flushing >

[GitHub] [hudi] danny0405 opened a new pull request #2966: [HUDI-1915] Fix the file id for write data buffer before flushing

2021-05-19 Thread GitBox
danny0405 opened a new pull request #2966: URL: https://github.com/apache/hudi/pull/2966 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Created] (HUDI-1915) Fix the file id for write data buffer before flushing

2021-05-19 Thread Danny Chen (Jira)
Danny Chen created HUDI-1915: Summary: Fix the file id for write data buffer before flushing Key: HUDI-1915 URL: https://issues.apache.org/jira/browse/HUDI-1915 Project: Apache Hudi Issue Type:

[GitHub] [hudi] pratyakshsharma commented on pull request #2912: [HUDI-1561] Adding read me for hudi-cli tool

2021-05-19 Thread GitBox
pratyakshsharma commented on pull request #2912: URL: https://github.com/apache/hudi/pull/2912#issuecomment-843916220 Let me take care of this @vinothchandar @nsivabalan . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [hudi] codecov-commenter commented on pull request #2965: [MINOR] Remove unused method in BaseSparkCommitActionExecutor

2021-05-19 Thread GitBox
codecov-commenter commented on pull request #2965: URL: https://github.com/apache/hudi/pull/2965#issuecomment-843911890 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2965?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation)

[GitHub] [hudi] wangxianghu opened a new pull request #2965: [MINOR] Remove unused method in BaseSparkCommitActionExecutor

2021-05-19 Thread GitBox
wangxianghu opened a new pull request #2965: URL: https://github.com/apache/hudi/pull/2965 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of

[GitHub] [hudi] peanut-chenzhong removed a comment on issue #2955: [SUPPORT]Log system conflict in Hudi-Cli after run temp_* command

2021-05-19 Thread GitBox
peanut-chenzhong removed a comment on issue #2955: URL: https://github.com/apache/hudi/issues/2955#issuecomment-843079769 BTW, how can I change this label to awaiting-community-help? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to