[GitHub] [hudi] yanghua edited a comment on issue #2834: [SUPPORT] Help~~~org.apache.hudi.exception.TableNotFoundException

2021-04-19 Thread GitBox
yanghua edited a comment on issue #2834: URL: https://github.com/apache/hudi/issues/2834#issuecomment-823016979 > Error while compiling statement: No privilege 'Create' found for outputs { } It seems you do not have hive permission. -- This is an automated message from the Apache

[GitHub] [hudi] yanghua commented on issue #2834: [SUPPORT] Help~~~org.apache.hudi.exception.TableNotFoundException

2021-04-19 Thread GitBox
yanghua commented on issue #2834: URL: https://github.com/apache/hudi/issues/2834#issuecomment-823016979 > Error while compiling statement: No privilege 'Create' found for outputs { } It seems you do not have hive authorized. -- This is an automated message from the Apache Git Ser

[GitHub] [hudi] ssdong edited a comment on issue #2818: [SUPPORT] Exception thrown in incremental query(MOR) and potential change data loss after archiving

2021-04-19 Thread GitBox
ssdong edited a comment on issue #2818: URL: https://github.com/apache/hudi/issues/2818#issuecomment-822660600 Hey @garyli1019 thank you for the meticulous explanation. Yep, I was trying to confirm the “expected” behavior of incremental query. It makes sense to pull from _existing_ active

[GitHub] [hudi] codejoyan opened a new issue #2852: [SUPPORT] Read Hudi Table from Hive - Hive Sync clarification

2021-04-19 Thread GitBox
codejoyan opened a new issue #2852: URL: https://github.com/apache/hudi/issues/2852 I have a requirement to read Hudi table from Hive. Documentation (https://hudi.apache.org/docs/querying_data.html#hive) says that we have to copy hudi-hadoop-mr-bundle-x.y.z-SNAPSHOT.jar in the aux jar

[GitHub] [hudi] vingov commented on pull request #2768: [HUDI-485]: corrected the check for incremental sql

2021-04-19 Thread GitBox
vingov commented on pull request #2768: URL: https://github.com/apache/hudi/pull/2768#issuecomment-822978140 lgtm, we should use '%s' to replace the from commit time in the line no. 197 with the string format function: `incrementalPullSQLtemplate.add("incrementalSQL", String.format(

[jira] [Commented] (HUDI-1747) Deltastreamer incremental read is not working on the MOR table

2021-04-19 Thread Vinoth Govindarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325466#comment-17325466 ] Vinoth Govindarajan commented on HUDI-1747: --- [~shivnarayan] - Here are the answe

[GitHub] [hudi] codecov-commenter commented on pull request #2759: [HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false

2021-04-19 Thread GitBox
codecov-commenter commented on pull request #2759: URL: https://github.com/apache/hudi/pull/2759#issuecomment-822959772 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2759?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache

[GitHub] [hudi] codecov-commenter commented on pull request #2851: [HUDI-1551] Add support for BigDecimal and Integer when partitioning …

2021-04-19 Thread GitBox
codecov-commenter commented on pull request #2851: URL: https://github.com/apache/hudi/pull/2851#issuecomment-822957769 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2851?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache

[jira] [Updated] (HUDI-1415) Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread pengzhiwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1415: - Description:  Currently hudi can sync the meta data to hive meta store using HiveSyncTool. The table desc

[jira] [Updated] (HUDI-1415) Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread pengzhiwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1415: - Issue Type: Improvement (was: Bug) > Read Hoodie Table As Spark DataSource Table > -

[jira] [Updated] (HUDI-1551) Support Partition with BigDecimal/Integer field

2021-04-19 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1551: - Labels: pull-request-available (was: ) > Support Partition with BigDecimal/Integer field > --

[GitHub] [hudi] giaosudau opened a new pull request #2851: [HUDI-1551] Add support for BigDecimal and Integer when partitioning …

2021-04-19 Thread GitBox
giaosudau opened a new pull request #2851: URL: https://github.com/apache/hudi/pull/2851 …based on time. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What

[GitHub] [hudi] jsbali commented on pull request #2809: [HUDI-1789] Support reading older snapshots

2021-04-19 Thread GitBox
jsbali commented on pull request #2809: URL: https://github.com/apache/hudi/pull/2809#issuecomment-822952827 @vinothchandar yes that is correct -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [hudi] li36909 commented on a change in pull request #2759: [HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false

2021-04-19 Thread GitBox
li36909 commented on a change in pull request #2759: URL: https://github.com/apache/hudi/pull/2759#discussion_r616320165 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java ## @@ -386,39 +408,18 @@ public CommandProcessorResponse up

[GitHub] [hudi] li36909 commented on a change in pull request #2759: [HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false

2021-04-19 Thread GitBox
li36909 commented on a change in pull request #2759: URL: https://github.com/apache/hudi/pull/2759#discussion_r616318879 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java ## @@ -69,6 +70,8 @@ private static final Logger LOG = L

[GitHub] [hudi] li36909 commented on a change in pull request #2759: [HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false

2021-04-19 Thread GitBox
li36909 commented on a change in pull request #2759: URL: https://github.com/apache/hudi/pull/2759#discussion_r616318243 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java ## @@ -88,8 +91,27 @@ public HoodieHiveClient(HiveSyncConfig

[jira] [Commented] (HUDI-774) Spark to Avro converter incorrectly generates optional fields

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325449#comment-17325449 ] sivabalan narayanan commented on HUDI-774: -- [~afilipchik]: Can you check if HUDI-1

[jira] [Commented] (HUDI-1747) Deltastreamer incremental read is not working on the MOR table

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325447#comment-17325447 ] sivabalan narayanan commented on HUDI-1747: --- [~vino]: Can you help me understand

[jira] [Assigned] (HUDI-1717) Metadata Table reader does not show correct view of the metadata

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1717: - Assignee: Prashant Wason > Metadata Table reader does not show correct view of th

[jira] [Updated] (HUDI-1718) when query incr view of mor table which has Multi level partitions, the query failed

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1718: -- Status: In Progress (was: Open) > when query incr view of mor table which has Multi le

[jira] [Assigned] (HUDI-1718) when query incr view of mor table which has Multi level partitions, the query failed

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1718: - Assignee: tao meng > when query incr view of mor table which has Multi level par

[jira] [Updated] (HUDI-1718) when query incr view of mor table which has Multi level partitions, the query failed

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1718: -- Status: Patch Available (was: In Progress) > when query incr view of mor table which h

[jira] [Assigned] (HUDI-1719) hive on spark/mr,Incremental query of the mor table, the partition field is incorrect

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1719: - Assignee: tao meng > hive on spark/mr,Incremental query of the mor table, the par

[jira] [Updated] (HUDI-1719) hive on spark/mr,Incremental query of the mor table, the partition field is incorrect

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1719: -- Status: Patch Available (was: In Progress) > hive on spark/mr,Incremental query of the

[jira] [Updated] (HUDI-1719) hive on spark/mr,Incremental query of the mor table, the partition field is incorrect

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1719: -- Status: In Progress (was: Open) > hive on spark/mr,Incremental query of the mor table,

[jira] [Assigned] (HUDI-1722) hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1722: - Assignee: tao meng > hive beeline/spark-sql query specified field on mor table o

[jira] [Updated] (HUDI-1722) hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1722: -- Status: Patch Available (was: In Progress) > hive beeline/spark-sql query specified fi

[jira] [Updated] (HUDI-1722) hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1722: -- Status: In Progress (was: Open) > hive beeline/spark-sql query specified field on mor

[jira] [Updated] (HUDI-1723) DFSPathSelector skips files with the same modify date when read up to source limit

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1723: -- Status: In Progress (was: Open) > DFSPathSelector skips files with the same modify date

[jira] [Updated] (HUDI-1723) DFSPathSelector skips files with the same modify date when read up to source limit

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1723: -- Status: Patch Available (was: In Progress) > DFSPathSelector skips files with the same

[jira] [Assigned] (HUDI-1723) DFSPathSelector skips files with the same modify date when read up to source limit

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1723: - Assignee: Raymond Xu > DFSPathSelector skips files with the same modify date when

[jira] [Updated] (HUDI-1723) DFSPathSelector skips files with the same modify date when read up to source limit

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1723: -- Status: Open (was: New) > DFSPathSelector skips files with the same modify date when re

[GitHub] [hudi] nsivabalan commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread GitBox
nsivabalan commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-822941764 @vinothchandar : I see that author is actively responding/working on the PR. Will leave it to the author to address feedback. If we don't see any activity for sometime, I can ch

[jira] [Resolved] (HUDI-1720) when query incr view of mor table which has many delete records use sparksql/hive-beeline, StackOverflowError

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan resolved HUDI-1720. --- Resolution: Fixed > when query incr view of mor table which has many delete records u

[jira] [Assigned] (HUDI-1720) when query incr view of mor table which has many delete records use sparksql/hive-beeline, StackOverflowError

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1720: - Assignee: tao meng > when query incr view of mor table which has many delete rec

[jira] [Updated] (HUDI-1720) when query incr view of mor table which has many delete records use sparksql/hive-beeline, StackOverflowError

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1720: -- Status: In Progress (was: Open) > when query incr view of mor table which has many del

[jira] [Comment Edited] (HUDI-89) Clean up placement, naming, defaults of HoodieWriteConfig

2021-04-19 Thread Roc Marshal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325435#comment-17325435 ] Roc Marshal edited comment on HUDI-89 at 4/20/21, 3:05 AM: --- [~wenn

[jira] [Commented] (HUDI-89) Clean up placement, naming, defaults of HoodieWriteConfig

2021-04-19 Thread Roc Marshal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325435#comment-17325435 ] Roc Marshal commented on HUDI-89: - [~wenningd] It will be an evolutionary process. Backward

[jira] [Updated] (HUDI-1721) run_sync_tool support hive3.1.2 on hadoop3.1.4

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1721: -- Status: Patch Available (was: In Progress) > run_sync_tool support hive3.1.2 on hadoop3

[jira] [Assigned] (HUDI-1721) run_sync_tool support hive3.1.2 on hadoop3.1.4

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-1721: - Assignee: Vinoth Chandar > run_sync_tool support hive3.1.2 on hadoop3.1.4 >

[GitHub] [hudi] pengzhiwei2018 commented on pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread GitBox
pengzhiwei2018 commented on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-822936399 Hi @umehrot2 @vinothchandar The PR has updated to solve the last comment, Please take a review again~ -- This is an automated message from the Apache Git Service. To respo

[jira] [Updated] (HUDI-1716) rt view w/ MOR tables fails after schema evolution

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1716: -- Status: Patch Available (was: In Progress) > rt view w/ MOR tables fails after schema e

[jira] [Updated] (HUDI-1716) rt view w/ MOR tables fails after schema evolution

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1716: -- Status: Closed (was: Patch Available) > rt view w/ MOR tables fails after schema evolut

[jira] [Reopened] (HUDI-1716) rt view w/ MOR tables fails after schema evolution

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reopened HUDI-1716: --- > rt view w/ MOR tables fails after schema evolution > ---

[jira] [Resolved] (HUDI-1716) rt view w/ MOR tables fails after schema evolution

2021-04-19 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan resolved HUDI-1716. --- Resolution: Fixed > rt view w/ MOR tables fails after schema evolution > -

[GitHub] [hudi] liujinhui1994 commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table

2021-04-19 Thread GitBox
liujinhui1994 commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-822930145 > > we should still use the old schema with full fields there, for new records with partial values, we can patch them up with a builtin placeholder values > > agree. T

[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread GitBox
pengzhiwei2018 commented on a change in pull request #2283: URL: https://github.com/apache/hudi/pull/2283#discussion_r616297739 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java ## @@ -164,7 +165,13 @@ private void syncHoodieTable(Stri

[GitHub] [hudi] ngonik edited a comment on issue #1679: [HUDI-1609] How to disable Hive JDBC and enable metastore

2021-04-19 Thread GitBox
ngonik edited a comment on issue #1679: URL: https://github.com/apache/hudi/issues/1679#issuecomment-82299 I was able to fix the JSONException error on EMR. Just needed to manually add the org.json (https://mvnrepository.com/artifact/org.json/json) package to both executor and driver e

[GitHub] [hudi] MyLanPangzi commented on pull request #2719: [HUDI-1721] run_sync_tool support hive3

2021-04-19 Thread GitBox
MyLanPangzi commented on pull request #2719: URL: https://github.com/apache/hudi/pull/2719#issuecomment-822922894 > @MyLanPangzi just want to confirm that this works on hive2 as well. We can then land if other have no concerns I'll spend some time testing Hive2 with these vars at thi

[GitHub] [hudi] ngonik commented on issue #1679: [HUDI-1609] How to disable Hive JDBC and enable metastore

2021-04-19 Thread GitBox
ngonik commented on issue #1679: URL: https://github.com/apache/hudi/issues/1679#issuecomment-82299 I was able to fix the JSONException error on EMR. Just needed to add the org.json package when deploying the cluster. I created a blog post for future reference: https://ngonik.medium.c

[GitHub] [hudi] umehrot2 commented on pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread GitBox
umehrot2 commented on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-822916265 @pengzhiwei2018 @vinothchandar dropped one comment. Otherwise LGTM. Also seems like it has conflicts now. I can merge it once its cleared out. -- This is an automated message fr

[GitHub] [hudi] umehrot2 commented on a change in pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread GitBox
umehrot2 commented on a change in pull request #2283: URL: https://github.com/apache/hudi/pull/2283#discussion_r616288457 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java ## @@ -164,7 +165,13 @@ private void syncHoodieTable(String tab

[GitHub] [hudi] hj2016 commented on issue #2623: org.apache.hudi.exception.HoodieDependentSystemUnavailableException:System HBASE unavailable.

2021-04-19 Thread GitBox
hj2016 commented on issue #2623: URL: https://github.com/apache/hudi/issues/2623#issuecomment-822908861 @nsivabalan I upgraded the CDH cluster hadoop 2.6.0 version to hadoop 3.0.0 here, and the previous dependency was still 2.6.0. I tried to delete all the dependency packages of 2.6.0 and

[GitHub] [hudi] lw309637554 commented on pull request #2773: [HUDI-1764] Add Hudi-CLI support for clustering

2021-04-19 Thread GitBox
lw309637554 commented on pull request #2773: URL: https://github.com/apache/hudi/pull/2773#issuecomment-822906790 > @satishkotha @lw309637554 is this now good to go? it is good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] lw309637554 commented on pull request #2776: [HUDI-1768] spark datasource support schema validate add column

2021-04-19 Thread GitBox
lw309637554 commented on pull request #2776: URL: https://github.com/apache/hudi/pull/2776#issuecomment-822906030 > > @nsivabalan is this still valid? See that we also landed #2765 ? > @nsivabalan @vinothchandar i think the unit test is valid , i will modify the unit test , and t

[GitHub] [hudi] lw309637554 commented on pull request #2776: [HUDI-1768] spark datasource support schema validate add column

2021-04-19 Thread GitBox
lw309637554 commented on pull request #2776: URL: https://github.com/apache/hudi/pull/2776#issuecomment-822905918 > @nsivabalan is this still valid? See that we also landed #2765 ? i think the unit test is valid , i will modify the unit test , and then can help to review -- This

[GitHub] [hudi] lw309637554 commented on a change in pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread GitBox
lw309637554 commented on a change in pull request #2722: URL: https://github.com/apache/hudi/pull/2722#discussion_r616282438 ## File path: hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/functional/TestHoodieCombineHiveInputFormat.java ## @@ -84,6 +86,73 @@ public void set

[GitHub] [hudi] lw309637554 commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread GitBox
lw309637554 commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-822899908 > testHoodieRealtimeCombineHoodieInputFormat try it . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [hudi] lw309637554 commented on a change in pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread GitBox
lw309637554 commented on a change in pull request #2722: URL: https://github.com/apache/hudi/pull/2722#discussion_r616277375 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieParquetRealtimeInputFormat.java ## @@ -85,12 +85,14 @@ void addProjecti

[GitHub] [hudi] li36909 commented on pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files

2021-04-19 Thread GitBox
li36909 commented on pull request #2749: URL: https://github.com/apache/hudi/pull/2749#issuecomment-822899384 @n3nash @vinothchandar thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] cdmikechen commented on pull request #2835: [HUDI-1802] Timeline Server Bundle need to include com.esotericsoftware package

2021-04-19 Thread GitBox
cdmikechen commented on pull request #2835: URL: https://github.com/apache/hudi/pull/2835#issuecomment-822874143 @vinothchandar Yes~ I'm now running multiple `hudi deltastreamer` instances by spark operator (spark 3.0.2) in OpenShit4.6 and use a standalone `timeline server` storage by `roc

[GitHub] [hudi] vinothchandar commented on pull request #2809: [HUDI-1789] Support reading older snapshots

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2809: URL: https://github.com/apache/hudi/pull/2809#issuecomment-822856309 @jsbali IIUC this PR adds ability to bound the max commit time for reading. @satishkotha do you mind reviewing this/first pass. -- This is an automated message from

[hudi] branch master updated: [MINOR] Added metric reporter Prometheus to HoodieBackedTableMetadataWriter (#2842)

2021-04-19 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9a288cc [MINOR] Added metric reporter Prometheus

[GitHub] [hudi] vinothchandar merged pull request #2842: [MINOR] Added metric reporter Prometheus to HoodieBackedTableMetadataWriter

2021-04-19 Thread GitBox
vinothchandar merged pull request #2842: URL: https://github.com/apache/hudi/pull/2842 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, ple

[GitHub] [hudi] vinothchandar commented on pull request #2842: [MINOR] Added metric reporter Prometheus to HoodieBackedTableMetadataWriter

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2842: URL: https://github.com/apache/hudi/pull/2842#issuecomment-822845200 Thanks @sbernauer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [hudi] vinothchandar commented on a change in pull request #2822: [Hotfix][hudi-sync] Refactor method up to parent-class

2021-04-19 Thread GitBox
vinothchandar commented on a change in pull request #2822: URL: https://github.com/apache/hudi/pull/2822#discussion_r616232560 ## File path: hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/AbstractSyncHoodieClient.java ## @@ -136,6 +141,42 @@ public Messag

[GitHub] [hudi] vinothchandar commented on pull request #2310: [HUDI-1444] fix rollback for emtpy partition table

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2310: URL: https://github.com/apache/hudi/pull/2310#issuecomment-822843378 is this same as #2749 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [hudi] vinothchandar commented on pull request #2283: [HUDI-1415] Read Hoodie Table As Spark DataSource Table

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2283: URL: https://github.com/apache/hudi/pull/2283#issuecomment-822842895 #2651 is now merged. @umehrot2 @pengzhiwei2018 what are the next steps for this PR? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] vinothchandar closed pull request #2141: [HUDI-898] Add new backwards compatible API to expose schema in preCombine

2021-04-19 Thread GitBox
vinothchandar closed pull request #2141: URL: https://github.com/apache/hudi/pull/2141 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, ple

[GitHub] [hudi] vinothchandar commented on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-822840253 @umehrot2 in case you have some time, please take a pass. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [hudi] vinothchandar commented on pull request #2790: [HUDI-1779] Fail to bootstrap/upsert a table which contains timestamp column

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2790: URL: https://github.com/apache/hudi/pull/2790#issuecomment-822840179 @li36909 Thanks for your contribution! Queued up for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] vinothchandar commented on pull request #2767: [HUDI-1761] Adding support for Test your own Schema with QuickStart

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2767: URL: https://github.com/apache/hudi/pull/2767#issuecomment-822839021 @nsivabalan do we really need this kind of flexibility in a quickstart? I understand for testing (perf or regression) we might need something like this. -- This is a

[GitHub] [hudi] vinothchandar commented on pull request #2761: [HUDI-1676] Support SQL with spark3

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2761: URL: https://github.com/apache/hudi/pull/2761#issuecomment-822838229 I will be spending some serious time on #2645 this week. So lets please keep this open as we merge that cc @pengzhiwei2018 -- This is an automated message from t

[GitHub] [hudi] vinothchandar commented on pull request #2773: [HUDI-1764] Add Hudi-CLI support for clustering

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2773: URL: https://github.com/apache/hudi/pull/2773#issuecomment-822837688 @satishkotha @lw309637554 is this now good to go? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [hudi] vinothchandar merged pull request #2749: [HUDI-1744][Rollback] rollback fail on mor table when the partition path hasn't any files

2021-04-19 Thread GitBox
vinothchandar merged pull request #2749: URL: https://github.com/apache/hudi/pull/2749 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, ple

[hudi] branch master updated (d21753d -> 6b4b878)

2021-04-19 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from d21753d [HUDI-1802] Timeline Server Bundle need to include com.esotericsoftware package (#2835) add 6b4b878 [HU

[GitHub] [hudi] vinothchandar commented on a change in pull request #2759: [HUDI-1759] Save one connection retry to hive metastore when hiveSyncTool run with useJdbc=false

2021-04-19 Thread GitBox
vinothchandar commented on a change in pull request #2759: URL: https://github.com/apache/hudi/pull/2759#discussion_r616223548 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HoodieHiveClient.java ## @@ -69,6 +70,8 @@ private static final Logger LO

[GitHub] [hudi] vinothchandar edited a comment on pull request #2768: [HUDI-485]: corrected the check for incremental sql

2021-04-19 Thread GitBox
vinothchandar edited a comment on pull request #2768: URL: https://github.com/apache/hudi/pull/2768#issuecomment-822830833 @vingov for an extra pair of eyes. @pratyakshsharma any testing done for the PR? any way to add a test for the tool? -- This is an automated message from the Apache

[GitHub] [hudi] vinothchandar commented on pull request #2768: [HUDI-485]: corrected the check for incremental sql

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2768: URL: https://github.com/apache/hudi/pull/2768#issuecomment-822830833 @vingov for an extra pair of eyes. @pratyakshsharma can you please provide more context into what the fix is? any tests? -- This is an automated message from the Ap

[GitHub] [hudi] vinothchandar commented on pull request #2776: [HUDI-1768] spark datasource support schema validate add column

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2776: URL: https://github.com/apache/hudi/pull/2776#issuecomment-822829555 @nsivabalan is this still valid? See that we also landed #2765 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [hudi] vinothchandar commented on pull request #2784: [HUDI-1740] Fix insert-overwrite API archival

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2784: URL: https://github.com/apache/hudi/pull/2784#issuecomment-822827791 I will take a pass on this and land! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [hudi] vinothchandar commented on a change in pull request #2819: [HUDI-1794] Moved static COMMIT_FORMATTER to thread local variable as SimpleDateFormat is not thread safe.

2021-04-19 Thread GitBox
vinothchandar commented on a change in pull request #2819: URL: https://github.com/apache/hudi/pull/2819#discussion_r616215766 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java ## @@ -73,6 +71,16 @@ private static final

[GitHub] [hudi] vinothchandar commented on pull request #2722: [HUDI-1722]hive beeline/spark-sql query specified field on mor table occur NPE

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2722: URL: https://github.com/apache/hudi/pull/2722#issuecomment-822802881 @nsivabalan same over to you to get this ready for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [hudi] vinothchandar commented on a change in pull request #2739: [MINOR] Fixing key generators blog content

2021-04-19 Thread GitBox
vinothchandar commented on a change in pull request #2739: URL: https://github.com/apache/hudi/pull/2739#discussion_r616194063 ## File path: docs/_posts/2021-02-13-hudi-key-generators.md ## @@ -5,18 +5,16 @@ author: shivnarayan category: blog --- -Every record in Hudi is un

[GitHub] [hudi] vinothchandar commented on pull request #2716: [HUDI-1718] when query incr view of mor table which has Multi level partitions, the query failed

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2716: URL: https://github.com/apache/hudi/pull/2716#issuecomment-822799662 @nsivabalan could you please shepherd this to be ready for review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [hudi] vinothchandar commented on pull request #2714: [HUDI-1707] Reduces log level for too verbose messages from info to debug level.

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2714: URL: https://github.com/apache/hudi/pull/2714#issuecomment-822795571 I have re kicked the tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] vinothchandar commented on pull request #2697: [HUDI-1211] clean up spark session for each test of FunctionalTestHar…

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2697: URL: https://github.com/apache/hudi/pull/2697#issuecomment-822794015 @nsivabalan are you saying this pr is already subsumed by other fixes on master? Can you please clarify what your suggestion for this pr is -- This is an automated message

[GitHub] [hudi] vinothchandar commented on pull request #2677: [HUDI-1714] Added tests to TestHoodieTimelineArchiveLog for the archival of compl…

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2677: URL: https://github.com/apache/hudi/pull/2677#issuecomment-822792474 @satishkotha seems like this is ready? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [hudi] abhijeetkushe opened a new issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-19 Thread GitBox
abhijeetkushe opened a new issue #2850: URL: https://github.com/apache/hudi/issues/2850 **Describe the problem you faced** We have a hoodiedeltastreamer application deployed in EMR which reads objects from source bucket : s3:// which is populated by a kinesis firehose located in a

[GitHub] [hudi] jintaoguan edited a comment on pull request #2773: [HUDI-1764] Add Hudi-CLI support for clustering

2021-04-19 Thread GitBox
jintaoguan edited a comment on pull request #2773: URL: https://github.com/apache/hudi/pull/2773#issuecomment-822671882 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [hudi] jintaoguan edited a comment on pull request #2773: [HUDI-1764] Add Hudi-CLI support for clustering

2021-04-19 Thread GitBox
jintaoguan edited a comment on pull request #2773: URL: https://github.com/apache/hudi/pull/2773#issuecomment-822671882 @lw309637554 Sure. I will open a new issue (https://issues.apache.org/jira/browse/HUDI-1813) for updating the documentation of CLI. Thank you. -- This is an automated

[jira] [Created] (HUDI-1813) Update the documentation about using clustering from CLI

2021-04-19 Thread Jintao (Jira)
Jintao created HUDI-1813: Summary: Update the documentation about using clustering from CLI Key: HUDI-1813 URL: https://issues.apache.org/jira/browse/HUDI-1813 Project: Apache Hudi Issue Type: Improv

[GitHub] [hudi] jintaoguan commented on pull request #2773: [HUDI-1764] Add Hudi-CLI support for clustering

2021-04-19 Thread GitBox
jintaoguan commented on pull request #2773: URL: https://github.com/apache/hudi/pull/2773#issuecomment-822671882 @lw309637554 Sure. I will open a new issue for updating the documentation of CLI. Thank you. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [hudi] ssdong commented on issue #2818: [SUPPORT] Exception thrown in incremental query(MOR) and potential change data loss after archiving

2021-04-19 Thread GitBox
ssdong commented on issue #2818: URL: https://github.com/apache/hudi/issues/2818#issuecomment-822660600 Hey @garyli1019 thank you for the meticulous explanation. Yep, I was trying to confirm the “expected” behavior of incremental query. It makes sense to pull from _existing_ active timelin

[GitHub] [hudi] zhedoubushishi edited a comment on pull request #2833: [WIP][HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo

2021-04-19 Thread GitBox
zhedoubushishi edited a comment on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-822659476 > this PR looks good. We can do the other config classes as well and land this. > > a) Can we also tackle `HoodieTableConfig` in this PR? > > b) For this

[GitHub] [hudi] zhedoubushishi commented on pull request #2833: [WIP][HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo

2021-04-19 Thread GitBox
zhedoubushishi commented on pull request #2833: URL: https://github.com/apache/hudi/pull/2833#issuecomment-822659476 > this PR looks good. We can do the other config classes as well and land this. > > a) Can we also tackle `HoodieTableConfig` in this PR? > > b) For this commen

[GitHub] [hudi] vinothchandar commented on pull request #2666: [HUDI-1160] Support update partial fields for CoW table

2021-04-19 Thread GitBox
vinothchandar commented on pull request #2666: URL: https://github.com/apache/hudi/pull/2666#issuecomment-822627744 >we should still use the old schema with full fields there, for new records with partial values, we can patch them up with a builtin placeholder values agree. This is a

[GitHub] [hudi] vinothchandar commented on a change in pull request #2833: [WIP][HUDI-89] Add configOption & refactor HoodieBootstrapConfig for a demo

2021-04-19 Thread GitBox
vinothchandar commented on a change in pull request #2833: URL: https://github.com/apache/hudi/pull/2833#discussion_r616012703 ## File path: hudi-common/src/main/java/org/apache/hudi/common/config/ConfigOption.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Fo

[jira] [Updated] (HUDI-89) Clean up placement, naming, defaults of HoodieWriteConfig

2021-04-19 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-89?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-89: --- Fix Version/s: 0.9.0 > Clean up placement, naming, defaults of HoodieWriteConfig > -

[jira] [Updated] (HUDI-89) Clean up placement, naming, defaults of HoodieWriteConfig

2021-04-19 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-89?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-89: --- Priority: Blocker (was: Major) > Clean up placement, naming, defaults of HoodieWriteConfig > --

  1   2   >