[GitHub] [hudi] n3nash commented on issue #1737: [SUPPORT]spark streaming create small parquet files

2021-04-21 Thread GitBox
n3nash commented on issue #1737: URL: https://github.com/apache/hudi/issues/1737#issuecomment-824574236 Closing this ticket since the issue seems resolved by 0.7.0 and no activity in the last 60+ days. @kimberlyamandalu feel free to re-open or open a new ticket. -- This is an automated

[GitHub] [hudi] n3nash closed issue #1737: [SUPPORT]spark streaming create small parquet files

2021-04-21 Thread GitBox
n3nash closed issue #1737: URL: https://github.com/apache/hudi/issues/1737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] n3nash commented on issue #1957: [SUPPORT] Small Table Upsert sometimes take a lot of time

2021-04-21 Thread GitBox
n3nash commented on issue #1957: URL: https://github.com/apache/hudi/issues/1957#issuecomment-824570762 Closing this ticket due to inactivity. @jiangok2006 please re-open if your slow issue continues to persist. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [hudi] n3nash closed issue #1957: [SUPPORT] Small Table Upsert sometimes take a lot of time

2021-04-21 Thread GitBox
n3nash closed issue #1957: URL: https://github.com/apache/hudi/issues/1957 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] n3nash commented on issue #1981: [SUPPORT] Huge performance Difference Between Hudi and Regular Parquet in Athena

2021-04-21 Thread GitBox
n3nash commented on issue #1981: URL: https://github.com/apache/hudi/issues/1981#issuecomment-824570179 @umehrot2 Do you know when 0.7 will support metadata table in Athena ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [hudi] n3nash edited a comment on issue #2066: [SUPPORT] Hudi is increasing the storage size big time

2021-04-21 Thread GitBox
n3nash edited a comment on issue #2066: URL: https://github.com/apache/hudi/issues/2066#issuecomment-824569091 JIRA for making the key virtual -> https://issues.apache.org/jira/browse/HUDI-1449. This will be done in the next 2 months. (by next release) Closing this ticket. -- This is an

[GitHub] [hudi] n3nash commented on issue #2066: [SUPPORT] Hudi is increasing the storage size big time

2021-04-21 Thread GitBox
n3nash commented on issue #2066: URL: https://github.com/apache/hudi/issues/2066#issuecomment-824569091 JIRA for making the key virtual -> https://issues.apache.org/jira/browse/HUDI-1449. This will be done in the next 2 months. Closing this ticket. -- This is an automated message from t

[GitHub] [hudi] n3nash closed issue #2066: [SUPPORT] Hudi is increasing the storage size big time

2021-04-21 Thread GitBox
n3nash closed issue #2066: URL: https://github.com/apache/hudi/issues/2066 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] n3nash commented on issue #2072: [SUPPORT] Hudi Pyspark Application Example

2021-04-21 Thread GitBox
n3nash commented on issue #2072: URL: https://github.com/apache/hudi/issues/2072#issuecomment-824568541 Closing this due to inactivity and follow up JIRA filed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [hudi] n3nash closed issue #2072: [SUPPORT] Hudi Pyspark Application Example

2021-04-21 Thread GitBox
n3nash closed issue #2072: URL: https://github.com/apache/hudi/issues/2072 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[jira] [Updated] (HUDI-1593) Add support for "show restore" in hudi-cli

2021-04-21 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1593: -- Labels: sev:normal user-support-issues (was: user-support-issues) > Add support for "show resto

[GitHub] [hudi] n3nash commented on issue #2135: [SUPPORT] GDPR safe deletes is complex

2021-04-21 Thread GitBox
n3nash commented on issue #2135: URL: https://github.com/apache/hudi/issues/2135#issuecomment-824540132 Closing this ticket due to inactivity and existing JIRA to follow up on. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] n3nash closed issue #2135: [SUPPORT] GDPR safe deletes is complex

2021-04-21 Thread GitBox
n3nash closed issue #2135: URL: https://github.com/apache/hudi/issues/2135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[jira] [Updated] (HUDI-1549) Programmatic way to fetch earliest commit retained

2021-04-21 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1549: -- Labels: sev:normal user-support-issues (was: user-support-issues) > Programmatic way to fetch e

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2862: [HUDI-1829] Use while loop instead of recursive call in MergeOnReadIn…

2021-04-21 Thread GitBox
codecov-commenter edited a comment on pull request #2862: URL: https://github.com/apache/hudi/pull/2862#issuecomment-824527479 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [hudi] n3nash closed issue #2513: [SUPPORT]Hive-Cli set hive.input.format=org.apache.hudi.hadoop.HoodieParquetInputFormat and query error

2021-04-21 Thread GitBox
n3nash closed issue #2513: URL: https://github.com/apache/hudi/issues/2513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] n3nash commented on issue #2513: [SUPPORT]Hive-Cli set hive.input.format=org.apache.hudi.hadoop.HoodieParquetInputFormat and query error

2021-04-21 Thread GitBox
n3nash commented on issue #2513: URL: https://github.com/apache/hudi/issues/2513#issuecomment-824538766 @GintokiYs Closing this ticket due to inactivity. If you continue to see this issue, please re-open -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [hudi] n3nash commented on issue #2529: [SUPPORT] - Hudi Update in EMR

2021-04-21 Thread GitBox
n3nash commented on issue #2529: URL: https://github.com/apache/hudi/issues/2529#issuecomment-824538194 @Magicbeanbuyer I've seen multiple people want to update the Hudi jar in their EMR cluster. Do you mind writing this in a FAQ ? -- This is an automated message from the Apache Git Serv

[GitHub] [hudi] xushiyan commented on pull request #2845: [HUDI-1723] Fix path selector listing files with the same mod date

2021-04-21 Thread GitBox
xushiyan commented on pull request #2845: URL: https://github.com/apache/hudi/pull/2845#issuecomment-824535848 @nsivabalan thanks for reviewing this...I've listed my points in #2850. will start work on tests once we have an agreed plan. -- This is an automated message from the Apache Git

[GitHub] [hudi] n3nash commented on issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2021-04-21 Thread GitBox
n3nash commented on issue #1552: URL: https://github.com/apache/hudi/issues/1552#issuecomment-824535202 @FelixKJose Closing this ticket due to inactivity. Please feel free to re-open or open a new one if you continue to see issue. I understand you are also trying to get 0.8.0 version work

[GitHub] [hudi] n3nash closed issue #1552: Time taken for upserting hudi table is increasing with increase in number of partitions

2021-04-21 Thread GitBox
n3nash closed issue #1552: URL: https://github.com/apache/hudi/issues/1552 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] codecov-commenter edited a comment on pull request #2862: [HUDI-1829] Use while loop instead of recursive call in MergeOnReadIn…

2021-04-21 Thread GitBox
codecov-commenter edited a comment on pull request #2862: URL: https://github.com/apache/hudi/pull/2862#issuecomment-824527479 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2862?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The

[GitHub] [hudi] xushiyan commented on a change in pull request #2845: [HUDI-1723] Fix path selector listing files with the same mod date

2021-04-21 Thread GitBox
xushiyan commented on a change in pull request #2845: URL: https://github.com/apache/hudi/pull/2845#discussion_r618075710 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DFSPathSelector.java ## @@ -121,28 +121,30 @@ public static DFSPathSel

[jira] [Commented] (HUDI-1138) Re-implement marker files via timeline server

2021-04-21 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327096#comment-17327096 ] Vinoth Chandar commented on HUDI-1138: -- yes. the plan for metadata table is to unify

[GitHub] [hudi] n3nash commented on issue #1491: [SUPPORT] OutOfMemoryError during upsert 53M records

2021-04-21 Thread GitBox
n3nash commented on issue #1491: URL: https://github.com/apache/hudi/issues/1491#issuecomment-824533410 Closing this ticket due to inactivity, details and PR to follow up on the filed JIRA -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] n3nash closed issue #1491: [SUPPORT] OutOfMemoryError during upsert 53M records

2021-04-21 Thread GitBox
n3nash closed issue #1491: URL: https://github.com/apache/hudi/issues/1491 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[jira] [Assigned] (HUDI-818) Optimize the default value of hoodie.memory.merge.max.size option

2021-04-21 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-818: Assignee: sivabalan narayanan > Optimize the default value of hoodie.memory.merge.max.size o

[jira] [Commented] (HUDI-251) JDBC incremental load to HUDI with DeltaStreamer

2021-04-21 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327095#comment-17327095 ] Vinoth Chandar commented on HUDI-251: - done! Left a comment on the RFC. > JDBC increme

[GitHub] [hudi] n3nash commented on issue #2557: [SUPPORT]Container exited with a non-zero exit code 137

2021-04-21 Thread GitBox
n3nash commented on issue #2557: URL: https://github.com/apache/hudi/issues/2557#issuecomment-824532918 @kingkongpoon Please let us know if this issue persists, else we can close this ticket -- This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [hudi] n3nash commented on issue #2605: [SUPPORT] How to reload a writeConfig from a existed hudi path ?

2021-04-21 Thread GitBox
n3nash commented on issue #2605: URL: https://github.com/apache/hudi/issues/2605#issuecomment-824531751 Closing this ticket, follow up on the opened JIRA -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [hudi] n3nash closed issue #2605: [SUPPORT] How to reload a writeConfig from a existed hudi path ?

2021-04-21 Thread GitBox
n3nash closed issue #2605: URL: https://github.com/apache/hudi/issues/2605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[jira] [Assigned] (HUDI-1640) Implement Spark Datasource option to read hudi configs from properties file

2021-04-21 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-1640: - Assignee: sivabalan narayanan > Implement Spark Datasource option to read hudi configs fr

[GitHub] [hudi] n3nash commented on issue #2614: Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token

2021-04-21 Thread GitBox
n3nash commented on issue #2614: URL: https://github.com/apache/hudi/issues/2614#issuecomment-824530680 @root18039532923 Are you able to try out hudi-0.7.0 to see if your issue is resolved ? Please let us know so we can close this ticket -- This is an automated message from the Apache Gi

[GitHub] [hudi] n3nash commented on issue #2688: [SUPPORT] Sync to Hive using Metastore

2021-04-21 Thread GitBox
n3nash commented on issue #2688: URL: https://github.com/apache/hudi/issues/2688#issuecomment-824528659 @rubenssoto Are you able to resolve this issue or consider @ismailsimsek solution proposal. If this issue is resolved, please let me know so I can close this ticket -- This is an auto

[GitHub] [hudi] codecov-commenter commented on pull request #2862: [HUDI-1829] Use while loop instead of recursive call in MergeOnReadIn…

2021-04-21 Thread GitBox
codecov-commenter commented on pull request #2862: URL: https://github.com/apache/hudi/pull/2862#issuecomment-824527479 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2862?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache

[GitHub] [hudi] n3nash commented on issue #2696: Metadata and runtime exceptions in Hudi 0.7.0 on AWS Glue

2021-04-21 Thread GitBox
n3nash commented on issue #2696: URL: https://github.com/apache/hudi/issues/2696#issuecomment-824527320 @umehrot2 Are you able to jump in and help @kimberlyamandalu here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [hudi] n3nash commented on issue #2482: [SUPPORT]

2021-04-21 Thread GitBox
n3nash commented on issue #2482: URL: https://github.com/apache/hudi/issues/2482#issuecomment-824526117 @duanyongvictory Closing this ticket due to inactivity. Please re-open if your issue persists. -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [hudi] n3nash closed issue #2482: [SUPPORT]

2021-04-21 Thread GitBox
n3nash closed issue #2482: URL: https://github.com/apache/hudi/issues/2482 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] n3nash closed issue #2489: [SUPPORT]

2021-04-21 Thread GitBox
n3nash closed issue #2489: URL: https://github.com/apache/hudi/issues/2489 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact

[GitHub] [hudi] n3nash commented on issue #2489: [SUPPORT]

2021-04-21 Thread GitBox
n3nash commented on issue #2489: URL: https://github.com/apache/hudi/issues/2489#issuecomment-824525695 @ishg Since we haven't heard from you in over a month, closing this ticket. Please re-open if your issue is unresolved. -- This is an automated message from the Apache Git Service. To

[hudi] branch asf-site updated: [HUDI-1769] Add download page to the site (#2847)

2021-04-21 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 9f93958 [HUDI-1769] Add download page to t

[GitHub] [hudi] nsivabalan merged pull request #2847: [HUDI-1769]Add download page to the site

2021-04-21 Thread GitBox
nsivabalan merged pull request #2847: URL: https://github.com/apache/hudi/pull/2847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[jira] [Updated] (HUDI-1829) Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow

2021-04-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1829: - Labels: pull-request-available (was: ) > Use while loop instead of recursive call in MergeOnReadI

[GitHub] [hudi] danny0405 opened a new pull request #2862: [HUDI-1829] Use while loop instead of recursive call in MergeOnReadIn…

2021-04-21 Thread GitBox
danny0405 opened a new pull request #2862: URL: https://github.com/apache/hudi/pull/2862 …putFormat to avoid StackOverflow Recursive all is risky for StackOverflow when there are too many. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review

[jira] [Updated] (HUDI-1829) Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow

2021-04-21 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-1829: - Description: When there are too many > Use while loop instead of recursive call in MergeOnReadInputFormat

[jira] [Created] (HUDI-1829) Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow

2021-04-21 Thread Danny Chen (Jira)
Danny Chen created HUDI-1829: Summary: Use while loop instead of recursive call in MergeOnReadInputFormat to avoid StackOverflow Key: HUDI-1829 URL: https://issues.apache.org/jira/browse/HUDI-1829 Project

[GitHub] [hudi] JSK520 commented on issue #143: Tracking ticket for folks to be added to slack group

2021-04-21 Thread GitBox
JSK520 commented on issue #143: URL: https://github.com/apache/hudi/issues/143#issuecomment-824504951 Please add me to the community- 18502070...@163.com -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [hudi] yanghua commented on issue #2834: [SUPPORT] Help~~~org.apache.hudi.exception.TableNotFoundException

2021-04-21 Thread GitBox
yanghua commented on issue #2834: URL: https://github.com/apache/hudi/issues/2834#issuecomment-824489701 @wk888 Sorry, I did not find the reason. It seems it was not caused by the flink write client. Can you test the whole workflow via spark? -- This is an automated message from the Apac

[jira] [Commented] (HUDI-1138) Re-implement marker files via timeline server

2021-04-21 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327052#comment-17327052 ] liwei commented on HUDI-1138: - [~vinoth] thanks  1. I have a idea, can we update the file to

[GitHub] [hudi] n3nash edited a comment on issue #2860: [SUPPORT] How to get record key and partitionfields using hudi API ?

2021-04-21 Thread GitBox
n3nash edited a comment on issue #2860: URL: https://github.com/apache/hudi/issues/2860#issuecomment-824440054 @shenbinglife you can just look for the columns `_hoodie_record_key` and `_hoodie_partition_path` . There is no "Hudi API" per se, you can use your choice of query engine spark (s

[GitHub] [hudi] n3nash commented on issue #2860: [SUPPORT] How to get record key and partitionfields using hudi API ?

2021-04-21 Thread GitBox
n3nash commented on issue #2860: URL: https://github.com/apache/hudi/issues/2860#issuecomment-824440054 @shenbinglife you can just look for the columns `_hoodie_record_key` and `_hoodie_partition_path` -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [hudi] vinothchandar commented on pull request #2847: [HUDI-1769]Add download page to the site

2021-04-21 Thread GitBox
vinothchandar commented on pull request #2847: URL: https://github.com/apache/hudi/pull/2847#issuecomment-824432001 lgtm. @nsivabalan if you are happy too, we can land this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[jira] [Updated] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-21 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-874: --- Status: Open (was: New) > Schema evolution does not work with AWS Glue catalog > -

[jira] [Resolved] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-21 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-874. Resolution: Fixed > Schema evolution does not work with AWS Glue catalog > --

[jira] [Assigned] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-21 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-874: -- Assignee: Udit Mehrotra > Schema evolution does not work with AWS Glue catalog > ---

[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-21 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327008#comment-17327008 ] Udit Mehrotra commented on HUDI-874: This has been fixed since EMR 6.1.0 and EMR 5.32.0

[GitHub] [hudi] prashantwason commented on a change in pull request #2819: [HUDI-1794] Moved static COMMIT_FORMATTER to thread local variable as SimpleDateFormat is not thread safe.

2021-04-21 Thread GitBox
prashantwason commented on a change in pull request #2819: URL: https://github.com/apache/hudi/pull/2819#discussion_r617958581 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java ## @@ -73,6 +71,16 @@ private static final

[GitHub] [hudi] xushiyan commented on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
xushiyan commented on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824402773 @abhijeetkushe I think the bug is also related to [HUDI-1723](https://issues.apache.org/jira/browse/HUDI-1723) , in a rarer scenario of racing condition like you mentioned: some files

[GitHub] [hudi] rubenssoto commented on issue #2787: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-04-21 Thread GitBox
rubenssoto commented on issue #2787: URL: https://github.com/apache/hudi/issues/2787#issuecomment-824357982 Yeah I can. is it a safe configuration? Because I use a generic pipeline for many tables, so it will apply for all... Thank you -- This is an automated message from t

[GitHub] [hudi] codejoyan edited a comment on issue #2852: [SUPPORT] Read Hudi Table from Hive - Hive Sync clarification

2021-04-21 Thread GitBox
codejoyan edited a comment on issue #2852: URL: https://github.com/apache/hudi/issues/2852#issuecomment-824269118 Can we add the jars while reading from beeline/hive cli and create the external table manually? In that case the inputformatclasses will be visible to the HMS while creating t

[GitHub] [hudi] codejoyan edited a comment on issue #2852: [SUPPORT] Read Hudi Table from Hive - Hive Sync clarification

2021-04-21 Thread GitBox
codejoyan edited a comment on issue #2852: URL: https://github.com/apache/hudi/issues/2852#issuecomment-824269118 Can we add the jars while reading using Hive from beeline/hive cli and create the external table manually ? In that case the inputformatclasses will be visible to the HMS whil

[GitHub] [hudi] codejoyan commented on issue #2852: [SUPPORT] Read Hudi Table from Hive - Hive Sync clarification

2021-04-21 Thread GitBox
codejoyan commented on issue #2852: URL: https://github.com/apache/hudi/issues/2852#issuecomment-824269118 Can we pass the jar when we are creating the Hudi table using Spark datasourcewriter and add the jars while reading using Hive from beeline/hive cli? In that case the inputformatclass

[jira] [Created] (HUDI-1828) Ensure All Tests Pass with ORC format

2021-04-21 Thread Teresa Kang (Jira)
Teresa Kang created HUDI-1828: - Summary: Ensure All Tests Pass with ORC format Key: HUDI-1828 URL: https://issues.apache.org/jira/browse/HUDI-1828 Project: Apache Hudi Issue Type: Sub-task

[jira] [Created] (HUDI-1827) Add ORC support in Bootstrap Op

2021-04-21 Thread Teresa Kang (Jira)
Teresa Kang created HUDI-1827: - Summary: Add ORC support in Bootstrap Op Key: HUDI-1827 URL: https://issues.apache.org/jira/browse/HUDI-1827 Project: Apache Hudi Issue Type: Sub-task Co

[jira] [Created] (HUDI-1826) Add ORC support in HoodieSnapshotExporter

2021-04-21 Thread Teresa Kang (Jira)
Teresa Kang created HUDI-1826: - Summary: Add ORC support in HoodieSnapshotExporter Key: HUDI-1826 URL: https://issues.apache.org/jira/browse/HUDI-1826 Project: Apache Hudi Issue Type: Sub-task

[GitHub] [hudi] vinothchandar commented on a change in pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-21 Thread GitBox
vinothchandar commented on a change in pull request #2645: URL: https://github.com/apache/hudi/pull/2645#discussion_r617739216 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/execution/HoodieLazyInsertIterable.java ## @@ -96,9 +97,14 @@ public Hoodie

[jira] [Created] (HUDI-1825) Implement HDFSOrcImporter

2021-04-21 Thread Teresa Kang (Jira)
Teresa Kang created HUDI-1825: - Summary: Implement HDFSOrcImporter Key: HUDI-1825 URL: https://issues.apache.org/jira/browse/HUDI-1825 Project: Apache Hudi Issue Type: Sub-task Componen

[jira] [Updated] (HUDI-1823) Hive/Presto Integration with ORC

2021-04-21 Thread Teresa Kang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teresa Kang updated HUDI-1823: -- Summary: Hive/Presto Integration with ORC (was: Hive/Presto Integration) > Hive/Presto Integration with

[jira] [Created] (HUDI-1824) Spark Integration with ORC

2021-04-21 Thread Teresa Kang (Jira)
Teresa Kang created HUDI-1824: - Summary: Spark Integration with ORC Key: HUDI-1824 URL: https://issues.apache.org/jira/browse/HUDI-1824 Project: Apache Hudi Issue Type: Sub-task Compone

[jira] [Updated] (HUDI-1823) Hive/Presto Integration

2021-04-21 Thread Teresa Kang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teresa Kang updated HUDI-1823: -- Summary: Hive/Presto Integration (was: Spark/Presto Integration) > Hive/Presto Integration > --

[jira] [Created] (HUDI-1823) Spark/Presto Integration

2021-04-21 Thread Teresa Kang (Jira)
Teresa Kang created HUDI-1823: - Summary: Spark/Presto Integration Key: HUDI-1823 URL: https://issues.apache.org/jira/browse/HUDI-1823 Project: Apache Hudi Issue Type: Sub-task Component

[jira] [Created] (HUDI-1822) [Umbrella] Range index support

2021-04-21 Thread satish (Jira)
satish created HUDI-1822: Summary: [Umbrella] Range index support Key: HUDI-1822 URL: https://issues.apache.org/jira/browse/HUDI-1822 Project: Apache Hudi Issue Type: Bug Reporter: satish

[jira] [Updated] (HUDI-1822) [Umbrella] Range index support

2021-04-21 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish updated HUDI-1822: - Component/s: Index > [Umbrella] Range index support > -- > > Key: HUDI

[jira] [Updated] (HUDI-1822) [Umbrella] Range index support

2021-04-21 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish updated HUDI-1822: - Issue Type: New Feature (was: Bug) > [Umbrella] Range index support > -- > >

[GitHub] [hudi] abhijeetkushe edited a comment on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
abhijeetkushe edited a comment on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824070964 @nsivabalan I am using the default source limit i.e 9223372036854775807 so this is not directly connected with my issue.But I wanted to talk about another related issue.I

[GitHub] [hudi] abhijeetkushe edited a comment on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
abhijeetkushe edited a comment on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824070964 @nsivabalan I am using the default source limit i.e 9223372036854775807 so this is not directly connected with my issue.But I wanted to talk about another related issue.I

[GitHub] [hudi] abhijeetkushe edited a comment on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
abhijeetkushe edited a comment on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824070964 @nsivabalan I am using the default source limit i.e 9223372036854775807 so this is not directly connected with my issue.But I wanted to talk about another related issue.I

[GitHub] [hudi] satishkotha merged pull request #2678: [HUDI-1746] Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles

2021-04-21 Thread GitBox
satishkotha merged pull request #2678: URL: https://github.com/apache/hudi/pull/2678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pleas

[hudi] branch master updated: [HUDI-1746] Added support for replace commits in commit showpartitions, commit show_write_stats, commit showfiles (#2678)

2021-04-21 Thread satish
This is an automated email from the ASF dual-hosted git repository. satish pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 4a34318 [HUDI-1746] Added support for replace com

[hudi] branch master updated (c24d90d -> b31c520)

2021-04-21 Thread satish
This is an automated email from the ASF dual-hosted git repository. satish pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from c24d90d [MINOR] Expose the detailed exception object (#2861) add b31c520 [HUDI-1714] Added tests to TestHoodieTi

[GitHub] [hudi] satishkotha merged pull request #2677: [HUDI-1714] Added tests to TestHoodieTimelineArchiveLog for the archival of compl…

2021-04-21 Thread GitBox
satishkotha merged pull request #2677: URL: https://github.com/apache/hudi/pull/2677 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pleas

[GitHub] [hudi] vinothchandar edited a comment on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-21 Thread GitBox
vinothchandar edited a comment on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-824222572 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. F

[GitHub] [hudi] vinothchandar commented on pull request #2645: [HUDI-1659] Basic Implementation Of Spark Sql Support

2021-04-21 Thread GitBox
vinothchandar commented on pull request #2645: URL: https://github.com/apache/hudi/pull/2645#issuecomment-824222572 @pengzhiwei2018 can we file followups from this review as sub tasks under the same umbrella JIRA? I spent sometime looking at snowflake and bigquery and what kind of

[jira] [Issue Comment Deleted] (HUDI-1764) Add support for Hudi CLI tools to schedule and run clustering

2021-04-21 Thread Jintao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jintao updated HUDI-1764: - Comment: was deleted (was: Cancel) > Add support for Hudi CLI tools to schedule and run clustering >

[jira] [Closed] (HUDI-1764) Add support for Hudi CLI tools to schedule and run clustering

2021-04-21 Thread Jintao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jintao closed HUDI-1764. Resolution: Done Cancel > Add support for Hudi CLI tools to schedule and run clustering > -

[jira] [Updated] (HUDI-1764) Add support for Hudi CLI tools to schedule and run clustering

2021-04-21 Thread Jintao (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jintao updated HUDI-1764: - Status: In Progress (was: Open) > Add support for Hudi CLI tools to schedule and run clustering > ---

[GitHub] [hudi] nsivabalan commented on a change in pull request #2716: [HUDI-1718] when query incr view of mor table which has Multi level partitions, the query failed

2021-04-21 Thread GitBox
nsivabalan commented on a change in pull request #2716: URL: https://github.com/apache/hudi/pull/2716#discussion_r617695415 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/hive/HoodieCombineHiveInputFormat.java ## @@ -170,7 +170,7 @@ protected HoodieCombineFi

[jira] [Commented] (HUDI-251) JDBC incremental load to HUDI with DeltaStreamer

2021-04-21 Thread Sagar Sumit (Jira)
[ https://issues.apache.org/jira/browse/HUDI-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326678#comment-17326678 ] Sagar Sumit commented on HUDI-251: -- [~vinoth] Thanks for the clarifications. Now it makes

[jira] [Commented] (HUDI-733) presto query data error

2021-04-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326676#comment-17326676 ] sivabalan narayanan commented on HUDI-733: -- [~hj324545]: if the issue is resolved,

[jira] [Updated] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2021-04-21 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-874: - Labels: aws-emr sev:critical user-support-issues (was: sev:critical user-support-issues)

[GitHub] [hudi] njalan commented on issue #2791: [SUPPORT]Failed to enable hoodie.metadata.enable

2021-04-21 Thread GitBox
njalan commented on issue #2791: URL: https://github.com/apache/hudi/issues/2791#issuecomment-824127927 @prashantwason By the way table is no a partition table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[hudi] branch master updated (cc81ddd -> c24d90d)

2021-04-21 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from cc81ddd [HUDI-1812] Add explicit index state TTL option for Flink writer (#2853) add c24d90d [MINOR] Expose the

[GitHub] [hudi] leesf merged pull request #2861: [MINOR] Expose the detailed exception object for table not found

2021-04-21 Thread GitBox
leesf merged pull request #2861: URL: https://github.com/apache/hudi/pull/2861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please cont

[GitHub] [hudi] abhijeetkushe edited a comment on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
abhijeetkushe edited a comment on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824070964 @nsivabalan I am using the default source limit i.e 9223372036854775807 so this is not directly my issue.But I wanted to talk about another related issue.I realized while

[GitHub] [hudi] abhijeetkushe edited a comment on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
abhijeetkushe edited a comment on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824070964 @nsivabalan That is another issue which i wanted to ask.I also realized that hudi's checkpoint method a bug so doing something like this would be needed -> [aws-glue-job-bo

[GitHub] [hudi] abhijeetkushe commented on issue #2850: [SUPPORT] S3 files skipped by HoodieDeltaStreamer on s3 bucket in continuous mode

2021-04-21 Thread GitBox
abhijeetkushe commented on issue #2850: URL: https://github.com/apache/hudi/issues/2850#issuecomment-824070964 @nsivabalan That is another issue which i wanted to ask ?.I also realized that hudi's checkpoint method a bug so doing something like this would be needed -> [aws-glue-job-bookma

[GitHub] [hudi] vburenin commented on pull request #2598: [HUDI-1648] Added custom kafka meta fields and custom kafka avro decoder.

2021-04-21 Thread GitBox
vburenin commented on pull request #2598: URL: https://github.com/apache/hudi/pull/2598#issuecomment-824052316 There is no related PR yet. I've been planning to contribute this feature to upstream in Q2, so far it seems possible. -- This is an automated message from the Apache Git Servic

[jira] [Comment Edited] (HUDI-1668) GlobalSortPartitioner is getting called twice during bulk_insert.

2021-04-21 Thread Sugamber (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326525#comment-17326525 ] Sugamber edited comment on HUDI-1668 at 4/21/21, 1:14 PM: -- I've a

[jira] [Commented] (HUDI-1668) GlobalSortPartitioner is getting called twice during bulk_insert.

2021-04-21 Thread Sugamber (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326526#comment-17326526 ] Sugamber commented on HUDI-1668: [~shivnarayan]  I see Global sort executed twice in this

[jira] [Commented] (HUDI-1668) GlobalSortPartitioner is getting called twice during bulk_insert.

2021-04-21 Thread Sugamber (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17326525#comment-17326525 ] Sugamber commented on HUDI-1668: I've attached the both screenshot. !Screenshot 2021-04-2

  1   2   >