[GitHub] [hudi] vinothchandar merged pull request #1693: [HUDI-985] Introduce rerun ci bot

2020-07-22 Thread GitBox
vinothchandar merged pull request #1693: URL: https://github.com/apache/hudi/pull/1693 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on pull request #1810: [HUDI-875] Abstract hudi-sync-common, and support hudi-hive-sync

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1810: URL: https://github.com/apache/hudi/pull/1810#issuecomment-662830316 @lw309637554 can you please give me a couple days. I am trying to prioritize all the 0.6.0 blockers for now.

[GitHub] [hudi] vinothchandar commented on pull request #1817: [HUDI-651] Fix incremental queries in MOR tables

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1817: URL: https://github.com/apache/hudi/pull/1817#issuecomment-662830116 @garyli1019 are you talking about corner cases not handled in this PR? can you review the PR once for intended functionality? I am trying to see if this can help

[GitHub] [hudi] vinothchandar commented on pull request #1838: [HUDI-1082] Fix minor bug in deciding the insert buckets

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1838: URL: https://github.com/apache/hudi/pull/1838#issuecomment-662829522 @shenh062326 are you able to add this as a test case and update the PR? This is an automated message from

[GitHub] [hudi] leesf merged pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-22 Thread GitBox
leesf merged pull request #1851: URL: https://github.com/apache/hudi/pull/1851 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[hudi] branch master updated: [HUDI-1113] Add user define metrics reporter (#1851)

2020-07-22 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new c39778c [HUDI-1113] Add user define metrics

[GitHub] [hudi] leesf commented on pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-22 Thread GitBox
leesf commented on pull request #1851: URL: https://github.com/apache/hudi/pull/1851#issuecomment-662828528 > @leesf > sorry for letting you check some many times, I changed the test class name, please review it no worries..

[GitHub] [hudi] vinothchandar commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-22 Thread GitBox
vinothchandar commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r459224482 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiMergeOnReadRDD.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] vinothchandar commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-22 Thread GitBox
vinothchandar commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r459224106 ## File path: hudi-spark/src/main/scala/org/apache/hudi/SnapshotRelation.scala ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] vinothchandar commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-22 Thread GitBox
vinothchandar commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r459224042 ## File path: hudi-spark/src/main/scala/org/apache/hudi/SnapshotRelation.scala ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [hudi] zherenyu831 commented on pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-22 Thread GitBox
zherenyu831 commented on pull request #1851: URL: https://github.com/apache/hudi/pull/1851#issuecomment-662826386 @leesf sorry for letting you check some many times, I changed the test class name, please review it This

[GitHub] [hudi] vinothchandar commented on a change in pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-22 Thread GitBox
vinothchandar commented on a change in pull request #1848: URL: https://github.com/apache/hudi/pull/1848#discussion_r459222574 ## File path: hudi-spark/src/main/scala/org/apache/hudi/DataSourceOptions.scala ## @@ -110,6 +112,10 @@ object DataSourceReadOptions { */ val

[hudi] branch asf-site updated: Travis CI build asf-site

2020-07-22 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 4328d28 Travis CI build asf-site 4328d28 is

[hudi] branch asf-site updated: [DOC] Add document for the use of metrics system in Hudi. (#1769)

2020-07-22 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 8cb5617 [DOC] Add document for the use of

[GitHub] [hudi] leesf merged pull request #1769: [DOC] Add document for the use of metrics system in Hudi.

2020-07-22 Thread GitBox
leesf merged pull request #1769: URL: https://github.com/apache/hudi/pull/1769 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] zherenyu831 commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-22 Thread GitBox
zherenyu831 commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r459214674 ## File path: hudi-client/src/test/java/org/apache/hudi/metrics/TestMetricsReporterFactory.java ## @@ -45,4 +52,53 @@ public void

[GitHub] [hudi] garyli1019 commented on issue #1864: Spark 2.2.0 is compatible?

2020-07-22 Thread GitBox
garyli1019 commented on issue #1864: URL: https://github.com/apache/hudi/issues/1864#issuecomment-662818399 Hi @hrmguilherme2 , there are some users are using Spark 2.2 with Hudi. Please feel free to give it a try :) This

[hudi] branch master updated (a8bd76c -> 3dd189e)

2020-07-22 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from a8bd76c [HUDI-1029] In inline compaction mode, previously failed compactions needs to be retried before new

[GitHub] [hudi] vinothchandar merged pull request #1865: [MINOR] Fix checkstyle issue on TestHoodieClientOnCopyOnWriteStorage

2020-07-22 Thread GitBox
vinothchandar merged pull request #1865: URL: https://github.com/apache/hudi/pull/1865 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] bvaradar merged pull request #1857: [HUDI-1029] In inline compaction mode, previously failed compactions …

2020-07-22 Thread GitBox
bvaradar merged pull request #1857: URL: https://github.com/apache/hudi/pull/1857 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[hudi] branch master updated: [HUDI-1029] In inline compaction mode, previously failed compactions needs to be retried before new compactions (#1857)

2020-07-22 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new a8bd76c [HUDI-1029] In inline compaction mode,

[GitHub] [hudi] xushiyan closed pull request #1861: [HUDI-781] [WIP] Refactor test utils classes

2020-07-22 Thread GitBox
xushiyan closed pull request #1861: URL: https://github.com/apache/hudi/pull/1861 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] nsivabalan commented on issue #1840: HUDI DELETE

2020-07-22 Thread GitBox
nsivabalan commented on issue #1840: URL: https://github.com/apache/hudi/issues/1840#issuecomment-662806446 it is already closed. This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-22 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r459200671 ## File path: hudi-client/src/test/java/org/apache/hudi/metrics/TestMetricsReporterFactory.java ## @@ -45,4 +52,53 @@ public void

[GitHub] [hudi] leesf commented on a change in pull request #1851: [HUDI-1113] Add user define metrics reporter

2020-07-22 Thread GitBox
leesf commented on a change in pull request #1851: URL: https://github.com/apache/hudi/pull/1851#discussion_r459200519 ## File path: hudi-client/src/test/java/org/apache/hudi/metrics/TestMetricsReporterFactory.java ## @@ -45,4 +52,53 @@ public void

[GitHub] [hudi] vinothchandar opened a new pull request #1865: [MINOR] Fix checkstyle issue on TestHoodieClientOnCopyOnWriteStorage

2020-07-22 Thread GitBox
vinothchandar opened a new pull request #1865: URL: https://github.com/apache/hudi/pull/1865 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of

[GitHub] [hudi] vinothchandar commented on pull request #1865: [MINOR] Fix checkstyle issue on TestHoodieClientOnCopyOnWriteStorage

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1865: URL: https://github.com/apache/hudi/pull/1865#issuecomment-662804672 Please, dont merge until CI passes. This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] hrmguilherme2 opened a new issue #1864: Spark 2.2.0 is compatible?

2020-07-22 Thread GitBox
hrmguilherme2 opened a new issue #1864: URL: https://github.com/apache/hudi/issues/1864 My cluster have spark 2.2.0 version... but I can see that hudi have spark 2.4.4... can i use this yet? This is an automated message

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #347

2020-07-22 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.34 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[hudi] branch master updated (5b6026b -> 9bd37ef)

2020-07-22 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 5b6026b [HUDI-802] Fixing deletes for inserts in same batch in write path (#1792) add 9bd37ef [MINOR] Fix

[GitHub] [hudi] nsivabalan merged pull request #1863: [MINOR] Fix flaky testUpsertsUpdatePartitionPath* tests

2020-07-22 Thread GitBox
nsivabalan merged pull request #1863: URL: https://github.com/apache/hudi/pull/1863 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] vinothchandar merged pull request #1792: [HUDI-802] Fixing deletes for inserts in same batch in write path

2020-07-22 Thread GitBox
vinothchandar merged pull request #1792: URL: https://github.com/apache/hudi/pull/1792 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[hudi] branch master updated (12ef8c9 -> 5b6026b)

2020-07-22 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 12ef8c9 [HUDI-708] Add temps show and unit test for TempViewCommand (#1770) add 5b6026b [HUDI-802] Fixing

[GitHub] [hudi] vinothchandar opened a new pull request #1863: [MINOR] Fix flaky testUpsertsUpdatePartitionPath* tests

2020-07-22 Thread GitBox
vinothchandar opened a new pull request #1863: URL: https://github.com/apache/hudi/pull/1863 - Passed 20+ times locally now ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull

[jira] [Created] (HUDI-1121) Provide a document describing how to use callback

2020-07-22 Thread wangxianghu (Jira)
wangxianghu created HUDI-1121: - Summary: Provide a document describing how to use callback Key: HUDI-1121 URL: https://issues.apache.org/jira/browse/HUDI-1121 Project: Apache Hudi Issue Type:

[jira] [Assigned] (HUDI-1103) Improve the code format of Delete data demo in Quick-Start Guide

2020-07-22 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu reassigned HUDI-1103: - Assignee: Trevorzhang (was: wangxianghu) > Improve the code format of Delete data demo in

[GitHub] [hudi] Mathieu1124 commented on pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
Mathieu1124 commented on pull request #1842: URL: https://github.com/apache/hudi/pull/1842#issuecomment-662774862 > @Mathieu1124 also please document how to use the callback in the website. ok,thanks for your review @leesf @yanghua

[GitHub] [hudi] vinothchandar commented on pull request #1857: [HUDI-1029] In inline compaction mode, previously failed compactions …

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1857: URL: https://github.com/apache/hudi/pull/1857#issuecomment-662773187 @bvaradar ready for review. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] garyli1019 commented on a change in pull request #1862: [HUDI-1120] Support spotless for scala

2020-07-22 Thread GitBox
garyli1019 commented on a change in pull request #1862: URL: https://github.com/apache/hudi/pull/1862#discussion_r459162879 ## File path: style/scalafmt.conf ## @@ -0,0 +1,52 @@ + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license

[GitHub] [hudi] leesf commented on pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
leesf commented on pull request #1842: URL: https://github.com/apache/hudi/pull/1842#issuecomment-662769053 @Mathieu1124 also please document how to use the callback in the website. This is an automated message from the

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Component/s: Code Cleanup > Support spotless for scala > -- > >

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Fix Version/s: 0.6.0 > Support spotless for scala > -- > >

[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-07-22 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163151#comment-17163151 ] Vinoth Chandar commented on HUDI-1015: -- let's create sub tasks here? > Audit all

[GitHub] [hudi] garyli1019 commented on pull request #1862: [HUDI-1120] Support spotless for scala

2020-07-22 Thread GitBox
garyli1019 commented on pull request #1862: URL: https://github.com/apache/hudi/pull/1862#issuecomment-662768085 @leesf Can we bring back the scala support for spotless first? Right now looks like to don't have any check for scala code.

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1120: - Labels: pull-request-available (was: ) > Support spotless for scala > --

[GitHub] [hudi] garyli1019 opened a new pull request #1862: [HUDI-1120] Support spotless for scala

2020-07-22 Thread GitBox
garyli1019 opened a new pull request #1862: URL: https://github.com/apache/hudi/pull/1862 ## What is the purpose of the pull request Support spotless for Scala. ## Brief change log - *Add initial support for spotless with scalafmt* ## Verify this pull request

[jira] [Closed] (HUDI-708) Add unit test for TempViewCommand

2020-07-22 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-708. - Fix Version/s: 0.6.0 Resolution: Implemented Implemented via master branch:

[jira] [Updated] (HUDI-708) Add unit test for TempViewCommand

2020-07-22 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-708: -- Status: Open (was: New) > Add unit test for TempViewCommand > - > >

[hudi] branch master updated (743ef32 -> 12ef8c9)

2020-07-22 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 743ef32 [HUDI-871] Add support for Tencent Cloud Object Storage(COS) (#1855) add 12ef8c9 [HUDI-708] Add

[GitHub] [hudi] yanghua merged pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-22 Thread GitBox
yanghua merged pull request #1770: URL: https://github.com/apache/hudi/pull/1770 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] yanghua commented on a change in pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-22 Thread GitBox
yanghua commented on a change in pull request #1770: URL: https://github.com/apache/hudi/pull/1770#discussion_r459159775 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/TempViewCommand.java ## @@ -20,36 +20,55 @@ import org.apache.hudi.cli.HoodieCLI;

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Status: In Progress (was: Open) > Support spotless for scala > -- > >

[jira] [Updated] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li updated HUDI-1120: - Status: Open (was: New) > Support spotless for scala > -- > >

[jira] [Created] (HUDI-1120) Support spotless for scala

2020-07-22 Thread Yanjia Gary Li (Jira)
Yanjia Gary Li created HUDI-1120: Summary: Support spotless for scala Key: HUDI-1120 URL: https://issues.apache.org/jira/browse/HUDI-1120 Project: Apache Hudi Issue Type: Sub-task

[GitHub] [hudi] vinothchandar commented on pull request #1792: [HUDI-802] Fixing deletes for inserts in same batch in write path

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1792: URL: https://github.com/apache/hudi/pull/1792#issuecomment-662763386 ``` [ERROR] Failures: 2552[ERROR] org.apache.hudi.client.TestHoodieClientOnCopyOnWriteStorage.testUpsertsUpdatePartitionPathGlobalBloom(HoodieIndex$IndexType)

[GitHub] [hudi] qingyuan18 commented on issue #1854: query MOR table using spark sql error

2020-07-22 Thread GitBox
qingyuan18 commented on issue #1854: URL: https://github.com/apache/hudi/issues/1854#issuecomment-662761051 > Is the table (._acidtest2) registered as Hive table. If so, can you provide the complete table description of the table (desc formatted > > ) in Hive metastore.

[GitHub] [hudi] asheeshgarg commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-22 Thread GitBox
asheeshgarg commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-662709245 @bvaradar mostly I see : org.apache.hudi.exception.HoodieRollbackException: Found in-flight commits  after time :20200722052838, please rollback greater commits first Does

[GitHub] [hudi] umehrot2 commented on pull request #1848: [HUDI-69] Support Spark Datasource for MOR table - RDD approach

2020-07-22 Thread GitBox
umehrot2 commented on pull request #1848: URL: https://github.com/apache/hudi/pull/1848#issuecomment-662699185 > @umehrot2 can you also please make a quick second pass. @vinothchandar @garyli1019 Sorry for the late response. Plan to review it later today.

[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-07-22 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163073#comment-17163073 ] Udit Mehrotra commented on HUDI-874: This has been fixed by EMR folks, but the fix will make it in

[GitHub] [hudi] umehrot2 commented on issue #1856: [SUPPORT] HiveSyncTool fails on alter table cascade

2020-07-22 Thread GitBox
umehrot2 commented on issue #1856: URL: https://github.com/apache/hudi/issues/1856#issuecomment-662694529 @GurRonenExplorium @bvaradar EMR team is aware of this issue when working with Glue metastore. We have fixed it, however it will only be provided in the future EMR releases. Right now

[GitHub] [hudi] stackfun edited a comment on issue #1860: [SUPPORT] Issue when querying from Spark Datasource if COW table is being written to at the same time

2020-07-22 Thread GitBox
stackfun edited a comment on issue #1860: URL: https://github.com/apache/hudi/issues/1860#issuecomment-662658795 @bvaradar Thanks for your quick response. I ran the same test but running the hive query first, then the spark query and I'm still seeing similar results. ``` Hive

[GitHub] [hudi] stackfun commented on issue #1860: [SUPPORT] Issue when querying from Spark Datasource if COW table is being written to at the same time

2020-07-22 Thread GitBox
stackfun commented on issue #1860: URL: https://github.com/apache/hudi/issues/1860#issuecomment-662658795 I ran the same test but running the hive query first, then the spark query and I'm still seeing similar results. ``` Hive Query: ++ |count(1)| ++

[GitHub] [hudi] bvaradar commented on issue #1852: [SUPPORT]

2020-07-22 Thread GitBox
bvaradar commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-662638790 Ended up creating a new jira : https://issues.apache.org/jira/browse/HUDI-1119 as this has different cause. This is

[jira] [Created] (HUDI-1119) MOR appends slow due to file listing in executor side for finding the log file

2020-07-22 Thread Balaji Varadarajan (Jira)
Balaji Varadarajan created HUDI-1119: Summary: MOR appends slow due to file listing in executor side for finding the log file Key: HUDI-1119 URL: https://issues.apache.org/jira/browse/HUDI-1119

[jira] [Updated] (HUDI-1119) MOR appends slow due to file listing in executor side for finding the log file

2020-07-22 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1119: - Status: Open (was: New) > MOR appends slow due to file listing in executor side for

[jira] [Updated] (HUDI-1119) MOR appends slow due to file listing in executor side for finding the log file

2020-07-22 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1119: - Labels: perf (was: ) > MOR appends slow due to file listing in executor side for finding

[jira] [Comment Edited] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-07-22 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163018#comment-17163018 ] Balaji Varadarajan edited comment on HUDI-1015 at 7/22/20, 7:08 PM:

[GitHub] [hudi] asheeshgarg removed a comment on issue #1787: Exception During Insert

2020-07-22 Thread GitBox
asheeshgarg removed a comment on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662610773 @bvaradar I have used the --jars option in submit other jars are picked up as well. I also see the class is there but still getting the same error.

[GitHub] [hudi] bvaradar commented on issue #1852: [SUPPORT]

2020-07-22 Thread GitBox
bvaradar commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-662632930 We have a jira : https://issues.apache.org/jira/browse/HUDI-1015 to improve/avoid listing. I have added this case to the jira.

[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-07-22 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163018#comment-17163018 ] Balaji Varadarajan commented on HUDI-1015: -- Another place where we do listing in executor : 

[GitHub] [hudi] bvaradar commented on issue #1852: [SUPPORT]

2020-07-22 Thread GitBox
bvaradar commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-662630342 Sorry, I did not realize that. Let me check and get back This is an automated message from the Apache Git Service. To

[GitHub] [hudi] bvaradar commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-22 Thread GitBox
bvaradar commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-662628377 @asheeshgarg : Yes, you should see that the spark job failed and its logs should tell you what is wrong. This is an

[GitHub] [hudi] ssomuah commented on issue #1852: [SUPPORT]

2020-07-22 Thread GitBox
ssomuah commented on issue #1852: URL: https://github.com/apache/hudi/issues/1852#issuecomment-662626520 1. I'm trying this now. 2. The stack trace is the one I provided above. This is an automated message from the

[GitHub] [hudi] asheeshgarg commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-22 Thread GitBox
asheeshgarg commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-662612691 @bvaradar you are suggesting look at the spark logs during ingestion or any other logs? This is an automated

[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-22 Thread GitBox
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662610773 @bvaradar I have used the --jars option in submit other jars are picked up as well. I also see the class is there but still getting the same error.

[GitHub] [hudi] vinothchandar commented on pull request #1149: [HUDI-472] Introduce configurations and new modes of sorting for bulk_insert

2020-07-22 Thread GitBox
vinothchandar commented on pull request #1149: URL: https://github.com/apache/hudi/pull/1149#issuecomment-662584698 >Do you think we need this? If not, I can remove this class (also getting rid of the scala imports). @yihua yes lets remove this for now. globally_sorted,

[GitHub] [hudi] vinothchandar commented on a change in pull request #1149: [HUDI-472] Introduce configurations and new modes of sorting for bulk_insert

2020-07-22 Thread GitBox
vinothchandar commented on a change in pull request #1149: URL: https://github.com/apache/hudi/pull/1149#discussion_r458958468 ## File path: hudi-client/src/main/java/org/apache/hudi/execution/CopyOnWriteInsertHandler.java ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458926719 ## File path: hudi-client/src/test/java/org/apache/hudi/testutils/HoodieWriteCommitTestHarness.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458922358 ## File path: hudi-client/src/main/java/org/apache/hudi/callback/util/HoodieCommitCallbackFactory.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the

[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458917677 ## File path: hudi-client/src/test/java/org/apache/hudi/callback/http/TestHoodieWriteCommitHttpCallback.java ## @@ -0,0 +1,53 @@ +/* + * Licensed to

[GitHub] [hudi] Mathieu1124 commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
Mathieu1124 commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458916801 ## File path: hudi-client/src/main/java/org/apache/hudi/callback/util/HoodieCommitCallbackFactory.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the

[jira] [Commented] (HUDI-874) Schema evolution does not work with AWS Glue catalog

2020-07-22 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162918#comment-17162918 ] Balaji Varadarajan commented on HUDI-874: - This issue keeps coming up. New ticket: 

[GitHub] [hudi] bvaradar commented on issue #1856: [SUPPORT] HiveSyncTool fails on alter table cascade

2020-07-22 Thread GitBox
bvaradar commented on issue #1856: URL: https://github.com/apache/hudi/issues/1856#issuecomment-662544225 cc @umehrot2 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] bvaradar commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-22 Thread GitBox
bvaradar commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-662543343 @asheeshgarg : I can only see rollback files here. These should be cleaned up when the HUDI-1118 is added. BTW, this actually points out that you are seeing (or had seen) lots of

[GitHub] [hudi] bvaradar commented on issue #1787: Exception During Insert

2020-07-22 Thread GitBox
bvaradar commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662540757 @asheeshgarg : I can see the class in the jar. Please check if you are using the right options in adding the jar to spark driver. MacBook-Pro:oss balaji.varadarajan$ jar tf

[GitHub] [hudi] bvaradar commented on issue #1840: HUDI DELETE

2020-07-22 Thread GitBox
bvaradar commented on issue #1840: URL: https://github.com/apache/hudi/issues/1840#issuecomment-662536376 @reenarosid @nsivabalan : Can we close this issue ? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] bvaradar commented on issue #1860: [SUPPORT] Issue when querying from Spark Datasource if COW table is being written to at the same time

2020-07-22 Thread GitBox
bvaradar commented on issue #1860: URL: https://github.com/apache/hudi/issues/1860#issuecomment-662535162 @stackfun : You are running spark query first followed by hive query. Between the 2 runs, hudi would have committed the data and that could be the reason you are seeing inconsistent

[GitHub] [hudi] bvaradar commented on issue #1845: [SUPPORT] Support for Schema evolution. Facing an error

2020-07-22 Thread GitBox
bvaradar commented on issue #1845: URL: https://github.com/apache/hudi/issues/1845#issuecomment-662533183 Thanks @n3nash @sbernauer : I think the exception you are seeing in the production could be because of different reasons than the tests. I would like to decouple them and focus

[GitHub] [hudi] asheeshgarg commented on issue #1825: [SUPPORT] Compaction of parquet and meta file

2020-07-22 Thread GitBox
asheeshgarg commented on issue #1825: URL: https://github.com/apache/hudi/issues/1825#issuecomment-662530755 @bvaradar the content of .hoodie is listed at https://gist.github.com/asheeshgarg/8897de60ab6ba78b5847f5432a4a69dd

[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-22 Thread GitBox
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662525238 @bvaradar I added https://mvnrepository.com/artifact/com.tdunning/json/1.8/json-1.8.jar to spark jars but still facing the same issue An error occurred while calling o179.save.

[GitHub] [hudi] yanghua commented on a change in pull request #1842: [HUDI-1037]Introduce a write committed callback hook

2020-07-22 Thread GitBox
yanghua commented on a change in pull request #1842: URL: https://github.com/apache/hudi/pull/1842#discussion_r458873037 ## File path: hudi-client/src/main/java/org/apache/hudi/config/HoodieWriteCommitCallbackConfig.java ## @@ -0,0 +1,106 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] hddong commented on pull request #1774: [HUDI-703]Add unit test for HoodieSyncCommand

2020-07-22 Thread GitBox
hddong commented on pull request #1774: URL: https://github.com/apache/hudi/pull/1774#issuecomment-662496685 @yanghua : Thanks for your review, had address them. This is an automated message from the Apache Git Service. To

[GitHub] [hudi] Mathieu1124 edited a comment on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-07-22 Thread GitBox
Mathieu1124 edited a comment on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-662474645 > @Mathieu1124 , @leesf : @n3nash said he is half way through reviewing. I took another pass and this seems low risk enough for us to merge for 0.6.0. > > We have

[GitHub] [hudi] Mathieu1124 edited a comment on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-07-22 Thread GitBox
Mathieu1124 edited a comment on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-662474645 > @Mathieu1124 , @leesf : @n3nash said he is half way through reviewing. I took another pass and this seems low risk enough for us to merge for 0.6.0. > > We have

[GitHub] [hudi] hddong commented on a change in pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-22 Thread GitBox
hddong commented on a change in pull request #1770: URL: https://github.com/apache/hudi/pull/1770#discussion_r458821046 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/utils/SparkTempViewProvider.java ## @@ -101,6 +101,17 @@ public void runQuery(String sqlText) {

[GitHub] [hudi] hddong commented on a change in pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-22 Thread GitBox
hddong commented on a change in pull request #1770: URL: https://github.com/apache/hudi/pull/1770#discussion_r458817753 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/TempViewCommand.java ## @@ -20,36 +20,55 @@ import org.apache.hudi.cli.HoodieCLI;

[GitHub] [hudi] hddong commented on a change in pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-22 Thread GitBox
hddong commented on a change in pull request #1770: URL: https://github.com/apache/hudi/pull/1770#discussion_r458817753 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/TempViewCommand.java ## @@ -20,36 +20,55 @@ import org.apache.hudi.cli.HoodieCLI;

[GitHub] [hudi] Mathieu1124 commented on pull request #1827: [HUDI-1089] Refactor hudi-client to support multi-engine

2020-07-22 Thread GitBox
Mathieu1124 commented on pull request #1827: URL: https://github.com/apache/hudi/pull/1827#issuecomment-662474645 > @Mathieu1124 , @leesf : @n3nash said he is half way through reviewing. I took another pass and this seems low risk enough for us to merge for 0.6.0. > > We have some

[GitHub] [hudi] hddong commented on a change in pull request #1770: [HUDI-708]Add temps show and unit test for TempViewCommand

2020-07-22 Thread GitBox
hddong commented on a change in pull request #1770: URL: https://github.com/apache/hudi/pull/1770#discussion_r458817019 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/HoodieCLI.java ## @@ -115,4 +115,16 @@ public static synchronized TempViewProvider

[GitHub] [hudi] GurRonenExplorium commented on issue #1856: [SUPPORT] HiveSyncTool fails on alter table cascade

2020-07-22 Thread GitBox
GurRonenExplorium commented on issue #1856: URL: https://github.com/apache/hudi/issues/1856#issuecomment-662442723 Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to

  1   2   >