[GitHub] [hudi] yanghua commented on pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
yanghua commented on pull request #1871: URL: https://github.com/apache/hudi/pull/1871#issuecomment-671742888 > @xushiyan @yanghua : This PR is causing lot of merge conflicts to a blocker PR which we needed to merge by tonight and I am unable to resolve conflicts in time. I am reverting

[GitHub] [hudi] RajasekarSribalan commented on issue #1939: [SUPPORT] Hudi creating parquet with huge size and not in sink with limitFileSize

2020-08-10 Thread GitBox
RajasekarSribalan commented on issue #1939: URL: https://github.com/apache/hudi/issues/1939#issuecomment-671742308 Yes @bvaradar we do an initial bulk insert and then upsert for subsequent operations.! I configured hoodie.copyonwrite.record.size.estimate to 128 while taking initial load

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468332027 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ## @@ -108,262 +106,280 @@ private[hudi] object

[GitHub] [hudi] bvaradar commented on pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
bvaradar commented on pull request #1871: URL: https://github.com/apache/hudi/pull/1871#issuecomment-671730869 Thanks @xushiyan This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] zhedoubushishi commented on pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-10 Thread GitBox
zhedoubushishi commented on pull request #1870: URL: https://github.com/apache/hudi/pull/1870#issuecomment-671730790 LGTM to me. Thanks for the implementation of versioning part @bvaradar ! Only left some minor comments. I noticed that there's a conflict with another commit but it seems

[GitHub] [hudi] xushiyan commented on pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
xushiyan commented on pull request #1871: URL: https://github.com/apache/hudi/pull/1871#issuecomment-671730074 @bvaradar no worries.. i can do another one. This is an automated message from the Apache Git Service. To respond

[hudi] branch master updated: Revert "[HUDI-781] Introduce HoodieTestTable for test preparation (#1871)"

2020-08-10 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 626f78f Revert "[HUDI-781] Introduce

[GitHub] [hudi] bvaradar commented on pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
bvaradar commented on pull request #1871: URL: https://github.com/apache/hudi/pull/1871#issuecomment-671729188 @xushiyan @yanghua : This PR is causing lot of merge conflicts to a blocker PR which we needed to merge by tonight and I am unable to resolve conflicts in time. I am reverting

[GitHub] [hudi] xushiyan commented on pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
xushiyan commented on pull request #1871: URL: https://github.com/apache/hudi/pull/1871#issuecomment-671721410 @vinothchandar sorry i'll make the PRs in draft state until the cut. This is an automated message from the Apache

[GitHub] [hudi] harishchanderramesh edited a comment on issue #1936: Hudi Query Error

2020-08-10 Thread GitBox
harishchanderramesh edited a comment on issue #1936: URL: https://github.com/apache/hudi/issues/1936#issuecomment-671719305 Hi @umehrot2 , Please find me responses below. Are you able to do a simple aws s3 ls and list or get anything from your cluster on S3 ? **_Yes,I am

[GitHub] [hudi] harishchanderramesh edited a comment on issue #1936: Hudi Query Error

2020-08-10 Thread GitBox
harishchanderramesh edited a comment on issue #1936: URL: https://github.com/apache/hudi/issues/1936#issuecomment-671719305 Hi @umehrot2 , Please find me responses below. Are you able to do a simple aws s3 ls and list or get anything from your cluster on S3 ? _Yes,I am

[GitHub] [hudi] harishchanderramesh commented on issue #1936: Hudi Query Error

2020-08-10 Thread GitBox
harishchanderramesh commented on issue #1936: URL: https://github.com/apache/hudi/issues/1936#issuecomment-671719305 Hi @umehrot2 , Are you able to do a simple aws s3 ls and list or get anything from your cluster on S3 ? - Yes,I am able to. Are you configuring to use S3A

[GitHub] [hudi] harishchanderramesh edited a comment on issue #1936: Hudi Query Error

2020-08-10 Thread GitBox
harishchanderramesh edited a comment on issue #1936: URL: https://github.com/apache/hudi/issues/1936#issuecomment-671719305 Hi @umehrot2 , Please find me responses below. Are you able to do a simple aws s3 ls and list or get anything from your cluster on S3 ? - Yes,I am

[hudi] branch master updated: [HUDI-1175] Commenting out testsuite tests from Integration tests until we investigate the CI flakiness (#1945)

2020-08-10 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 9c24151 [HUDI-1175] Commenting out testsuite

[GitHub] [hudi] vinothchandar merged pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
vinothchandar merged pull request #1945: URL: https://github.com/apache/hudi/pull/1945 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468311412 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ## @@ -108,262 +106,280 @@ private[hudi] object

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r468311076 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala ## @@ -108,262 +106,280 @@ private[hudi] object

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #366

2020-08-10 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.58 KB...] cdi-api-1.0.jar cdi-api.license commons-cli-1.4.jar commons-cli.license commons-io-2.5.jar commons-io.license

[GitHub] [hudi] vinothchandar commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r461939866 ## File path: hudi-client/src/main/java/org/apache/hudi/client/HoodieInternalWriteStatus.java ## @@ -0,0 +1,151 @@ +/* + * Licensed to the Apache

[jira] [Created] (HUDI-1176) Support log4j2 config

2020-08-10 Thread hong dongdong (Jira)
hong dongdong created HUDI-1176: --- Summary: Support log4j2 config Key: HUDI-1176 URL: https://issues.apache.org/jira/browse/HUDI-1176 Project: Apache Hudi Issue Type: Bug Components:

[GitHub] [hudi] bvaradar commented on issue #1939: [SUPPORT] Hudi creating parquet with huge size and not in sink with limitFileSize

2020-08-10 Thread GitBox
bvaradar commented on issue #1939: URL: https://github.com/apache/hudi/issues/1939#issuecomment-671690639 To understand, Are you using bulk insert for initial loading and upsert for subsequent operations ? For records with LOBs, it is important to tune

[GitHub] [hudi] bvaradar commented on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-10 Thread GitBox
bvaradar commented on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-671687932 @tooptoop4 : The checkpoints are stored as part of .commit files in .hoodie folder and will persist across cluster, application restarts.

[GitHub] [hudi] bvaradar commented on issue #1925: [SUPPORT] Support for Confluent Cloud SchemaRegistryProvider

2020-08-10 Thread GitBox
bvaradar commented on issue #1925: URL: https://github.com/apache/hudi/issues/1925#issuecomment-671686601 @jpugliesi : With Spark DataSource write the schema is implicitly derived from the input data-frame we want to write. Is there a specific use-case you have in mind ? Since

[GitHub] [hudi] vinothchandar commented on pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
vinothchandar commented on pull request #1871: URL: https://github.com/apache/hudi/pull/1871#issuecomment-671685350 @yanghua @xushiyan can we please hold off on these refactoring PRs until we cut RCs please. we are trying to land the last bits. the rebase efforts from these, keep

[hudi] branch asf-site updated: Travis CI build asf-site

2020-08-10 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 0b938d5 Travis CI build asf-site 0b938d5 is

[hudi] branch asf-site updated: Travis CI build asf-site

2020-08-10 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 0b938d5 Travis CI build asf-site 0b938d5 is

[jira] [Closed] (HUDI-1121) Provide a document describing how to use callback

2020-08-10 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1121. -- Resolution: Done Done via asf-site branch: a6f991c4c2d72166fe8c898b6f63bb1d16ccd7a0 > Provide a document

[jira] [Updated] (HUDI-1121) Provide a document describing how to use callback

2020-08-10 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-1121: --- Status: Open (was: New) > Provide a document describing how to use callback >

[GitHub] [hudi] yanghua merged pull request #1935: [HUDI-1121][DOC]Provide a document describing how to use callback

2020-08-10 Thread GitBox
yanghua merged pull request #1935: URL: https://github.com/apache/hudi/pull/1935 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[hudi] branch asf-site updated: [HUDI-1121] Provide a document describing how to use callback (#1935)

2020-08-10 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new a6f991c [HUDI-1121] Provide a document

[GitHub] [hudi] xushiyan commented on a change in pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
xushiyan commented on a change in pull request #1871: URL: https://github.com/apache/hudi/pull/1871#discussion_r468279092 ## File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java ## @@ -237,11 +229,12 @@ public static void

[hudi] branch master updated: [HUDI-781] Introduce HoodieTestTable for test preparation (#1871)

2020-08-10 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new b2e703d [HUDI-781] Introduce HoodieTestTable

[GitHub] [hudi] yanghua merged pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
yanghua merged pull request #1871: URL: https://github.com/apache/hudi/pull/1871 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] yanghua commented on a change in pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
yanghua commented on a change in pull request #1871: URL: https://github.com/apache/hudi/pull/1871#discussion_r468276194 ## File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java ## @@ -237,11 +229,12 @@ public static void

[GitHub] [hudi] nsivabalan commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
nsivabalan commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468269077 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestBase.java ## @@ -103,22 +103,26 @@ private static String

[hudi] branch master updated: [HUDI-1173] fix hudi-prometheus pom dependency (#1942)

2020-08-10 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 934f00b [HUDI-1173] fix hudi-prometheus pom

[GitHub] [hudi] leesf merged pull request #1942: [HUDI-1173] fix hudi-prometheus pom dependency

2020-08-10 Thread GitBox
leesf merged pull request #1942: URL: https://github.com/apache/hudi/pull/1942 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] vinothchandar commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468266606 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestBase.java ## @@ -103,22 +103,26 @@ private static String

[GitHub] [hudi] nsivabalan commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
nsivabalan commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468265130 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestBase.java ## @@ -103,22 +103,26 @@ private static String

[GitHub] [hudi] nsivabalan commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
nsivabalan commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468265130 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestBase.java ## @@ -103,22 +103,26 @@ private static String

[jira] [Updated] (HUDI-1119) MOR appends slow due to file listing in executor side for finding the log file

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1119: - Fix Version/s: (was: 0.6.0) > MOR appends slow due to file listing in executor side for

[jira] [Updated] (HUDI-1119) MOR appends slow due to file listing in executor side for finding the log file

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1119: - Priority: Major (was: Blocker) > MOR appends slow due to file listing in executor side for

[jira] [Updated] (HUDI-289) Implement a test suite to support long running test for Hudi writing and querying end-end

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-289: Fix Version/s: (was: 0.6.0) Priority: Major (was: Blocker) > Implement a test suite to

[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-920: Status: Patch Available (was: In Progress) > Incremental view on MOR table using Spark Datasource >

[jira] [Updated] (HUDI-920) Incremental view on MOR table using Spark Datasource

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-920: Status: In Progress (was: Open) > Incremental view on MOR table using Spark Datasource >

[jira] [Updated] (HUDI-808) Support for cleaning source data

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-808: Status: In Progress (was: Open) > Support for cleaning source data >

[jira] [Updated] (HUDI-808) Support for cleaning source data

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-808: Status: Patch Available (was: In Progress) > Support for cleaning source data >

[jira] [Updated] (HUDI-1014) Design and Implement upgrade-downgrade infrastrucutre

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1014: - Status: Closed (was: Patch Available) > Design and Implement upgrade-downgrade infrastrucutre >

[jira] [Updated] (HUDI-1098) Marker file finalizing may block on a data file that was never written

2020-08-10 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1098: -- Status: Closed (was: Patch Available) > Marker file finalizing may block on a data

[jira] [Updated] (HUDI-971) Fix HFileBootstrapIndexReader.getIndexedPartitions() returns unclean partition name

2020-08-10 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-971: Status: Closed (was: Patch Available) > Fix HFileBootstrapIndexReader.getIndexedPartitions()

[GitHub] [hudi] vinothchandar commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468251623 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestBase.java ## @@ -103,22 +103,26 @@ private static String

[GitHub] [hudi] bvaradar commented on issue #1941: [SUPPORT] partition's value changed with hbase index

2020-08-10 Thread GitBox
bvaradar commented on issue #1941: URL: https://github.com/apache/hudi/issues/1941#issuecomment-671645896 @satishkotha @n3nash : Can you please chime in on this ? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] bvaradar commented on issue #1940: [SUPPORT] In CDC scenario, Does Hudi support schema enforcement like Delta Lake?

2020-08-10 Thread GitBox
bvaradar commented on issue #1940: URL: https://github.com/apache/hudi/issues/1940#issuecomment-671645321 Hudi supports Avro schema evolution and compatibility rules. We are also planning to rethink schema evolution in general for our next major release.

[GitHub] [hudi] nsivabalan commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
nsivabalan commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468245640 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestHoodieDemo.java ## @@ -115,6 +116,7 @@ public void testParquetDemo() throws

[GitHub] [hudi] bvaradar commented on a change in pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
bvaradar commented on a change in pull request #1945: URL: https://github.com/apache/hudi/pull/1945#discussion_r468245015 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/ITTestHoodieDemo.java ## @@ -115,6 +116,7 @@ public void testParquetDemo() throws

[GitHub] [hudi] umehrot2 commented on pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-10 Thread GitBox
umehrot2 commented on pull request #1944: URL: https://github.com/apache/hudi/pull/1944#issuecomment-671642813 cc @vinothchandar @bvaradar @bhasudha This is an automated message from the Apache Git Service. To respond to

[jira] [Updated] (HUDI-1175) Investigate CI test flakiness (hangs)

2020-08-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1175: - Labels: pull-request-available (was: ) > Investigate CI test flakiness (hangs) >

[GitHub] [hudi] nsivabalan opened a new pull request #1945: [HUDI-1175] Minor fixes for CI flakiness

2020-08-10 Thread GitBox
nsivabalan opened a new pull request #1945: URL: https://github.com/apache/hudi/pull/1945 - Adding some log statements - Commenting out testsuite tests from integration tests until we investigate CI flakiness ## What is the purpose of the pull request *This patch comments

[jira] [Created] (HUDI-1175) Investigate CI test flakiness (hangs)

2020-08-10 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1175: - Summary: Investigate CI test flakiness (hangs) Key: HUDI-1175 URL: https://issues.apache.org/jira/browse/HUDI-1175 Project: Apache Hudi Issue

[jira] [Resolved] (HUDI-999) Parallelize listing of Source dataset partitions

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-999. Resolution: Fixed > Parallelize listing of Source dataset partitions >

[jira] [Updated] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-620: --- Status: Closed (was: Patch Available) > Hive Sync Integration of bootstrapped table >

[jira] [Commented] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175092#comment-17175092 ] Udit Mehrotra commented on HUDI-620: Resolved by https://github.com/apache/hudi/pull/1702/ > Hive Sync

[jira] [Resolved] (HUDI-427) Implement CLI support for performing bootstrap

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-427. Resolution: Fixed > Implement CLI support for performing bootstrap >

[jira] [Resolved] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-08-10 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra resolved HUDI-426. Resolution: Fixed > Implement Spark DataSource Support for querying bootstrapped tables >

[jira] [Updated] (HUDI-1174) Hudi changes for bootstrapped tables integration with Presto

2020-08-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1174: - Labels: pull-request-available (was: ) > Hudi changes for bootstrapped tables integration with

[GitHub] [hudi] umehrot2 opened a new pull request #1944: [HUDI-1174] Changes for bootstrapped tables to work with presto

2020-08-10 Thread GitBox
umehrot2 opened a new pull request #1944: URL: https://github.com/apache/hudi/pull/1944 ## What is the purpose of the pull request The purpose of this pull request is to implement changes required on Hudi side to get Bootstrapped tables integrated with Presto. The testing was done

[jira] [Created] (HUDI-1174) Hudi changes for bootstrapped tables integration with Presto

2020-08-10 Thread Udit Mehrotra (Jira)
Udit Mehrotra created HUDI-1174: --- Summary: Hudi changes for bootstrapped tables integration with Presto Key: HUDI-1174 URL: https://issues.apache.org/jira/browse/HUDI-1174 Project: Apache Hudi

[GitHub] [hudi] steveloughran commented on issue #1837: [SUPPORT]S3 file listing causing compaction to get eventually slow

2020-08-10 Thread GitBox
steveloughran commented on issue #1837: URL: https://github.com/apache/hudi/issues/1837#issuecomment-671523549 The issue here is that treewalking is pathologically bad for S3. Asking for a deep listing is often more efficient; filesystem.listFiles(path, recursive=true) will do this

[GitHub] [hudi] jpugliesi edited a comment on issue #1925: [SUPPORT] Support for Confluent Cloud SchemaRegistryProvider

2020-08-10 Thread GitBox
jpugliesi edited a comment on issue #1925: URL: https://github.com/apache/hudi/issues/1925#issuecomment-671522522 @bvaradar looks like this works - thanks for your help. One follow up question about this `SchemaRegistryProvider` - is it possible to configure Hudi to use this

[GitHub] [hudi] jpugliesi commented on issue #1925: [SUPPORT] Support for Confluent Cloud SchemaRegistryProvider

2020-08-10 Thread GitBox
jpugliesi commented on issue #1925: URL: https://github.com/apache/hudi/issues/1925#issuecomment-671522522 @bvaradar looks like this works - thanks for your help. One follow up question about this `SchemaRegistryProvider` - is it possible to configure Hudi to use this

[GitHub] [hudi] tooptoop4 edited a comment on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-10 Thread GitBox
tooptoop4 edited a comment on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-671512191 @bhasudha how does checkpointing work here? ie after some time of running DeltaStreamer job i need to stop the DeltaStreamer job, destroy old EC2, launch new EC2, restart

[GitHub] [hudi] tooptoop4 commented on issue #1813: ERROR HoodieDeltaStreamer: Got error running delta sync once.

2020-08-10 Thread GitBox
tooptoop4 commented on issue #1813: URL: https://github.com/apache/hudi/issues/1813#issuecomment-671512191 @bhasudha how does checkpointing work here? ie after some time of running DeltaStreamer job i need to stop the DeltaStreamer job, destroy old EC2, launch new EC2, restart

[GitHub] [hudi] zhedoubushishi commented on a change in pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-10 Thread GitBox
zhedoubushishi commented on a change in pull request #1870: URL: https://github.com/apache/hudi/pull/1870#discussion_r468069699 ## File path: hudi-common/src/main/java/org/apache/hudi/common/HoodieCleanStat.java ## @@ -39,17 +40,34 @@ private final List successDeleteFiles;

[GitHub] [hudi] wfhartford opened a new issue #1943: [SUPPORT] Gradle fails with dependency on org.apache.hudi:hudi-spark_2.12:0.5.3

2020-08-10 Thread GitBox
wfhartford opened a new issue #1943: URL: https://github.com/apache/hudi/issues/1943 Using the `hudi-spark_2.12` artifact as a dependency in gradle fails with the following error: ``` inconsistent module metadata found. Descriptor: org.apache.hudi:hudi-spark_2.11:0.5.3 Errors: bad

[jira] [Updated] (HUDI-1153) Spark DataSource and Streaming Write must fail when operation type is misconfigured

2020-08-10 Thread Sreeram Ramji (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreeram Ramji updated HUDI-1153: Status: In Progress (was: Open) > Spark DataSource and Streaming Write must fail when operation

[GitHub] [hudi] xushiyan commented on a change in pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
xushiyan commented on a change in pull request #1871: URL: https://github.com/apache/hudi/pull/1871#discussion_r468032133 ## File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java ## @@ -237,11 +229,12 @@ public static void

[GitHub] [hudi] xushiyan commented on a change in pull request #1871: [HUDI-781] Introduce HoodieTestTable for test preparation

2020-08-10 Thread GitBox
xushiyan commented on a change in pull request #1871: URL: https://github.com/apache/hudi/pull/1871#discussion_r468021630 ## File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestUtils.java ## @@ -237,11 +229,12 @@ public static void

[GitHub] [hudi] vinothchandar commented on pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-10 Thread GitBox
vinothchandar commented on pull request #1870: URL: https://github.com/apache/hudi/pull/1870#issuecomment-671401411 @zhedoubushishi @umehrot2 please review this PR carefully ! We plan to land today This is an automated

[GitHub] [hudi] vinothchandar commented on a change in pull request #1870: [HUDI-808] Support cleaning bootstrap source data

2020-08-10 Thread GitBox
vinothchandar commented on a change in pull request #1870: URL: https://github.com/apache/hudi/pull/1870#discussion_r467944196 ## File path: hudi-client/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java ## @@ -82,40 +83,45 @@ HoodieCleanerPlan

[jira] [Updated] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-10 Thread leesf (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf updated HUDI-1173: Priority: Blocker (was: Minor) > fix hudi-prometheus pom dependency > -- > >

[jira] [Updated] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-10 Thread leesf (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] leesf updated HUDI-1173: Fix Version/s: 0.6.0 > fix hudi-prometheus pom dependency > -- > >

[GitHub] [hudi] UZi5136225 commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
UZi5136225 commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671354955 Yes, this configuration needs to be added This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] sbernauer edited a comment on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer edited a comment on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671353265 @UZi5136225 i just noticed that i get a `java.lang.ClassNotFoundException: io.prometheus.client.exporter.common.TextFormat` when trying to access the interface. I think

[GitHub] [hudi] sbernauer edited a comment on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer edited a comment on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671353265 @UZi5136225 i just noticed that i get a `java.lang.ClassNotFoundException: io.prometheus.client.exporter.common.TextFormat` when trying to access the interface. I think

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671353265 @UZi5136225 i just noticed taht i get a `java.lang.ClassNotFoundException: io.prometheus.client.exporter.common.TextFormat` when trying to access the interface. I think we

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671350777 Thanks @UZi5136225, this fixed the problem! I dont understand why, because i added all the libs below in the correct version to the classpath. But it works :) ``` mvn

[GitHub] [hudi] UZi5136225 commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
UZi5136225 commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671336381 https://github.com/apache/hudi/pull/1942 @sbernauer You can try this PR This is an automated message from the

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671334282 Thanks a lot @UZi5136225 for your fast response! This is an automated message from the Apache Git Service. To

[jira] [Updated] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1173: - Labels: pull-request-available (was: ) > fix hudi-prometheus pom dependency >

[GitHub] [hudi] UZi5136225 opened a new pull request #1942: [HUDI-1173] fix hudi-prometheus pom dependency

2020-08-10 Thread GitBox
UZi5136225 opened a new pull request #1942: URL: https://github.com/apache/hudi/pull/1942 fix hudi-prometheus pom dependency ## What is the purpose of the pull request fix hudi-prometheus pom dependency ## Committer checklist - [ ] Has a corresponding JIRA in

[GitHub] [hudi] UZi5136225 commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
UZi5136225 commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671331840 It’s because I missed to configure less dependencies, it will be fixed soon @sbernauer This is an automated

[jira] [Created] (HUDI-1173) fix hudi-prometheus pom dependency

2020-08-10 Thread liujinhui (Jira)
liujinhui created HUDI-1173: --- Summary: fix hudi-prometheus pom dependency Key: HUDI-1173 URL: https://issues.apache.org/jira/browse/HUDI-1173 Project: Apache Hudi Issue Type: Improvement

[GitHub] [hudi] sbernauer edited a comment on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer edited a comment on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671304318 I don't use the include option but instead guarantee that all drivers and executors have the libs and correctly use them via SPARK_DIST_CLASSPATH.

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671304318 I don't use the include, option but instead guarantee that all drivers and executors have the libs and correctly use them via SPARK_DIST_CLASSPATH.

[GitHub] [hudi] UZi5136225 commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
UZi5136225 commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671302048 If you use the include tag to include the required dependencies, it should be correct. Let me see why This is

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671301067 If i uncompress my `simpleclient_dropwizard-0.8.0.jar` and look inside the class all seems correct: ``` $ javap DropwizardExports.class Compiled from

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671299385 The constructor of DropwizardExports seems to be stable and I'm wondering what is going wrong here (It does not seem like a version mismatch to me).

[GitHub] [hudi] UZi5136225 commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
UZi5136225 commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671298995 0.8.0 correct This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] sbernauer commented on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
sbernauer commented on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671298448 My setup has no access to the internet, so i downloaded the libs from mvn central and included them in my docker images. Is the version 0.8.0 correct? ``` $ ls

[GitHub] [hudi] UZi5136225 removed a comment on pull request #1931: [HUDI-210] hudi-support-prometheus-pushgateway

2020-08-10 Thread GitBox
UZi5136225 removed a comment on pull request #1931: URL: https://github.com/apache/hudi/pull/1931#issuecomment-671297401 Try the following packages you need io.prometheus:simpleclient_pushgateway This is an automated

  1   2   >