[GitHub] [hudi] ashishmgofficial edited a comment on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
ashishmgofficial edited a comment on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705990567 @bvaradar Yes Im using the above mentioned url for schema ``` { "connect.name": "airflow.public.motor_crash_violation_incidents.Envelope", "fields"

[GitHub] [hudi] ashishmgofficial edited a comment on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
ashishmgofficial edited a comment on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705990567 Yes Im using the above mentioned url for schema ``` { "connect.name": "airflow.public.motor_crash_violation_incidents.Envelope", "fields": [

[GitHub] [hudi] tandonraghav edited a comment on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
tandonraghav edited a comment on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-706003688 @bvaradar Thanks for the answer. Below is how our set up looks like- - We have Client Level Mongo collections. Write various client Mongo oplogs into one topic.

[GitHub] [hudi] tandonraghav edited a comment on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
tandonraghav edited a comment on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-706003688 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [hudi] tandonraghav commented on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
tandonraghav commented on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-706003688 @bvaradar Thanks for the answer. Below is how our set up looks like- - We have Client Level Mongo collections. Write various client Mongo oplogs into one topic. - Writ

[GitHub] [hudi] ashishmgofficial commented on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
ashishmgofficial commented on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705990567 ``` { "connect.name": "airflow.public.motor_crash_violation_incidents.Envelope", "fields": [ { "default": null, "name": "before",

[GitHub] [hudi] bvaradar commented on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
bvaradar commented on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705980977 @ashishmgofficial : This looks like schema mismatch issue. There might be a bug in the schema provider implementation that I pasted. Can you also attach the schema from schema registry

[GitHub] [hudi] ashishmgofficial commented on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
ashishmgofficial commented on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705969068 @bvaradar My bad...Im attaching the logs [yarn-logs.txt](https://github.com/apache/hudi/files/5352619/logs.txt)

[GitHub] [hudi] n3nash commented on pull request #2152: [HUDI-1326] Added an API to force publish metrics and flush them.

2020-10-08 Thread GitBox
n3nash commented on pull request #2152: URL: https://github.com/apache/hudi/pull/2152#issuecomment-705957506 @bvaradar Have you tested out if the metrics reporting works for Spark Structured Streaming ? This is an automated

[GitHub] [hudi] n3nash commented on a change in pull request #2152: [HUDI-1326] Added an API to force publish metrics and flush them.

2020-10-08 Thread GitBox
n3nash commented on a change in pull request #2152: URL: https://github.com/apache/hudi/pull/2152#discussion_r502179722 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/Metrics.java ## @@ -97,6 +110,14 @@ public static synchronized void shutdo

[GitHub] [hudi] n3nash commented on a change in pull request #2152: [HUDI-1326] Added an API to force publish metrics and flush them.

2020-10-08 Thread GitBox
n3nash commented on a change in pull request #2152: URL: https://github.com/apache/hudi/pull/2152#discussion_r502179584 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/Metrics.java ## @@ -68,8 +69,20 @@ private void reportAndCloseReporter() {

[jira] [Commented] (HUDI-314) Unable to query a multi-partitions MOR realtime table

2020-10-08 Thread Bharat Dighe (Jira)
[ https://issues.apache.org/jira/browse/HUDI-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210578#comment-17210578 ] Bharat Dighe commented on HUDI-314: --- I am seeing similar error. The partition key is addr

[hudi] branch master updated: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable (#2143)

2020-10-08 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 1d1d91d [HUDI-995] Migrate HoodieTestUtils APIs

[GitHub] [hudi] yanghua merged pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
yanghua merged pull request #2143: URL: https://github.com/apache/hudi/pull/2143 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] codecov-io commented on pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
codecov-io commented on pull request #2143: URL: https://github.com/apache/hudi/pull/2143#issuecomment-705926236 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2143?src=pr&el=h1) Report > Merging [#2143](https://codecov.io/gh/apache/hudi/pull/2143?src=pr&el=desc) into [master](https

[GitHub] [hudi] codecov-io edited a comment on pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
codecov-io edited a comment on pull request #2143: URL: https://github.com/apache/hudi/pull/2143#issuecomment-705926236 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2143?src=pr&el=h1) Report > Merging [#2143](https://codecov.io/gh/apache/hudi/pull/2143?src=pr&el=desc) into [master

[GitHub] [hudi] xushiyan commented on a change in pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
xushiyan commented on a change in pull request #2143: URL: https://github.com/apache/hudi/pull/2143#discussion_r502124874 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieWriteableTestTable.java ## @@ -128,4 +148,37 @@ public HoodieWrit

[GitHub] [hudi] xushiyan commented on a change in pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
xushiyan commented on a change in pull request #2143: URL: https://github.com/apache/hudi/pull/2143#discussion_r502123493 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieWriteableTestTable.java ## @@ -94,6 +106,10 @@ public String with

[GitHub] [hudi] yanghua commented on a change in pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
yanghua commented on a change in pull request #2143: URL: https://github.com/apache/hudi/pull/2143#discussion_r502103169 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieWriteableTestTable.java ## @@ -94,6 +106,10 @@ public String withI

[hudi] branch asf-site updated: Travis CI build asf-site

2020-10-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new d0fb25a Travis CI build asf-site d0fb25a is d

[hudi] branch asf-site updated: [DOCS] PrestoCon Panel Discussion added to site (#2155)

2020-10-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 81fb485 [DOCS] PrestoCon Panel Discussion add

[hudi] branch asf-site updated: [DOCS] PrestoCon Panel Discussion added to site (#2155)

2020-10-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 81fb485 [DOCS] PrestoCon Panel Discussion add

[hudi] branch asf-site updated: [DOCS] PrestoCon Panel Discussion added to site (#2155)

2020-10-08 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 81fb485 [DOCS] PrestoCon Panel Discussion add

[GitHub] [hudi] vinothchandar merged pull request #2155: [DOCS] PrestoCon Panel Discussion added to site

2020-10-08 Thread GitBox
vinothchandar merged pull request #2155: URL: https://github.com/apache/hudi/pull/2155 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

2020-10-08 Thread GitBox
bvaradar commented on issue #2153: URL: https://github.com/apache/hudi/issues/2153#issuecomment-705792876 With 0.6.0, you can set hoodie.fail.on.timeline.archiving=false to make it non-fatal This is an automated message from

[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

2020-10-08 Thread GitBox
bvaradar commented on issue #2153: URL: https://github.com/apache/hudi/issues/2153#issuecomment-705792507 @prashanthvg89 : With 0.6.0 release, this was no longer a fatal error. Can you try that version ? This is an automated

[GitHub] [hudi] bvaradar commented on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
bvaradar commented on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705787833 @ashishmgofficial : the one you pasted is only driver side logs. Do you have executor logs ? If you have spark history server setup, you can look at the tasks sections in the failed st

[GitHub] [hudi] bvaradar commented on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
bvaradar commented on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-705780717 @tandonraghav : Regarding your question on compaction, Since you are using WriteClient level APIs, you can use HoodieTable.getHoodieView().getPendingCompactionOperations().map(Pair:

[GitHub] [hudi] prashanthvg89 commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

2020-10-08 Thread GitBox
prashanthvg89 commented on issue #2153: URL: https://github.com/apache/hudi/issues/2153#issuecomment-705764448 Could happen due to race condition https://stackoverflow.com/questions/38750638/spark-1-6-1-s3-multiobjectdeleteexception I have about 100 retries on S3 failures in my appli

[GitHub] [hudi] vinothchandar commented on pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-10-08 Thread GitBox
vinothchandar commented on pull request #2064: URL: https://github.com/apache/hudi/pull/2064#issuecomment-705760186 @prashantwason if you can take a pass over the last batch of my comments, that would be great. There was nt much there that we could resolve based on prior discussions. --

[GitHub] [hudi] vinothchandar commented on a change in pull request #2064: WIP - [HUDI-842] Implementation of HUDI RFC-15.

2020-10-08 Thread GitBox
vinothchandar commented on a change in pull request #2064: URL: https://github.com/apache/hudi/pull/2064#discussion_r501943042 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java ## @@ -57,41 +57,58 @@ private static fi

[GitHub] [hudi] vinothchandar opened a new pull request #2155: [DOCS] PrestoCon Panel Discussion added to site

2020-10-08 Thread GitBox
vinothchandar opened a new pull request #2155: URL: https://github.com/apache/hudi/pull/2155 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of t

[GitHub] [hudi] codecov-io commented on pull request #2125: [HUDI-1301] use spark INCREMENTAL mode query hudi dataset support sch…

2020-10-08 Thread GitBox
codecov-io commented on pull request #2125: URL: https://github.com/apache/hudi/pull/2125#issuecomment-705706818 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2125?src=pr&el=h1) Report > Merging [#2125](https://codecov.io/gh/apache/hudi/pull/2125?src=pr&el=desc) into [master](https

[GitHub] [hudi] vinothchandar commented on issue #2154: [SUPPORT] Throwing org.apache.spark.shuffle.FetchFailedException consistently

2020-10-08 Thread GitBox
vinothchandar commented on issue #2154: URL: https://github.com/apache/hudi/issues/2154#issuecomment-705696840 this is typically just Spark Shuffle failure. sometimes the remote executor OOMs or something and you get this symptom on another executor. You can check the Spark UI to see if th

[GitHub] [hudi] tandonraghav edited a comment on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
tandonraghav edited a comment on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-705665216 @bvaradar Mongo Collection-> Hudi Table (This is our setup also) - We cannot put collections to different topics because there are 100s of such collections, so not in favo

[GitHub] [hudi] tandonraghav edited a comment on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
tandonraghav edited a comment on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-705665216 @bvaradar Mongo Collection-> Hudi Table (This is our setup also) - We cannot put collections to different topics because there are 100s of such collections, so not in favo

[GitHub] [hudi] tandonraghav commented on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
tandonraghav commented on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-705665216 @bvaradar Mongo Collection-> Hudi Table (This is our setup also) - We cannot put collections to different topics because there are 100s of such collections, so not in favour of c

[GitHub] [hudi] Rajpratik71 closed pull request #1400: [WIP] optimization debian package manager tweaks

2020-10-08 Thread GitBox
Rajpratik71 closed pull request #1400: URL: https://github.com/apache/hudi/pull/1400 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[jira] [Closed] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei closed HUDI-1203. --- > Allow port configuration for EmbeddedTimelineService > > >

[jira] [Resolved] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei resolved HUDI-1203. - Resolution: Fixed > Allow port configuration for EmbeddedTimelineService > ---

[jira] [Reopened] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei reopened HUDI-1203: - > Allow port configuration for EmbeddedTimelineService > > >

[jira] [Updated] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1203: Status: Closed (was: Patch Available) > Allow port configuration for EmbeddedTimelineService >

[jira] [Updated] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1203: Status: Patch Available (was: In Progress) > Allow port configuration for EmbeddedTimelineService > ---

[jira] [Reopened] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei reopened HUDI-1203: - > Allow port configuration for EmbeddedTimelineService > > >

[jira] [Updated] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1203: Status: Closed (was: Patch Available) > Allow port configuration for EmbeddedTimelineService >

[jira] [Updated] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1203: Status: Patch Available (was: In Progress) > Allow port configuration for EmbeddedTimelineService > ---

[jira] [Updated] (HUDI-1203) Allow port configuration for EmbeddedTimelineService

2020-10-08 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liwei updated HUDI-1203: Status: In Progress (was: Open) > Allow port configuration for EmbeddedTimelineService > --

[GitHub] [hudi] SteNicholas commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501777975 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/command/ITTestHoodieSyncCommand.java ## @@ -52,7 +52,7 @@ public void testValidateSync

[GitHub] [hudi] xushiyan commented on pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
xushiyan commented on pull request #2143: URL: https://github.com/apache/hudi/pull/2143#issuecomment-705608041 > @xushiyan Sorry for the late reply, the past week was during the National Day holiday, you know. Will review tomorrow. @yanghua Understood. No worries. -

[GitHub] [hudi] yanghua commented on pull request #2143: [HUDI-995] Migrate HoodieTestUtils APIs to HoodieTestTable

2020-10-08 Thread GitBox
yanghua commented on pull request #2143: URL: https://github.com/apache/hudi/pull/2143#issuecomment-705603856 @xushiyan Sorry for the late reply, the past week was during the National Day holiday, you know. Will review tomorrow.

[GitHub] [hudi] lw309637554 commented on pull request #2125: [HUDI-1301] use spark INCREMENTAL mode query hudi dataset support sch…

2020-10-08 Thread GitBox
lw309637554 commented on pull request #2125: URL: https://github.com/apache/hudi/pull/2125#issuecomment-705588099 > introducing a ReadOption thanks , i will introduce a ReadOption . This is an automated message from t

[GitHub] [hudi] SteNicholas commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501739942 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -286,8 +303,48 @@ public

[GitHub] [hudi] SteNicholas commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501739210 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -90,14 +88,33 @@ private

[GitHub] [hudi] SteNicholas commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501735976 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -112,16 +115,17 @@ private vo

[GitHub] [hudi] SteNicholas commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501728945 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/TestHoodieClientOnCopyOnWriteStorage.java ## @@ -743,47 +743,37 @@ publ

[GitHub] [hudi] SteNicholas commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501726744 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -286,8 +303,48 @@ public

[GitHub] [hudi] SteNicholas removed a comment on pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
SteNicholas removed a comment on pull request #2111: URL: https://github.com/apache/hudi/pull/2111#issuecomment-704795298 @bvaradar @vinothchandar Could you please help to review this pull request? This is an automated messag

[GitHub] [hudi] KarthickAN opened a new issue #2154: [SUPPORT] Throwing org.apache.spark.shuffle.FetchFailedException consistently

2020-10-08 Thread GitBox
KarthickAN opened a new issue #2154: URL: https://github.com/apache/hudi/issues/2154 Any insight into this issue ? I keep getting this consistenly. Need help in resolving this. **Stacktrace:** py4j.protocol.Py4JJavaError: An error occurred while calling o171.save. : org.apa

[GitHub] [hudi] leesf commented on pull request #2125: [HUDI-1301] use spark INCREMENTAL mode query hudi dataset support sch…

2020-10-08 Thread GitBox
leesf commented on pull request #2125: URL: https://github.com/apache/hudi/pull/2125#issuecomment-705534225 > This may not always be desired. the user may wish to get the incremental results using the latest schema, as well, right? If adding fields and using the latest schema for inc

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501677026 ## File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/command/ITTestHoodieSyncCommand.java ## @@ -52,7 +52,7 @@ public void testValidateSync() thr

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501661275 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/TestHoodieClientOnCopyOnWriteStorage.java ## @@ -743,47 +743,37 @@ public voi

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501676578 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -286,8 +303,48 @@ public void t

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501676468 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -286,8 +303,48 @@ public void t

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501675500 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -286,8 +303,48 @@ public void t

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501675165 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -286,8 +303,48 @@ public void t

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501659751 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ## @@ -112,16 +115,17 @@ private void ass

[GitHub] [hudi] leesf commented on a change in pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-10-08 Thread GitBox
leesf commented on a change in pull request #2111: URL: https://github.com/apache/hudi/pull/2111#discussion_r501674516 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/action/commit/TestUpsertPartitioner.java ## @@ -90,14 +88,33 @@ private Upsert

[GitHub] [hudi] ashishmgofficial commented on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
ashishmgofficial commented on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705431204 @bvaradar Please find below the logs [log.txt](https://github.com/apache/hudi/files/5346518/log.txt) ---

[GitHub] [hudi] bvaradar commented on issue #2153: [SUPPORT] Failed to delete key: /.hoodie/.temp/20201006182950

2020-10-08 Thread GitBox
bvaradar commented on issue #2153: URL: https://github.com/apache/hudi/issues/2153#issuecomment-705418948 @umehrot2 : Can you throw some light here ? when will EMR/S3 throw this error ? Is this server-side issue which will go away with retry ? -

[GitHub] [hudi] bvaradar commented on issue #2151: [SUPPORT] How to run Periodic Compaction? Multiple Tables - When no Upserts

2020-10-08 Thread GitBox
bvaradar commented on issue #2151: URL: https://github.com/apache/hudi/issues/2151#issuecomment-705413459 @tandonraghav : the correct setup would be to actually keep separate Hudi tables for each Mongo collection (table) as the schema could be different across collections. You should have

[GitHub] [hudi] bvaradar commented on issue #2149: Help with Reading Kafka topic written using Debezium Connector - Deltastreamer

2020-10-08 Thread GitBox
bvaradar commented on issue #2149: URL: https://github.com/apache/hudi/issues/2149#issuecomment-705407415 @ashishmgofficial : The exception you pasted is not the real root-cause. You should see the root-cause exceptions in the executor logs as well. It would be easy to debug if we know the

[jira] [Assigned] (HUDI-1278) Need a generic payload class which can skip late arriving data based on specific fields

2020-10-08 Thread shenh062326 (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenh062326 reassigned HUDI-1278: - Assignee: shenh062326 > Need a generic payload class which can skip late arriving data based on

[jira] [Commented] (HUDI-1278) Need a generic payload class which can skip late arriving data based on specific fields

2020-10-08 Thread shenh062326 (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210063#comment-17210063 ] shenh062326 commented on HUDI-1278: --- [~vbalaji]  I can take this if you have not start i

[GitHub] [hudi] shenh062326 commented on pull request #2085: [HUDI-1209] Properties File must be optional when running deltastreamer

2020-10-08 Thread GitBox
shenh062326 commented on pull request #2085: URL: https://github.com/apache/hudi/pull/2085#issuecomment-705389402 @vinothchandar can you take a look at this pull request? This is an automated message from the Apache Git Servi