[jira] [Updated] (HUDI-1392) lose partition info when using spark parameter "basePath"

2020-11-24 Thread steven zhang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] steven zhang updated HUDI-1392: --- Description: Reproduce the issue with below steps:         set 

[GitHub] [hudi] quitozang closed issue #2274: [SUPPORT]

2020-11-24 Thread GitBox
quitozang closed issue #2274: URL: https://github.com/apache/hudi/issues/2274 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] shenh062326 commented on pull request #2222: [HUDI-1364] Add HoodieJavaEngineContext to hudi-java-client

2020-11-24 Thread GitBox
shenh062326 commented on pull request #: URL: https://github.com/apache/hudi/pull/#issuecomment-733449190 > @shenh062326 are you planning to follow on with a full impl of a java based client? Changes LGTM. Yes, I will add a full impl of a java based client.

[jira] [Created] (HUDI-1416) [Documentation] Documentation is confusing

2020-11-24 Thread Hemanga Borah (Jira)
Hemanga Borah created HUDI-1416: --- Summary: [Documentation] Documentation is confusing Key: HUDI-1416 URL: https://issues.apache.org/jira/browse/HUDI-1416 Project: Apache Hudi Issue Type:

[GitHub] [hudi] garyli1019 merged pull request #2243: HUDI-1392 lose partition info when using spark parameter basePath

2020-11-24 Thread GitBox
garyli1019 merged pull request #2243: URL: https://github.com/apache/hudi/pull/2243 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[hudi] branch master updated: [HUDI-1392] lose partition info when using spark parameter basePath (#2243)

2020-11-24 Thread garyli
This is an automated email from the ASF dual-hosted git repository. garyli pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 56866a1 [HUDI-1392] lose partition info when

[GitHub] [hudi] garyli1019 commented on pull request #2243: HUDI-1392 lose partition info when using spark parameter basePath

2020-11-24 Thread GitBox
garyli1019 commented on pull request #2243: URL: https://github.com/apache/hudi/pull/2243#issuecomment-733445977 @yui2010 merging. Please assign the Jira ticket to yourself and close it. If you don't have contributor access yet, please send an email with your Jira ID to the dev mailing

[GitHub] [hudi] bithw1 edited a comment on issue #2276: [SUPPORT] java.lang.IllegalStateException: No Compaction request available

2020-11-24 Thread GitBox
bithw1 edited a comment on issue #2276: URL: https://github.com/apache/hudi/issues/2276#issuecomment-733441100 The code that create/upsert the table is as follows, I have explicitly specified the following two lines to disable compaction. ```

[GitHub] [hudi] bithw1 commented on issue #2276: [SUPPORT] java.lang.IllegalStateException: No Compaction request available

2020-11-24 Thread GitBox
bithw1 commented on issue #2276: URL: https://github.com/apache/hudi/issues/2276#issuecomment-733441100 The code that create/upsert the table is as follows, I have explicitly specified the following two lines to disable compaction.

[GitHub] [hudi] bithw1 commented on issue #2276: [SUPPORT] java.lang.IllegalStateException: No Compaction request available

2020-11-24 Thread GitBox
bithw1 commented on issue #2276: URL: https://github.com/apache/hudi/issues/2276#issuecomment-733439132 Thanks @bvaradar , The files on hdfs are: ``` 0 2020-11-22 10:00 /data/hudi_demo/hudi_hive_read_write_mor_5/.hoodie/.aux 0 2020-11-22 10:01

[jira] [Assigned] (HUDI-981) Use rocksDB as flink state backend

2020-11-24 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu reassigned HUDI-981: Assignee: chijunqing (was: wangxianghu) > Use rocksDB as flink state backend >

[GitHub] [hudi] SteNicholas commented on pull request #2111: [HUDI-1234] Insert new records regardless of small file when using insert operation

2020-11-24 Thread GitBox
SteNicholas commented on pull request #2111: URL: https://github.com/apache/hudi/pull/2111#issuecomment-733426822 > @SteNicholas still interested in driving this forward? @vinothchandar , yes, I have discussed with @leesf offline. This week would be completed.

[GitHub] [hudi] asharma4-lucid commented on issue #2269: [SUPPORT] - HUDI Table Bulk Insert for 5 gb parquet file progressively taking longer time to insert.

2020-11-24 Thread GitBox
asharma4-lucid commented on issue #2269: URL: https://github.com/apache/hudi/issues/2269#issuecomment-733323629 Yes this is a COW table. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] bvaradar commented on issue #2277: [SUPPORT]

2020-11-24 Thread GitBox
bvaradar commented on issue #2277: URL: https://github.com/apache/hudi/issues/2277#issuecomment-733305873 @umehrot2 : Can you please take a look at this ? This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] bvaradar commented on issue #2276: [SUPPORT] java.lang.IllegalStateException: No Compaction request available

2020-11-24 Thread GitBox
bvaradar commented on issue #2276: URL: https://github.com/apache/hudi/issues/2276#issuecomment-733304908 You can use hudi-cli and use "compactions show all" to list compactions and find the timestamp of one that is pending. Another option is to list .hoodie folder and find all the

[GitHub] [hudi] bvaradar commented on issue #2269: [SUPPORT] - HUDI Table Bulk Insert for 5 gb parquet file progressively taking longer time to insert.

2020-11-24 Thread GitBox
bvaradar commented on issue #2269: URL: https://github.com/apache/hudi/issues/2269#issuecomment-733299492 @asharma4-lucid : ~5hrs is way too much. Can you disable cleaning using the config hoodie.clean.automatic=false and try. Is this a COW table ?

[GitHub] [hudi] vinothchandar commented on pull request #2208: [HUDI-1040] Make Hudi support Spark 3

2020-11-24 Thread GitBox
vinothchandar commented on pull request #2208: URL: https://github.com/apache/hudi/pull/2208#issuecomment-733204015 @giaosudau that seems like JVM crash. Not sure what in this PR could crash that. Do you have more diagnostic info?

[GitHub] [hudi] asharma4-lucid commented on issue #2269: [SUPPORT] - HUDI Table Bulk Insert for 5 gb parquet file progressively taking longer time to insert.

2020-11-24 Thread GitBox
asharma4-lucid commented on issue #2269: URL: https://github.com/apache/hudi/issues/2269#issuecomment-733174238 Thanks @bvaradar. I tried to insert just 5 records to the existing table with ~300K partitions and it took close to ~5 hrs. If I insert ~5 records in a new table it takes less

[GitHub] [hudi] codecov-io edited a comment on pull request #2278: [HUDI-1412] Make HoodieWriteConfig support setting different default …

2020-11-24 Thread GitBox
codecov-io edited a comment on pull request #2278: URL: https://github.com/apache/hudi/pull/2278#issuecomment-733020702 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2278?src=pr=h1) Report > Merging [#2278](https://codecov.io/gh/apache/hudi/pull/2278?src=pr=desc) (12b85dc) into

[GitHub] [hudi] wangxianghu commented on pull request #2271: [WIP][HUDI-1335] Introduce FlinkHoodieSimpleIndex to hudi-flink-client

2020-11-24 Thread GitBox
wangxianghu commented on pull request #2271: URL: https://github.com/apache/hudi/pull/2271#issuecomment-733026404 blocked by https://github.com/apache/hudi/pull/2278 This is an automated message from the Apache Git Service.

[GitHub] [hudi] wangxianghu removed a comment on pull request #2271: [WIP][HUDI-1335] Introduce FlinkHoodieSimpleIndex to hudi-flink-client

2020-11-24 Thread GitBox
wangxianghu removed a comment on pull request #2271: URL: https://github.com/apache/hudi/pull/2271#issuecomment-733023377 blocked by https://github.com/apache/hudi/pull/2278 This is an automated message from the Apache Git

[GitHub] [hudi] codecov-io edited a comment on pull request #2278: [HUDI-1412] Make HoodieWriteConfig support setting different default …

2020-11-24 Thread GitBox
codecov-io edited a comment on pull request #2278: URL: https://github.com/apache/hudi/pull/2278#issuecomment-733020702 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2278?src=pr=h1) Report > Merging [#2278](https://codecov.io/gh/apache/hudi/pull/2278?src=pr=desc) (40c6d23) into

[GitHub] [hudi] wangxianghu commented on pull request #2271: [WIP][HUDI-1335] Introduce FlinkHoodieSimpleIndex to hudi-flink-client

2020-11-24 Thread GitBox
wangxianghu commented on pull request #2271: URL: https://github.com/apache/hudi/pull/2271#issuecomment-733023377 blocked by https://github.com/apache/hudi/pull/2278 This is an automated message from the Apache Git Service.

[GitHub] [hudi] wangxianghu commented on pull request #2278: [HUDI-1412] Make HoodieWriteConfig support setting different default …

2020-11-24 Thread GitBox
wangxianghu commented on pull request #2278: URL: https://github.com/apache/hudi/pull/2278#issuecomment-733022202 @yanghua please take a look when free This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] codecov-io commented on pull request #2278: [HUDI-1412] Make HoodieWriteConfig support setting different default …

2020-11-24 Thread GitBox
codecov-io commented on pull request #2278: URL: https://github.com/apache/hudi/pull/2278#issuecomment-733020702 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2278?src=pr=h1) Report > Merging [#2278](https://codecov.io/gh/apache/hudi/pull/2278?src=pr=desc) (12b85dc) into

[jira] [Updated] (HUDI-1412) Make HoodieWriteConfig support setting different default value according to engine type

2020-11-24 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1412: - Labels: pull-request-available (was: ) > Make HoodieWriteConfig support setting different

[GitHub] [hudi] wangxianghu opened a new pull request #2278: [HUDI-1412] Make HoodieWriteConfig support setting different default …

2020-11-24 Thread GitBox
wangxianghu opened a new pull request #2278: URL: https://github.com/apache/hudi/pull/2278 …value according to engine type ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull

[jira] [Updated] (HUDI-1412) Make HoodieWriteConfig support setting different default value according to engine type

2020-11-24 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1412: -- Description: Currently, `HoodieIndexConfig` set its default index type to bloom, which is suitable for

[jira] [Updated] (HUDI-1412) Make HoodieWriteConfig support setting different default value according to engine type

2020-11-24 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1412: -- Summary: Make HoodieWriteConfig support setting different default value according to engine type (was:

[GitHub] [hudi] codecov-io edited a comment on pull request #2216: [HUDI-1357] Added a check to ensure no records are lost during updates.

2020-11-24 Thread GitBox
codecov-io edited a comment on pull request #2216: URL: https://github.com/apache/hudi/pull/2216#issuecomment-729776111 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2216?src=pr=h1) Report > Merging [#2216](https://codecov.io/gh/apache/hudi/pull/2216?src=pr=desc) (c8f05c9) into

[jira] [Commented] (HUDI-1414) HoodieInputFormat support for bucketed partitions

2020-11-24 Thread linshan-ma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238081#comment-17238081 ] linshan-ma commented on HUDI-1414: -- I'm interested in this ticket。 I want to try it. > HoodieInputFormat

[jira] [Assigned] (HUDI-1414) HoodieInputFormat support for bucketed partitions

2020-11-24 Thread linshan-ma (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] linshan-ma reassigned HUDI-1414: Assignee: linshan-ma > HoodieInputFormat support for bucketed partitions >

[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2242: [HUDI-1366] Make deltasteamer support exporting data from hdfs to hudi

2020-11-24 Thread GitBox
liujinhui1994 commented on a change in pull request #2242: URL: https://github.com/apache/hudi/pull/2242#discussion_r52946 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java ## @@ -522,14 +523,18 @@ public static

[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2242: [HUDI-1366] Make deltasteamer support exporting data from hdfs to hudi

2020-11-24 Thread GitBox
liujinhui1994 commented on a change in pull request #2242: URL: https://github.com/apache/hudi/pull/2242#discussion_r52946 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java ## @@ -522,14 +523,18 @@ public static

[GitHub] [hudi] santas-little-helper-13 opened a new issue #2277: [SUPPORT]

2020-11-24 Thread GitBox
santas-little-helper-13 opened a new issue #2277: URL: https://github.com/apache/hudi/issues/2277 Hi, I am working with hudi in AWS Glue. I have a problem with hudi updates. So I have one Glue job that inserts data into hudi parquet files, it reads data from glue table, does