[GitHub] [hudi] codecov-io commented on pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
codecov-io commented on pull request #2634: URL: https://github.com/apache/hudi/pull/2634#issuecomment-791203944 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2634?src=pr&el=h1) Report > Merging [#2634](https://codecov.io/gh/apache/hudi/pull/2634?src=pr&el=desc) (ccf9a8f) into [ma

[GitHub] [hudi] garyli1019 commented on pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
garyli1019 commented on pull request #2634: URL: https://github.com/apache/hudi/pull/2634#issuecomment-791202125 @xiarixiaoyao your force push already triggered the CI. Do you mean JIRA contributor access? If so, would you send an email to the dev mailing list with your JIRA ID? That's how

[GitHub] [hudi] xiarixiaoyao commented on pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
xiarixiaoyao commented on pull request #2634: URL: https://github.com/apache/hudi/pull/2634#issuecomment-791188663 Thank you @garyli1019 but I don't have permission to trigger CI, could you help. BYT, could you give me the contributor permission? thank you --

[GitHub] [hudi] satishkotha commented on a change in pull request #2635: [WIP] [DO NOT MERGE] sample code for measuring hfile performance with column ranges

2021-03-04 Thread GitBox
satishkotha commented on a change in pull request #2635: URL: https://github.com/apache/hudi/pull/2635#discussion_r588059204 ## File path: hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/storage/TestHoodieFileWriterFactory.java ## @@ -64,4 +80,246 @@ public void

[GitHub] [hudi] satishkotha opened a new pull request #2635: [WIP] [DO NOT MERGE] sample code for measuring hfile performance with column ranges

2021-03-04 Thread GitBox
satishkotha opened a new pull request #2635: URL: https://github.com/apache/hudi/pull/2635 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the

[jira] [Updated] (HUDI-1636) Support Builder Pattern To Build Table Properties For HoodieTableConfig

2021-03-04 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-1636: --- Fix Version/s: 0.8.0 > Support Builder Pattern To Build Table Properties For HoodieTableConfig > ---

[hudi] branch master updated: [HUDI-1636] Support Builder Pattern To Build Table Properties For HoodieTableConfig (#2596)

2021-03-04 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new bc883db [HUDI-1636] Support Builder Pattern To

[jira] [Closed] (HUDI-1636) Support Builder Pattern To Build Table Properties For HoodieTableConfig

2021-03-04 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1636. -- Resolution: Done > Support Builder Pattern To Build Table Properties For HoodieTableConfig > -

[GitHub] [hudi] yanghua merged pull request #2596: [HUDI-1636] Support Builder Pattern To Build Table Properties For HoodieTableConfig

2021-03-04 Thread GitBox
yanghua merged pull request #2596: URL: https://github.com/apache/hudi/pull/2596 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] yanghua commented on a change in pull request #2596: [HUDI-1636] Support Builder Pattern To Build Table Properties For HoodieTableConfig

2021-03-04 Thread GitBox
yanghua commented on a change in pull request #2596: URL: https://github.com/apache/hudi/pull/2596#discussion_r588054717 ## File path: hudi-examples/src/main/java/org/apache/hudi/examples/java/HoodieJavaWriteClientExample.java ## @@ -72,8 +72,11 @@ public static void main(Stri

[hudi] branch master updated: [HUDI-1655] Support custom date format and fix unsupported exception in DatePartitionPathSelector (#2621)

2021-03-04 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new f53bca4 [HUDI-1655] Support custom date format

[GitHub] [hudi] xushiyan merged pull request #2621: [HUDI-1655] Support custom date format and fix unsupported exception in DatePartitionPathSelector

2021-03-04 Thread GitBox
xushiyan merged pull request #2621: URL: https://github.com/apache/hudi/pull/2621 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [hudi] codecov-io edited a comment on pull request #2625: [1568] Fixing spark3 bundles

2021-03-04 Thread GitBox
codecov-io edited a comment on pull request #2625: URL: https://github.com/apache/hudi/pull/2625#issuecomment-789780238 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2625?src=pr&el=h1) Report > Merging [#2625](https://codecov.io/gh/apache/hudi/pull/2625?src=pr&el=desc) (369d44f) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2625: [1568] Fixing spark3 bundles

2021-03-04 Thread GitBox
codecov-io edited a comment on pull request #2625: URL: https://github.com/apache/hudi/pull/2625#issuecomment-789780238 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2625?src=pr&el=h1) Report > Merging [#2625](https://codecov.io/gh/apache/hudi/pull/2625?src=pr&el=desc) (369d44f) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2625: [1568] Fixing spark3 bundles

2021-03-04 Thread GitBox
codecov-io edited a comment on pull request #2625: URL: https://github.com/apache/hudi/pull/2625#issuecomment-789780238 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2625?src=pr&el=h1) Report > Merging [#2625](https://codecov.io/gh/apache/hudi/pull/2625?src=pr&el=desc) (369d44f) in

[GitHub] [hudi] nsivabalan edited a comment on pull request #2625: [1568] Fixing spark3 bundles

2021-03-04 Thread GitBox
nsivabalan edited a comment on pull request #2625: URL: https://github.com/apache/hudi/pull/2625#issuecomment-789746642 CC @vinothchandar @garyli1019 @bvaradar This is an automated message from the Apache Git Service. To res

[GitHub] [hudi] nsivabalan removed a comment on pull request #2625: [1568] Adding hudi-spark3-bundle

2021-03-04 Thread GitBox
nsivabalan removed a comment on pull request #2625: URL: https://github.com/apache/hudi/pull/2625#issuecomment-789746287 @umehrot2 @zhedoubushishi : Can you folks help me out here. I see we have added support for spark3 [here](https://github.com/apache/hudi/pull/2208). Did we test the bund

[jira] [Created] (HUDI-1666) Refactor BaseCleanActionExecutor to return List

2021-03-04 Thread Nishith Agarwal (Jira)
Nishith Agarwal created HUDI-1666: - Summary: Refactor BaseCleanActionExecutor to return List Key: HUDI-1666 URL: https://issues.apache.org/jira/browse/HUDI-1666 Project: Apache Hudi Issue Ty

[GitHub] [hudi] xiarixiaoyao edited a comment on pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
xiarixiaoyao edited a comment on pull request #2634: URL: https://github.com/apache/hudi/pull/2634#issuecomment-791119070 cc @garyli1019 , could you take a look This is an automated message from the Apache Git Service. To re

[GitHub] [hudi] xiarixiaoyao edited a comment on pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
xiarixiaoyao edited a comment on pull request #2634: URL: https://github.com/apache/hudi/pull/2634#issuecomment-791119070 This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [hudi] xiarixiaoyao commented on pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
xiarixiaoyao commented on pull request #2634: URL: https://github.com/apache/hudi/pull/2634#issuecomment-791119070 cc @garyli1019 This is an automated message from the Apache Git Service. To respond to the message, please lo

[jira] [Created] (HUDI-1665) Remove autoCommit option from BaseCommitActionExecutor

2021-03-04 Thread Nishith Agarwal (Jira)
Nishith Agarwal created HUDI-1665: - Summary: Remove autoCommit option from BaseCommitActionExecutor Key: HUDI-1665 URL: https://issues.apache.org/jira/browse/HUDI-1665 Project: Apache Hudi Is

[jira] [Assigned] (HUDI-1486) Remove pending rollback and move to cleaner

2021-03-04 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal reassigned HUDI-1486: - Assignee: Nishith Agarwal > Remove pending rollback and move to cleaner > ---

[jira] [Updated] (HUDI-1662) Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1662: - Labels: pull-request-available (was: ) > Failed to query real-time view use hive/spark-sql when

[jira] [Updated] (HUDI-1486) Remove pending rollback and move to cleaner

2021-03-04 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1486: -- Status: Patch Available (was: In Progress) > Remove pending rollback and move to cleaner >

[jira] [Updated] (HUDI-1486) Remove pending rollback and move to cleaner

2021-03-04 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1486: -- Status: Open (was: New) > Remove pending rollback and move to cleaner > ---

[jira] [Updated] (HUDI-1486) Remove pending rollback and move to cleaner

2021-03-04 Thread Nishith Agarwal (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishith Agarwal updated HUDI-1486: -- Status: In Progress (was: Open) > Remove pending rollback and move to cleaner > ---

[GitHub] [hudi] xiarixiaoyao opened a new pull request #2634: [HUDI-1662] Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread GitBox
xiarixiaoyao opened a new pull request #2634: URL: https://github.com/apache/hudi/pull/2634 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of th

[jira] [Updated] (HUDI-1655) Support custom date format and fix unsupported exception in DatePartitionPathSelector

2021-03-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1655: - Description: Add a config to allow parsing custom date format in  {{DatePartitionPathSelector}}. Currently

[jira] [Updated] (HUDI-1655) Support custom date format and fix unsupported exception in DatePartitionPathSelector

2021-03-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1655: - Summary: Support custom date format and fix unsupported exception in DatePartitionPathSelector (was: Allo

[GitHub] [hudi] xushiyan commented on pull request #2621: [HUDI-1655] Support custom date format and fix unsupported exception in DatePartitionPathSelector

2021-03-04 Thread GitBox
xushiyan commented on pull request #2621: URL: https://github.com/apache/hudi/pull/2621#issuecomment-791115825 > @xushiyan It would be also good to change the title of the PR to add more information about fixing the bug? @yanghua ok done. Pls check. Thanks.

[jira] [Updated] (HUDI-1655) Allow custom date format in DatePartitionPathSelector

2021-03-04 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-1655: - Description: Add a config to allow parsing custom date format in  {{DatePartitionPathSelector}}. Currently

[GitHub] [hudi] yanghua commented on pull request #2621: [HUDI-1655] Allow custom date format in DatePartitionPathSelector

2021-03-04 Thread GitBox
yanghua commented on pull request #2621: URL: https://github.com/apache/hudi/pull/2621#issuecomment-791113338 @xushiyan It would be also good to change the title of the PR to add more information about fixing the bug? This i

[GitHub] [hudi] yanghua commented on a change in pull request #2621: [HUDI-1655] Allow custom date format in DatePartitionPathSelector

2021-03-04 Thread GitBox
yanghua commented on a change in pull request #2621: URL: https://github.com/apache/hudi/pull/2621#discussion_r587994448 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DatePartitionPathSelector.java ## @@ -130,20 +140,19 @@ public DatePart

[GitHub] [hudi] root18039532923 commented on issue #2614: Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token

2021-03-04 Thread GitBox
root18039532923 commented on issue #2614: URL: https://github.com/apache/hudi/issues/2614#issuecomment-791105284 I need the jar of the patch which you opened to test,but I am using inner-net. @satishkotha This is an automat

[jira] [Updated] (HUDI-1662) Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread tao meng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tao meng updated HUDI-1662: --- Description: step1: prepare raw DataFrame with DateType, and insert it to HudiMorTable df_raw.withColumn("dat

[jira] [Updated] (HUDI-1662) Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread tao meng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tao meng updated HUDI-1662: --- Description: step1: prepare raw DataFrame with DateType, and insert it to HudiMorTable df_raw.withColumn("dat

[jira] [Closed] (HUDI-1664) Streaming read for Flink MOR table

2021-03-04 Thread Danny Chen (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen closed HUDI-1664. Resolution: Duplicate > Streaming read for Flink MOR table > -- > >

[jira] [Created] (HUDI-1663) Streaming read for Flink MOR table

2021-03-04 Thread Danny Chen (Jira)
Danny Chen created HUDI-1663: Summary: Streaming read for Flink MOR table Key: HUDI-1663 URL: https://issues.apache.org/jira/browse/HUDI-1663 Project: Apache Hudi Issue Type: Sub-task C

[jira] [Created] (HUDI-1664) Streaming read for Flink MOR table

2021-03-04 Thread Danny Chen (Jira)
Danny Chen created HUDI-1664: Summary: Streaming read for Flink MOR table Key: HUDI-1664 URL: https://issues.apache.org/jira/browse/HUDI-1664 Project: Apache Hudi Issue Type: Sub-task C

[jira] [Updated] (HUDI-1662) Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread tao meng (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tao meng updated HUDI-1662: --- Description: step1: prepare raw DataFrame with DateType, and insert it to HudiMorTable df_raw.withColumn("dat

[jira] [Created] (HUDI-1662) Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType

2021-03-04 Thread tao meng (Jira)
tao meng created HUDI-1662: -- Summary: Failed to query real-time view use hive/spark-sql when hudi mor table contains dateType Key: HUDI-1662 URL: https://issues.apache.org/jira/browse/HUDI-1662 Project: Apa

[hudi] branch master updated (89003bc -> 7cc75e0)

2021-03-04 Thread nagarwal
This is an automated email from the ASF dual-hosted git repository. nagarwal pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 89003bc [HUDI-1647] Supports snapshot read for Flink (#2613) add 7cc75e0 [HUDI-1646] Provide mechanism to read

[GitHub] [hudi] n3nash merged pull request #2611: [HUDI-1646] Provide mechanism to read uncommitted data through InputFormat

2021-03-04 Thread GitBox
n3nash merged pull request #2611: URL: https://github.com/apache/hudi/pull/2611 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] xushiyan commented on a change in pull request #2621: [HUDI-1655] Allow custom date format in DatePartitionPathSelector

2021-03-04 Thread GitBox
xushiyan commented on a change in pull request #2621: URL: https://github.com/apache/hudi/pull/2621#discussion_r587949226 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DatePartitionPathSelector.java ## @@ -130,20 +140,19 @@ public DatePar

[GitHub] [hudi] xushiyan commented on a change in pull request #2621: [HUDI-1655] Allow custom date format in DatePartitionPathSelector

2021-03-04 Thread GitBox
xushiyan commented on a change in pull request #2621: URL: https://github.com/apache/hudi/pull/2621#discussion_r587949226 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DatePartitionPathSelector.java ## @@ -130,20 +140,19 @@ public DatePar

[jira] [Closed] (HUDI-1647) Supports snapshot read for Flink

2021-03-04 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang closed HUDI-1647. -- Resolution: Implemented Implemented via master branch: 89003bc7801b035b5be31c76bfbf691bfcf9081a > Supports snap

[jira] [Updated] (HUDI-1647) Supports snapshot read for Flink

2021-03-04 Thread vinoyang (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated HUDI-1647: --- Fix Version/s: 0.8.0 > Supports snapshot read for Flink > > >

[GitHub] [hudi] yanghua merged pull request #2613: [HUDI-1647] Supports snapshot read for Flink

2021-03-04 Thread GitBox
yanghua merged pull request #2613: URL: https://github.com/apache/hudi/pull/2613 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[hudi] branch master updated (899ae70 -> 89003bc)

2021-03-04 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 899ae70 [HUDI-1587] Add latency and freshness support (#2541) add 89003bc [HUDI-1647] Supports snapshot read f

[hudi] branch asf-site updated: Travis CI build asf-site

2021-03-04 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new a2efa69 Travis CI build asf-site a2efa69 is d

[hudi] branch asf-site updated: [DOCS] Update 0_4_docker_demo.cn.md (#2629)

2021-03-04 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 2619964 [DOCS] Update 0_4_docker_demo.cn.md (

[GitHub] [hudi] vinothchandar merged pull request #2629: [DOCS] Fix docs hive_sync path

2021-03-04 Thread GitBox
vinothchandar merged pull request #2629: URL: https://github.com/apache/hudi/pull/2629 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] bvaradar commented on issue #2620: [SUPPORT] Performance Tuning: Slow stages (Building Workload Profile & Getting Small files from partitions) during Hudi Writes

2021-03-04 Thread GitBox
bvaradar commented on issue #2620: URL: https://github.com/apache/hudi/issues/2620#issuecomment-790975178 @nsivabalan is looking into this. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] umehrot2 edited a comment on issue #2592: [SUPPORT] Does latest versions of Hudi (0.7.0, 0.6.0) work with Spark 2.3.0 when reading orc files?

2021-03-04 Thread GitBox
umehrot2 edited a comment on issue #2592: URL: https://github.com/apache/hudi/issues/2592#issuecomment-790955939 @codejoyan I think the problem stems because you are using `org.apache.spark:spark-avro_2.11:2.4.4` in you packages with spark-submit. This is incompatible with `spark-sql 2.3.0

[GitHub] [hudi] umehrot2 commented on issue #2592: [SUPPORT] Does latest versions of Hudi (0.7.0, 0.6.0) work with Spark 2.3.0 when reading orc files?

2021-03-04 Thread GitBox
umehrot2 commented on issue #2592: URL: https://github.com/apache/hudi/issues/2592#issuecomment-790955939 @codejoyan I think the problem stems because you are using `org.apache.spark:spark-avro_2.11:2.4.4` in you packages with spark-submit. This is incompatible with `spark-sql 2.3.0` that

[GitHub] [hudi] satishkotha commented on issue #2589: [SUPPORT] Issue with adding column while running deltastreamer with kafka source.

2021-03-04 Thread GitBox
satishkotha commented on issue #2589: URL: https://github.com/apache/hudi/issues/2589#issuecomment-790942436 yeah, column deletions are not supported today. You can consider making all columns optional and continue writing null for fields that you want to delete. @bvaradar Could you

[GitHub] [hudi] codecov-io edited a comment on pull request #2632: [HUDI-1661] Exclude clustering commits from TimelineUtils API

2021-03-04 Thread GitBox
codecov-io edited a comment on pull request #2632: URL: https://github.com/apache/hudi/pull/2632#issuecomment-790858581 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2632?src=pr&el=h1) Report > Merging [#2632](https://codecov.io/gh/apache/hudi/pull/2632?src=pr&el=desc) (52706ea) in

[GitHub] [hudi] satishkotha commented on issue #2614: Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token

2021-03-04 Thread GitBox
satishkotha commented on issue #2614: URL: https://github.com/apache/hudi/issues/2614#issuecomment-790940752 @root18039532923 Did you get a chance to try this patch? We can merge patch after your testing looks good. This is

[GitHub] [hudi] satishkotha commented on a change in pull request #2627: [HUDI-1653] Add support for composite keys in NonpartitionedKeyGenerator

2021-03-04 Thread GitBox
satishkotha commented on a change in pull request #2627: URL: https://github.com/apache/hudi/pull/2627#discussion_r587802720 ## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/keygen/NonpartitionedKeyGenerator.java ## @@ -20,20 +20,32 @@ import org.ap

[GitHub] [hudi] vinothchandar edited a comment on issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-03-04 Thread GitBox
vinothchandar edited a comment on issue #2633: URL: https://github.com/apache/hudi/issues/2633#issuecomment-790926531 @n3nash so seems to corroborate with udit's finding then. cc @bvaradar as well, who can comment on the suggested fix. -

[GitHub] [hudi] vinothchandar commented on issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-03-04 Thread GitBox
vinothchandar commented on issue #2633: URL: https://github.com/apache/hudi/issues/2633#issuecomment-790926531 cc @bvaradar as well, who can confirm. This is an automated message from the Apache Git Service. To respond to th

[GitHub] [hudi] n3nash commented on issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-03-04 Thread GitBox
n3nash commented on issue #2633: URL: https://github.com/apache/hudi/issues/2633#issuecomment-790917128 @umehrot2 From what you are describing, it looks like a bug. When we configure HbaseIndex, we automatically assume ``` public boolean canIndexLogFiles() { return true;

[GitHub] [hudi] modi95 commented on pull request #2628: [HUDI-1635] Improvements to Hudi Test Suite

2021-03-04 Thread GitBox
modi95 commented on pull request #2628: URL: https://github.com/apache/hudi/pull/2628#issuecomment-790916909 @nbalajee these changes look good to me. Can try doing the following: - Run schema evolution test suite in the docker setup (step-by-step guide available in test-suite readme)

[GitHub] [hudi] umehrot2 commented on issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-03-04 Thread GitBox
umehrot2 commented on issue #2633: URL: https://github.com/apache/hudi/issues/2633#issuecomment-790891245 @bvaradar can you confirm that the finding is correct since it seems you worked on that file system view implementation ? Also cc @n3nash @vinothchandar -

[GitHub] [hudi] umehrot2 opened a new issue #2633: Empty File Slice causing application to fail in small files optimization code

2021-03-04 Thread GitBox
umehrot2 opened a new issue #2633: URL: https://github.com/apache/hudi/issues/2633 **Describe the problem you faced** IHAC who is using Hudi's `Spark structured streaming sink` with `asynchronous compaction` and `Hbase Index` on EMR. The Hudi version being used is 0.6.0. After a whi

[GitHub] [hudi] vinothchandar commented on a change in pull request #2374: [HUDI-845] Added locking capability to allow multiple writers

2021-03-04 Thread GitBox
vinothchandar commented on a change in pull request #2374: URL: https://github.com/apache/hudi/pull/2374#discussion_r587049910 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java ## @@ -30,16 +31,19 @@ import org.ap

[GitHub] [hudi] codecov-io commented on pull request #2632: [HUDI-1661] Exclude clustering commits from TimelineUtils API

2021-03-04 Thread GitBox
codecov-io commented on pull request #2632: URL: https://github.com/apache/hudi/pull/2632#issuecomment-790858581 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2632?src=pr&el=h1) Report > Merging [#2632](https://codecov.io/gh/apache/hudi/pull/2632?src=pr&el=desc) (fd53720) into [ma

[jira] [Updated] (HUDI-1661) Change utility methods that help get extra metadata to ignore internal rewrite commits

2021-03-04 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1661: - Labels: pull-request-available (was: ) > Change utility methods that help get extra metadata to i

[GitHub] [hudi] satishkotha opened a new pull request #2632: [HUDI-1661] Exclude clustering commits from TimelineUtils API

2021-03-04 Thread GitBox
satishkotha opened a new pull request #2632: URL: https://github.com/apache/hudi/pull/2632 ## What is the purpose of the pull request Exclude internal rewrite commit such as clustering commits from getExtraMetadataFromLatest API ## Brief change log getExtraMetadataFromLatest AP

[jira] [Created] (HUDI-1661) Change utility methods that help get extra metadata to ignore internal rewrite commits

2021-03-04 Thread satish (Jira)
satish created HUDI-1661: Summary: Change utility methods that help get extra metadata to ignore internal rewrite commits Key: HUDI-1661 URL: https://issues.apache.org/jira/browse/HUDI-1661 Project: Apache Hu

[GitHub] [hudi] n3nash commented on pull request #2628: [HUDI-1635] Improvements to Hudi Test Suite

2021-03-04 Thread GitBox
n3nash commented on pull request #2628: URL: https://github.com/apache/hudi/pull/2628#issuecomment-790821135 @modi95 can you review this ? This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [hudi] codecov-io edited a comment on pull request #2611: [HUDI-1646] Provide mechanism to read uncommitted data through InputFormat

2021-03-04 Thread GitBox
codecov-io edited a comment on pull request #2611: URL: https://github.com/apache/hudi/pull/2611#issuecomment-787641189 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2611?src=pr&el=h1) Report > Merging [#2611](https://codecov.io/gh/apache/hudi/pull/2611?src=pr&el=desc) (427523d) in

[GitHub] [hudi] n3nash opened a new pull request #2631: [HUDI 1660] Excluding compaction and clustering instants from inflight rollback

2021-03-04 Thread GitBox
n3nash opened a new pull request #2631: URL: https://github.com/apache/hudi/pull/2631 ## What is the purpose of the pull request *This PR fixes a bug to ensure that pending compaction & clustering operations are always excluded when performing inflight rollbacks* ## Committer

[jira] [Created] (HUDI-1660) Exclude pending compaction & clustering from rollback

2021-03-04 Thread Nishith Agarwal (Jira)
Nishith Agarwal created HUDI-1660: - Summary: Exclude pending compaction & clustering from rollback Key: HUDI-1660 URL: https://issues.apache.org/jira/browse/HUDI-1660 Project: Apache Hudi Iss

[jira] [Updated] (HUDI-1656) Loading history data to new hudi table taking longer time

2021-03-04 Thread Fredrick jose antony cruz (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredrick jose antony cruz updated HUDI-1656: Description: spark-submit --jars /u/users/svcordrdats/order_hudi_poc/hudi-s

[jira] [Updated] (HUDI-1659) Support DDL And Insert For Hudi In Spark Sql

2021-03-04 Thread pengzhiwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] pengzhiwei updated HUDI-1659: - Summary: Support DDL And Insert For Hudi In Spark Sql (was: DDL Support For Hudi In Spark Sql) > Suppor

[GitHub] [hudi] rswagatika commented on issue #2564: Hoodie clean is not deleting old files

2021-03-04 Thread GitBox
rswagatika commented on issue #2564: URL: https://github.com/apache/hudi/issues/2564#issuecomment-790707124 @bvaradar Hi I was able to get the files but the driver log am trying will provide you once i have it. Let me know if this is what you meaning for recursive s3 dataset? 2021-

[jira] [Created] (HUDI-1659) DDL Support For Hudi In Spark Sql

2021-03-04 Thread pengzhiwei (Jira)
pengzhiwei created HUDI-1659: Summary: DDL Support For Hudi In Spark Sql Key: HUDI-1659 URL: https://issues.apache.org/jira/browse/HUDI-1659 Project: Apache Hudi Issue Type: Sub-task Co

[jira] [Created] (HUDI-1658) Spark Sql Support For Hudi

2021-03-04 Thread pengzhiwei (Jira)
pengzhiwei created HUDI-1658: Summary: Spark Sql Support For Hudi Key: HUDI-1658 URL: https://issues.apache.org/jira/browse/HUDI-1658 Project: Apache Hudi Issue Type: New Feature Compon

[GitHub] [hudi] guanziyue opened a new issue #2630: [SUPPORT]Confuse about the strategy to evaluate average record size

2021-03-04 Thread GitBox
guanziyue opened a new issue #2630: URL: https://github.com/apache/hudi/issues/2630 **Describe the problem you faced** In the UpsertPartitioner class, the method called averageBytesPerRecord is used to evaluate the average record size according to previous commits. There is a fileSi

[GitHub] [hudi] danny0405 commented on a change in pull request #2613: [HUDI-1647] Supports snapshot read for Flink

2021-03-04 Thread GitBox
danny0405 commented on a change in pull request #2613: URL: https://github.com/apache/hudi/pull/2613#discussion_r587409966 ## File path: hudi-flink/src/main/resources/META-INF/services/org.apache.flink.table.factories.TableFactory ## @@ -0,0 +1,17 @@ +# Licensed to the Apache

[GitHub] [hudi] yanghua commented on a change in pull request #2621: [HUDI-1655] Allow custom date format in DatePartitionPathSelector

2021-03-04 Thread GitBox
yanghua commented on a change in pull request #2621: URL: https://github.com/apache/hudi/pull/2621#discussion_r587389540 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DatePartitionPathSelector.java ## @@ -130,20 +140,19 @@ public DatePart

[GitHub] [hudi] garyli1019 commented on a change in pull request #2613: [HUDI-1647] Supports snapshot read for Flink

2021-03-04 Thread GitBox
garyli1019 commented on a change in pull request #2613: URL: https://github.com/apache/hudi/pull/2613#discussion_r587262267 ## File path: hudi-flink/src/main/resources/META-INF/services/org.apache.flink.table.factories.TableFactory ## @@ -0,0 +1,17 @@ +# Licensed to the Apache

[GitHub] [hudi] chaplinthink opened a new pull request #2629: [DOCS] Fix docs hive_sync path

2021-03-04 Thread GitBox
chaplinthink opened a new pull request #2629: URL: https://github.com/apache/hudi/pull/2629 Fix docs hive_sync path ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*