[GitHub] [hudi] boneanxs commented on a diff in pull request #5723: [HUDI-4173] Fix wrong results if the user read no base files hudi table by glob paths

2022-06-07 Thread GitBox
boneanxs commented on code in PR #5723: URL: https://github.com/apache/hudi/pull/5723#discussion_r891987652 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ## @@ -340,6 +340,21 @@ abstract class HoodieBaseRelation(val sqlContex

[GitHub] [hudi] boneanxs commented on a diff in pull request #5723: [HUDI-4173] Fix wrong results if the user read no base files hudi table by glob paths

2022-06-07 Thread GitBox
boneanxs commented on code in PR #5723: URL: https://github.com/apache/hudi/pull/5723#discussion_r891987232 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/execution/datasources/HoodieInMemoryFileIndex.scala: ## @@ -34,7 +37,77 @@ class HoodieInMemoryF

svn commit: r54982 - /dev/hudi/KEYS

2022-06-07 Thread yihua
Author: yihua Date: Wed Jun 8 06:42:08 2022 New Revision: 54982 Log: Add GPG key for Y Ethan Guo Modified: dev/hudi/KEYS Modified: dev/hudi/KEYS == --- dev/hudi/KEYS (original) +++ dev/hudi/KEYS Wed Jun 8 06:42:08

[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
wzx140 commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891970546 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java: ## @@ -142,7 +144,8 @@ public void runMerge(HoodieTable>, HoodieD

[GitHub] [hudi] xicm commented on pull request #5790: [HUDI-3682] testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter

2022-06-07 Thread GitBox
xicm commented on PR #5790: URL: https://github.com/apache/hudi/pull/5790#issuecomment-1149519890 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
wzx140 commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891969399 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieAvroIndexedRecord.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [hudi] hudi-bot commented on pull request #5761: [HUDI-4165] Support Create/Drop/Show/Refresh Index Syntax for Spark SQL

2022-06-07 Thread GitBox
hudi-bot commented on PR #5761: URL: https://github.com/apache/hudi/pull/5761#issuecomment-1149516882 ## CI report: * 68a79f570e9aedfbd5934eeaa49319a634dcbe7c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9125

[GitHub] [hudi] hudi-bot commented on pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
hudi-bot commented on PR #5791: URL: https://github.com/apache/hudi/pull/5791#issuecomment-1149513998 ## CI report: * ebb64b9442d89387d03976b7d7c7a5c31d653af5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9147

[GitHub] [hudi] hudi-bot commented on pull request #5761: [HUDI-4165] Support Create/Drop/Show/Refresh Index Syntax for Spark SQL

2022-06-07 Thread GitBox
hudi-bot commented on PR #5761: URL: https://github.com/apache/hudi/pull/5761#issuecomment-1149513941 ## CI report: * 68a79f570e9aedfbd5934eeaa49319a634dcbe7c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9125

[GitHub] [hudi] sunke38 commented on issue #5765: [SUPPORT] throw "java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.client.HdfsDataInputStream.getReadStatistics()"

2022-06-07 Thread GitBox
sunke38 commented on issue #5765: URL: https://github.com/apache/hudi/issues/5765#issuecomment-1149511966 @nsivabalan :I didn't have any HBASE dependencies in my pom here is all about Hadoop dependencies. It make me confuse that it show same error when I run insert in spark-shell. I did n

[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
wzx140 commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891962118 ## hudi-common/src/main/java/org/apache/hudi/common/util/MappingIterator.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or

[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
wzx140 commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891961288 ## hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java: ## @@ -120,19 +122,21 @@ protected byte[] serializeRecords(List records) thro

[GitHub] [hudi] hudi-bot commented on pull request #5790: [HUDI-3682] testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter

2022-06-07 Thread GitBox
hudi-bot commented on PR #5790: URL: https://github.com/apache/hudi/pull/5790#issuecomment-1149508326 ## CI report: * b02b7d8481e212c1a33348d63227e54524cab4a7 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9146

[GitHub] [hudi] a0x commented on issue #5792: [SUPPORT] Update hudi table(using SparkSQL) failed when the column contains `null` value in other records

2022-06-07 Thread GitBox
a0x commented on issue #5792: URL: https://github.com/apache/hudi/issues/5792#issuecomment-1149507752 Here is my analysis. The key exception is **`java.lang.RuntimeException: Null-value for required field: note`**, which means the field `note` is not nullable. But I added `null` valu

[GitHub] [hudi] xushiyan commented on a diff in pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
xushiyan commented on code in PR #5791: URL: https://github.com/apache/hudi/pull/5791#discussion_r891957547 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieBaseRelation.scala: ## @@ -509,12 +510,22 @@ abstract class HoodieBaseRelation(val sqlConte

[jira] [Updated] (HUDI-3309) Integrate quickstart examples into integration tests

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3309: Fix Version/s: 0.12.0 (was: 0.11.1) > Integrate quickstart examples into integration

[jira] [Closed] (HUDI-4124) Add valid check in Spark Datasource configs

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-4124. --- Resolution: Fixed > Add valid check in Spark Datasource configs > ---

[jira] [Closed] (HUDI-3862) Fix default configurations of HoodieHBaseIndexConfig

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-3862. --- Resolution: Fixed > Fix default configurations of HoodieHBaseIndexConfig > ---

[jira] [Updated] (HUDI-4088) Flink ORC base file format does not work

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4088: Fix Version/s: 0.12.0 (was: 0.11.1) > Flink ORC base file format does not work >

[GitHub] [hudi] liujinhui1994 commented on a diff in pull request #5639: [HUDI-2516] Upgrade JUnit to 5.8.2

2022-06-07 Thread GitBox
liujinhui1994 commented on code in PR #5639: URL: https://github.com/apache/hudi/pull/5639#discussion_r891949504 ## pom.xml: ## @@ -77,8 +77,8 @@ 3.2.0 -3.0.0-M4 -3.0.0-M4 +3.0.0-M6 Review Comment: of course -- This is an automated message from the

[jira] [Updated] (HUDI-3698) Follow up on DeltaStreamer CI issues

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3698: Fix Version/s: (was: 0.11.1) > Follow up on DeltaStreamer CI issues > --

[jira] [Closed] (HUDI-4100) CTAS failed to clean up when given an illegal MANAGED table definition

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-4100. --- Resolution: Fixed > CTAS failed to clean up when given an illegal MANAGED table definition > -

[jira] [Closed] (HUDI-4086) thread factory optimization in async service

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-4086. --- Resolution: Fixed > thread factory optimization in async service > ---

[jira] [Updated] (HUDI-1761) Add support to test custom schema w/ QuickStart

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1761: Fix Version/s: 0.12.0 (was: 0.11.1) > Add support to test custom schema w/ QuickStart

[GitHub] [hudi] jjtjiang commented on issue #5777: [SUPPORT] Hudi table has duplicate data.

2022-06-07 Thread GitBox
jjtjiang commented on issue #5777: URL: https://github.com/apache/hudi/issues/5777#issuecomment-1149496653 Through the test, I saw a strange phenomenon. At the beginning, the data was repeated, and after a few minutes to several hours, the query was repeated and there was no repeated data.

[jira] [Updated] (HUDI-3856) Upgrade maven surefire to M5

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3856: Fix Version/s: 0.12.0 (was: 0.11.1) > Upgrade maven surefire to M5 >

[jira] [Updated] (HUDI-2188) Improve test for the insert_overwrite and insert_overwrite_table in hoodieDeltaStreamer

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2188: Fix Version/s: 0.12.0 (was: 0.11.1) > Improve test for the insert_overwrite and inser

[jira] [Updated] (HUDI-1053) Make ComplexKeyGenerator also support non partitioned Hudi dataset

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1053: Fix Version/s: 0.12.0 (was: 0.11.1) > Make ComplexKeyGenerator also support non parti

[jira] [Updated] (HUDI-736) Simplify ReflectionUtils#getTopLevelClasses

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-736: --- Fix Version/s: 0.12.0 (was: 0.11.1) > Simplify ReflectionUtils#getTopLevelClasses > ---

[jira] [Updated] (HUDI-3303) CI Improvements

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3303: Fix Version/s: (was: 0.11.1) > CI Improvements > --- > > Key: HUDI-3303 >

[jira] [Updated] (HUDI-4002) HoodieWrapperFileSystem class cast issue

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4002: Fix Version/s: 0.12.0 (was: 0.11.1) > HoodieWrapperFileSystem class cast issue >

[jira] [Updated] (HUDI-4118) flink write hudi sync hive .If the Hive table already exists, write to create automatically will throw an exception

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4118: Fix Version/s: 0.12.0 (was: 0.11.1) > flink write hudi sync hive .If the Hive table a

[jira] [Updated] (HUDI-3998) getCommitsSinceLastCleaning failed when async cleaning

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3998: Fix Version/s: 0.12.0 (was: 0.11.1) > getCommitsSinceLastCleaning failed when async c

[jira] [Updated] (HUDI-3983) ClassNotFoundException when using hudi-spark-bundle to write table with hbase index

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3983: Fix Version/s: 0.12.0 (was: 0.11.1) > ClassNotFoundException when using hudi-spark-bu

[jira] [Updated] (HUDI-3976) Newly introduced HiveSyncConfig config, syncAsSparkDataSourceTable is defaulted as true

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3976: Fix Version/s: 0.12.0 (was: 0.11.1) > Newly introduced HiveSyncConfig config, syncAsS

[jira] [Updated] (HUDI-3965) Spark sql dml w/ spark2 and scala12 fails w/ ClassNotFoundException for SparkSQLCLIDriver

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3965: Fix Version/s: 0.12.0 (was: 0.11.1) > Spark sql dml w/ spark2 and scala12 fails w/ Cl

[jira] [Updated] (HUDI-3961) Encounter NoClassDefFoundError when using Spark 3.1 bundle and utilities slim bundle

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3961: Fix Version/s: 0.12.0 (was: 0.11.1) > Encounter NoClassDefFoundError when using Spark

[jira] [Updated] (HUDI-3818) hudi doesn't support bytes column as primary key

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3818: Fix Version/s: 0.12.0 (was: 0.11.1) > hudi doesn't support bytes column as primary ke

[jira] [Updated] (HUDI-3914) Enhance TestColumnStatsIndex to test indexing with regular writes and table services

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3914: Fix Version/s: (was: 0.11.1) > Enhance TestColumnStatsIndex to test indexing with regular writes and tab

[jira] [Updated] (HUDI-4126) Fix the HoodieRealtimePath for bootstrap file

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4126: Fix Version/s: 0.12.0 (was: 0.11.1) > Fix the HoodieRealtimePath for bootstrap file >

[jira] [Updated] (HUDI-4112) Relax constraint in metadata table that rollback of a commit that got archived in MDT throws exception

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4112: Fix Version/s: 0.12.0 (was: 0.11.1) > Relax constraint in metadata table that rollbac

[jira] [Updated] (HUDI-4159) non existant partition field values has issues w/ partition pruning

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4159: Fix Version/s: 0.12.0 (was: 0.11.1) > non existant partition field values has issues

[jira] [Updated] (HUDI-4090) Fix flaky IT tests ITTestHoodieDataSource.testStreamReadAppendData

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4090: Fix Version/s: 0.12.0 (was: 0.11.1) > Fix flaky IT tests ITTestHoodieDataSource.testS

[jira] [Updated] (HUDI-3783) Fix HoodieTestTable harness to also properly validate Column Stats

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3783: Fix Version/s: (was: 0.11.1) > Fix HoodieTestTable harness to also properly validate Column Stats >

[jira] [Updated] (HUDI-3687) Make sure CI run tests against all Spark versions

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3687: Fix Version/s: 0.12.0 (was: 0.11.1) > Make sure CI run tests against all Spark versio

[jira] [Updated] (HUDI-3660) config hoodie.logfile.max.size not work

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3660: Fix Version/s: 0.12.0 (was: 0.11.1) > config hoodie.logfile.max.size not work > -

[jira] [Updated] (HUDI-3603) Support read DateType for hive2/hive3

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3603: Fix Version/s: 0.12.0 (was: 0.11.1) > Support read DateType for hive2/hive3 > -

[jira] [Closed] (HUDI-4111) Bump ANTLR runtime version in Spark 3.x

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-4111. --- Resolution: Fixed > Bump ANTLR runtime version in Spark 3.x > --- > >

[jira] [Updated] (HUDI-4000) Docs around DBT

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-4000: Fix Version/s: 0.12.0 (was: 0.11.1) > Docs around DBT > --- > >

[jira] [Updated] (HUDI-3994) HoodieDeltaStreamer - Spark master shouldn't have a default

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3994: Fix Version/s: 0.12.0 (was: 0.11.1) > HoodieDeltaStreamer - Spark master shouldn't ha

[jira] [Updated] (HUDI-3959) Rename class name for spark rdd reader

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3959: Fix Version/s: 0.12.0 (was: 0.11.1) > Rename class name for spark rdd reader > --

[jira] [Updated] (HUDI-3764) Allow loading external configs while querying Hudi tables with Spark

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3764: Fix Version/s: 0.12.0 (was: 0.11.1) > Allow loading external configs while querying H

[jira] [Updated] (HUDI-3984) Hudi cli cannot get infos of unpartitioned table

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3984: Fix Version/s: 0.12.0 (was: 0.11.1) > Hudi cli cannot get infos of unpartitioned tabl

[jira] [Updated] (HUDI-3979) Make sure Hudi Relations are only fetching strictly required columns

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3979: Fix Version/s: 0.12.0 (was: 0.11.1) > Make sure Hudi Relations are only fetching stri

[jira] [Updated] (HUDI-3993) Avoid calling into Spark UDF in Bulk Insert

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3993: Fix Version/s: (was: 0.11.1) > Avoid calling into Spark UDF in Bulk Insert > ---

[GitHub] [hudi] a0x opened a new issue, #5792: [SUPPORT] Update hudi table(using SparkSQL) failed when the column contains `null` value in other records

2022-06-07 Thread GitBox
a0x opened a new issue, #5792: URL: https://github.com/apache/hudi/issues/5792 **Describe the problem you faced** Update hudi table(using SparkSQL) failed when the column contains `null` value in other records, as the following image: https://user-images.githubusercontent.com/3

[jira] [Updated] (HUDI-3819) upgrade spring cve-2022-22965

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3819: Fix Version/s: 0.12.0 (was: 0.11.1) > upgrade spring cve-2022-22965 > ---

[jira] [Updated] (HUDI-3896) Support Spark optimizations for `HadoopFsRelation`

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-3896: Fix Version/s: 0.12.0 (was: 0.11.1) > Support Spark optimizations for `HadoopFsRelati

[jira] [Updated] (HUDI-1238) [UMBRELLA] Perf test env

2022-06-07 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-1238: Fix Version/s: (was: 0.11.1) > [UMBRELLA] Perf test env > > > K

[GitHub] [hudi] hudi-bot commented on pull request #5788: [HUDI-4207] HoodieFlinkWriteClient.getOrCreateWriteHandle throws an e…

2022-06-07 Thread GitBox
hudi-bot commented on PR #5788: URL: https://github.com/apache/hudi/pull/5788#issuecomment-1149472869 ## CI report: * 02b3550eb789013073e4d200ab5ec556199f6bb2 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9145

[GitHub] [hudi] yihua commented on a diff in pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
yihua commented on code in PR #5791: URL: https://github.com/apache/hudi/pull/5791#discussion_r891913237 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/io/FileBasedInternalSchemaStorageManager.java: ## @@ -131,6 +131,27 @@ private List getValidInstants() {

[GitHub] [hudi] yihua commented on pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
yihua commented on PR #5791: URL: https://github.com/apache/hudi/pull/5791#issuecomment-1149455490 @alexeykudinkin could you review this as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-06-07 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1149446127 ## CI report: * 8c6f6e19940ce7ac04dfcfce52da3ccdaf3a8b0f UNKNOWN * c4799803cff8adffef56e889a5cd4d52599fcf73 UNKNOWN * c5616888bb267cb505a12b88cad3e99f9dd18d9b UNKNOWN * 34

[GitHub] [hudi] hudi-bot commented on pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
hudi-bot commented on PR #5791: URL: https://github.com/apache/hudi/pull/5791#issuecomment-1149444154 ## CI report: * ebb64b9442d89387d03976b7d7c7a5c31d653af5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9147

[GitHub] [hudi] hudi-bot commented on pull request #5761: [HUDI-4165] Support Create/Drop/Show/Refresh Index Syntax for Spark SQL

2022-06-07 Thread GitBox
hudi-bot commented on PR #5761: URL: https://github.com/apache/hudi/pull/5761#issuecomment-1149441814 ## CI report: * 68a79f570e9aedfbd5934eeaa49319a634dcbe7c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9125

[GitHub] [hudi] vinothchandar commented on pull request #4309: [HUDI-3016][RFC-43] Proposal to implement Table Management Service

2022-06-07 Thread GitBox
vinothchandar commented on PR #4309: URL: https://github.com/apache/hudi/pull/4309#issuecomment-1149440761 Taking this over, given @prashantwason is now on a break. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [hudi] vinothchandar commented on pull request #5436: [RFC-51] [HUDI-3478] Change Data Capture RFC

2022-06-07 Thread GitBox
vinothchandar commented on PR #5436: URL: https://github.com/apache/hudi/pull/5436#issuecomment-1149440486 >only when the HoodieMergeHandle is called, not always. And other scenarios can re-use the existing files. For HoodieCreateHandle, we deduce `op` on the fly since the beforeImage

[GitHub] [hudi] hudi-bot commented on pull request #5761: [HUDI-4165] Support Create/Drop/Show/Refresh Index Syntax for Spark SQL

2022-06-07 Thread GitBox
hudi-bot commented on PR #5761: URL: https://github.com/apache/hudi/pull/5761#issuecomment-1149439718 ## CI report: * 68a79f570e9aedfbd5934eeaa49319a634dcbe7c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9125

[GitHub] [hudi] hudi-bot commented on pull request #5787: fix setup_demo.sh script to package jar inside docker folder

2022-06-07 Thread GitBox
hudi-bot commented on PR #5787: URL: https://github.com/apache/hudi/pull/5787#issuecomment-1149439744 ## CI report: * c5baed99203a61c8df1cf55970ee15805d678e03 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9143

[GitHub] [hudi] huberylee commented on pull request #5761: [HUDI-4165] Support Create/Drop/Show/Refresh Index Syntax for Spark SQL

2022-06-07 Thread GitBox
huberylee commented on PR #5761: URL: https://github.com/apache/hudi/pull/5761#issuecomment-1149438664 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [hudi] YuweiXiao commented on issue #5770: [SUPPORT] hoodie.parquet.max.file.size Property is Being Ignored

2022-06-07 Thread GitBox
YuweiXiao commented on issue #5770: URL: https://github.com/apache/hudi/issues/5770#issuecomment-1149433070 Did you enable `clustering`, as you mentioned `files merged into larger file`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [hudi] XuQianJin-Stars commented on a diff in pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
XuQianJin-Stars commented on code in PR #5791: URL: https://github.com/apache/hudi/pull/5791#discussion_r89189 ## hudi-common/src/main/java/org/apache/hudi/internal/schema/io/FileBasedInternalSchemaStorageManager.java: ## @@ -131,6 +131,27 @@ private List getValidInstants()

[GitHub] [hudi] YuweiXiao commented on pull request #5771: [HUDI-4071] Relax record key requirement and write with minimal options

2022-06-07 Thread GitBox
YuweiXiao commented on PR #5771: URL: https://github.com/apache/hudi/pull/5771#issuecomment-1149430062 Hey! Could we also auto-disable the bloom filter appended to the parquet file in this case? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
wzx140 commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891895012 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieRecord.java: ## @@ -18,14 +18,21 @@ package org.apache.hudi.common.model; +import org.apache.avro.Schema

[GitHub] [hudi] xushiyan commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
xushiyan commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891870033 ## hudi-client/hudi-java-client/src/main/java/org/apache/hudi/table/action/commit/JavaWriteHelper.java: ## @@ -67,11 +65,11 @@ public List> deduplicateRecords( retu

[GitHub] [hudi] hudi-bot commented on pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
hudi-bot commented on PR #5791: URL: https://github.com/apache/hudi/pull/5791#issuecomment-1149416541 ## CI report: * ebb64b9442d89387d03976b7d7c7a5c31d653af5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9147

[GitHub] [hudi] wzx140 commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
wzx140 commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891883955 ## hudi-common/src/main/java/org/apache/hudi/common/model/HoodieAvroIndexedRecord.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-06-07 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1149414187 ## CI report: * 8c6f6e19940ce7ac04dfcfce52da3ccdaf3a8b0f UNKNOWN * c4799803cff8adffef56e889a5cd4d52599fcf73 UNKNOWN * c5616888bb267cb505a12b88cad3e99f9dd18d9b UNKNOWN * a2

[GitHub] [hudi] hudi-bot commented on pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
hudi-bot commented on PR #5791: URL: https://github.com/apache/hudi/pull/5791#issuecomment-1149412275 ## CI report: * ebb64b9442d89387d03976b7d7c7a5c31d653af5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9147

[GitHub] [hudi] hudi-bot commented on pull request #5790: HUDI-3682 testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter

2022-06-07 Thread GitBox
hudi-bot commented on PR #5790: URL: https://github.com/apache/hudi/pull/5790#issuecomment-1149412259 ## CI report: * b02b7d8481e212c1a33348d63227e54524cab4a7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9146

[GitHub] [hudi] hudi-bot commented on pull request #5788: [HUDI-4207] HoodieFlinkWriteClient.getOrCreateWriteHandle throws an e…

2022-06-07 Thread GitBox
hudi-bot commented on PR #5788: URL: https://github.com/apache/hudi/pull/5788#issuecomment-1149412233 ## CI report: * 02b3550eb789013073e4d200ab5ec556199f6bb2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9145

[GitHub] [hudi] hudi-bot commented on pull request #5402: [WIP] Support Hadoop 3.x Hive 3.x and Spark 3.2.x default

2022-06-07 Thread GitBox
hudi-bot commented on PR #5402: URL: https://github.com/apache/hudi/pull/5402#issuecomment-1149411921 ## CI report: * 8c6f6e19940ce7ac04dfcfce52da3ccdaf3a8b0f UNKNOWN * c4799803cff8adffef56e889a5cd4d52599fcf73 UNKNOWN * c5616888bb267cb505a12b88cad3e99f9dd18d9b UNKNOWN * a2

[GitHub] [hudi] hudi-bot commented on pull request #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
hudi-bot commented on PR #5791: URL: https://github.com/apache/hudi/pull/5791#issuecomment-1149410216 ## CI report: * ebb64b9442d89387d03976b7d7c7a5c31d653af5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5790: HUDI-3682 testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter

2022-06-07 Thread GitBox
hudi-bot commented on PR #5790: URL: https://github.com/apache/hudi/pull/5790#issuecomment-1149410192 ## CI report: * b02b7d8481e212c1a33348d63227e54524cab4a7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] hudi-bot commented on pull request #5788: [HUDI-4207] HoodieFlinkWriteClient.getOrCreateWriteHandle throws an e…

2022-06-07 Thread GitBox
hudi-bot commented on PR #5788: URL: https://github.com/apache/hudi/pull/5788#issuecomment-1149410159 ## CI report: * 02b3550eb789013073e4d200ab5ec556199f6bb2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the

[GitHub] [hudi] eric9204 commented on issue #5634: [SUPPORT] -- The UPSERT write operation seems did not WORK !

2022-06-07 Thread GitBox
eric9204 commented on issue #5634: URL: https://github.com/apache/hudi/issues/5634#issuecomment-1149409252 > Did you query it using Flink instead ? @danny0405 Yes, the query results are all consistent with presto, spark sql, flink sql and hive. -- This is an automated message from

[jira] [Commented] (HUDI-1657) build failed on AArch64, Fedora 33

2022-06-07 Thread jian.li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17551368#comment-17551368 ] jian.li commented on HUDI-1657: --- hi,i also encounter same problem,but i am bloked by compile

[jira] [Updated] (HUDI-1657) build failed on AArch64, Fedora 33

2022-06-07 Thread jian.li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jian.li updated HUDI-1657: -- Attachment: image-2022-06-08-11-15-22-577.png > build failed on AArch64, Fedora 33 > --

[GitHub] [hudi] vinothchandar commented on a diff in pull request #5627: [HUDI-3350][HUDI-3351] Rebase Record combining semantic into `HoodieRecordCombiningEngine`

2022-06-07 Thread GitBox
vinothchandar commented on code in PR #5627: URL: https://github.com/apache/hudi/pull/5627#discussion_r891876288 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java: ## @@ -123,6 +124,12 @@ public class HoodieWriteConfig extends HoodieCo

[hudi] branch release-0.11.1-rc1 created (now 39ebc28b6f)

2022-06-07 Thread yihua
This is an automated email from the ASF dual-hosted git repository. yihua pushed a change to branch release-0.11.1-rc1 in repository https://gitbox.apache.org/repos/asf/hudi.git at 39ebc28b6f Create release branch for version 0.11.1. This branch includes the following new commits: n

[jira] [Updated] (HUDI-1657) build failed on AArch64, Fedora 33

2022-06-07 Thread jian.li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jian.li updated HUDI-1657: -- Attachment: image-2022-06-08-11-13-43-324.png > build failed on AArch64, Fedora 33 > --

[GitHub] [hudi] xiarixiaoyao opened a new pull request, #5791: [MINOR] follow up HUDI-4178 automatically enable schema evolution when read hoodie table.

2022-06-07 Thread GitBox
xiarixiaoyao opened a new pull request, #5791: URL: https://github.com/apache/hudi/pull/5791 …n read hoodie table. ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull reque

[GitHub] [hudi] eric9204 commented on issue #5671: [SUPPORT] Archive can't be triggered,when parameter of the metadata table was use in the program

2022-06-07 Thread GitBox
eric9204 commented on issue #5671: URL: https://github.com/apache/hudi/issues/5671#issuecomment-1149405745 @nsivabalan @danny0405 It's true that there are some anomalies.The program started at 10: 27,The first instant file was modified at 11: 01,It should be determined that the archive is

[GitHub] [hudi] vinothchandar commented on a diff in pull request #5522: [HUDI-3378][HUDI-3379][HUDI-3381] Rebasing usages of HoodieRecordPayload and raw Avro payload to rely on HoodieRecord instead

2022-06-07 Thread GitBox
vinothchandar commented on code in PR #5522: URL: https://github.com/apache/hudi/pull/5522#discussion_r891853078 ## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/BaseMergeHelper.java: ## @@ -87,29 +95,67 @@ protected GenericRecord transformRec

[GitHub] [hudi] eric9204 closed issue #5671: [SUPPORT] Archive can't be triggered,when parameter of the metadata table was use in the program

2022-06-07 Thread GitBox
eric9204 closed issue #5671: [SUPPORT] Archive can't be triggered,when parameter of the metadata table was use in the program URL: https://github.com/apache/hudi/issues/5671 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [hudi] eric9204 commented on issue #5671: [SUPPORT] Archive can't be triggered,when parameter of the metadata table was use in the program

2022-06-07 Thread GitBox
eric9204 commented on issue #5671: URL: https://github.com/apache/hudi/issues/5671#issuecomment-1149398637 > if there are any pending or failed commits in data table timeline, metadata table archival will be stalled. that is what Danny is suggesting to check the data table timeline

[jira] [Updated] (HUDI-3682) testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter

2022-06-07 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-3682: - Labels: pull-request-available (was: ) > testReaderFilterRowKeys fails in TestHoodieOrcReaderWrit

[GitHub] [hudi] xicm opened a new pull request, #5790: HUDI-3682 testReaderFilterRowKeys fails in TestHoodieOrcReaderWriter

2022-06-07 Thread GitBox
xicm opened a new pull request, #5790: URL: https://github.com/apache/hudi/pull/5790 ## What is the purpose of the pull request Fix flakey test testReaderFilterRowKeys in TestHoodieOrcReaderWriter ## Brief change log TestReaderFilterRowKeys needs to get the key from RECOR

[GitHub] [hudi] justdoit-code closed issue #5789: HoodieWrapper

2022-06-07 Thread GitBox
justdoit-code closed issue #5789: HoodieWrapper URL: https://github.com/apache/hudi/issues/5789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-

[GitHub] [hudi] justdoit-code opened a new issue, #5789: HoodieWrapper

2022-06-07 Thread GitBox
justdoit-code opened a new issue, #5789: URL: https://github.com/apache/hudi/issues/5789 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-su

  1   2   3   >