Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua merged PR #10615: URL: https://github.com/apache/hudi/pull/10615 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2052767220 ## CI report: * 805ba35b65afbb1daccbcf00291fd520a69c5584 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23232) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2052703726 ## CI report: * dfab8e1285bf0241eea2e71f9d85607c647446d7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23212) * 805ba35b65afbb1daccbcf00291fd520a69c5584 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23232) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2052699814 ## CI report: * dfab8e1285bf0241eea2e71f9d85607c647446d7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23212) * 805ba35b65afbb1daccbcf00291fd520a69c5584 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1563323590 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -530,6 +539,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) + val keyGenClass = ReflectionUtils.getClass(tableConfigKeyGeneratorClassName) + if (classOf[CustomKeyGenerator].equals(keyGenClass) Review Comment: The assumption is that these key generators should not be extended. We should keep it this way for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1563245298 ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSparkSqlWithCustomKeyGenerator.scala: ## @@ -0,0 +1,571 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hudi.functional + +import org.apache.hudi.HoodieSparkUtils +import org.apache.hudi.common.config.TypedProperties +import org.apache.hudi.common.table.HoodieTableMetaClient +import org.apache.hudi.common.util.StringUtils +import org.apache.hudi.exception.HoodieException +import org.apache.hudi.functional.TestSparkSqlWithCustomKeyGenerator._ +import org.apache.hudi.util.SparkKeyGenUtils +import org.apache.spark.sql.SaveMode +import org.apache.spark.sql.hudi.common.HoodieSparkSqlTestBase +import org.joda.time.DateTime +import org.joda.time.format.DateTimeFormat +import org.junit.jupiter.api.Assertions.{assertEquals, assertFalse, assertTrue} +import org.slf4j.LoggerFactory + +import java.io.IOException + +/** + * Tests Spark SQL DML with custom key generator and write configs. + */ +class TestSparkSqlWithCustomKeyGenerator extends HoodieSparkSqlTestBase { + private val LOG = LoggerFactory.getLogger(getClass) + + test("Test Spark SQL DML with custom key generator") { +withTempDir { tmp => + Seq( +Seq("COPY_ON_WRITE", "ts:timestamp,segment:simple", + "(ts=202401, segment='cat2')", "202401/cat2", + Seq("202312/cat2", "202312/cat4", "202401/cat1", "202401/cat3", "202402/cat1", "202402/cat3", "202402/cat5"), + TS_FORMATTER_FUNC, + (ts: Integer, segment: String) => TS_FORMATTER_FUNC.apply(ts) + "/" + segment), +Seq("MERGE_ON_READ", "segment:simple", + "(segment='cat3')", "cat3", + Seq("cat1", "cat2", "cat4", "cat5"), + TS_TO_STRING_FUNC, + (_: Integer, segment: String) => segment), +Seq("MERGE_ON_READ", "ts:timestamp", + "(ts=202312)", "202312", + Seq("202401", "202402"), + TS_FORMATTER_FUNC, + (ts: Integer, _: String) => TS_FORMATTER_FUNC.apply(ts)), +Seq("MERGE_ON_READ", "ts:timestamp,segment:simple", + "(ts=202401, segment='cat2')", "202401/cat2", + Seq("202312/cat2", "202312/cat4", "202401/cat1", "202401/cat3", "202402/cat1", "202402/cat3", "202402/cat5"), + TS_FORMATTER_FUNC, + (ts: Integer, segment: String) => TS_FORMATTER_FUNC.apply(ts) + "/" + segment) + ).foreach { testParams => +withTable(generateTableName) { tableName => + LOG.warn("Testing with parameters: " + testParams) + val tableType = testParams(0).asInstanceOf[String] + val writePartitionFields = testParams(1).asInstanceOf[String] + val dropPartitionStatement = testParams(2).asInstanceOf[String] + val droppedPartition = testParams(3).asInstanceOf[String] + val expectedPartitions = testParams(4).asInstanceOf[Seq[String]] + val tsGenFunc = testParams(5).asInstanceOf[Integer => String] + val partitionGenFunc = testParams(6).asInstanceOf[(Integer, String) => String] + val tablePath = tmp.getCanonicalPath + "/" + tableName + val timestampKeyGeneratorConfig = if (writePartitionFields.contains("timestamp")) { +TS_KEY_GEN_CONFIGS + } else { +Map[String, String]() + } + val timestampKeyGenProps = if (timestampKeyGeneratorConfig.nonEmpty) { +", " + timestampKeyGeneratorConfig.map(e => e._1 + " = '" + e._2 + "'").mkString(", ") + } else { +"" + } + + prepareTableWithKeyGenerator( +tableName, tablePath, tableType, +CUSTOM_KEY_GEN_CLASS_NAME, writePartitionFields, timestampKeyGeneratorConfig) + + // SQL CTAS with table properties containing key generator write configs + createTableWithSql(tableName, tablePath, +s"hoodie.datasource.write.partitionpath.field = '$writePartitionFields'" + timestampKeyGenProps) + + // Prepare source and test SQL INSERT INTO + val sourceTableName = tableName + "_source" +
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1563198254 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -530,6 +539,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) Review Comment: As an example, the table looks like this in Spark catalog: ``` spark-sql (default)> DESCRIBE TABLE formatted h0; 24/04/12 13:59:53 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException _hoodie_commit_time string _hoodie_commit_seqno string _hoodie_record_key string _hoodie_partition_path string _hoodie_file_namestring id int name string pricedecimal(5,1) ts int segment string # Partition Information # col_name data_type comment ts int segment string # Detailed Table Information Catalog spark_catalog Database default Tableh0 Ownerethan Created Time Fri Apr 12 13:58:05 PDT 2024 Last Access UNKNOWN Created By Spark 3.5.1 Type EXTERNAL Provider hudi Table Properties [hoodie.datasource.write.partitionpath.field=ts:timestamp,segment:simple, preCombineField=name, primaryKey=id, provider=hudi, type=cow] Location file:/private/var/folders/60/wk8qzx310fd32b2dp7mhzvdcgn/T/spark-4ac6fb47-e20b-4679-a668-e28238ec3e05/h0 Serde Library org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe InputFormat org.apache.hudi.hadoop.HoodieParquetInputFormat OutputFormat org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat Time taken: 1.694 seconds, Fetched 30 row(s) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1563196323 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -530,6 +539,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) Review Comment: Yes, the table properties associated with `HoodieCatalogTable` are persisted across Spark sessions. The persisted partition field write config `hoodie.datasource.write.partitionpath.field` is a custom config outside Spark, which is used by Hudi logic only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1563151868 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { Review Comment: Flink writer should provide the correct partition field write config. The query side may have some gaps. Created [HUDI-7613](https://issues.apache.org/jira/browse/HUDI-7613) as a follow-up. ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala: ## @@ -201,8 +201,26 @@ object HoodieWriterUtils { diffConfigs.append(s"KeyGenerator:\t$datasourceKeyGen\t$tableConfigKeyGen\n") } +// Please note that the validation of partition path fields needs the key generator class +// for the table, since the custom key generator expects a different format of +// the value of the write config "hoodie.datasource.write.partitionpath.field" +// e.g., "col:simple,ts:timestamp", whereas the table config "hoodie.table.partition.fields" +// in hoodie.properties stores "col,ts". +// The "params" here may only contain the write config of partition path field, +// so we need to pass in the validated key generator class name. +val validatedKeyGenClassName = if (tableConfigKeyGen != null) { Review Comment: Only the `hoodie.datasource.write.partitionpath.field` takes effect in the writer path. Before the fix, the write config is automatically set by the SQL writer based on the value of table config `hoodie.table.partition.fields`. ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) + val keyGenClass = ReflectionUtils.getClass(tableConfigKeyGeneratorClassName) + if (classOf[CustomKeyGenerator].equals(keyGenClass) +|| classOf[CustomAvroKeyGenerator].equals(keyGenClass)) { +// For custom key generator, we have to take the write config value from +// "hoodie.datasource.write.partitionpath.field" which contains the key generator +// type, whereas the table config only contains the prtition field names without +// key generator types. +if (writeConfigPartitionField.isDefined) { + writeConfigPartitionField.get +} else { + log.warn("Write config \"hoodie.datasource.write.partitionpath.field\" is not set for " ++ "custom key generator. This may fail the write operation.") + partitionFieldNamesWithoutKeyGenType Review Comment: It fails with the error message `Unable to find field names for partition path in proper format` in the `CustomKeyGenerator` indicating that the config is not set properly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2052277245 > I like that this has the benefit of not breaking tables with their existing hoodie.table.recordkey.fields, but I am curious about any other approaches you thought about. From you test code, it looks like we can't use `partitioned by (dt:int,idk:string)` when creating the table. I don't think that should block this pr from landing, but in the documentation for SQL: https://hudi.apache.org/docs/sql_ddl#create-partitioned-table I think we should add an example Good point. I tried `partitioned by` statement but it did not work either, due to the same the write config of the partition fields. But you're right that adding a new table config indicating the partition field types should solve the problem fundamentally. We should update the SQL docs on any gaps here. > > Also, I think think this change will help us to fix partition pruning which currently does not work with timestamp keygen: https://issues.apache.org/jira/browse/HUDI-6614 Right. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
jonvex commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1562569055 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala: ## @@ -201,8 +201,26 @@ object HoodieWriterUtils { diffConfigs.append(s"KeyGenerator:\t$datasourceKeyGen\t$tableConfigKeyGen\n") } +// Please note that the validation of partition path fields needs the key generator class +// for the table, since the custom key generator expects a different format of +// the value of the write config "hoodie.datasource.write.partitionpath.field" +// e.g., "col:simple,ts:timestamp", whereas the table config "hoodie.table.partition.fields" +// in hoodie.properties stores "col,ts". +// The "params" here may only contain the write config of partition path field, +// so we need to pass in the validated key generator class name. +val validatedKeyGenClassName = if (tableConfigKeyGen != null) { Review Comment: So when `hoodie.datasource.write.partitionpath.field` is set, we don't set `hoodie.table.partition.fields` ? ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -530,6 +539,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) + val keyGenClass = ReflectionUtils.getClass(tableConfigKeyGeneratorClassName) + if (classOf[CustomKeyGenerator].equals(keyGenClass) Review Comment: Do we want to make this cover any classes that extend customkeygen as well? ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { Review Comment: So does this mean that it's still an issue for flink and hive etc? ## hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestSparkSqlWithCustomKeyGenerator.scala: ## @@ -0,0 +1,571 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hudi.functional + +import org.apache.hudi.HoodieSparkUtils +import org.apache.hudi.common.config.TypedProperties +import
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2051092856 ## CI report: * dfab8e1285bf0241eea2e71f9d85607c647446d7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23212) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2050971324 ## CI report: * 989ffd5220e4f5ae666a05afdd0e7de3c6543972 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23202) * dfab8e1285bf0241eea2e71f9d85607c647446d7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23212) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2050965097 ## CI report: * 989ffd5220e4f5ae666a05afdd0e7de3c6543972 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23202) * dfab8e1285bf0241eea2e71f9d85607c647446d7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2050776752 ## CI report: * 989ffd5220e4f5ae666a05afdd0e7de3c6543972 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23202) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2050723314 ## CI report: * 50b27846bf118909f3fd69f20cf5d7654d8a87c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23189) * 989ffd5220e4f5ae666a05afdd0e7de3c6543972 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23202) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2050717174 ## CI report: * 50b27846bf118909f3fd69f20cf5d7654d8a87c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23189) * 989ffd5220e4f5ae666a05afdd0e7de3c6543972 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
codope commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1561326209 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { Review Comment: Got it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1561319630 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { Review Comment: For the custom key generator, we have to take it from the properties stored in the Spark catalog table. `partitionFieldNamesWithoutKeyGenType` is derived from the existing table configs, which can be wrong. Also, in some code path, `tableConfigKeyGeneratorClassName` is not passed in. ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) + val keyGenClass = ReflectionUtils.getClass(tableConfigKeyGeneratorClassName) + if (classOf[CustomKeyGenerator].equals(keyGenClass) +|| classOf[CustomAvroKeyGenerator].equals(keyGenClass)) { +// For custom key generator, we have to take the write config value from +// "hoodie.datasource.write.partitionpath.field" which contains the key generator +// type, whereas the table config only contains the prtition field names without +// key generator types. +if (writeConfigPartitionField.isDefined) { + writeConfigPartitionField.get +} else { + log.warn("Write config \"hoodie.datasource.write.partitionpath.field\" is not set for " ++ "custom key generator. This may fail the write operation.") + partitionFieldNamesWithoutKeyGenType Review Comment: The write fails in the overall validation method. There is no need to fail in this util method again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2050092146 > 1. is there any change to partitions in `hoodie.proerties`? Do we now write it as `field1:type,field2:type2` when using CustomKeyGenerator? There is no change to the table configs in `hoodie.properties`, i.e., the `hoodie.table.partition.fields` contains the comma-separated list of partition field names like `"segment,ts"` (no type for custom key generator). This PR opens the opportunity to override the `hoodie.datasource.write.partitionpath.field` with `SET TBLPROPERTIES` at the table level in the Spark catalog, so that SQL DML can derive the correct write config of the partition fields (e.g., `"segment:simple,ts:timestamp"` instead of `"segment,ts"`). > 2. Thanks for adding extensive tests. Can you please look into the failures? They seem related to the patch. Failures for Spark 3.2 and above are fixed. I'm looking into failures for older Spark versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2049197186 ## CI report: * 50b27846bf118909f3fd69f20cf5d7654d8a87c7 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23189) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2049032647 ## CI report: * c376900f104a979535fe7b4b9bb7e9a2d236a2b9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23182) * 50b27846bf118909f3fd69f20cf5d7654d8a87c7 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23189) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2049023818 ## CI report: * c376900f104a979535fe7b4b9bb7e9a2d236a2b9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23182) * 50b27846bf118909f3fd69f20cf5d7654d8a87c7 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2048869318 ## CI report: * c376900f104a979535fe7b4b9bb7e9a2d236a2b9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23182) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2048834962 ## CI report: * 185d0fc1b26344563514603f9f5e600972feaaac Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23050) * c376900f104a979535fe7b4b9bb7e9a2d236a2b9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23182) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2048829426 ## CI report: * 185d0fc1b26344563514603f9f5e600972feaaac Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23050) * c376900f104a979535fe7b4b9bb7e9a2d236a2b9 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
codope commented on code in PR #10615: URL: https://github.com/apache/hudi/pull/10615#discussion_r1551075779 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { Review Comment: Should we instead directly infer from the passed string `tableConfigKeyGeneratorClassName`? I mean if the string has no `:` then return `partitionFieldNamesWithoutKeyGenType`. I am not following why `tableConfigKeyGeneratorClassName` being null or empty means partition field names are without keygen type. Suppose, in a future release we drop the keygen config from table properties, then will this hold true? ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -528,6 +536,40 @@ object ProvidesHoodieConfig { filterNullValues(overridingOpts) } + /** + * @param tableConfigKeyGeneratorClassName key generator class name in the table config. + * @param partitionFieldNamesWithoutKeyGenType partition field names without key generator types + * from the table config. + * @param catalogTable HoodieCatalogTable instance to fetch table properties. + * @return the write config value to set for "hoodie.datasource.write.partitionpath.field". + */ + def getPartitionPathFieldWriteConfig(tableConfigKeyGeneratorClassName: String, + partitionFieldNamesWithoutKeyGenType: String, + catalogTable: HoodieCatalogTable): String = { +if (StringUtils.isNullOrEmpty(tableConfigKeyGeneratorClassName)) { + partitionFieldNamesWithoutKeyGenType +} else { + val writeConfigPartitionField = catalogTable.catalogProperties.get(PARTITIONPATH_FIELD.key()) + val keyGenClass = ReflectionUtils.getClass(tableConfigKeyGeneratorClassName) + if (classOf[CustomKeyGenerator].equals(keyGenClass) +|| classOf[CustomAvroKeyGenerator].equals(keyGenClass)) { +// For custom key generator, we have to take the write config value from +// "hoodie.datasource.write.partitionpath.field" which contains the key generator +// type, whereas the table config only contains the prtition field names without +// key generator types. +if (writeConfigPartitionField.isDefined) { + writeConfigPartitionField.get +} else { + log.warn("Write config \"hoodie.datasource.write.partitionpath.field\" is not set for " ++ "custom key generator. This may fail the write operation.") + partitionFieldNamesWithoutKeyGenType Review Comment: Should we then fail early if write is going to fail? Maybe, make it like a validation? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2024589288 ## CI report: * 185d0fc1b26344563514603f9f5e600972feaaac Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23050) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2024516329 ## CI report: * afc107a681bb6df8e1b856239a811ccac6b3b3db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22314) * 185d0fc1b26344563514603f9f5e600972feaaac Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=23050) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-2024506879 ## CI report: * afc107a681bb6df8e1b856239a811ccac6b3b3db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22314) * 185d0fc1b26344563514603f9f5e600972feaaac UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-1925553485 ## CI report: * afc107a681bb6df8e1b856239a811ccac6b3b3db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22314) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-1925509035 ## CI report: * afc107a681bb6df8e1b856239a811ccac6b3b3db Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22314) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
hudi-bot commented on PR #10615: URL: https://github.com/apache/hudi/pull/10615#issuecomment-1925507573 ## CI report: * afc107a681bb6df8e1b856239a811ccac6b3b3db UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [HUDI-7378] Fix Spark SQL DML with custom key generator [hudi]
yihua opened a new pull request, #10615: URL: https://github.com/apache/hudi/pull/10615 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk level (write none, low medium or high below) _If medium or high, explain what verification was done to mitigate the risks._ ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org