[GitHub] [hudi] hudi-bot commented on pull request #6732: [HUDI-4148] Add client for hudi table service manager
hudi-bot commented on PR #6732: URL: https://github.com/apache/hudi/pull/6732#issuecomment-1374727149 ## CI report: * af728e3eeb1fd694eba037bc9a48869831ddb053 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14172) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14171) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6732: [HUDI-4148] Add client for hudi table service manager
hudi-bot commented on PR #6732: URL: https://github.com/apache/hudi/pull/6732#issuecomment-1374725718 ## CI report: * af728e3eeb1fd694eba037bc9a48869831ddb053 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14171) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14172) * Unknown: [CANCELED](TBD) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6732: [HUDI-4148] Add client for hudi table service manager
hudi-bot commented on PR #6732: URL: https://github.com/apache/hudi/pull/6732#issuecomment-1374724389 ## CI report: * af728e3eeb1fd694eba037bc9a48869831ddb053 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
hudi-bot commented on PR #7621: URL: https://github.com/apache/hudi/pull/7621#issuecomment-1374723047 ## CI report: * 2837b8dc79a5e968f9a15e3f79547dcc7f4b142f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14165) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14170) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7608: [HUDI-5503]Optimize flink table factory option check
hudi-bot commented on PR #7608: URL: https://github.com/apache/hudi/pull/7608#issuecomment-1374723003 ## CI report: * 158f8a9c55aecdfe8465e092651edbbd24f911f4 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14169) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] XuQianJin-Stars commented on pull request #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
XuQianJin-Stars commented on PR #7621: URL: https://github.com/apache/hudi/pull/7621#issuecomment-1374715339 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] XuQianJin-Stars commented on pull request #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
XuQianJin-Stars commented on PR #7621: URL: https://github.com/apache/hudi/pull/7621#issuecomment-1374714684 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas commented on a diff in pull request #7620: [HUDI-5511] Do not clean the CkpMetadata dir when restart the job
SteNicholas commented on code in PR #7620: URL: https://github.com/apache/hudi/pull/7620#discussion_r1064075908 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java: ## @@ -92,13 +92,14 @@ public void close() { // - /** - * Initialize the message bus, would clean all the messages + * Initialize the message bus, would keep all the messages. * * This expects to be called by the driver. */ public void bootstrap() throws IOException { -fs.delete(path, true); -fs.mkdirs(path); +if (!fs.exists(path)) { Review Comment: If a checkpoint succeed and the job crashes suddenly, meanwhile the JM restarts on another machine instance, the ckp metadata isn't keeped. This change only solves the scenario where JM is on the same machine. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374708940 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14168) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] davidshtian commented on issue #7591: [SUPPORT] Kinesis Data Analytics Flink1.13 to HUDI
davidshtian commented on issue #7591: URL: https://github.com/apache/hudi/issues/7591#issuecomment-1374699427 > @davidshtian @soumilshah1995 Have you tried 1.13.2 version of the packege _flink-s3-fs-hadoop-1.13.2.jar_? As KDA [supports for Apache Flink version 1.13.2](https://docs.aws.amazon.com/kinesisanalytics/latest/java/doc-history.html), thanks~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7608: [HUDI-5503]Optimize flink table factory option check
hudi-bot commented on PR #7608: URL: https://github.com/apache/hudi/pull/7608#issuecomment-1374699255 ## CI report: * f7391999a7868e7c97797823cab078a3e42f0bca Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14134) * 158f8a9c55aecdfe8465e092651edbbd24f911f4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14169) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hbgstc123 commented on a diff in pull request #7608: [HUDI-5503]Optimize flink table factory option check
hbgstc123 commented on code in PR #7608: URL: https://github.com/apache/hudi/pull/7608#discussion_r1064076705 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -88,11 +95,44 @@ public DynamicTableSink createDynamicTableSink(Context context) { checkArgument(!StringUtils.isNullOrEmpty(conf.getString(FlinkOptions.PATH)), "Option [path] should not be empty."); ResolvedSchema schema = context.getCatalogTable().getResolvedSchema(); +mergeTableConfig(conf, schema); sanityCheck(conf, schema); setupConfOptions(conf, context.getObjectIdentifier(), context.getCatalogTable(), schema); return new HoodieTableSink(conf, schema); } + /** + * fallback pk and pre-combine to table config if not provided + */ + private void mergeTableConfig(Configuration conf, ResolvedSchema schema) { +String basePath = conf.getOptional(FlinkOptions.PATH).orElseThrow(() -> +new ValidationException("Option [path] should not be empty.")); +Path metaPath = new CachingPath(basePath, METAFOLDER_NAME); +FileSystem fileSystem = FSUtils.getFs(metaPath, HadoopConfigurations.getHadoopConf(conf)); +HoodieTableConfig tableConfig; +try { + tableConfig = new HoodieTableConfig(fileSystem, metaPath.toString(), null, null); +} catch (HoodieIOException e) { + LOG.info("Fail to get table config.", e); + return; +} + +Map propsMap = tableConfig.propsMap(); +List writeColumnNames = schema.getColumnNames(); + +if (!conf.contains(FlinkOptions.RECORD_KEY_FIELD) && !schema.getPrimaryKey().isPresent() +&& propsMap.containsKey(HoodieTableConfig.RECORDKEY_FIELDS.key()) +&& writeColumnNames.contains(propsMap.get(HoodieTableConfig.RECORDKEY_FIELDS.key( { + conf.set(FlinkOptions.RECORD_KEY_FIELD, propsMap.get(HoodieTableConfig.RECORDKEY_FIELDS.key())); Review Comment: right, will fix this. thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7608: [HUDI-5503]Optimize flink table factory option check
hudi-bot commented on PR #7608: URL: https://github.com/apache/hudi/pull/7608#issuecomment-1374698509 ## CI report: * f7391999a7868e7c97797823cab078a3e42f0bca Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14134) * 158f8a9c55aecdfe8465e092651edbbd24f911f4 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hbgstc123 commented on a diff in pull request #7608: [HUDI-5503]Optimize flink table factory option check
hbgstc123 commented on code in PR #7608: URL: https://github.com/apache/hudi/pull/7608#discussion_r1064076453 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java: ## @@ -69,7 +77,6 @@ public class HoodieTableFactory implements DynamicTableSourceFactory, DynamicTab public DynamicTableSource createDynamicTableSource(Context context) { Configuration conf = FlinkOptions.fromMap(context.getCatalogTable().getOptions()); ResolvedSchema schema = context.getCatalogTable().getResolvedSchema(); -sanityCheck(conf, schema); setupConfOptions(conf, context.getObjectIdentifier(), context.getCatalogTable(), schema); Review Comment: Oh i miss that pk field is used to emit delete data. I add sanity check for stream read mor table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hbgstc123 commented on a diff in pull request #7608: [HUDI-5503]Optimize flink table factory option check
hbgstc123 commented on code in PR #7608: URL: https://github.com/apache/hudi/pull/7608#discussion_r1064076345 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bulk/RowDataKeyGen.java: ## @@ -134,7 +155,9 @@ public HoodieKey getHoodieKey(RowData rowData) { } public String getRecordKey(RowData rowData) { -if (this.simpleRecordKey) { +if (!hasRecordKey) { + return DEFAULT_RECORD_KEY; +} else if (this.simpleRecordKey) { Review Comment: Not sure if remove the pk field will cause error somewhere, and write a identical value should use very low storage in columnar file format like parquet, and UUID will use much more space since its uniq so cannot compress well, and i don't know where we can use uuid, so i think maybe store a identical value for pk is better. I change default key value to RowDataKeyGen.EMPTY_RECORDKEY_PLACEHOLDER since empty row key will report error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas commented on a diff in pull request #7620: [HUDI-5511] Do not clean the CkpMetadata dir when restart the job
SteNicholas commented on code in PR #7620: URL: https://github.com/apache/hudi/pull/7620#discussion_r1064075908 ## hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/meta/CkpMetadata.java: ## @@ -92,13 +92,14 @@ public void close() { // - /** - * Initialize the message bus, would clean all the messages + * Initialize the message bus, would keep all the messages. * * This expects to be called by the driver. */ public void bootstrap() throws IOException { -fs.delete(path, true); -fs.mkdirs(path); +if (!fs.exists(path)) { Review Comment: If a checkpoint succeed and the job crashes suddenly, meanwhile the JM restarts on another machine instance, the ckp metadata isn't keeped. This change only solves the scenario where JM is on the same machine. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374685640 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14168) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s
hudi-bot commented on PR #6361: URL: https://github.com/apache/hudi/pull/6361#issuecomment-1374684926 ## CI report: * a3f8cab6db30b8186e19c3c3ac1c85c0fe3fa63f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14167) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] ThinkerLei commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
ThinkerLei commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374684521 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374684204 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] KnightChess commented on a diff in pull request #7607: [HUDI-5499] Fixing Spark SQL configs not being properly propagated for CTAS and other commands
KnightChess commented on code in PR #7607: URL: https://github.com/apache/hudi/pull/7607#discussion_r1064068273 ## hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/ProvidesHoodieConfig.scala: ## @@ -81,10 +80,8 @@ trait ProvidesHoodieConfig extends Logging { HoodieSyncConfig.META_SYNC_PARTITION_FIELDS.key -> tableConfig.getPartitionFieldProp, HoodieSyncConfig.META_SYNC_PARTITION_EXTRACTOR_CLASS.key -> hiveSyncConfig.getStringOrDefault(HoodieSyncConfig.META_SYNC_PARTITION_EXTRACTOR_CLASS), HiveSyncConfigHolder.HIVE_SUPPORT_TIMESTAMP_TYPE.key -> hiveSyncConfig.getBoolean(HiveSyncConfigHolder.HIVE_SUPPORT_TIMESTAMP_TYPE).toString, -HoodieWriteConfig.UPSERT_PARALLELISM_VALUE.key -> hoodieProps.getString(HoodieWriteConfig.UPSERT_PARALLELISM_VALUE.key, "200"), Review Comment: > Does this mean that the upsert parallelism cannot be tuned anymore from the SQL statement? Generally, are the key-value pairs in `Map.apply()` just overrides? the `combineOptions` method add it from SQLConf, and the properties priority logical is different from the old, `Map.apply()` is highest -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] ThinkerLei commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
ThinkerLei commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374672841 Test failure has nothing to do with this PR, @hudi-bot run azure re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s
hudi-bot commented on PR #6361: URL: https://github.com/apache/hudi/pull/6361#issuecomment-1374649595 ## CI report: * edfcc047ac71663a47813ac4187a523cbd0e5c9e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14162) * a3f8cab6db30b8186e19c3c3ac1c85c0fe3fa63f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14167) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s
hudi-bot commented on PR #6361: URL: https://github.com/apache/hudi/pull/6361#issuecomment-1374648144 ## CI report: * edfcc047ac71663a47813ac4187a523cbd0e5c9e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14162) * a3f8cab6db30b8186e19c3c3ac1c85c0fe3fa63f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7573: [HUDI-5484] Avoid using `GenericRecord` in `HoodieColumnStatMetadata`
hudi-bot commented on PR #7573: URL: https://github.com/apache/hudi/pull/7573#issuecomment-1374619773 ## CI report: * c59596637cd44124388717082704db7e7bb8bdaf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14164) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14166) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7573: [HUDI-5484] Avoid using `GenericRecord` in `HoodieColumnStatMetadata`
hudi-bot commented on PR #7573: URL: https://github.com/apache/hudi/pull/7573#issuecomment-1374569864 ## CI report: * c59596637cd44124388717082704db7e7bb8bdaf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14164) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14166) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] soumilshah1995 commented on issue #7591: [SUPPORT] Kinesis Data Analytics Flink1.13 to HUDI
soumilshah1995 commented on issue #7591: URL: https://github.com/apache/hudi/issues/7591#issuecomment-1374565548 @davidshtian -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] cxzl25 commented on pull request #7573: [HUDI-5484] Avoid using `GenericRecord` in `HoodieColumnStatMetadata`
cxzl25 commented on PR #7573: URL: https://github.com/apache/hudi/pull/7573#issuecomment-1374546366 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7487: [SUPPORT] S3 Buckets reached quota limit when reading from hudi tables
xushiyan commented on issue #7487: URL: https://github.com/apache/hudi/issues/7487#issuecomment-1374545829 Is this still happening? pls share more info like what the job is doing when this occurs - is it reading or writing? the logs would tell. It's likely due to a lot of small files. have you run clustering for this table? what do the writer configs look like? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7533: [SUPPORT] Recreate deleted metadata table
xushiyan commented on issue #7533: URL: https://github.com/apache/hudi/issues/7533#issuecomment-1374545025 @szingerpeter @yihua what is the latest state of this issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7494: FileNotFoundException while writing dataframe to local file system
xushiyan commented on issue #7494: URL: https://github.com/apache/hudi/issues/7494#issuecomment-1374544413 > java.io.FileNotFoundException: File file:/tmp/hudi_trips_cow_4 does not exist Likely the file path scheme is not working. pls refer to @jonvex 's complete example above. will close this as working example provided. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan closed issue #7494: FileNotFoundException while writing dataframe to local file system
xushiyan closed issue #7494: FileNotFoundException while writing dataframe to local file system URL: https://github.com/apache/hudi/issues/7494 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan closed issue #7507: [SUPPORT] how to use flink offline with occ
xushiyan closed issue #7507: [SUPPORT] how to use flink offline with occ URL: https://github.com/apache/hudi/issues/7507 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7507: [SUPPORT] how to use flink offline with occ
xushiyan commented on issue #7507: URL: https://github.com/apache/hudi/issues/7507#issuecomment-1374543732 closing this as suggestion was provided. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan closed issue #7530: Hudi Log files are increasing in our application day by day
xushiyan closed issue #7530: Hudi Log files are increasing in our application day by day URL: https://github.com/apache/hudi/issues/7530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7530: Hudi Log files are increasing in our application day by day
xushiyan commented on issue #7530: URL: https://github.com/apache/hudi/issues/7530#issuecomment-1374542671 is this the same issue as https://github.com/apache/hudi/issues/7600 ? let's consolidate the discussion in one place. moving the discussion there and will link the issue there and close this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
hudi-bot commented on PR #7621: URL: https://github.com/apache/hudi/pull/7621#issuecomment-1374542301 ## CI report: * 2837b8dc79a5e968f9a15e3f79547dcc7f4b142f Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14165) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7573: [HUDI-5484] Avoid using `GenericRecord` in `HoodieColumnStatMetadata`
hudi-bot commented on PR #7573: URL: https://github.com/apache/hudi/pull/7573#issuecomment-1374542270 ## CI report: * c59596637cd44124388717082704db7e7bb8bdaf Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14164) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7531: [SUPPORT] table comments not fully supported
xushiyan commented on issue #7531: URL: https://github.com/apache/hudi/issues/7531#issuecomment-1374538276 @jonvex can you look into this please? looks like some config fixes should resolve it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-5513) Improve documentation for spark-sql write configs
Jonathan Vexler created HUDI-5513: - Summary: Improve documentation for spark-sql write configs Key: HUDI-5513 URL: https://issues.apache.org/jira/browse/HUDI-5513 Project: Apache Hudi Issue Type: Improvement Components: configs, spark-sql Reporter: Jonathan Vexler Add documentation for how to set write configs in spark-sql, especially in the situation when working with multiple tables. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] xushiyan commented on issue #7539: [SUPPORT]IllegalStateException: Trying to access closed classloader
xushiyan commented on issue #7539: URL: https://github.com/apache/hudi/issues/7539#issuecomment-1374525865 @hbgstc123 does this happen every few hours or it only happened once so far? can you try upgrading to 0.12.2 and see how it goes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] jomach commented on issue #7565: [SUPPORT] Memory Exception when building BuildProfile
jomach commented on issue #7565: URL: https://github.com/apache/hudi/issues/7565#issuecomment-1374525652 The executors are being killed due to memory exceptions. (OOM) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5485) Improve performance of savepoint with MDT
[ https://issues.apache.org/jira/browse/HUDI-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5485: - Component/s: metadata > Improve performance of savepoint with MDT > - > > Key: HUDI-5485 > URL: https://issues.apache.org/jira/browse/HUDI-5485 > Project: Apache Hudi > Issue Type: Improvement > Components: metadata >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Critical > Fix For: 0.13.0 > > > [https://github.com/apache/hudi/issues/7541] > When metadata table is enabled, the savepoint operation is slow for a large > number of partitions (e.g., 75k). The root cause is that for each partition, > the metadata table is scanned, which is unnecessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-5485) Improve performance of savepoint with MDT
[ https://issues.apache.org/jira/browse/HUDI-5485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-5485: - Priority: Blocker (was: Critical) > Improve performance of savepoint with MDT > - > > Key: HUDI-5485 > URL: https://issues.apache.org/jira/browse/HUDI-5485 > Project: Apache Hudi > Issue Type: Improvement > Components: metadata >Reporter: Ethan Guo >Assignee: Ethan Guo >Priority: Blocker > Fix For: 0.13.0 > > > [https://github.com/apache/hudi/issues/7541] > When metadata table is enabled, the savepoint operation is slow for a large > number of partitions (e.g., 75k). The root cause is that for each partition, > the metadata table is scanned, which is unnecessary. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] xushiyan commented on issue #7557: [SUPPORT]: org.apache.hudi.exception.HoodieException: Could not sync using the meta sync class org.apache.hudi.hive.HiveSyncTool
xushiyan commented on issue #7557: URL: https://github.com/apache/hudi/issues/7557#issuecomment-1374523969 > *Hive version : 1.2.1000 Hive 1.x is not supported. pls try upgrade to Hive 2.x or 3.x. Also if you're on hudi 0.11.0, pls consider upgrade to later patch releases like 0.11.1 or 0.12.2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374523316 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7565: [SUPPORT] Memory Exception when building BuildProfile
xushiyan commented on issue #7565: URL: https://github.com/apache/hudi/issues/7565#issuecomment-1374522126 ```java inputRecords .mapToPair(record -> Pair.of( new Tuple2<>(record.getPartitionPath(), Option.ofNullable(record.getCurrentLocation())), record)) .countByKey(); ``` you should refer to `org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor#buildProfile` which is used by spark. I think this is more of a spark job tuning issue, where parallelism and executor memory should be tuned. > Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages. Any further info on this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374521685 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan closed issue #7570: [SUPPORT]Sync hive lost some partitions when submit multiple commits at the same time
xushiyan closed issue #7570: [SUPPORT]Sync hive lost some partitions when submit multiple commits at the same time URL: https://github.com/apache/hudi/issues/7570 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7589: Keep only clustered file(all) after cleaning
xushiyan commented on issue #7589: URL: https://github.com/apache/hudi/issues/7589#issuecomment-1374517461 @maheshguptags what you need is to do savepointing. see https://hudi.apache.org/docs/disaster_recovery For each clustering (replace commit), you just need to trigger a savepoint and then cleaner won't delete the savepointed commit and its files, hence retain it forever (until you delete the savepoint). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan closed issue #7596: [SUPPORT] java.lang.NoSuchMethodException: org.apache.hudi.utilities.sources.AvroKafkaSource when running HoodieDeltaStreamer
xushiyan closed issue #7596: [SUPPORT] java.lang.NoSuchMethodException: org.apache.hudi.utilities.sources.AvroKafkaSource when running HoodieDeltaStreamer URL: https://github.com/apache/hudi/issues/7596 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] ThinkerLei commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
ThinkerLei commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374513867 @hudi-bot run azure re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7600: Hoodie clean is not deleting old files for MOR table
xushiyan commented on issue #7600: URL: https://github.com/apache/hudi/issues/7600#issuecomment-1374513596 @SabyasachiDasTR have you observed any error or warn in logs? it's likely that something is blocking the clean or failing it. Can you search logs and find any statement wrt "clean"? looks like it just stop clean at some point. yes you can use cli to trigger clean manually. it won't impact the data. if you want to be cautious, you can perform it against a table clone to try it out. If something is failing the clean, it'll be the same result though. Need to check the logs still. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7602: [SUPPORT] When does the Spark engine's bulk insert mode support bucket index
xushiyan commented on issue #7602: URL: https://github.com/apache/hudi/issues/7602#issuecomment-1374503920 @minihippo can you please advise? it's gonna be a very useful improvement -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] xushiyan commented on issue #7617: [SUPPORT] Hudi "write" command doesn't fail when on incompatible partition type, but "read" command fails.
xushiyan commented on issue #7617: URL: https://github.com/apache/hudi/issues/7617#issuecomment-1374501572 @jonvex can you help verify this with 0.12.2 and master version pls? just to confirm if the behavior was fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
hudi-bot commented on PR #7621: URL: https://github.com/apache/hudi/pull/7621#issuecomment-1374499632 ## CI report: * 2837b8dc79a5e968f9a15e3f79547dcc7f4b142f Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14165) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6732: [HUDI-4148] Add client for hudi table service manager
hudi-bot commented on PR #6732: URL: https://github.com/apache/hudi/pull/6732#issuecomment-1374499289 ## CI report: Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
hudi-bot commented on PR #7621: URL: https://github.com/apache/hudi/pull/7621#issuecomment-1374498072 ## CI report: * 2837b8dc79a5e968f9a15e3f79547dcc7f4b142f UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7573: [HUDI-5484] Avoid using `GenericRecord` in `HoodieColumnStatMetadata`
hudi-bot commented on PR #7573: URL: https://github.com/apache/hudi/pull/7573#issuecomment-1374498015 ## CI report: * 1ac267ba9af690ecd47f74f60c34851387aee9eb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14080) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14083) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14089) * c59596637cd44124388717082704db7e7bb8bdaf Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14164) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6732: [HUDI-4148] Add client for hudi table service manager
hudi-bot commented on PR #6732: URL: https://github.com/apache/hudi/pull/6732#issuecomment-1374497670 ## CI report: Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7573: [HUDI-5484] Avoid using `GenericRecord` in `HoodieColumnStatMetadata`
hudi-bot commented on PR #7573: URL: https://github.com/apache/hudi/pull/7573#issuecomment-1374496229 ## CI report: * 1ac267ba9af690ecd47f74f60c34851387aee9eb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14080) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14083) Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14089) * c59596637cd44124388717082704db7e7bb8bdaf UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yuzhaojing closed pull request #6732: [HUDI-4148] Add client for hudi table service manager
yuzhaojing closed pull request #6732: [HUDI-4148] Add client for hudi table service manager URL: https://github.com/apache/hudi/pull/6732 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yuzhaojing opened a new pull request, #6732: [HUDI-4148] Add client for hudi table service manager
yuzhaojing opened a new pull request, #6732: URL: https://github.com/apache/hudi/pull/6732 ### Change Logs Refactor the part of BaseHoodieWriteClient about table service and wrapped it into BaseHoodieTableServiceClient. _About the Public API for the table service part of BaseHoodieWriteClient._ Add BaseTableServiceClient. ### Impact Affect core writer paths ### Risk level Medium ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-5512) spark call procedure run_bootstrap missing params cause job fail
[ https://issues.apache.org/jira/browse/HUDI-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-5512: - Labels: pull-request-available (was: ) > spark call procedure run_bootstrap missing params cause job fail > > > Key: HUDI-5512 > URL: https://issues.apache.org/jira/browse/HUDI-5512 > Project: Apache Hudi > Issue Type: Bug > Components: spark-sql >Reporter: KnightChess >Assignee: KnightChess >Priority: Major > Labels: pull-request-available > > # spark sql call procedure run_bootstrap lose many conf when save to > `hoodit.properties` > # some conf can not take effect sometimes, like key_gen_class -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] KnightChess opened a new pull request, #7621: [HUDI-5512] fix spark call procedure run_bootstrap missing conf and c…
KnightChess opened a new pull request, #7621: URL: https://github.com/apache/hudi/pull/7621 ### Change Logs - According to the init bootstrap table code in `BootstrapExecutor` and `HoodieSparkSqlWriter`, add some conf to hoodie.properties - fix `key_generator_class` can not take effect ### Impact None, current conf will contain all old conf ### Risk level (write none, low medium or high below) low ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ none ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-5512) spark call procedure run_bootstrap missing params cause job fail
KnightChess created HUDI-5512: - Summary: spark call procedure run_bootstrap missing params cause job fail Key: HUDI-5512 URL: https://issues.apache.org/jira/browse/HUDI-5512 Project: Apache Hudi Issue Type: Bug Components: spark-sql Reporter: KnightChess Assignee: KnightChess # spark sql call procedure run_bootstrap lose many conf when save to `hoodit.properties` # some conf can not take effect sometimes, like key_gen_class -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [hudi] yuzhaojing commented on pull request #6732: [HUDI-4148] Add client for hudi table service manager
yuzhaojing commented on PR #6732: URL: https://github.com/apache/hudi/pull/6732#issuecomment-1374481061 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374470670 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7620: [HUDI-5511] Do not clean the CkpMetadata dir when restart the job
hudi-bot commented on PR #7620: URL: https://github.com/apache/hudi/pull/7620#issuecomment-1374429123 ## CI report: * cd670233392323f8602950a5d2595661b668f3e9 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14161) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #6361: [HUDI-4690][HUDI-4503] Cleaning up Hudi custom Spark `Rule`s
hudi-bot commented on PR #6361: URL: https://github.com/apache/hudi/pull/6361#issuecomment-1374428866 ## CI report: * edfcc047ac71663a47813ac4187a523cbd0e5c9e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14162) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #7609: [HUDI-5504]Fix concurrency conflict when asyncCompaction is enabled
hudi-bot commented on PR #7609: URL: https://github.com/apache/hudi/pull/7609#issuecomment-1374415222 ## CI report: * 94a8e3bb534c386cc55c3150120c8e56b7596f29 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14158) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=14163) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org