[GitHub] [hudi] nsivabalan commented on pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
nsivabalan commented on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-943066396 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
nsivabalan commented on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-943061856 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3794: [HUDI-2553] Metadata table compaction trigger max delta commits default config (re-enable)
nsivabalan commented on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-943060753 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] t822876884 opened a new issue #3796: [SUPPORT] Flink write to hudi,after running for a period of time,throw a NoClassDefFoundError
t822876884 opened a new issue #3796: URL: https://github.com/apache/hudi/issues/3796 hudi 0.9.0 flink 1.12.2 ```java public static void main(String[] args) { //ENV StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStateBackend(new FsStateBackend(YARN_CKP_PATH)); env.enableCheckpointing(6); env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); env.setParallelism(1); EnvironmentSettings settings = EnvironmentSettings.newInstance().useBlinkPlanner() .inStreamingMode().build(); StreamTableEnvironment tableEnvironment = StreamTableEnvironment.create(env, settings); FlinkKafkaConsumer consumer = new FlinkKafkaConsumer(KAFKA_TOPIC, new SimpleStringSchema(), kafkaProperties()); consumer.setStartFromTimestamp(163379520L); //SOURCE DataStreamSource yarnDS = env .addSource(consumer) .setParallelism(8); DataStream dataDs = yarnDS.filter(new FilterFunction() { @Override public boolean filter(String value) throws Exception { String type = JSONObject.parseObject(value).getString("type"); if (("yarn").equals(type)) { return true; } return false; } }).setParallelism(4) .map(new MapFunction() { @Override public YarnDataEntity map(String value) throws Exception { String data = JSONObject.parseObject(value).getString("data"); YarnDataEntity yarnDataEntities = JSONObject.parseObject(data, YarnDataEntity.class); yarnDataEntities.setDt(DateUtil.convertTimeByLong(yarnDataEntities.getStartedTime())); return yarnDataEntities; } }).setParallelism(8); Table dataDsYarn = tableEnvironment.fromDataStream(dataDs); //Table result = tableEnvironment.sqlQuery("SELECT * FROM " + dataDsYarn); //tableEnvironment.toAppendStream(result, YarnDataEntity.class).print(); tableEnvironment.executeSql("CREATE TABLE big_data_analyse_yarn(" + " allocatedMB INT," + " allocatedVCores INT," + " amContainerLogs VARCHAR(200)," + " amHostHttpAddress VARCHAR(200)," + " amNodeLabelExpression VARCHAR(200)," + " amRPCAddress VARCHAR(20)," + " appNodeLabelExpression VARCHAR(200)," + " applicationTags VARCHAR(200)," + " applicationType VARCHAR(20)," + " clusterId BIGINT," + " clusterUsagePercentage FLOAT," + " diagnostics VARCHAR(200)," + " dt VARCHAR(20)," + " elapsedTime BIGINT, " + " finalStatus VARCHAR(200)," + " finishedTime BIGINT," + " id VARCHAR(200)," + " logAggregationStatus VARCHAR(200)," + " memorySeconds BIGINT, " + " name VARCHAR(200)," + " numAMContainerPreempted INT, " + " numNonAMContainerPreempted INT, " + " preemptedResourceMB int," + " preemptedResourceVCores BIGINT, " + " priority VARCHAR(200)," + " progress FLOAT, " + " queue VARCHAR(200)," + " queueUsagePercentage FLOAT, " + " runningContainers INT, " + " startedTime BIGINT," + " `state` VARCHAR(200)," + " trackingUI VARCHAR(200)," + " trackingUrl VARCHAR(200)," + " unmanagedApplication boolean," + " `user` VARCHAR(20)," + " vcoreSeconds BIGINT" + ")" + " PARTITIONED BY (dt)" + "WITH (" + " 'connector' = 'hudi'," + " 'path' = '"+ YARN_DATA_PATH +"'," + " 'write.tasks' = '8'," + " 'read.streaming.enabled'= 'true', " + " 'table.type' = 'MERGE_ON_READ', " + " 'read.streaming.check-interval' = '30'," + " 'write.precombine.field' = 'dt'," + " 'hoodie.datasource.write.operation' = 'insert'," + " 'hoodie.datasource.write.recordkey.field' = 'id' " + " )"); tableEnvironment.executeSql("insert into big_data_analyse_yarn select * from " + dataDsYarn); } ``` ``` org.ap
[GitHub] [hudi] hudi-bot edited a comment on pull request #3519: [DO NOT MERGE] 0.9.0 release patch for flink
hudi-bot edited a comment on pull request #3519: URL: https://github.com/apache/hudi/pull/3519#issuecomment-903204631 ## CI report: * fd423c27cc15e112b99d8102ab7f5cb9a5d623c5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2637) * 1d3142cd55878ba81a358bf0b4d194779585bada Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2638) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3519: [DO NOT MERGE] 0.9.0 release patch for flink
hudi-bot edited a comment on pull request #3519: URL: https://github.com/apache/hudi/pull/3519#issuecomment-903204631 ## CI report: * fd423c27cc15e112b99d8102ab7f5cb9a5d623c5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2637) * 1d3142cd55878ba81a358bf0b4d194779585bada UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on a change in pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
garyli1019 commented on a change in pull request #3792: URL: https://github.com/apache/hudi/pull/3792#discussion_r728655818 ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -189,6 +190,10 @@ public static HoodieWriteConfig getHoodieClientConfig(Configuration conf) { .enable(conf.getBoolean(FlinkOptions.METADATA_ENABLED)) .withMaxNumDeltaCommitsBeforeCompaction(conf.getInteger(FlinkOptions.METADATA_COMPACTION_DELTA_COMMITS)) .build()) +.withPayloadConfig(HoodiePayloadConfig.newBuilder() + .withPayloadOrderingField(conf.getString(FlinkOptions.PRECOMBINE_FIELD)) + .withPayloadEventTimeField(conf.getString(FlinkOptions.RECORD_KEY_FIELD)) Review comment: hmm, not sure I understand when it is needed mean. Even users may use the same field for these two, but they have a completely different identity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (HUDI-2551) Support DefaultHoodieRecordPayload for flink
[ https://issues.apache.org/jira/browse/HUDI-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-2551. -- Resolution: Fixed Fixed via master branch: f897e6d73ebc26d32017774d452389023f53f742 > Support DefaultHoodieRecordPayload for flink > > > Key: HUDI-2551 > URL: https://issues.apache.org/jira/browse/HUDI-2551 > Project: Apache Hudi > Issue Type: Bug > Components: Flink Integration >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated: [HUDI-2551] Support DefaultHoodieRecordPayload for flink (#3792)
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new f897e6d [HUDI-2551] Support DefaultHoodieRecordPayload for flink (#3792) f897e6d is described below commit f897e6d73ebc26d32017774d452389023f53f742 Author: Danny Chan AuthorDate: Thu Oct 14 13:46:53 2021 +0800 [HUDI-2551] Support DefaultHoodieRecordPayload for flink (#3792) --- .../hudi/execution/FlinkLazyInsertIterable.java| 2 +- .../apache/hudi/configuration/FlinkOptions.java| 2 +- .../hudi/sink/bootstrap/BootstrapOperator.java | 7 ++ .../bootstrap/batch/BatchBootstrapOperator.java| 5 .../java/org/apache/hudi/util/StreamerUtil.java| 5 .../apache/hudi/table/HoodieDataSourceITCase.java | 29 ++ 6 files changed, 48 insertions(+), 2 deletions(-) diff --git a/hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/execution/FlinkLazyInsertIterable.java b/hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/execution/FlinkLazyInsertIterable.java index 8769f63..b0674b2 100644 --- a/hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/execution/FlinkLazyInsertIterable.java +++ b/hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/execution/FlinkLazyInsertIterable.java @@ -65,7 +65,7 @@ public class FlinkLazyInsertIterable extends Hood try { final Schema schema = new Schema.Parser().parse(hoodieConfig.getSchema()); bufferedIteratorExecutor = - new BoundedInMemoryExecutor<>(hoodieConfig.getWriteBufferLimitBytes(), new IteratorBasedQueueProducer<>(inputItr), Option.of(getInsertHandler()), getTransformFunction(schema)); + new BoundedInMemoryExecutor<>(hoodieConfig.getWriteBufferLimitBytes(), new IteratorBasedQueueProducer<>(inputItr), Option.of(getInsertHandler()), getTransformFunction(schema, hoodieConfig)); final List result = bufferedIteratorExecutor.execute(); assert result != null && !result.isEmpty() && !bufferedIteratorExecutor.isRemaining(); return result; diff --git a/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java b/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java index 81bd517..b2359f4 100644 --- a/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java +++ b/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java @@ -100,7 +100,7 @@ public class FlinkOptions extends HoodieConfig { public static final ConfigOption METADATA_COMPACTION_DELTA_COMMITS = ConfigOptions .key("metadata.compaction.delta_commits") .intType() - .defaultValue(24) + .defaultValue(10) .withDescription("Max delta commits for metadata table to trigger compaction, default 24"); // diff --git a/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java b/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java index 3ac7aa1..0e7bb54 100644 --- a/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java +++ b/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java @@ -129,6 +129,13 @@ public class BootstrapOperator WriteOperationType.fromValue(conf.getString(FlinkOptions.OPERATION)), HoodieTableType.valueOf(conf.getString(FlinkOptions.TABLE_TYPE))); +preLoadIndexRecords(); + } + + /** + * Load the index records before {@link #processElement}. + */ + protected void preLoadIndexRecords() throws Exception { String basePath = hoodieTable.getMetaClient().getBasePath(); int taskID = getRuntimeContext().getIndexOfThisSubtask(); LOG.info("Start loading records in table {} into the index state, taskId = {}", basePath, taskID); diff --git a/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/batch/BatchBootstrapOperator.java b/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/batch/BatchBootstrapOperator.java index ac4c2b1..258f884 100644 --- a/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/batch/BatchBootstrapOperator.java +++ b/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/batch/BatchBootstrapOperator.java @@ -57,6 +57,11 @@ public class BatchBootstrapOperator } @Override + protected void preLoadIndexRecords() { +// no operation + } + + @Override @SuppressWarnings("unchecked") public void processElement(StreamRecord element) throws Exception { final HoodieRecord record = (HoodieRecord) element.getValue(); diff --git a/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java b/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java index cfa2980..7fb550d 100644 --- a/hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.ja
[GitHub] [hudi] danny0405 merged pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
danny0405 merged pull request #3792: URL: https://github.com/apache/hudi/pull/3792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3203: [HUDI-2086] Refactor hive mor_incremental_view
danny0405 commented on a change in pull request #3203: URL: https://github.com/apache/hudi/pull/3203#discussion_r728651340 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieParquetRealtimeInputFormat.java ## @@ -66,6 +90,170 @@ return HoodieRealtimeInputFormatUtils.getRealtimeSplits(job, fileSplits); } + /** + * Keep the logical of mor_incr_view as same as spark datasource. + * Step1: Get list of commits to be fetched based on start commit and max commits(for snapshot max commits is -1). Review comment: `logical` is an adjective, please use noun `logic` instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3519: [DO NOT MERGE] 0.9.0 release patch for flink
hudi-bot edited a comment on pull request #3519: URL: https://github.com/apache/hudi/pull/3519#issuecomment-903204631 ## CI report: * fd423c27cc15e112b99d8102ab7f5cb9a5d623c5 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2637) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
hudi-bot edited a comment on pull request #3792: URL: https://github.com/apache/hudi/pull/3792#issuecomment-942232592 ## CI report: * 677cbef4d404808777dad21fc19e68b332b0ef0b Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2636) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] gaoshihang commented on issue #3790: [SUPPORT]Flink-cdc write to COW hudi table record duplicate
gaoshihang commented on issue #3790: URL: https://github.com/apache/hudi/issues/3790#issuecomment-942939360 > No, spark also needs this option but with a different option key. thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #3790: [SUPPORT]Flink-cdc write to COW hudi table record duplicate
danny0405 commented on issue #3790: URL: https://github.com/apache/hudi/issues/3790#issuecomment-942938716 No, spark also needs this option but with a different option key. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] gaoshihang commented on issue #3790: [SUPPORT]Flink-cdc write to COW hudi table record duplicate
gaoshihang commented on issue #3790: URL: https://github.com/apache/hudi/issues/3790#issuecomment-942932423 Please ask another question, is this unique to Flink? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] gaoshihang closed issue #3790: [SUPPORT]Flink-cdc write to COW hudi table record duplicate
gaoshihang closed issue #3790: URL: https://github.com/apache/hudi/issues/3790 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] gaoshihang commented on issue #3790: [SUPPORT]Flink-cdc write to COW hudi table record duplicate
gaoshihang commented on issue #3790: URL: https://github.com/apache/hudi/issues/3790#issuecomment-942929519 Thank you very much! resolve my problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
danny0405 commented on a change in pull request #3792: URL: https://github.com/apache/hudi/pull/3792#discussion_r728620696 ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -189,6 +190,10 @@ public static HoodieWriteConfig getHoodieClientConfig(Configuration conf) { .enable(conf.getBoolean(FlinkOptions.METADATA_ENABLED)) .withMaxNumDeltaCommitsBeforeCompaction(conf.getInteger(FlinkOptions.METADATA_COMPACTION_DELTA_COMMITS)) .build()) +.withPayloadConfig(HoodiePayloadConfig.newBuilder() + .withPayloadOrderingField(conf.getString(FlinkOptions.PRECOMBINE_FIELD)) + .withPayloadEventTimeField(conf.getString(FlinkOptions.RECORD_KEY_FIELD)) Review comment: Yes, we can add that when it is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3519: [DO NOT MERGE] 0.9.0 release patch for flink
hudi-bot edited a comment on pull request #3519: URL: https://github.com/apache/hudi/pull/3519#issuecomment-903204631 ## CI report: * 1b66ae85aed5f2f0d1542323be74216d062a5ca6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2612) * fd423c27cc15e112b99d8102ab7f5cb9a5d623c5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2637) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3519: [DO NOT MERGE] 0.9.0 release patch for flink
hudi-bot edited a comment on pull request #3519: URL: https://github.com/apache/hudi/pull/3519#issuecomment-903204631 ## CI report: * 1b66ae85aed5f2f0d1542323be74216d062a5ca6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2612) * fd423c27cc15e112b99d8102ab7f5cb9a5d623c5 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
hudi-bot edited a comment on pull request #3792: URL: https://github.com/apache/hudi/pull/3792#issuecomment-942232592 ## CI report: * ba0dc0a6169de8b9a2c6ee9659ee9b7750d4d5b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2621) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2634) * 677cbef4d404808777dad21fc19e68b332b0ef0b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2636) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
hudi-bot edited a comment on pull request #3792: URL: https://github.com/apache/hudi/pull/3792#issuecomment-942232592 ## CI report: * ba0dc0a6169de8b9a2c6ee9659ee9b7750d4d5b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2621) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2634) * 677cbef4d404808777dad21fc19e68b332b0ef0b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3765: [HUDI-2533] New option for hoodieClusteringJob to check, rollback and re-execute the last failed clustering job
hudi-bot edited a comment on pull request #3765: URL: https://github.com/apache/hudi/pull/3765#issuecomment-938488397 ## CI report: * faf3186897f0a7ab71d63cf5736ab45ae49347cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2613) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2615) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2633) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
hudi-bot edited a comment on pull request #3792: URL: https://github.com/apache/hudi/pull/3792#issuecomment-942232592 ## CI report: * ba0dc0a6169de8b9a2c6ee9659ee9b7750d4d5b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2621) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2634) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on a change in pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
garyli1019 commented on a change in pull request #3792: URL: https://github.com/apache/hudi/pull/3792#discussion_r728607020 ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -189,6 +190,10 @@ public static HoodieWriteConfig getHoodieClientConfig(Configuration conf) { .enable(conf.getBoolean(FlinkOptions.METADATA_ENABLED)) .withMaxNumDeltaCommitsBeforeCompaction(conf.getInteger(FlinkOptions.METADATA_COMPACTION_DELTA_COMMITS)) .build()) +.withPayloadConfig(HoodiePayloadConfig.newBuilder() + .withPayloadOrderingField(conf.getString(FlinkOptions.PRECOMBINE_FIELD)) + .withPayloadEventTimeField(conf.getString(FlinkOptions.RECORD_KEY_FIELD)) Review comment: then we need an EVENTTIME_FIELD right? we can set default as PRECOMBINE_FIELD, but I think in some cases users may set two separate fields for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on issue #3790: [SUPPORT]Flink-cdc write to COW hudi table record duplicate
danny0405 commented on issue #3790: URL: https://github.com/apache/hudi/issues/3790#issuecomment-942910695 You need to set up option `write.insert.drop.duplicates` explicitly to deduplicate before merge, see document: https://www.yuque.com/docs/share/01c98494-a980-414c-9c45-152023bf3c17?#pqEWP -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
danny0405 commented on a change in pull request #3792: URL: https://github.com/apache/hudi/pull/3792#discussion_r728601033 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java ## @@ -129,6 +129,13 @@ public void initializeState(StateInitializationContext context) throws Exception WriteOperationType.fromValue(conf.getString(FlinkOptions.OPERATION)), HoodieTableType.valueOf(conf.getString(FlinkOptions.TABLE_TYPE))); +preLoadIndexRecords(); + } + + /** + * Load the index records before {@link #processElement}. + */ + protected void preLoadIndexRecords() throws Exception { Review comment: Yes, just to fix the duplicate index loading of the test cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
danny0405 commented on a change in pull request #3792: URL: https://github.com/apache/hudi/pull/3792#discussion_r728600818 ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -189,6 +190,10 @@ public static HoodieWriteConfig getHoodieClientConfig(Configuration conf) { .enable(conf.getBoolean(FlinkOptions.METADATA_ENABLED)) .withMaxNumDeltaCommitsBeforeCompaction(conf.getInteger(FlinkOptions.METADATA_COMPACTION_DELTA_COMMITS)) .build()) +.withPayloadConfig(HoodiePayloadConfig.newBuilder() + .withPayloadOrderingField(conf.getString(FlinkOptions.PRECOMBINE_FIELD)) + .withPayloadEventTimeField(conf.getString(FlinkOptions.RECORD_KEY_FIELD)) Review comment: Yes, you are with, as a default, we may use `FlinkOptions.PRECOMBINE_FIELD` as event time field. ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -189,6 +190,10 @@ public static HoodieWriteConfig getHoodieClientConfig(Configuration conf) { .enable(conf.getBoolean(FlinkOptions.METADATA_ENABLED)) .withMaxNumDeltaCommitsBeforeCompaction(conf.getInteger(FlinkOptions.METADATA_COMPACTION_DELTA_COMMITS)) .build()) +.withPayloadConfig(HoodiePayloadConfig.newBuilder() + .withPayloadOrderingField(conf.getString(FlinkOptions.PRECOMBINE_FIELD)) + .withPayloadEventTimeField(conf.getString(FlinkOptions.RECORD_KEY_FIELD)) Review comment: Yes, you are right, as a default, we may use `FlinkOptions.PRECOMBINE_FIELD` as event time field. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3787: [HUDI-2548] Flink streaming reader misses the rolling over file handles
danny0405 commented on a change in pull request #3787: URL: https://github.com/apache/hudi/pull/3787#discussion_r728599250 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java ## @@ -139,7 +139,7 @@ public void processElement(I value, ProcessFunction.Context ctx, Coll public void close() { if (this.writeClient != null) { this.writeClient.cleanHandlesGracefully(); - this.writeClient.close(); + // this.writeClient.close(); Review comment: Because the embedded timeline server is JVM process singleton, if one thread starts to close the server, the other threads that needs the server would fall into exception. Would fix the server as a driver service in following PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas commented on a change in pull request #3787: [HUDI-2548] Flink streaming reader misses the rolling over file handles
SteNicholas commented on a change in pull request #3787: URL: https://github.com/apache/hudi/pull/3787#discussion_r728579081 ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieCommitMetadata.java ## @@ -142,6 +143,60 @@ public WriteOperationType getOperationType() { return fileGroupIdToFullPaths; } + /** + * Extract the file status of all affected files from the commit metadata. If a file has + * been touched multiple times in the given commits, the return value will keep the one + * from the latest commit. + * + * @param basePath The base path + * @return the file full path to file status mapping + */ + public Map getFullPathToFileStatus(String basePath) { +Map fullPathToFileStatus = new HashMap<>(); +for (List stats : getPartitionToWriteStats().values()) { + // Iterate through all the written files. + for (HoodieWriteStat stat : stats) { +String relativeFilePath = stat.getPath(); +Path fullPath = relativeFilePath != null ? FSUtils.getPartitionPath(basePath, relativeFilePath) : null; Review comment: IMO, this could directly check whether relativeFilePath is null to put fileStatus into fullPathToFileStatus. ## File path: hudi-common/src/main/java/org/apache/hudi/common/model/HoodieCommitMetadata.java ## @@ -142,6 +143,60 @@ public WriteOperationType getOperationType() { return fileGroupIdToFullPaths; } + /** + * Extract the file status of all affected files from the commit metadata. If a file has + * been touched multiple times in the given commits, the return value will keep the one + * from the latest commit. + * + * @param basePath The base path + * @return the file full path to file status mapping + */ + public Map getFullPathToFileStatus(String basePath) { +Map fullPathToFileStatus = new HashMap<>(); +for (List stats : getPartitionToWriteStats().values()) { + // Iterate through all the written files. + for (HoodieWriteStat stat : stats) { +String relativeFilePath = stat.getPath(); +Path fullPath = relativeFilePath != null ? FSUtils.getPartitionPath(basePath, relativeFilePath) : null; +if (fullPath != null) { + FileStatus fileStatus = new FileStatus(stat.getFileSizeInBytes(), false, 0, 0, + 0, fullPath); + fullPathToFileStatus.put(fullPath.getName(), fileStatus); +} + } +} +return fullPathToFileStatus; + } + + /** + * Extract the file status of all affected files from the commit metadata. If a file has + * been touched multiple times in the given commits, the return value will keep the one + * from the latest commit by file group ID. + * + * Note: different with {@link #getFullPathToFileStatus(String)}, + * only the latest commit file for a file group is returned, + * this is an optimization for COPY_ON_WRITE table to eliminate legacy files for filesystem view. + * + * @param basePath The base path + * @return the file ID to file status mapping + */ + public Map getFileIdToFileStatus(String basePath) { +Map fileIdToFileStatus = new HashMap<>(); +for (List stats : getPartitionToWriteStats().values()) { + // Iterate through all the written files. + for (HoodieWriteStat stat : stats) { +String relativeFilePath = stat.getPath(); +Path fullPath = relativeFilePath != null ? FSUtils.getPartitionPath(basePath, relativeFilePath) : null; Review comment: Ditto. ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java ## @@ -139,7 +139,7 @@ public void processElement(I value, ProcessFunction.Context ctx, Coll public void close() { if (this.writeClient != null) { this.writeClient.cleanHandlesGracefully(); - this.writeClient.close(); + // this.writeClient.close(); Review comment: Why doesn't this invoke the `close` method of the write client? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
danny0405 commented on a change in pull request #3741: URL: https://github.com/apache/hudi/pull/3741#discussion_r728597616 ## File path: hudi-common/src/main/java/org/apache/hudi/common/data/HoodieData.java ## @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hudi.common.data; + +import org.apache.hudi.common.function.SerializableFunction; + +import java.io.Serializable; +import java.util.Iterator; +import java.util.List; +import java.util.Properties; + +/** + * An abstraction for a data collection of objects in type T to store the reference + * and do transformation. + * + * @param type of object. + */ +public abstract class HoodieData implements Serializable { Review comment: `HoodieCollection` seems a better name because it is mainly used to avid the Java annotation diff. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
danny0405 commented on a change in pull request #3741: URL: https://github.com/apache/hudi/pull/3741#discussion_r728594783 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java ## @@ -383,7 +388,13 @@ public void completeCompaction( protected List compact(String compactionInstantTime, boolean shouldComplete) { // only used for metadata table, the compaction happens in single thread try { - List writeStatuses = FlinkCompactHelpers.compact(compactionInstantTime, this); + RunCompactionActionExecutor compactionExecutor = new RunCompactionActionExecutor( + context, config, getHoodieTable(), compactionInstantTime, this, Review comment: A better way is moving the `new RunCompactionActionExecutor` execution into the `HoodieFlinkTable` impl `#compact` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution
hudi-bot edited a comment on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-919855741 ## CI report: * aaf33e3a28680ab5febca7df70937ce543619a94 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2632) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
danny0405 commented on a change in pull request #3741: URL: https://github.com/apache/hudi/pull/3741#discussion_r728587644 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/HoodieWriteMetadata.java ## @@ -46,6 +46,36 @@ public HoodieWriteMetadata() { } + /** + * Clones the write metadata with transformed write statuses. + * + * @param transformedWriteStatuses transformed write statuses + * @param type of transformed write statuses + * @return Cloned {@link HoodieWriteMetadata} instance + */ + public HoodieWriteMetadata clone(T transformedWriteStatuses) { +HoodieWriteMetadata newMetadataInstance = new HoodieWriteMetadata<>(); +newMetadataInstance.setWriteStatuses(transformedWriteStatuses); +if (indexLookupDuration.isPresent()) { Review comment: We should find a way to eliminate the metadata clone, which is hard to maintain and buggy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on a change in pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
garyli1019 commented on a change in pull request #3792: URL: https://github.com/apache/hudi/pull/3792#discussion_r728585655 ## File path: hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/BootstrapOperator.java ## @@ -129,6 +129,13 @@ public void initializeState(StateInitializationContext context) throws Exception WriteOperationType.fromValue(conf.getString(FlinkOptions.OPERATION)), HoodieTableType.valueOf(conf.getString(FlinkOptions.TABLE_TYPE))); +preLoadIndexRecords(); + } + + /** + * Load the index records before {@link #processElement}. + */ + protected void preLoadIndexRecords() throws Exception { Review comment: not related to this PR? ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -189,6 +190,10 @@ public static HoodieWriteConfig getHoodieClientConfig(Configuration conf) { .enable(conf.getBoolean(FlinkOptions.METADATA_ENABLED)) .withMaxNumDeltaCommitsBeforeCompaction(conf.getInteger(FlinkOptions.METADATA_COMPACTION_DELTA_COMMITS)) .build()) +.withPayloadConfig(HoodiePayloadConfig.newBuilder() + .withPayloadOrderingField(conf.getString(FlinkOptions.PRECOMBINE_FIELD)) + .withPayloadEventTimeField(conf.getString(FlinkOptions.RECORD_KEY_FIELD)) Review comment: Eventtime key is actually different from record key. It should be a timestamp format. Should we add another option? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
hudi-bot edited a comment on pull request #3792: URL: https://github.com/apache/hudi/pull/3792#issuecomment-942232592 ## CI report: * ba0dc0a6169de8b9a2c6ee9659ee9b7750d4d5b4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2621) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2634) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] Neo966 opened a new issue #3795: [SUPPORT] hive query hudi error
Neo966 opened a new issue #3795: URL: https://github.com/apache/hudi/issues/3795 hive version:2.1.1 flink:1.12.2 scala:2.11 1、hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; select count(*) from xxx; //103828120, It's not right. actual number is 18874368. 2、hive.input.format=org.apache.hudi.hadoop.hive.HoodieCombineHiveInputFormat; select count(*) from xxx; //Error: Error while processing statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/apache/hadoop/hive/common/StringInternUtils (state=08S01,code=-101) 3、hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; select count(*) from xxx; //18874368 select * from xxx limit 18874360, 10; //it's works, display the last 8 records normally. select count(*) from xxx where name = 'lisi'; //2097152 select * from xxx where name = 'lisi' limit 2097150, 10; //the result error, no record return, should return last 2 record. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3765: [HUDI-2533] New option for hoodieClusteringJob to check, rollback and re-execute the last failed clustering job
hudi-bot edited a comment on pull request #3765: URL: https://github.com/apache/hudi/pull/3765#issuecomment-938488397 ## CI report: * faf3186897f0a7ab71d63cf5736ab45ae49347cb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2613) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2615) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2633) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3792: [HUDI-2551] Support DefaultHoodieRecordPayload for flink
danny0405 commented on pull request #3792: URL: https://github.com/apache/hudi/pull/3792#issuecomment-942892659 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] zhangyue19921010 commented on pull request #3765: [HUDI-2533] New option for hoodieClusteringJob to check, rollback and re-execute the last failed clustering job
zhangyue19921010 commented on pull request #3765: URL: https://github.com/apache/hudi/pull/3765#issuecomment-942892121 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (HUDI-2548) Flink streaming reader misses the rolling over file handles
[ https://issues.apache.org/jira/browse/HUDI-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen resolved HUDI-2548. -- Resolution: Fixed Fixed via master branch: abf3e3fe71cd92a4129cf110a5206fbcfb3b1ae2 > Flink streaming reader misses the rolling over file handles > --- > > Key: HUDI-2548 > URL: https://issues.apache.org/jira/browse/HUDI-2548 > Project: Apache Hudi > Issue Type: Bug > Components: Flink Integration >Reporter: Danny Chen >Assignee: Danny Chen >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] zhangyue19921010 removed a comment on pull request #3765: [HUDI-2533] New option for hoodieClusteringJob to check, rollback and re-execute the last failed clustering job
zhangyue19921010 removed a comment on pull request #3765: URL: https://github.com/apache/hudi/pull/3765#issuecomment-942114128 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[hudi] branch master updated (cff384d -> abf3e3f)
This is an automated email from the ASF dual-hosted git repository. danny0405 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from cff384d [HUDI-2552] Fixing some test failures to unblock broken CI master (#3793) add abf3e3f [HUDI-2548] Flink streaming reader misses the rolling over file handles (#3787) No new revisions were added by this update. Summary of changes: .../java/org/apache/hudi/io/HoodieMergeHandle.java | 2 +- .../hudi/common/model/HoodieCommitMetadata.java| 59 - .../table/timeline/HoodieArchivedTimeline.java | 13 ++-- .../apache/hudi/configuration/FlinkOptions.java| 4 +- .../org/apache/hudi/sink/StreamWriteFunction.java | 5 +- .../hudi/sink/StreamWriteOperatorCoordinator.java | 8 ++- .../sink/partitioner/profile/WriteProfile.java | 2 +- .../sink/partitioner/profile/WriteProfiles.java| 77 -- .../apache/hudi/source/IncrementalInputSplits.java | 23 ++- .../hudi/source/StreamReadMonitoringFunction.java | 2 +- .../apache/hudi/streamer/FlinkStreamerConfig.java | 4 +- .../java/org/apache/hudi/util/StreamerUtil.java| 12 +++- .../org/apache/hudi/sink/TestWriteCopyOnWrite.java | 13 ++-- .../apache/hudi/table/HoodieDataSourceITCase.java | 38 +-- .../hudi/hadoop/utils/HoodieInputFormatUtils.java | 72 ++-- .../hudi/MergeOnReadIncrementalRelation.scala | 17 ++--- 16 files changed, 225 insertions(+), 126 deletions(-)
[GitHub] [hudi] danny0405 merged pull request #3787: [HUDI-2548] Flink streaming reader misses the rolling over file handles
danny0405 merged pull request #3787: URL: https://github.com/apache/hudi/pull/3787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution
hudi-bot edited a comment on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-919855741 ## CI report: * 89dab78876b2512aa4967ced70da27f6fdb46b14 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2619) * aaf33e3a28680ab5febca7df70937ce543619a94 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2632) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas removed a comment on pull request #3779: [HUDI-2503] HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service
SteNicholas removed a comment on pull request #3779: URL: https://github.com/apache/hudi/pull/3779#issuecomment-942881620 > stored @danny0405 , IMO, I could do some work to share the flink index which is stored in the state for this pull request. What do you think about? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas commented on pull request #3779: [HUDI-2503] HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service
SteNicholas commented on pull request #3779: URL: https://github.com/apache/hudi/pull/3779#issuecomment-942881963 > Please do not merge before we can share the flink index which is stored in the state. @danny0405 , IMO, I could do some work to share the flink index which is stored in the state for this pull request. What do you think about? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas edited a comment on pull request #3779: [HUDI-2503] HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service
SteNicholas edited a comment on pull request #3779: URL: https://github.com/apache/hudi/pull/3779#issuecomment-942881620 > stored @danny0405 , IMO, I could do some work to share the flink index which is stored in the state for this pull request. What do you think about? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas removed a comment on pull request #3779: [HUDI-2503] HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service
SteNicholas removed a comment on pull request #3779: URL: https://github.com/apache/hudi/pull/3779#issuecomment-942881691 > stored @danny0405 , IMO, I could do some work to share the flink index which is stored in the state for this pull request. What do you think about? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] SteNicholas commented on pull request #3779: [HUDI-2503] HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service
SteNicholas commented on pull request #3779: URL: https://github.com/apache/hudi/pull/3779#issuecomment-942881620 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
nsivabalan commented on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-942878688 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2549) Exceptions when using second writer into Hudi table managed by DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428561#comment-17428561 ] sivabalan narayanan commented on HUDI-2549: --- Hey Dave. in my local set up, I did not have deltastreamer doing 1 commit per 30 seconds. also, my spark writer was fast and start time did not interfere w/ deltastreamer. but I ensured that deltastreamer did not have any issues w/ checkpoint if there was some spark writer interleaved. > Exceptions when using second writer into Hudi table managed by DeltaStreamer > > > Key: HUDI-2549 > URL: https://issues.apache.org/jira/browse/HUDI-2549 > Project: Apache Hudi > Issue Type: Bug > Components: DeltaStreamer, Spark Integration, Writer Core >Reporter: Dave Hagman >Assignee: Dave Hagman >Priority: Critical > Labels: multi-writer, sev:critical > Fix For: 0.10.0 > > > When running the DeltaStreamer along with a second spark datasource writer > (with [ZK-based OCC > enabled|https://hudi.apache.org/docs/concurrency_control#enabling-multi-writing] > we receive the following exception (which haults the spark datasource > writer). This occurs following warnings of timeline inconsistencies: > > {code:java} > 21/10/07 17:10:05 INFO TransactionManager: Transaction ending with > transaction owner Option{val=[==>20211007170717__commit__INFLIGHT]} > 21/10/07 17:10:05 INFO ZookeeperBasedLockProvider: RELEASING lock > atZkBasePath = /events/test/mwc/v1, lock key = events_mwc_test_v1 > 21/10/07 17:10:05 INFO ZookeeperBasedLockProvider: RELEASED lock atZkBasePath > = /events/test/mwc/v1, lock key = events_mwc_test_v1 > 21/10/07 17:10:05 INFO TransactionManager: Transaction ended > Exception in thread "main" java.lang.IllegalArgumentException > at > org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31) > at > org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:414) > at > org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:395) > at > org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:153) > at > org.apache.hudi.client.AbstractHoodieWriteClient.commit(AbstractHoodieWriteClient.java:218) > at > org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:190) > at > org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:124) > at > org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:617) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:274) > at > org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164) > at > org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133) > at > org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989) > at > org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) > at > org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232) > at > org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110) > at > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135) > at > org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) > at > org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232) > at > org.apache.spark
[GitHub] [hudi] nsivabalan commented on pull request #3794: [HUDI-2553] Metadata table compaction trigger max delta commits default config (re-enable)
nsivabalan commented on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942876773 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution
hudi-bot edited a comment on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-919855741 ## CI report: * 89dab78876b2512aa4967ced70da27f6fdb46b14 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2619) * aaf33e3a28680ab5febca7df70937ce543619a94 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3794: [HUDI-2553] Metadata table compaction trigger max delta commits default config (re-enable)
hudi-bot edited a comment on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942793927 ## CI report: * 31852dac3234f80b094392197a34ac5704f2e784 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2631) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3794: [HUDI-2553] Metadata table compaction trigger max delta commits default config (re-enable)
hudi-bot edited a comment on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942793927 ## CI report: * f4b16e728f180c9fc4655ae052bb89b2f6a1ff8b Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2630) * 31852dac3234f80b094392197a34ac5704f2e784 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2631) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3794: [HUDI-2553] Metadata table compaction trigger max delta commits default config (re-enable)
hudi-bot edited a comment on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942793927 ## CI report: * f4b16e728f180c9fc4655ae052bb89b2f6a1ff8b Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2630) * 31852dac3234f80b094392197a34ac5704f2e784 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2553) Re-enable max delta commits for metadata table to 10
[ https://issues.apache.org/jira/browse/HUDI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-2553: - Labels: pull-request-available (was: ) > Re-enable max delta commits for metadata table to 10 > > > Key: HUDI-2553 > URL: https://issues.apache.org/jira/browse/HUDI-2553 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > Labels: pull-request-available > > our CI was broken recently. hence reverted couple of tests and the default > value for max delta commits for metadata table. > [https://github.com/apache/hudi/pull/3793] > > Please set it back to 10. Lets re-run CI for the patch few times to ensure > there are no flakiness. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3794: [HUDI-2553] Metadata table compaction trigger max delta commits default config (re-enable)
hudi-bot edited a comment on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942793927 ## CI report: * f4b16e728f180c9fc4655ae052bb89b2f6a1ff8b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2630) * 31852dac3234f80b094392197a34ac5704f2e784 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
hudi-bot edited a comment on pull request #3741: URL: https://github.com/apache/hudi/pull/3741#issuecomment-931660346 ## CI report: * 333c80ea94b4ed248108d68357e5729bd6613104 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2629) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2553) Re-enable max delta commits for metadata table to 10
[ https://issues.apache.org/jira/browse/HUDI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2553: - Status: In Progress (was: Open) > Re-enable max delta commits for metadata table to 10 > > > Key: HUDI-2553 > URL: https://issues.apache.org/jira/browse/HUDI-2553 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > > our CI was broken recently. hence reverted couple of tests and the default > value for max delta commits for metadata table. > [https://github.com/apache/hudi/pull/3793] > > Please set it back to 10. Lets re-run CI for the patch few times to ensure > there are no flakiness. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2553) Re-enable max delta commits for metadata table to 10
[ https://issues.apache.org/jira/browse/HUDI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2553: - Status: Patch Available (was: In Progress) > Re-enable max delta commits for metadata table to 10 > > > Key: HUDI-2553 > URL: https://issues.apache.org/jira/browse/HUDI-2553 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > > our CI was broken recently. hence reverted couple of tests and the default > value for max delta commits for metadata table. > [https://github.com/apache/hudi/pull/3793] > > Please set it back to 10. Lets re-run CI for the patch few times to ensure > there are no flakiness. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2532) Set right default value for max delta commits for compaction in metadata table
[ https://issues.apache.org/jira/browse/HUDI-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2532: - Status: Closed (was: Patch Available) > Set right default value for max delta commits for compaction in metadata > table > --- > > Key: HUDI-2532 > URL: https://issues.apache.org/jira/browse/HUDI-2532 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Set right default value of 10 for max delta commits for compaction in > metadata table. As of now, its set as 24 which is huge. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2532) Set right default value for max delta commits for compaction in metadata table
[ https://issues.apache.org/jira/browse/HUDI-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HUDI-2532: - Status: Patch Available (was: In Progress) > Set right default value for max delta commits for compaction in metadata > table > --- > > Key: HUDI-2532 > URL: https://issues.apache.org/jira/browse/HUDI-2532 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > > Set right default value of 10 for max delta commits for compaction in > metadata table. As of now, its set as 24 which is huge. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] nsivabalan commented on a change in pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
nsivabalan commented on a change in pull request #3762: URL: https://github.com/apache/hudi/pull/3762#discussion_r728518751 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -126,23 +130,21 @@ protected BaseTableMetadata(HoodieEngineContext engineContext, HoodieMetadataCon } @Override - public Map getAllFilesInPartitions(List partitionPaths) + public Map getAllFilesInPartitions(List partitions) throws IOException { if (enabled) { - Map partitionsFilesMap = new HashMap<>(); - try { -for (String partitionPath : partitionPaths) { - partitionsFilesMap.put(partitionPath, fetchAllFilesInPartition(new Path(partitionPath))); -} +// need to understand why we did not make bulk get before Review comment: from what I infer, with HoodieMergedLogRecordScanner, we first read all records from all log blocks and prepare a hash map of records(record key to HoodieRecord). And we don't do seek based read prior to this patch and so we do read all log records from all log blocks. so was bit curious. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3794: [HUDI-2532] Metadata table compaction trigger max delta commits default config (re-enable)
hudi-bot edited a comment on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942793927 ## CI report: * f4b16e728f180c9fc4655ae052bb89b2f6a1ff8b Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2630) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #3794: [HUDI-2532] Metadata table compaction trigger max delta commits default config (re-enable)
hudi-bot commented on pull request #3794: URL: https://github.com/apache/hudi/pull/3794#issuecomment-942793927 ## CI report: * f4b16e728f180c9fc4655ae052bb89b2f6a1ff8b UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
nsivabalan commented on pull request #3762: URL: https://github.com/apache/hudi/pull/3762#issuecomment-942793206 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] manojpec opened a new pull request #3794: [HUDI-2532] Metadata table compaction trigger max delta commits default config (re-enable)
manojpec opened a new pull request #3794: URL: https://github.com/apache/hudi/pull/3794 ## What is the purpose of the pull request Setting the max delta commits default config to 10 (previously it was 24) to trigger the compaction in metadata table quicker than before. The previous change for this https://github.com/apache/hudi/pull/3784 is suspected for breaking CI, so re-doing this change to let CI catch the flakiness if any. ## Brief change log * Updated the default config value in HoodieMetadataConfig.java ## Verify this pull request ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
hudi-bot edited a comment on pull request #3741: URL: https://github.com/apache/hudi/pull/3741#issuecomment-931660346 ## CI report: * 3b08956e6b53ba25be60a659ac6d28d147d9a77b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2625) * 333c80ea94b4ed248108d68357e5729bd6613104 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2629) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
hudi-bot edited a comment on pull request #3741: URL: https://github.com/apache/hudi/pull/3741#issuecomment-931660346 ## CI report: * 3b08956e6b53ba25be60a659ac6d28d147d9a77b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2625) * 333c80ea94b4ed248108d68357e5729bd6613104 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-2555) Fix flaky FlinkCompaction integration test
[ https://issues.apache.org/jira/browse/HUDI-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2555: -- Description: Recently CI was broken and had to revert some suspicious tests. [https://github.com/apache/hudi/pull/3793] We need to fix it and re-enable them back. [ITTestHoodieFlinkCompactor.java|https://github.com/apache/hudi/pull/3793/files#diff-f15b4ec18c40c9494e62ae73aa4b79beeafd1a5fa185b6ec6a7044fa6ed9e1fd] testHoodieFlinkCompactor was: Recently CI was broken and had to revert some suspicious tests. We need to fix it and re-enable them back. [ITTestHoodieFlinkCompactor.java|https://github.com/apache/hudi/pull/3793/files#diff-f15b4ec18c40c9494e62ae73aa4b79beeafd1a5fa185b6ec6a7044fa6ed9e1fd] testHoodieFlinkCompactor > Fix flaky FlinkCompaction integration test > -- > > Key: HUDI-2555 > URL: https://issues.apache.org/jira/browse/HUDI-2555 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Danny Chen >Priority: Major > > Recently CI was broken and had to revert some suspicious tests. > [https://github.com/apache/hudi/pull/3793] > We need to fix it and re-enable them back. > [ITTestHoodieFlinkCompactor.java|https://github.com/apache/hudi/pull/3793/files#diff-f15b4ec18c40c9494e62ae73aa4b79beeafd1a5fa185b6ec6a7044fa6ed9e1fd] > > testHoodieFlinkCompactor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-2553) Re-enable max delta commits for metadata table to 10
[ https://issues.apache.org/jira/browse/HUDI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2553: -- Description: our CI was broken recently. hence reverted couple of tests and the default value for max delta commits for metadata table. [https://github.com/apache/hudi/pull/3793] Please set it back to 10. Lets re-run CI for the patch few times to ensure there are no flakiness. was: our CI was broken recently. hence reverted couple of tests and the default value for max delta commits for metadata table. Please set it back to 10. Lets re-run CI for the patch few times to ensure there are no flakiness. > Re-enable max delta commits for metadata table to 10 > > > Key: HUDI-2553 > URL: https://issues.apache.org/jira/browse/HUDI-2553 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > > our CI was broken recently. hence reverted couple of tests and the default > value for max delta commits for metadata table. > [https://github.com/apache/hudi/pull/3793] > > Please set it back to 10. Lets re-run CI for the patch few times to ensure > there are no flakiness. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-2555) Fix flaky FlinkCompaction integration test
sivabalan narayanan created HUDI-2555: - Summary: Fix flaky FlinkCompaction integration test Key: HUDI-2555 URL: https://issues.apache.org/jira/browse/HUDI-2555 Project: Apache Hudi Issue Type: Improvement Reporter: sivabalan narayanan Recently CI was broken and had to revert some suspicious tests. We need to fix it and re-enable them back. [ITTestHoodieFlinkCompactor.java|https://github.com/apache/hudi/pull/3793/files#diff-f15b4ec18c40c9494e62ae73aa4b79beeafd1a5fa185b6ec6a7044fa6ed9e1fd] testHoodieFlinkCompactor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2555) Fix flaky FlinkCompaction integration test
[ https://issues.apache.org/jira/browse/HUDI-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-2555: - Assignee: Danny Chen > Fix flaky FlinkCompaction integration test > -- > > Key: HUDI-2555 > URL: https://issues.apache.org/jira/browse/HUDI-2555 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Danny Chen >Priority: Major > > Recently CI was broken and had to revert some suspicious tests. > We need to fix it and re-enable them back. > [ITTestHoodieFlinkCompactor.java|https://github.com/apache/hudi/pull/3793/files#diff-f15b4ec18c40c9494e62ae73aa4b79beeafd1a5fa185b6ec6a7044fa6ed9e1fd] > > testHoodieFlinkCompactor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2553) Re-enable max delta commits for metadata table to 10
[ https://issues.apache.org/jira/browse/HUDI-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-2553: - Assignee: Manoj Govindassamy > Re-enable max delta commits for metadata table to 10 > > > Key: HUDI-2553 > URL: https://issues.apache.org/jira/browse/HUDI-2553 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: Manoj Govindassamy >Priority: Major > > our CI was broken recently. hence reverted couple of tests and the default > value for max delta commits for metadata table. > Please set it back to 10. Lets re-run CI for the patch few times to ensure > there are no flakiness. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-2554) Fix some flaky metadata tests
sivabalan narayanan created HUDI-2554: - Summary: Fix some flaky metadata tests Key: HUDI-2554 URL: https://issues.apache.org/jira/browse/HUDI-2554 Project: Apache Hudi Issue Type: Improvement Reporter: sivabalan narayanan recently CI was broken and had to disable few tests. [https://github.com/apache/hudi/pull/3793/files] TestHoodieBackedMetadata testRollbackOperations testErrorCases -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-2554) Fix some flaky metadata tests
[ https://issues.apache.org/jira/browse/HUDI-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-2554: - Assignee: sivabalan narayanan > Fix some flaky metadata tests > - > > Key: HUDI-2554 > URL: https://issues.apache.org/jira/browse/HUDI-2554 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > > recently CI was broken and had to disable few tests. > [https://github.com/apache/hudi/pull/3793/files] > TestHoodieBackedMetadata > testRollbackOperations > testErrorCases -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-2553) Re-enable max delta commits for metadata table to 10
sivabalan narayanan created HUDI-2553: - Summary: Re-enable max delta commits for metadata table to 10 Key: HUDI-2553 URL: https://issues.apache.org/jira/browse/HUDI-2553 Project: Apache Hudi Issue Type: Improvement Reporter: sivabalan narayanan our CI was broken recently. hence reverted couple of tests and the default value for max delta commits for metadata table. Please set it back to 10. Lets re-run CI for the patch few times to ensure there are no flakiness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[hudi] branch master updated (e6711b1 -> cff384d)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from e6711b1 [HUDI-2435][BUG]Fix clustering handle errors (#3666) add cff384d [HUDI-2552] Fixing some test failures to unblock broken CI master (#3793) No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/hudi/table/HoodieTable.java | 3 +-- .../apache/hudi/client/functional/TestHoodieBackedMetadata.java | 9 ++--- .../java/org/apache/hudi/common/config/HoodieMetadataConfig.java | 2 +- .../org/apache/hudi/sink/compact/ITTestHoodieFlinkCompactor.java | 8 4 files changed, 12 insertions(+), 10 deletions(-)
[GitHub] [hudi] nsivabalan merged pull request #3793: [HUDI-2552] Fixing some test failures to unblock broken CI master
nsivabalan merged pull request #3793: URL: https://github.com/apache/hudi/pull/3793 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * 5b9557062c7872f1a49f2261037425dc9b2c0185 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2627) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-2549) Exceptions when using second writer into Hudi table managed by DeltaStreamer
[ https://issues.apache.org/jira/browse/HUDI-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428499#comment-17428499 ] Dave Hagman commented on HUDI-2549: --- While continuing to test, I found that the _*FileAlreadyExistsException*_ can occur on both the deltastreamer and secondary writers (spark datasource writers in my tests). On my latest run the spark datasource writer created a commit "ahead" of the deltastreamer. This resulted in the deltastreamer failing with the same error as before: Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: File already exists:s3:// This also caused a more insidious issue: The deltastreamer checkpoint state is now missing from recent commits and therefore it is unable to start. [~shivnarayan] [~vinoth] Can you confirm that you are able to reproduce this issue? I remember seeing that you have run this exact configuration without issue before. If that is the case then I am quite confused why it would not work for me on a brand new table. > Exceptions when using second writer into Hudi table managed by DeltaStreamer > > > Key: HUDI-2549 > URL: https://issues.apache.org/jira/browse/HUDI-2549 > Project: Apache Hudi > Issue Type: Bug > Components: DeltaStreamer, Spark Integration, Writer Core >Reporter: Dave Hagman >Assignee: Dave Hagman >Priority: Critical > Labels: multi-writer, sev:critical > Fix For: 0.10.0 > > > When running the DeltaStreamer along with a second spark datasource writer > (with [ZK-based OCC > enabled|https://hudi.apache.org/docs/concurrency_control#enabling-multi-writing] > we receive the following exception (which haults the spark datasource > writer). This occurs following warnings of timeline inconsistencies: > > {code:java} > 21/10/07 17:10:05 INFO TransactionManager: Transaction ending with > transaction owner Option{val=[==>20211007170717__commit__INFLIGHT]} > 21/10/07 17:10:05 INFO ZookeeperBasedLockProvider: RELEASING lock > atZkBasePath = /events/test/mwc/v1, lock key = events_mwc_test_v1 > 21/10/07 17:10:05 INFO ZookeeperBasedLockProvider: RELEASED lock atZkBasePath > = /events/test/mwc/v1, lock key = events_mwc_test_v1 > 21/10/07 17:10:05 INFO TransactionManager: Transaction ended > Exception in thread "main" java.lang.IllegalArgumentException > at > org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31) > at > org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:414) > at > org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:395) > at > org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:153) > at > org.apache.hudi.client.AbstractHoodieWriteClient.commit(AbstractHoodieWriteClient.java:218) > at > org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:190) > at > org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:124) > at > org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:617) > at > org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:274) > at > org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164) > at > org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133) > at > org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989) > at > org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107) >
[jira] [Reopened] (HUDI-270) [UMBRELLA] Improve Hudi website UI and documentation
[ https://issues.apache.org/jira/browse/HUDI-270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reopened HUDI-270: - Assignee: Kyle Weller (was: Bhavani Sudha Saktheeswaran) > [UMBRELLA] Improve Hudi website UI and documentation > > > Key: HUDI-270 > URL: https://issues.apache.org/jira/browse/HUDI-270 > Project: Apache Hudi > Issue Type: Task > Components: Docs >Reporter: Bhavani Sudha Saktheeswaran >Assignee: Kyle Weller >Priority: Minor > Labels: hudi-umbrellas > > This is an umbrella task of multiple tasks that aim to improve the website -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-1958) [Umbrella] Follow up items from 1 pass over GH issues
[ https://issues.apache.org/jira/browse/HUDI-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-1958: Assignee: Kyle Weller (was: Vinoth Chandar) > [Umbrella] Follow up items from 1 pass over GH issues > - > > Key: HUDI-1958 > URL: https://issues.apache.org/jira/browse/HUDI-1958 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Reporter: Nishith Agarwal >Assignee: Kyle Weller >Priority: Blocker > Labels: Docs, hudi-umbrellas, release-blocker > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * 314d2f3212816795351a9961382b84630ed1069a Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2626) * 5b9557062c7872f1a49f2261037425dc9b2c0185 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2627) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] fengjian428 commented on issue #3755: [Delta Streamer] file name mismatch with meta when compaction running
fengjian428 commented on issue #3755: URL: https://github.com/apache/hudi/issues/3755#issuecomment-942709709 @guanziyue Do you know what this error means? `21/10/14 04:36:22 ERROR RequestHandler: Got runtime exception servicing request partition=TH%2F2021-01&maxinstant=20211014043535&basepath=hdfs%3A%2F%2Ftl5%2Fprojects%2Fdata_vite%2Fmysql_ingestion%2Frti_vite%2Fshopee_item_v4_db__item_v4_tab_new6&lastinstantts=20211014043614&timelinehash=5d50a0189abbb1e122f7a838ac389bb21ae27ef6db6428821c908be8f566e032 java.lang.IllegalArgumentException: Last known instant from client was 20211014043614 but server has the following timeline [[20211014042315__deltacommit__COMPLETED], [20211014042356__deltacommit__COMPLETED], [20211014042430__deltacommit__COMPLETED], [20211014042509__deltacommit__COMPLETED], [20211014042534__deltacommit__COMPLETED], [20211014042558__commit__COMPLETED], [20211014042607__deltacommit__COMPLETED], [20211014042648__deltacommit__COMPLETED], [20211014042713__deltacommit__COMPLETED], [20211014042736__deltacommit__COMPLETED], [20211014042758__deltacommit__COMPLETED], [20211014042820__commit__COMPLETED], [20211014042824__deltacommit__COMPLETED], [20211014042905__clean__COMPLETED], [20211014042918__deltacommit__COMPLETED], [20211014042937__clean__COMPLETED], [20211014042948__deltacommit__COMPLETED], [20211014043012__clean__COMPLETED], [20211014043022__deltacommit__COMPLETED], [20211014043047__clean__COMPLETED], [20211014043056__deltacommit__COMPLETED], [20211014043115__clean __COMPLETED], [20211014043124__commit__COMPLETED], [20211014043127__deltacommit__COMPLETED], [20211014043145__clean__COMPLETED], [20211014043313__deltacommit__COMPLETED], [20211014043351__clean__COMPLETED], [20211014043419__deltacommit__COMPLETED], [20211014043443__clean__COMPLETED], [20211014043454__deltacommit__COMPLETED], [20211014043525__clean__COMPLETED], [20211014043535__deltacommit__COMPLETED], [20211014043605__clean__COMPLETED], [20211014043614__commit__COMPLETED]] at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:40) at org.apache.hudi.timeline.service.RequestHandler$ViewHandler.handle(RequestHandler.java:510) at io.javalin.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:22) at io.javalin.Javalin.lambda$addHandler$0(Javalin.java:606) at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:46) at io.javalin.core.JavalinServlet$service$2$1.invoke(JavalinServlet.kt:17) at io.javalin.core.JavalinServlet$service$1.invoke(JavalinServlet.kt:143) at io.javalin.core.JavalinServlet$service$2.invoke(JavalinServlet.kt:41) at io.javalin.core.JavalinServlet.service(JavalinServlet.kt:107) at io.javalin.core.util.JettyServerUtil$initialize$httpHandler$1.doHandle(JettyServerUtil.kt:72) at org.apache.hudi.org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.apache.hudi.org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) at org.apache.hudi.org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668) at org.apache.hudi.org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) at org.apache.hudi.org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247) at org.apache.hudi.org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) at org.apache.hudi.org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:61) at org.apache.hudi.org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:174) at org.apache.hudi.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.apache.hudi.org.eclipse.jetty.server.Server.handle(Server.java:502) at org.apache.hudi.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370) at org.apache.hudi.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267) at org.apache.hudi.org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.apache.hudi.org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.apache.hudi.org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) at org.apache.hudi.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.apache.hudi.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.apache.hudi.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.apache.hudi.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * 314d2f3212816795351a9961382b84630ed1069a Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2626) * 5b9557062c7872f1a49f2261037425dc9b2c0185 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * c9019d52d97deeec182234c18f87625537bf602c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2624) * 314d2f3212816795351a9961382b84630ed1069a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2626) * 5b9557062c7872f1a49f2261037425dc9b2c0185 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
nsivabalan commented on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942679228 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * c9019d52d97deeec182234c18f87625537bf602c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2624) * 314d2f3212816795351a9961382b84630ed1069a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2626) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * c9019d52d97deeec182234c18f87625537bf602c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2624) * 314d2f3212816795351a9961382b84630ed1069a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
hudi-bot edited a comment on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942340315 ## CI report: * c9019d52d97deeec182234c18f87625537bf602c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2624) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
hudi-bot edited a comment on pull request #3741: URL: https://github.com/apache/hudi/pull/3741#issuecomment-931660346 ## CI report: * 3b08956e6b53ba25be60a659ac6d28d147d9a77b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=2625) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis` re-run the last Travis build - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] yihua commented on a change in pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
yihua commented on a change in pull request #3741: URL: https://github.com/apache/hudi/pull/3741#discussion_r728307580 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java ## @@ -18,39 +18,277 @@ package org.apache.hudi.table.action.compact; +import org.apache.hudi.avro.model.HoodieCompactionOperation; import org.apache.hudi.avro.model.HoodieCompactionPlan; +import org.apache.hudi.client.AbstractHoodieWriteClient; +import org.apache.hudi.client.WriteStatus; +import org.apache.hudi.common.data.HoodieAccumulator; +import org.apache.hudi.common.data.HoodieData; import org.apache.hudi.common.engine.HoodieEngineContext; +import org.apache.hudi.common.engine.TaskContextSupplier; +import org.apache.hudi.common.fs.FSUtils; +import org.apache.hudi.common.model.CompactionOperation; +import org.apache.hudi.common.model.HoodieBaseFile; import org.apache.hudi.common.model.HoodieFileGroupId; +import org.apache.hudi.common.model.HoodieLogFile; import org.apache.hudi.common.model.HoodieRecordPayload; +import org.apache.hudi.common.model.HoodieTableType; +import org.apache.hudi.common.model.HoodieWriteStat.RuntimeStats; +import org.apache.hudi.common.table.HoodieTableMetaClient; +import org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner; +import org.apache.hudi.common.table.timeline.HoodieActiveTimeline; +import org.apache.hudi.common.table.timeline.HoodieInstant; +import org.apache.hudi.common.table.timeline.HoodieTimeline; +import org.apache.hudi.common.table.view.TableFileSystemView.SliceView; +import org.apache.hudi.common.util.CollectionUtils; +import org.apache.hudi.common.util.CompactionUtils; +import org.apache.hudi.common.util.Option; +import org.apache.hudi.common.util.ValidationUtils; +import org.apache.hudi.common.util.collection.Pair; import org.apache.hudi.config.HoodieWriteConfig; +import org.apache.hudi.io.IOUtils; +import org.apache.hudi.table.HoodieCopyOnWriteTableOperation; import org.apache.hudi.table.HoodieTable; +import org.apache.hudi.table.action.compact.strategy.CompactionStrategy; + +import org.apache.avro.Schema; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.log4j.LogManager; +import org.apache.log4j.Logger; import java.io.IOException; import java.io.Serializable; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.List; import java.util.Set; +import java.util.stream.StreamSupport; + +import static java.util.stream.Collectors.toList; /** * A HoodieCompactor runs compaction on a hoodie table. */ -public interface HoodieCompactor extends Serializable { +public abstract class HoodieCompactor implements Serializable { + + private static final Logger LOG = LogManager.getLogger(HoodieCompactor.class); /** - * Generate a new compaction plan for scheduling. + * @param config Write config. + * @return the reader schema for {@link HoodieMergedLogRecordScanner}. + */ + public abstract Schema getReaderSchema(HoodieWriteConfig config); + + /** + * Updates the reader schema for actual compaction operations. * - * @param context HoodieEngineContext - * @param hoodieTable Hoodie Table - * @param config Hoodie Write Configuration - * @param compactionCommitTime scheduled compaction commit time - * @param fgIdsInPendingCompactions partition-fileId pairs for which compaction is pending - * @return Compaction Plan - * @throws IOException when encountering errors + * @param config Write config. + * @param metaClient {@link HoodieTableMetaClient} instance to use. */ - HoodieCompactionPlan generateCompactionPlan(HoodieEngineContext context, HoodieTable hoodieTable, HoodieWriteConfig config, - String compactionCommitTime, Set fgIdsInPendingCompactions) throws IOException; + public abstract void updateReaderSchema(HoodieWriteConfig config, HoodieTableMetaClient metaClient); + + /** + * Handles the compaction timeline based on the compaction instant. + * + * @param table {@link HoodieTable} instance to use. + * @param pendingCompactionTimeline pending compaction timeline. + * @param compactionInstantTime compaction instant + * @param writeClient Write client. + */ + public abstract void handleCompactionTimeline( + HoodieTable table, HoodieTimeline pendingCompactionTimeline, Review comment: Fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
nsivabalan commented on a change in pull request #3762: URL: https://github.com/apache/hudi/pull/3762#discussion_r728244437 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -126,23 +130,21 @@ protected BaseTableMetadata(HoodieEngineContext engineContext, HoodieMetadataCon } @Override - public Map getAllFilesInPartitions(List partitionPaths) + public Map getAllFilesInPartitions(List partitions) throws IOException { if (enabled) { - Map partitionsFilesMap = new HashMap<>(); - try { -for (String partitionPath : partitionPaths) { - partitionsFilesMap.put(partitionPath, fetchAllFilesInPartition(new Path(partitionPath))); -} +// need to understand why we did not make bulk get before Review comment: @prashantwason @satishkotha : do you guys know why we did not do batch get here and doing 1 key at a time? is there any particular reason for it. I have fixed it to fetch batch get in this patch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashantwason commented on a change in pull request #3762: [HUDI-1294] Adding inline read and seek based read(batch get) for hfile log blocks in metadata table
prashantwason commented on a change in pull request #3762: URL: https://github.com/apache/hudi/pull/3762#discussion_r728264572 ## File path: hudi-common/src/main/java/org/apache/hudi/metadata/BaseTableMetadata.java ## @@ -126,23 +130,21 @@ protected BaseTableMetadata(HoodieEngineContext engineContext, HoodieMetadataCon } @Override - public Map getAllFilesInPartitions(List partitionPaths) + public Map getAllFilesInPartitions(List partitions) throws IOException { if (enabled) { - Map partitionsFilesMap = new HashMap<>(); - try { -for (String partitionPath : partitionPaths) { - partitionsFilesMap.put(partitionPath, fetchAllFilesInPartition(new Path(partitionPath))); -} +// need to understand why we did not make bulk get before Review comment: For simplicity of implementation I suppose - performance was not taken into consideration. Also, given the number of keys being fetched, batch would be slower as it may need to read the entire hfile. @umehrot2 Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3741: [HUDI-2501] Add HoodieData abstraction and refactor compaction actions in hudi-client module
hudi-bot edited a comment on pull request #3741: URL: https://github.com/apache/hudi/pull/3741#issuecomment-931660346 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3793: [HUDI-2552] Fixing metadata validation causing test failures
nsivabalan commented on pull request #3793: URL: https://github.com/apache/hudi/pull/3793#issuecomment-942595144 @hudi-bot azure run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org