Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1925152973 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * 094f6fe1e22df51003ffd842b2b71eb51ba03650 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22304) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7377) Fix flaky test: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping
[ https://issues.apache.org/jira/browse/HUDI-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7377: -- Description: Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 [6454|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6455]Expected: is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, 1970-01-01T00:00:08, par4]]" [6455|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6456] but: was "[]" [6456|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6457]Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 [6457|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6458]Expected: is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, 1970-01-01T00:00:08, par4]]" [6458|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6459] but: was "[]" https://github.com/apache/hudi/actions/runs/7764079844/job/21177035632?pr=10512 was: Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 [6454|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6455]Expected: is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, 1970-01-01T00:00:08, par4]]" [6455|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6456] but: was "[]" [6456|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6457]Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 [6457|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6458]Expected: is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, 1970-01-01T00:00:08, par4]]" [6458|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6459] but: was "[]" > Fix flaky test: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping > --- > > Key: HUDI-7377 > URL: https://issues.apache.org/jira/browse/HUDI-7377 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 > [6454|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6455]Expected: > is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, > 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], > +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, > 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], > +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, > 1970-01-01T00:00:08, par4]]" > [6455|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6456] > but: was "[]" > [6456|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6457]Error: > ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 > [6457|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6458]Expected: > is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, > 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], > +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, > 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], > +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, >
(hudi) branch master updated: [Hudi-6902] Fix the timestamp format in hive test (#10610)
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 0ac2051ed09 [Hudi-6902] Fix the timestamp format in hive test (#10610) 0ac2051ed09 is described below commit 0ac2051ed093ea9ab6d7244e0bd44482a287a9f7 Author: Lin Liu <141371752+linliu-c...@users.noreply.github.com> AuthorDate: Fri Feb 2 20:37:41 2024 -0800 [Hudi-6902] Fix the timestamp format in hive test (#10610) --- .../java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java b/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java index 37d625a599f..5bf0a255eb4 100644 --- a/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java +++ b/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java @@ -47,6 +47,7 @@ import org.apache.avro.generic.GenericData; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hive.ql.io.IOConstants; +import org.apache.hadoop.hive.serde2.io.TimestampWritable; import org.apache.hadoop.io.ArrayWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.NullWritable; @@ -819,9 +820,9 @@ public class TestHoodieParquetInputFormat { } else { Date date = new Date(); date.setTime(testTimestampLong); - assertEquals( - new SimpleDateFormat("-MM-dd HH:mm:ss.SSS").format(date), - String.valueOf(writable.get()[0])); + Timestamp actualTime = ((TimestampWritable) writable.get()[0]).getTimestamp(); + SimpleDateFormat dateFormat = new SimpleDateFormat("-MM-dd HH:mm:ss.SSS"); + assertEquals(dateFormat.format(date), dateFormat.format(actualTime)); } // test long assertEquals(testTimestampLong * 1000, ((LongWritable) writable.get()[1]).get());
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
nsivabalan merged PR #10610: URL: https://github.com/apache/hudi/pull/10610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7377) Fix flaky test: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping
[ https://issues.apache.org/jira/browse/HUDI-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7377: -- Parent: HUDI-6902 Issue Type: Sub-task (was: Bug) > Fix flaky test: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping > --- > > Key: HUDI-7377 > URL: https://issues.apache.org/jira/browse/HUDI-7377 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 > [6454|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6455]Expected: > is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, > 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], > +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, > 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], > +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, > 1970-01-01T00:00:08, par4]]" > [6455|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6456] > but: was "[]" > [6456|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6457]Error: > ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 > [6457|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6458]Expected: > is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, > 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], > +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, > 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], > +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, > 1970-01-01T00:00:08, par4]]" > [6458|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6459] > but: was "[]" -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7377) Fix flaky test: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping
Lin Liu created HUDI-7377: - Summary: Fix flaky test: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping Key: HUDI-7377 URL: https://issues.apache.org/jira/browse/HUDI-7377 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu Assignee: Lin Liu Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 [6454|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6455]Expected: is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, 1970-01-01T00:00:08, par4]]" [6455|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6456] but: was "[]" [6456|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6457]Error: ITTestHoodieDataSource.testWriteAndReadWithDataSkipping:1603 [6457|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6458]Expected: is "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1], +I[id2, Stephen, 33, 1970-01-01T00:00:02, par1], +I[id3, Julian, 53, 1970-01-01T00:00:03, par2], +I[id4, Fabian, 31, 1970-01-01T00:00:04, par2], +I[id5, Sophia, 18, 1970-01-01T00:00:05, par3], +I[id6, Emma, 20, 1970-01-01T00:00:06, par3], +I[id7, Bob, 44, 1970-01-01T00:00:07, par4], +I[id8, Han, 56, 1970-01-01T00:00:08, par4]]" [6458|https://github.com/apache/hudi/actions/runs/7762799770/job/21173863141?pr=10512#step:6:6459] but: was "[]" -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1925036962 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * 3bc27d30c4a3434984717afe74cf051a8c41d41d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22303) * 094f6fe1e22df51003ffd842b2b71eb51ba03650 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22304) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1925035316 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * 3bc27d30c4a3434984717afe74cf051a8c41d41d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22303) * 094f6fe1e22df51003ffd842b2b71eb51ba03650 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]
danny0405 commented on code in PR #10594: URL: https://github.com/apache/hudi/pull/10594#discussion_r1476923080 ## hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/utils/TestRowDataToAvroConverters.java: ## @@ -0,0 +1,124 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi.utils; + +import org.apache.avro.generic.GenericRecord; +import org.apache.flink.formats.common.TimestampFormat; +import org.apache.flink.formats.json.JsonToRowDataConverters; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonProcessingException; +import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.flink.table.api.DataTypes; +import org.apache.flink.table.types.DataType; +import org.apache.flink.table.types.logical.RowType; +import org.apache.hudi.util.AvroSchemaConverter; +import org.apache.hudi.util.RowDataToAvroConverters; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +import java.time.Instant; +import java.time.LocalDateTime; +import java.time.ZoneId; +import java.time.format.DateTimeFormatter; +import java.util.TimeZone; + +import static org.apache.flink.table.api.DataTypes.ROW; +import static org.apache.flink.table.api.DataTypes.FIELD; +import static org.apache.flink.table.api.DataTypes.TIMESTAMP; + +class TestRowDataToAvroConverters { + + DateTimeFormatter formatter = DateTimeFormatter.ofPattern("-MM-dd HH:mm:ss"); + @Test + void testRowDataToAvroStringToRowDataWithLocalTimezone1() throws JsonProcessingException { +TimeZone.setDefault(TimeZone.getTimeZone(ZoneId.of("Asia/Shanghai"))); +String timestampFromUtc8 = "2021-03-30 15:44:29"; + +DataType rowDataType = ROW(FIELD("timestamp_from_utc_8", TIMESTAMP())); +JsonToRowDataConverters.JsonToRowDataConverter jsonToRowDataConverter = +new JsonToRowDataConverters(true, true, TimestampFormat.SQL) +.createConverter(rowDataType.getLogicalType()); +Object rowData = jsonToRowDataConverter.convert(new ObjectMapper().readTree("{\"timestamp_from_utc_8\":\"" + timestampFromUtc8 + "\"}")); + Review Comment: And why we just use the `TIMESTAMP_LTZ` instead of `TIMESTAMP` when local time zone is needed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [WIP] [HUDI-6787] Implement the HoodieFileGroupReader API for Hive [hudi]
danny0405 commented on PR #10422: URL: https://github.com/apache/hudi/pull/10422#issuecomment-1925026965 @bvaradar Can you help the review of the hive related code? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1925020479 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * 3bc27d30c4a3434984717afe74cf051a8c41d41d Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22303) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1925018239 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * a4e0d876e148fe469df64a3f0a200f6394f6a072 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22301) * 3bc27d30c4a3434984717afe74cf051a8c41d41d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22303) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7284] fix bad method name getLastPendingClusterCommit to getLastPendingClusterInstant [hudi]
hudi-bot commented on PR #10613: URL: https://github.com/apache/hudi/pull/10613#issuecomment-1924991827 ## CI report: * 9fc4c7e5f6161190a97ccf353a9ff78e94ad692a Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22298) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7376) Kill running test instances in Azure CI when the PR is updated
[ https://issues.apache.org/jira/browse/HUDI-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7376: -- Summary: Kill running test instances in Azure CI when the PR is updated (was: Kill existing tests in Azure CI when the PR is updated) > Kill running test instances in Azure CI when the PR is updated > -- > > Key: HUDI-7376 > URL: https://issues.apache.org/jira/browse/HUDI-7376 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7376) Kill existing tests in Azure CI when the PR is updated
Lin Liu created HUDI-7376: - Summary: Kill existing tests in Azure CI when the PR is updated Key: HUDI-7376 URL: https://issues.apache.org/jira/browse/HUDI-7376 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu Assignee: Lin Liu -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1924931305 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * ac70b6f20ac017840ec1acfce4e5bcbe5f8b9beb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22296) * a4e0d876e148fe469df64a3f0a200f6394f6a072 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22301) * 3bc27d30c4a3434984717afe74cf051a8c41d41d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22303) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1924926608 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * ac70b6f20ac017840ec1acfce4e5bcbe5f8b9beb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22296) * a4e0d876e148fe469df64a3f0a200f6394f6a072 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22301) * 3bc27d30c4a3434984717afe74cf051a8c41d41d UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] POC hudi-aws integration testing w/ moto [hudi]
hudi-bot commented on PR #10614: URL: https://github.com/apache/hudi/pull/10614#issuecomment-1924921568 ## CI report: * 006772264911a29d7a925731d305b57768075428 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22302) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] POC hudi-aws integration testing w/ moto [hudi]
hudi-bot commented on PR #10614: URL: https://github.com/apache/hudi/pull/10614#issuecomment-1924892823 ## CI report: * 006772264911a29d7a925731d305b57768075428 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22302) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1924892531 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * ac70b6f20ac017840ec1acfce4e5bcbe5f8b9beb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22296) * a4e0d876e148fe469df64a3f0a200f6394f6a072 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22301) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] POC hudi-aws integration testing w/ moto [hudi]
hudi-bot commented on PR #10614: URL: https://github.com/apache/hudi/pull/10614#issuecomment-1924885707 ## CI report: * 006772264911a29d7a925731d305b57768075428 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
hudi-bot commented on PR #10610: URL: https://github.com/apache/hudi/pull/10610#issuecomment-1924885672 ## CI report: * 6af0095c396fdf2b589fe1f6af842f42d4e41bcd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22294) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1924885454 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * ac70b6f20ac017840ec1acfce4e5bcbe5f8b9beb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22296) * a4e0d876e148fe469df64a3f0a200f6394f6a072 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-6902) Detect and fix flaky tests
[ https://issues.apache.org/jira/browse/HUDI-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-6902: -- Summary: Detect and fix flaky tests (was: Detect flaky tests) > Detect and fix flaky tests > -- > > Key: HUDI-6902 > URL: https://issues.apache.org/jira/browse/HUDI-6902 > Project: Apache Hudi > Issue Type: New Feature >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > Labels: pull-request-available > > Step 1: Create a dummy PR and try to trigger the errors if possible. > 1. The integration test constantly fails. > 2. Some random failures: > [https://github.com/apache/hudi/actions/runs/6396038672] -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
hudi-bot commented on PR #10610: URL: https://github.com/apache/hudi/pull/10610#issuecomment-1924880352 ## CI report: * 6af0095c396fdf2b589fe1f6af842f42d4e41bcd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22294) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7373] revert config hoodie.write.handle.missing.cols.with.lossless.type.promotion [hudi]
hudi-bot commented on PR #10611: URL: https://github.com/apache/hudi/pull/10611#issuecomment-1924880376 ## CI report: * 183d46218d75594a53a56803d49c46d354d8a5ef Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22297) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]
parisni commented on PR #10604: URL: https://github.com/apache/hudi/pull/10604#issuecomment-1924879580 BTW @danny0405 I am working on a POC for IT test with moto to improve the reliability of this module https://github.com/apache/hudi/pull/10614 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] POC hudi-aws integration testing w/ moto [hudi]
parisni opened a new pull request, #10614: URL: https://github.com/apache/hudi/pull/10614 ### Change Logs _Describe context and summary for this change. Highlight if any code was copied._ ### Impact _Describe any public API or user-facing feature change or any performance impact._ ### Risk level (write none, low medium or high below) _If medium or high, explain what verification was done to mitigate the risks._ ### Documentation Update _Describe any necessary documentation update if there is any new feature, config, or user-facing change_ - _The config description must be updated if new configs are added or the default value of the configs are changed_ - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make changes to the website._ ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-7375) Fix flaky test: testLogReaderWithDifferentVersionsOfDeleteBlocks
Lin Liu created HUDI-7375: - Summary: Fix flaky test: testLogReaderWithDifferentVersionsOfDeleteBlocks Key: HUDI-7375 URL: https://issues.apache.org/jira/browse/HUDI-7375 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu Assignee: Lin Liu {code:java} Error: testLogReaderWithDifferentVersionsOfDeleteBlocks{DiskMapType, boolean, boolean, boolean}[13] Time elapsed: 0.043 s <<< ERROR! 3421org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/[13] BITCASK, false, true, false1706913234251/partition_path/.test-fileid1_100.log.1_1-0-1 could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation. 3422at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2338) 3423at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) 3424at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989) 3425at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:911) 3426at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) 3427at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) 3428at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) 3429at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) 3430at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) 3431at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1213) 3432at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1089) 3433at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1012) 3434at java.security.AccessController.doPrivileged(Native Method) 3435at javax.security.auth.Subject.doAs(Subject.java:422) 3436at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) 3437at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3026) 3438 3439at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) 3440at org.apache.hadoop.ipc.Client.call(Client.java:1558) 3441at org.apache.hadoop.ipc.Client.call(Client.java:1455) 3442at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242) 3443at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129) 3444at jdk.proxy2/jdk.proxy2.$Proxy43.addBlock(Unknown Source) 3445at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:530) 3446at jdk.internal.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) 3447at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 3448at java.base/java.lang.reflect.Method.invoke(Method.java:568) 3449at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) 3450at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) 3451at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) 3452at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) 3453at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) 3454at jdk.proxy2/jdk.proxy2.$Proxy44.addBlock(Unknown Source) 3455at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088) 3456at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1915) 3457at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1717) 3458at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:713) 3459 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7375) Fix flaky test: testLogReaderWithDifferentVersionsOfDeleteBlocks
[ https://issues.apache.org/jira/browse/HUDI-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7375: -- Parent: HUDI-6902 Issue Type: Sub-task (was: Bug) > Fix flaky test: testLogReaderWithDifferentVersionsOfDeleteBlocks > > > Key: HUDI-7375 > URL: https://issues.apache.org/jira/browse/HUDI-7375 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > {code:java} > Error: testLogReaderWithDifferentVersionsOfDeleteBlocks{DiskMapType, > boolean, boolean, boolean}[13] Time elapsed: 0.043 s <<< ERROR! > 3421org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /user/root/[13] BITCASK, false, true, > false1706913234251/partition_path/.test-fileid1_100.log.1_1-0-1 could only be > written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running > and 3 node(s) are excluded in this operation. > 3422 at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2338) > 3423 at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) > 3424 at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989) > 3425 at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:911) > 3426 at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) > 3427 at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > 3428 at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) > 3429 at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > 3430 at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > 3431 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1213) > 3432 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1089) > 3433 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1012) > 3434 at java.security.AccessController.doPrivileged(Native Method) > 3435 at javax.security.auth.Subject.doAs(Subject.java:422) > 3436 at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > 3437 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3026) > 3438 > 3439 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1612) > 3440 at org.apache.hadoop.ipc.Client.call(Client.java:1558) > 3441 at org.apache.hadoop.ipc.Client.call(Client.java:1455) > 3442 at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242) > 3443 at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129) > 3444 at jdk.proxy2/jdk.proxy2.$Proxy43.addBlock(Unknown Source) > 3445 at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:530) > 3446 at jdk.internal.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) > 3447 at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 3448 at java.base/java.lang.reflect.Method.invoke(Method.java:568) > 3449 at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) > 3450 at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) > 3451 at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) > 3452 at > org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) > 3453 at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) > 3454 at jdk.proxy2/jdk.proxy2.$Proxy44.addBlock(Unknown Source) > 3455 at > org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1088) > 3456 at > org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1915) > 3457 at > org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1717) > 3458 at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:713) > 3459 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
linliu-code commented on PR #10610: URL: https://github.com/apache/hudi/pull/10610#issuecomment-1924840271 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7374) Fix flaky test: AsyncCompaciton.testConcurrentCompaction
[ https://issues.apache.org/jira/browse/HUDI-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7374: -- Parent: HUDI-6902 Issue Type: Sub-task (was: Bug) > Fix flaky test: AsyncCompaciton.testConcurrentCompaction > > > Key: HUDI-7374 > URL: https://issues.apache.org/jira/browse/HUDI-7374 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Lin Liu >Assignee: Danny Chen >Priority: Major > > > {code:java} > [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 50.603 s <<< FAILURE! - in > org.apache.hudi.table.action.compact.TestAsyncCompaction > [ERROR] testConcurrentCompaction Time elapsed: 7.767 s <<< FAILURE! > org.opentest4j.AssertionFailedError: compaction plan should not include > pending log files ==> expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) > at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) > at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:193) > at > org.apache.hudi.table.action.compact.TestAsyncCompaction.testConcurrentCompaction(TestAsyncCompaction.java:235) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-7374) Fix flaky test: AsyncCompaciton.testConcurrentCompaction
[ https://issues.apache.org/jira/browse/HUDI-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu reassigned HUDI-7374: - Assignee: Danny Chen (was: Lin Liu) > Fix flaky test: AsyncCompaciton.testConcurrentCompaction > > > Key: HUDI-7374 > URL: https://issues.apache.org/jira/browse/HUDI-7374 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Danny Chen >Priority: Major > > > {code:java} > [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 50.603 s <<< FAILURE! - in > org.apache.hudi.table.action.compact.TestAsyncCompaction > [ERROR] testConcurrentCompaction Time elapsed: 7.767 s <<< FAILURE! > org.opentest4j.AssertionFailedError: compaction plan should not include > pending log files ==> expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) > at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) > at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:193) > at > org.apache.hudi.table.action.compact.TestAsyncCompaction.testConcurrentCompaction(TestAsyncCompaction.java:235) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7374) Fix flaky test: AsyncCompaciton.testConcurrentCompaction
[ https://issues.apache.org/jira/browse/HUDI-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7374: -- Description: {code:java} [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 50.603 s <<< FAILURE! - in org.apache.hudi.table.action.compact.TestAsyncCompaction [ERROR] testConcurrentCompaction Time elapsed: 7.767 s <<< FAILURE! org.opentest4j.AssertionFailedError: compaction plan should not include pending log files ==> expected: but was: at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:193) at org.apache.hudi.table.action.compact.TestAsyncCompaction.testConcurrentCompaction(TestAsyncCompaction.java:235) {code} was: {code:java} at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32) at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57) at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:150) at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) [I {code} > Fix flaky test: AsyncCompaciton.testConcurrentCompaction > > > Key: HUDI-7374 > URL: https://issues.apache.org/jira/browse/HUDI-7374 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > > {code:java} > [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 50.603 s <<< FAILURE! - in > org.apache.hudi.table.action.compact.TestAsyncCompaction > [ERROR] testConcurrentCompaction Time elapsed: 7.767 s <<< FAILURE! > org.opentest4j.AssertionFailedError: compaction plan should not include > pending log files ==> expected: but was: > at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) > at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:40) > at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:193) > at >
[jira] [Created] (HUDI-7374) Fix flaky test: AsyncCompaciton.testConcurrentCompaction
Lin Liu created HUDI-7374: - Summary: Fix flaky test: AsyncCompaciton.testConcurrentCompaction Key: HUDI-7374 URL: https://issues.apache.org/jira/browse/HUDI-7374 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu Assignee: Lin Liu {code:java} at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32) at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57) at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:150) at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) [I {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
(hudi) branch master updated: [HUDI-6868] Support extracting passwords from credential store for Hive Sync (#10577)
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new 33407805bd8 [HUDI-6868] Support extracting passwords from credential store for Hive Sync (#10577) 33407805bd8 is described below commit 33407805bd860e295cb9cdfa592f44175c4fa4fb Author: Aditya Goenka <63430370+ad1happy...@users.noreply.github.com> AuthorDate: Sat Feb 3 03:59:58 2024 +0530 [HUDI-6868] Support extracting passwords from credential store for Hive Sync (#10577) Co-authored-by: Danny Chan --- .../scala/org/apache/hudi/HoodieSparkSqlWriter.scala | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala index 7e099166f28..00ec59c5b8f 100644 --- a/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala +++ b/hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala @@ -21,6 +21,8 @@ import org.apache.avro.Schema import org.apache.avro.generic.GenericData import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.{FileSystem, Path} +import org.apache.hadoop.hive.conf.HiveConf +import org.apache.hadoop.hive.shims.ShimLoader import org.apache.hudi.AutoRecordKeyGenerationUtils.mayBeValidateParamsForAutoGenerationOfRecordKeys import org.apache.hudi.AvroConversionUtils.{convertAvroSchemaToStructType, convertStructTypeToAvroSchema, getAvroRecordNameAndNamespace} import org.apache.hudi.DataSourceOptionsHelper.fetchMissingWriteConfigsFromTableConfig @@ -884,7 +886,19 @@ class HoodieSparkSqlWriterInternal { properties.put(HiveSyncConfigHolder.HIVE_SYNC_SCHEMA_STRING_LENGTH_THRESHOLD.key, spark.sessionState.conf.getConf(StaticSQLConf.SCHEMA_STRING_LENGTH_THRESHOLD).toString) properties.put(HoodieSyncConfig.META_SYNC_SPARK_VERSION.key, SPARK_VERSION) properties.put(HoodieSyncConfig.META_SYNC_USE_FILE_LISTING_FROM_METADATA.key, hoodieConfig.getBoolean(HoodieMetadataConfig.ENABLE)) - + if ((fs.getConf.get(HiveConf.ConfVars.METASTOREPWD.varname) == null || fs.getConf.get(HiveConf.ConfVars.METASTOREPWD.varname).isEmpty) && +(properties.get(HiveSyncConfigHolder.HIVE_PASS.key()) == null || properties.get(HiveSyncConfigHolder.HIVE_PASS.key()).toString.isEmpty)){ +try { + val passwd = ShimLoader.getHadoopShims.getPassword(spark.sparkContext.hadoopConfiguration, HiveConf.ConfVars.METASTOREPWD.varname) + if (passwd != null && !passwd.isEmpty) { +fs.getConf.set(HiveConf.ConfVars.METASTOREPWD.varname, passwd) +properties.put(HiveSyncConfigHolder.HIVE_PASS.key(), passwd) + } +} catch { + case e: Exception => +log.info("Exception while trying to get Meta Sync password from hadoop credential store", e) +} + } // Collect exceptions in list because we want all sync to run. Then we can throw val failedMetaSyncs = new mutable.HashMap[String,HoodieException]() syncClientToolClassSet.foreach(impl => {
Re: [PR] [HUDI-6868] Support extracting passwords from credential store for Hive Sync [hudi]
bvaradar merged PR #10577: URL: https://github.com/apache/hudi/pull/10577 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7284] fix bad method name getLastPendingClusterCommit to getLastPendingClusterInstant [hudi]
hudi-bot commented on PR #10613: URL: https://github.com/apache/hudi/pull/10613#issuecomment-1924783083 ## CI report: * 9fc4c7e5f6161190a97ccf353a9ff78e94ad692a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22298) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7284] fix bad method name getLastPendingClusterCommit to getLastPendingClusterInstant [hudi]
hudi-bot commented on PR #10613: URL: https://github.com/apache/hudi/pull/10613#issuecomment-1924776802 ## CI report: * 9fc4c7e5f6161190a97ccf353a9ff78e94ad692a UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(hudi) branch asf-site updated: [DOCS][WEBSITE] Update team page for pmc and committer list (#10603)
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new fec3720d4a8 [DOCS][WEBSITE] Update team page for pmc and committer list (#10603) fec3720d4a8 is described below commit fec3720d4a8d4201a4119510498c38788534a2e4 Author: nadine farah AuthorDate: Fri Feb 2 13:53:25 2024 -0800 [DOCS][WEBSITE] Update team page for pmc and committer list (#10603) --- website/community/team.md | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/website/community/team.md b/website/community/team.md index 943623e8432..adc33cadd6d 100644 --- a/website/community/team.md +++ b/website/community/team.md @@ -14,27 +14,30 @@ last_modified_at: 2020-09-01T15:59:57-04:00 | https://avatars.githubusercontent.com/bhasudha"} className="profile-pic" alt="bhasudha" align="middle" /> | [Bhavani Sudha](https://github.com/bhasudha) | PMC, Committer | bhavanisudha | | https://avatars.githubusercontent.com/bvaradar"} className="profile-pic" alt="bvaradar" align="middle" /> | [Balaji Varadarajan](https://github.com/bvaradar)| PMC, Committer | vbalaji | | https://avatars.githubusercontent.com/danny0405"} className="profile-pic" alt="danny0405" align="middle" /> | [Danny Chan](https://github.com/danny0405) | PMC, Committer | danny0405| -| https://avatars.githubusercontent.com/yihua"} className="profile-pic" alt="yihua" align="middle" /> | [Ethan Guo](https://github.com/yihua) | Committer | yihua | +| https://avatars.githubusercontent.com/yihua"} className="profile-pic" alt="yihua" align="middle" /> | [Ethan Guo](https://github.com/yihua) | PMC, Committer| yihua | | https://avatars.githubusercontent.com/XuQianJin-Stars"} className="profile-pic" alt="XuQianJin-Stars" align="middle" /> | [Forward Xu](https://github.com/XuQianJin-Stars) | Committer | forwardxu| | https://avatars.githubusercontent.com/garyli1019"} className="profile-pic" alt="garyli1019" align="middle" /> | [Gary Li](https://github.com/garyli1019) | PMC, Committer | garyli| +| https://avatars.githubusercontent.com/boneanxs"} className="profile-pic" alt="boneanxs" align="middle" /> | [Hui An](https://github.com/boneanxs) | Committer | rexan | | https://avatars.githubusercontent.com/lresende"} className="profile-pic" alt="lresende" align="middle" /> | [Luciano Resende](https://github.com/lresende) | PMC, Committer | lresende | | https://avatars.githubusercontent.com/lamberken"} className="profile-pic" alt="lamberken" className="profile-pic" align="middle" /> | [lamberken](https://github.com/lamberken) | Committer | lamberken | | https://avatars.githubusercontent.com/n3nash"} className="profile-pic" alt="n3nash" align="middle" /> | [Nishith Agarwal](https://github.com/n3nash) | PMC, Committer | nagarwal | | https://avatars.githubusercontent.com/prasannarajaperumal"} className="profile-pic" alt="prasannarajaperumal" align="middle" /> | [Prasanna Rajaperumal](https://github.com/prasannarajaperumal) | PMC, Committer | prasanna | -| https://avatars.githubusercontent.com/prashantwason"} className="profile-pic" alt="prashantwason" /> | [Prashant Wason](https://github.com/prashantwason) | Committer | pwason | +| https://avatars.githubusercontent.com/prashantwason"} className="profile-pic" alt="prashantwason" /> | [Prashant Wason](https://github.com/prashantwason) | PMC, Committer | pwason | | https://avatars.githubusercontent.com/pratyakshsharma"} className="profile-pic" alt="pratyakshsharma" align="middle" /> | [Pratyaksh Sharma](https://github.com/pratyakshsharma) | Committer | pratyakshsharma| +| https://avatars.githubusercontent.com/stream2000"} className="profile-pic" alt="stream2000" align="middle" /> | [Qijun Fu](https://github.com/stream2000) | Committer | stream2000| | https://avatars.githubusercontent.com/xushiyan"} className="profile-pic" alt="xushiyan" align="middle" /> | [Raymond Xu](https://github.com/xushiyan) | PMC, Committer | xushiyan| | https://avatars.githubusercontent.com/leesf"} className="profile-pic" alt="leesf" align="middle" /> | [Shaofeng Li](https://github.com/leesf) | PMC, Committer | leesf| | https://avatars.githubusercontent.com/nsivabalan"} className="profile-pic" alt="nsivabalan" align="middle" /> | [Sivabalan Narayanan](https://github.com/nsivabalan) | PMC, Committer |
Re: [PR] [DOCS][WEBSITE] Update team page for pmc and committer list [hudi]
xushiyan merged PR #10603: URL: https://github.com/apache/hudi/pull/10603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [MINOR] fix bad method name getLastPendingClusterCommit to getLastPendingClusterInstant [hudi]
jonvex opened a new pull request, #10613: URL: https://github.com/apache/hudi/pull/10613 ### Change Logs change getLastPendingClusterCommit to getLastPendingClusterInstant because it's not a commit if it's pending ### Impact reduce dev confusion ### Risk level (write none, low medium or high below) none ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
hudi-bot commented on PR #10610: URL: https://github.com/apache/hudi/pull/10610#issuecomment-1924642669 ## CI report: * 6af0095c396fdf2b589fe1f6af842f42d4e41bcd Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22294) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1924623263 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * ac70b6f20ac017840ec1acfce4e5bcbe5f8b9beb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22296) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7373] revert config hoodie.write.handle.missing.cols.with.lossless.type.promotion [hudi]
hudi-bot commented on PR #10611: URL: https://github.com/apache/hudi/pull/10611#issuecomment-1924605636 ## CI report: * 183d46218d75594a53a56803d49c46d354d8a5ef Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22297) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1924605026 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * dec08647f16ec8ecdb12534d3c94aa76de3de5c2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22292) * ac70b6f20ac017840ec1acfce4e5bcbe5f8b9beb UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7372) Remove docker usage from building process
[ https://issues.apache.org/jira/browse/HUDI-7372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7372: -- Summary: Remove docker usage from building process (was: Remove thrift usage from building process) > Remove docker usage from building process > - > > Key: HUDI-7372 > URL: https://issues.apache.org/jira/browse/HUDI-7372 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > In current building process a thrift file is compiled on the fly, where we > run the thrift command using docker run command. Since GH CI runs in > containers, it means that we are utilizing a docker A to generate files in > docker B. I don't know if there are any risks or complexity there. I want to > propose that we run thrift command offline to generate the classes, and check > them into hudi repo. Then we can 1. reduce the complexity of the compiling > process; 2. reduce the compiling time in the CI tests, 3. remove the > requirement for installing thrift in our building environment. Meanwhile, I > did not see any benefits to build these classes on the fly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-7373] revert config hoodie.write.handle.missing.cols.with.lossless.type.promotion [hudi]
hudi-bot commented on PR #10611: URL: https://github.com/apache/hudi/pull/10611#issuecomment-1924591219 ## CI report: * 183d46218d75594a53a56803d49c46d354d8a5ef UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [WIP] [HUDI-6787] Implement the HoodieFileGroupReader API for Hive [hudi]
jonvex commented on code in PR #10422: URL: https://github.com/apache/hudi/pull/10422#discussion_r1476626353 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieFileGroupReaderRecordReader.java: ## @@ -0,0 +1,279 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.hudi.hadoop; + +import org.apache.hudi.avro.HoodieAvroUtils; +import org.apache.hudi.common.config.HoodieCommonConfig; +import org.apache.hudi.common.config.HoodieReaderConfig; +import org.apache.hudi.common.fs.FSUtils; +import org.apache.hudi.common.model.BaseFile; +import org.apache.hudi.common.model.FileSlice; +import org.apache.hudi.common.model.HoodieBaseFile; +import org.apache.hudi.common.model.HoodieFileGroupId; +import org.apache.hudi.common.table.HoodieTableMetaClient; +import org.apache.hudi.common.table.TableSchemaResolver; +import org.apache.hudi.common.table.read.HoodieFileGroupReader; +import org.apache.hudi.common.table.timeline.HoodieInstant; +import org.apache.hudi.common.util.Option; +import org.apache.hudi.common.util.StringUtils; +import org.apache.hudi.common.util.TablePathUtils; +import org.apache.hudi.hadoop.realtime.HoodieRealtimeRecordReader; +import org.apache.hudi.hadoop.realtime.RealtimeSplit; +import org.apache.hudi.hadoop.utils.HoodieRealtimeInputFormatUtils; +import org.apache.hudi.hadoop.utils.HoodieRealtimeRecordReaderUtils; + +import org.apache.avro.Schema; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants; +import org.apache.hadoop.hive.serde2.ColumnProjectionUtils; +import org.apache.hadoop.io.ArrayWritable; +import org.apache.hadoop.io.NullWritable; +import org.apache.hadoop.io.Writable; +import org.apache.hadoop.mapred.FileSplit; +import org.apache.hadoop.mapred.InputSplit; +import org.apache.hadoop.mapred.JobConf; +import org.apache.hadoop.mapred.RecordReader; +import org.apache.hadoop.mapred.Reporter; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Locale; +import java.util.Map; +import java.util.Set; +import java.util.function.UnaryOperator; +import java.util.stream.Collectors; +import java.util.stream.Stream; + +public class HoodieFileGroupReaderRecordReader implements RecordReader { + + public interface HiveReaderCreator { +org.apache.hadoop.mapred.RecordReader getRecordReader( +final org.apache.hadoop.mapred.InputSplit split, +final org.apache.hadoop.mapred.JobConf job, +final org.apache.hadoop.mapred.Reporter reporter +) throws IOException; + } + + private final HiveHoodieReaderContext readerContext; + private final HoodieFileGroupReader fileGroupReader; + private final ArrayWritable arrayWritable; + private final NullWritable nullWritable = NullWritable.get(); + private final InputSplit inputSplit; + private final JobConf jobConfCopy; + private final UnaryOperator reverseProjection; + + public HoodieFileGroupReaderRecordReader(HiveReaderCreator readerCreator, + final InputSplit split, + final JobConf jobConf, + final Reporter reporter) throws IOException { +this.jobConfCopy = new JobConf(jobConf); +HoodieRealtimeInputFormatUtils.cleanProjectionColumnIds(jobConfCopy); +Set partitionColumns = new HashSet<>(getPartitionFieldNames(jobConfCopy)); +this.inputSplit = split; + +FileSplit fileSplit = (FileSplit) split; +String tableBasePath = getTableBasePath(split, jobConfCopy); +HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder() +.setConf(jobConfCopy) +.setBasePath(tableBasePath) +.build(); +String latestCommitTime = getLatestCommitTime(split, metaClient); +Schema tableSchema = getLatestTableSchema(metaClient, jobConfCopy, latestCommitTime); +Schema requestedSchema = createRequestedSchema(tableSchema, jobConfCopy); +Map hosts = new HashMap<>(); +this.readerContext = new
Re: [PR] [WIP] [HUDI-6787] Implement the HoodieFileGroupReader API for Hive [hudi]
jonvex commented on code in PR #10422: URL: https://github.com/apache/hudi/pull/10422#discussion_r1476613616 ## hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java: ## @@ -91,9 +94,42 @@ private void initAvroInputFormat() { } } + private static boolean checkIfHudiTable(final InputSplit split, final JobConf job) { +try { + Option tablePathOpt = TablePathUtils.getTablePath(((FileSplit) split).getPath(), job); + if (!tablePathOpt.isPresent()) { +return false; + } + return tablePathOpt.get().getFileSystem(job).exists(new Path(tablePathOpt.get(), HoodieTableMetaClient.METAFOLDER_NAME)); +} catch (IOException e) { + return false; +} + } + @Override public RecordReader getRecordReader(final InputSplit split, final JobConf job, final Reporter reporter) throws IOException { + +if (HoodieFileGroupReaderRecordReader.useFilegroupReader(job)) { + try { +if (!(split instanceof FileSplit) || !checkIfHudiTable(split, job)) { + return super.getRecordReader(split, job, reporter); +} +if (supportAvroRead && HoodieColumnProjectionUtils.supportTimestamp(job)) { + return new HoodieFileGroupReaderRecordReader((s, j, r) -> { +try { + return new ParquetRecordReaderWrapper(new HoodieTimestampAwareParquetInputFormat(), s, j, r); +} catch (InterruptedException e) { + throw new RuntimeException(e); +} + }, split, job, reporter); +} else { + return new HoodieFileGroupReaderRecordReader(super::getRecordReader, split, job, reporter); Review Comment: I added your suggestion. Could you please let me know if that fixes the issue? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [HUDI-7373] Make schema evolution doc and config correct [hudi]
jonvex opened a new pull request, #10612: URL: https://github.com/apache/hudi/pull/10612 ### Change Logs hoodie.write.set.null.for.missing.columns was renamed to hoodie.write.handle.missing.cols.with.lossless.type.promotion. The config only adds the missing columns. The "reverse type promotion" is not controlled by a config. Additionally, this commit never made it to the release, so hoodie.write.handle.missing.cols.with.lossless.type.promotion will not work, only hoodie.write.set.null.for.missing.columns . ### Impact Users can get the correct config ### Risk level (write none, low medium or high below) none ### Documentation Update N/A ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
hudi-bot commented on PR #10610: URL: https://github.com/apache/hudi/pull/10610#issuecomment-1924522324 ## CI report: * 6af0095c396fdf2b589fe1f6af842f42d4e41bcd Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22294) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7373) Revert config name back to hoodie.write.set.null.for.missing.columns in master
[ https://issues.apache.org/jira/browse/HUDI-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-7373: - Labels: pull-request-available (was: ) > Revert config name back to hoodie.write.set.null.for.missing.columns in master > -- > > Key: HUDI-7373 > URL: https://issues.apache.org/jira/browse/HUDI-7373 > Project: Apache Hudi > Issue Type: Task >Reporter: Jonathan Vexler >Assignee: Jonathan Vexler >Priority: Critical > Labels: pull-request-available > > hoodie.write.set.null.for.missing.columns was renamed to > hoodie.write.handle.missing.cols.with.lossless.type.promotion. The config > only adds the missing columns. The "reverse type promotion" is not controlled > by a config. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[PR] [HUDI-7373] revert config hoodie.write.handle.missing.cols.with.lossless.type.promotion [hudi]
jonvex opened a new pull request, #10611: URL: https://github.com/apache/hudi/pull/10611 ### Change Logs hoodie.write.set.null.for.missing.columns was renamed to hoodie.write.handle.missing.cols.with.lossless.type.promotion. The config only adds the missing columns. The "reverse type promotion" is not controlled by a config. ### Impact Prevent config key change in next release ### Risk level (write none, low medium or high below) none ### Documentation Update Need to fix website. Will link here ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-7373) Revert config name back to hoodie.write.set.null.for.missing.columns in master
[ https://issues.apache.org/jira/browse/HUDI-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7373: -- Description: hoodie.write.set.null.for.missing.columns was renamed to hoodie.write.handle.missing.cols.with.lossless.type.promotion. The config only adds the missing columns. The "reverse type promotion" is not controlled by a config. was: hoodie.write.set.null.for.missing.columns was renamed to hoodie.write.handle.missing.cols.with.lossless.type.promotion. The feature only adds the missing columns. The "reverse type promotion" is not controlled by a config. > Revert config name back to hoodie.write.set.null.for.missing.columns in master > -- > > Key: HUDI-7373 > URL: https://issues.apache.org/jira/browse/HUDI-7373 > Project: Apache Hudi > Issue Type: Task >Reporter: Jonathan Vexler >Assignee: Jonathan Vexler >Priority: Critical > > hoodie.write.set.null.for.missing.columns was renamed to > hoodie.write.handle.missing.cols.with.lossless.type.promotion. The config > only adds the missing columns. The "reverse type promotion" is not controlled > by a config. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7373) Revert config name back to hoodie.write.set.null.for.missing.columns in master
[ https://issues.apache.org/jira/browse/HUDI-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7373: -- Status: In Progress (was: Open) > Revert config name back to hoodie.write.set.null.for.missing.columns in master > -- > > Key: HUDI-7373 > URL: https://issues.apache.org/jira/browse/HUDI-7373 > Project: Apache Hudi > Issue Type: Task >Reporter: Jonathan Vexler >Assignee: Jonathan Vexler >Priority: Critical > > hoodie.write.set.null.for.missing.columns was renamed to > hoodie.write.handle.missing.cols.with.lossless.type.promotion. The feature > only adds the missing columns. The "reverse type promotion" is not controlled > by a config. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7373) Revert config name back to hoodie.write.set.null.for.missing.columns in master
[ https://issues.apache.org/jira/browse/HUDI-7373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Vexler updated HUDI-7373: -- Status: Patch Available (was: In Progress) > Revert config name back to hoodie.write.set.null.for.missing.columns in master > -- > > Key: HUDI-7373 > URL: https://issues.apache.org/jira/browse/HUDI-7373 > Project: Apache Hudi > Issue Type: Task >Reporter: Jonathan Vexler >Assignee: Jonathan Vexler >Priority: Critical > > hoodie.write.set.null.for.missing.columns was renamed to > hoodie.write.handle.missing.cols.with.lossless.type.promotion. The feature > only adds the missing columns. The "reverse type promotion" is not controlled > by a config. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HUDI-7373) Revert config name back to hoodie.write.set.null.for.missing.columns in master
Jonathan Vexler created HUDI-7373: - Summary: Revert config name back to hoodie.write.set.null.for.missing.columns in master Key: HUDI-7373 URL: https://issues.apache.org/jira/browse/HUDI-7373 Project: Apache Hudi Issue Type: Task Reporter: Jonathan Vexler Assignee: Jonathan Vexler hoodie.write.set.null.for.missing.columns was renamed to hoodie.write.handle.missing.cols.with.lossless.type.promotion. The feature only adds the missing columns. The "reverse type promotion" is not controlled by a config. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-6902] Fix string format of the timestamp in a test [hudi]
hudi-bot commented on PR #10610: URL: https://github.com/apache/hudi/pull/10610#issuecomment-1924513079 ## CI report: * 6af0095c396fdf2b589fe1f6af842f42d4e41bcd UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Hudi-6902] Fix string format of the timestamp in a test [hudi]
linliu-code opened a new pull request, #10610: URL: https://github.com/apache/hudi/pull/10610 ### Change Logs The underlying reason is that in Hive 2.x version, the string format of the TimestampWritable depends on the length of the underlying timestamp. In order to make the comparison stable, we harden the string format. ### Impact Less flaky. ### Risk level (write none, low medium or high below) None. ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] Inconsistent Checkpoint Size in Flink Applications with MoR [hudi]
FranMorilloAWS commented on issue #10329: URL: https://github.com/apache/hudi/issues/10329#issuecomment-1924469731 How is the bucket number automatically expanded? I saw in the pfr that it is meant to be a subtask in the clustering service, but clustering only works with COW and insert. Will hudi + flink have any rewrite api in flink to remove /compact small files? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] RLI Spark Hudi Error occurs when executing map [hudi]
maheshguptags commented on issue #10609: URL: https://github.com/apache/hudi/issues/10609#issuecomment-1923978525 cc: @codope @ad1happy2go @bhasudha -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] RLI Spark Hudi Error occurs when executing map [hudi]
maheshguptags opened a new issue, #10609: URL: https://github.com/apache/hudi/issues/10609 I am trying to ingest the data using spark+kafka streaming to hudi table with the RLI index. but unfortunately ingesting 5-10 records is throwing the below issue. Steps to reproduce the behavior: 1. first build dependency for hudi 14 and spark 3.4 2. add hudi RLI index **Expected behavior** it should work end to end with RLI index enable **Environment Description** * Hudi version : 14 * Spark version : 3.4.0 * Hive version : NA * Hadoop version : 3.3.4 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : Yes **Additional context** Hudi Configuration: val hudiOptions = Map( "hoodie.table.name" -> "customer_profile", "hoodie.datasource.write.recordkey.field" -> "x,y", "hoodie.datasource.write.partitionpath.field" -> "x", "hoodie.datasource.write.precombine.field" -> "ts", "hoodie.table.type" -> "COPY_ON_WRITE", "hoodie.clean.max.commits" -> "6", "hoodie.clean.trigger.strategy" -> "NUM_COMMITS", "hoodie.cleaner.commits.retained" -> "4", "hoodie.cleaner.parallelism" -> "50", "hoodie.clean.automatic" -> "true", "hoodie.clean.async" -> "true", "hoodie.parquet.compression.codec" -> "snappy", "hoodie.index.type" -> "RECORD_INDEX", "hoodie.metadata.record.index.enable" -> "true", "hoodie.metadata.record.index.min.filegroup.count " -> "20", # in trial "hoodie.metadata.record.index.max.filegroup.count" -> "5000" ) **Stacktrace** ``` 24/02/02 13:51:46 INFO BlockManagerInfo: Added broadcast_86_piece0 in memory on 10.224.52.183:42743 (size: 161.7 KiB, free: 413.7 MiB) 24/02/02 13:51:46 INFO BlockManagerInfo: Added broadcast_86_piece0 in memory on 10.224.50.139:39367 (size: 161.7 KiB, free: 413.7 MiB) 24/02/02 13:51:46 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 40 to 10.224.50.139:55724 24/02/02 13:51:46 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 40 to 10.224.52.183:34940 24/02/02 13:51:46 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 40 to 10.224.50.159:33310 24/02/02 13:51:47 INFO TaskSetManager: Starting task 9.0 in stage 148.0 (TID 553) (10.224.53.172, executor 3, partition 9, NODE_LOCAL, 7189 bytes) 24/02/02 13:51:47 WARN TaskSetManager: Lost task 1.0 in stage 148.0 (TID 545) (10.224.53.172 executor 3): org.apache.hudi.exception.HoodieException: org.apache.hudi.exception.HoodieException: Error occurs when executing map at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.base/java.lang.reflect.Constructor.newInstance(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.reportException(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.invoke(Unknown Source) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(Unknown Source) at java.base/java.util.stream.AbstractPipeline.evaluate(Unknown Source) at java.base/java.util.stream.ReferencePipeline.collect(Unknown Source) at org.apache.hudi.common.engine.HoodieLocalEngineContext.map(HoodieLocalEngineContext.java:84) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:270) at org.apache.hudi.metadata.BaseTableMetadata.readRecordIndex(BaseTableMetadata.java:296) at org.apache.hudi.index.SparkMetadataTableRecordIndex$RecordIndexFileGroupLookupFunction.call(SparkMetadataTableRecordIndex.java:170) at org.apache.hudi.index.SparkMetadataTableRecordIndex$RecordIndexFileGroupLookupFunction.call(SparkMetadataTableRecordIndex.java:157) at org.apache.spark.api.java.JavaRDDLike.$anonfun$mapPartitionsToPair$1(JavaRDDLike.scala:186) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:853) at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:853) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:101) at
Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]
hudi-bot commented on PR #10604: URL: https://github.com/apache/hudi/pull/10604#issuecomment-1923753603 ## CI report: * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN * 3e3f138597900a7285b45f3df639e6c44b281571 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22293) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-1517][HUDI-6758][HUDI-6761] Adding support for per-logfile marker to track all log files added by a commit and to assist with rollbacks [hudi]
codope commented on code in PR #9553: URL: https://github.com/apache/hudi/pull/9553#discussion_r1314261872 ## hudi-common/src/main/java/org/apache/hudi/common/data/HoodieListPairData.java: ## @@ -191,6 +191,30 @@ public HoodiePairData>> leftOuterJoin(HoodiePairData(leftOuterJoined, lazy); } + @Override + public HoodiePairData> join(HoodiePairData other) { +ValidationUtils.checkArgument(other instanceof HoodieListPairData); + +// Transform right-side container to a multi-map of [[K]] to [[List]] values +HashMap> rightStreamMap = ((HoodieListPairData) other).asStream().collect( +Collectors.groupingBy( +Pair::getKey, +HashMap::new, +Collectors.mapping(Pair::getValue, Collectors.toList(; Review Comment: Here, we're converting the right-side of the join (`other`) into a Stream, and then using the collect method to aggregate this stream into a HashMap (`rightStreamMap`). This map holds all keys and associated values of the right side in memory. If the `other` dataset is large, this could lead to significant memory usage. Maybe just the keys can be held in-memory for presence check. Something like below: ``` public HoodiePairData> join(HoodiePairData other) { ValidationUtils.checkArgument(other instanceof HoodieListPairData); // Transform right-side container to a multi-map of [[K]] to [[List]] values Map> rightStreamMap = ((HoodieListPairData) other).asStream().collect( Collectors.groupingBy( Pair::getKey, Collectors.mapping(Pair::getValue, Collectors.toList(; List>> joinResult = new ArrayList<>(); asStream().forEach(pair -> { K key = pair.getKey(); V leftValue = pair.getValue(); List rightValues = rightStreamMap.getOrDefault(key, Collections.emptyList()); for (W rightValue : rightValues) { joinResult.add(Pair.of(key, Pair.of(leftValue, rightValue))); } }); return new HoodieListPairData<>(joinResult.stream(), lazy); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]
hudi-bot commented on PR #10604: URL: https://github.com/apache/hudi/pull/10604#issuecomment-1923494239 ## CI report: * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN * 9b3f2ebbfbdcd615d407ef89e0c4575e6c3f669b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22278) * 3e3f138597900a7285b45f3df639e6c44b281571 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22293) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]
hudi-bot commented on PR #10604: URL: https://github.com/apache/hudi/pull/10604#issuecomment-1923482651 ## CI report: * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN * 9b3f2ebbfbdcd615d407ef89e0c4575e6c3f669b Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22278) * 3e3f138597900a7285b45f3df639e6c44b281571 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1923467631 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * dec08647f16ec8ecdb12534d3c94aa76de3de5c2 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22292) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [I] [SUPPORT] Executor executes action [commits the instant 20240202161708414] error [hudi]
Toroidals commented on issue #10608: URL: https://github.com/apache/hudi/issues/10608#issuecomment-1923451819 > **_提交问题前的提示_** > > * 您是否浏览过我们的[常见问题解答](https://hudi.apache.org/learn/faq/)? > * 加入邮件列表以参与对话并在 [dev-subscr...@hudi.apache.org](mailto:dev-subscr...@hudi.apache.org) 获得更快的支持。 > * 如果您已将其作为 bug 进行分类,请直接提交[问题](https://issues.apache.org/jira/projects/HUDI/issues)。 > > **描述您面临的问题** > > 当使用两个 Flink 程序写入 Hudi 中同一张表的不同分区时,参数已设置为:options.put(FlinkOptions.WRITE_CLIENT_ID.key(), String.valueOf(System.currentTimeMillis()));出现以下错误: ![image](https://private-user-images.githubusercontent.com/54655412/301797895-85b54504-073b-4580-890b-523b24948cac.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDY4Njc0NDYsIm5iZiI6MTcwNjg2NzE0NiwicGF0aCI6Ii81NDY1NTQxMi8zMDE3OTc4OTUtODViNTQ1MDQtMDczYi00NTgwLTg5MGItNTIzYjI0OTQ4Y2FjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAyMDIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMjAyVDA5NDU0NlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUwMTg0YjhiY2I5YjIxMGIyYWU5ODc5ZjFjN2YxZjBjZjRmMmQ1ODZiOTk5OWZhYWM2ODhiZTgyOTE5ZjhjZTkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lk PTAmcmVwb19pZD0wIn0.pHn3wn0sG2PlSeOWgwITrj5FtoqC52cHuOMgaEr6FTI) > > 2024-02-02 17:21:12 org.apache.flink.runtime.JobException:在 org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138) 在 org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getGlobalFailureHandlingResult(ExecutionFailureHandler.java:101)在 org.apache.flink.runtime.scheduler.DefaultScheduler.handleGlobalFailure(DefaultScheduler.java:322) 在 org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.lambda$failJob$0(OperatorCoordinatorHolder.java:574) 在 org.apache.flink.runtime.rpc.akka.Akka.AkkaRpcActor.lambda$handleRunAsync$4(AkkaRpcActor.java:443)在 org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) 在 org.apache.flink.runtime.rpc.akka.Akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:443) 在 org.apache.flink.runtime. rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:213) 在 org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)在 org.apache.flink.runtime.rpc.akka.Akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163) 在 akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) 在 akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) 在 scala。PartialFunction.applyOrElse(PartialFunction.scala:123) 在 scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) 在 akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) 在 scala。PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) 在 scala。PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 在 scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) 在 akka.actor.Actor.aroundReceive(Actor.scala:537) 在 akka.actor.Actor.aroundReceive$(Actor.scala:535) � �� akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) 在 akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) 在 akka.actor.ActorCell.invoke(ActorCell.scala:548) 在 akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) 在 akka.dispatch.Mailbox.run(Mailbox.scala:231) 在 akka.dispatch.Mailbox.exec(Mailbox.scala:243) 在 java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 在 java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 在 java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 在 java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) 由以下原因引起:org.apache.flink.util.FlinkException:OperatorCoordinator 触发的“consistent_bucket_write:default_database.hudi_rbs_rbscmfprd_cmf_wf_operation_log_cdc_qy_test”(运算符 ab5eb0c735d351ddaa2e080f1564920d)的全局故障。在 org.apache.flink.runtime.operators .coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:556) 在 org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$start$0(StreamWriteOperatorCoordinator.java:196) 在 org.apache.hudi.sink.utils.NonThrownExecutor.handleException(NonThrownExecutor.java:142) 在 org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:133) 在 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 在 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 在 java.lang.Thread.run(Thread.java:748)原因:org.apache.hudi.exception.HoodieException:执行器执行操作[提交即时20240202171450091]错误...另外 6 个原因: java.lang.IllegalArgumentException at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33) at
Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]
parisni commented on PR #10604: URL: https://github.com/apache/hudi/pull/10604#issuecomment-1923447852 I e2e tested locally. We will land this patch next week in production so let me confirm it's all right then. BTW I added a refacto let me know if it's better -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [SUPPORT] Executor executes action [commits the instant 20240202161708414] error [hudi]
Toroidals opened a new issue, #10608: URL: https://github.com/apache/hudi/issues/10608 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-subscr...@hudi.apache.org. - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly. **Describe the problem you faced** When using two Flink programs to write to different partitions of the same table in Hudi, And the parameter has been set as: options.put(FlinkOptions.WRITE_CLIENT_ID.key(), String.valueOf(System.currentTimeMillis())); the following error occurred: ![image](https://github.com/apache/hudi/assets/54655412/85b54504-073b-4580-890b-523b24948cac) 2024-02-02 17:21:12 org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138) at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getGlobalFailureHandlingResult(ExecutionFailureHandler.java:101) at org.apache.flink.runtime.scheduler.DefaultScheduler.handleGlobalFailure(DefaultScheduler.java:322) at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.lambda$failJob$0(OperatorCoordinatorHolder.java:574) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRunAsync$4(AkkaRpcActor.java:443) at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:443) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:213) at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78) at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24) at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20) at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) at akka.actor.Actor.aroundReceive(Actor.scala:537) at akka.actor.Actor.aroundReceive$(Actor.scala:535) at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580) at akka.actor.ActorCell.invoke(ActorCell.scala:548) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) at akka.dispatch.Mailbox.run(Mailbox.scala:231) at akka.dispatch.Mailbox.exec(Mailbox.scala:243) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) Caused by: org.apache.flink.util.FlinkException: Global failure triggered by OperatorCoordinator for 'consistent_bucket_write: default_database.hudi_rbs_rbscmfprd_cmf_wf_operation_log_cdc_qy_test' (operator ab5eb0c735d351ddaa2e080f1564920d). at org.apache.flink.runtime.operators.coordination.OperatorCoordinatorHolder$LazyInitializedCoordinatorContext.failJob(OperatorCoordinatorHolder.java:556) at org.apache.hudi.sink.StreamWriteOperatorCoordinator.lambda$start$0(StreamWriteOperatorCoordinator.java:196) at org.apache.hudi.sink.utils.NonThrownExecutor.handleException(NonThrownExecutor.java:142) at org.apache.hudi.sink.utils.NonThrownExecutor.lambda$wrapAction$0(NonThrownExecutor.java:133) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hudi.exception.HoodieException: Executor executes action [commits the instant 20240202171450091] error ... 6 more Caused by: java.lang.IllegalArgumentException at org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:33) at
[jira] [Created] (HUDI-7372) Remove thrift usage from building process
Lin Liu created HUDI-7372: - Summary: Remove thrift usage from building process Key: HUDI-7372 URL: https://issues.apache.org/jira/browse/HUDI-7372 Project: Apache Hudi Issue Type: Bug Reporter: Lin Liu Assignee: Lin Liu -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HUDI-7372) Remove thrift usage from building process
[ https://issues.apache.org/jira/browse/HUDI-7372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lin Liu updated HUDI-7372: -- Description: In current building process a thrift file is compiled on the fly, where we run the thrift command using docker run command. Since GH CI runs in containers, it means that we are utilizing a docker A to generate files in docker B. I don't know if there are any risks or complexity there. I want to propose that we run thrift command offline to generate the classes, and check them into hudi repo. Then we can 1. reduce the complexity of the compiling process; 2. reduce the compiling time in the CI tests, 3. remove the requirement for installing thrift in our building environment. Meanwhile, I did not see any benefits to build these classes on the fly. > Remove thrift usage from building process > - > > Key: HUDI-7372 > URL: https://issues.apache.org/jira/browse/HUDI-7372 > Project: Apache Hudi > Issue Type: Bug >Reporter: Lin Liu >Assignee: Lin Liu >Priority: Major > > In current building process a thrift file is compiled on the fly, where we > run the thrift command using docker run command. Since GH CI runs in > containers, it means that we are utilizing a docker A to generate files in > docker B. I don't know if there are any risks or complexity there. I want to > propose that we run thrift command offline to generate the classes, and check > them into hudi repo. Then we can 1. reduce the complexity of the compiling > process; 2. reduce the compiling time in the CI tests, 3. remove the > requirement for installing thrift in our building environment. Meanwhile, I > did not see any benefits to build these classes on the fly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HUDI-7368) Integrate Flink with file group reader
[ https://issues.apache.org/jira/browse/HUDI-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xy reassigned HUDI-7368: Assignee: xy > Integrate Flink with file group reader > -- > > Key: HUDI-7368 > URL: https://issues.apache.org/jira/browse/HUDI-7368 > Project: Apache Hudi > Issue Type: Task > Components: flink-sql >Reporter: Danny Chen >Assignee: xy >Priority: Major > Fix For: 1.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1923298696 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * 26ba2d427dd4551dc69e47a30ffadbc7563202c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22291) * dec08647f16ec8ecdb12534d3c94aa76de3de5c2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22292) Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]
hudi-bot commented on PR #10512: URL: https://github.com/apache/hudi/pull/10512#issuecomment-1923288240 ## CI report: * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN * 26ba2d427dd4551dc69e47a30ffadbc7563202c5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22291) * dec08647f16ec8ecdb12534d3c94aa76de3de5c2 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org