Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1923171230

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * 26ba2d427dd4551dc69e47a30ffadbc7563202c5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22291)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1923143547

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * cd679c9540cac6994a7b24c20440df8f33376e55 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22290)
 
   * 26ba2d427dd4551dc69e47a30ffadbc7563202c5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (HUDI-7371) Integrate new file group reader into MergeOnReadInputFormat

2024-02-01 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy reassigned HUDI-7371:


Assignee: xy

> Integrate new file group reader into MergeOnReadInputFormat
> ---
>
> Key: HUDI-7371
> URL: https://issues.apache.org/jira/browse/HUDI-7371
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: flink
>Reporter: Danny Chen
>Assignee: xy
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7369) Add a native hoodie record implementation for Flink

2024-02-01 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy reassigned HUDI-7369:


Assignee: xy

> Add a native hoodie record implementation for Flink
> ---
>
> Key: HUDI-7369
> URL: https://issues.apache.org/jira/browse/HUDI-7369
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: flink
>Reporter: Danny Chen
>Assignee: xy
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7370) Add reader context for Flink

2024-02-01 Thread xy (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xy reassigned HUDI-7370:


Assignee: xy

> Add reader context for Flink
> 
>
> Key: HUDI-7370
> URL: https://issues.apache.org/jira/browse/HUDI-7370
> Project: Apache Hudi
>  Issue Type: Sub-task
>  Components: flink
>Reporter: Danny Chen
>Assignee: xy
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-9424]Support using local timezone when writing flink TIMESTAMP data [hudi]

2024-02-01 Thread via GitHub


cmmp6 commented on code in PR #10594:
URL: https://github.com/apache/hudi/pull/10594#discussion_r1475571405


##
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/utils/TestRowDataToAvroConverters.java:
##
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utils;
+
+import org.apache.avro.generic.GenericRecord;
+import org.apache.flink.formats.common.TimestampFormat;
+import org.apache.flink.formats.json.JsonToRowDataConverters;
+import 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonProcessingException;
+import 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper;
+import org.apache.flink.table.api.DataTypes;
+import org.apache.flink.table.types.DataType;
+import org.apache.flink.table.types.logical.RowType;
+import org.apache.hudi.util.AvroSchemaConverter;
+import org.apache.hudi.util.RowDataToAvroConverters;
+
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneId;
+import java.time.format.DateTimeFormatter;
+import java.util.TimeZone;
+
+import static org.apache.flink.table.api.DataTypes.ROW;
+import static org.apache.flink.table.api.DataTypes.FIELD;
+import static org.apache.flink.table.api.DataTypes.TIMESTAMP;
+
+class TestRowDataToAvroConverters {
+
+  DateTimeFormatter formatter = DateTimeFormatter.ofPattern("-MM-dd 
HH:mm:ss");
+  @Test
+  void testRowDataToAvroStringToRowDataWithLocalTimezone1() throws 
JsonProcessingException {
+TimeZone.setDefault(TimeZone.getTimeZone(ZoneId.of("Asia/Shanghai")));
+String timestampFromUtc8 = "2021-03-30 15:44:29";
+
+DataType rowDataType = ROW(FIELD("timestamp_from_utc_8", TIMESTAMP()));
+JsonToRowDataConverters.JsonToRowDataConverter jsonToRowDataConverter =
+new JsonToRowDataConverters(true, true, TimestampFormat.SQL)
+.createConverter(rowDataType.getLogicalType());
+Object rowData = jsonToRowDataConverter.convert(new 
ObjectMapper().readTree("{\"timestamp_from_utc_8\":\"" + timestampFromUtc8 + 
"\"}"));
+

Review Comment:
   OK,I will add ITs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922839860

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * cd679c9540cac6994a7b24c20440df8f33376e55 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22290)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922831647

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * 08d2def477e1ab733163ed5d2206971fdbaee583 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22284)
 
   * cd679c9540cac6994a7b24c20440df8f33376e55 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10605:
URL: https://github.com/apache/hudi/pull/10605#issuecomment-1922821516

   
   ## CI report:
   
   * cead8c73b454e60033d3fc75c8f4482be61f6e5c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22288)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7367] Add makeQualified APIs [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10607:
URL: https://github.com/apache/hudi/pull/10607#issuecomment-1922767925

   
   ## CI report:
   
   * 5aaa8d7fb5e359585130a1279013d5b7bbd7fc78 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22287)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] The Schema Evolution Not working For Hudi 0.12.3 [hudi]

2024-02-01 Thread via GitHub


lei-su-awx commented on issue #10309:
URL: https://github.com/apache/hudi/issues/10309#issuecomment-1922740654

   Hi @ad1happy2go 0.14.1 worked fine. Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(hudi) branch master updated: [HUDI-6902] Fix a test about timestamp format (#10606)

2024-02-01 Thread yihua
This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new ed6b0727f0f [HUDI-6902] Fix a test about timestamp format (#10606)
ed6b0727f0f is described below

commit ed6b0727f0f004a20167bb4574d42d2bbc3ead48
Author: Lin Liu <141371752+linliu-c...@users.noreply.github.com>
AuthorDate: Thu Feb 1 18:18:41 2024 -0800

[HUDI-6902] Fix a test about timestamp format (#10606)
---
 .../java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git 
a/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java
 
b/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java
index d71055079c2..37d625a599f 100644
--- 
a/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java
+++ 
b/hudi-hadoop-mr/src/test/java/org/apache/hudi/hadoop/TestHoodieParquetInputFormat.java
@@ -67,12 +67,14 @@ import java.io.FileOutputStream;
 import java.io.IOException;
 import java.nio.file.Paths;
 import java.sql.Timestamp;
+import java.text.SimpleDateFormat;
 import java.time.Instant;
 import java.time.LocalDate;
 import java.time.LocalDateTime;
 import java.time.ZoneOffset;
 import java.util.ArrayList;
 import java.util.Collections;
+import java.util.Date;
 import java.util.List;
 
 import static 
org.apache.hudi.common.table.timeline.TimelineMetadataUtils.serializeCommitMetadata;
@@ -815,7 +817,11 @@ public class TestHoodieParquetInputFormat {
   Instant.ofEpochMilli(testTimestampLong), ZoneOffset.UTC);
   assertEquals(Timestamp.valueOf(localDateTime).toString(), 
String.valueOf(writable.get()[0]));
 } else {
-  assertEquals(new Timestamp(testTimestampLong).toString(), 
String.valueOf(writable.get()[0]));
+  Date date = new Date();
+  date.setTime(testTimestampLong);
+  assertEquals(
+  new SimpleDateFormat("-MM-dd HH:mm:ss.SSS").format(date),
+  String.valueOf(writable.get()[0]));
 }
 // test long
 assertEquals(testTimestampLong * 1000, ((LongWritable) 
writable.get()[1]).get());



Re: [PR] [HUDI-7365] Fix a flaky test for timestamp output format [hudi]

2024-02-01 Thread via GitHub


yihua merged PR #10606:
URL: https://github.com/apache/hudi/pull/10606


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7365] Fix a flaky test for timestamp output format [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10606:
URL: https://github.com/apache/hudi/pull/10606#issuecomment-1922666297

   
   ## CI report:
   
   * dbbd2fdb81e942f7c03bb2bc9ad9a47cf6b29d3a Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22285)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10605:
URL: https://github.com/apache/hudi/pull/10605#issuecomment-1922660097

   
   ## CI report:
   
   * 9bc6d4a0b141dea1c67c5b6c9beeaad28548a222 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22280)
 
   * cead8c73b454e60033d3fc75c8f4482be61f6e5c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22288)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10605:
URL: https://github.com/apache/hudi/pull/10605#issuecomment-1922650451

   
   ## CI report:
   
   * 9bc6d4a0b141dea1c67c5b6c9beeaad28548a222 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22280)
 
   * cead8c73b454e60033d3fc75c8f4482be61f6e5c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7365] Fix a flaky test for timestamp output format [hudi]

2024-02-01 Thread via GitHub


linliu-code commented on PR #10606:
URL: https://github.com/apache/hudi/pull/10606#issuecomment-1922647314

   @yihua @rmahindra123 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-7371) Integrate new file group reader into MergeOnReadInputFormat

2024-02-01 Thread Danny Chen (Jira)
Danny Chen created HUDI-7371:


 Summary: Integrate new file group reader into 
MergeOnReadInputFormat
 Key: HUDI-7371
 URL: https://issues.apache.org/jira/browse/HUDI-7371
 Project: Apache Hudi
  Issue Type: Sub-task
  Components: flink
Reporter: Danny Chen
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7370) Add reader context for Flink

2024-02-01 Thread Danny Chen (Jira)
Danny Chen created HUDI-7370:


 Summary: Add reader context for Flink
 Key: HUDI-7370
 URL: https://issues.apache.org/jira/browse/HUDI-7370
 Project: Apache Hudi
  Issue Type: Sub-task
  Components: flink
Reporter: Danny Chen
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7369) Add a native hoodie record implementation for Flink

2024-02-01 Thread Danny Chen (Jira)
Danny Chen created HUDI-7369:


 Summary: Add a native hoodie record implementation for Flink
 Key: HUDI-7369
 URL: https://issues.apache.org/jira/browse/HUDI-7369
 Project: Apache Hudi
  Issue Type: Sub-task
  Components: flink
Reporter: Danny Chen
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7368) Integrate Flink with file group reader

2024-02-01 Thread Danny Chen (Jira)
Danny Chen created HUDI-7368:


 Summary: Integrate Flink with file group reader
 Key: HUDI-7368
 URL: https://issues.apache.org/jira/browse/HUDI-7368
 Project: Apache Hudi
  Issue Type: Task
  Components: flink-sql
Reporter: Danny Chen
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


kbuci commented on code in PR #10605:
URL: https://github.com/apache/hudi/pull/10605#discussion_r1475403094


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/TestHoodieTimelineArchiver.java:
##
@@ -1582,6 +1582,36 @@ public void 
testPendingClusteringAfterArchiveCommit(boolean enableMetadata) thro
 "Since we have a pending clustering instant at 0002, we should 
never archive any commit after ");
   }
 
+  @Test

Review Comment:
   For context, I had a patch on my organization's internal older build of HUDI 
that had the same fix as 
[HUDI-7207](https://issues.apache.org/jira/browse/HUDI-7207) as well as this 
unit test. Although 
[HUDI-7207](https://issues.apache.org/jira/browse/HUDI-7207) has already been 
identified and resolved, I wanted to open a PR for adding this unit test in 
case the `apache/hudi` reviewers think it is useful (and if not I can close 
this PR).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


kbuci commented on code in PR #10605:
URL: https://github.com/apache/hudi/pull/10605#discussion_r1475398790


##
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/TestHoodieTimelineArchiver.java:
##
@@ -1582,6 +1582,36 @@ public void 
testPendingClusteringAfterArchiveCommit(boolean enableMetadata) thro
 "Since we have a pending clustering instant at 0002, we should 
never archive any commit after ");
   }
 
+  @Test
+  public void testRetryArchivalAfterPreviousFailedDeletion() throws Exception {
+HoodieWriteConfig writeConfig = initTestTableAndGetWriteConfig(true, 2, 4, 
2);
+for (int i = 0; i <= 5; i++) {
+  testTable.doWriteOperation("10" + i, WriteOperationType.UPSERT, 
Arrays.asList("p1", "p2"), 1);
+}
+HoodieTable table = HoodieSparkTable.create(writeConfig, context, 
metaClient);
+HoodieTimelineArchiver archiver = new HoodieTimelineArchiver(writeConfig, 
table);
+
+HoodieTimeline timeline = 
metaClient.getActiveTimeline().getWriteTimeline();
+assertEquals(6, timeline.countInstants(), "Loaded 6 commits and the count 
should match");
+assertTrue(archiver.archiveIfRequired(context) > 0);
+// Simulate archival failing to delete by re-adding the .commit instant 
files
+// (101.commit, 102.commit, and 103.commit instant files)
+HoodieTestDataGenerator.createOnlyCompletedCommitFile(basePath, 
"101_1001", wrapperFs.getConf());

Review Comment:
   Unfortunately this is a bit of a hack. Ideally we would somehow induce a 
failure during the DFS delete call in 
`org.apache.hudi.client.timeline.HoodieTimelineArchiver#deleteArchivedInstants` 
, but instead here we have to re-add some completed commit instant files.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7367] Add makeQualified APIs [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10607:
URL: https://github.com/apache/hudi/pull/10607#issuecomment-1922611817

   
   ## CI report:
   
   * 5aaa8d7fb5e359585130a1279013d5b7bbd7fc78 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22287)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7367] Add makeQualified APIs [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10607:
URL: https://github.com/apache/hudi/pull/10607#issuecomment-1922602774

   
   ## CI report:
   
   * 5aaa8d7fb5e359585130a1279013d5b7bbd7fc78 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


danny0405 commented on PR #10604:
URL: https://github.com/apache/hudi/pull/10604#issuecomment-1922602186

   There are no UT, can you ensure the functionality by offline e2e tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922596177

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * 12afa66ed0b62dff350a10e46e1bc8dee88acb4e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22286)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922596007

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * 08d2def477e1ab733163ed5d2206971fdbaee583 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22284)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (HUDI-7207) Concurrent archiving and data reading leads to missing data in query results.

2024-02-01 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-7207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813442#comment-17813442
 ] 

Danny Chen commented on HUDI-7207:
--

guess it should be fixed in 1.0 release ? because we add snapshot isolation to 
the new archiving.

> Concurrent archiving and data reading leads to missing data in query results.
> -
>
> Key: HUDI-7207
> URL: https://issues.apache.org/jira/browse/HUDI-7207
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ma Jian
>Priority: Blocker
>  Labels: pull-request-available
>
> Assuming there are 4 instants in a Hudi table that need to be archived, with 
> timestamps in ascending order (as they have been sorted after obtaining 
> {{{}instantToArchive{}}}): these are 1.deltacommit, 2.deltacommit, 
> 3.deltacommit, and 4.deltacommit, corresponding to the files a.parquet, 
> b.parquet, c.parquet, and d.parquet, respectively.
> In the archiving code, the deletion of instants is handled by the following 
> code snippet:
> {code:java}
> if (!completedInstants.isEmpty()) {
>     context.foreach(
>         completedInstants,
>         instant -> activeTimeline.deleteInstantFileIfExists(instant),
>         Math.min(completedInstants.size(), 
> config.getArchiveDeleteParallelism())
>     );
> }
>  {code}
> Different instants are distributed across different threads for execution. 
> For instance, in Spark with a parallelism of 2, they would be distributed as 
> 1 and 2, and 3 and 4. Consequently, there may be scenarios where instant 3 is 
> deleted before instant 2. If instants 1 and 3 are deleted while 2 and 4 are 
> not yet deleted, a query request obtaining 
> visibleCommitsAndCompactionTimeline at this point would find a timeline with 
> instants 2, 4, and so on.
> During a query, this would result in the data under c.parquet, corresponding 
> to instant 3, becoming completely invisible. I believe this is a very 
> problematic situation, as users could unknowingly retrieve incorrect data.
> Here are a few potential solutions I've considered:
> 1.Prohibit concurrent deletion of completed files. While this would ensure 
> the order of deletions, it could significantly impact performance, which is 
> not an optimal solution.
> 2.Implement a solution similar to a marker file, recording which instants are 
> in the process of being deleted, and then remove these instants directly from 
> the timeline during reads.
> 3.Based on the second solution, incorporate archiving by adding archive 
> instants to the timeline, allowing for direct retrieval of pending archives 
> during data reads. Here I have a question: why don't previous archives have 
> corresponding instant action?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7367) Add makeQualified to HoodieLocation

2024-02-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7367:
-
Labels: pull-request-available  (was: )

> Add makeQualified to HoodieLocation
> ---
>
> Key: HUDI-7367
> URL: https://issues.apache.org/jira/browse/HUDI-7367
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7367] Add makeQualified APIs [hudi]

2024-02-01 Thread via GitHub


yihua opened a new pull request, #10607:
URL: https://github.com/apache/hudi/pull/10607

   ### Change Logs
   
   This PR makes the following changes to properly support qualified location:
   - Adds `getUri` to `HoodieStorage`.
   - Adds `makeQualified` to `HoodieLocation`.
   - Adds `makeQualified` to `FSUtils`.
   - New tests for the functionality.
   
   ### Impact
   
   Properly support qualified location with `HoodieStorage` and 
`HoodieLocation`.  This fixes the original test failures in #10591.
   
   ### Risk level
   
   none
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7367) Add makeQualified to HoodieLocation

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7367:

Priority: Blocker  (was: Major)

> Add makeQualified to HoodieLocation
> ---
>
> Key: HUDI-7367
> URL: https://issues.apache.org/jira/browse/HUDI-7367
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7367) Add makeQualified to HoodieLocation

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo reassigned HUDI-7367:
---

Assignee: Ethan Guo

> Add makeQualified to HoodieLocation
> ---
>
> Key: HUDI-7367
> URL: https://issues.apache.org/jira/browse/HUDI-7367
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7367) Add makeQualified to HoodieLocation

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7367:

Fix Version/s: 1.0.0

> Add makeQualified to HoodieLocation
> ---
>
> Key: HUDI-7367
> URL: https://issues.apache.org/jira/browse/HUDI-7367
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 1.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-7367) Add makeQualified to HoodieLocation

2024-02-01 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7367:
---

 Summary: Add makeQualified to HoodieLocation
 Key: HUDI-7367
 URL: https://issues.apache.org/jira/browse/HUDI-7367
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7365) Fix a flaky test TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType

2024-02-01 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7365:
--
Status: Patch Available  (was: In Progress)

> Fix a flaky test 
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType
> --
>
> Key: HUDI-7365
> URL: https://issues.apache.org/jira/browse/HUDI-7365
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>  Labels: pull-request-available
>
> We can see that the error sometimes without any changes.
>  
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType:818 
> expected: <2024-02-01 07:36:39.0> but was: <2024-02-01 07:36:39>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7365) Fix a flaky test TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType

2024-02-01 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7365:
--
Status: In Progress  (was: Open)

> Fix a flaky test 
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType
> --
>
> Key: HUDI-7365
> URL: https://issues.apache.org/jira/browse/HUDI-7365
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>  Labels: pull-request-available
>
> We can see that the error sometimes without any changes.
>  
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType:818 
> expected: <2024-02-01 07:36:39.0> but was: <2024-02-01 07:36:39>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7365) Fix a flaky test TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType

2024-02-01 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7365:
--
Parent: HUDI-6902
Issue Type: Sub-task  (was: Bug)

> Fix a flaky test 
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType
> --
>
> Key: HUDI-7365
> URL: https://issues.apache.org/jira/browse/HUDI-7365
> Project: Apache Hudi
>  Issue Type: Sub-task
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>  Labels: pull-request-available
>
> We can see that the error sometimes without any changes.
>  
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType:818 
> expected: <2024-02-01 07:36:39.0> but was: <2024-02-01 07:36:39>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922538293

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * e8ba2506557c89ed46704b182c8a2fa7a68e7a14 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22282)
 
   * 12afa66ed0b62dff350a10e46e1bc8dee88acb4e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7365] Fix a test about timestamp format [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10606:
URL: https://github.com/apache/hudi/pull/10606#issuecomment-1922528550

   
   ## CI report:
   
   * dbbd2fdb81e942f7c03bb2bc9ad9a47cf6b29d3a Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22285)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922528258

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * b29f20cd9907f1607f588a6fc4e8c037e079f58e Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22281)
 
   * 08d2def477e1ab733163ed5d2206971fdbaee583 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22284)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7365) Fix a flaky test TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType

2024-02-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7365:
-
Labels: pull-request-available  (was: )

> Fix a flaky test 
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType
> --
>
> Key: HUDI-7365
> URL: https://issues.apache.org/jira/browse/HUDI-7365
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>  Labels: pull-request-available
>
> We can see that the error sometimes without any changes.
>  
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType:818 
> expected: <2024-02-01 07:36:39.0> but was: <2024-02-01 07:36:39>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-7365] Fix a test about timestamp format [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10606:
URL: https://github.com/apache/hudi/pull/10606#issuecomment-1922517599

   
   ## CI report:
   
   * dbbd2fdb81e942f7c03bb2bc9ad9a47cf6b29d3a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922517465

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * e8ba2506557c89ed46704b182c8a2fa7a68e7a14 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22282)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922517169

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * b29f20cd9907f1607f588a6fc4e8c037e079f58e Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22281)
 
   * 08d2def477e1ab733163ed5d2206971fdbaee583 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10605:
URL: https://github.com/apache/hudi/pull/10605#issuecomment-1922509415

   
   ## CI report:
   
   * 9bc6d4a0b141dea1c67c5b6c9beeaad28548a222 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22280)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10604:
URL: https://github.com/apache/hudi/pull/10604#issuecomment-1922509381

   
   ## CI report:
   
   * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN
   * 9b3f2ebbfbdcd615d407ef89e0c4575e6c3f669b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22278)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [HUDI-6902] Fix a test about timestamp format [hudi]

2024-02-01 Thread via GitHub


linliu-code opened a new pull request, #10606:
URL: https://github.com/apache/hudi/pull/10606

   ### Change Logs
   
   We fix the time format for the timestamp. Previously, the 
Timestamp.toString() function prints nano seconds, which is not included in the 
actual format.
   
   ### Impact
   
   Less flaky test.
   
   ### Risk level (write none, low medium or high below)
   
   No.
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] initial commit for pmc and committer update [hudi]

2024-02-01 Thread via GitHub


nfarah86 commented on PR #10603:
URL: https://github.com/apache/hudi/pull/10603#issuecomment-1922490372

   @bhasudha updated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7365) Fix a flaky test TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType

2024-02-01 Thread Lin Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Liu updated HUDI-7365:
--
Summary: Fix a flaky test 
TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType  (was: 
Fxi a flaky test 
TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType)

> Fix a flaky test 
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType
> --
>
> Key: HUDI-7365
> URL: https://issues.apache.org/jira/browse/HUDI-7365
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lin Liu
>Assignee: Lin Liu
>Priority: Major
>
> We can see that the error sometimes without any changes.
>  
> TestHoodieParquetInputFormat.testHoodieParquetInputFormatReadTimeType:818 
> expected: <2024-02-01 07:36:39.0> but was: <2024-02-01 07:36:39>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922452949

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * 3b5649620c700d50e8a152964ab34fd737e8a1de Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22276)
 
   * e8ba2506557c89ed46704b182c8a2fa7a68e7a14 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22282)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922452756

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * b29f20cd9907f1607f588a6fc4e8c037e079f58e Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22281)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922444917

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * 3b5649620c700d50e8a152964ab34fd737e8a1de Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22276)
 
   * e8ba2506557c89ed46704b182c8a2fa7a68e7a14 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922444688

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * fbed23334169885b5032f42444f5ee324ce03e53 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22279)
 
   * b29f20cd9907f1607f588a6fc4e8c037e079f58e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10577:
URL: https://github.com/apache/hudi/pull/10577#issuecomment-1922437129

   
   ## CI report:
   
   * 03953e8ea609fd6c6b9dd8d227b547ac2d551ff8 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22277)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7366] Fix HoodieLocation with encoded paths [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10602:
URL: https://github.com/apache/hudi/pull/10602#issuecomment-1922437278

   
   ## CI report:
   
   * 785e2c880312d4280ca94cb20d134cc70a4fdeff Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22273)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10605:
URL: https://github.com/apache/hudi/pull/10605#issuecomment-1922356661

   
   ## CI report:
   
   * 9bc6d4a0b141dea1c67c5b6c9beeaad28548a222 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22280)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922356350

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * fbed23334169885b5032f42444f5ee324ce03e53 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22279)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10605:
URL: https://github.com/apache/hudi/pull/10605#issuecomment-1922341424

   
   ## CI report:
   
   * 9bc6d4a0b141dea1c67c5b6c9beeaad28548a222 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10604:
URL: https://github.com/apache/hudi/pull/10604#issuecomment-1922341290

   
   ## CI report:
   
   * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN
   * 9b3f2ebbfbdcd615d407ef89e0c4575e6c3f669b Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22278)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922340402

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * cb04f6d02339e217583bd046a7331db52de7a0f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22275)
 
   * fbed23334169885b5032f42444f5ee324ce03e53 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10604:
URL: https://github.com/apache/hudi/pull/10604#issuecomment-1922328981

   
   ## CI report:
   
   * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN
   * 9b3f2ebbfbdcd615d407ef89e0c4575e6c3f669b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [MINOR] Add UT org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion [hudi]

2024-02-01 Thread via GitHub


kbuci opened a new pull request, #10605:
URL: https://github.com/apache/hudi/pull/10605

   
   ### Change Logs
   
   - Add unit test  
org.apache.hudi.io.TestHoodieTimelineArchiver#testRetryArchivalAfterPreviousFailedDeletion
 to simulate scenario where an archival call does not delete all the completed 
`.commit` files on the timeline and archival is retried again. The expected 
behavior is that the subsequent archival retry should "clean up" the leftover 
`.commit` files. This scenario can happen if 
`org.apache.hudi.client.timeline.HoodieTimelineArchiver#deleteArchivedInstants` 
encounters transient failures when deleting completed instant files, such as 
due to the edge case that was resolved by HUDI-7207
   
   ### Impact
   
   None
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   none
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] initial commit for pmc and committer update [hudi]

2024-02-01 Thread via GitHub


bhasudha commented on PR #10603:
URL: https://github.com/apache/hudi/pull/10603#issuecomment-1922258447

   Change Udit to PMC, Committer.
   Add entry for Voon Hou as a committer - 
https://home.apache.org/phonebook.html?uid=vhs 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10604:
URL: https://github.com/apache/hudi/pull/10604#issuecomment-1922254233

   
   ## CI report:
   
   * 8299f34e4d6caba0abbd8a74bb0963c9450b6c35 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


parisni commented on PR #10604:
URL: https://github.com/apache/hudi/pull/10604#issuecomment-1922252363

   @danny0405 you were involved in the previous PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [HUDI-7351] Implement partition pushdown for glue [hudi]

2024-02-01 Thread via GitHub


parisni opened a new pull request, #10604:
URL: https://github.com/apache/hudi/pull/10604

   ### Change Logs
   
   This is a follow up of #10572
   
   While the mentioned PR fixed the runtime error, it did not implement the 
logic for pushdown.
   
   The current PR does so by:
   - refactor to enable one implementation of pushdown per sync client
   - provide a glue expression (diverge from hms)
   - reorder the partitions in case they are returned misordered by the 
metastore
   - optimize partition retrieval by missing the columns details 
   
   ### Impact
   
   Using this feature has proven faster metastore sync in case of large number 
of partitions, from couple of minutes to seconds.
   
   ### Risk level (write none, low medium or high below)
   
   None
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922230674

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * 3b5649620c700d50e8a152964ab34fd737e8a1de Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22276)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922230387

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * cb04f6d02339e217583bd046a7331db52de7a0f2 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22275)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922154475

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * a4da45265bd6b1dc2eb7a1bdb2a99916f651264e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22272)
 
   * 3b5649620c700d50e8a152964ab34fd737e8a1de Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22276)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10577:
URL: https://github.com/apache/hudi/pull/10577#issuecomment-1922154331

   
   ## CI report:
   
   * 27e72600df8807de069ab066fcf4a1d40c0d9b56 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22247)
 
   * 03953e8ea609fd6c6b9dd8d227b547ac2d551ff8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22277)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922154052

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * ecd7ac35249e279c940338f74eda28409cdd8d9a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22274)
 
   * cb04f6d02339e217583bd046a7331db52de7a0f2 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22275)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6868) Hudi HiveSync doesn't support extracting passwords from credential store

2024-02-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6868:
-
Labels: pull-request-available  (was: )

> Hudi HiveSync doesn't support extracting passwords from credential store
> 
>
> Key: HUDI-6868
> URL: https://issues.apache.org/jira/browse/HUDI-6868
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: hive, hudi-utilities, spark
>Reporter: Kuldeep Kulkarni
>Priority: Major
>  Labels: pull-request-available
> Attachments: pyspark_hudi_test.py
>
>
> We have a customer use-case of running PySpark on [Dataproc 
> Serverless|https://cloud.google.com/dataproc-serverless/docs/overview] with 
> [hudi-spark3-bundle|https://mvnrepository.com/artifact/org.apache.hudi/hudi-spark3-bundle],
>  PySpark job fails to sync Hudi table with HMS DB(remote CloudSQL DB 
> instance) due to not able to extract the password from the credential store. 
> Same job works fine if we mention Hive Metstore DB user password instead of 
> credential store. 
> Checking 
> [code|https://github.com/apache/hudi/blob/release-0.12.3/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfig.java]
>  for HiveSync configs or 
> [HiveSyncConfigHolder|https://github.com/apache/hudi/blob/73c2167566730a76a0650d488511253ebc66156f/hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncConfigHolder.java#L44],
>  I don't see any option where it detects credential store for extracting 
> passwords. Something like [this 
> code|https://github.com/apache/hive/blob/rel/release-2.3.9/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L482]
>  from HMS ObjectStore.
> [Hive Sync Config Document|https://hudi.apache.org/docs/syncing_metastore/] 
> also doesn't have any reference of using credential store. 
> In order to find the password through the Hadoop Credential Provider API, it 
> would need to make a call to 
> [`Configuration#getPassword(String)`|https://hadoop.apache.org/docs/r3.3.6/api/org/apache/hadoop/conf/Configuration.html#getPassword-java.lang.String-].
>  We don't see anywhere in the Hudi codebase calling "getPassword"
>  
> *Repro steps:*
>  
> Sample PySpark script - Attached. 
>  
> Command with successful job execution with Metastore DB password:
> {code:java}
> gcloud dataproc batches submit --version 1.1 --container-image 
> gcr.io//new-custom-debian:v4 --region  pyspark 
> gs:///pyspark_hudi_test.py 
> --jars="gs:///hudi-spark3-bundle_2.12-0.12.3.jar" --properties 
> "spark.hadoop.javax.jdo.option.ConnectionURL=jdbc:mysql://:3306/hive_metastore,spark.hadoop.javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver,spark.hadoop.javax.jdo.option.ConnectionUserName=hive,spark.hadoop.javax.jdo.option.ConnectionPassword="
>  --deps-bucket gs:// -- SPARK_EXTRA_CLASSPATH=/opt/spark/jars/* 
> {code}
>  
> Failing command ( with credential store):
> {code:java}
> gcloud dataproc batches submit --version 1.1 --container-image 
> gcr.io//new-custom-debian:v4 --region  pyspark 
> gs:///pyspark_hudi_test.py 
> --jars="gs:///hudi-spark3-bundle_2.12-0.12.3.jar" --properties 
> "spark.hadoop.javax.jdo.option.ConnectionURL=jdbc:mysql://:3306/hive_metastore,spark.hadoop.javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver,spark.hadoop.javax.jdo.option.ConnectionUserName=hive,spark.hadoop.hadoop.security.credential.provider.path=jceks://gs@/metastore-pass-v2.jceks"
>  --deps-bucket gs:// -- SPARK_EXTRA_CLASSPATH=/opt/spark/jars/*  
> {code}
>  
> Error:
> {code:java}
> 23/09/11 04:30:42 INFO HoodieSparkSqlWriter$: Commit 20230911042953444 
> successful!
> 23/09/11 04:30:42 INFO HoodieSparkSqlWriter$: Config.inlineCompactionEnabled 
> ? false
> 23/09/11 04:30:42 INFO HoodieSparkSqlWriter$: Compaction Scheduled is 
> Optional.empty
> 23/09/11 04:30:42 INFO HoodieSparkSqlWriter$: Config.asyncClusteringEnabled ? 
> false
> 23/09/11 04:30:42 INFO HoodieSparkSqlWriter$: Clustering Scheduled is 
> Optional.empty
> 23/09/11 04:30:42 INFO HiveConf: Found configuration file null
> [..]
> 23/09/11 04:30:42 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient 
> from gs:///
> 23/09/11 04:30:42 INFO HoodieTableConfig: Loading table properties from 
> gs:///.hoodie/hoodie.properties
> 23/09/11 04:30:42 INFO HoodieTableMetaClient: Finished Loading Table of type 
> COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from gs:///
> 23/09/11 04:30:42 INFO HoodieTableMetaClient: Loading Active commit timeline 
> for gs:///
> 23/09/11 04:30:42 INFO HoodieActiveTimeline: Loaded instants upto : 
> Option\{val=[20230911042953444__commit__COMPLETED]}
> 23/09/11 04:30:43 INFO HiveMetaStore: 0: Opening raw store with 
> implementation class:org.apache.hadoop.hive.metastore.ObjectStore
> 23/09/11 04:30:43 INFO Obj

Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922144013

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * a4da45265bd6b1dc2eb7a1bdb2a99916f651264e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22272)
 
   * 3b5649620c700d50e8a152964ab34fd737e8a1de UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6868] Support extracting passwords from credential store for Hive Sync [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10577:
URL: https://github.com/apache/hudi/pull/10577#issuecomment-1922143841

   
   ## CI report:
   
   * 27e72600df8807de069ab066fcf4a1d40c0d9b56 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22247)
 
   * 03953e8ea609fd6c6b9dd8d227b547ac2d551ff8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922143613

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * ecd7ac35249e279c940338f74eda28409cdd8d9a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22274)
 
   * cb04f6d02339e217583bd046a7331db52de7a0f2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922129151

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * ecd7ac35249e279c940338f74eda28409cdd8d9a Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22274)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7366] Fix HoodieLocation with encoded paths [hudi]

2024-02-01 Thread via GitHub


yihua commented on PR #10602:
URL: https://github.com/apache/hudi/pull/10602#issuecomment-1922070322

   @linliu-code to review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] initial commit for pmc and committer update [hudi]

2024-02-01 Thread via GitHub


nfarah86 opened a new pull request, #10603:
URL: https://github.com/apache/hudi/pull/10603

   updated pmc & committer team
   
   https://github.com/apache/hudi/assets/5392555/4c2a4abd-2c7d-4b54-8822-1414659cbf54";>
   
   @xushiyan  @bhasudha ready for review


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7366] Fix HoodieLocation with encoded paths [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10602:
URL: https://github.com/apache/hudi/pull/10602#issuecomment-1922045893

   
   ## CI report:
   
   * 785e2c880312d4280ca94cb20d134cc70a4fdeff Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22273)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1922045503

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * fde40846df0d71137525f3e3dcf71042e5c65f4f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22271)
 
   * ecd7ac35249e279c940338f74eda28409cdd8d9a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-7366] Fix HoodieLocation with encoded paths [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10602:
URL: https://github.com/apache/hudi/pull/10602#issuecomment-1922033847

   
   ## CI report:
   
   * 785e2c880312d4280ca94cb20d134cc70a4fdeff UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1922021744

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * a4da45265bd6b1dc2eb7a1bdb2a99916f651264e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22272)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7366) Fix HoodieLocation with encoded paths

2024-02-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-7366:
-
Labels: pull-request-available  (was: )

> Fix HoodieLocation with encoded paths
> -
>
> Key: HUDI-7366
> URL: https://issues.apache.org/jira/browse/HUDI-7366
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.0.0
>
>
> Encoded path like "s3://foo/bar/1%2F2%2F3" should be kept as is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [HUDI-7366] Fix HoodieLocation with encoded paths [hudi]

2024-02-01 Thread via GitHub


yihua opened a new pull request, #10602:
URL: https://github.com/apache/hudi/pull/10602

   ### Change Logs
   
   This PR fixes HoodieLocation with encoded paths, e.g., 
`s3://foo/bar/1%2F2%2F3`, which should be kept as is.
   
   New tests are added.  The tests pass now where as they failed without the 
fix before.
   
   ### Impact
   
   Bug fix.
   
   ### Risk level
   
   low
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-7366) Fix HoodieLocation with encoded paths

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7366:

Fix Version/s: 1.0.0

> Fix HoodieLocation with encoded paths
> -
>
> Key: HUDI-7366
> URL: https://issues.apache.org/jira/browse/HUDI-7366
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 1.0.0
>
>
> Encoded path like "s3://foo/bar/1%2F2%2F3" should be kept as is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7366) Fix HoodieLocation with encoded paths

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7366:

Description: Encoded path like "s3://foo/bar/1%2F2%2F3" should be kept as 
is.

> Fix HoodieLocation with encoded paths
> -
>
> Key: HUDI-7366
> URL: https://issues.apache.org/jira/browse/HUDI-7366
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Priority: Major
>
> Encoded path like "s3://foo/bar/1%2F2%2F3" should be kept as is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-7366) Fix HoodieLocation with encoded paths

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-7366:

Priority: Blocker  (was: Major)

> Fix HoodieLocation with encoded paths
> -
>
> Key: HUDI-7366
> URL: https://issues.apache.org/jira/browse/HUDI-7366
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 1.0.0
>
>
> Encoded path like "s3://foo/bar/1%2F2%2F3" should be kept as is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HUDI-7366) Fix HoodieLocation with encoded paths

2024-02-01 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo reassigned HUDI-7366:
---

Assignee: Ethan Guo

> Fix HoodieLocation with encoded paths
> -
>
> Key: HUDI-7366
> URL: https://issues.apache.org/jira/browse/HUDI-7366
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 1.0.0
>
>
> Encoded path like "s3://foo/bar/1%2F2%2F3" should be kept as is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [I] Upsert operation not working and job is running longer while using "Record level index" in Apache Hudi 0.14 in EMR 6.15 [hudi]

2024-02-01 Thread via GitHub


SudhirSaxena commented on issue #10587:
URL: https://github.com/apache/hudi/issues/10587#issuecomment-1921973115

   @ad1happy2go , it's printing the same which i have posted above even for 2-3 
hours, creating task, finishing task but there is no actual execution 
happening, even there is no executor_id active in spark-UI , 
   sure, we can connect over call. Can you please schedule time for call, i 
will connect or let me know how to connect so that i can do working session 
with you/your team. thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-7366) Fix HoodieLocation with encoded paths

2024-02-01 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-7366:
---

 Summary: Fix HoodieLocation with encoded paths
 Key: HUDI-7366
 URL: https://issues.apache.org/jira/browse/HUDI-7366
 Project: Apache Hudi
  Issue Type: Bug
Reporter: Ethan Guo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1921934113

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * 65e053366dda920f40c5663d37e83b43f340ec3e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22270)
 
   * a4da45265bd6b1dc2eb7a1bdb2a99916f651264e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22272)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1921933780

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * fde40846df0d71137525f3e3dcf71042e5c65f4f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22271)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6497] Replace FileSystem, Path, and FileStatus usage in `hudi-common` module [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10591:
URL: https://github.com/apache/hudi/pull/10591#issuecomment-1921916377

   
   ## CI report:
   
   * 8207558e8c8714386cf2f71929d6fb08db10617b UNKNOWN
   * 65e053366dda920f40c5663d37e83b43f340ec3e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22270)
 
   * a4da45265bd6b1dc2eb7a1bdb2a99916f651264e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1921915876

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * a16039ae6cbe9d99de3f231e1e2f4e5fe488c07e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22269)
 
   * fde40846df0d71137525f3e3dcf71042e5c65f4f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


hudi-bot commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-1921903140

   
   ## CI report:
   
   * 0e5a63db2337ae435f17eb956460e22caeea65b3 UNKNOWN
   * 4d759f3b4d6629e738b9b1afe4157c514d6df182 UNKNOWN
   * a70247f32679a6441cea131e946acce6fd09523e UNKNOWN
   * a5529adc60d4af0c3ece9bbcdcc98ecd5482d21a UNKNOWN
   * b13310f2241a287a1966fe7fd63a616b86c3974c UNKNOWN
   * d47977a291de7374cc34436f4c4e22e1812a883e UNKNOWN
   * e0931770db4a4846a16b09eace9154166bd0842d UNKNOWN
   * f8c748241017499433296ff26e6984064d8085b8 UNKNOWN
   * a16039ae6cbe9d99de3f231e1e2f4e5fe488c07e Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=22269)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-6902] Containerize the Azure tests [hudi]

2024-02-01 Thread via GitHub


linliu-code commented on PR #10512:
URL: https://github.com/apache/hudi/pull/10512#issuecomment-192113

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Hudi behaviour if AWS Glue concurrency is triggered[SUPPORT] [hudi]

2024-02-01 Thread via GitHub


rishabhreply commented on issue #10559:
URL: https://github.com/apache/hudi/issues/10559#issuecomment-1921866552

   @ad1happy2go I see, but in your suggested way the one glue run will process 
all 10 files meaning it will take more time as one job has 10 files to process. 
To not to face such situations, I wanted to leverage step functions that could 
run concurrent runs of the same glue job to distribute & process the 10 files 
in batches.  :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Upsert operation not working and job is running longer while using "Record level index" in Apache Hudi 0.14 in EMR 6.15 [hudi]

2024-02-01 Thread via GitHub


ad1happy2go commented on issue #10587:
URL: https://github.com/apache/hudi/issues/10587#issuecomment-1921854320

   @SudhirSaxena The above are driver logs only. But can you see what logs it 
is printing during that one hour period. We can sync up on slack and connect on 
call also to understand it more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] [SUPPORT] Dataloss in FlinkCDC into Hudi without any exception or other infomation [hudi]

2024-02-01 Thread via GitHub


ad1happy2go commented on issue #10542:
URL: https://github.com/apache/hudi/issues/10542#issuecomment-1921851971

   @xuzifu666 Can you post your table configurations or code snippet which you 
using to load the data. Did you tried to reproduce the same with small data or 
you are seeing this behaviour with billion of records only. I will also try 
once in case I can reproduce.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >