[GitHub] [hudi] hudi-bot commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1685199608

   
   ## CI report:
   
   * 4f0de8a6d00fe72108a12d8316cb1d38389d6b31 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19355)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19362)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19365)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19376)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1685165782

   
   ## CI report:
   
   * 4f0de8a6d00fe72108a12d8316cb1d38389d6b31 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19355)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19362)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19365)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19376)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] majian1998 commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


majian1998 commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1685163877

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [MINOR] Close record readers after use during tests (#9457)

2023-08-19 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 5a7b5f28d99 [MINOR] Close record readers after use during tests (#9457)
5a7b5f28d99 is described below

commit 5a7b5f28d99d16a7ca363a490a70702a87d85a89
Author: voonhous 
AuthorDate: Sun Aug 20 09:45:51 2023 +0800

[MINOR] Close record readers after use during tests (#9457)
---
 .../test/java/org/apache/hudi/testutils/HoodieMergeOnReadTestUtils.java  | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/testutils/HoodieMergeOnReadTestUtils.java
 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/testutils/HoodieMergeOnReadTestUtils.java
index 6f787db6069..7185115a4d5 100644
--- 
a/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/testutils/HoodieMergeOnReadTestUtils.java
+++ 
b/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/testutils/HoodieMergeOnReadTestUtils.java
@@ -166,6 +166,7 @@ public class HoodieMergeOnReadTestUtils {
   .forEach(fieldsPair -> newRecord.set(fieldsPair.getKey(), 
values[fieldsPair.getValue().pos()]));
   records.add(newRecord.build());
 }
+recordReader.close();
   }
 } catch (IOException ie) {
   LOG.error("Read records error", ie);



[GitHub] [hudi] danny0405 merged pull request #9457: [MINOR] Close record readers after use during tests

2023-08-19 Thread via GitHub


danny0405 merged PR #9457:
URL: https://github.com/apache/hudi/pull/9457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #9416: [HUDI-6678] Fix the acquisition of clean&rollback instants to archive

2023-08-19 Thread via GitHub


danny0405 commented on PR #9416:
URL: https://github.com/apache/hudi/pull/9416#issuecomment-1685150559

   Looks good from my side, cc @yihua for the final review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] guanziyue commented on a diff in pull request #4913: [HUDI-1517] create marker file for every log file

2023-08-19 Thread via GitHub


guanziyue commented on code in PR #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r1299270668


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##
@@ -273,4 +280,31 @@ protected static Option 
toAvroRecord(HoodieRecord record, Schema
   return Option.empty();
 }
   }
+
+  protected class AppendLogWriteCallback implements HoodieLogFileWriteCallback 
{
+// here we distinguish log files created from log files being appended. 
Considering following scenario:
+// An appending task write to log file.
+// (1) append to existing file file_instant_writetoken1.log.1
+// (2) rollover and create file file_instant_writetoken2.log.2
+// Then this task failed and retry by a new task.
+// (3) append to existing file file_instant_writetoken1.log.1
+// (4) rollover and create file file_instant_writetoken3.log.2
+// finally file_instant_writetoken2.log.2 should not be committed to hudi, 
we use marker file to delete it.
+// keep in mind that log file is not always fail-safe unless it never roll 
over
+

Review Comment:
   > oh, I get it. in hdfs like systems, if we are using direct style markers, 
if two diff writers try to append to same log file (either sequentially or 
conrrently), we will be calling append type marker for the same log file more 
than once. and direct style markers will fail since the marker file already 
exists. is my understanding correct? did we make any fix on this end or not 
yet? I mean, I underststand we have reverted this patch in latest master. but 
for MDT purpose, I am looking to see if we can re-add this patch (per log file 
marker).So, trying to understand any gaps or failures we need to handle before 
we can add per log file marker support.
   
   Yes! you are correct. This was finally fixed by 
https://github.com/apache/hudi/pull/9003/files. 
   Unfortunately, this PR was reverted due to another failure. W/o MDT, 
FileSystem Based FileSystemView can actually 'see' some uncommitted files, like 
log files being written. And according to current FileGroup definition, an 
uncommitted log file is considered valid as long as it has a committed base 
instant time. Such an uncommitted file should be correctly addressed in reading 
because hudi can find that the instant time in log block read from this log 
file is invalid.
   However, with this PR, we may delete an invalid log file when commit is 
going to finish while a reading job may require this file existing.
   In theory, such an error should not occur with MDT because MDT will not show 
this file until it is committed. For FileSystem Based FileSystemView, I failed 
to have an idea to fix this with a short time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] guanziyue commented on a diff in pull request #4913: [HUDI-1517] create marker file for every log file

2023-08-19 Thread via GitHub


guanziyue commented on code in PR #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r1299270668


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##
@@ -273,4 +280,31 @@ protected static Option 
toAvroRecord(HoodieRecord record, Schema
   return Option.empty();
 }
   }
+
+  protected class AppendLogWriteCallback implements HoodieLogFileWriteCallback 
{
+// here we distinguish log files created from log files being appended. 
Considering following scenario:
+// An appending task write to log file.
+// (1) append to existing file file_instant_writetoken1.log.1
+// (2) rollover and create file file_instant_writetoken2.log.2
+// Then this task failed and retry by a new task.
+// (3) append to existing file file_instant_writetoken1.log.1
+// (4) rollover and create file file_instant_writetoken3.log.2
+// finally file_instant_writetoken2.log.2 should not be committed to hudi, 
we use marker file to delete it.
+// keep in mind that log file is not always fail-safe unless it never roll 
over
+

Review Comment:
   > oh, I get it. in hdfs like systems, if we are using direct style markers, 
if two diff writers try to append to same log file (either sequentially or 
conrrently), we will be calling append type marker for the same log file more 
than once. and direct style markers will fail since the marker file already 
exists. is my understanding correct? did we make any fix on this end or not 
yet? I mean, I underststand we have reverted this patch in latest master. but 
for MDT purpose, I am looking to see if we can re-add this patch (per log file 
marker).So, trying to understand any gaps or failures we need to handle before 
we can add per log file marker support.
   
   Yes! you are correct. This was finally fixed by 
https://github.com/apache/hudi/pull/9003/files. 
   Unfortunately, this PR was reverted by another failure. W/o MDT, FileSystem 
Based FileSystemView can actually 'see' some uncommitted files, like log files 
being written. And according to current FileGroup definition, an uncommitted 
log file is considered valid as long as it has a committed base instant time. 
Such an uncommitted file should be correctly addressed in reading because hudi 
can find that the instant time in log block  is invalid.
   However, with this PR, we may delete an invalid log file when commit is 
going to finish while a reading job may require this file existed.
   In theory, such an error should not occur with MDT. For FileSystem Based 
FileSystemView, I failed to have an idea to fix this with a short time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9482: [HUDI-6728] Update BigQuery manifest sync to support schema evolution

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9482:
URL: https://github.com/apache/hudi/pull/9482#issuecomment-1685108892

   
   ## CI report:
   
   * 87b20e5e4cc44ed70c52ee9ae0f746542f144e52 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19374)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9485: [HUDI-6730] Enable hoodie configuration using the --conf option with the "spark." prefix

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9485:
URL: https://github.com/apache/hudi/pull/9485#issuecomment-1685088898

   
   ## CI report:
   
   * 79060b391199c430b6d0ae8d7e63a10dfb2a853f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19372)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9482: [HUDI-6728] Update BigQuery manifest sync to support schema evolution

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9482:
URL: https://github.com/apache/hudi/pull/9482#issuecomment-1685079217

   
   ## CI report:
   
   * 641d974b5e43f37f8ed429e75e817ba8a5a8376e Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19373)
 
   * 87b20e5e4cc44ed70c52ee9ae0f746542f144e52 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19374)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a diff in pull request #4913: [HUDI-1517] create marker file for every log file

2023-08-19 Thread via GitHub


nsivabalan commented on code in PR #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r1299237297


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##
@@ -273,4 +280,31 @@ protected static Option 
toAvroRecord(HoodieRecord record, Schema
   return Option.empty();
 }
   }
+
+  protected class AppendLogWriteCallback implements HoodieLogFileWriteCallback 
{
+// here we distinguish log files created from log files being appended. 
Considering following scenario:
+// An appending task write to log file.
+// (1) append to existing file file_instant_writetoken1.log.1
+// (2) rollover and create file file_instant_writetoken2.log.2
+// Then this task failed and retry by a new task.
+// (3) append to existing file file_instant_writetoken1.log.1
+// (4) rollover and create file file_instant_writetoken3.log.2
+// finally file_instant_writetoken2.log.2 should not be committed to hudi, 
we use marker file to delete it.
+// keep in mind that log file is not always fail-safe unless it never roll 
over
+

Review Comment:
   oh, I get it. in hdfs like systems, if we are using direct style markers, if 
two diff writers try to append to same log file (either sequentially or 
conrrently), we will be calling append type marker for the same log file more 
than once. and direct style markers will fail since the marker file already 
exists.
   is my understanding correct? 
   did we make any fix on this end or not yet?
   I mean, I underststand we have reverted this patch in latest master. but for 
MDT purpose, I am looking to see if we can re-add this patch (per log file 
marker).So, trying to understand any gaps or failures we need to handle before 
we can add per log file marker support. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a diff in pull request #4913: [HUDI-1517] create marker file for every log file

2023-08-19 Thread via GitHub


nsivabalan commented on code in PR #4913:
URL: https://github.com/apache/hudi/pull/4913#discussion_r1299237090


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##
@@ -273,4 +280,31 @@ protected static Option 
toAvroRecord(HoodieRecord record, Schema
   return Option.empty();
 }
   }
+
+  protected class AppendLogWriteCallback implements HoodieLogFileWriteCallback 
{
+// here we distinguish log files created from log files being appended. 
Considering following scenario:
+// An appending task write to log file.
+// (1) append to existing file file_instant_writetoken1.log.1
+// (2) rollover and create file file_instant_writetoken2.log.2
+// Then this task failed and retry by a new task.
+// (3) append to existing file file_instant_writetoken1.log.1
+// (4) rollover and create file file_instant_writetoken3.log.2
+// finally file_instant_writetoken2.log.2 should not be committed to hudi, 
we use marker file to delete it.
+// keep in mind that log file is not always fail-safe unless it never roll 
over
+

Review Comment:
   sorry, can you guys help me understand why preLogOpen would throw here?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9482: [HUDI-6728] Update BigQuery manifest sync to support schema evolution

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9482:
URL: https://github.com/apache/hudi/pull/9482#issuecomment-1685061547

   
   ## CI report:
   
   * 75434ba7c835be022517f59805a12fc80da0d249 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19363)
 
   * 641d974b5e43f37f8ed429e75e817ba8a5a8376e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19373)
 
   * 87b20e5e4cc44ed70c52ee9ae0f746542f144e52 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9484: [HUDI-6729] Fix get partition values from path for non-string type partition column

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9484:
URL: https://github.com/apache/hudi/pull/9484#issuecomment-1685057712

   
   ## CI report:
   
   * 905cc6b4eff305d54e52f4c1ac2d44d449e9afc5 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19371)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9416: [HUDI-6678] Fix the acquisition of clean&rollback instants to archive

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9416:
URL: https://github.com/apache/hudi/pull/9416#issuecomment-1685057662

   
   ## CI report:
   
   * 5bd469384744d76c63e658e043e4dac6a6fd5ac3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19370)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9483:
URL: https://github.com/apache/hudi/pull/9483#issuecomment-1685057703

   
   ## CI report:
   
   * f424ca9897807f1bdcb7886dd6bb402e0968f04f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19369)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9482: [HUDI-6728] Update BigQuery manifest sync to support schema evolution

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9482:
URL: https://github.com/apache/hudi/pull/9482#issuecomment-1685042098

   
   ## CI report:
   
   * 75434ba7c835be022517f59805a12fc80da0d249 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19363)
 
   * 641d974b5e43f37f8ed429e75e817ba8a5a8376e Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19373)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9467: [HUDI-6621] Fix downgrade handler for 0.14.0

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9467:
URL: https://github.com/apache/hudi/pull/9467#issuecomment-1685042082

   
   ## CI report:
   
   * d20d5b2e45e0eccf8f3ec40077696eecf9dfc4bb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19368)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9482: [HUDI-6728] Update BigQuery manifest sync to support schema evolution

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9482:
URL: https://github.com/apache/hudi/pull/9482#issuecomment-1685040652

   
   ## CI report:
   
   * 75434ba7c835be022517f59805a12fc80da0d249 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19363)
 
   * 641d974b5e43f37f8ed429e75e817ba8a5a8376e UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9485: [HUDI-6730] Enable hoodie configuration using the --conf option with the "spark." prefix

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9485:
URL: https://github.com/apache/hudi/pull/9485#issuecomment-1685030774

   
   ## CI report:
   
   * 79060b391199c430b6d0ae8d7e63a10dfb2a853f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19372)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9485: [HUDI-6730] Enable hoodie configuration using the --conf option with the "spark." prefix

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9485:
URL: https://github.com/apache/hudi/pull/9485#issuecomment-1685029379

   
   ## CI report:
   
   * 79060b391199c430b6d0ae8d7e63a10dfb2a853f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6730) Enable hoodie configuration using the --conf option with the "spark." prefix.

2023-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6730:
-
Labels: pull-request-available  (was: )

> Enable hoodie configuration using the --conf option with the "spark." prefix.
> -
>
> Key: HUDI-6730
> URL: https://issues.apache.org/jira/browse/HUDI-6730
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Wechar
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] wecharyu opened a new pull request, #9485: [HUDI-6730] Enable hoodie configuration using the --conf option with the "spark." prefix

2023-08-19 Thread via GitHub


wecharyu opened a new pull request, #9485:
URL: https://github.com/apache/hudi/pull/9485

   ### Change Logs
   
   When submit spark job by `--conf` options, it only accepts option key start 
with "spark." prefix, so we can extract hoodie config through sqlConf start 
with "spark.hoodie.".
   
   ### Impact
   
   User can set hoodie conf by `--conf spark.hoodie.xxx=xxx` when submitting  
in the submitting a Spark job.
   
   ### Risk level (write none, low medium or high below)
   
   Low.
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6730) Enable hoodie configuration using the --conf option with the "spark." prefix.

2023-08-19 Thread Wechar (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wechar updated HUDI-6730:
-
Summary: Enable hoodie configuration using the --conf option with the 
"spark." prefix.  (was: Enable hoodie configuration using the --conf option 
with the spark. prefix.)

> Enable hoodie configuration using the --conf option with the "spark." prefix.
> -
>
> Key: HUDI-6730
> URL: https://issues.apache.org/jira/browse/HUDI-6730
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Wechar
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-6730) Enable hoodie configuration using the --conf option with the spark. prefix.

2023-08-19 Thread Wechar (Jira)
Wechar created HUDI-6730:


 Summary: Enable hoodie configuration using the --conf option with 
the spark. prefix.
 Key: HUDI-6730
 URL: https://issues.apache.org/jira/browse/HUDI-6730
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: Wechar






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #9484: [HUDI-6729] Fix get partition values from path for non-string type partition column

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9484:
URL: https://github.com/apache/hudi/pull/9484#issuecomment-1685009140

   
   ## CI report:
   
   * 4d503b60e26faf4f879e09f266255d6c9af98afc Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19367)
 
   * 905cc6b4eff305d54e52f4c1ac2d44d449e9afc5 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19371)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9477: [HUDI-6726] Fix connection leaks related to file reader and iterator close

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9477:
URL: https://github.com/apache/hudi/pull/9477#issuecomment-1685009045

   
   ## CI report:
   
   * 2fe4b6b8c722c26e4d970e8613be2f73e4b4eb4f Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19364)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9484: [HUDI-6729] Fix get partition values from path for non-string type partition column

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9484:
URL: https://github.com/apache/hudi/pull/9484#issuecomment-1684998275

   
   ## CI report:
   
   * 4d503b60e26faf4f879e09f266255d6c9af98afc Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19367)
 
   * 905cc6b4eff305d54e52f4c1ac2d44d449e9afc5 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9477: [HUDI-6726] Fix connection leaks related to file reader and iterator close

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9477:
URL: https://github.com/apache/hudi/pull/9477#issuecomment-1684998127

   
   ## CI report:
   
   * 2fe4b6b8c722c26e4d970e8613be2f73e4b4eb4f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan closed pull request #9457: [MINOR] Close record readers after use during tests

2023-08-19 Thread via GitHub


nsivabalan closed pull request #9457: [MINOR] Close record readers after use 
during tests
URL: https://github.com/apache/hudi/pull/9457


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #9457: [MINOR] Close record readers after use during tests

2023-08-19 Thread via GitHub


nsivabalan commented on PR #9457:
URL: https://github.com/apache/hudi/pull/9457#issuecomment-1684948315

   test failure is unrelated. Landing the patch. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9416: [HUDI-6678] Fix the acquisition of clean&rollback instants to archive

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9416:
URL: https://github.com/apache/hudi/pull/9416#issuecomment-1684945535

   
   ## CI report:
   
   * 642c6dd967978781d41b74138f89fae26192056b Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19263)
 
   * 5bd469384744d76c63e658e043e4dac6a6fd5ac3 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19370)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] codope commented on a diff in pull request #9444: [HUDI-6692] Do not allow switching from Primary keyed table to primary key less table

2023-08-19 Thread via GitHub


codope commented on code in PR #9444:
URL: https://github.com/apache/hudi/pull/9444#discussion_r1299186957


##
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala:
##
@@ -956,9 +956,7 @@ object DataSourceOptionsHelper {
*/
   def fetchMissingWriteConfigsFromTableConfig(tableConfig: HoodieTableConfig, 
params: Map[String, String]) : Map[String, String] = {
 val missingWriteConfigs = scala.collection.mutable.Map[String, String]()
-if (!params.contains(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.key()) && 
tableConfig.getRawRecordKeyFieldProp != null) {
-  missingWriteConfigs ++= 
Map(KeyGeneratorOptions.RECORDKEY_FIELD_NAME.key() -> 
tableConfig.getRawRecordKeyFieldProp)
-}

Review Comment:
   I am not following the fix here. I think this is a valid block. If some 
batch did not have record key in the write config, then why not infer from 
table config if it is present? I believe we resolve the configs before setting 
the write operation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9483:
URL: https://github.com/apache/hudi/pull/9483#issuecomment-1684944274

   
   ## CI report:
   
   * 373fb78cc587229fd9210edc0b9102101b3a3deb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19366)
 
   * f424ca9897807f1bdcb7886dd6bb402e0968f04f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19369)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1684944263

   
   ## CI report:
   
   * 4f0de8a6d00fe72108a12d8316cb1d38389d6b31 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19355)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19362)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19365)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9416: [HUDI-6678] Fix the acquisition of clean&rollback instants to archive

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9416:
URL: https://github.com/apache/hudi/pull/9416#issuecomment-1684944240

   
   ## CI report:
   
   * 642c6dd967978781d41b74138f89fae26192056b Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19263)
 
   * 5bd469384744d76c63e658e043e4dac6a6fd5ac3 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9483:
URL: https://github.com/apache/hudi/pull/9483#issuecomment-1684942784

   
   ## CI report:
   
   * 373fb78cc587229fd9210edc0b9102101b3a3deb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19366)
 
   * f424ca9897807f1bdcb7886dd6bb402e0968f04f UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9467: [HUDI-6621] Fix downgrade handler for 0.14.0

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9467:
URL: https://github.com/apache/hudi/pull/9467#issuecomment-1684942768

   
   ## CI report:
   
   * 50d4aea5e545b5094368e3a192ffb5fd2008c481 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19349)
 
   * d20d5b2e45e0eccf8f3ec40077696eecf9dfc4bb Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19368)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] codope commented on a diff in pull request #9422: [HUDI-6681] Ensure MOR Column Stats Index skips reading filegroups correctly

2023-08-19 Thread via GitHub


codope commented on code in PR #9422:
URL: https://github.com/apache/hudi/pull/9422#discussion_r1299184657


##
hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/functional/TestMORColstats.java:
##
@@ -0,0 +1,481 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.functional;
+
+import org.apache.hudi.DataSourceReadOptions;
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.client.SparkRDDWriteClient;
+import org.apache.hudi.common.config.HoodieMetadataConfig;
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.common.testutils.HoodieTestDataGenerator;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.testutils.HoodieSparkClientTestBase;
+
+import org.apache.spark.SparkException;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Row;
+import org.junit.jupiter.api.AfterEach;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import java.util.Set;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+
+import static 
org.apache.hudi.common.testutils.RawTripTestPayload.recordToString;
+import static 
org.apache.hudi.config.HoodieCompactionConfig.INLINE_COMPACT_NUM_DELTA_COMMITS;
+import static org.apache.spark.sql.SaveMode.Append;
+import static org.apache.spark.sql.SaveMode.Overwrite;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+/**
+ * Test mor with colstats enabled in scenarios to ensure that files
+ * are being appropriately read or not read.
+ * The strategy employed is to corrupt targeted base files. If we want
+ * to prove the file is read, we assert that an exception will be thrown.
+ * If we want to prove the file is not read, we expect the read to
+ * successfully execute.
+ */
+public class TestMORColstats extends HoodieSparkClientTestBase {
+
+  private static String matchCond = "trip_type = 'UBERX'";
+  private static String nonMatchCond = "trip_type = 'BLACK'";
+  private static String[] dropColumns = {"_hoodie_commit_time", 
"_hoodie_commit_seqno",
+  "_hoodie_record_key", "_hoodie_partition_path", "_hoodie_file_name"};
+
+  private Boolean shouldOverwrite;
+  Map options;
+  @TempDir
+  public java.nio.file.Path basePath;
+
+  @BeforeEach
+  public void setUp() throws Exception {
+initSparkContexts();
+dataGen = new HoodieTestDataGenerator();
+shouldOverwrite = true;
+options = getOptions();
+Properties props = new Properties();
+props.putAll(options);
+try {
+  metaClient = HoodieTableMetaClient.initTableAndGetMetaClient(hadoopConf, 
basePath.toString(), props);
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+  }
+
+  @AfterEach
+  public void tearDown() throws IOException {
+cleanupSparkContexts();
+cleanupTestDataGenerator();
+metaClient = null;
+  }
+
+  /**
+   * Create two files, one should be excluded by colstats
+   */
+  @Test
+  public void testBaseFileOnly() {
+Dataset inserts = makeInsertDf("000", 100);
+Dataset batch1 = inserts.where(matchCond);
+Dataset batch2 = inserts.where(nonMatchCond);
+doWrite(batch1);
+doWrite(batch2);
+List filesToCorrupt = getFilesToCorrupt();
+assertEquals(1, filesToCorrupt.size());
+filesToCorrupt.forEach(TestMORColstats::corruptFile);
+assertEquals(0, readMatchingRecords().except(batch1).count());
+//Read without data skipping to show that it will fail
+//Reading with data skipping succeeded so that means that data skipping is 
working and the corrupted
+//file was no

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #9416: [HUDI-6678] Fix the acquisition of clean&rollback instants to archive

2023-08-19 Thread via GitHub


Zouxxyy commented on code in PR #9416:
URL: https://github.com/apache/hudi/pull/9416#discussion_r1299184229


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java:
##
@@ -452,107 +431,137 @@ private Stream 
getCommitInstantsToArchive() throws IOException {
   ? CompactionUtils.getOldestInstantToRetainForCompaction(
   table.getActiveTimeline(), 
config.getInlineCompactDeltaCommitMax())
   : Option.empty();
+  oldestInstantToRetainCandidates.add(oldestInstantToRetainForCompaction);
 
-  // The clustering commit instant can not be archived unless we ensure 
that the replaced files have been cleaned,
+  // 3. The clustering commit instant can not be archived unless we ensure 
that the replaced files have been cleaned,
   // without the replaced files metadata on the timeline, the fs view 
would expose duplicates for readers.
   // Meanwhile, when inline or async clustering is enabled, we need to 
ensure that there is a commit in the active timeline
   // to check whether the file slice generated in pending clustering after 
archive isn't committed.
   Option oldestInstantToRetainForClustering =
   
ClusteringUtils.getOldestInstantToRetainForClustering(table.getActiveTimeline(),
 table.getMetaClient());
+  oldestInstantToRetainCandidates.add(oldestInstantToRetainForClustering);
+
+  // 4. If metadata table is enabled, do not archive instants which are 
more recent than the last compaction on the
+  // metadata table.
+  if (table.getMetaClient().getTableConfig().isMetadataTableAvailable()) {
+try (HoodieTableMetadata tableMetadata = 
HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(), 
config.getBasePath())) {
+  Option latestCompactionTime = 
tableMetadata.getLatestCompactionTime();
+  if (!latestCompactionTime.isPresent()) {
+LOG.info("Not archiving as there is no compaction yet on the 
metadata table");
+return Collections.emptyList();
+  } else {
+LOG.info("Limiting archiving of instants to latest compaction on 
metadata table at " + latestCompactionTime.get());
+oldestInstantToRetainCandidates.add(Option.of(new HoodieInstant(
+HoodieInstant.State.COMPLETED, COMPACTION_ACTION, 
latestCompactionTime.get(;
+  }
+} catch (Exception e) {
+  throw new HoodieException("Error limiting instant archival based on 
metadata table", e);
+}
+  }
+
+  // 5. If this is a metadata table, do not archive the commits that live 
in data set
+  // active timeline. This is required by metadata table,
+  // see HoodieTableMetadataUtil#processRollbackMetadata for details.
+  if (table.isMetadataTable()) {
+HoodieTableMetaClient dataMetaClient = HoodieTableMetaClient.builder()
+
.setBasePath(HoodieTableMetadata.getDatasetBasePath(config.getBasePath()))
+.setConf(metaClient.getHadoopConf())
+.build();
+Option qualifiedEarliestInstant =
+TimelineUtils.getEarliestInstantForMetadataArchival(
+dataMetaClient.getActiveTimeline(), 
config.shouldArchiveBeyondSavepoint());
+
+// Do not archive the instants after the earliest commit (COMMIT, 
DELTA_COMMIT, and
+// REPLACE_COMMIT only, considering non-savepoint commit only if 
enabling archive
+// beyond savepoint) and the earliest inflight instant (all actions).
+// This is required by metadata table, see 
HoodieTableMetadataUtil#processRollbackMetadata
+// for details.
+// Todo: Remove #7580
+// Note that we cannot blindly use the earliest instant of all 
actions, because CLEAN and
+// ROLLBACK instants are archived separately apart from commits (check
+// HoodieTimelineArchiver#getCleanInstantsToArchive).  If we do so, a 
very old completed
+// CLEAN or ROLLBACK instant can block the archive of metadata table 
timeline and causes
+// the active timeline of metadata table to be extremely long, leading 
to performance issues
+// for loading the timeline.
+oldestInstantToRetainCandidates.add(qualifiedEarliestInstant);
+  }
+
+  // Choose the instant in oldestInstantToRetainCandidates with the 
smallest
+  // timestamp as oldestInstantToRetain.
+  java.util.Optional oldestInstantToRetain = 
oldestInstantToRetainCandidates
+  .stream()
+  .filter(Option::isPresent)
+  .map(Option::get)
+  .min(HoodieInstant.COMPARATOR);
 
-  // Actually do the commits
-  Stream instantToArchiveStream = 
commitTimeline.getInstantsAsStream()
+  // Step2: We cannot archive any commits which are made after the first 
savepoint present,
+  // unless HoodieArchivalConfig#ARCHIVE_BEYOND_SAVEPOINT is enabled.
+  Option firstSavepoint = 
table.getComp

[GitHub] [hudi] Zouxxyy commented on a diff in pull request #9416: [HUDI-6678] Fix the acquisition of clean&rollback instants to archive

2023-08-19 Thread via GitHub


Zouxxyy commented on code in PR #9416:
URL: https://github.com/apache/hudi/pull/9416#discussion_r1299184229


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java:
##
@@ -452,107 +431,137 @@ private Stream 
getCommitInstantsToArchive() throws IOException {
   ? CompactionUtils.getOldestInstantToRetainForCompaction(
   table.getActiveTimeline(), 
config.getInlineCompactDeltaCommitMax())
   : Option.empty();
+  oldestInstantToRetainCandidates.add(oldestInstantToRetainForCompaction);
 
-  // The clustering commit instant can not be archived unless we ensure 
that the replaced files have been cleaned,
+  // 3. The clustering commit instant can not be archived unless we ensure 
that the replaced files have been cleaned,
   // without the replaced files metadata on the timeline, the fs view 
would expose duplicates for readers.
   // Meanwhile, when inline or async clustering is enabled, we need to 
ensure that there is a commit in the active timeline
   // to check whether the file slice generated in pending clustering after 
archive isn't committed.
   Option oldestInstantToRetainForClustering =
   
ClusteringUtils.getOldestInstantToRetainForClustering(table.getActiveTimeline(),
 table.getMetaClient());
+  oldestInstantToRetainCandidates.add(oldestInstantToRetainForClustering);
+
+  // 4. If metadata table is enabled, do not archive instants which are 
more recent than the last compaction on the
+  // metadata table.
+  if (table.getMetaClient().getTableConfig().isMetadataTableAvailable()) {
+try (HoodieTableMetadata tableMetadata = 
HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(), 
config.getBasePath())) {
+  Option latestCompactionTime = 
tableMetadata.getLatestCompactionTime();
+  if (!latestCompactionTime.isPresent()) {
+LOG.info("Not archiving as there is no compaction yet on the 
metadata table");
+return Collections.emptyList();
+  } else {
+LOG.info("Limiting archiving of instants to latest compaction on 
metadata table at " + latestCompactionTime.get());
+oldestInstantToRetainCandidates.add(Option.of(new HoodieInstant(
+HoodieInstant.State.COMPLETED, COMPACTION_ACTION, 
latestCompactionTime.get(;
+  }
+} catch (Exception e) {
+  throw new HoodieException("Error limiting instant archival based on 
metadata table", e);
+}
+  }
+
+  // 5. If this is a metadata table, do not archive the commits that live 
in data set
+  // active timeline. This is required by metadata table,
+  // see HoodieTableMetadataUtil#processRollbackMetadata for details.
+  if (table.isMetadataTable()) {
+HoodieTableMetaClient dataMetaClient = HoodieTableMetaClient.builder()
+
.setBasePath(HoodieTableMetadata.getDatasetBasePath(config.getBasePath()))
+.setConf(metaClient.getHadoopConf())
+.build();
+Option qualifiedEarliestInstant =
+TimelineUtils.getEarliestInstantForMetadataArchival(
+dataMetaClient.getActiveTimeline(), 
config.shouldArchiveBeyondSavepoint());
+
+// Do not archive the instants after the earliest commit (COMMIT, 
DELTA_COMMIT, and
+// REPLACE_COMMIT only, considering non-savepoint commit only if 
enabling archive
+// beyond savepoint) and the earliest inflight instant (all actions).
+// This is required by metadata table, see 
HoodieTableMetadataUtil#processRollbackMetadata
+// for details.
+// Todo: Remove #7580
+// Note that we cannot blindly use the earliest instant of all 
actions, because CLEAN and
+// ROLLBACK instants are archived separately apart from commits (check
+// HoodieTimelineArchiver#getCleanInstantsToArchive).  If we do so, a 
very old completed
+// CLEAN or ROLLBACK instant can block the archive of metadata table 
timeline and causes
+// the active timeline of metadata table to be extremely long, leading 
to performance issues
+// for loading the timeline.
+oldestInstantToRetainCandidates.add(qualifiedEarliestInstant);
+  }
+
+  // Choose the instant in oldestInstantToRetainCandidates with the 
smallest
+  // timestamp as oldestInstantToRetain.
+  java.util.Optional oldestInstantToRetain = 
oldestInstantToRetainCandidates
+  .stream()
+  .filter(Option::isPresent)
+  .map(Option::get)
+  .min(HoodieInstant.COMPARATOR);
 
-  // Actually do the commits
-  Stream instantToArchiveStream = 
commitTimeline.getInstantsAsStream()
+  // Step2: We cannot archive any commits which are made after the first 
savepoint present,
+  // unless HoodieArchivalConfig#ARCHIVE_BEYOND_SAVEPOINT is enabled.
+  Option firstSavepoint = 
table.getComp

[jira] [Updated] (HUDI-6621) Add a downgrade step from 6 to 5 to detect new delete blocks

2023-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6621:
-
Labels: pull-request-available  (was: )

> Add a downgrade step from 6 to 5 to detect new delete blocks
> 
>
> Key: HUDI-6621
> URL: https://issues.apache.org/jira/browse/HUDI-6621
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.14.0
>
>
> In table version 6, we introduce a new delete block format (v3) with Avro 
> serde (HUDI-5760).  For downgrading a table from v6 to v5, we need to perform 
> compaction to handle v3 delete blocks created using the new format.
> Also with the addition of record index field in Metadata table schema, the 
> downgrade needs to delete the metadata table to avoid column drop errors 
> after downgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #9467: [HUDI-6621] Fix downgrade handler for 0.14.0

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9467:
URL: https://github.com/apache/hudi/pull/9467#issuecomment-1684934537

   
   ## CI report:
   
   * 50d4aea5e545b5094368e3a192ffb5fd2008c481 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19349)
 
   * d20d5b2e45e0eccf8f3ec40077696eecf9dfc4bb UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6621) Add a downgrade step from 6 to 5 to detect new delete blocks

2023-08-19 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HUDI-6621:
--
Description: 
In table version 6, we introduce a new delete block format (v3) with Avro serde 
(HUDI-5760).  For downgrading a table from v6 to v5, we need to perform 
compaction to handle v3 delete blocks created using the new format.
Also with the addition of record index field in Metadata table schema, the 
downgrade needs to delete the metadata table to avoid column drop errors after 
downgrade.

  was:In table version 6, we introduce a new delete block format (v3) with Avro 
serde (HUDI-5760).  For downgrading a table from v6 to v5, we need to perform 
compaction to handle v3 delete blocks created using the new format.


> Add a downgrade step from 6 to 5 to detect new delete blocks
> 
>
> Key: HUDI-6621
> URL: https://issues.apache.org/jira/browse/HUDI-6621
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.14.0
>
>
> In table version 6, we introduce a new delete block format (v3) with Avro 
> serde (HUDI-5760).  For downgrading a table from v6 to v5, we need to perform 
> compaction to handle v3 delete blocks created using the new format.
> Also with the addition of record index field in Metadata table schema, the 
> downgrade needs to delete the metadata table to avoid column drop errors 
> after downgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-6621) Add a downgrade step from 6 to 5 to detect new delete blocks

2023-08-19 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HUDI-6621:
--
Description: In table version 6, we introduce a new delete block format 
(v3) with Avro serde (HUDI-5760).  For downgrading a table from v6 to v5, we 
need to perform compaction to handle v3 delete blocks created using the new 
format.  (was: In table version 6, we introduce a new delete block format (v3) 
with Avro serde (HUDI-5760).  For downgrading a table from v6 to v5, we need to 
check any v3 delete blocks using the new format and ask user to manually 
restore to a commit before any file slice with a v3 delete block.)

> Add a downgrade step from 6 to 5 to detect new delete blocks
> 
>
> Key: HUDI-6621
> URL: https://issues.apache.org/jira/browse/HUDI-6621
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Major
> Fix For: 0.14.0
>
>
> In table version 6, we introduce a new delete block format (v3) with Avro 
> serde (HUDI-5760).  For downgrading a table from v6 to v5, we need to perform 
> compaction to handle v3 delete blocks created using the new format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (HUDI-6717) Fix downgrade handler for 0.14.0

2023-08-19 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain closed HUDI-6717.
-
Resolution: Duplicate

> Fix downgrade handler for 0.14.0
> 
>
> Key: HUDI-6717
> URL: https://issues.apache.org/jira/browse/HUDI-6717
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: pull-request-available
>
> Since the log block version (due to delete block change) has been upgraded in 
> 0.14.0, the delete blocks can not be read in 0.13.0 or earlier.
> Similarly the addition of record level index field in metadata table leads to 
> column drop error on downgrade. The Jira aims to fix the downgrade handler to 
> trigger compaction and delete metadata table if user wishes to downgrade from 
> version six (0.14.0) to version 5 (0.13.0).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #9467: [HUDI-6717] Fix downgrade handler for 0.14.0

2023-08-19 Thread via GitHub


lokeshj1703 commented on code in PR #9467:
URL: https://github.com/apache/hudi/pull/9467#discussion_r1299179473


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/SixToFiveDowngradeHandler.java:
##
@@ -39,20 +47,26 @@
 
 import static 
org.apache.hudi.common.table.HoodieTableConfig.TABLE_METADATA_PARTITIONS;
 import static 
org.apache.hudi.common.table.HoodieTableConfig.TABLE_METADATA_PARTITIONS_INFLIGHT;
-import static 
org.apache.hudi.metadata.HoodieTableMetadataUtil.deleteMetadataTablePartition;
 
 /**
  * Downgrade handle to assist in downgrading hoodie table from version 6 to 5.
  * To ensure compatibility, we need recreate the compaction requested file to
  * .aux folder.
+ * Since version 6 includes a new schema field for metadata table(MDT),
+ * the MDT needs to be deleted during downgrade to avoid column drop error.
+ * Also log block version was upgraded in version 6, therefore full compaction 
needs
+ * to be completed during downgrade to avoid write failures.
  */
 public class SixToFiveDowngradeHandler implements DowngradeHandler {
 
   @Override
   public Map downgrade(HoodieWriteConfig config, 
HoodieEngineContext context, String instantTime, SupportsUpgradeDowngrade 
upgradeDowngradeHelper) {
 final HoodieTable table = upgradeDowngradeHelper.getTable(config, context);
 
-removeRecordIndexIfNeeded(table, context);
+// Since version 6 includes a new schema field for metadata table(MDT), 
the MDT needs to be deleted during downgrade to avoid column drop error.
+HoodieTableMetadataUtil.deleteMetadataTable(config.getBasePath(), context);
+runCompaction(table, context, config, upgradeDowngradeHelper);

Review Comment:
   Addressed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] lokeshj1703 commented on a diff in pull request #9467: [HUDI-6717] Fix downgrade handler for 0.14.0

2023-08-19 Thread via GitHub


lokeshj1703 commented on code in PR #9467:
URL: https://github.com/apache/hudi/pull/9467#discussion_r1299179444


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/SixToFiveDowngradeHandler.java:
##
@@ -65,13 +79,31 @@ public Map 
downgrade(HoodieWriteConfig config, HoodieEng
   }
 
   /**
-   * Record-level index, a new partition in metadata table, was first added in
-   * 0.14.0 ({@link HoodieTableVersion#SIX}. Any downgrade from this version
-   * should remove this partition.
+   * Utility method to run compaction for MOR table as part of downgrade step.
*/
-  private static void removeRecordIndexIfNeeded(HoodieTable table, 
HoodieEngineContext context) {
-HoodieTableMetaClient metaClient = table.getMetaClient();
-deleteMetadataTablePartition(metaClient, context, 
MetadataPartitionType.RECORD_INDEX, false);
+  private void runCompaction(HoodieTable table, HoodieEngineContext context, 
HoodieWriteConfig config,
+ SupportsUpgradeDowngrade upgradeDowngradeHelper) {
+try {
+  if (table.getMetaClient().getTableType() == 
HoodieTableType.MERGE_ON_READ) {
+// The log block version has been upgraded in version six so 
compaction is required for downgrade.
+// set required configs for scheduling compaction.
+
HoodieInstantTimeGenerator.setCommitTimeZone(table.getMetaClient().getTableConfig().getTimelineTimezone());
+HoodieWriteConfig compactionConfig = 
HoodieWriteConfig.newBuilder().withProps(config.getProps()).build();
+compactionConfig.setValue(HoodieCompactionConfig.INLINE_COMPACT.key(), 
"true");
+
compactionConfig.setValue(HoodieCompactionConfig.INLINE_COMPACT_NUM_DELTA_COMMITS.key(),
 "1");
+
compactionConfig.setValue(HoodieCompactionConfig.INLINE_COMPACT_TRIGGER_STRATEGY.key(),
 CompactionTriggerStrategy.NUM_COMMITS.name());
+
compactionConfig.setValue(HoodieCompactionConfig.COMPACTION_STRATEGY.key(), 
UnBoundedCompactionStrategy.class.getName());
+compactionConfig.setValue(HoodieMetadataConfig.ENABLE.key(), "false");
+EmbeddedTimelineServerHelper.createEmbeddedTimelineService(context, 
config);

Review Comment:
   Addressed. This was required earlier but not needed any more.



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/SixToFiveDowngradeHandler.java:
##
@@ -39,20 +47,26 @@
 
 import static 
org.apache.hudi.common.table.HoodieTableConfig.TABLE_METADATA_PARTITIONS;
 import static 
org.apache.hudi.common.table.HoodieTableConfig.TABLE_METADATA_PARTITIONS_INFLIGHT;
-import static 
org.apache.hudi.metadata.HoodieTableMetadataUtil.deleteMetadataTablePartition;
 
 /**
  * Downgrade handle to assist in downgrading hoodie table from version 6 to 5.
  * To ensure compatibility, we need recreate the compaction requested file to
  * .aux folder.
+ * Since version 6 includes a new schema field for metadata table(MDT),
+ * the MDT needs to be deleted during downgrade to avoid column drop error.
+ * Also log block version was upgraded in version 6, therefore full compaction 
needs
+ * to be completed during downgrade to avoid write failures.

Review Comment:
   Addressed
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9484: [HUDI-6729] Fix get partition values from path for non-string type partition column

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9484:
URL: https://github.com/apache/hudi/pull/9484#issuecomment-1684924952

   
   ## CI report:
   
   * 4d503b60e26faf4f879e09f266255d6c9af98afc Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19367)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9484: [HUDI-6729] Fix get partition values from path for non-string type partition column

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9484:
URL: https://github.com/apache/hudi/pull/9484#issuecomment-1684923777

   
   ## CI report:
   
   * 4d503b60e26faf4f879e09f266255d6c9af98afc UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6729) Fix get partition values from path for non-string type partition column

2023-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6729:
-
Labels: pull-request-available  (was: )

> Fix get partition values from path for non-string type partition column
> ---
>
> Key: HUDI-6729
> URL: https://issues.apache.org/jira/browse/HUDI-6729
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: hudi-utilities
>Reporter: Wechar
>Priority: Major
>  Labels: pull-request-available
>
> When we enable {{hoodie.datasource.read.extract.partition.values.from.path}} 
> to get partition values from path instead of data file, the exception throw 
> if partition column is not string type:
> {code:bash}
> Caused by: java.lang.ClassCastException: 
> org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer
>     at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:103)
>     at 
> org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInt(rows.scala:41)
>     at 
> org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInt$(rows.scala:41)
>     at 
> org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getInt(rows.scala:195)
>     at 
> org.apache.spark.sql.execution.vectorized.ColumnVectorUtils.populate(ColumnVectorUtils.java:97)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:245)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:264)
>     at 
> org.apache.spark.sql.execution.datasources.parquet.Spark32LegacyHoodieParquetFileFormat.$anonfun$buildReaderWithPartitionValues$2(Spark32LegacyHoodieParquetFileFormat.scala:314)
>     at 
> org.apache.hudi.HoodieDataSourceHelper$.$anonfun$buildHoodieParquetReader$1(HoodieDataSourceHelper.scala:67)
>     at 
> org.apache.hudi.HoodieBaseRelation.$anonfun$createBaseFileReader$2(HoodieBaseRelation.scala:602)
>     at 
> org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:680)
>     at 
> org.apache.hudi.HoodieBaseRelation$.$anonfun$projectReader$1(HoodieBaseRelation.scala:706)
>     at 
> org.apache.hudi.HoodieBaseRelation$.$anonfun$projectReader$2(HoodieBaseRelation.scala:711)
>     at 
> org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:680)
>     at 
> org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96)
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
>     at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
>     at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
>     at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
>     at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>     at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
>     at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
>     at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>     at org.apache.spark.scheduler.Task.run(Task.scala:131)
>     at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
>     at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] wecharyu opened a new pull request, #9484: [HUDI-6729] Fix get partition values from path for non-string type partition column

2023-08-19 Thread via GitHub


wecharyu opened a new pull request, #9484:
URL: https://github.com/apache/hudi/pull/9484

   ### Change Logs
   
   When we enable `hoodie.datasource.read.extract.partition.values.from.path` 
to get partition values from path instead of data file, the exception throw if 
partition column is not string type. 
   
   This patch fix the issue by cast partition value string to target datatype, 
following Spark's approach.
   ```bash
   Caused by: java.lang.ClassCastException: 
org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer
   at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:103)
   at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInt(rows.scala:41)
   at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInt$(rows.scala:41)
   at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getInt(rows.scala:195)
   at 
org.apache.spark.sql.execution.vectorized.ColumnVectorUtils.populate(ColumnVectorUtils.java:97)
   at 
org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:245)
   at 
org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:264)
   at 
org.apache.spark.sql.execution.datasources.parquet.Spark32LegacyHoodieParquetFileFormat.$anonfun$buildReaderWithPartitionValues$2(Spark32LegacyHoodieParquetFileFormat.scala:314)
   at 
org.apache.hudi.HoodieDataSourceHelper$.$anonfun$buildHoodieParquetReader$1(HoodieDataSourceHelper.scala:67)
   at 
org.apache.hudi.HoodieBaseRelation.$anonfun$createBaseFileReader$2(HoodieBaseRelation.scala:602)
   at 
org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:680)
   at 
org.apache.hudi.HoodieBaseRelation$.$anonfun$projectReader$1(HoodieBaseRelation.scala:706)
   at 
org.apache.hudi.HoodieBaseRelation$.$anonfun$projectReader$2(HoodieBaseRelation.scala:711)
   at 
org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:680)
   at 
org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
   at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
   at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   at org.apache.spark.scheduler.Task.run(Task.scala:131)
   at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
   ```
   
   ### Impact
   
   No
   
   ### Risk level (write none, low medium or high below)
   
   None
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6729) Fix get partition values from path for non-string type partition column

2023-08-19 Thread Wechar (Jira)
Wechar created HUDI-6729:


 Summary: Fix get partition values from path for non-string type 
partition column
 Key: HUDI-6729
 URL: https://issues.apache.org/jira/browse/HUDI-6729
 Project: Apache Hudi
  Issue Type: Bug
  Components: hudi-utilities
Reporter: Wechar


When we enable {{hoodie.datasource.read.extract.partition.values.from.path}} to 
get partition values from path instead of data file, the exception throw if 
partition column is not string type:
{code:bash}
Caused by: java.lang.ClassCastException: 
org.apache.spark.unsafe.types.UTF8String cannot be cast to java.lang.Integer
    at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:103)
    at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInt(rows.scala:41)
    at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getInt$(rows.scala:41)
    at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getInt(rows.scala:195)
    at 
org.apache.spark.sql.execution.vectorized.ColumnVectorUtils.populate(ColumnVectorUtils.java:97)
    at 
org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:245)
    at 
org.apache.spark.sql.execution.datasources.parquet.VectorizedParquetRecordReader.initBatch(VectorizedParquetRecordReader.java:264)
    at 
org.apache.spark.sql.execution.datasources.parquet.Spark32LegacyHoodieParquetFileFormat.$anonfun$buildReaderWithPartitionValues$2(Spark32LegacyHoodieParquetFileFormat.scala:314)
    at 
org.apache.hudi.HoodieDataSourceHelper$.$anonfun$buildHoodieParquetReader$1(HoodieDataSourceHelper.scala:67)
    at 
org.apache.hudi.HoodieBaseRelation.$anonfun$createBaseFileReader$2(HoodieBaseRelation.scala:602)
    at 
org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:680)
    at 
org.apache.hudi.HoodieBaseRelation$.$anonfun$projectReader$1(HoodieBaseRelation.scala:706)
    at 
org.apache.hudi.HoodieBaseRelation$.$anonfun$projectReader$2(HoodieBaseRelation.scala:711)
    at 
org.apache.hudi.HoodieBaseRelation$BaseFileReader.apply(HoodieBaseRelation.scala:680)
    at 
org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:96)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9483:
URL: https://github.com/apache/hudi/pull/9483#issuecomment-1684913205

   
   ## CI report:
   
   * 373fb78cc587229fd9210edc0b9102101b3a3deb Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19366)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9477: [HUDI-6726] Fix connection leaks related to file reader and iterator close

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9477:
URL: https://github.com/apache/hudi/pull/9477#issuecomment-1684913193

   
   ## CI report:
   
   * 2fe4b6b8c722c26e4d970e8613be2f73e4b4eb4f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19364)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9483:
URL: https://github.com/apache/hudi/pull/9483#issuecomment-1684905560

   
   ## CI report:
   
   * 373fb78cc587229fd9210edc0b9102101b3a3deb Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19366)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9483:
URL: https://github.com/apache/hudi/pull/9483#issuecomment-1684904260

   
   ## CI report:
   
   * 373fb78cc587229fd9210edc0b9102101b3a3deb UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hbgstc123 opened a new pull request, #9483: [HUDI-6156] Prevent leaving tmp file in timeline, delete tmp file whe…

2023-08-19 Thread via GitHub


hbgstc123 opened a new pull request, #9483:
URL: https://github.com/apache/hudi/pull/9483

   …n rename throw exception.
   
   ### Change Logs
   
   follow former pr, try delete tmp file when rename throw excetion.
   
   ### Impact
   
   none
   
   ### Risk level (write none, low medium or high below)
   
   none
   
   ### Documentation Update
   
   none
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1684902865

   
   ## CI report:
   
   * 4f0de8a6d00fe72108a12d8316cb1d38389d6b31 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19355)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19362)
 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19365)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] majian1998 commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


majian1998 commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1684902220

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9472: [HUDI-6719]Fix data inconsistency issues caused by concurrent clustering and delete partition.

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9472:
URL: https://github.com/apache/hudi/pull/9472#issuecomment-1684878994

   
   ## CI report:
   
   * 4f0de8a6d00fe72108a12d8316cb1d38389d6b31 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19355)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19362)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #9477: [HUDI-6726] Fix connection leaks related to file reader and iterator close

2023-08-19 Thread via GitHub


hudi-bot commented on PR #9477:
URL: https://github.com/apache/hudi/pull/9477#issuecomment-1684876834

   
   ## CI report:
   
   * c90d959664437e13d53ce3c9810f824eaf396262 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19356)
 
   * 2fe4b6b8c722c26e4d970e8613be2f73e4b4eb4f Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=19364)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org