[GitHub] [hudi] bvaradar commented on a diff in pull request #8143: [HUDI-5911] SimpleTransactionDirectMarkerBasedDetectionStrategy can't work with none-partitioned table

2023-04-29 Thread via GitHub


bvaradar commented on code in PR #8143:
URL: https://github.com/apache/hudi/pull/8143#discussion_r1181169082


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/DirectMarkerTransactionManager.java:
##
@@ -83,7 +83,7 @@ private static TypedProperties createUpdatedLockProps(
   throw new HoodieNotSupportedException("Only Support ZK-based lock for 
DirectMarkerTransactionManager now.");
 }
 TypedProperties props = new TypedProperties(writeConfig.getProps());
-props.setProperty(LockConfiguration.ZK_LOCK_KEY_PROP_KEY, partitionPath + 
"/" + fileId);
+props.setProperty(LockConfiguration.ZK_LOCK_KEY_PROP_KEY, (null != 
partitionPath && !partitionPath.isEmpty()) ? partitionPath + "/" + fileId : 
fileId);

Review Comment:
   This change will pose a challenge when upgrading. During upgrade, we would 
need all writers to be stopped and upgrade to the version containing this 
change for safe concurrency



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528945298

   
   ## CI report:
   
   * 8d29d9571d94e3d654e87151b16ef99ff02762b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16763)
 
   * 0c8c7d99fc250191a7eba156052f01371e431a30 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16768)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528944501

   
   ## CI report:
   
   * 8d29d9571d94e3d654e87151b16ef99ff02762b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16763)
 
   * 0c8c7d99fc250191a7eba156052f01371e431a30 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528943697

   
   ## CI report:
   
   * 8d29d9571d94e3d654e87151b16ef99ff02762b4 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16763)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xicm commented on pull request #7355: [HUDI-5308] Hive3 query returns null when the where clause has a partition field

2023-04-29 Thread via GitHub


xicm commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1528942760

   > So it is because the incorrect hive server version is used ?
   
   yes, partition query returns null with hive3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] c-f-cooper commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


c-f-cooper commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528941416

   > Hi, can you elaborate a little more what would happen if the inputstream 
is not closed properly? Can you write a test case to demonstrate the resolution 
of the issue.
   
   To demonstrate the issue of unclosed IO streams, I can write the following 
test program:
   `import java.io.*;
   
   public class UnclosedIOTest {
   public static void main(String[] args) throws IOException {
   BufferedReader reader = new BufferedReader(new 
InputStreamReader(System.in));
   BufferedWriter writer = new BufferedWriter(new 
FileWriter("output.txt"));
   
   System.out.print("Please enter a line of text:");
   String line = reader.readLine();
   writer.write(line);
   System.out.println("Written to output.txt");
   }
   }
   `
   This program uses an unclosed BufferedReader and an unclosed BufferedWriter. 
When the program runs, the user will be prompted to enter a line of text, and 
the program will write this line of text to a file called output.txt. However, 
because the BufferedReader and BufferedWriter are not closed, it can lead to 
the following issues:
   
   Resource leakage: When the program is run repeatedly, a new BufferedReader 
and BufferedWriter will be created each time, but the old IO streams are not 
closed, and they will continue to occupy system resources, which may eventually 
cause the system or program to crash.
   Data loss: If an exception occurs while writing data and the BufferedWriter 
is not closed, the written data may be lost because they have not been flushed 
to the disk yet.
   To solve these issues, you should add the following code at the end of the 
program to close the IO streams:
   
   `reader.close();
   writer.close();
   `
   This will ensure that the program correctly releases the IO resources.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] bvaradar commented on a diff in pull request #8378: [HUDI-6031] fix bug: checkpoint lost after changing cow to mor

2023-04-29 Thread via GitHub


bvaradar commented on code in PR #8378:
URL: https://github.com/apache/hudi/pull/8378#discussion_r1181163294


##
hudi-utilities/src/test/java/org/apache/hudi/utilities/deltastreamer/TestHoodieDeltaStreamer.java:
##
@@ -2604,6 +2605,59 @@ public void testForceEmptyMetaSync() throws Exception {
 assertTrue(hiveClient.tableExists(tableName), "Table " + tableName + " 
should exist");
   }
 
+  @Test
+  public void testResumeCheckpointAfterChangingCOW2MOR() throws Exception {
+String tableBasePath = basePath + 
"/test_resume_checkpoint_after_changing_cow_to_mor";
+// default table type is COW
+HoodieDeltaStreamer.Config cfg = TestHelpers.makeConfig(tableBasePath, 
WriteOperationType.BULK_INSERT);
+new HoodieDeltaStreamer(cfg, jsc).sync();
+TestHelpers.assertRecordCount(1000, tableBasePath, sqlContext);
+TestHelpers.assertCommitMetadata("0", tableBasePath, fs, 1);
+TestHelpers.assertAtLeastNCommits(1, tableBasePath, fs);
+
+// change cow to mor
+HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder()
+.setConf(new Configuration(fs.getConf()))
+.setBasePath(cfg.targetBasePath)
+.setLoadActiveTimelineOnLoad(false)
+.build();
+Properties hoodieProps = new Properties();
+hoodieProps.load(fs.open(new Path(cfg.targetBasePath + 
"/.hoodie/hoodie.properties")));
+LOG.info("old props: {}", hoodieProps);
+hoodieProps.put("hoodie.table.type", HoodieTableType.MERGE_ON_READ.name());
+LOG.info("new props: {}", hoodieProps);
+Path metaPathDir = new Path(metaClient.getBasePathV2(), METAFOLDER_NAME);
+HoodieTableConfig.create(metaClient.getFs(), metaPathDir, hoodieProps);
+
+// continue deltastreamer
+cfg = TestHelpers.makeConfig(tableBasePath, WriteOperationType.UPSERT);
+cfg.tableType = HoodieTableType.MERGE_ON_READ.name();
+new HoodieDeltaStreamer(cfg, jsc).sync();
+// out of 1000 new records, 500 are inserts, 450 are updates and 50 are 
deletes.

Review Comment:
   Sounds good.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8611: [HUDI-6157] Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8611:
URL: https://github.com/apache/hudi/pull/8611#issuecomment-1528932193

   
   ## CI report:
   
   * e3b3799e1e360710b99bc089f193b771fc8c4db3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16759)
 
   * b184b111c6928408d082ce73486f5bd3ae7c6683 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16767)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8611: [HUDI-6157] Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8611:
URL: https://github.com/apache/hudi/pull/8611#issuecomment-1528931363

   
   ## CI report:
   
   * e3b3799e1e360710b99bc089f193b771fc8c4db3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16759)
 
   * b184b111c6928408d082ce73486f5bd3ae7c6683 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #8493: [HUDI-6098] Use bulk insert prepped for the initial write into MDT.

2023-04-29 Thread via GitHub


danny0405 commented on PR #8493:
URL: https://github.com/apache/hudi/pull/8493#issuecomment-1528930736

   @prashantwason You need to rebase with the latest master to get the tests 
passed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8599: [MINOR] Ensure metrics prefix does not contain any dot.

2023-04-29 Thread via GitHub


danny0405 commented on code in PR #8599:
URL: https://github.com/apache/hudi/pull/8599#discussion_r1181158282


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##
@@ -2175,7 +2175,8 @@ public boolean getPushGatewayRandomJobNameSuffix() {
   }
 
   public String getMetricReporterMetricsNamePrefix() {
-return getStringOrDefault(HoodieMetricsConfig.METRICS_REPORTER_PREFIX);
+// Metrics prefixes should not have a dot as this is usually a separator
+return 
getStringOrDefault(HoodieMetricsConfig.METRICS_REPORTER_PREFIX).replaceAll("\\.",
 "_");

Review Comment:
   Can we just report the invalid format and throws exception ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [MINOR] Match the directory-filter-regex to the relative directory name (#8601)

2023-04-29 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 01992458d86 [MINOR] Match the directory-filter-regex to the relative 
directory name (#8601)
01992458d86 is described below

commit 01992458d86034cbfca79a865d6ee47313fc585e
Author: Prashant Wason 
AuthorDate: Sat Apr 29 20:33:22 2023 -0700

[MINOR] Match the directory-filter-regex to the relative directory name 
(#8601)
---
 .../apache/hudi/metadata/HoodieBackedTableMetadataWriter.java| 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
index df4d2530815..1f5f505364c 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
@@ -673,12 +673,9 @@ public abstract class HoodieBackedTableMetadataWriter 
implements HoodieTableMeta
   for (DirectoryInfo dirInfo : processedDirectories) {
 if (!dirFilterRegex.isEmpty()) {
   final String relativePath = dirInfo.getRelativePath();
-  if (!relativePath.isEmpty()) {
-Path partitionPath = new Path(datasetBasePath, relativePath);
-if (partitionPath.getName().matches(dirFilterRegex)) {
-  LOG.info("Ignoring directory " + partitionPath + " which matches 
the filter regex " + dirFilterRegex);
-  continue;
-}
+  if (!relativePath.isEmpty() && relativePath.matches(dirFilterRegex)) 
{
+LOG.info("Ignoring directory " + relativePath + " which matches 
the filter regex " + dirFilterRegex);
+continue;
   }
 }
 



[GitHub] [hudi] danny0405 merged pull request #8601: [MINOR] Match the directory-filter-regex to the relative directory name.

2023-04-29 Thread via GitHub


danny0405 merged PR #8601:
URL: https://github.com/apache/hudi/pull/8601


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8602: [MINOR] When a clean operation fails do not continue and throw the exception.

2023-04-29 Thread via GitHub


danny0405 commented on code in PR #8602:
URL: https://github.com/apache/hudi/pull/8602#discussion_r1181157917


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java:
##
@@ -256,13 +257,14 @@ public HoodieCleanMetadata execute() {
 cleanMetadataList.add(runPendingClean(table, hoodieInstant));
   } catch (Exception e) {
 LOG.warn("Failed to perform previous clean operation, instant: " + 
hoodieInstant, e);
+throw e;
   }

Review Comment:
   Can we write a test?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] xccui commented on pull request #8594: [HUDI-6148] Recreate StreamWriteOperatorCoordinator for global failovers

2023-04-29 Thread via GitHub


xccui commented on PR #8594:
URL: https://github.com/apache/hudi/pull/8594#issuecomment-1528929872

   Hi @danny0405, it's the http connection pool (`CPool`) in 
`PoolingHttpClientConnectionManager` used by s3a FileSystem. It was closed for 
an OOM of JobManager (see 
https://github.com/apache/httpcomponents-client/commit/ca98ad69adad79de57d8b944ba524f7267a795cb).
 I'm not quite sure why the JobManager was not restarted but just triggered a 
job failover. But when a failover is triggered, I believe the whole job 
including the coordinator should be reset.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8604: [HUDI-6151] Rollback previously applied commits to MDT when operations are retried.

2023-04-29 Thread via GitHub


danny0405 commented on code in PR #8604:
URL: https://github.com/apache/hudi/pull/8604#discussion_r1181157786


##
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java:
##
@@ -161,27 +161,28 @@ protected void commit(String instantTime, 
Map to MDT.



##
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java:
##
@@ -161,27 +161,28 @@ protected void commit(String instantTime, 
Map for e.g



##
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java:
##
@@ -161,27 +161,28 @@ protected void commit(String instantTime, 
Map alreadyCompletedInstant = 
metadataMetaClient.getActiveTimeline().filterCompletedInstants().filter(entry 
-> entry.getTimestamp().equals(instantTime)).lastInstant();
-if (alreadyCompletedInstant.isPresent()) {
-  // this code path refers to a re-attempted commit that got committed 
to metadata table, but failed in datatable.
-  // for eg, lets say compaction c1 on 1st attempt succeeded in 
metadata table and failed before committing to datatable.
-  // when retried again, data table will first rollback pending 
compaction. these will be applied to metadata table, but all changes
-  // are upserts to metadata table and so only a new delta commit will 
be created.
-  // once rollback is complete, compaction will be retried again, 
which will eventually hit this code block where the respective commit is
-  // already part of completed commit. So, we have to manually remove 
the completed instant and proceed.
-  // and it is for the same reason we enabled 
withAllowMultiWriteOnSameInstant for metadata table.
-  HoodieActiveTimeline.deleteInstantFile(metadataMetaClient.getFs(), 
metadataMetaClient.getMetaPath(), alreadyCompletedInstant.get());
-  metadataMetaClient.reloadActiveTimeline();
+LOG.info(String.format("%s completed commit at %s being applied to 
metadata table",
+alreadyCompletedInstant.isPresent() ? "Already" : "Partially", 
instantTime));

Review Comment:
   applied to metadata table -> applied to metadata table.



##
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java:
##
@@ -161,27 +161,28 @@ protected void commit(String instantTime, 
Map let's say



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated (f3ddcd97625 -> b56ab71c57c)

2023-04-29 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


from f3ddcd97625 [MINOR] Fix the hudi-cli export command (#8608)
 add b56ab71c57c [MINOR] Update colstats parallelism default to 200 (#8517)

No new revisions were added by this update.

Summary of changes:
 .../main/java/org/apache/hudi/common/config/HoodieMetadataConfig.java   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[GitHub] [hudi] xushiyan merged pull request #8517: [MINOR] Update colstats parallelism default to 200

2023-04-29 Thread via GitHub


xushiyan merged PR #8517:
URL: https://github.com/apache/hudi/pull/8517


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8606: [MINOR] Check the return value from delete during rollback and finalize to ensure the files actually got deleted.

2023-04-29 Thread via GitHub


danny0405 commented on code in PR #8606:
URL: https://github.com/apache/hudi/pull/8606#discussion_r1181157467


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/BaseRollbackHelper.java:
##
@@ -197,14 +197,21 @@ protected List 
deleteFiles(HoodieTableMetaClient metaClient,
 // if first rollback attempt failed and retried again, chances 
that some files are already deleted.
 isDeleted = true;
   }
+
+  if (!isDeleted) {

Review Comment:
   In which case the `metaClient.getFs().delete()` can return false if the file 
actually exists there?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8605: [HUDI-6152] Fixed the check for older timestamps with second granularity during index tagLocation.

2023-04-29 Thread via GitHub


danny0405 commented on code in PR #8605:
URL: https://github.com/apache/hudi/pull/8605#discussion_r1181157260


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -170,9 +171,35 @@ public static List filterKeysFromFile(Path 
filePath, List candid
 return foundRecordKeys;
   }
 
+  /**
+   * Check if the given commit timestamp is valid for the timeline.
+   *
+   * The commit timestamp is considered to be valid if:
+   *   1. the commit timestamp is present in the timeline, or
+   *   2. the commit timestamp is less than the first commit timestamp in the 
timeline
+   *
+   * @param commitTimeline  The timeline
+   * @param commitTsThe commit timestamp to check
+   * @returntrue if the commit timestamp is valid for the 
timeline
+   */
   public static boolean checkIfValidCommit(HoodieTimeline commitTimeline, 
String commitTs) {
-// Check if the last commit ts for this row is 1) present in the timeline 
or
-// 2) is less than the first commit ts in the timeline
-return !commitTimeline.empty() && 
commitTimeline.containsOrBeforeTimelineStarts(commitTs);
+if (commitTimeline.empty()) {
+  return false;
+}
+
+// Check for 0.8+ timestamps which have msec granularity
+if (commitTimeline.containsOrBeforeTimelineStarts(commitTs)) {
+  return true;
+}
+
+// Check for older timestamp which have sec granularity and an extension 
of DEFAULT_MILLIS_EXT may have been added via Timeline operations
+if (commitTs.length() == 
HoodieInstantTimeGenerator.MILLIS_INSTANT_TIMESTAMP_FORMAT_LENGTH && 
commitTs.endsWith(HoodieInstantTimeGenerator.DEFAULT_MILLIS_EXT)) {
+  final String actualOlderFormatTs = commitTs.substring(0, 
commitTs.length() - HoodieInstantTimeGenerator.DEFAULT_MILLIS_EXT.length());
+  if (commitTimeline.containsOrBeforeTimelineStarts(actualOlderFormatTs)) {
+return true;
+  }
+}

Review Comment:
   Shouldm't we fix this method instead? 
`commitTimeline.containsOrBeforeTimelineStarts` and should we have a version 
number for the timeline?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #8607: [MINOR] Fixed the reading of instants from very old archive files where ACTION_STATE is not present in instants.

2023-04-29 Thread via GitHub


danny0405 commented on code in PR #8607:
URL: https://github.com/apache/hudi/pull/8607#discussion_r1181156586


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieArchivedTimeline.java:
##
@@ -152,9 +156,13 @@ public void loadCompactionDetailsInMemory(String 
compactionInstantTime) {
 
   public void loadCompactionDetailsInMemory(String startTs, String endTs) {
 // load compactionPlan
-loadInstants(new TimeRangeFilter(startTs, endTs), true, record ->
-
record.get(ACTION_TYPE_KEY).toString().equals(HoodieTimeline.COMPACTION_ACTION)
-&& 
HoodieInstant.State.INFLIGHT.toString().equals(record.get(ACTION_STATE).toString())
+loadInstants(new TimeRangeFilter(startTs, endTs), true,
+record -> {
+  // Older files don't have action state set.
+  Object action = record.get(ACTION_STATE);
+  return 
record.get(ACTION_TYPE_KEY).toString().equals(HoodieTimeline.COMPACTION_ACTION)
+&& (action == null || 
HoodieInstant.State.INFLIGHT.toString().equals(action.toString()));

Review Comment:
   When action equals null, the instant state is definite to be `INFLIGHT` for 
old version ? Can we write ta test case?



##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieArchivedTimeline.java:
##
@@ -143,7 +143,11 @@ public void loadInstantDetailsInMemory(String startTs, 
String endTs) {
 
   public void loadCompletedInstantDetailsInMemory() {
 loadInstants(null, true,
-record -> 
HoodieInstant.State.COMPLETED.toString().equals(record.get(ACTION_STATE).toString()));
+record -> {
+  // Very old archived instants don't have action state set.
+  Object action = record.get(ACTION_STATE);
+  return action == null || 
HoodieInstant.State.COMPLETED.toString().equals(action.toString());

Review Comment:
   When action equals null, the instant state is definite to be `COMPLETE` for 
old version ? Can we write ta test case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #7355: [HUDI-5308] Hive3 query returns null when the where clause has a partition field

2023-04-29 Thread via GitHub


danny0405 commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1528927662

   So it is because the incorrect hive server version is used ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [MINOR] Fix the hudi-cli export command (#8608)

2023-04-29 Thread danny0405
This is an automated email from the ASF dual-hosted git repository.

danny0405 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new f3ddcd97625 [MINOR] Fix the hudi-cli export command (#8608)
f3ddcd97625 is described below

commit f3ddcd97625631f91488da745164bbe7809ecc76
Author: Prashant Wason 
AuthorDate: Sat Apr 29 20:07:29 2023 -0700

[MINOR] Fix the hudi-cli export command (#8608)

1. Removed the hardcoded location of archives
2. Handle the case where the metadata from an archive entry may be null 
(seen in very old archives)
---
 .../java/org/apache/hudi/cli/commands/ExportCommand.java  | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git 
a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java 
b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java
index 54227a613e4..e81a532f2a8 100644
--- a/hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java
+++ b/hudi-cli/src/main/java/org/apache/hudi/cli/commands/ExportCommand.java
@@ -44,6 +44,8 @@ import org.apache.hudi.common.table.timeline.HoodieTimeline;
 import org.apache.hudi.common.table.timeline.TimelineMetadataUtils;
 import org.apache.hudi.common.util.collection.ClosableIterator;
 import org.apache.hudi.exception.HoodieException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 import org.springframework.shell.standard.ShellComponent;
 import org.springframework.shell.standard.ShellMethod;
 import org.springframework.shell.standard.ShellOption;
@@ -67,6 +69,8 @@ import java.util.stream.Collectors;
 @ShellComponent
 public class ExportCommand {
 
+  private static final Logger LOG = 
LoggerFactory.getLogger(ExportCommand.class);
+
   @ShellMethod(key = "export instants", value = "Export Instants and their 
metadata from the Timeline")
   public String exportInstants(
   @ShellOption(value = {"--limit"}, help = "Limit Instants", defaultValue 
= "-1") final Integer limit,
@@ -77,7 +81,7 @@ public class ExportCommand {
   throws Exception {
 
 final String basePath = HoodieCLI.getTableMetaClient().getBasePath();
-final Path archivePath = new Path(basePath + 
"/.hoodie/.commits_.archive*");
+final Path archivePath = new 
Path(HoodieCLI.getTableMetaClient().getArchivePath());
 final Set actionSet = new 
HashSet(Arrays.asList(filter.split(",")));
 int numExports = limit == -1 ? Integer.MAX_VALUE : limit;
 int numCopied = 0;
@@ -121,7 +125,7 @@ public class ExportCommand {
   Reader reader = HoodieLogFormat.newReader(fileSystem, new 
HoodieLogFile(fs.getPath()), HoodieArchivedMetaEntry.getClassSchema());
 
   // read the avro blocks
-  while (reader.hasNext() && copyCount < limit) {
+  while (reader.hasNext() && copyCount++ < limit) {
 HoodieAvroDataBlock blk = (HoodieAvroDataBlock) reader.next();
 try (ClosableIterator> recordItr = 
blk.getRecordIterator(HoodieRecordType.AVRO)) {
   while (recordItr.hasNext()) {
@@ -158,11 +162,12 @@ public class ExportCommand {
 }
 
 final String instantTime = 
archiveEntryRecord.get("commitTime").toString();
+if (metadata == null) {
+  LOG.error("Could not load metadata for action " + action + " at 
instant time " + instantTime);
+  continue;
+}
 final String outPath = localFolder + Path.SEPARATOR + instantTime 
+ "." + action;
 writeToFile(outPath, HoodieAvroUtils.avroToJson(metadata, true));
-if (++copyCount == limit) {
-  break;
-}
   }
 }
   }



[GitHub] [hudi] danny0405 commented on pull request #8608: [MINOR] Fixed the hudi-cli export command.

2023-04-29 Thread via GitHub


danny0405 commented on PR #8608:
URL: https://github.com/apache/hudi/pull/8608#issuecomment-1528927412

   The failed test case: 
https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=16754&view=logs&j=dcedfe73-9485-5cc5-817a-73b61fc5dcb0&t=746585d8-b50a-55c3-26c5-517d93af9934&l=37674
   
   Should not be caused by this patch, would merge it soon~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 merged pull request #8608: [MINOR] Fixed the hudi-cli export command.

2023-04-29 Thread via GitHub


danny0405 merged PR #8608:
URL: https://github.com/apache/hudi/pull/8608


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #8594: [HUDI-6148] Recreate StreamWriteOperatorCoordinator for global failovers

2023-04-29 Thread via GitHub


danny0405 commented on PR #8594:
URL: https://github.com/apache/hudi/pull/8594#issuecomment-1528927069

   Thanks for the contribution @xccui , can you illustrate what kind of 
connection pool is not released when global failure is triggered?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


danny0405 commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528926785

   Hi, can you elaborate a little more what would happen if the inputstream is 
not closed properly? Can you write a test case to demonstrate the resolution of 
the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528925367

   
   ## CI report:
   
   * 66912e50cc13e9fdfeaddd68bfe53aead0f493cc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16731)
 
   * 8d29d9571d94e3d654e87151b16ef99ff02762b4 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16763)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8596: [BUG-FIX] use try with resource to close stream

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8596:
URL: https://github.com/apache/hudi/pull/8596#issuecomment-1528924611

   
   ## CI report:
   
   * 66912e50cc13e9fdfeaddd68bfe53aead0f493cc Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16731)
 
   * 8d29d9571d94e3d654e87151b16ef99ff02762b4 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8190:
URL: https://github.com/apache/hudi/pull/8190#issuecomment-1528887656

   
   ## CI report:
   
   * 7c71b63797be01ee91268c2520f82b18b3f13b7c Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16762)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8190:
URL: https://github.com/apache/hudi/pull/8190#issuecomment-1528869463

   
   ## CI report:
   
   * 1557ca7eeb8ef85bb76fe75ac38f0201dcf6de96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15726)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15734)
 
   * 7c71b63797be01ee91268c2520f82b18b3f13b7c Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16762)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8190: [HUDI-5936] Fix serialization problem when FileStatus is not serializable

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8190:
URL: https://github.com/apache/hudi/pull/8190#issuecomment-1528868341

   
   ## CI report:
   
   * 1557ca7eeb8ef85bb76fe75ac38f0201dcf6de96 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15726)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15734)
 
   * 7c71b63797be01ee91268c2520f82b18b3f13b7c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8517: [MINOR] Update colstats parallelism default to 200

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8517:
URL: https://github.com/apache/hudi/pull/8517#issuecomment-1528828147

   
   ## CI report:
   
   * 100ac5f5f9d8e9935625dda5419d5d66a92126a6 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16761)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8611: [HUDI-6157] Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8611:
URL: https://github.com/apache/hudi/pull/8611#issuecomment-1528814552

   
   ## CI report:
   
   * e3b3799e1e360710b99bc089f193b771fc8c4db3 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16759)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8517: [MINOR] Update colstats parallelism default to 200

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8517:
URL: https://github.com/apache/hudi/pull/8517#issuecomment-1528805834

   
   ## CI report:
   
   * 250a4dfe87b170e5df2ec282b9214e90f77fec45 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16503)
 
   * 100ac5f5f9d8e9935625dda5419d5d66a92126a6 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16761)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8517: [MINOR] Update colstats parallelism default to 200

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8517:
URL: https://github.com/apache/hudi/pull/8517#issuecomment-1528804334

   
   ## CI report:
   
   * 250a4dfe87b170e5df2ec282b9214e90f77fec45 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16503)
 
   * 100ac5f5f9d8e9935625dda5419d5d66a92126a6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1528802806

   
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[hudi] branch master updated: [HUDI-6035] Make simple index parallelism auto inferred (#8468)

2023-04-29 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/master by this push:
 new 78ad883a067 [HUDI-6035] Make simple index parallelism auto inferred 
(#8468)
78ad883a067 is described below

commit 78ad883a067537bfef866dd5388faa4922efbd58
Author: clownxc <598457...@qq.com>
AuthorDate: Sat Apr 29 22:25:07 2023 +0800

[HUDI-6035] Make simple index parallelism auto inferred (#8468)


-

Co-authored-by: ClownXC 
Co-authored-by: Raymond Xu 
---
 .../main/java/org/apache/hudi/config/HoodieIndexConfig.java| 10 +-
 .../java/org/apache/hudi/index/simple/HoodieSimpleIndex.java   |  7 ++-
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
index fd50fdb0f6d..dc0b1cd5f4a 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieIndexConfig.java
@@ -189,14 +189,14 @@ public class HoodieIndexConfig extends HoodieConfig {
 
   public static final ConfigProperty SIMPLE_INDEX_PARALLELISM = 
ConfigProperty
   .key("hoodie.simple.index.parallelism")
-  .defaultValue("100")
+  .defaultValue("0")
   .markAdvanced()
   .withDocumentation("Only applies if index type is SIMPLE. "
   + "This limits the parallelism of fetching records from the base 
files of affected "
-  + "partitions. The index picks the configured parallelism if the 
number of base "
-  + "files is larger than this configured value; otherwise, the number 
of base files "
-  + "is used as the parallelism. If the indexing stage is slow due to 
the limited "
-  + "parallelism, you can increase this to tune the performance.");
+  + "partitions. By default, this is auto computed based on input 
workload characteristics. "
+  + "If the parallelism is explicitly configured by the user, the 
user-configured "
+  + "value is used in defining the actual parallelism. If the indexing 
stage is slow "
+  + "due to the limited parallelism, you can increase this to tune the 
performance.");
 
   public static final ConfigProperty GLOBAL_SIMPLE_INDEX_PARALLELISM = 
ConfigProperty
   .key("hoodie.global.simple.index.parallelism")
diff --git 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/simple/HoodieSimpleIndex.java
 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/simple/HoodieSimpleIndex.java
index 95823ff51e3..dbc49d0655f 100644
--- 
a/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/simple/HoodieSimpleIndex.java
+++ 
b/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/simple/HoodieSimpleIndex.java
@@ -107,11 +107,16 @@ public class HoodieSimpleIndex
   
.getString(HoodieIndexConfig.SIMPLE_INDEX_INPUT_STORAGE_LEVEL_VALUE));
 }
 
+int inputParallelism = inputRecords.getNumPartitions();
+int configuredSimpleIndexParallelism = config.getSimpleIndexParallelism();
+// NOTE: Target parallelism could be overridden by the config
+int targetParallelism =
+configuredSimpleIndexParallelism > 0 ? 
configuredSimpleIndexParallelism : inputParallelism;
 HoodiePairData> keyedInputRecords =
 inputRecords.mapToPair(record -> new ImmutablePair<>(record.getKey(), 
record));
 HoodiePairData existingLocationsOnTable =
 fetchRecordLocationsForAffectedPartitions(keyedInputRecords.keys(), 
context, hoodieTable,
-config.getSimpleIndexParallelism());
+targetParallelism);
 
 HoodieData> taggedRecords =
 keyedInputRecords.leftOuterJoin(existingLocationsOnTable).map(entry -> 
{



[GitHub] [hudi] xushiyan merged pull request #8468: [HUDI-6035] Make simple index parallelism auto inferred

2023-04-29 Thread via GitHub


xushiyan merged PR #8468:
URL: https://github.com/apache/hudi/pull/8468


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8468: [HUDI-6035] Make simple index parallelism auto inferred

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8468:
URL: https://github.com/apache/hudi/pull/8468#issuecomment-1528791113

   
   ## CI report:
   
   * 9bce0a1d69458192721d929a554ef16281a13bed UNKNOWN
   * 1849bb1337d66a6433cad4cd38f0f1b978390b31 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16757)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8611: [HUDI-6157] Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8611:
URL: https://github.com/apache/hudi/pull/8611#issuecomment-1528781383

   
   ## CI report:
   
   * e3b3799e1e360710b99bc089f193b771fc8c4db3 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16759)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8611: [HUDI-6157] Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8611:
URL: https://github.com/apache/hudi/pull/8611#issuecomment-1528780076

   
   ## CI report:
   
   * e3b3799e1e360710b99bc089f193b771fc8c4db3 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8594: [HUDI-6148] Recreate StreamWriteOperatorCoordinator for global failovers

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8594:
URL: https://github.com/apache/hudi/pull/8594#issuecomment-1528778716

   
   ## CI report:
   
   * ff459b2c4de2e4adcdd30977193b026d34636c7b Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16755)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6157) Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6157:
-
Labels: pull-request-available  (was: )

> Fix potential data loss for flink streaming source from table with multi 
> writer
> ---
>
> Key: HUDI-6157
> URL: https://issues.apache.org/jira/browse/HUDI-6157
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: flink-sql
>Reporter: Danny Chen
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] danny0405 opened a new pull request, #8611: [HUDI-6157] Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread via GitHub


danny0405 opened a new pull request, #8611:
URL: https://github.com/apache/hudi/pull/8611

   …able with multi writer
   
   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was 
copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6157) Fix potential data loss for flink streaming source from table with multi writer

2023-04-29 Thread Danny Chen (Jira)
Danny Chen created HUDI-6157:


 Summary: Fix potential data loss for flink streaming source from 
table with multi writer
 Key: HUDI-6157
 URL: https://issues.apache.org/jira/browse/HUDI-6157
 Project: Apache Hudi
  Issue Type: Bug
  Components: flink-sql
Reporter: Danny Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #8609: [HUDI-6154] Introduced rety while reading hoodie.properties to deal with parallel updates.

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8609:
URL: https://github.com/apache/hudi/pull/8609#issuecomment-1528768431

   
   ## CI report:
   
   * 33114fa16eff146842ea56a8e178441ed448866f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16756)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8608: [MINOR] Fixed the hudi-cli export command.

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8608:
URL: https://github.com/apache/hudi/pull/8608#issuecomment-1528757217

   
   ## CI report:
   
   * bef668a7c58f3af8ccaf2b70bdda69c5db2e9952 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16754)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7355: [HUDI-5308] Hive3 query returns null when the where clause has a partition field

2023-04-29 Thread via GitHub


hudi-bot commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1528754157

   
   ## CI report:
   
   * e371363eb434b8c1878b0b1cf5d26121303c05e1 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16740)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16753)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1528739547

   
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16758)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8610:
URL: https://github.com/apache/hudi/pull/8610#issuecomment-1528737263

   
   ## CI report:
   
   * f34ffd6ccf4fd366ade5dad8487ff9a0a248bec8 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8468: [HUDI-6035] Make simple index parallelism auto inferred

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8468:
URL: https://github.com/apache/hudi/pull/8468#issuecomment-1528737160

   
   ## CI report:
   
   * 73d1149b6adf91c85e2cd45ef419b8351c07f2cf Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16360)
 
   * 9bce0a1d69458192721d929a554ef16281a13bed UNKNOWN
   * 1849bb1337d66a6433cad4cd38f0f1b978390b31 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16757)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8468: [HUDI-6035] Make simple index parallelism auto inferred

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8468:
URL: https://github.com/apache/hudi/pull/8468#issuecomment-1528735680

   
   ## CI report:
   
   * 73d1149b6adf91c85e2cd45ef419b8351c07f2cf Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16360)
 
   * 9bce0a1d69458192721d929a554ef16281a13bed UNKNOWN
   * 1849bb1337d66a6433cad4cd38f0f1b978390b31 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (HUDI-6156) prevent leaving tmp file in timeline when multi task try to complete the same instant

2023-04-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-6156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-6156:
-
Labels: pull-request-available  (was: )

> prevent leaving tmp file in timeline when multi task try to complete the same 
> instant
> -
>
> Key: HUDI-6156
> URL: https://issues.apache.org/jira/browse/HUDI-6156
> Project: Apache Hudi
>  Issue Type: Bug
>Reporter: HBG
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hbgstc123 opened a new pull request, #8610: [HUDI-6156] prevent leaving tmp file in timeline when multi process t…

2023-04-29 Thread via GitHub


hbgstc123 opened a new pull request, #8610:
URL: https://github.com/apache/hudi/pull/8610

   …ry to complete the same instant
   
   ### Change Logs
   
   Now if to task try to complete the same instant, a "xxx.tmp" file will leave 
in the .hoodie dir.
   
   For example a flink ingestion job with offline compaction, the ingestion job 
and offline compaction could both trigger clean task, and there are chances 2 
clean task running the same clean instant, and the slow one will fail to rename 
tmp file(e.g. 20230429171948763.clean.tmp) to final file name (e.g. 
20230429171948763.clean), leaving tmp file in timeline.
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance 
impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the 
risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, 
config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the 
default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. 
Please create a Jira ticket, attach the
 ticket number here and follow the 
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to 
make
 changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #7173: [HUDI-5189] Make HiveAvroSerializer compatible with hive3

2023-04-29 Thread via GitHub


hudi-bot commented on PR #7173:
URL: https://github.com/apache/hudi/pull/7173#issuecomment-1528735209

   
   ## CI report:
   
   * 33e116e83e6ca348dc6039db0f76ed5df50a731f Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16721)
 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16730)
 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16752)
 Azure: 
[CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16741)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (HUDI-6156) prevent leaving tmp file in timeline when multi task try to complete the same instant

2023-04-29 Thread HBG (Jira)
HBG created HUDI-6156:
-

 Summary: prevent leaving tmp file in timeline when multi task try 
to complete the same instant
 Key: HUDI-6156
 URL: https://issues.apache.org/jira/browse/HUDI-6156
 Project: Apache Hudi
  Issue Type: Bug
Reporter: HBG






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [hudi] hudi-bot commented on pull request #8468: [HUDI-6035] Make simple index parallelism auto inferred

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8468:
URL: https://github.com/apache/hudi/pull/8468#issuecomment-1528723365

   
   ## CI report:
   
   * 73d1149b6adf91c85e2cd45ef419b8351c07f2cf Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16360)
 
   * 9bce0a1d69458192721d929a554ef16281a13bed UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8607: [MINOR] Fixed the reading of instants from very old archive files where ACTION_STATE is not present in instants.

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8607:
URL: https://github.com/apache/hudi/pull/8607#issuecomment-1528718856

   
   ## CI report:
   
   * 18ec6f29e045dbb17ba587b54279b807492f71f0 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16751)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #8606: [MINOR] Check the return value from delete during rollback and finalize to ensure the files actually got deleted.

2023-04-29 Thread via GitHub


hudi-bot commented on PR #8606:
URL: https://github.com/apache/hudi/pull/8606#issuecomment-1528705558

   
   ## CI report:
   
   * e306a06b8c62c4218a0833e271b52364e05c4b50 Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=16750)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org