Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2024-01-09 Thread via GitHub


waywtdcc commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1884005603

   The meta file of version 1.0.0 is in avro format, and the old version is in 
json format. This should also be compatible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2024-01-09 Thread via GitHub


boneanxs commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1882755083

   > > No, we would fix the ugrade for 1.0.0 GA release.
   > 
   > Does this mean that the official version 1.0.0 will support automatic 
upgrade and fix this bug?
   
   @waywtdcc Yeap, will fix it recently.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2024-01-08 Thread via GitHub


waywtdcc commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1882506165

   > No, we would fix the ugrade for 1.0.0 GA release.
   
   Does this mean that the official version 1.0.0 will support automatic 
upgrade and fix this bug?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2024-01-04 Thread via GitHub


danny0405 commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1877989899

   No, we would fix the ugrade for 1.0.0 GA release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2024-01-04 Thread via GitHub


waywtdcc commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1876697501

   
![image](https://github.com/apache/hudi/assets/59957056/6cf60022-5d42-4e12-afd2-dac7a1f85e4b)
   This is an error when upgrading from 0.12.1 to 1.0.0-beta1. Is this 
compatible with the file format of the old version?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-16 Thread via GitHub


boneanxs commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1361422542


##
hudi-common/src/main/java/org/apache/hudi/common/config/LockConfiguration.java:
##
@@ -35,19 +35,23 @@ public class LockConfiguration implements Serializable {
   public static final String 
LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY = LOCK_PREFIX + 
"max_wait_time_ms_between_retry";
 
   public static final String 
LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY = LOCK_PREFIX + 
"client.wait_time_ms_between_retry";
+  public static final String 
DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS = String.valueOf(5000L);

Review Comment:
   `TimeGeneratorBase` need this default value, it cannot access 
`HoodieLockConfig` since it's not in `hudi-common`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-16 Thread via GitHub


yihua commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1361379762


##
hudi-common/src/main/java/org/apache/hudi/common/config/LockConfiguration.java:
##
@@ -35,19 +35,23 @@ public class LockConfiguration implements Serializable {
   public static final String 
LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY = LOCK_PREFIX + 
"max_wait_time_ms_between_retry";
 
   public static final String 
LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY = LOCK_PREFIX + 
"client.wait_time_ms_between_retry";
+  public static final String 
DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS = String.valueOf(5000L);

Review Comment:
   Wrt this PR, is this refactoring existing code?  We have common lock configs 
in `HoodieLockConfig` class.



##
hudi-common/src/main/java/org/apache/hudi/common/config/LockConfiguration.java:
##
@@ -35,19 +35,23 @@ public class LockConfiguration implements Serializable {
   public static final String 
LOCK_ACQUIRE_RETRY_MAX_WAIT_TIME_IN_MILLIS_PROP_KEY = LOCK_PREFIX + 
"max_wait_time_ms_between_retry";
 
   public static final String 
LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY = LOCK_PREFIX + 
"client.wait_time_ms_between_retry";
+  public static final String 
DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS = String.valueOf(5000L);

Review Comment:
   @danny0405 @boneanxs ^



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-13 Thread via GitHub


danny0405 merged PR #9617:
URL: https://github.com/apache/hudi/pull/9617


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1758932211

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * 9262fe65ccfb3d7f74eb0ee35d4f822eeb1a67ea Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20295)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1758826067

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * bfea4593820ae5257f93f822986ef58168f69dde Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20289)
 
   * 9262fe65ccfb3d7f74eb0ee35d4f822eeb1a67ea Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20295)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1758820588

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * bfea4593820ae5257f93f822986ef58168f69dde Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20289)
 
   * 9262fe65ccfb3d7f74eb0ee35d4f822eeb1a67ea UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1757581642

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * bfea4593820ae5257f93f822986ef58168f69dde Azure: 
[FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20289)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1757366765

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * 2543e0a84a337b02e4af14a3f8b2c4dafbd6d558 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20266)
 
   * bfea4593820ae5257f93f822986ef58168f69dde Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20289)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1757346235

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * 2543e0a84a337b02e4af14a3f8b2c4dafbd6d558 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20266)
 
   * bfea4593820ae5257f93f822986ef58168f69dde UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


boneanxs commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1354592241


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimeGeneratorBase.java:
##
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.table.timeline;
+
+import org.apache.hudi.common.config.HoodieTimeGeneratorConfig;
+import org.apache.hudi.common.config.LockConfiguration;
+import org.apache.hudi.common.config.SerializableConfiguration;
+import org.apache.hudi.common.lock.LockProvider;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieLockException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Serializable;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS;
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_NUM_RETRIES;
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_WAIT_TIMEOUT_MS;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_CLIENT_NUM_RETRIES_PROP_KEY;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_WAIT_TIMEOUT_MS_PROP_KEY;
+
+/**
+ * Base time generator facility that maintains lock-related utilities.
+ */
+public abstract class TimeGeneratorBase implements TimeGenerator, Serializable 
{
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(TimeGeneratorBase.class);
+
+  /**
+   * The lock provider.
+   */
+  private volatile LockProvider lockProvider;
+  /**
+   * The maximum times to retry in case there are failures.
+   */
+  private final int maxRetries;
+  /**
+   * The maximum time to wait for each time generation to resolve the clock 
skew issue on distributed hosts.
+   */
+  private final long maxWaitTimeInMs;
+  /**
+   * The maximum time to block for acquiring a lock.
+   */
+  private final int lockAcquireWaitTimeInMs;
+
+  protected final HoodieTimeGeneratorConfig config;
+  private final LockConfiguration lockConfiguration;
+
+  /**
+   * The hadoop configuration.
+   */
+  private final SerializableConfiguration hadoopConf;
+
+  public TimeGeneratorBase(HoodieTimeGeneratorConfig config, 
SerializableConfiguration hadoopConf) {
+this.config = config;
+this.lockConfiguration = config.getLockConfiguration();
+this.hadoopConf = hadoopConf;
+
+maxRetries = 
lockConfiguration.getConfig().getInteger(LOCK_ACQUIRE_CLIENT_NUM_RETRIES_PROP_KEY,
+Integer.parseInt(DEFAULT_LOCK_ACQUIRE_NUM_RETRIES));
+lockAcquireWaitTimeInMs = 
lockConfiguration.getConfig().getInteger(LOCK_ACQUIRE_WAIT_TIMEOUT_MS_PROP_KEY,
+Integer.parseInt(DEFAULT_LOCK_ACQUIRE_WAIT_TIMEOUT_MS));
+maxWaitTimeInMs = 
lockConfiguration.getConfig().getLong(LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY,
+Long.parseLong(DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS));
+  }
+
+  protected LockProvider getLockProvider() {
+// Perform lazy initialization of lock provider only if needed
+if (lockProvider == null) {
+  synchronized (this) {
+if (lockProvider == null) {
+  String lockProviderClass = 
lockConfiguration.getConfig().getString("hoodie.write.lock.provider");
+  LOG.info("LockProvider for TimeGenerator: " + lockProviderClass);
+  lockProvider = (LockProvider) 
ReflectionUtils.loadClass(lockProviderClass,
+  lockConfiguration, hadoopConf.get());
+}
+  }
+}
+return lockProvider;
+  }
+
+  public void lock() {

Review Comment:
   Use `RetryHelper` to simplify code here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-11 Thread via GitHub


boneanxs commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1757292134

   > Looks good overall, is there anyway we can abstract that failure retries 
as a common utility? Add add a UT for TimeGenerator.
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-10 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1754921125

   
   ## CI report:
   
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * 2543e0a84a337b02e4af14a3f8b2c4dafbd6d558 Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20266)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-10 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1754602002

   
   ## CI report:
   
   * ceede4af8af859baa21e5a2178e421bfd84c7ead Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20142)
 
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * 2543e0a84a337b02e4af14a3f8b2c4dafbd6d558 Azure: 
[PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20266)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-10 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1754587571

   
   ## CI report:
   
   * ceede4af8af859baa21e5a2178e421bfd84c7ead Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20142)
 
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   * 2543e0a84a337b02e4af14a3f8b2c4dafbd6d558 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-10 Thread via GitHub


boneanxs commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1351725337


##
hudi-common/src/main/java/org/apache/hudi/client/transaction/lock/InProcessLockProvider.java:
##
@@ -19,16 +19,15 @@
 
 package org.apache.hudi.client.transaction.lock;
 
+import org.apache.hudi.common.config.HoodieCommonConfig;
 import org.apache.hudi.common.config.LockConfiguration;
 import org.apache.hudi.common.config.TypedProperties;
 import org.apache.hudi.common.lock.LockProvider;
 import org.apache.hudi.common.lock.LockState;
 import org.apache.hudi.common.util.ValidationUtils;
-import org.apache.hudi.config.HoodieWriteConfig;
 import org.apache.hudi.exception.HoodieLockException;
 
 import org.apache.hadoop.conf.Configuration;
-import org.jetbrains.annotations.NotNull;

Review Comment:
   Remove import `NotNull` here since hudi-common doesn't include dep 
`org.jetbrains.annotations`, which could meet ClassNotFound issue when 
compiling. Also, I see other lockProviders also don't use this annotation, like 
`ZookeeperBasedLockProvider` and `HiveMetastoreBasedLockProvider`, so simply 
remove it here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-10 Thread via GitHub


hudi-bot commented on PR #9617:
URL: https://github.com/apache/hudi/pull/9617#issuecomment-1754573890

   
   ## CI report:
   
   * ceede4af8af859baa21e5a2178e421bfd84c7ead Azure: 
[SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=20142)
 
   * 8821b701b0ae25d6bcf17dd95f36be9cc8de084b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-10 Thread via GitHub


boneanxs commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1351289121


##
packaging/hudi-kafka-connect-bundle/pom.xml:
##
@@ -107,6 +107,7 @@
 com.lmax:disruptor
 
com.github.davidmoten:guava-mini
 
com.github.davidmoten:hilbert-curve
+
com.github.ben-manes.caffeine:caffeine

Review Comment:
   I think this is actually a missing dep in hudi-kafka-connect-bundle, all 
other bundles include this. If hudi-kafka-connect-bundle doesn't include this, 
it could meet ClassNotFound issue before this pr when access 
`InternalSchemaCache`(This cache already uses caffeine cache, and is in 
hudi-common).



##
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkTableServiceClient.java:
##
@@ -137,6 +137,7 @@ protected void completeClustering(
 
   LOG.info("Committing Clustering {} finished with result {}.", 
clusteringCommitTime, metadata);
   table.getActiveTimeline().transitionReplaceInflightToComplete(
+  false,

Review Comment:
   Here already guarantee the lock, this operation(inflight->complete) is 
wrapped by `this.txnManager.beginTransaction` and 
`this.txnManager.endTransaction`



##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java:
##
@@ -198,8 +204,23 @@ private void readObject(java.io.ObjectInputStream in) 
throws IOException, ClassN
 in.defaultReadObject();
   }
 
+  /**
+   * Create a complete instant and save to storage with a completion time.
+   * @param shouldLock whether the lock should be enabled.
+   * @param instant the complete instant.
+   */
+  public void createCompleteInstant(boolean shouldLock, HoodieInstant instant) 
{

Review Comment:
   `InProcessLockProvider` is not accessible in `hudi-common`, so we have to 
set shouldLock to false for tests in hudi-common. After moving 
`InProcessLockProvider` to `hudi-common`, we don't need to pass this flag, 
shouldLock is always true.



##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -187,7 +188,11 @@ public static String maskWithoutFileId(String instantTime, 
int taskPartitionId)
   }
 
   public static String getCommitFromCommitFile(String commitFileName) {
-return commitFileName.split("\\.")[0];
+try {
+  return HoodieInstant.extractTimestamp(commitFileName);
+} catch (IllegalArgumentException e) {
+  return "";

Review Comment:
   Since callers doesn't pass fileName correctly, they could pass non-commit 
file, will change them



##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java:
##
@@ -573,54 +624,85 @@ public HoodieInstant 
transitionReplaceRequestedToInflight(HoodieInstant requeste
 ValidationUtils.checkArgument(requestedInstant.isRequested());
 HoodieInstant inflightInstant = new HoodieInstant(State.INFLIGHT, 
REPLACE_COMMIT_ACTION, requestedInstant.getTimestamp());
 // Then write to timeline
-transitionState(requestedInstant, inflightInstant, data);
+transitionPendingState(requestedInstant, inflightInstant, data);
 return inflightInstant;
   }
 
   /**
* Transition replace inflight to Committed.
*
+   * @param shouldLock Whether to hold the lock when performing transition
* @param inflightInstant Inflight instant
* @param data Extra Metadata
* @return commit instant
*/
-  public HoodieInstant transitionReplaceInflightToComplete(HoodieInstant 
inflightInstant, Option data) {
+  public HoodieInstant transitionReplaceInflightToComplete(boolean shouldLock,
+   HoodieInstant 
inflightInstant, Option data) {
 
ValidationUtils.checkArgument(inflightInstant.getAction().equals(HoodieTimeline.REPLACE_COMMIT_ACTION));
 ValidationUtils.checkArgument(inflightInstant.isInflight());
 HoodieInstant commitInstant = new HoodieInstant(State.COMPLETED, 
REPLACE_COMMIT_ACTION, inflightInstant.getTimestamp());
 // Then write to timeline
-transitionState(inflightInstant, commitInstant, data);
+transitionStateToComplete(shouldLock, inflightInstant, commitInstant, 
data);
 return commitInstant;
   }
 
-  private void transitionState(HoodieInstant fromInstant, HoodieInstant 
toInstant, Option data) {
-transitionState(fromInstant, toInstant, data, false);
+  private void transitionPendingState(HoodieInstant fromInstant, HoodieInstant 
toInstant, Option data) {
+transitionPendingState(fromInstant, toInstant, data, false);
+  }
+
+  protected void transitionStateToComplete(boolean shouldLock, HoodieInstant 
fromInstant,
+   HoodieInstant toInstant, 
Option data) {
+

Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343399866


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimeGeneratorBase.java:
##
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.table.timeline;
+
+import org.apache.hudi.common.config.HoodieTimeGeneratorConfig;
+import org.apache.hudi.common.config.LockConfiguration;
+import org.apache.hudi.common.config.SerializableConfiguration;
+import org.apache.hudi.common.lock.LockProvider;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieLockException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Serializable;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS;
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_NUM_RETRIES;
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_WAIT_TIMEOUT_MS;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_CLIENT_NUM_RETRIES_PROP_KEY;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_WAIT_TIMEOUT_MS_PROP_KEY;
+
+/**
+ * Base time generator facility that maintains lock-related utilities.
+ */
+public abstract class TimeGeneratorBase implements TimeGenerator, Serializable 
{
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(TimeGeneratorBase.class);
+
+  /**
+   * The lock provider.
+   */
+  private volatile LockProvider lockProvider;
+  /**
+   * The maximum times to retry in case there are failures.
+   */
+  private final int maxRetries;
+  /**
+   * The maximum time to wait for each time generation to resolve the clock 
skew issue on distributed hosts.
+   */
+  private final long maxWaitTimeInMs;
+  /**
+   * The maximum time to block for acquiring a lock.
+   */
+  private final int lockAcquireWaitTimeInMs;
+
+  protected final HoodieTimeGeneratorConfig config;
+  private final LockConfiguration lockConfiguration;
+
+  /**
+   * The hadoop configuration.
+   */
+  private final SerializableConfiguration hadoopConf;
+
+  public TimeGeneratorBase(HoodieTimeGeneratorConfig config, 
SerializableConfiguration hadoopConf) {
+this.config = config;
+this.lockConfiguration = config.getLockConfiguration();
+this.hadoopConf = hadoopConf;
+
+maxRetries = 
lockConfiguration.getConfig().getInteger(LOCK_ACQUIRE_CLIENT_NUM_RETRIES_PROP_KEY,
+Integer.parseInt(DEFAULT_LOCK_ACQUIRE_NUM_RETRIES));
+lockAcquireWaitTimeInMs = 
lockConfiguration.getConfig().getInteger(LOCK_ACQUIRE_WAIT_TIMEOUT_MS_PROP_KEY,
+Integer.parseInt(DEFAULT_LOCK_ACQUIRE_WAIT_TIMEOUT_MS));
+maxWaitTimeInMs = 
lockConfiguration.getConfig().getLong(LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY,
+Long.parseLong(DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS));
+  }
+
+  protected LockProvider getLockProvider() {
+// Perform lazy initialization of lock provider only if needed
+if (lockProvider == null) {
+  synchronized (this) {
+if (lockProvider == null) {
+  String lockProviderClass = 
lockConfiguration.getConfig().getString("hoodie.write.lock.provider");
+  LOG.info("LockProvider for TimeGenerator: " + lockProviderClass);
+  lockProvider = (LockProvider) 
ReflectionUtils.loadClass(lockProviderClass,
+  lockConfiguration, hadoopConf.get());
+}
+  }
+}
+return lockProvider;
+  }
+
+  public void lock() {

Review Comment:
   Yeah, I think we can have a refactoring to the failure retries code snippet, 
maybe we can abstrac it out into a utilitiy method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this 

Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343399866


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimeGeneratorBase.java:
##
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.table.timeline;
+
+import org.apache.hudi.common.config.HoodieTimeGeneratorConfig;
+import org.apache.hudi.common.config.LockConfiguration;
+import org.apache.hudi.common.config.SerializableConfiguration;
+import org.apache.hudi.common.lock.LockProvider;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieLockException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.Serializable;
+import java.util.concurrent.TimeUnit;
+
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS;
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_NUM_RETRIES;
+import static 
org.apache.hudi.common.config.LockConfiguration.DEFAULT_LOCK_ACQUIRE_WAIT_TIMEOUT_MS;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_CLIENT_NUM_RETRIES_PROP_KEY;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY;
+import static 
org.apache.hudi.common.config.LockConfiguration.LOCK_ACQUIRE_WAIT_TIMEOUT_MS_PROP_KEY;
+
+/**
+ * Base time generator facility that maintains lock-related utilities.
+ */
+public abstract class TimeGeneratorBase implements TimeGenerator, Serializable 
{
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(TimeGeneratorBase.class);
+
+  /**
+   * The lock provider.
+   */
+  private volatile LockProvider lockProvider;
+  /**
+   * The maximum times to retry in case there are failures.
+   */
+  private final int maxRetries;
+  /**
+   * The maximum time to wait for each time generation to resolve the clock 
skew issue on distributed hosts.
+   */
+  private final long maxWaitTimeInMs;
+  /**
+   * The maximum time to block for acquiring a lock.
+   */
+  private final int lockAcquireWaitTimeInMs;
+
+  protected final HoodieTimeGeneratorConfig config;
+  private final LockConfiguration lockConfiguration;
+
+  /**
+   * The hadoop configuration.
+   */
+  private final SerializableConfiguration hadoopConf;
+
+  public TimeGeneratorBase(HoodieTimeGeneratorConfig config, 
SerializableConfiguration hadoopConf) {
+this.config = config;
+this.lockConfiguration = config.getLockConfiguration();
+this.hadoopConf = hadoopConf;
+
+maxRetries = 
lockConfiguration.getConfig().getInteger(LOCK_ACQUIRE_CLIENT_NUM_RETRIES_PROP_KEY,
+Integer.parseInt(DEFAULT_LOCK_ACQUIRE_NUM_RETRIES));
+lockAcquireWaitTimeInMs = 
lockConfiguration.getConfig().getInteger(LOCK_ACQUIRE_WAIT_TIMEOUT_MS_PROP_KEY,
+Integer.parseInt(DEFAULT_LOCK_ACQUIRE_WAIT_TIMEOUT_MS));
+maxWaitTimeInMs = 
lockConfiguration.getConfig().getLong(LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS_PROP_KEY,
+Long.parseLong(DEFAULT_LOCK_ACQUIRE_CLIENT_RETRY_WAIT_TIME_IN_MILLIS));
+  }
+
+  protected LockProvider getLockProvider() {
+// Perform lazy initialization of lock provider only if needed
+if (lockProvider == null) {
+  synchronized (this) {
+if (lockProvider == null) {
+  String lockProviderClass = 
lockConfiguration.getConfig().getString("hoodie.write.lock.provider");
+  LOG.info("LockProvider for TimeGenerator: " + lockProviderClass);
+  lockProvider = (LockProvider) 
ReflectionUtils.loadClass(lockProviderClass,
+  lockConfiguration, hadoopConf.get());
+}
+  }
+}
+return lockProvider;
+  }
+
+  public void lock() {

Review Comment:
   Yeah, I think we can have a refactoring to the failure retries code snippet, 
maybe we can abstract it out into a utilitiy method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this 

Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343397772


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -370,6 +377,28 @@ public synchronized HoodieActiveTimeline 
reloadActiveTimeline() {
 return activeTimeline;
   }
 
+  /**
+   * Returns next instant time in the correct format. Lock is enabled by 
default.
+   */
+  public String createNewInstantTime() {
+return createNewInstantTime(true);
+  }
+
+  /**
+   * Returns next instant time in the correct format.
+   *
+   * @param shouldLock whether the lock should be enabled to get the instant 
time.
+   */
+  public String createNewInstantTime(boolean shouldLock) {

Review Comment:
   We've been thinking through this for a while and we want to avoid the 
dead-lock, the lock providers other than in-process is kind of not reentrant 
now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343395847


##
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java:
##
@@ -187,7 +188,11 @@ public static String maskWithoutFileId(String instantTime, 
int taskPartitionId)
   }
 
   public static String getCommitFromCommitFile(String commitFileName) {
-return commitFileName.split("\\.")[0];
+try {
+  return HoodieInstant.extractTimestamp(commitFileName);
+} catch (IllegalArgumentException e) {
+  return "";

Review Comment:
   +1 to throw exception.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343376582


##
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieTimeGeneratorConfig.java:
##
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+import org.apache.hudi.common.table.timeline.TimeGeneratorType;
+import org.apache.hudi.common.util.Option;
+
+import java.util.Properties;
+
+import static org.apache.hudi.common.config.LockConfiguration.LOCK_PREFIX;
+
+/**
+ * Configuration for Hoodie time generation.
+ */
+public class HoodieTimeGeneratorConfig extends HoodieConfig {
+
+  public static final String BASE_PATH = "hoodie.base.path";

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343376222


##
packaging/hudi-kafka-connect-bundle/pom.xml:
##
@@ -107,6 +107,7 @@
 com.lmax:disruptor
 
com.github.davidmoten:guava-mini
 
com.github.davidmoten:hilbert-curve
+
com.github.ben-manes.caffeine:caffeine

Review Comment:
   +1, we can use a weak reference hash map now and we should add a close hook 
for the instances.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343361124


##
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkTableServiceClient.java:
##
@@ -137,6 +137,7 @@ protected void completeClustering(
 
   LOG.info("Committing Clustering {} finished with result {}.", 
clusteringCommitTime, metadata);
   table.getActiveTimeline().transitionReplaceInflightToComplete(
+  false,

Review Comment:
   I guess it's because the file slicing is not affected by the replace 
operation, there is no need to keep atomicity.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343359738


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/BaseRollbackActionExecutor.java:
##
@@ -261,7 +261,10 @@ protected void finishRollback(HoodieInstant 
inflightInstant, HoodieRollbackMetad
   // If publish the rollback to the timeline, we finally transition the 
inflight rollback
   // to complete in the data table timeline
   if (!skipTimelinePublish) {
-
table.getActiveTimeline().transitionRollbackInflightToComplete(inflightInstant,
+// NOTE: no need to lock here, since !skipTimelinePublish is always 
true,
+// when skipLocking is false, txnManager above-mentioned should lock 
it.
+// when skipLocking is true, the caller should have already held the 
lock.

Review Comment:
   re-entrancy is good but you still need to keep track of the holdness of the 
lock explicitly, we have a `LockManager` in the `TxnManager`, but the 
`LockManager` is only available in the write client, and it is not designed as 
a singleton now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


danny0405 commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1343355515


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##
@@ -964,7 +976,7 @@ public void update(HoodieRestoreMetadata restoreMetadata, 
String instantTime) {
   // We cannot create a deltaCommit at instantTime now because a future 
(rollback) block has already been written to the logFiles.
   // We need to choose a timestamp which would be a validInstantTime for 
MDT. This is either a commit timestamp completed on the dataset
   // or a timestamp with suffix which we use for MDT clean, compaction etc.
-  String syncCommitTime = 
HoodieTableMetadataUtil.createRestoreTimestamp(HoodieActiveTimeline.createNewInstantTime());
+  String syncCommitTime = 
HoodieTableMetadataUtil.createRestoreTimestamp(writeClient.createNewInstantTime(false));

Review Comment:
   Currently, only 2 use cases need a explicit lock:
   
   1. create new instant time
   2. transform pending metadata file to complete state
   
   And if the caller already holds a lock, we must not use a lock to avoid 
dead-lock.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [HUDI-1623] Solid completion time on timeline [hudi]

2023-10-02 Thread via GitHub


vinothchandar commented on code in PR #9617:
URL: https://github.com/apache/hudi/pull/9617#discussion_r1342547848


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/BaseRollbackActionExecutor.java:
##
@@ -261,7 +261,10 @@ protected void finishRollback(HoodieInstant 
inflightInstant, HoodieRollbackMetad
   // If publish the rollback to the timeline, we finally transition the 
inflight rollback
   // to complete in the data table timeline
   if (!skipTimelinePublish) {
-
table.getActiveTimeline().transitionRollbackInflightToComplete(inflightInstant,
+// NOTE: no need to lock here, since !skipTimelinePublish is always 
true,
+// when skipLocking is false, txnManager above-mentioned should lock 
it.
+// when skipLocking is true, the caller should have already held the 
lock.

Review Comment:
   esp stuff like this, it's better to have re-entrancy here as opposed to 
avoiding locks, so we don't introduce code deps on locking, which is hard to 
debug.



##
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkTableServiceClient.java:
##
@@ -137,6 +137,7 @@ protected void completeClustering(
 
   LOG.info("Committing Clustering {} finished with result {}.", 
clusteringCommitTime, metadata);
   table.getActiveTimeline().transitionReplaceInflightToComplete(
+  false,

Review Comment:
   why do we skip locking here completely.



##
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieTimeGeneratorConfig.java:
##
@@ -0,0 +1,128 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.config;
+
+import org.apache.hudi.common.table.timeline.TimeGeneratorType;
+import org.apache.hudi.common.util.Option;
+
+import java.util.Properties;
+
+import static org.apache.hudi.common.config.LockConfiguration.LOCK_PREFIX;
+
+/**
+ * Configuration for Hoodie time generation.
+ */
+public class HoodieTimeGeneratorConfig extends HoodieConfig {
+
+  public static final String BASE_PATH = "hoodie.base.path";
+  private static final String LOCK_PROVIDER_KEY = LOCK_PREFIX + "provider";
+  private static final String DEFAULT_LOCK_PROVIDER = 
"org.apache.hudi.client.transaction.lock.InProcessLockProvider";
+
+  public static final ConfigProperty TIME_GENERATOR_TYPE = 
ConfigProperty
+  .key("hoodie.time.generator.type")
+  .defaultValue(TimeGeneratorType.LOCK_PROVIDER.name())
+  .withValidValues(TimeGeneratorType.LOCK_PROVIDER.name())
+  .sinceVersion("1.0.0")
+  .markAdvanced()
+  .withDocumentation("Time generator type, which is used to generate 
globally monotonically increasing timestamp");
+
+  public static final ConfigProperty MAX_EXPECTED_CLOCK_SKEW_MS = 
ConfigProperty
+  .key("hoodie.time.generator.max_expected_clock_skew_ms")
+  .defaultValue(20L)

Review Comment:
   I'd say something like 200ms is a safer default.



##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimeGeneratorType.java:
##
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.table.timeline;
+
+import org.apache.hudi.common.config.EnumDescription;
+import org.apache.hudi.common.config.EnumFieldDescription;
+
+/**
+ * Types of {@link TimeGenerator}.
+ */
+@EnumDescription("Time generator type, indicating the time generator