[ 
https://issues.apache.org/jira/browse/HUDI-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443034#comment-17443034
 ] 

Prashant Wason commented on HUDI-2559:
--------------------------------------

Approach 2 means multiple configs need to be maintained for each writer.

Other options:
 # We can also add a small random delay to timestamp creation to make it rare
 # Multiple processes already assumes multi-writer with locking right? So we 
can use the locking mechanism to generate a unique time instant for us. 
Uniqueness can be checked via file system - creating a file <timestamp>.instant 
(i.e. without the action name). In the rate case of collision, only one process 
will be able to create this file and wins. This adds a single create operation 
so does not slow down much.

> Ensure unique timestamps are generated for commit times with concurrent 
> writers
> -------------------------------------------------------------------------------
>
>                 Key: HUDI-2559
>                 URL: https://issues.apache.org/jira/browse/HUDI-2559
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Blocker
>              Labels: pull-request-available, release-blocker
>             Fix For: 0.10.0
>
>
> Ensure unique timestamps are generated for commit times with concurrent 
> writers.
> this is the piece of code in HoodieActiveTimeline which creates a new commit 
> time.
> {code:java}
> public static String createNewInstantTime(long milliseconds) {
>   return lastInstantTime.updateAndGet((oldVal) -> {
>     String newCommitTime;
>     do {
>       newCommitTime = HoodieActiveTimeline.COMMIT_FORMATTER.format(new 
> Date(System.currentTimeMillis() + milliseconds));
>     } while (HoodieTimeline.compareTimestamps(newCommitTime, 
> LESSER_THAN_OR_EQUALS, oldVal));
>     return newCommitTime;
>   });
> }
> {code}
> There are chances that a deltastreamer and a concurrent spark ds writer gets 
> same timestamp and one of them fails. 
> Related issues and github jiras: 
> [https://github.com/apache/hudi/issues/3782]
> https://issues.apache.org/jira/browse/HUDI-2549
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to