Jason-liujc opened a new issue, #9512:
URL: https://github.com/apache/hudi/issues/9512

   
   **Describe the problem you faced**
   
   We have a usecase where we need a table level lock for multiple EMR clusters 
writing to the same Hudi table. We've tried the two options provided by Hudi 
OCC guide page. 
   
   For `single_writer` option, we don't see any lock being created in DynamoDB, 
therefore there were no write locks and the EMR Hudi upsert job fails due to 
multiple writer writing to the same table.
   
   For `optimistic_concurrency_control`, it's creating and deleting many locks 
in DynamoDB entry. And the lock is not on table level. So when two jobs write 
to the same file, one of the job will fail. This leads to us having to add a 
lot of retries in our cluster, which is not ideal.
   
   Is there any 
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Run Hudi upsert jobs in multiple AWS EMR cluster writing to the same 
table at the same time.
   2. Use DynamoDB as the lock provider, these jobs write to the same 
partitions.
   3. Use `single_writer`  + `hoodie.cleaner.policy.failed.writes=EAGER`or 
`optimistic_concurrency_control` + `hoodie.cleaner.policy.failed.writes=LAZY` 
as part of the Hudi write option.
   4. See each time, the jobs will fail due to different errors.
   
   **Expected behavior**
   
   Hudi would provide a table level lock that allow us to block other writers 
from writing until one of the writer is done when writing to the same partition.
   
   **Environment Description**
   
   * Hudi version : 0.13.0 (EMR 6.11)
   
   * Spark version : 3.3.1
   
   * Hive version : 3.1.3
   
   * Hadoop version : 3.3.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : No
   
   
   **Additional context**
   
   We are considering building something our own to enforce sequential 
insertion during orchestration. But this can be avoided if Hudi provide a table 
level lock.
   
   **Stacktrace**
   
   Single writer:
   
   ```
   
   23/08/21 23:19:47 ERROR Client: Application diagnostics message: User class 
threw exception: org.apache.hudi.exception.HoodieRollbackException: Failed to 
rollback s3://xxxxbucket/xxxkey commits 20230821231625673
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollback(BaseHoodieTableServiceClient.java:823)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:727)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:711)
        at 
org.apache.hudi.client.BaseHoodieTableServiceClient.rollbackFailedWrites(BaseHoodieTableServiceClient.java:706)
        at 
org.apache.hudi.client.BaseHoodieWriteClient.lambda$startCommitWithTime$97cdbdca$1(BaseHoodieWriteClient.java:843)
   ```
   
   
   Multi writer with `optimistic_concurrency_control`:
   
   ```
   
   23/08/21 21:40:31 ERROR Client: Application diagnostics message: User class 
threw exception: org.apache.hudi.exception.HoodieWriteConflictException: 
java.util.ConcurrentModificationException: Cannot resolve conflicts for 
overlapping writes
        at 
org.apache.hudi.client.transaction.SimpleConcurrentFileWritesConflictResolutionStrategy.resolveConflict(SimpleConcurrentFileWritesConflictResolutionStrategy.java:108)
        at 
org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:85)
        at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
        at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
        at 
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to