[ 
https://issues.apache.org/jira/browse/HUDI-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liwei updated HUDI-944:
-----------------------
    Description: 
Now hudi just support write、compaction concurrency control. But some scenario 
need write concurrency control.Such as two spark job with different data source 
,need to write to the same hudi table.

I have two Proposal:

1. first step :support write concurrency control on different partition
but now when two client write data to different partition, will meet these error

a、Rolling back commits failed

b、instants version already exist
[2020-05-25 21:20:34,732] INFO Checking for file exists 
?/tmp/HudiDLATestPartition/.hoodie/20200525212031.clean.inflight 
(org.apache.hudi.common.table.timeline.HoodieActiveTimeline)
Exception in thread "main" org.apache.hudi.exception.HoodieIOException: Failed 
to create file /tmp/HudiDLATestPartition/.hoodie/20200525212031.clean
 at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.createImmutableFileInPath(HoodieActiveTimeline.java:437)
 at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:327)
 at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionCleanInflightToComplete(HoodieActiveTimeline.java:290)
 at 
org.apache.hudi.client.HoodieCleanClient.runClean(HoodieCleanClient.java:183)
 at 
org.apache.hudi.client.HoodieCleanClient.runClean(HoodieCleanClient.java:142)
 at 
org.apache.hudi.client.HoodieCleanClient.lambda$clean$0(HoodieCleanClient.java:88)
 at 
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)

c、two client's archiving conflict

d、the read client meets "Unable to infer schema for Parquet. It must be 
specified manually.;"

2. second step:support insert、upsert、compaction concurrency control on 
different isolation level such as Serializable、WriteSerializable.

hudi can design a mechanism to check the confict in 
AbstractHoodieWriteClient.commit()

 

> Hudi support more complete  concurrency control when write data
> ---------------------------------------------------------------
>
>                 Key: HUDI-944
>                 URL: https://issues.apache.org/jira/browse/HUDI-944
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: liwei
>            Assignee: liwei
>            Priority: Major
>             Fix For: 0.6.1
>
>
> Now hudi just support write、compaction concurrency control. But some scenario 
> need write concurrency control.Such as two spark job with different data 
> source ,need to write to the same hudi table.
> I have two Proposal:
> 1. first step :support write concurrency control on different partition
> but now when two client write data to different partition, will meet these 
> error
> a、Rolling back commits failed
> b、instants version already exist
> [2020-05-25 21:20:34,732] INFO Checking for file exists 
> ?/tmp/HudiDLATestPartition/.hoodie/20200525212031.clean.inflight 
> (org.apache.hudi.common.table.timeline.HoodieActiveTimeline)
> Exception in thread "main" org.apache.hudi.exception.HoodieIOException: 
> Failed to create file /tmp/HudiDLATestPartition/.hoodie/20200525212031.clean
>  at 
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline.createImmutableFileInPath(HoodieActiveTimeline.java:437)
>  at 
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:327)
>  at 
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionCleanInflightToComplete(HoodieActiveTimeline.java:290)
>  at 
> org.apache.hudi.client.HoodieCleanClient.runClean(HoodieCleanClient.java:183)
>  at 
> org.apache.hudi.client.HoodieCleanClient.runClean(HoodieCleanClient.java:142)
>  at 
> org.apache.hudi.client.HoodieCleanClient.lambda$clean$0(HoodieCleanClient.java:88)
>  at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> c、two client's archiving conflict
> d、the read client meets "Unable to infer schema for Parquet. It must be 
> specified manually.;"
> 2. second step:support insert、upsert、compaction concurrency control on 
> different isolation level such as Serializable、WriteSerializable.
> hudi can design a mechanism to check the confict in 
> AbstractHoodieWriteClient.commit()
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to