hudi-bot opened a new issue, #15163:
URL: https://github.com/apache/hudi/issues/15163
When I enable the concurrency in the hudi java writer, it looks like
something is wrong when committing at the same time.
The exception:
```
{{org.apache.hudi.exception.HoodieIOException: Failed to create file
file:/tmp/integration/hudi/.hoodie/20220517094051766.commit
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.createImmutableFileInPath(HoodieActiveTimeline.java:745)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:560)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:536)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.saveAsComplete(HoodieActiveTimeline.java:183)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:270)
~[hudi-client-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:234)
~[hudi-client-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.client.HoodieJavaWriteClient.commit(HoodieJavaWriteClient.java:88)
~[hudi-java-client-0.11.0.jar:0.11.0]
at
org.apache.hudi.client.HoodieJavaWriteClient.commit(HoodieJavaWriteClient.java:51)
~[hudi-java-client-0.11.0.jar:0.11.0]
at
org.apache.hudi.client.BaseHoodieWriteClient.commit(BaseHoodieWriteClient.java:206)
~[hudi-client-common-0.11.0.jar:0.11.0]
at
org.apache.pulsar.ecosystem.io.sink.hudi.BufferedConnectWriter.flushRecords(BufferedConnectWriter.java:82)
~[PqY5lYEJSWPWMDq7E5HC2Q/:?]
at
org.apache.pulsar.ecosystem.io.sink.hudi.HoodieWriter.flush(HoodieWriter.java:85)
~[PqY5lYEJSWPWMDq7E5HC2Q/:?]
at
org.apache.pulsar.ecosystem.io.sink.SinkWriter.commitIfNeed(SinkWriter.java:128)
~[PqY5lYEJSWPWMDq7E5HC2Q/:?]
at
org.apache.pulsar.ecosystem.io.sink.SinkWriter.run(SinkWriter.java:113)
[PqY5lYEJSWPWMDq7E5HC2Q/:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_201]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_201]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: File already
exists: file:/tmp/integration/hudi/.hoodie/20220517094051766.commit
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:315)
~[hadoop-common-3.2.2.jar:?]
at
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:353)
~[hadoop-common-3.2.2.jar:?]
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:403)
~[hadoop-common-3.2.2.jar:?]
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:466)
~[hadoop-common-3.2.2.jar:?]
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:445)
~[hadoop-common-3.2.2.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1125)
~[hadoop-common-3.2.2.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1105)
~[hadoop-common-3.2.2.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:994)
~[hadoop-common-3.2.2.jar:?]
at
org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$create$2(HoodieWrapperFileSystem.java:222)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:101)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.common.fs.HoodieWrapperFileSystem.create(HoodieWrapperFileSystem.java:221)
~[hudi-common-0.11.0.jar:0.11.0]
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.createImmutableFileInPath(HoodieActiveTimeline.java:740)
~[hudi-common-0.11.0.jar:0.11.0]
... 16 more}}
```
And my hudi writer configuration:
```
{{"hoodie.table.name": "hudi-connector-test",
"hoodie.table.type": "COPY_ON_WRITE",
"hoodie.base.path": "file:///tmp/integration/hudi",
"hoodie.clean.async": "true",
"hoodie.write.concurrency.mode": "optimistic_concurrency_control",
"hoodie.cleaner.policy.failed.writes": "LAZY",
"hoodie.write.lock.provider":
"org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
"hoodie.write.lock.zookeeper.url": "localhost",
"hoodie.write.lock.zookeeper.port": "2181",
"hoodie.write.lock.zookeeper.lock_key": "pulsar_hudi",
"hoodie.write.lock.zookeeper.base_path": "/hudi",
"hoodie.datasource.write.recordkey.field": "id",
"hoodie.datasource.write.partitionpath.field": "id",}}
```
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-4121
- Type: Bug
- Epic: https://issues.apache.org/jira/browse/HUDI-5197
---
## Comments
19/May/22 01:36;edward.yong;I found in the SparkRDDWriteClient, it
implements preCommit
[https://github.com/apache/hudi/blob/551aa959c57721a5cc4d3f63f79e0201978980a2/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java#L471.]
But the java client doesn't do similar things;;;
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]