[ https://issues.apache.org/jira/browse/HDDS-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anu Engineer updated HDDS-2026: ------------------------------- Attachment: changes.diff > Overlapping chunk region cannot be read concurrently > ---------------------------------------------------- > > Key: HDDS-2026 > URL: https://issues.apache.org/jira/browse/HDDS-2026 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode > Reporter: Doroszlai, Attila > Priority: Critical > Attachments: HDDS-2026-repro.patch, changes.diff, > first-cut-proposed.diff > > > Concurrent requests to datanode for the same chunk may result in the > following exception in datanode: > {code} > java.nio.channels.OverlappingFileLockException > at java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229) > at java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123) > at > java.base/sun.nio.ch.AsynchronousFileChannelImpl.addToFileLockTable(AsynchronousFileChannelImpl.java:178) > at > java.base/sun.nio.ch.SimpleAsynchronousFileChannelImpl.implLock(SimpleAsynchronousFileChannelImpl.java:185) > at > java.base/sun.nio.ch.AsynchronousFileChannelImpl.lock(AsynchronousFileChannelImpl.java:118) > at > org.apache.hadoop.ozone.container.keyvalue.helpers.ChunkUtils.readData(ChunkUtils.java:175) > at > org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerImpl.readChunk(ChunkManagerImpl.java:213) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleReadChunk(KeyValueHandler.java:574) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:195) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:271) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:148) > at > org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:73) > at > org.apache.hadoop.ozone.container.common.transport.server.GrpcXceiverService$1.onNext(GrpcXceiverService.java:61) > {code} > It seems this is covered by retry logic, as key read is eventually successful > at client side. > The problem is that: > bq. File locks are held on behalf of the entire Java virtual machine. They > are not suitable for controlling access to a file by multiple threads within > the same virtual machine. > ([source|https://docs.oracle.com/javase/8/docs/api/java/nio/channels/FileLock.html]) > code ref: > [{{ChunkUtils.readData}}|https://github.com/apache/hadoop/blob/c92de8209d1c7da9e7ce607abeecb777c4a52c6a/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/ChunkUtils.java#L175] -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org