[ https://issues.apache.org/jira/browse/HDDS-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961938#comment-16961938 ]
Marton Elek commented on HDDS-2372: ----------------------------------- Let's say I am writing chunks. Imagine the following timing. Flow A # Leader receive the write chunk request # Write chunk is written to the disk (WRITE_DATE) and saved to the cache # WriteChunk is sent to the Follower1 with the next HB # As the WriteChunk has beed added to the Follower1 and Leader1 it can be committed # Write chunk write is called (COMMIT_DATA) the tmp file is renamed to the final name Flow B # HB should be sent to Follower2 # For some reason cache is empty (too many other requests?) the write chunk should be read from the disk # A new ReadChunk request is executed by the HddsDispatcher and the chunk data is read (from an other thread, it's *async*) # The read HB is sent to the leader As B.3 is an async operation it's possible that during the B.3 the write chunk is committed (A.5) and the chunk can't be read any more from the tmp file. > Datanode pipeline is failing with NoSuchFileException > ----------------------------------------------------- > > Key: HDDS-2372 > URL: https://issues.apache.org/jira/browse/HDDS-2372 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Reporter: Marton Elek > Priority: Critical > > Found it on a k8s based test cluster using a simple 3 node cluster and > HDDS-2327 freon test. After a while the StateMachine become unhealthy after > this error: > {code:java} > datanode-0 datanode java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > java.nio.file.NoSuchFileException: > /data/storage/hdds/2a77fab9-9dc5-4f73-9501-b5347ac6145c/current/containerDir0/1/chunks/gGYYgiTTeg_testdata_chunk_13931.tmp.2.20830 > {code} > Can be reproduced. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org