[ https://issues.apache.org/jira/browse/HDDS-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Supratim Deka reassigned HDDS-1206: ----------------------------------- Assignee: Supratim Deka (was: Shashikant Banerjee) > need to handle in the client when one of the datanode disk goes out of space > ---------------------------------------------------------------------------- > > Key: HDDS-1206 > URL: https://issues.apache.org/jira/browse/HDDS-1206 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client > Reporter: Nilotpal Nandi > Assignee: Supratim Deka > Priority: Major > > steps taken : > -------------------- > # create 40 datanode cluster. > # one of the datanodes has less than 5 GB space. > # Started writing key of size 600MB. > operation failed: > Error on the client: > ---------------------------- > {noformat} > Fri Mar 1 09:05:28 UTC 2019 Ruuning > /root/hadoop_trunk/ozone-0.4.0-SNAPSHOT/bin/ozone sh key put > testvol172275910-1551431122-1/testbuck172275910-1551431122-1/test_file24 > /root/test_files/test_file24 > original md5sum a6de00c9284708585f5a99b0490b0b23 > 2019-03-01 09:05:39,142 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:39,578 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,368 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,450 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,457 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 does not exist > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$handlePartialFlush$2(BlockOutputStream.java:393) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,535 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,617 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,741 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,814 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 creation failed > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$writeChunkToContainer$5(BlockOutputStream.java:613) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-03-01 09:05:40,815 ERROR storage.BlockOutputStream: Unexpected Storage > Container Exception: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > ContainerID 79 does not exist > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.validateContainerResponse(ContainerProtocolCalls.java:568) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.validateResponse(BlockOutputStream.java:535) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.lambda$handlePartialFlush$2(BlockOutputStream.java:393) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > java.nio.BufferOverflowException > at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:189) > at > org.apache.hadoop.hdds.scm.storage.BlockOutputStream.write(BlockOutputStream.java:213) > at > org.apache.hadoop.ozone.client.io.BlockOutputStreamEntry.write(BlockOutputStreamEntry.java:128) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.handleWrite(KeyOutputStream.java:307) > at > org.apache.hadoop.ozone.client.io.KeyOutputStream.write(KeyOutputStream.java:268) > at > org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:96) > at > org.apache.hadoop.ozone.web.ozShell.keys.PutKeyHandler.call(PutKeyHandler.java:111) > at > org.apache.hadoop.ozone.web.ozShell.keys.PutKeyHandler.call(PutKeyHandler.java:53) > at picocli.CommandLine.execute(CommandLine.java:919) > at picocli.CommandLine.access$700(CommandLine.java:104) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1083) > at picocli.CommandLine$RunLast.handle(CommandLine.java:1051) > at > picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959) > at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242) > at picocli.CommandLine.parseWithHandler(CommandLine.java:1181) > at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61) > at org.apache.hadoop.ozone.web.ozShell.Shell.execute(Shell.java:84) > at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52) > at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:95){noformat} > > ozone.log > ----------------- > > {noformat} > 2019-03-01 09:05:33,248 [IPC Server handler 17 on 9889] DEBUG > (OzoneManagerRequestHandler.java:137) - Received OMRequest: cmdType: CreateKey > traceID: "5f169cde0a4c8a4e:79f0b64c3329c0ba:5f169cde0a4c8a4e:0" > clientId: "client-86810A76C95E" > createKeyRequest { > keyArgs { > volumeName: "testvol172275910-1551431122-1" > bucketName: "testbuck172275910-1551431122-1" > keyName: "test_file24" > dataSize: 629145600 > type: RATIS > factor: THREE > isMultipartKey: false > } > } > , > 2019-03-01 09:05:33,255 [IPC Server handler 17 on 9889] DEBUG > (KeyManagerImpl.java:465) - Key test_file24 allocated in volume > testvol172275910-1551431122-1 bucket testbuck172275910-1551431122-1 > 2019-03-01 09:05:38,229 [IPC Server handler 8 on 9889] DEBUG > (OzoneManagerRequestHandler.java:137) - Received OMRequest: cmdType: > AllocateBlock > traceID: "5f169cde0a4c8a4e:fe6c4bdb75978062:5f169cde0a4c8a4e:0" > clientId: "client-86810A76C95E" > allocateBlockRequest { > keyArgs { > volumeName: "testvol172275910-1551431122-1" > bucketName: "testbuck172275910-1551431122-1" > keyName: "test_file24" > dataSize: 629145600 > } > clientID: 20622763490697872 > } > , > 2019-03-01 09:05:38,739 [grpc-default-executor-17] INFO > (ContainerUtils.java:149) - Operation: CreateContainer : Trace ID: > 5f169cde0a4c8a4e:f340a80dafdf68eb:5f169cde0a4c8a4e:0 : Message: Container > creation failed, due to disk out of space : Result: DISK_OUT_OF_SPACE > 2019-03-01 09:05:38,790 [grpc-default-executor-17] INFO > (ContainerUtils.java:149) - Operation: WriteChunk : Trace ID: > 5f169cde0a4c8a4e:f340a80dafdf68eb:5f169cde0a4c8a4e:0 : Message: ContainerID > 79 creation failed : Result: DISK_OUT_OF_SPACE > 2019-03-01 09:05:38,800 [grpc-default-executor-17] DEBUG > (ContainerStateMachine.java:358) - writeChunk writeStateMachineData : blockId > containerID: 79 > localID: 101674591075108132 > blockCommitSequenceId: 0 > logIndex 3 chunkName > f6508b585fbd0b834b2139939467ac03_stream_8101b9db-a724-4690-abe1-c7daa2630326_chunk_1 > 2019-03-01 09:05:38,801 [grpc-default-executor-17] DEBUG > (ContainerStateMachine.java:365) - writeChunk writeStateMachineData > completed: blockId containerID: 79 > localID: 101674591075108132 > blockCommitSequenceId: 0 > logIndex 3 chunkName > f6508b585fbd0b834b2139939467ac03_stream_8101b9db-a724-4690-abe1-c7daa2630326_chunk_1 > 2019-03-01 09:05:38,978 > [StateMachineUpdater-89770995-a80c-451d-a875-a0d384c2067f] INFO > (ContainerStateMachine.java:573) - Gap in indexes at:0 detected, adding dummy > entries > 2019-03-01 09:05:38,979 > [StateMachineUpdater-89770995-a80c-451d-a875-a0d384c2067f] INFO > (ContainerStateMachine.java:573) - Gap in indexes at:1 detected, adding dummy > entries > 2019-03-01 09:05:38,980 > [StateMachineUpdater-89770995-a80c-451d-a875-a0d384c2067f] INFO > (ContainerStateMachine.java:573) - Gap in indexes at:2 detected, adding dummy > entries > 2019-03-01 09:05:38,981 > [StateMachineUpdater-89770995-a80c-451d-a875-a0d384c2067f] INFO > (ContainerUtils.java:149) - Operation: CreateContainer : Trace ID: > 5f169cde0a4c8a4e:f340a80dafdf68eb:5f169cde0a4c8a4e:0 : Message: Container > creation failed, due to disk out of space : Result: DISK_OUT_OF_SPACE > 2019-03-01 09:05:38,981 > [StateMachineUpdater-89770995-a80c-451d-a875-a0d384c2067f] INFO > (ContainerUtils.java:149) - Operation: WriteChunk : Trace ID: > 5f169cde0a4c8a4e:f340a80dafdf68eb:5f169cde0a4c8a4e:0 : Message: ContainerID > 79 creation failed : Result: DISK_OUT_OF_SPACE > 2019-03-01 09:05:39,357 [grpc-default-executor-18] INFO > (ContainerUtils.java:149) - Operation: CreateContainer : Trace ID: > 5f169cde0a4c8a4e:896673a239485fcd:5f169cde0a4c8a4e:0 : Message: Container > creation failed, due to disk out of space : Result: DISK_OUT_OF_SPACE > 2019-03-01 09:05:39,358 [grpc-default-executor-18] INFO > (ContainerUtils.java:149) - Operation: WriteChunk : Trace ID: > 5f169cde0a4c8a4e:896673a239485fcd:5f169cde0a4c8a4e:0 : Message: ContainerID > 79 creation failed : Result: DISK_OUT_OF_SPACE > > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org