Hi Bipin, There are no disk space issues on the nimbus host. We also see this exception. I think this is the effect not the cause. Not completely sure though.
2024-01-18 11:47:16.018 o.a.s.t.ProcessFunction pool-29-thread-60 [ERROR] Internal error processing beginBlobDownload java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3860) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.generated.Nimbus$Processor$beginBlobDownload.getResult(Nimbus.java:4340) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.generated.Nimbus$Processor$beginBlobDownload.getResult(Nimbus.java:4319) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) [storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) [storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) [storm-client-2.3.0.jar:2.3.0] at org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) [storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) [storm-shaded-deps-2.3.0.jar:2.3.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:283) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) ~[storm-server-2.3.0.jar:2.3.0] ... 10 more Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:139) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) ~[storm-server-2.3.0.jar:2.3.0] ... 10 more Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TFramedTransport.read(TFramedTransport.java:109) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:77) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.generated.Nimbus$Client.recv_createStateInZookeeper(Nimbus.java:1036) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.generated.Nimbus$Client.createStateInZookeeper(Nimbus.java:1023) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:136) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) ~[storm-server-2.3.0.jar:2.3.0] ... 10 more Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[?:?] at java.net.SocketInputStream.socketRead(SocketInputStream.java:115) ~[?:?] at java.net.SocketInputStream.read(SocketInputStream.java:168) ~[?:?] at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?] at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?] at java.io.BufferedInputStream.read1(BufferedInputStream.java:292) ~[?:?] at java.io.BufferedInputStream.read(BufferedInputStream.java:351) ~[?:?] at org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TFramedTransport.read(TFramedTransport.java:109) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:77) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.generated.Nimbus$Client.recv_createStateInZookeeper(Nimbus.java:1036) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.generated.Nimbus$Client.createStateInZookeeper(Nimbus.java:1023) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:136) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) ~[storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) ~[storm-server-2.3.0.jar:2.3.0] ... 10 more On Thu, Jan 18, 2024 at 12:25 PM Bipin Prasad <bipin_pra...@yahoo.com.invalid> wrote: > Hello Devender, The zookeeper entry is expected to be removed when the > topology is killed. It appears that you are using local file store blob. So > the blobs are (expected to be) on the nimbus. Can you check the mailbox > directory and confirm whether or not the blob for this topology made it > that far? It is possible that there is some issue with disk space on the > nimbus host? > —Bipin > > > Sent from Yahoo Mail for iPhone > > > On Thursday, January 18, 2024, 11:39 AM, Devendar Rao < > devendar.gu...@gmail.com> wrote: > > Thanks Bipin for the response. To add more details: > Whenever a new topology is deployed nimbus doesn't respond and > supervisor(s) go down. We have to restart the services to bring the cluster > back to normal. > Another error we see: o.a.s.b.BlobStoreUtils BLOB-STORE-TIMER [ERROR] > Could not download the blob with key: topology_B-7-1704896101-stormjar.jar > There is a stale entry in the zk path: > /storm/blobstore/topology_B-7-1704896101-stormjar.jar. Not sure why it was > not getting cleared off. This is pretty consistent. > This error goes away after manually deleting stale entry from zk path: > rmr /storm/blobstore/topology_B-7-1704896101-stormjar.jar > Thanks,Devendar > > > On Thu, Jan 18, 2024 at 11:28 AM Devendar Rao <devendar.gu...@gmail.com> > wrote: > > Full stack trace: > 2024-01-18 18:27:12.889 o.a.s.d.n.Nimbus pool-29-thread-22 [WARN] get blob > meta exception.org.apache.storm.utils.WrappedKeyNotFoundException: > topology-A-7-1704896101-stormjar.jar at > org.apache.storm.blobstore.LocalFsBlobStore.getStoredBlobMeta(LocalFsBlobStore.java:258) > ~[storm-server-2.3.0.jar:2.3.0] at > org.apache.storm.blobstore.LocalFsBlobStore.getBlobMeta(LocalFsBlobStore.java:288) > ~[storm-server-2.3.0.jar:2.3.0] at > org.apache.storm.daemon.nimbus.Nimbus.getBlobMeta(Nimbus.java:3815) > [storm-server-2.3.0.jar:2.3.0] at > org.apache.storm.generated.Nimbus$Processor$getBlobMeta.getResult(Nimbus.java:4278) > [storm-client-2.3.0.jar:2.3.0] at > org.apache.storm.generated.Nimbus$Processor$getBlobMeta.getResult(Nimbus.java:4257) > [storm-client-2.3.0.jar:2.3.0] at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.3.0.jar:2.3.0] at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) > [storm-shaded-deps-2.3.0.jar:2.3.0] at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) > [storm-client-2.3.0.jar:2.3.0] at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) > [storm-shaded-deps-2.3.0.jar:2.3.0] at > org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.3.0.jar:2.3.0] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] > > On Thu, Jan 18, 2024 at 11:26 AM Bipin Prasad <bipin_pra...@yahoo.com.invalid> > wrote: > > Can you post the full stack trace? I want to confirm that this is logged > while trying to obtain the heartbeat.This message is a warning message and > is not expected to shutdown nimbus. > > > Sent from Yahoo Mail for iPhone > > > On Thursday, January 18, 2024, 11:19 AM, Devendar Rao < > devendar.gu...@gmail.com> wrote: > > Hi, > We're constantly seeing issues in storm 2.3.0 with blobs with each > topology deployment. Supervisor/nimbus dies after seeing the below > exceptions. > Is this a known issue? Are we hitting any blob cache size limits? > Exceptions: > "Could not download the blob with key:" > o.a.s.d.n.Nimbus pool-29-thread-22 [WARN] > .org.apache.storm.utils.WrappedKeyNotFoundException > > > Can someone please shed some light on this? > Thanks > > > > > > > > >