What is the size of the blob (the larger is probably the jar). The download time would depend on the network speed. Also noticed the name was f the mounted file system as ephemeral. These are not persistent storage? How about on the nimbus?
Sent from Yahoo Mail for iPhone On Friday, January 19, 2024, 12:24 PM, Devendar Rao <devendar.gu...@gmail.com> wrote: We also see these warnings during the issue. Are there any settings which controls the availability of these files. Is there a setting which controls the wokrer to to keep in wait till the required files are available? Do we need to increase the value for this setting - topology.max.replication.wait.time.sec? It's set to 60 Sec now. 2024-01-18 11:37:08.328 o.a.s.l.AsyncLocalizer SLOT_6700 [WARN] Local base blobs are not available. java.io.FileNotFoundException: File '/mnt/ephermal0 /data/storm/supervisor/stormdist/topology_A-15-1704896815/stormconf.ser' does not exist at org.apache.storm.shade.org.apache.commons.io.FileUtils. openInputStream(FileUtils.java:297) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray( FileUtils.java:1851) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache. storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java: 311) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.utils.ConfigUtils. readSupervisorStormConfImpl(ConfigUtils.java:478) ~[storm-client-2.3.0.jar:2 .3.0] at org.apache.storm.utils.ConfigUtils.readSupervisorStormConf( ConfigUtils.java:306) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm. localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:368) ~[storm- server-2.3.0.jar:2.3.0] at org.apache.storm.localizer.AsyncLocalizer. releaseSlotFor(AsyncLocalizer.java:475) [storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon.supervisor.Slot.handleWaitingForBlobLocalization( Slot.java:410) [storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon. supervisor.Slot.stateMachineStep(Slot.java:192) [storm-server-2.3.0.jar:2.3. 0] at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:941) [storm- server-2.3.0.jar:2.3.0] On Thu, Jan 18, 2024 at 1:28 PM Devendar Rao <devendar.gu...@gmail.com> wrote: > Hi Bipin, > > There are no disk space issues on the nimbus host. We also see this > exception. I think this is the effect not the cause. Not completely sure > though. > > 2024-01-18 11:47:16.018 o.a.s.t.ProcessFunction pool-29-thread-60 [ERROR] > Internal error processing beginBlobDownload > java.lang.RuntimeException: java.lang.RuntimeException: > java.lang.RuntimeException: > org.apache.storm.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3860) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.generated.Nimbus$Processor$beginBlobDownload.getResult(Nimbus.java:4340) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.generated.Nimbus$Processor$beginBlobDownload.getResult(Nimbus.java:4319) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.3.0.jar:2.3.0] > at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) > [storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) > [storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) > [storm-shaded-deps-2.3.0.jar:2.3.0] > at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.3.0.jar:2.3.0] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at java.lang.Thread.run(Thread.java:834) [?:?] > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.storm.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:283) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) > ~[storm-server-2.3.0.jar:2.3.0] > ... 10 more > Caused by: java.lang.RuntimeException: > org.apache.storm.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:139) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) > ~[storm-server-2.3.0.jar:2.3.0] > ... 10 more > Caused by: org.apache.storm.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timed out > at > org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TFramedTransport.read(TFramedTransport.java:109) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:77) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.generated.Nimbus$Client.recv_createStateInZookeeper(Nimbus.java:1036) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.generated.Nimbus$Client.createStateInZookeeper(Nimbus.java:1023) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:136) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) > ~[storm-server-2.3.0.jar:2.3.0] > ... 10 more > Caused by: java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) ~[?:?] > at java.net.SocketInputStream.socketRead(SocketInputStream.java:115) ~[?:?] > at java.net.SocketInputStream.read(SocketInputStream.java:168) ~[?:?] > at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?] > at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?] > at java.io.BufferedInputStream.read1(BufferedInputStream.java:292) ~[?:?] > at java.io.BufferedInputStream.read(BufferedInputStream.java:351) ~[?:?] > at > org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TFramedTransport.read(TFramedTransport.java:109) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:77) > ~[storm-shaded-deps-2.3.0.jar:2.3.0] > at > org.apache.storm.generated.Nimbus$Client.recv_createStateInZookeeper(Nimbus.java:1036) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.generated.Nimbus$Client.createStateInZookeeper(Nimbus.java:1023) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:136) > ~[storm-client-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389) > ~[storm-server-2.3.0.jar:2.3.0] > at > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845) > ~[storm-server-2.3.0.jar:2.3.0] > ... 10 more > > On Thu, Jan 18, 2024 at 12:25 PM Bipin Prasad > <bipin_pra...@yahoo.com.invalid> wrote: > >> Hello Devender, The zookeeper entry is expected to be removed when the >> topology is killed. It appears that you are using local file store blob. So >> the blobs are (expected to be) on the nimbus. Can you check the mailbox >> directory and confirm whether or not the blob for this topology made it >> that far? It is possible that there is some issue with disk space on the >> nimbus host? >> —Bipin >> >> >> Sent from Yahoo Mail for iPhone >> >> >> On Thursday, January 18, 2024, 11:39 AM, Devendar Rao < >> devendar.gu...@gmail.com> wrote: >> >> Thanks Bipin for the response. To add more details: >> Whenever a new topology is deployed nimbus doesn't respond and >> supervisor(s) go down. We have to restart the services to bring the cluster >> back to normal. >> Another error we see: o.a.s.b.BlobStoreUtils BLOB-STORE-TIMER [ERROR] >> Could not download the blob with key: topology_B-7-1704896101-stormjar.jar >> There is a stale entry in the zk path: >> /storm/blobstore/topology_B-7-1704896101-stormjar.jar. Not sure why it was >> not getting cleared off. This is pretty consistent. >> This error goes away after manually deleting stale entry from zk path: >> rmr /storm/blobstore/topology_B-7-1704896101-stormjar.jar >> Thanks,Devendar >> >> >> On Thu, Jan 18, 2024 at 11:28 AM Devendar Rao <devendar.gu...@gmail.com> >> wrote: >> >> Full stack trace: >> 2024-01-18 18:27:12.889 o.a.s.d.n.Nimbus pool-29-thread-22 [WARN] get >> blob meta exception.org.apache.storm.utils.WrappedKeyNotFoundException: >> topology-A-7-1704896101-stormjar.jar at >> org.apache.storm.blobstore.LocalFsBlobStore.getStoredBlobMeta(LocalFsBlobStore.java:258) >> ~[storm-server-2.3.0.jar:2.3.0] at >> org.apache.storm.blobstore.LocalFsBlobStore.getBlobMeta(LocalFsBlobStore.java:288) >> ~[storm-server-2.3.0.jar:2.3.0] at >> org.apache.storm.daemon.nimbus.Nimbus.getBlobMeta(Nimbus.java:3815) >> [storm-server-2.3.0.jar:2.3.0] at >> org.apache.storm.generated.Nimbus$Processor$getBlobMeta.getResult(Nimbus.java:4278) >> [storm-client-2.3.0.jar:2.3.0] at >> org.apache.storm.generated.Nimbus$Processor$getBlobMeta.getResult(Nimbus.java:4257) >> [storm-client-2.3.0.jar:2.3.0] at >> org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) >> [storm-shaded-deps-2.3.0.jar:2.3.0] at >> org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38) >> [storm-shaded-deps-2.3.0.jar:2.3.0] at >> org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172) >> [storm-client-2.3.0.jar:2.3.0] at >> org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) >> [storm-shaded-deps-2.3.0.jar:2.3.0] at >> org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) >> [storm-shaded-deps-2.3.0.jar:2.3.0] at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) >> [?:?] at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >> [?:?] at java.lang.Thread.run(Thread.java:834) [?:?] >> >> On Thu, Jan 18, 2024 at 11:26 AM Bipin Prasad >> <bipin_pra...@yahoo.com.invalid> >> wrote: >> >> Can you post the full stack trace? I want to confirm that this is logged >> while trying to obtain the heartbeat.This message is a warning message and >> is not expected to shutdown nimbus. >> >> >> Sent from Yahoo Mail for iPhone >> >> >> On Thursday, January 18, 2024, 11:19 AM, Devendar Rao < >> devendar.gu...@gmail.com> wrote: >> >> Hi, >> We're constantly seeing issues in storm 2.3.0 with blobs with each >> topology deployment. Supervisor/nimbus dies after seeing the below >> exceptions. >> Is this a known issue? Are we hitting any blob cache size limits? >> Exceptions: >> "Could not download the blob with key:" >> o.a.s.d.n.Nimbus pool-29-thread-22 [WARN] >> .org.apache.storm.utils.WrappedKeyNotFoundException >> >> >> Can someone please shed some light on this? >> Thanks >> >> >> >> >> >> >> >> >>