The size of the jar is 270M. We've 19 topologies each with 270M jar size.
Yes, we use AWS EBS volumes for supervisors and Nimbus uses the local
Nvme(instance store).

Do you think increasing this topology.max.replication.wait.time.sec would
help?

On Fri, Jan 19, 2024 at 12:29 PM Bipin Prasad
<bipin_pra...@yahoo.com.invalid> wrote:

> What is the size of the blob (the larger is probably the jar). The
> download time would depend on the network speed. Also noticed the name was
> f the mounted file system as ephemeral. These are not persistent storage?
> How about on the nimbus?
>
>
> Sent from Yahoo Mail for iPhone
>
>
> On Friday, January 19, 2024, 12:24 PM, Devendar Rao <
> devendar.gu...@gmail.com> wrote:
>
> We also see these warnings during the issue. Are there any settings which
> controls the availability of these files.
>
> Is there a setting which controls the wokrer to to keep in wait till the
> required files are available?
>
> Do we need to increase the value for this setting -
> topology.max.replication.wait.time.sec?
> It's set to 60 Sec now.
> 2024-01-18 11:37:08.328 o.a.s.l.AsyncLocalizer SLOT_6700 [WARN] Local base
> blobs are not available. java.io.FileNotFoundException: File
> '/mnt/ephermal0
> /data/storm/supervisor/stormdist/topology_A-15-1704896815/stormconf.ser'
> does not exist at org.apache.storm.shade.org.apache.commons.io.FileUtils.
> openInputStream(FileUtils.java:297) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at
> org.apache.storm.shade.org
> .apache.commons.io.FileUtils.readFileToByteArray(
> FileUtils.java:1851) ~[storm-shaded-deps-2.3.0.jar:2.3.0] at org.apache.
> storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:
> 311) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.utils.ConfigUtils.
> readSupervisorStormConfImpl(ConfigUtils.java:478)
> ~[storm-client-2.3.0.jar:2
> .3.0] at org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(
> ConfigUtils.java:306) ~[storm-client-2.3.0.jar:2.3.0] at org.apache.storm.
> localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:368)
> ~[storm-
> server-2.3.0.jar:2.3.0] at org.apache.storm.localizer.AsyncLocalizer.
> releaseSlotFor(AsyncLocalizer.java:475) [storm-server-2.3.0.jar:2.3.0] at
> org.apache.storm.daemon.supervisor.Slot.handleWaitingForBlobLocalization(
> Slot.java:410) [storm-server-2.3.0.jar:2.3.0] at org.apache.storm.daemon.
> supervisor.Slot.stateMachineStep(Slot.java:192)
> [storm-server-2.3.0.jar:2.3.
> 0] at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:941) [storm-
> server-2.3.0.jar:2.3.0]
>
>
> On Thu, Jan 18, 2024 at 1:28 PM Devendar Rao <devendar.gu...@gmail.com>
> wrote:
>
> > Hi Bipin,
> >
> > There are no disk space issues on the nimbus host. We also see this
> > exception. I think this is the effect not the cause. Not completely sure
> > though.
> >
> > 2024-01-18 11:47:16.018 o.a.s.t.ProcessFunction pool-29-thread-60 [ERROR]
> > Internal error processing beginBlobDownload
> > java.lang.RuntimeException: java.lang.RuntimeException:
> > java.lang.RuntimeException:
> > org.apache.storm.thrift.transport.TTransportException:
> > java.net.SocketTimeoutException: Read timed out
> > at
> > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3860)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.generated.Nimbus$Processor$beginBlobDownload.getResult(Nimbus.java:4340)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.generated.Nimbus$Processor$beginBlobDownload.getResult(Nimbus.java:4319)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38)
> > [storm-shaded-deps-2.3.0.jar:2.3.0]
> > at org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> > [storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172)
> > [storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524)
> > [storm-shaded-deps-2.3.0.jar:2.3.0]
> > at org.apache.storm.thrift.server.Invocation.run(Invocation.java:18)
> > [storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> > [?:?]
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> > [?:?]
> > at java.lang.Thread.run(Thread.java:834) [?:?]
> > Caused by: java.lang.RuntimeException: java.lang.RuntimeException:
> > org.apache.storm.thrift.transport.TTransportException:
> > java.net.SocketTimeoutException: Read timed out
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:283)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > ... 10 more
> > Caused by: java.lang.RuntimeException:
> > org.apache.storm.thrift.transport.TTransportException:
> > java.net.SocketTimeoutException: Read timed out
> > at
> >
> org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:139)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > ... 10 more
> > Caused by: org.apache.storm.thrift.transport.TTransportException:
> > java.net.SocketTimeoutException: Read timed out
> > at
> >
> org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.transport.TFramedTransport.read(TFramedTransport.java:109)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.generated.Nimbus$Client.recv_createStateInZookeeper(Nimbus.java:1036)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.generated.Nimbus$Client.createStateInZookeeper(Nimbus.java:1023)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:136)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > ... 10 more
> > Caused by: java.net.SocketTimeoutException: Read timed out
> > at java.net.SocketInputStream.socketRead0(Native Method) ~[?:?]
> > at java.net.SocketInputStream.socketRead(SocketInputStream.java:115)
> ~[?:?]
> > at java.net.SocketInputStream.read(SocketInputStream.java:168) ~[?:?]
> > at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
> > at java.io.BufferedInputStream.fill(BufferedInputStream.java:252) ~[?:?]
> > at java.io.BufferedInputStream.read1(BufferedInputStream.java:292) ~[?:?]
> > at java.io.BufferedInputStream.read(BufferedInputStream.java:351) ~[?:?]
> > at
> >
> org.apache.storm.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:141)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.transport.TFramedTransport.read(TFramedTransport.java:109)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.thrift.transport.TTransport.readAll(TTransport.java:86)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
> > ~[storm-shaded-deps-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.generated.Nimbus$Client.recv_createStateInZookeeper(Nimbus.java:1036)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.generated.Nimbus$Client.createStateInZookeeper(Nimbus.java:1023)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.NimbusBlobStore.createStateInZookeeper(NimbusBlobStore.java:136)
> > ~[storm-client-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.createStateInZookeeper(BlobStoreUtils.java:240)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.BlobStoreUtils.updateKeyForBlobStore(BlobStoreUtils.java:277)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.checkForBlobUpdate(LocalFsBlobStore.java:469)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> >
> org.apache.storm.blobstore.LocalFsBlobStore.getBlob(LocalFsBlobStore.java:389)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > at
> > org.apache.storm.daemon.nimbus.Nimbus.beginBlobDownload(Nimbus.java:3845)
> > ~[storm-server-2.3.0.jar:2.3.0]
> > ... 10 more
> >
> > On Thu, Jan 18, 2024 at 12:25 PM Bipin Prasad
> > <bipin_pra...@yahoo.com.invalid> wrote:
> >
> >> Hello Devender,  The zookeeper entry is expected to be removed when the
> >> topology is killed. It appears that you are using local file store
> blob. So
> >> the blobs are (expected to be) on the nimbus. Can you check the mailbox
> >> directory and confirm whether or not the blob for this topology made it
> >> that far? It is possible that there is some issue with disk space on the
> >> nimbus host?
> >> —Bipin
> >>
> >>
> >> Sent from Yahoo Mail for iPhone
> >>
> >>
> >> On Thursday, January 18, 2024, 11:39 AM, Devendar Rao <
> >> devendar.gu...@gmail.com> wrote:
> >>
> >> Thanks Bipin for the response. To add more details:
> >> Whenever a new topology is deployed nimbus doesn't respond and
> >> supervisor(s) go down. We have to restart the services to bring the
> cluster
> >> back to normal.
> >> Another error we see: o.a.s.b.BlobStoreUtils BLOB-STORE-TIMER [ERROR]
> >> Could not download the blob with key:
> topology_B-7-1704896101-stormjar.jar
> >> There is a stale entry in the zk path:
> >> /storm/blobstore/topology_B-7-1704896101-stormjar.jar. Not sure why it
> was
> >> not getting cleared off. This is pretty consistent.
> >> This error goes away after manually deleting stale entry from zk path:
> >> rmr /storm/blobstore/topology_B-7-1704896101-stormjar.jar
> >> Thanks,Devendar
> >>
> >>
> >> On Thu, Jan 18, 2024 at 11:28 AM Devendar Rao <devendar.gu...@gmail.com
> >
> >> wrote:
> >>
> >> Full stack trace:
> >> 2024-01-18 18:27:12.889 o.a.s.d.n.Nimbus pool-29-thread-22 [WARN] get
> >> blob meta exception.org.apache.storm.utils.WrappedKeyNotFoundException:
> >> topology-A-7-1704896101-stormjar.jar at
> >>
> org.apache.storm.blobstore.LocalFsBlobStore.getStoredBlobMeta(LocalFsBlobStore.java:258)
> >> ~[storm-server-2.3.0.jar:2.3.0] at
> >>
> org.apache.storm.blobstore.LocalFsBlobStore.getBlobMeta(LocalFsBlobStore.java:288)
> >> ~[storm-server-2.3.0.jar:2.3.0] at
> >> org.apache.storm.daemon.nimbus.Nimbus.getBlobMeta(Nimbus.java:3815)
> >> [storm-server-2.3.0.jar:2.3.0] at
> >>
> org.apache.storm.generated.Nimbus$Processor$getBlobMeta.getResult(Nimbus.java:4278)
> >> [storm-client-2.3.0.jar:2.3.0] at
> >>
> org.apache.storm.generated.Nimbus$Processor$getBlobMeta.getResult(Nimbus.java:4257)
> >> [storm-client-2.3.0.jar:2.3.0] at
> >> org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38)
> >> [storm-shaded-deps-2.3.0.jar:2.3.0] at
> >> org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
> >> [storm-shaded-deps-2.3.0.jar:2.3.0] at
> >>
> org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:172)
> >> [storm-client-2.3.0.jar:2.3.0] at
> >>
> org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524)
> >> [storm-shaded-deps-2.3.0.jar:2.3.0] at
> >> org.apache.storm.thrift.server.Invocation.run(Invocation.java:18)
> >> [storm-shaded-deps-2.3.0.jar:2.3.0] at
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> >> [?:?] at
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> >> [?:?] at java.lang.Thread.run(Thread.java:834) [?:?]
> >>
> >> On Thu, Jan 18, 2024 at 11:26 AM Bipin Prasad <bipin_pra...@yahoo.com
> .invalid>
> >> wrote:
> >>
> >> Can you post the full stack trace? I want to confirm that this is logged
> >> while trying to obtain the heartbeat.This message is a warning message
> and
> >> is not expected to shutdown nimbus.
> >>
> >>
> >> Sent from Yahoo Mail for iPhone
> >>
> >>
> >> On Thursday, January 18, 2024, 11:19 AM, Devendar Rao <
> >> devendar.gu...@gmail.com> wrote:
> >>
> >> Hi,
> >> We're constantly seeing issues in storm 2.3.0 with blobs with each
> >> topology deployment. Supervisor/nimbus dies after seeing the below
> >> exceptions.
> >> Is this a known issue? Are we hitting any blob cache size limits?
> >> Exceptions:
> >> "Could not download the blob with key:"
> >> o.a.s.d.n.Nimbus pool-29-thread-22 [WARN]
> >> .org.apache.storm.utils.WrappedKeyNotFoundException
> >>
> >>
> >> Can someone please shed some light on this?
> >> Thanks
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
>
>
>
>

Reply via email to