Thanks everyone for the help. I found the issue. Turns out it was something completely unrelated. Someone had set up a JVM option to run a monitoring application of their own on top of Storm and every time anyone was submitting a topology, this application was also executed on the whole cluster, leading to OOM. Remote-profiling helped me detect the problem and disable this.
On Wed, Jan 20, 2016 at 6:57 PM, Andrew Xor <andreas.gramme...@gmail.com> wrote: > Hello again, > > I'd like to chip in @Nikolas but what you will have to do... you will > probably not like... I really think this is not storm's fault... that would > be really weird. Additionally the jvm that you use for local execution are > the same as the ones in your cluster? Now... the best way to know that is > wrong would be to get profiled data from the workers that die on the > cluster... to do that you will need to debug/profile the topology on the > cluster -- it's not a trivial task and can be time consuming. Now I'll tell > you a kind of "simple" way of getting the data... which is what I did when > I needed to debug a cluster topology. > > Now, since you don't know which of the nodes of the cluster will be > assigned to your topology you'd have to hook-up with all of them... but > thankfully you will receive response only from the ones that will actually > do the work; this is because the jvm on listening on your specified port > will be spawned but the nodes executing your topology. You will have to use > childopts argument to pass the port that the debug jvm will be hooked > (details on how to hook jvm's remotely are here > <http://docs.oracle.com/javase/8/docs/technotes/guides/jpda/conninv.html> > and for attaching YK here > <https://www.yourkit.com/docs/java/help/attach_agent.jsp>), so make sure > it's one that's not already used -- you can use nmap to scan the nodes for > available ports. Next, depending on your version of java you should invoke > the jvm and attach it to your profiler server -- preferably, use YourKit > for that...even in trial mode as it makes the process quite easier than > other debuggers. Then review the dump/profiled data to find your issue... > if you find that there is an issue with Storm please do include as much > details as you can! That will make it easier to find and patch the issue, > if any. > > Let us know if this helped or you require anything else. > > > On Mon, Jan 18, 2016 at 11:54 PM, Yury Ruchin <yuri.ruc...@gmail.com> > wrote: > >> Yes, I suggest you to try and spot the problem by looking at the dump of >> a workers that throws the exception. That way you could at least be certain >> about what consumes workers memory. >> >> Oracle HotSpot has a number of options controlling GC logging, setting >> them for worker JVMs may help in troubleshooting. Plumbr's Handbook seems >> to be a decent reading on that matter: https://plumbr.eu/handbook. >> >> Since you are using custom spout, could you provide its code, at least >> the part that emits tuples? >> >> 2016-01-15 23:57 GMT+03:00 Nikolaos Pavlakis <nikolaspavla...@gmail.com>: >> >>> Hi Yury. >>> >>> 1. I am using Storm 0.9.5 >>> 2. It is a BaseRichSpout. Yes, it has acking enabled and I ack each >>> tuple at the end of the "execute" method of the bolt. I see tuples being >>> acked in Storm UI. >>> 3. Yes I observe memory usage increasing (which eventually leads to the >>> topology hanging) even in my dummy setup which is not saving anything in >>> memory, it merely reproduces the message-passing of my algorithm. I do not >>> get OOM errors when I execute the topology on the cluster, but I get the >>> most common exception in Storm :* java.lang.RuntimeException: >>> java.lang.NullPointerException at >>> backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)* >>> and some tasks die and the Storm UI statistics get lost/re-started. I have >>> never profiled a topology that is being executed on the cluster so I am not >>> very certain if this is what you mean. If I understand correctly, what you >>> are suggesting is to take a heap dump using visualVM at some node while the >>> topology is running and analyze this heap dump. >>> 4. I haven't seen any GC logs (not sure how to collect GC logs from the >>> cluster). >>> >>> Thanks again for your help. >>> >>> On Fri, Jan 15, 2016 at 9:57 PM, Yury Ruchin <yuri.ruc...@gmail.com> >>> wrote: >>> >>>> Hi Nick, >>>> >>>> Some questions: >>>> >>>> 1. Well, what version of Storm are you using? :) >>>> >>>> 2. What is the spout you are using? Is this spout reliable, i. e. does >>>> it use message ids to have messages acked/failed by downstream bolts? Do >>>> you have acker enabled for your topology? If it is unreliable or does not >>>> have acker, then topology.max.spout.pending has no effect and if your bolts >>>> don't keep up with your spout, you will likely end up with overflow buffer >>>> growing larger and larger. >>>> >>>> 3. Not sure if I get it right: after you stopped saving anything in >>>> memory - do you still experience memory usage increasing? Have you observed >>>> OutOfMemoryErrors? If yes, you might want to launch your worker processes >>>> with -XX:+HeapDumpOnOutOfMemoryError. If no, you can take on-demand heap >>>> dump using e. g. VisualVM and feed it to a memory analyzer, such as MAT, >>>> then take a look what eats up the heap. >>>> >>>> 4. What do you think it's a memory issue? Have you looked at GC graphs >>>> shown by e. g. VisualVM? Did you collect any GC logs to see how long it >>>> took? >>>> >>>> Regards >>>> Yury >>>> >>>> 2016-01-15 20:15 GMT+03:00 Nikolaos Pavlakis <nikolaspavla...@gmail.com >>>> >: >>>> >>>>> Thanks for all the replies so far. I am profiling the topology in >>>>> local mode with VisualVm and I do not see this problem. I am still running >>>>> to this problem when the topology is deployed on the cluster, even with >>>>> max.spout.pending = 1. >>>>> >>>>> On Wed, Jan 13, 2016 at 10:38 PM, John Yost <hokiege...@gmail.com> >>>>> wrote: >>>>> >>>>>> +1 for Andrew, definitely agree profiling with jvisualvm or whatever >>>>>> is definitely something to do if you have not done already >>>>>> >>>>>> On Wed, Jan 13, 2016 at 3:30 PM, Andrew Xor < >>>>>> andreas.gramme...@gmail.com> wrote: >>>>>> >>>>>>> Hey, >>>>>>> >>>>>>> Care to give version of storm/jvm? Does this happen on cluster >>>>>>> execution only or when also running the topology in local mode? >>>>>>> Unfortunately, probably the best way to find what's really going on is >>>>>>> to >>>>>>> profile your topology... if you can run the topology locally this will >>>>>>> make >>>>>>> things quite a bit easier as profiling storm topologies on a live >>>>>>> cluster >>>>>>> can be quite time consuming. >>>>>>> >>>>>>> Regards. >>>>>>> >>>>>>> On Wed, Jan 13, 2016 at 10:06 PM, Nikolaos Pavlakis < >>>>>>> nikolaspavla...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I am implementing a distributed algorithm for pagerank estimation >>>>>>>> using Storm. I have been having memory problems, so I decided to >>>>>>>> create a >>>>>>>> dummy implementation that does not explicitly save anything in memory, >>>>>>>> to >>>>>>>> determine whether the problem lies in my algorithm or my Storm >>>>>>>> structure. >>>>>>>> >>>>>>>> Indeed, while the only thing the dummy implementation does is >>>>>>>> message-passing (a lot of it), the memory of each worker process keeps >>>>>>>> rising until the pipeline is clogged. I do not understand why this >>>>>>>> might be >>>>>>>> happening. >>>>>>>> >>>>>>>> My cluster has 18 machines (some with 8g, some 16g and some 32g of >>>>>>>> memory). I have set the worker heap size to 6g (-Xmx6g). >>>>>>>> >>>>>>>> My topology is very very simple: >>>>>>>> One spout >>>>>>>> One bolt (with parallelism). >>>>>>>> >>>>>>>> The bolt receives data from the spout (fieldsGrouping) and also >>>>>>>> from other tasks of itself. >>>>>>>> >>>>>>>> My message-passing pattern is based on random walks with a certain >>>>>>>> stopping probability. More specifically: >>>>>>>> The spout generates a tuple. >>>>>>>> One specific task from the bolt receives this tuple. >>>>>>>> Based on a certain probability, this task generates another tuple >>>>>>>> and emits it again to another task of the same bolt. >>>>>>>> >>>>>>>> >>>>>>>> I am stuck at this problem for quite a while, so it would be very >>>>>>>> helpful if someone could help. >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Nick >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >