I wrote a basic Trident spout that emits a “hello world” string a million times. I wrote a basic Trident function that prints the tuple value to stdout. I set up a basic Trident topology, spout shuffle with parallelism of 1, and the function with parallelism of 3. The problem node dies as soon as it receives the first tuple, the other 2 nodes print the received tuples before dying due to the other node. The problem is only on 1 specific node. If I shut down the supervisor and bring another one up on a different box, all 3 nodes print “hello world”. This rules out any of my custom classes not being loaded properly, because they aren’t included in this dummy topology. And all of the nodes share a central location for external jars, which I symlink to the storm/lib directory. I checked every jar on each node and there are no extra jars on any of them. Does anyone else have any ideas? Where else should I look? It clearly isn’t the code causing the problem. Java version is the same everywhere. Please help.
From: Nathan Leung [mailto:[email protected]] Sent: Thursday, October 09, 2014 9:47 AM To: user Subject: Re: Serialization issue? Just as a convoluted example with core storm. If I have two spouts tasks and two bolt tasks and use localOrShuffleGrouping tuples would not be serialized and I would not encounter any potential kryo issues because the tuples are avoiding the network. However, if number of tasks stays at 2 each, and number of workers is 3 or 4 then I would have serialization issues because they would be distributed more widely and data would cross the network. That said, since you're using a fields grouping you would cross the network even with just 2 workers so that's probably not it. On Thu, Oct 9, 2014 at 8:52 AM, Brunner, Bill <[email protected]<mailto:[email protected]>> wrote: Very. I am using standard fields grouping – groupBy() in Trident – in a few places , and I am sending basic Scala data structures around in my tuples (Maps, Lists, etc). If it were a true serialization issue, why would my topology run at all? I feel like it should blow up on any cluster with more than 1 node… As far as how my topology is configured, I’m not sure what details would be useful to you. From: Nathan Leung [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, October 09, 2014 8:18 AM To: user Subject: RE: Serialization issue? That is odd then. Kryo serializes classes in a more fully qualified manner the first time it sees the class for an object, and records and index of closes out has seen before. It serializes classes by index when it sees them subsequently. On deserialization it rebuilds the index as it encounters new class information. Your error means that it sees a class index value that it hasn't rebuilt yet. The only time I've run into this kind of error was when I was mucking with custom serialization. What type of grouping are you using, and what type of data are you sending your collection? Also, how is your topology configured? Have you try writing a standalone program to serialize your data with Kryo outside of storm? On Oct 9, 2014 7:57 AM, "Brunner, Bill" <[email protected]<mailto:[email protected]>> wrote: 0.9.2-incubating, and all my external jar dependencies are on a network share and symlinked to the /storm/lib folder (not that it would matter, but this was also happening on a completely different environment when the jars were not symlinked). From: Nathan Leung [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, October 09, 2014 7:42 AM To: user Subject: Re: Serialization issue? Can you double check that you're running the same version on all three nodes? On Oct 9, 2014 7:35 AM, "Brunner, Bill" <[email protected]<mailto:[email protected]>> wrote: I am seeing serialization issues on a single worker node, while the other nodes are unaffected. I usually get the error shortly after the topology is loaded. It happens on a single node any time the total worker nodes in the cluster is > 2, ie, the topology runs w/o error on 2 nodes (1 worker each), but when I introduce a 3rd node, that node fails with the serialization error. Similarly with 4 worker nodes, 3 run and the 4th fails. I have included the stack trace below. FYI I am using the default serializer. I have tried cycling nimbus/zk and each node many times, and it does not help (I have also done a complete reinstall on the failing node(s)) Hopefully someone can help. Thanks Bill 2014-10-06 11:17:30 b.s.util [ERROR] Async loop died! java.lang.RuntimeException: com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 38 at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.daemon.executor$fn__5641$fn__5653$fn__5700.invoke(executor.clj:746) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.util$async_loop$fn__457.invoke(util.clj:431) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] Caused by: com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 38 at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:119) ~[kryo-2.21.jar:na] at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) ~[kryo-2.21.jar:na] at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) ~[kryo-2.21.jar:na] at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109) ~[kryo-2.21.jar:na] at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) ~[kryo-2.21.jar:na] at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:629) ~[kryo-2.21.jar:na] at backtype.storm.serialization.KryoValuesDeserializer.deserializeFrom(KryoValuesDeserializer.java:38) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.serialization.KryoTupleDeserializer.deserialize(KryoTupleDeserializer.java:53) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.daemon.executor$mk_task_receiver$fn__5564.invoke(executor.clj:396) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.disruptor$clojure_handler$reify__745.onEvent(disruptor.clj:58) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] ... 6 common frames omitted ________________________________ This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. ________________________________ This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. ________________________________ This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message. ---------------------------------------------------------------------- This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer. If you are not the intended recipient, please delete this message.
