RE: Dynamic Properties Revisited
David, Sorry... missed the Zookeeper part of your response. I will look into that. I went to the archaius wiki, and the first configuration source that it mentioned was JDBC... so I got hung up on that. I will check out the Zookeeper module. Craig From: user-return-2171-CRAIG.A.KING=leidos@storm.incubator.apache.org [user-return-2171-CRAIG.A.KING=leidos@storm.incubator.apache.org] on behalf of David Miller [david.mil...@m-square.com.au] Sent: Wednesday, May 14, 2014 1:50 AM To: user@storm.incubator.apache.org Cc: user@storm.incubator.apache.org Subject: Re: Dynamic Properties Revisited Not sure what you mean by jdbc when I suggested using the zookeeper archaius module. Setting it up is simple, there's a few different ways all covered in the docs On 14/05/2014, at 12:31 PM, King, Craig A. craig.a.k...@leidos.commailto:craig.a.k...@leidos.com wrote: David, So the question that I have with archaius is how you go about bootstrapping the credentials to connect to a jdbc based configuration? I could make a service out of it, but then the bolt would have to authenticate to the service… Even if we used certificates, we would need to disseminate them throughout the topology. Aaron, How would a bolt connect back to Zookeeper to get configuration information? Thanks in advance, Craig On May 13, 2014, at 4:24 PM, David Miller david.mil...@m-square.com.aumailto:david.mil...@m-square.com.au wrote: We use archaius with the zookeeper module for this https://github.com/Netflix/archaius/wiki On Wed, May 14, 2014 at 1:18 AM, Aaron Zimmerman azimmer...@sproutsocial.commailto:azimmer...@sproutsocial.com wrote: I would put it in zookeeper, especially since that's already a dependency. On Tue, May 13, 2014 at 10:14 AM, King, Craig A. craig.a.k...@leidos.commailto:craig.a.k...@leidos.com wrote: I submitted this question back in March, but did not get any responses. Since a little time has passed, and there are a few more folks on the mail list, I thought I would pop it back up again. Thanks in advance, Craig On Mar 14, 2014, at 10:26 AM, King, Craig A. craig.a.k...@leidos.commailto:craig.a.k...@leidos.com wrote: This topic was covered before, but it does not entirely fit my use case. I am looking for some best practices, or ideas on how to manage user names/passwords and other properties that can change at any time. The previous discussion revolved around external configuration at submission time, and can be found here: http://grokbase.com/t/gg/storm-user/134r0rbepz/submitting-a-jar-with-external-config For background, I am doing an analysis of Storm for a DoD/Navy project. Within the Navy there are IA (Information Assurance) rules that govern password changes (such as passwords must change every 30 or 45 days etc.) We also need to design the administration of the system for 19 year old sailors with a few months training. In order to manage the properties, there would be some web based UI that would allow the admin to update passwords and hit a save button. No file editing or logging into Nimbus to change configuration files. The updated passwords (and other changed properties) should become immediately available to all currently running topologies. There could be dozens or even hundreds of topologies running, so killing and resubmitting with new properties is not really an option. I have a couple of ideas, but I am a storm newbie so I don't know the feasibility... 1) have the spouts monitor a property server for changes and push configuration (would require that all bolts get these streams.) 2) have each spout an bolt monitor the said property server. 3) use Messaging and have spouts/bolts subscribe to a configuration topic. All ideas are welcome. Thanks in advance. Craig
Re: Tuples lost in Storm 0.9.1
I am running into the same issue. Where do the lost tuples gone ? If they were queueing in the transport layer, the memory usage should keep increasing, but I didn't see any noticeable memory leaks. Does storm have the guarantee all tuples sent from task A to task B will be received by task B ? Moreover, are they in order ? Can anybody give any idea on this issue 2014-04-02 20:56 GMT+08:00 Daria Mayorova d.mayor...@gmail.com: Hi everyone, We are having some issues with the Storm topology. The problem is that some tuples are being lost somewhere in the topology. Just after the topology is deployed, it goes pretty well, but after several hours it starts to loose a significant amount of tuples. From what we've found out from the logs, the thing is that the tuples exit one bolt/spout, and never enter the next bolt. Here is some info about the topology: - The version is 0.9.1, and netty is used as transport - The spout is extending BaseRichSpout, and the bolts extend BaseBasicBolt - The spout is using Kestrel message queue - The cluster consists of 2 nodes: zookeeper, nimbus and ui are running on one node, and the workers run on another node. I am attaching the content of the config files below. We have also tried running the workers on another node (the same where nimbus and zookeeper are), and also on both nodes, but the behavior is the same. According to the Storm UI there are no Failed tuples. Can anybody give any idea of what might be the reason of the tuples getting lost? Thanks. *Storm config (storm.yaml)* (In case both nodes have workers running, the configuration is the same on both nodes, just the storm.local.hostname parameter changes) storm.zookeeper.servers: - zkserver1 nimbus.host: nimbusserver storm.local.dir: /mnt/storm supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703 storm.local.hostname: storm1server nimbus.childopts: -Xmx1024m -Djava.net.preferIPv4Stack=true ui.childopts: -Xmx768m -Djava.net.preferIPv4Stack=true supervisor.childopts: -Xmx1024m -Djava.net.preferIPv4Stack=true worker.childopts: -Xmx3548m -Djava.net.preferIPv4Stack=true storm.cluster.mode: distributed storm.local.mode.zmq: false storm.thrift.transport: backtype.storm.security.auth.SimpleTransportPlugin storm.messaging.transport: backtype.storm.messaging.netty.Context storm.messaging.netty.server_worker_threads: 1 storm.messaging.netty.client_worker_threads: 1 storm.messaging.netty.buffer_size: 5242880 #5MB buffer storm.messaging.netty.max_retries: 30 storm.messaging.netty.max_wait_ms: 1000 storm.messaging.netty.min_wait_ms: 100 *Zookeeper config (zoo.cfg):* tickTime=2000 initLimit=10 syncLimit=5 dataDir=/var/zookeeper clientPort=2181 autopurge.purgeInterval=24 autopurge.snapRetainCount=5 server.1=localhost:2888:3888 *Topology configuration* passed to the StormSubmitter: Config conf = new Config(); conf.setNumAckers(6); conf.setNumWorkers(4); conf.setMaxSpoutPending(100); Best regards, Daria Mayorova -- == Gvain Email: jh.li...@gmail.com
Re: Weirdness running topology on multiple nodes
Hi Justin, Can you share your storm.yaml config file? Do you have any firewall software running on any of the machines in your cluster? - Taylor On May 7, 2014, at 11:11 AM, Justin Workman justinjwork...@gmail.com wrote: We have spent the better part of 2 weeks now trying to get a pretty basic topology running across multiple nodes. I am sure I am missing something simple but for the life of me I cannot figure it out. Here is the situation, I have 1 nimbus server and 5 supervisor servers, with Zookeeper running on the nimbus server and 2 supervisor nodes. These hosts are all virtual machines 4 CPU's 8GB RAM, running in a OpenStack deployment. If all of the guests are running on the same physical hyperisor then the topology starts up just fine and runs without any issues. However, if we take the guests and spread them out over multiple hypervisors ( in the same OpenStack cluster ), the topology never really completely starts up. Things start to run, some messages are pulled off the spout, but nothing ever makes it all the way through the topology and nothing is ever ack'd. In the worker logs we get messages about reconnecting and eventually a Remote host unreachable error, and Async Loop Died. This used to result in a NumberFormat exception, reducing the netty retries from 30 to 10 resloved the NumberFormat error, and not we get the following 2014-05-07 09:00:51 b.s.m.n.Client [INFO] Reconnect ... [9] 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [9] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [9] 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:53 b.s.util [ERROR] Async loop died! java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, and does not take requests any more at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89) ~[na:na] at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433) ~[na:na] at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_26] Caused by: java.lang.RuntimeException: Client is being closed, and does not take requests any more at backtype.storm.messaging.netty.Client.send(Client.java:125) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319) ~[na:na] at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308) ~[na:na] at backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58) ~[na:na] And in the supervisor logs we see errors about the workers timing out and not starting up all the way, we also see executor timeouts in the nimbus logs. But we do not see any errors in the Zookeeper logs and the Zookeeper stats look fine. There do not appear to be any real network issues, I can run a continuous flood ping, between the hosts, with varying packet sizes, with minimal latency, and no dropped packets. I have also attempted to add all hosts to the local hosts files on each machine without any difference. I have also played with adjusting the different heartbeat timeouts and intervals with out any luck, and I have also deployed this same setup to a 5 node cluster on physical hardware ( 24 cores 64GB ram and a lot of local disks ), and we had the same issue. Topology would start, but data ever made it through the topology. The only way I have ever been able to get the topology to work is under OpenStack when all guests are on the same physical hypervisor. I think I am just missing something very obvious, but I am going in circles at this point and could use some additional suggestions. Thanks Justin signature.asc Description: Message signed with OpenPGP
Re: How/where to specify Workers JMX port?
The range of worker ports is defined in the storm.yaml as follows: supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703 Cheers Rob. On 8 May 2014 15:03, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Where/how does one specify the JMX port range Storm should use for Workers? I'm trying to document that here: https://sematext.atlassian.net/wiki/display/PUBSPM/SPM+Monitor+-+Standalone#SPMMonitor-Standalone-Storm All examples of Worker JMX port ranges just show -Dcom.sun.management.jmxremote.port=1%ID%, and ports 16700, 16701... but I can't find where the range is defined. e.g. what if I want to use ports 26700, 26701..., or what if I want to use ports 1, 10001 Where/how would I specify that? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ -- Cheers Rob.
Nimbus dies
Let's imagine the following scenario: - machine with one supervisor goes down - machine with nimbus goes down Right now, because some workers go down as well, a few queues are not drained properly, what causes that these queues are continuously increasing in size. To avoid this situation we should rebalance the topology in order to distribute the load across all of the remaining supervisors, but to do this I need the nimbus to be up and running. Moreover the basic monitoring information is not available because StormUI is also not working. My question is: What is a devops operation when the machine with nimbus dies and what can be done to minimize its unavailability period? Should we install nimbus on second machine and run it after first machine dies - something similar to failover services? Can we run more than one nimbus? Or maybe there is a better option? Thanks for help
Re: Storm Scaling Issues
how many ackers? make sure this is set to number of workers. what do you have for max spout pending? we configured to have 5-10 seconds of messages in pending state. (divide 5-10 seconds by avg tuple processing time) Bert On Wed, May 14, 2014 at 4:35 PM, Lasantha Fernando lasantha@gmail.comwrote: Hi Nathan, Tried increasing the number of workers with no luck. Also increased the parallelism of the throughput measuring bolt and checked the aggregate throughput, but that does not seem to be the bottleneck. Also the capacity value of the throughput measuring bolt in Storm UI is at around ~0.12. Will try out more configurations and see. Thank you very much for your tips. Any other tweaks I might try out? Thanks, Lasantha On Tue, May 13, 2014 at 6:38 PM, Nathan Leung ncle...@gmail.com wrote: For 20 spouts and even number of processing bolts 3 seems like an odd number of workers. Also are you sure you're not bottlenecked by your throughput measuring bolt? On May 13, 2014 2:43 AM, Lasantha Fernando lasantha@gmail.com wrote: Hi all, Is there any guide or hints on how to configure storm to scale better? I was running some tests with a custom scheduler and found that the throughput did not scale as expected. Any pointers on what I am doing wrong? Parallelism 2 4 8 16 Single Node (Avg) 166099 161539.5 193986 N/A Two Node (Avg) 160988 165563 174675.5 177624.5 The topology is as follows. Spout (Generates events continuously) - Processing Bolt - Throughput Measurement Bolt Parallelism is varied for the processing bolt. Parallelism for spout and throughput measuring bolt is kept constant at 20 and 1 respectively. Topology.NUM_WORKERS = 3 Custom scheduler code is available at [1]. Topology code is available at [2]. Any pointers would be much appreciated. Thanks, Lasantha [1] https://github.com/sajithshn/storm-schedulers/blob/master/src/main/java/org/wso2/siddhi/storm/scheduler/RoundRobinStormScheduler.java [2] https://github.com/lasanthafdo/siddhi-storm/blob/master/src/main/java/org/wso2/siddhi/storm/StockDataTopology.java
Re: Weirdness running topology on multiple nodes
That is odd. I have seen things like this happen when there are DNS configuration issues, but you have even updated /etc/hosts. * What does /etc/nsswitch.conf have for the hosts entry? This is what mine has: hosts: files dns I think that the java resolver code honors this setting, and this will cause it to look at /etc/hosts first for resolution. * Firewall settings could also cause this. (Pings would work while worker-worker communications might not.) * Failing that, maybe watch network packets to discover with what the workers really trying to communicate? -- Derek On 5/7/14, 10:11, Justin Workman wrote: We have spent the better part of 2 weeks now trying to get a pretty basic topology running across multiple nodes. I am sure I am missing something simple but for the life of me I cannot figure it out. Here is the situation, I have 1 nimbus server and 5 supervisor servers, with Zookeeper running on the nimbus server and 2 supervisor nodes. These hosts are all virtual machines 4 CPU's 8GB RAM, running in a OpenStack deployment. If all of the guests are running on the same physical hyperisor then the topology starts up just fine and runs without any issues. However, if we take the guests and spread them out over multiple hypervisors ( in the same OpenStack cluster ), the topology never really completely starts up. Things start to run, some messages are pulled off the spout, but nothing ever makes it all the way through the topology and nothing is ever ack'd. In the worker logs we get messages about reconnecting and eventually a Remote host unreachable error, and Async Loop Died. This used to result in a NumberFormat exception, reducing the netty retries from 30 to 10 resloved the NumberFormat error, and not we get the following 2014-05-07 09:00:51 b.s.m.n.Client [INFO] Reconnect ... [9] 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [9] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [9] 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:52 b.s.m.n.Client [INFO] Reconnect ... [10] 2014-05-07 09:00:52 b.s.m.n.Client [WARN] Remote address is not reachable. We will close this client. 2014-05-07 09:00:53 b.s.util [ERROR] Async loop died! java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, and does not take requests any more at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:107) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:78) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:77) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.disruptor$consume_loop_STAR_$fn__1577.invoke(disruptor.clj:89) ~[na:na] at backtype.storm.util$async_loop$fn__384.invoke(util.clj:433) ~[na:na] at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na] at java.lang.Thread.run(Thread.java:662) [na:1.6.0_26] Caused by: java.lang.RuntimeException: Client is being closed, and does not take requests any more at backtype.storm.messaging.netty.Client.send(Client.java:125) ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating] at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398$fn__4399.invoke(worker.clj:319) ~[na:na] at backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__4398.invoke(worker.clj:308) ~[na:na] at backtype.storm.disruptor$clojure_handler$reify__1560.onEvent(disruptor.clj:58) ~[na:na] And in the supervisor logs we see errors about the workers timing out and not starting up all the way, we also see executor timeouts in the nimbus logs. But we do not see any errors in the Zookeeper logs and the Zookeeper stats look fine. There do not appear to be any real network issues, I can run a continuous flood ping, between the hosts, with varying packet sizes, with minimal latency, and no dropped packets. I have also attempted to add all hosts to the local hosts files on each machine without any difference. I have also played with adjusting the different heartbeat timeouts and intervals with out any luck, and I have also deployed this same setup to a 5 node cluster on physical hardware ( 24 cores 64GB ram and a lot of local disks ), and we had the same issue. Topology would start, but data ever made it through the topology. The only way I have ever been able to get the topology to work
[VOTE] Storm Logo Contest - Round 1
This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [VOTE] Storm Logo Contest - Round 1
#10 - 5 pts (Logo Entry No. 10 - Jennifer Lee) -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive King of Prussia, PA 19406 M: 215.588.6024 @boneill42 http://www.twitter.com/boneill42 healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. On 5/15/14, 12:28 PM, P. Taylor Goetz ptgo...@gmail.com wrote: This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor
RE: [VOTE] Storm Logo Contest - Round 1
#9 - 3 pts. #3 - 2 pts. Great work everybody! Tough to pick... -Original Message- From: P. Taylor Goetz [mailto:ptgo...@gmail.com] Sent: Thursday, May 15, 2014 12:28 PM To: d...@storm.incubator.apache.org; user@storm.incubator.apache.org Subject: [VOTE] Storm Logo Contest - Round 1 This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor
Storm: How to emit top 5 word count every 5 minutes ?
Hi, is it possible to emit top 5 word count every 5 minute in storm word count example ? -- Thanks Regards Yogesh Panchal
Re: [VOTE] Storm Logo Contest - Round 1
#11 - 5 pts On Fri, May 16, 2014 at 7:43 AM, Brian O'Neill b...@alumni.brown.eduwrote: #10 - 5 pts (Logo Entry No. 10 - Jennifer Lee) -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. On 5/15/14, 12:28 PM, P. Taylor Goetz ptgo...@gmail.com wrote: This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor
Re: [VOTE] Storm Logo Contest - Round 1
#6 - 5 pts Rob Turner. On 15 May 2014 17:28, P. Taylor Goetz ptgo...@gmail.com wrote: This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor -- Cheers Rob.
Re: Storm Scaling Issues
Hi, Is your logging set to debug? If you get a lot of log messages that can slow you down. How fast is your network? Do you need fields grouping on your bolt? (it looks like yes but can be worthwhile to re-evaluate whether this is the case). You can try the following: 1) run with just spouts, check throughput. Judging by your spout this should be very high 2) run with your spout, and the processing bolt. Comment the business logic of your bolt to reduce the variables. Check the throughput with fieldsGrouping, shuffleGrouping, and localOrShuffleGrouping. If localOrShuffleGrouping is significantly faster than the other two, then you might be running into so kind of networking bottleneck. Judging by your numbers in the original email I suspect this to be the case. 3) Add the throughput measuring bolt and check performance through your entire flow. -Nathan On Wed, May 14, 2014 at 4:35 PM, Lasantha Fernando lasantha@gmail.comwrote: Hi Nathan, Tried increasing the number of workers with no luck. Also increased the parallelism of the throughput measuring bolt and checked the aggregate throughput, but that does not seem to be the bottleneck. Also the capacity value of the throughput measuring bolt in Storm UI is at around ~0.12. Will try out more configurations and see. Thank you very much for your tips. Any other tweaks I might try out? Thanks, Lasantha On Tue, May 13, 2014 at 6:38 PM, Nathan Leung ncle...@gmail.com wrote: For 20 spouts and even number of processing bolts 3 seems like an odd number of workers. Also are you sure you're not bottlenecked by your throughput measuring bolt? On May 13, 2014 2:43 AM, Lasantha Fernando lasantha@gmail.com wrote: Hi all, Is there any guide or hints on how to configure storm to scale better? I was running some tests with a custom scheduler and found that the throughput did not scale as expected. Any pointers on what I am doing wrong? Parallelism 2 4 8 16 Single Node (Avg) 166099 161539.5 193986 N/A Two Node (Avg) 160988 165563 174675.5 177624.5 The topology is as follows. Spout (Generates events continuously) - Processing Bolt - Throughput Measurement Bolt Parallelism is varied for the processing bolt. Parallelism for spout and throughput measuring bolt is kept constant at 20 and 1 respectively. Topology.NUM_WORKERS = 3 Custom scheduler code is available at [1]. Topology code is available at [2]. Any pointers would be much appreciated. Thanks, Lasantha [1] https://github.com/sajithshn/storm-schedulers/blob/master/src/main/java/org/wso2/siddhi/storm/scheduler/RoundRobinStormScheduler.java [2] https://github.com/lasanthafdo/siddhi-storm/blob/master/src/main/java/org/wso2/siddhi/storm/StockDataTopology.java
[Tuple loss] in Storm-0.9.0.1
I am running storm-0.9.0.1 with zmq as transport layer, and I set zmq.hwm = 2000 to avoid significant memory leaks, Also I disableed ackers. I found from UI that some components executed much less tuples than being transferred to them. Let's say, a topology looks like this : componentA - componentB. ComponentA transferred 100M tuples while componentB only executed 1M tuples. I'am wondering where do the left tuples gone. If they were queueing in the transport layer, then the queue should be keep increasing and thus the momory usage, But I didn't see any noticeable memeory leaks. Does storm have the guarantee that all tuples transferred from taskA to taskB should be received by taskB, moreover, the tulpes are in order. Regards -- == Gvain Email: jh.li...@gmail.com
Re: [VOTE] Storm Logo Contest - Round 1
#10 - 3 Points. #1 - 1 Point #2 - 1 Point Thanks, Brian On Thu, May 15, 2014 at 12:28 PM, P. Taylor Goetz ptgo...@gmail.com wrote: This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor
Re: Recovering From Zookeeper Failure
Hi Josh We are having the same issue for long time, and only solution is restart the whole storm cluster. (Actually I have asked the same question on 12 May but got no response.) In the meantime, we are currently evaluating switch to Apache Spark for streaming, you might also have a look. On Wed, May 14, 2014 at 11:25 PM, Josh Walton jwalton...@gmail.com wrote: Recently, we have had a couple of power failures for the servers running our zookeeper cluster. When zookeeper dies, the nimbus and supervisor processes eventually die as well. After the zookeeper failure, the only way I have gotten the supervisor processes to start back up is to delete the supervisor and worker directories as specified in the storm.yaml file. Is there a better/cleaner way to restart them? I have also noticed that when I start nimbus and the UI process back up, and navigate to the storm status page, the topologies we had started are still shown as active (even though they are not). This is the exception in the supervisor logs when I try to start them up after the zookeeper failure: 2014-05-14 09:16:03 b.s.event [ERROR] Error when processing event java.lang.RuntimeException: java.io.EOFException at backtype.storm.utils.Utils.deserialize(Utils.java:69) ~[storm-core-0.9.0-rc3.jar:na] at backtype.storm.utils.LocalState.snapshot(LocalState.java:28) ~[storm-core-0.9.0-rc3.jar:na] at backtype.storm.utils.LocalState.get(LocalState.java:39) ~[storm-core-0.9.0-rc3.jar:na] at backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:187) ~[storm-core-0.9.0-rc3.jar:na] at clojure.lang.AFn.applyToHelper(AFn.java:161) [clojure-1.4.0.jar:na] at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na] at clojure.core$apply.invoke(core.clj:603) ~[clojure-1.4.0.jar:na] at clojure.core$partial$fn__4070.doInvoke(core.clj:2343) ~[clojure-1.4.0.jar:na] at clojure.lang.RestFn.invoke(RestFn.java:397) ~[clojure-1.4.0.jar:na] at backtype.storm.event$event_manager$fn__3070.invoke(event.clj:24) ~[storm-core-0.9.0-rc3.jar:na] at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na] at java.lang.Thread.run(Thread.java:722) [na:1.7.0_21] Caused by: java.io.EOFException: null at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323) ~[na:1.7.0_21] at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792) ~[na:1.7.0_21] at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:799) ~[na:1.7.0_21] at java.io.ObjectInputStream.init(ObjectInputStream.java:299) ~[na:1.7.0_21] at backtype.storm.utils.Utils.deserialize(Utils.java:64) ~[storm-core-0.9.0-rc3.jar:na] ... 11 common frames omitted 2014-05-14 09:16:03 b.s.util [INFO] Halting process: (Error when processing an event)
unsubscribe
unsubscribe
RE: Multiple Storm components getting assigned to same worker slot despite of free slots being available
EvenScheduler doesn't seem to do the trick. I've set the scheduler to be EvenScheduler, but the two topologies we've got are still being assigned to the same supervisor, when there's 3 possible supervisors to assign to. It's hard to tell what exactly EvenScheduler is doing, is there some specification of what exactly EvenScheduler and DefaultScheduler do somewhere? From: bijoy deb [mailto:bijoy.comput...@gmail.com] Sent: 22 March 2014 05:39 To: user@storm.incubator.apache.org Subject: Re: Multiple Storm components getting assigned to same worker slot despite of free slots being available Thanks Nathan.So, I believe setting storm.scheduler to EvenScheduler,as suggested by Drew should do the trick? However, I still have one doubt.With reference to my use case, what I was looking for is that each component (spout/bolt instance) should get assigned to different slot,since I have free slots available;but instead multiple instances are getting assigned to same slot. Does setting the scheduler to EvenScheduler ensure even distribution of tasks across slots within a single worker machine also,or does it just ensure even distribution across multiple worker machines only. Also,shouldn't this scheduler property setting be there by default,since we would always like to assign tasks to empty slots first? Thanks Bijoy On Sat, Mar 22, 2014 at 10:57 AM, Nathan Marz nat...@nathanmarz.commailto:nat...@nathanmarz.com wrote: topology.optimize doesn't do anything at the moment. It was something planned for in the early days but turned out to be unecessary. On Fri, Mar 21, 2014 at 9:00 PM, bijoy deb bijoy.comput...@gmail.commailto:bijoy.comput...@gmail.com wrote: Thanks Drew.I am going to try those options and see if that helps. Thanks Bijoy On Fri, Mar 21, 2014 at 10:37 PM, Drew Goya d...@gradientx.commailto:d...@gradientx.com wrote: Take a look at topology.optimize and storm.scheduler I had the same issue and I found that setting topology.optimize to false and storm.scheduler to backtype.storm.scheduler.EvenScheduler gave me the even distribution of components I was looking for. On Fri, Mar 21, 2014 at 2:50 AM, bijoy deb bijoy.comput...@gmail.commailto:bijoy.comput...@gmail.com wrote: Hi, I am running a topology using Storm (version 0.9.1),on a cluster of 3 nodes (3x4=12 slots). My topology has 1 spout(parallelism=2),bolt A (parallelism=2),bolt B (parallelism=1) and bolt C(parallelism=1).Number of tasks (numTasks) for each component is default(1).Number of workers is set as 5. Given above scenario,when I submit the topology,I can see 5 slots are used up and 7 are free (out of 12).But still one instance of the spout and bolt C are going to the same worker slot (e.g port 6703 of node 1). Shouldn't Storm be ensuring that components are assigned to distinct unused slots as long as there are empty slots available? Or is there some configuration that I have missed or misconfigured here? I have pasted the screenshot of Storm UI below for reference. Thanks Bijoy -- Twitter: @nathanmarz http://nathanmarz.comhttp://nathanmarz.com/
Received commit for different transaction attempt
Could someone help me understand the context under which this error might occur? I'm seeing this pop up in one of our topologies running with Trident on Storm v0.9.0.1 and it is unable to recover once this happens. java.lang.RuntimeException: backtype.storm.topology.FailedException: Received commit for different transaction attempt at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:90) at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:61) at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62) at backtype.storm.daemon.executor$fn__3498$fn__3510$fn__3557.invoke(executor.clj:730) at backtype.storm.util$async_loop$fn__444.invoke(util.clj:403) at clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Thread.java:662) Caused by: backtype.storm.topology.FailedException: Received commit for different transaction attempt at storm.trident.spout.TridentSpoutExecutor.execute(TridentSpoutExecutor.java:56) at storm.trident.topology.TridentBoltExecutor.execute(TridentBoltExecutor.java:297) at backtype.storm.daemon.executor$fn__3498$tuple_action_fn__3500.invoke(executor.clj:615) at backtype.storm.daemon.executor$mk_task_receiver$fn__3421.invoke(executor.clj:383) at backtype.storm.disruptor$clojure_handler$reify__2962.onEvent(disruptor.clj:43) at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87) ... 6 more Thanks, Andrew
Re: [VOTE] Storm Logo Contest - Round 1
#10 - 5 points. On Fri, May 16, 2014 at 1:34 PM, Brian Enochson brian.enoch...@gmail.comwrote: #10 - 3 Points. #1 - 1 Point #2 - 1 Point Thanks, Brian On Thu, May 15, 2014 at 12:28 PM, P. Taylor Goetz ptgo...@gmail.comwrote: This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor
Re: Kryo
And to add to what Osman said, the upcoming Storm 0.9.2 version will be using Kryo 2.21. There are two main reasons: first, Kryo 2.21 fixes potential data corruption issues of prior Kryo versions; second, updating to 2.21 syncs Storm's Kryo dependency with other, nice to have libraries for data processing such as Twitter Chill/Bijection. FYI: Kryo is not tracked as a direct dependency in Storm's pom.xml [1]. Instead it is pulled in as a transitive dependency of Carbonite [2], which is a Clojure library for working with Kryo. And Carbonite 1.4.0 requires Kryo 2.21 [3]. Here are the relevant snippets in Storm's pom.xml: carbonite.version1.4.0/carbonite.version dependency groupIdcom.twitter/groupId artifactIdcarbonite/artifactId version${carbonite.version}/version /dependency You can also run $ mvn dependency:tree in the top-level directory in the git repository to generate the dependency tree of Storm. (You may need to run `mvn install` first, otherwise e.g. storm-starter will complain about not finding 0.9.2-SNAPSHOT jars.) Best, Michael [1] https://github.com/apache/incubator-storm/blob/master/pom.xml [2] https://github.com/sritchie/carbonite [3] https://github.com/sritchie/carbonite/blob/1.4.0/project.clj#L8 On 01/29/2014 12:36 PM, Osman wrote: 0.9.0.1 is using kryo/2.17 http://mvnrepository.com/artifact/com.esotericsoftware.kryo/kryo/2.17 On 29 January 2014 11:24, Klausen Schaefersinho klaus.schaef...@gmail.com mailto:klaus.schaef...@gmail.com wrote: Hi, which version of kryo is used in Storm? I have an dependency which also uses kryo and thus I have some runtime issues! I was looking into the pom.xml but could find it. Cheers, klaus
Re: [VOTE] Storm Logo Contest - Round 1
#10 - 5 points On Sat, May 17, 2014 at 6:18 AM, Jason Jackson jasonj...@gmail.com wrote: #10 - 5 points. On Fri, May 16, 2014 at 1:34 PM, Brian Enochson brian.enoch...@gmail.com wrote: #10 - 3 Points. #1 - 1 Point #2 - 1 Point Thanks, Brian On Thu, May 15, 2014 at 12:28 PM, P. Taylor Goetz ptgo...@gmail.com wrote: This is a call to vote on selecting the top 3 Storm logos from the 11 entries received. This is the first of two rounds of voting. In the first round the top 3 entries will be selected to move onto the second round where the winner will be selected. The entries can be viewed on the storm website here: http://storm.incubator.apache.org/blog.html VOTING Each person can cast a single vote. A vote consists of 5 points that can be divided among multiple entries. To vote, list the entry number, followed by the number of points assigned. For example: #1 - 2 pts. #2 - 1 pt. #3 - 2 pts. Votes cast by PPMC members are considered binding, but voting is open to anyone. This vote will be open until Thursday, May 22 11:59 PM UTC. - Taylor
Storm IPv6 supported when using hostnames?
Hi, The storm website states that Storm does not support IPv6 ( http://storm.incubator.apache.org/documentation/Troubleshooting.html). However, I got the impression from other sources ( https://issues.apache.org/jira/browse/STORM-192?jql=project%20%3D%20STORM%20AND%20text%20~%20ipv6 and https://groups.google.com/d/msg/storm-user/Yp7RFlXbG0Q/rONDnQzCq2cJ) that it might work by using hostnames (depending on DNS resolution). Is this correct? - Gerrit