Re: Storm installation

2014-10-21 Thread Yuheng Du
Hi Harsha, Thank you. The .tar ball I download was corrupted and the log4j-over-slf4j-1.6.6.jar file was missing. I downloaded again and now the storm ui runs fine. Thanks for your help again. Do I have to use Kafka to read data into storm? I have set up a bunch of rabbimq queues up running and i

Re: Storm installation

2014-10-21 Thread Harsha
Yuheng, regarding hortonworks sandbox 2.1 I am able to start the cluster without any issues. when you login as root do a su - storm . from /usr/lib/storm you can run ./bin/storm nimbus (for nimbus) ./bin/storm supervisor (for supervisor) ./bin/storm ui(for ui) and ui starts on 8744. regarding apa

Re: Storm Cluster on Linux and Wndows

2014-10-21 Thread Harsha
from the logs you posted it seems to be that the supervisor running on the windows machine is not able to connect to your nimbus. Make sure you configured the storm.yaml with right nimbus.host and nimbus.thrift.port which should be whatever you used on the linux box and also zookeeper host. Incase

Re: Storm Cluster on Linux and Wndows

2014-10-21 Thread jova
Yes I can ping my linux box from windows. I can also bring up the storm ui from my windows box I am running 2 different OS because I have windows 8 on one machine and it is not easy to configure a dual boot (linux/windows) anymore (from what I read). On my linux box I already configured everythi

Re: Storm installation

2014-10-21 Thread Yuheng Du
The system I am using is Ubuntu 14. Will that be the problem? On Tue, Oct 21, 2014 at 11:44 PM, Yuheng Du wrote: > When I run ./storm nimbus, It shows: > > root@yuheng-laptop:~/software/apache-storm-0.9.2-incubating/bin# ./storm > nimbus > output > > Reported exception: > java.lang.NoClassDefFou

Re: Storm installation

2014-10-21 Thread Yuheng Du
When I run ./storm nimbus, It shows: root@yuheng-laptop:~/software/apache-storm-0.9.2-incubating/bin# ./storm nimbus > output Reported exception: java.lang.NoClassDefFoundError: ch/qos/logback/core/joran/spi/JoranException at org.slf4j.LoggerFactory.bind(LoggerFactory.java:128) at org.slf4j.Logge

Re: Storm Cluster on Linux and Wndows

2014-10-21 Thread Harsha
  Are you able to ping your linux host from windows. any specific reason to run a distributed cluster on different OS. Thanks, Harsha   On Tue, Oct 21, 2014, at 08:21 PM, jova wrote: > Has anyone successfully run a storm cluster with Linux and Windows > boxes? I am attempting this, by running nimb

Storm Cluster on Linux and Wndows

2014-10-21 Thread jova
Has anyone successfully run a storm cluster with Linux and Windows boxes? I am attempting this, by running nimbus on a linux box and have a supervisor on a windows box. Everything seems fine until I submit a topology jar and the supervisor node attempts to download the topology jar. Here is the er

答复: DRPC problem

2014-10-21 Thread Junfeng Chen
Hi , thanks I found the following sentences:” The first bolt you declare will take in as input 2-tuples, where the first field is the request id and the second field is the arguments for that request. LinearDRPCTopologyBuilder expects the last bolt to emit an output stream containing 2-tuples o

Re: Storm installation

2014-10-21 Thread Yuheng Du
Harsha, The sandbox version I installed is 2.1. The storm-ui won't launch in sandbox. I have followed the instructions provided by saiprasad, but when I run storm ui, it gives me the following errors: Failed to instantiate SLF4J LoggerFactory Reported exception: java.lang.NoClassDefFoundError: c

Re: Storm installation

2014-10-21 Thread Telles Nobrega
I'm having an issue when trying to install it with a script. The whole thing is setup, but the workers never start. I don't have it here so I can't show it to you, but tomorrow I will try without zeromq and see what happens. On Tue, Oct 21, 2014 at 10:20 PM, Telles Nobrega wrote: > Thats good. I

Re: Storm installation

2014-10-21 Thread Telles Nobrega
Thats good. I installed storm 0.8.2 and it needed zeromq if i'm not mistaken. Good to know this is not necessary anymore. On Tue, Oct 21, 2014 at 10:16 PM, Harsha wrote: > Yuheng, > what was the issue with sandbox. Do you know which version was it. > -Harsha > > > On Tue, Oct 21, 2014,

Re: Storm installation

2014-10-21 Thread Harsha
Yuheng, what was the issue with sandbox. Do you know which version was it. -Harsha On Tue, Oct 21, 2014, at 06:10 PM, Harsha wrote: Storm doesn't have any dependency on zeromq. It uses netty by default but zeromq can be configurable if user prefers. -Harsha On Tue, Oct 21, 201

Re: Storm installation

2014-10-21 Thread Harsha
Storm doesn't have any dependency on zeromq. It uses netty by default but zeromq can be configurable if user prefers. -Harsha On Tue, Oct 21, 2014, at 06:04 PM, Telles Nobrega wrote: It is not necessary to install ZeroMQ and JZMQ anymore? On Tue, Oct 21, 2014 at 6:32 PM, Yuheng Du <[1]yuhe

Re: Storm installation

2014-10-21 Thread Telles Nobrega
It is not necessary to install ZeroMQ and JZMQ anymore? On Tue, Oct 21, 2014 at 6:32 PM, Yuheng Du wrote: > Hi Saiprasad, > > Thank you, that's helpfu!! I am installing storm according to your guide > now. > > best, > Yuheng > > On Tue, Oct 21, 2014 at 5:24 PM, saiprasad mishra < > saiprasadmis.

Re: Testing full storm topologies with non-serializable mocks

2014-10-21 Thread John Reilly
Exactly On Tue, Oct 21, 2014 at 4:55 PM, Stephen Armstrong < stephen.armstr...@linqia.com> wrote: > Ok, so I'm guessing for tests you just override the prepare() method to > create a test injector instead. > > Thanks > Steve > > On Tue, Oct 21, 2014 at 3:57 PM, John Reilly wrote: > >> Maybe I sh

Re: Bolt using the new Hive/HCatalog Streaming API

2014-10-21 Thread Harsha
Geovani, I've bolt/trident implementation and I am in the process of putting up github PR . Will post it here once its up. Thanks, Harsha On Tue, Oct 21, 2014, at 03:46 PM, Luiz Geovani Vier wrote: Hi Storm users, Does Storm provide any helpers for using the new Hive Streaming A

Re: Testing full storm topologies with non-serializable mocks

2014-10-21 Thread Stephen Armstrong
Ok, so I'm guessing for tests you just override the prepare() method to create a test injector instead. Thanks Steve On Tue, Oct 21, 2014 at 3:57 PM, John Reilly wrote: > Maybe I should give an example: > > class MyBolt extends BaseRichBolt { > > @transient var injector: XYZModule = null > >

Re: storm rebalancing losing data

2014-10-21 Thread Manoj Jaiswal
Thanks Yair. Its not stateful processing. But one set of message which flow once an hour define the queries per account number. Now the high volume messages flow all the time and they are processed in the bolt. So it statefull for the first set of message and stateless for the second set . I will

Re: Testing full storm topologies with non-serializable mocks

2014-10-21 Thread John Reilly
Maybe I should give an example: class MyBolt extends BaseRichBolt { @transient var injector: XYZModule = null override def prepare(stormConf: util.Map[_, _], context: TopologyContext, collector: OutputCollector): Unit = { val theConfig: Config = getConfigFromStormConfig(stormConf)

Re: Testing full storm topologies with non-serializable mocks

2014-10-21 Thread John Reilly
The injector is created inside the prepare method. On Tue, Oct 21, 2014 at 1:44 PM, Stephen Armstrong < stephen.armstr...@linqia.com> wrote: > I'm not understanding something here: > > If the bolt is pulling its dependencies from Guice inside it's prepare() > method, where does it get the injecto

Bolt using the new Hive/HCatalog Streaming API

2014-10-21 Thread Luiz Geovani Vier
Hi Storm users, Does Storm provide any helpers for using the new Hive Streaming API ( HiveEndPoint/TransactionBatch) or should I just create a Bolt from scratch? Sorry if that's a stupid question, I couldn't find anything in the docs and I don't want to reinvent the wheel. :-) I'm using Storm 0.9

What happens when a tuple times out?

2014-10-21 Thread Sam Mati
Hi all. First time playing around with Storm. I've set up a dummy topology to see how timeouts, ack, and fail work. I have a spout that emits random words (it chooses a word that has previously failed, and if there are none, a random word). I have a bolt that, depending on Random.nextBoolean(

Re: storm rebalancing losing data

2014-10-21 Thread Yair Weinberger
You can save the state per partition (I am not sure what is your partition key, but you should be able to use it as a key to a key-value storage). Then, when you receive a message from topic C, you should check if you have the appropriate state for the partition of this message, and if not grab it

Re: storm rebalancing losing data

2014-10-21 Thread Manoj Jaiswal
Thanks Yair, The messages picked up from topic A create the queries. These messages are partitioned and so are the real time messages from topic C. If I persist the state . then how do I get the same partitioned data. The partitioning of data is dynamic and based on worker nodes alive. Isnt it ? S

Re: spout getting stuck..

2014-10-21 Thread saiprasad mishra
Wondering if there is a message size issue which is blocking it to transfer the data either from kafka to spout or any of the below params from spout to bolts config.put(Config.TOPOLOGY_RECEIVER_BUFFER_SIZE,32); config.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE,64); config.

Re: spout getting stuck..

2014-10-21 Thread Manoj Jaiswal
Vladi, check the limit on open files in Kafka and supervisor servers. also check if there is any IO overhead. The processing may completely stop if there are too many files waiting to be written. run the top command -Manoj On Tue, Oct 21, 2014 at 2:08 PM, Vladi Feigin wrote: > One of two wor

Re: storm rebalancing losing data

2014-10-21 Thread Yair Weinberger
Hi, It sounds like your bolt actually has a state (Initialized by the messages picked up from topic A) When restarting the bolt in case of failover, storm does not provide any inherent mechanism to keep the Bolt's previous state. In my opinion, your best option would be to move to Trident, which p

Re: Storm installation

2014-10-21 Thread Yuheng Du
Hi Saiprasad, Thank you, that's helpfu!! I am installing storm according to your guide now. best, Yuheng On Tue, Oct 21, 2014 at 5:24 PM, saiprasad mishra wrote: > I just created one quickly if you need to setup as quick cluster manually > without hortonworks sandbox > > https://gist.github.co

Re: Storm installation

2014-10-21 Thread saiprasad mishra
I just created one quickly if you need to setup as quick cluster manually without hortonworks sandbox https://gist.github.com/saiprasadmishra/a50c730b67334c05f1e1 On Tue, Oct 21, 2014 at 1:58 PM, Yuheng Du wrote: > Hi everyone, > > I am a newbie to Storm. I tried to set up the stand-alone env

Re: spout getting stuck..

2014-10-21 Thread Vladi Feigin
One of two workers doesn't read the data from kafka at all. In this worker all succeeding bolts show 0 in emitted / transferred in UI This happens after a few hours of successful running. We don't use acks in this topology Vladi On Tue, Oct 21, 2014 at 11:55 PM, saiprasad mishra < saiprasadmis..

Re: spout getting stuck..

2014-10-21 Thread saiprasad mishra
Just some more info This was evident from complete latency metric in storm ui which was showing 0 for me I am on storm 0.9.2 incubating Regards Sai On Tue, Oct 21, 2014 at 1:55 PM, saiprasad mishra wrote: > Is the topology not reading from kafka at all and not marking the offset > at all. > Re

Storm installation

2014-10-21 Thread Yuheng Du
Hi everyone, I am a newbie to Storm. I tried to set up the stand-alone environment on my desktop. I have installed the Hortonworks sandbox. But it gives me some error on running storm-ui. Does anyone uses the Hortonworks Sandbox? Where can I find a convenient installation guide? Thanks. best,

Re: spout getting stuck..

2014-10-21 Thread saiprasad mishra
Is the topology not reading from kafka at all and not marking the offset at all. Recently I ran into similar issue which was because of buggy code in my toology, where the ack was happening 2 times instead of once, one in the tey block and one in the finally block inside the execute method. This wa

spout getting stuck..

2014-10-21 Thread Vladi Feigin
H All, We're experiencing very strange topology behavior when after a few days, sometimes hours (looks like during a peak load) the spouts get stuck. The data stops streaming and we lose a lot of data. We read the data from kafka (use kafka spout). Storm version 0.8.2 Does someone have something s

Re: Testing full storm topologies with non-serializable mocks

2014-10-21 Thread Stephen Armstrong
I'm not understanding something here: If the bolt is pulling its dependencies from Guice inside it's prepare() method, where does it get the injector? If it gets it from the constructor, then the serialization issue still happens. If it gets it from a static variable, then in the production enviro

Re: Testing full storm topologies with non-serializable mocks

2014-10-21 Thread John Reilly
To avoid problems like this, I use a dependency injection system which is initialized in the prepare method of the bolts. In my case, I use macwire (in scala), but you should be able to use spring, guice or any other di system to achieve the same. Cheers, John On Tue, Oct 21, 2014 at 10:10 AM, S

Storm UI best practices

2014-10-21 Thread William Oberman
Hello, Right now I'm viewing the Storm UI via a SSH tunnel. I tried to add in the log viewer UI, but this uses names/IPs of the internal cluster in the links (which obviously doesn't work). What is the best practice here? For Hadoop, I just use lynx local to the machine. But, Storm's UI doesn'

Testing full storm topologies with non-serializable mocks

2014-10-21 Thread Stephen Armstrong
Hello all, I've got a few topologies running, and have unit tests for each bolt/spout in isolation that mock out the edges of the tests (Tuples and OutputCollectors), but I want to have a full integration test. I setup local mode using the following function: public void runTopology(StormTopo

Re: Help // Installing storm on ubuntu

2014-10-21 Thread Dimitris Chachlakis
I finally managed to execute the topology I made in local mode. However I found the links you've sent me very useful and I going to spend some time exploring them. Once again thanks a lot for helping me out and I hope next time I will send an email to this list to be for something more advanced.

Re: DRPC problem

2014-10-21 Thread Kang Xiao
hi Junfeng, http://storm.apache.org/documentation/Distributed-RPC.html is useful for you to understand how DRPC works. LocalDRPC is generated from clj code https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/LocalDRPC.clj . On Mon, Oct 20, 2014 at 2:14 PM, Junfeng Chen

Re: Hadoop and HBase configuration files location (inside the topology jar?)

2014-10-21 Thread Paul Poulosky
It’s part of the config that you pass in to StormSubmitter when you launch the topology. https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/Config.java It’s a hashmap mapping strings to Objects (most of which are strings, some of which are lists) that allow you to co

RE: Help // Installing storm on ubuntu

2014-10-21 Thread Babu, Prashanth
Hi Dimitri, I have not read Storm Cookbook. So, I am not sure about the code or instructions in that book. Were you able to execute your topology after you fixed the errors? If not, alternatively, you can try checking this sample project [1] after running thru the instructions in the README. And