Re: Storm is unable to connect to the network or localhost

2015-04-16 Thread Shivendra Singh
Having ip mappings in /etc/hosts is often a remedy for such issues. Maybe it 
can help.

From: Nathan Leung mailto:ncle...@gmail.com>>
Reply-To: "user@storm.apache.org" 
mailto:user@storm.apache.org>>
Date: Thursday, April 16, 2015 at 3:30 PM
To: user mailto:user@storm.apache.org>>
Subject: Re: Storm is unable to connect to the network or localhost

The storm process is run by the same user that the supervisor is running as.

On Thu, Apr 16, 2015 at 6:24 PM, Zotti, Ryan J. 
mailto:ryan.zo...@capitalone.com>> wrote:
I've written a topology in Kafka and Storm. The topology works great until I 
try to make a Bolt that tries to either connect to the internet (via a restful 
API) or localhost (MongoDB) -- Storm is not able to connect to either. Does 
anyone know why this might happen? Is Kafka interfering somehow? For what it's 
worth, I'm running Kafka and Storm inside a Docker container that is itself run 
inside a server protected by a proxy server with a strong firewall. What makes 
this tough to debug is that I'm able to connect to localhost and I'm also able 
to run the restful API inside the Docker container and firewall when I'm not 
using Storm. So I'm guessing either Kafka or Storm is causing the issue. When 
Storm runs, is the process handed off to root? I have proxy settings that are 
only configured for myself (not root), so it’s possible that if the Storm 
process is handed off to some ID other than my own then my proxy settings would 
no longer apply, although this shouldn’t affect a connection to localhost.

Below is my python Bolt code if anyone is interested.

import storm
import urllib2

class TestBolt(storm.BasicBolt):

def process(self, tup):
throw_away_input = str(tup.values[0]).strip()
response = urllib2.urlopen('http://python.org/')
html = response.read()
storm.emit([html])

TestBolt().run()

Thanks,
Ryan

Ryan Zotti
Software Engineer, Enterprise Data Services
804.393.1656
ryan.zo...@capitalone.com

[cid:image001.jpg@01CFA4FA.31511390]




The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.



Re: Storm is unable to connect to the network or localhost

2015-04-16 Thread Nathan Leung
The storm process is run by the same user that the supervisor is running as.

On Thu, Apr 16, 2015 at 6:24 PM, Zotti, Ryan J. 
wrote:

> I've written a topology in Kafka and Storm. The topology works great until
> I try to make a Bolt that tries to either connect to the internet (via a
> restful API) or localhost (MongoDB) -- Storm is not able to connect to
> either. Does anyone know why this might happen? Is Kafka interfering
> somehow? For what it's worth, I'm running Kafka and Storm inside a Docker
> container that is itself run inside a server protected by a proxy server
> with a strong firewall. What makes this tough to debug is that I'm able to
> connect to localhost and I'm also able to run the restful API inside the
> Docker container and firewall when I'm not using Storm. So I'm guessing
> either Kafka or Storm is causing the issue. When Storm runs, is the process
> handed off to root? I have proxy settings that are only configured for
> myself (not root), so it’s possible that if the Storm process is handed off
> to some ID other than my own then my proxy settings would no longer apply,
> although this shouldn’t affect a connection to localhost.
>
>
>
> Below is my python Bolt code if anyone is interested.
>
>
>
> import storm
>
> import urllib2
>
>
>
> class TestBolt(storm.BasicBolt):
>
>
>
> def process(self, tup):
>
> throw_away_input = str(tup.values[0]).strip()
>
> response = urllib2.urlopen('http://python.org/')
>
> html = response.read()
>
> storm.emit([html])
>
>
>
> TestBolt().run()
>
>
>
> Thanks,
>
> Ryan
>
>
>
> *Ryan Zotti*
> *Software Engineer, Enterprise Data Services*
>
> 804.393.1656
> ryan.zo...@capitalone.com
>
>
>
> [image: cid:image001.jpg@01CFA4FA.31511390]
>
>
>
> --
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed.  If the reader of this message is not the
> intended recipient, you are hereby notified that any review,
> retransmission, dissemination, distribution, copying or other use of, or
> taking of any action in reliance upon this information is strictly
> prohibited. If you have received this communication in error, please
> contact the sender and delete the material from your computer.
>


RE: Storm is unable to connect to the network or localhost

2015-04-16 Thread Zotti, Ryan J.
I've written a topology in Kafka and Storm. The topology works great until I 
try to make a Bolt that tries to either connect to the internet (via a restful 
API) or localhost (MongoDB) -- Storm is not able to connect to either. Does 
anyone know why this might happen? Is Kafka interfering somehow? For what it's 
worth, I'm running Kafka and Storm inside a Docker container that is itself run 
inside a server protected by a proxy server with a strong firewall. What makes 
this tough to debug is that I'm able to connect to localhost and I'm also able 
to run the restful API inside the Docker container and firewall when I'm not 
using Storm. So I'm guessing either Kafka or Storm is causing the issue. When 
Storm runs, is the process handed off to root? I have proxy settings that are 
only configured for myself (not root), so it's possible that if the Storm 
process is handed off to some ID other than my own then my proxy settings would 
no longer apply, although this shouldn't affect a connection to localhost.

Below is my python Bolt code if anyone is interested.

import storm
import urllib2

class TestBolt(storm.BasicBolt):

def process(self, tup):
throw_away_input = str(tup.values[0]).strip()
response = urllib2.urlopen('http://python.org/')
html = response.read()
storm.emit([html])

TestBolt().run()

Thanks,
Ryan

Ryan Zotti
Software Engineer, Enterprise Data Services
804.393.1656
ryan.zo...@capitalone.com

[cid:image001.jpg@01D07872.88A34270]



The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: Workaround until https://issues.apache.org/jira/browse/STORM-573 is fixed in a release

2015-04-16 Thread 임정택
You can refer https://github.com/apache/storm/pull/308 and below code.

(def TEST-TIMEOUT-MS
  (let [timeout (System/getenv "STORM_TEST_TIMEOUT_MS")]
(parse-int (if timeout timeout "5000"

Cause of some reasons, it relies system environment variable, not JVM
property.


2015-04-17 2:33 GMT+09:00 Mark Tomko :

> Setting the property using System.setProperty does not work, but setting
> the property in the shell does seem to.
>
> One thing that I notice is that if a worker process dies, it kills the
> entire JVM. I'm running my tests from SBT at the moment, and when the
> exception is thrown, I'm getting dropped back to the shell, even with a
> try/catch block around the call to
> Testing.withSimulatedTimeLocalCluster(...).
>
> We could/should add some additional exception handling to the actual
> components, but I'm not sure it makes sense to kill the entire JVM just for
> an exception.
>
> Thanks!
> Mark
>
> On Thu, Apr 16, 2015 at 9:00 AM, 임정택  wrote:
>
>> Hello.
>>
>> You can try set system environment "STORM_TEST_TIMEOUT_MS" and run your
>> test.
>> It'll bind to TEST_TIMEOUT_MS and it's default timeout value of
>> complete-topology.
>>
>> Hope this helps.
>>
>> Regards.
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2015-04-16 1:08 GMT+09:00 Mark Tomko :
>>
>>> Hi,
>>>
>>> I'm working on some tests for one of my Storm topologies, and I'm
>>> consistently seeing topology timeouts from the testing framework, and it
>>> looks like it's only waiting 5 seconds, which is pretty aggressive in my
>>> opinion:
>>>
>>> Error in cluster
>>> java.lang.AssertionError: Test timed out (5000ms)
>>> at backtype.storm.testing$complete_topology.doInvoke(testing.clj:489)
>>> ~[storm-core-0.9.3.jar:0.9.3]
>>> at clojure.lang.RestFn.invoke(RestFn.java:826) ~[clojure-1.5.1.jar:na]
>>> at backtype.storm.testing4j$_completeTopology.invoke(testing4j.clj:61)
>>> ~[storm-core-0.9.3.jar:0.9.3]
>>> at backtype.storm.Testing.completeTopology(Unknown Source)
>>> [storm-core-0.9.3.jar:0.9.3]
>>>
>>>
>>> I know from experience that this topology runs pretty well on our remote
>>> cluster, so I don't think these timeouts are necessarily and indication of
>>> something bad.
>>>
>>> Looking around, I found this JIRA issue:
>>>
>>> https://issues.apache.org/jira/browse/STORM-573
>>>
>>> Which was fixed ~5 months ago, but didn't end up in Storm 0.9.4. Does
>>> anyone have any suggestions for a testing workaround in the meantime?
>>> Before I'd discovered the testing framework, I'd been just been spinning up
>>> a local cluster myself as part of the test, and I might just go back to
>>> doing that, but it would be great to use the officially supported testing
>>> mechanisms.
>>>
>>> Thanks,
>>>
>>> Mark
>>>
>>>
>>
>>
>> --
>> Name : 임 정택
>> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>> Twitter : http://twitter.com/heartsavior
>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>
>
>


-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior


Concurrent requests to DRPC

2015-04-16 Thread Douglas Alan
Hi. We want to make a Storm topology than handles DRPC requests, and we
want the requests to be handled concurrently. I.e., if a DRPC request comes
into the topology at time T0 that takes a long time to complete (let's say
100 seconds), we don't want this request to block another request that
comes in at T0 + 1 sec and will only take 1 second to complete. I.e, we'd
like to get the result for second request at time T0 + 2 sec  or T0 + 3 sec
or so, and not at time T0 + 101 sec.

Does this behavior happen automatically? Or would we have to do something
special to make Storm behave this way? Or is it just not the way that Storm
works? I haven't found anything about this in the online Storm
documentation or in any books.

I looked through the user@storm.apache.org archives looking for an answer
to this question before posting. I found this question posted a few months
ago:

>From "JianJian, Bu" 
Subject Issues about concurrent request to Trident DRPC
Date Mon, 29 Dec 2014 07:08:10 GMT
Dear All,

We’re encountering some issue when use trident DRPC --  It seems that
trident DRPC request is executed sequentially with later request blocked by
previous one no matter how the parallelism parameter would be set.

Our usage is :  we setup a client faced web services  which is hosted by a
group of hosts. The web service receive requests from our client and call
storm DRPC to query the data and send back the result to the Client.

Now the issue is some of the request will take much time to be finished by
DRPC.  So these requests will block other clients from getting the result.
  I can see many introductions in storm wiki about how drpc works but there
is little threads about how drpc handles multiple requests. There is only
one sentence about the concurrent reuqests in this wiki
https://storm.apache.org/documentation/Distributed-RPC.html “KeyedFairBolt
for weaving the processing of multiple requests at the same time” .  But it
has little help. There may be some ideas like start multiple similar
topology in the
same cluster but I think that would have other issue like balance the
load.  Does anyone ever have the similar issue and how did that get solved?

Unfortunately, there are no responses in the archive to this previous
posting.

Is Storm DRPC not widely used?

|>oug


RE: Regarding Storm Trident

2015-04-16 Thread Brunner, Bill
Its all in here: 
https://storm.apache.org/documentation/Trident-API-Overview.html

-Original Message-
From: Gaurav Agarwal [mailto:gaurav130...@gmail.com] 
Sent: Thursday, April 16, 2015 2:05 PM
To: user@storm.apache.org
Subject: Re: Regarding Storm Trident

i am aware of thse docs but if i have to check something about trident 
multireduce, merge and other methods like this where we need to go..

On 4/16/15, Jake K. Dodd  wrote:
> https://storm.apache.org/documentation/Trident-tutorial.html
>
> https://storm.apache.org/documentation/Trident-API-Overview.html
>
> https://storm.apache.org/documentation/Trident-state
>
> Best
>
> Jake K Dodd
>
>> On Apr 16, 2015, at 12:36, Gaurav Agarwal  wrote:
>>
>> -- Forwarded message --
>> From: Gaurav Agarwal 
>> Date: Thu, 16 Apr 2015 22:40:27 +0530
>> Subject: Regarding Storm Trident
>> To: us...@storm.apache.org
>>
>> Hello
>>
>> I am trying to  implement trident Stream,merge,multireduce and other 
>> functions in my code . But there is no proper documentation there I 
>> also find the api /Java docs but without any documentation , Are 
>> there any list of examples or documentation which i can look after 
>> for.
>>
>> Thanks
>

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.


Re: Supervisor repeatedly killing worker

2015-04-16 Thread Grant Overby (groverby)
I’m not, and If I had to guess I’d say it’s likely something is going wrong 
with the heartbeats, but how can I go about finding out?




From: Paul Poulosky mailto:ppoul...@yahoo-inc.com>>
Reply-To: "user@storm.apache.org" 
mailto:user@storm.apache.org>>, Paul Poulosky 
mailto:ppoul...@yahoo-inc.com>>
Date: Thursday, April 16, 2015 at 2:38 PM
To: "user@storm.apache.org" 
mailto:user@storm.apache.org>>
Subject: Re: Supervisor repeatedly killing worker

10.0.1.5


Re: Supervisor repeatedly killing worker

2015-04-16 Thread Paul Poulosky
Are you sure the worker is writing heartbeats and the supervisor is reading 
them?
 


 On Thursday, April 16, 2015 1:36 PM, Grant Overby (groverby) 
 wrote:
   

  The supervisor is reporting the worker “still hasn’t started” even though the 
worker is up and appears to be working. After timeout, the supervisor kills and 
restarts the worker.
I set the timeouts really high (10 mins) to see what would happen. I though 
maybe a GC pause or something. The worker is still killed with these large 
timeouts, it just takes longer.
The supervisor is able to establish a connection to the worker 
port:[root@twig07 storm]# netstat -pan | grep 6700tcp        0      0 :::6700   
                  :::*                        LISTEN      6860/java           
tcp        0      0 :::10.0.1.7:6700        :::10.0.1.8:44324       
ESTABLISHED 6860/java           tcp        0      0 :::10.0.1.7:36940       
:::10.0.1.8:6700        ESTABLISHED 6860/java           tcp        0    256 
:::10.0.1.7:36543       :::10.0.1.6:6700        ESTABLISHED 6860/java   
        tcp        0      0 :::10.0.1.7:6700        :::10.0.1.5:38746   
    ESTABLISHED 6860/java           tcp        0      0 :::10.0.1.7:59151   
    :::10.0.1.5:6700        ESTABLISHED 6860/java           tcp        0    
  0 :::10.0.1.7:6700        :::10.0.1.6:59916       ESTABLISHED 
6860/java    


What can I look at to diagnose this behavior?




Supervisor log:2015-04-15 18:32:52 b.s.d.supervisor [INFO] 
8e836654-931d-46d5-bf3f-cb75e394b113 still hasn't started2015-04-15 18:32:52 
b.s.d.supervisor [INFO] 8e836654-931d-46d5-bf3f-cb75e394b113 still hasn't 
started2015-04-15 18:54:01 b.s.d.supervisor [INFO] Shutting down and clearing 
state for id 8e836654-931d-46d5-bf3f-cb75e394b113. Current supervisor time: 
1429138441. State: :disallowed, Heartbeat: 
#backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1429138440, :storm-id 
"burner-trident-1-1429131611", :executors #{[4 4] [36 36] [68 68] [100 100] [8 
8] [40 40] [72 72] [104 104] [12 12] [44 44] [76 76] [108 108] [16 16] [48 48] 
[80 80] [112 112] [20 20] [52 52] [84 84] [116 116] [24 24] [56 56] [88 88] 
[120 120] [28 28] [60 60] [92 92] [124 124] [-1 -1] [32 32] [64 64] [96 96] 
[128 128]}, :port 6700}2015-04-15 18:54:01 b.s.d.supervisor [INFO] Shutting 
down 
c60221c8-1c96-41b5-9669-4fa8b74fc022:8e836654-931d-46d5-bf3f-cb75e394b1132015-04-15
 18:54:01 b.s.config [INFO] GET worker-user 
8e836654-931d-46d5-bf3f-cb75e394b1132015-04-15 18:54:02 b.s.config [INFO] 
REMOVE worker-user 8e836654-931d-46d5-bf3f-cb75e394b1132015-04-15 18:54:02 
b.s.d.supervisor [INFO] Shut down 
c60221c8-1c96-41b5-9669-4fa8b74fc022:8e836654-931d-46d5-bf3f-cb75e394b113


Worker log for the same time period:2015-04-15 18:43:16 k.p.SyncProducer [INFO] 
Connected to twig06.twigs:6667 for producing2015-04-15 18:43:16 
k.p.SyncProducer [INFO] Connected to twig08.twigs:6667 for producing2015-04-15 
18:43:16 k.p.SyncProducer [INFO] Connected to twig05.twigs:6667 for 
producing2015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from 
twig07.twigs:66672015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from 
twig08.twigs:66672015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from 
twig05.twigs:66672015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from 
twig06.twigs:66672015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from 
twig07.twigs:66672015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to 
twig07.twigs:6667 for producing2015-04-15 18:43:16 k.p.SyncProducer [INFO] 
Connected to twig06.twigs:6667 for producing2015-04-15 18:43:16 
k.p.SyncProducer [INFO] Connected to twig08.twigs:6667 for producing2015-04-15 
18:43:16 k.p.SyncProducer [INFO] Connected to twig05.twigs:6667 for 
producing2015-04-15 18:44:06 s.k.t.ZkBrokerReader [INFO] brokers need 
refreshing because 6ms have expired2015-04-15 18:44:06 
s.k.DynamicBrokersReader [INFO] Read partition info from zookeeper: 
GlobalPartitionInformation{partitionMap={0=twig07.twigs:6667, 
1=twig08.twigs:6667, 2=twig05.twigs:6667, 3=twig06.twigs:6667}}2015-04-15 
18:45:14 s.k.t.ZkBrokerReader [INFO] brokers need refreshing because 6ms 
have expired2015-04-15 18:45:14 s.k.DynamicBrokersReader [INFO] Read partition 
info from zookeeper: 
GlobalPartitionInformation{partitionMap={0=twig07.twigs:6667, 
1=twig08.twigs:6667, 2=twig05.twigs:6667, 3=twig06.twigs:6667}}2015-04-15 
18:54:01 o.a.s.c.r.ExponentialBackoffRetry [WARN] maxRetries too large (30). 
Pinning to 292015-04-15 18:54:01 b.s.u.StormBoundedExponentialBackoffRetry 
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries 
[30]2015-04-15 18:54:01 b.s.m.n.Client [INFO] New Netty Client, connect to 
twig07.twigs, 6700, config: , buffer_size: 52428802015-04-15 18:54:01 
b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-twig07.twigs/10.0.1.7:6700... [0]2015-04-15 18:54:01 
b.s.m.n.Client [INFO] Closing Netty Client 
Netty-Client

Supervisor repeatedly killing worker

2015-04-16 Thread Grant Overby (groverby)
The supervisor is reporting the worker “still hasn’t started” even though the 
worker is up and appears to be working. After timeout, the supervisor kills and 
restarts the worker.

I set the timeouts really high (10 mins) to see what would happen. I though 
maybe a GC pause or something. The worker is still killed with these large 
timeouts, it just takes longer.

The supervisor is able to establish a connection to the worker port:
[root@twig07 storm]# netstat -pan | grep 6700
tcp0  0 :::6700 :::*
LISTEN  6860/java
tcp0  0 :::10.0.1.7:6700:::10.0.1.8:44324   
ESTABLISHED 6860/java
tcp0  0 :::10.0.1.7:36940   :::10.0.1.8:6700
ESTABLISHED 6860/java
tcp0256 :::10.0.1.7:36543   :::10.0.1.6:6700
ESTABLISHED 6860/java
tcp0  0 :::10.0.1.7:6700:::10.0.1.5:38746   
ESTABLISHED 6860/java
tcp0  0 :::10.0.1.7:59151   :::10.0.1.5:6700
ESTABLISHED 6860/java
tcp0  0 :::10.0.1.7:6700:::10.0.1.6:59916   
ESTABLISHED 6860/java



What can I look at to diagnose this behavior?





Supervisor log:
2015-04-15 18:32:52 b.s.d.supervisor [INFO] 
8e836654-931d-46d5-bf3f-cb75e394b113 still hasn't started
2015-04-15 18:32:52 b.s.d.supervisor [INFO] 
8e836654-931d-46d5-bf3f-cb75e394b113 still hasn't started
2015-04-15 18:54:01 b.s.d.supervisor [INFO] Shutting down and clearing state 
for id 8e836654-931d-46d5-bf3f-cb75e394b113. Current supervisor time: 
1429138441. State: :disallowed, Heartbeat: 
#backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1429138440, :storm-id 
"burner-trident-1-1429131611", :executors #{[4 4] [36 36] [68 68] [100 100] [8 
8] [40 40] [72 72] [104 104] [12 12] [44 44] [76 76] [108 108] [16 16] [48 48] 
[80 80] [112 112] [20 20] [52 52] [84 84] [116 116] [24 24] [56 56] [88 88] 
[120 120] [28 28] [60 60] [92 92] [124 124] [-1 -1] [32 32] [64 64] [96 96] 
[128 128]}, :port 6700}
2015-04-15 18:54:01 b.s.d.supervisor [INFO] Shutting down 
c60221c8-1c96-41b5-9669-4fa8b74fc022:8e836654-931d-46d5-bf3f-cb75e394b113
2015-04-15 18:54:01 b.s.config [INFO] GET worker-user 
8e836654-931d-46d5-bf3f-cb75e394b113
2015-04-15 18:54:02 b.s.config [INFO] REMOVE worker-user 
8e836654-931d-46d5-bf3f-cb75e394b113
2015-04-15 18:54:02 b.s.d.supervisor [INFO] Shut down 
c60221c8-1c96-41b5-9669-4fa8b74fc022:8e836654-931d-46d5-bf3f-cb75e394b113



Worker log for the same time period:
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig06.twigs:6667 for 
producing
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig08.twigs:6667 for 
producing
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig05.twigs:6667 for 
producing
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from twig07.twigs:6667
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from twig08.twigs:6667
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from twig05.twigs:6667
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from twig06.twigs:6667
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Disconnecting from twig07.twigs:6667
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig07.twigs:6667 for 
producing
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig06.twigs:6667 for 
producing
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig08.twigs:6667 for 
producing
2015-04-15 18:43:16 k.p.SyncProducer [INFO] Connected to twig05.twigs:6667 for 
producing
2015-04-15 18:44:06 s.k.t.ZkBrokerReader [INFO] brokers need refreshing because 
6ms have expired
2015-04-15 18:44:06 s.k.DynamicBrokersReader [INFO] Read partition info from 
zookeeper: GlobalPartitionInformation{partitionMap={0=twig07.twigs:6667, 
1=twig08.twigs:6667, 2=twig05.twigs:6667, 3=twig06.twigs:6667}}
2015-04-15 18:45:14 s.k.t.ZkBrokerReader [INFO] brokers need refreshing because 
6ms have expired
2015-04-15 18:45:14 s.k.DynamicBrokersReader [INFO] Read partition info from 
zookeeper: GlobalPartitionInformation{partitionMap={0=twig07.twigs:6667, 
1=twig08.twigs:6667, 2=twig05.twigs:6667, 3=twig06.twigs:6667}}
2015-04-15 18:54:01 o.a.s.c.r.ExponentialBackoffRetry [WARN] maxRetries too 
large (30). Pinning to 29
2015-04-15 18:54:01 b.s.u.StormBoundedExponentialBackoffRetry [INFO] The 
baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries [30]
2015-04-15 18:54:01 b.s.m.n.Client [INFO] New Netty Client, connect to 
twig07.twigs, 6700, config: , buffer_size: 5242880
2015-04-15 18:54:01 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-twig07.twigs/10.0.1.7:6700... [0]
2015-04-15 18:54:01 b.s.m.n.Client [INFO] Closing Netty Client 
Netty-Client-twig08.twigs/10.0.1.8:6700
2015-04-15 18:54:01 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent 
with Netty-Client-twig08.twigs/10.0.1.8:6700..., timeout: 60ms, pendings: 0
2015-04-15 18:54:01 b.s.m.n.Cli

Re: Regarding Storm Trident

2015-04-16 Thread Gaurav Agarwal
i am aware of thse docs but if i have to check something about trident
multireduce, merge and other methods like this where we need to go..

On 4/16/15, Jake K. Dodd  wrote:
> https://storm.apache.org/documentation/Trident-tutorial.html
>
> https://storm.apache.org/documentation/Trident-API-Overview.html
>
> https://storm.apache.org/documentation/Trident-state
>
> Best
>
> Jake K Dodd
>
>> On Apr 16, 2015, at 12:36, Gaurav Agarwal  wrote:
>>
>> -- Forwarded message --
>> From: Gaurav Agarwal 
>> Date: Thu, 16 Apr 2015 22:40:27 +0530
>> Subject: Regarding Storm Trident
>> To: us...@storm.apache.org
>>
>> Hello
>>
>> I am trying to  implement trident Stream,merge,multireduce and other
>> functions in my code . But there is no proper documentation there I also
>> find the api /Java docs but without any documentation , Are there any
>> list
>> of examples or documentation which i can look after for.
>>
>> Thanks
>


Re: Regarding Storm Trident

2015-04-16 Thread Jake K. Dodd
https://storm.apache.org/documentation/Trident-tutorial.html

https://storm.apache.org/documentation/Trident-API-Overview.html

https://storm.apache.org/documentation/Trident-state

Best

Jake K Dodd

> On Apr 16, 2015, at 12:36, Gaurav Agarwal  wrote:
> 
> -- Forwarded message --
> From: Gaurav Agarwal 
> Date: Thu, 16 Apr 2015 22:40:27 +0530
> Subject: Regarding Storm Trident
> To: us...@storm.apache.org
> 
> Hello
> 
> I am trying to  implement trident Stream,merge,multireduce and other
> functions in my code . But there is no proper documentation there I also
> find the api /Java docs but without any documentation , Are there any list
> of examples or documentation which i can look after for.
> 
> Thanks


Regarding Storm Trident

2015-04-16 Thread Gaurav Agarwal
-- Forwarded message --
From: Gaurav Agarwal 
Date: Thu, 16 Apr 2015 22:40:27 +0530
Subject: Regarding Storm Trident
To: us...@storm.apache.org

Hello

I am trying to  implement trident Stream,merge,multireduce and other
functions in my code . But there is no proper documentation there I also
find the api /Java docs but without any documentation , Are there any list
of examples or documentation which i can look after for.

Thanks


Re: Workaround until https://issues.apache.org/jira/browse/STORM-573 is fixed in a release

2015-04-16 Thread Mark Tomko
Setting the property using System.setProperty does not work, but setting
the property in the shell does seem to.

One thing that I notice is that if a worker process dies, it kills the
entire JVM. I'm running my tests from SBT at the moment, and when the
exception is thrown, I'm getting dropped back to the shell, even with a
try/catch block around the call to
Testing.withSimulatedTimeLocalCluster(...).

We could/should add some additional exception handling to the actual
components, but I'm not sure it makes sense to kill the entire JVM just for
an exception.

Thanks!
Mark

On Thu, Apr 16, 2015 at 9:00 AM, 임정택  wrote:

> Hello.
>
> You can try set system environment "STORM_TEST_TIMEOUT_MS" and run your
> test.
> It'll bind to TEST_TIMEOUT_MS and it's default timeout value of
> complete-topology.
>
> Hope this helps.
>
> Regards.
> Jungtaek Lim (HeartSaVioR)
>
> 2015-04-16 1:08 GMT+09:00 Mark Tomko :
>
>> Hi,
>>
>> I'm working on some tests for one of my Storm topologies, and I'm
>> consistently seeing topology timeouts from the testing framework, and it
>> looks like it's only waiting 5 seconds, which is pretty aggressive in my
>> opinion:
>>
>> Error in cluster
>> java.lang.AssertionError: Test timed out (5000ms)
>> at backtype.storm.testing$complete_topology.doInvoke(testing.clj:489)
>> ~[storm-core-0.9.3.jar:0.9.3]
>> at clojure.lang.RestFn.invoke(RestFn.java:826) ~[clojure-1.5.1.jar:na]
>> at backtype.storm.testing4j$_completeTopology.invoke(testing4j.clj:61)
>> ~[storm-core-0.9.3.jar:0.9.3]
>> at backtype.storm.Testing.completeTopology(Unknown Source)
>> [storm-core-0.9.3.jar:0.9.3]
>>
>>
>> I know from experience that this topology runs pretty well on our remote
>> cluster, so I don't think these timeouts are necessarily and indication of
>> something bad.
>>
>> Looking around, I found this JIRA issue:
>>
>> https://issues.apache.org/jira/browse/STORM-573
>>
>> Which was fixed ~5 months ago, but didn't end up in Storm 0.9.4. Does
>> anyone have any suggestions for a testing workaround in the meantime?
>> Before I'd discovered the testing framework, I'd been just been spinning up
>> a local cluster myself as part of the test, and I might just go back to
>> doing that, but it would be great to use the officially supported testing
>> mechanisms.
>>
>> Thanks,
>>
>> Mark
>>
>>
>
>
> --
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>


Re: Stuck while running storm in local mode:

2015-04-16 Thread Nishu
Since this is parsing error in storm.yaml file, check whether you have any
tabspace in file. Try to remove all tab or space and rewrite.
Then it should work fine. You can also validate your yaml file on yamlLint
.


On Thu, Apr 16, 2015 at 6:35 PM, Gaurav Pandey 
wrote:

> I am configuring Cassendra 1.0 on CentOS 5.7 but getting stacked when
> running the command "cassandra -f". Here is the error message-
>
>
> Caused by: expected '', but found BlockMappingStart
>  in 'reader', line 24, column 1:
> storm.zookeeper.port: 2181
> ^
>
> at 
> org.yaml.snakeyaml.parser.ParserImpl$ParseDocumentStart.produce(ParserImpl.java:225)
> at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:158)
> at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:143)
> at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:108)
> at 
> org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:120)
> at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:481)
> at org.yaml.snakeyaml.Yaml.load(Yaml.java:424)
> at backtype.storm.utils.Utils.findAndReadConfigFile(Utils.java:141)
> at backtype.storm.utils.Utils.readStormConfig(Utils.java:188)
> at backtype.storm.utils.Utils.(Utils.java:71)
> ... 100 more
>
> Following is my storm.yaml file in /conf folder:
>
>
> ---
> java.library.path: /usr/local/lib
> nimbus.childopts: "-Xmx512m"
> nimbus.host: localhost
> storm.local.dir: /var/stormtmp
> storm.zookeeper.port: 2181
> storm.zookeeper.servers:
>   - localhost
> supervisor.childopts: "-Xmx256m"
> supervisor.slots.ports:
>   - 6700
>   - 6701
>   - 6702
>   - 6703
> worker.childopts: "-Xmx768m"
>
>
> Haven't got any answers from stack overflow as of now, can somebody please 
> suggest where am I going wrogn?
> Also if I comment the whole storm.yaml file the error remains the same, but 
> if I make some syntax error in storm.yaml file it does reflect in the error 
> stack, along with other erross
>
>


-- 
with regards,
Nishu Tayal


extract storm example from 0.9.4

2015-04-16 Thread alexey yakubovich
Hi, my first time, with Storm mailing list. Good to meet you guys.
How to extract the example projects from latest storm distribution (0.9.4) and 
have each example as a separate maven project (in eclipse)?Or, may be somebody 
can point me some example of storm/kafka that already have all dependencies 
resolved and I can just reuse the pom.xml.
I am trying to build streaming data channel with kafka and storm. The special 
think about it: i try to make it metadata-aware with collection of data 
transformers and loaders (into any of supported data stores). I see it as a 
storm spout that fetch msgs from kafka and do necessary ETL, and bolt that 
loads then into … whatever. Because of transformation and loading I need to 
extend provided example. And it would be more convenient to have it in a 
separate project (besides I don’t  know how to run example right from main 
storm project.). 
Thank you for any help.


Using Storm with Kundera to write into Cassandra/HBase

2015-04-16 Thread Spico Florin
Hello!
   Does anyone used Kundera (
https://github.com/impetus-opensource/Kundera/wiki/Getting-Started-in-5-minutes)
to write/read data from/to Cassadra/HBase?
  Any suggestions or github example will be appreciated.
Thanks.
 Florin


Stuck while running storm in local mode:

2015-04-16 Thread Gaurav Pandey
I am configuring Cassendra 1.0 on CentOS 5.7 but getting stacked when
running the command "cassandra -f". Here is the error message-


Caused by: expected '', but found BlockMappingStart
 in 'reader', line 24, column 1:
storm.zookeeper.port: 2181
^

at 
org.yaml.snakeyaml.parser.ParserImpl$ParseDocumentStart.produce(ParserImpl.java:225)
at org.yaml.snakeyaml.parser.ParserImpl.peekEvent(ParserImpl.java:158)
at org.yaml.snakeyaml.parser.ParserImpl.checkEvent(ParserImpl.java:143)
at org.yaml.snakeyaml.composer.Composer.getSingleNode(Composer.java:108)
at 
org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:120)
at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:481)
at org.yaml.snakeyaml.Yaml.load(Yaml.java:424)
at backtype.storm.utils.Utils.findAndReadConfigFile(Utils.java:141)
at backtype.storm.utils.Utils.readStormConfig(Utils.java:188)
at backtype.storm.utils.Utils.(Utils.java:71)
... 100 more

Following is my storm.yaml file in /conf folder:


---
java.library.path: /usr/local/lib
nimbus.childopts: "-Xmx512m"
nimbus.host: localhost
storm.local.dir: /var/stormtmp
storm.zookeeper.port: 2181
storm.zookeeper.servers:
  - localhost
supervisor.childopts: "-Xmx256m"
supervisor.slots.ports:
  - 6700
  - 6701
  - 6702
  - 6703
worker.childopts: "-Xmx768m"


Haven't got any answers from stack overflow as of now, can somebody
please suggest where am I going wrogn?
Also if I comment the whole storm.yaml file the error remains the
same, but if I make some syntax error in storm.yaml file it does
reflect in the error stack, along with other erross


Re: Workaround until https://issues.apache.org/jira/browse/STORM-573 is fixed in a release

2015-04-16 Thread 임정택
Hello.

You can try set system environment "STORM_TEST_TIMEOUT_MS" and run your
test.
It'll bind to TEST_TIMEOUT_MS and it's default timeout value of
complete-topology.

Hope this helps.

Regards.
Jungtaek Lim (HeartSaVioR)

2015-04-16 1:08 GMT+09:00 Mark Tomko :

> Hi,
>
> I'm working on some tests for one of my Storm topologies, and I'm
> consistently seeing topology timeouts from the testing framework, and it
> looks like it's only waiting 5 seconds, which is pretty aggressive in my
> opinion:
>
> Error in cluster
> java.lang.AssertionError: Test timed out (5000ms)
> at backtype.storm.testing$complete_topology.doInvoke(testing.clj:489)
> ~[storm-core-0.9.3.jar:0.9.3]
> at clojure.lang.RestFn.invoke(RestFn.java:826) ~[clojure-1.5.1.jar:na]
> at backtype.storm.testing4j$_completeTopology.invoke(testing4j.clj:61)
> ~[storm-core-0.9.3.jar:0.9.3]
> at backtype.storm.Testing.completeTopology(Unknown Source)
> [storm-core-0.9.3.jar:0.9.3]
>
>
> I know from experience that this topology runs pretty well on our remote
> cluster, so I don't think these timeouts are necessarily and indication of
> something bad.
>
> Looking around, I found this JIRA issue:
>
> https://issues.apache.org/jira/browse/STORM-573
>
> Which was fixed ~5 months ago, but didn't end up in Storm 0.9.4. Does
> anyone have any suggestions for a testing workaround in the meantime?
> Before I'd discovered the testing framework, I'd been just been spinning up
> a local cluster myself as part of the test, and I might just go back to
> doing that, but it would be great to use the officially supported testing
> mechanisms.
>
> Thanks,
>
> Mark
>
>


-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior


RE: how to get spout to re-emit in trident

2015-04-16 Thread Brunner, Bill
You would throw a FailedException

From: 王天驹 [mailto:wangtianju@gmail.com]
Sent: Thursday, April 16, 2015 4:39 AM
To: user@storm.apache.org
Subject: how to get spout to re-emit in trident

hi,
I am new to trident, when we use low level storm api, we call fail(tuple) to 
tell spout to re-emit tuple
what shall we do in trident to get the spout to re-emit? throw a runtime 
exception?

thx

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.


Storm 0.9.4 on yarn

2015-04-16 Thread Sandeep Samudrala
I was running storm on yarn using "https://github.com/yahoo/storm-yarn";
but yahoo storm on yarn was integrated with storm-0.9.0-wip21, which is bit
older version. We tried patching storm 0.9.4 with storm-on-yarn and then
deploy storm on yarn cluster.The Storm UI shows up with.

"Page not found"

Is storm on yarn extensively used anywhere?.

Thanks,
-Sandeep


An exception occured while executing the Java class. java.lang.InterruptedException

2015-04-16 Thread Chun Yuen Lim
I get my Apache Storm from
http://mirror.symnds.com/software/Apache/storm/apache-storm-0.9.3/apache-storm-0.9.3.tar.gz

I never modified any thing in the /home/user/storm/examples/storm-starter/
directory.

When I run mvn exec:java -D storm.topology=storm.starter.WordCountTopology
in the /home/user/storm/examples/storm-starter/, it gives the error as
below:

[ERROR] Failed to execute goal
org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default-cli) on project
storm-starter: An exception occured while executing the Java class.
java.lang.InterruptedException -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with
the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
MojoExecutionException - Apache Maven - Apache Software Foundation
cwiki.apache.org

Anyone know what is the problem & how to solve it?
I wish can provide me a detail solution as I am a newbie. Thank you.

The attached file are the WordCountTopology


WordCountTopology.java
Description: Binary data


how to get spout to re-emit in trident

2015-04-16 Thread 王天驹
hi,
I am new to trident, when we use low level storm api, we call fail(tuple)
to tell spout to re-emit tuple
what shall we do in trident to get the spout to re-emit? throw a runtime
exception?

thx


Re: ShellBolt / Bash invocation: Source environment variables

2015-04-16 Thread Johannes Hugo Kitschke
To answer my own question (maybe it's useful for someone else): My 
findings are:

- bash is launched as non-login, non-interactive shell for the shell bolt
- in this mode bash looks for a path/file in the environment variable 
'BASH_ENV', if it exists that file is sourced
- now I only had to create a file that loads the module and point the 
env. var. to the location of the file

That's it. Now it works.

On 04/15/2015 10:40 AM, Johannes Hugo Kitschke wrote:

Hi,

I am currently experiencing a problem when submitting a topology built 
with pyleus to my storm cluster:


java.lang.RuntimeException: Error when launching multilang subprocess
pyleus_venv/bin/python: error while loading shared libraries: 
libpython2.7.so.1.0: cannot open shared object file: No such file or 
directory


Maybe I know what causes this error: On our cluster I have to execute 
'module load python' to setup the environment for python. I have this 
in my .bashrc but my assumption is, that .bashrc does not get sourced 
when setting up the ShellBot (is this true?). This is based on the 
following observation: Local mode performs as expected, unless python 
module is not loaded. Then it fails with the error above.


Long story short: Which shell is used? Which files are sourced when 
the shell is launched for the ShellBolts?


Johannes