Re: Can be storm-deploy script used for deploying storm 0.9.x version on Amazon?

2014-03-27 Thread Spico Florin
Hello, Sasi!
  Thank your very much for your response.Due to the fact that I would like
to deploy also a Storm application on Amazon, it really helps to spare time
. If it possible, just as an information, do you have some measurement on
how much time did take the cluster to be up and running?

Best regards,
 Florin


On Wed, Mar 26, 2014 at 8:36 PM, Sasi Panja sasi.pa...@gmail.com wrote:


 I was able to deploy and run 0.9.0-rc2  on EC2  using
 https://github.com/nathanmarz/storm-deploy

 After following the instructions from the wiki and updating the
 configuration files, the following command worked for me  :

  *lein deploy-storm --start --name yourclustername  --branch master
 --commit 0.9.0-rc2*

 I am using   Leiningen 2.3.4

 All the workers, master and zookkeeper servers were up and running in the
 cluster


 

 # CLUSTERS CONFIG FILE *under storm-deploy/conf*


 
 nimbus.image: us-west-2/ami-ca2ca4fa #64-bit ubuntu
 nimbus.hardware: m1.xlarge

 supervisor.count: 5
 supervisor.image: us-west-2/ami-ca2ca4fa #64-bit ubuntu on
 us-west-2
 supervisor.hardware: m1.xlarge

 zookeeper.count: 1
 zookeeper.image: us-west-2/ami-ca2ca4fa #64-bit ubuntu
 zookeeper.hardware: m1.large


 
 *config.clj under   ~/.pallet *


 

 (defpallet

   :services

   {

:default {

  :blobstore-provider aws-s3

  :provider aws-ec2

  :environment {:user {:username storm  ; this must be storm

   :private-key-path
 /home/ubuntu/.ssh/id_rsa

   :public-key-path
 /home/ubuntu/.ssh/id_rsa.pub}

:aws-user-id }

  :identity 

  :credential XX

  :jclouds.regions us-west-2

  }

 })
 


 -Sasi


 On Wed, Mar 26, 2014 at 3:01 AM, Spico Florin spicoflo...@gmail.comwrote:

 Hello!
   I would like to know what changes should be applied to the storm-deploy
 script (https://github.com/nathanmarz/storm-deploy) in order to install
 it on Amazon?
   Thank you in advance.
   Regards,
   Florin





RE: failed to start supervisor with missing stormconf.ser

2014-03-27 Thread Simon Cooper
I've got the same error. In a running cluster, I kill the supervisor running on 
one of the machines, wait until storm reassigns the topology that was on that 
machine (called Sync), and then bring the supervisor up again. It immediately 
dies, with the following in the log:

2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Synchronizing supervisor
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Worker 
21f86017-fed3-4e94-93f4-7ea65ca983e3 is :timed-out: 
#backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1395917231, :storm-id 
Sync-1-1395916991, :executors #{[43 43] [21 21] [22 22] [-1 -1] [11 20] [23 
32] [1 10] [33 42]}, :port 6703} at supervisor time-secs 1395917412
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Storm code map: 
{Sync-1-1395916991 /home/storm/storm/nimbus/stormdist/Sync-1-1395916991, 
Async-2-1395916991 /home/storm/storm/nimbus/stormdist/Async-2-1395916991}
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Downloaded storm ids: 
#{Sync-1-1395916991}
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] All assignment: {}
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] New assignment: {}
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Writing new assignment {}
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Syncing processes
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Assigned executors: {6703 
#backtype.storm.daemon.supervisor.LocalAssignment{:storm-id 
Sync-1-1395916991, :executors ([33 42] [22 22] [21 21] [1 10] [43 43] [11 20] 
[23 32])}}
2014-03-27 10:50:12 b.s.d.supervisor [DEBUG] Allocated: 
{21f86017-fed3-4e94-93f4-7ea65ca983e3 [:timed-out 
#backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1395917231, :storm-id 
Sync-1-1395916991, :executors #{[43 43] [21 21] [22 22] [-1 -1] [11 20] [23 
32] [1 10] [33 42]}, :port 6703}]}
2014-03-27 10:50:12 b.s.d.supervisor [INFO] Shutting down and clearing state 
for id 21f86017-fed3-4e94-93f4-7ea65ca983e3. Current supervisor time: 
1395917412. State: :timed-out, Heartbeat: 
#backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1395917231, :storm-id 
Sync-1-1395916991, :executors #{[43 43] [21 21] [22 22] [-1 -1] [11 20] [23 
32] [1 10] [33 42]}, :port 6703}
2014-03-27 10:50:12 b.s.d.supervisor [INFO] Shutting down 
01e5ad81-57a4-4933-87e5-c487e969b4b3:21f86017-fed3-4e94-93f4-7ea65ca983e3
2014-03-27 10:50:12 b.s.d.supervisor [INFO] Removing code for storm id 
Sync-1-1395916991
2014-03-27 10:50:12 b.s.util [DEBUG] Rmr path 
/home/storm/storm/supervisor/stormdist/Sync-1-1395916991
2014-03-27 10:50:12 b.s.util [INFO] Error when trying to kill 2160. Process is 
probably already dead.
2014-03-27 10:50:12 b.s.util [DEBUG] Removing path 
/home/storm/storm/workers/21f86017-fed3-4e94-93f4-7ea65ca983e3/pids/2160
2014-03-27 10:50:12 b.s.util [DEBUG] Rmr path 
/home/storm/storm/workers/21f86017-fed3-4e94-93f4-7ea65ca983e3/heartbeats
2014-03-27 10:50:12 b.s.util [DEBUG] Removing path 
/home/storm/storm/workers/21f86017-fed3-4e94-93f4-7ea65ca983e3/pids
2014-03-27 10:50:12 b.s.util [DEBUG] Removing path 
/home/storm/storm/workers/21f86017-fed3-4e94-93f4-7ea65ca983e3
2014-03-27 10:50:12 b.s.d.supervisor [INFO] Shut down 
01e5ad81-57a4-4933-87e5-c487e969b4b3:21f86017-fed3-4e94-93f4-7ea65ca983e3
2014-03-27 10:50:12 b.s.util [DEBUG] Making dirs at 
/home/storm/storm/workers/9e27425e-6d2b-48cf-a592-8dfc0204f332/pids
2014-03-27 10:50:12 b.s.d.supervisor [INFO] Launching worker with assignment 
#backtype.storm.daemon.supervisor.LocalAssignment{:storm-id 
Sync-1-1395916991, :executors ([33 42] [22 22] [21 21] [1 10] [43 43] [11 20] 
[23 32])} for this supervisor 01e5ad81-57a4-4933-87e5-c487e969b4b3 on port 6703 
with id 9e27425e-6d2b-48cf-a592-8dfc0204f332
2014-03-27 10:50:12 b.s.event [ERROR] Error when processing event
java.io.FileNotFoundException: File 
'/home/storm/storm/supervisor/stormdist/Sync-1-1395916991/stormconf.ser' does 
not exist
at 
org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:137) 
~[commons-io-1.4.jar:1.4]
at 
org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1135) 
~[commons-io-1.4.jar:1.4]
at 
backtype.storm.config$read_supervisor_storm_conf.invoke(config.clj:177) 
~[storm-core-0.9.0.1.jar:na]
at 
backtype.storm.daemon.supervisor$fn__6328.invoke(supervisor.clj:410) 
~[storm-core-0.9.0.1.jar:na]
at clojure.lang.MultiFn.invoke(MultiFn.java:177) 
~[clojure-1.4.0.jar:na]
at 
backtype.storm.daemon.supervisor$sync_processes$iter__6219__6223$fn__6224.invoke(supervisor.clj:244)
 ~[storm-core-0.9.0.1.jar:na]
at clojure.lang.LazySeq.sval(LazySeq.java:42) 
~[clojure-1.4.0.jar:na]
at clojure.lang.LazySeq.seq(LazySeq.java:60) 
~[clojure-1.4.0.jar:na]
at clojure.lang.RT.seq(RT.java:473) ~[clojure-1.4.0.jar:na]
at clojure.core$seq.invoke(core.clj:133) ~[clojure-1.4.0.jar:na]
at clojure.core$dorun.invoke(core.clj:2725) 
~[clojure-1.4.0.jar:na]
at 

Re: DI with Storm

2014-03-27 Thread Adam Lewis
Yes that is exactly right, the submission to Nimbus is in the form of a big
thrift message describing the topology...this message includes java
serialized blobs of your topology components (spouts/bolts). They get
instantiated within the VM calling StormSubmitter.  Typically you would
pass configuration info to the constructor, but dependencies (e.g. DB
connection pool, etc) are transient fields.  Then in the prepare method
(called after deserialization on the worker) you use the serialized
configuration fields to initialize the transient ones.  Of course Guice
fits naturally into that step.


On Thu, Mar 27, 2014 at 12:37 AM, Software Dev static.void@gmail.comwrote:

 Ok so you would configure the map in the main method before submitting
 the topology. Then this conf can be used to create guice injectors. Is
 that correct?

 In the book Getting Started with Storm it states:

 To customize a bolt, you should set parameters in its constructor and
 save them as instance variables so they will be serialized when
 submitting the bolt to the cluster.

 Does this mean bolts are instantiated on the client side before being
 submitted to nimbus/cluster?

 On Wed, Mar 26, 2014 at 2:05 PM, Svend Vanderveken
 svend.vanderve...@gmail.com wrote:
 
  The storm configuration map is part of the arguments received by each
  prepare() method, in most Storm primitives, on each worker. It's
 serialised
  to each worker when a topology instance is started there. The initial
 storm
  configuration map is provided at deploy time to Nimbus, in the class
  containing the main() method, specified in the storm jar blabla.jar
  some.class.here command.
 
 
 
 
  On Wed, Mar 26, 2014 at 4:42 PM, Software Dev static.void@gmail.com
 
  wrote:
 
  How does one get the configuration map to each worker?
 
  On Wed, Mar 26, 2014 at 6:41 AM, Adam Lewis m...@adamlewis.com wrote:
   Or, since this is only being called from prepare at startup anyway,
   simpler:
  
   public class InjectorProvider {
  
   private static Injector injector;
   public static synchronized Injector get(Map conf) {
   if (injector == null) {
   injector = Guice.createInjector(
   new DAOModule(conf),
   new S3Module(conf));
   }
  
   return injector;
   }
   }
  
  
  
  
  
   On Wed, Mar 26, 2014 at 9:26 AM, Svend Vanderveken
   svend.vanderve...@gmail.com wrote:
  
private static Injector injector;
  
   or better:
  
   private static volatile Injector injector;
  
  
   see
  
 http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
  
  
   On Tue, Mar 25, 2014 at 9:55 PM, Patricio Echagüe 
 patric...@gmail.com
   wrote:
  
   It's fine. You can synchronize with a static monitor the creation on
   the
   injector per worker. That's how I do it.
  
   public class InjectorProvider {
  
   private static Injector injector;
  
   public static Injector get() {
   if (injector == null) {
   synchronized (InjectorProvider.class) {
   if (injector == null) {
   injector = Guice.createInjector(
   new DAOModule(),
   new S3Module();
   }
   }
   }
  
   return injector;
   }
  
  
   On Tue, Mar 25, 2014 at 6:24 PM, Adam Lewis m...@adamlewis.com
   wrote:
  
  
   Doesn't Storm 0.9 have a prepare for the worker?
  
  
   No, I don't think it does, but please point this out if I'm
 mistaken.
   I
   found the right JIRA issue though:
   https://issues.apache.org/jira/browse/STORM-126
  
   Seems like the patch was well along but hasn't seen any recent
   activity.
  
  
  
  
 
 



Workers / Tasks management in case of failure

2014-03-27 Thread Quentin de G.
Hello everyone,

I'm building a topology in which I'm counting connected devices. Each
device can send a start and a stop message, both with a same unique ID (to
match the device).

Due to the high throughput of messages, I'm storing a state (in my case, a
list of connected devices) in RAM, shared across all workers, to check and
count devices.
I'm also storing each start message in HBase, using Put in bulk (and also
removing when I got end messages). I don't want to count using HBase (too
slow).

I'd like to be able, in case of a worker / task failure, to restore the
state in RAM by reading the values that are into HBase.

Problem is, I have no idea on how to implement this with Trident.

I found this PDF:
http://didata.us/assets.d/DiData-StormRedis-Portfolio.pdf, that seems
to be a good way of doing it, but only with Storm. I need to
achieve exactly-once processing thanks to Trident.

My questions is, can I achieve all this using Trident ? Is the idea of
combining RAM + HBase is wrong ?

Many thanks,
Quentin


date time in tuple

2014-03-27 Thread michael campbell

How do you put a datetime, let's say a jodatime datetime value, in a tuple? 

How do you get a datetime out of a tuple, what sort of method corresponds to 
tuple.getLongByField for a datetime?

Michael Campbell

-- 




Re: date time in tuple

2014-03-27 Thread Dan Guja
Try this:
(DateTime)tuple.getValueByField(myDateTimeFieldName);


On Thu, Mar 27, 2014 at 8:50 AM, michael campbell 
michael.campb...@dsl.pipex.com wrote:


 How do you put a datetime, let's say a jodatime datetime value, in a tuple?

 How do you get a datetime out of a tuple, what sort of method corresponds
 to tuple.getLongByField for a datetime?

 Michael Campbell

 --





Re: date time in tuple

2014-03-27 Thread Dan Guja
Also it might be worth reading:
https://github.com/nathanmarz/storm/wiki/Serialization


On Thu, Mar 27, 2014 at 9:01 AM, Dan Guja dang...@gmail.com wrote:

 Try this:
 (DateTime)tuple.getValueByField(myDateTimeFieldName);


 On Thu, Mar 27, 2014 at 8:50 AM, michael campbell 
 michael.campb...@dsl.pipex.com wrote:


 How do you put a datetime, let's say a jodatime datetime value, in a
 tuple?

 How do you get a datetime out of a tuple, what sort of method corresponds
 to tuple.getLongByField for a datetime?

 Michael Campbell

 --






Re: DI with Storm

2014-03-27 Thread Software Dev
Thanks. I think that cleared up most of my misunderstanding.

On Thu, Mar 27, 2014 at 6:16 AM, Adam Lewis m...@adamlewis.com wrote:
 Yes that is exactly right, the submission to Nimbus is in the form of a big
 thrift message describing the topology...this message includes java
 serialized blobs of your topology components (spouts/bolts). They get
 instantiated within the VM calling StormSubmitter.  Typically you would pass
 configuration info to the constructor, but dependencies (e.g. DB
 connection pool, etc) are transient fields.  Then in the prepare method
 (called after deserialization on the worker) you use the serialized
 configuration fields to initialize the transient ones.  Of course Guice fits
 naturally into that step.


 On Thu, Mar 27, 2014 at 12:37 AM, Software Dev static.void@gmail.com
 wrote:

 Ok so you would configure the map in the main method before submitting
 the topology. Then this conf can be used to create guice injectors. Is
 that correct?

 In the book Getting Started with Storm it states:

 To customize a bolt, you should set parameters in its constructor and
 save them as instance variables so they will be serialized when
 submitting the bolt to the cluster.

 Does this mean bolts are instantiated on the client side before being
 submitted to nimbus/cluster?

 On Wed, Mar 26, 2014 at 2:05 PM, Svend Vanderveken
 svend.vanderve...@gmail.com wrote:
 
  The storm configuration map is part of the arguments received by each
  prepare() method, in most Storm primitives, on each worker. It's
  serialised
  to each worker when a topology instance is started there. The initial
  storm
  configuration map is provided at deploy time to Nimbus, in the class
  containing the main() method, specified in the storm jar blabla.jar
  some.class.here command.
 
 
 
 
  On Wed, Mar 26, 2014 at 4:42 PM, Software Dev
  static.void@gmail.com
  wrote:
 
  How does one get the configuration map to each worker?
 
  On Wed, Mar 26, 2014 at 6:41 AM, Adam Lewis m...@adamlewis.com wrote:
   Or, since this is only being called from prepare at startup anyway,
   simpler:
  
   public class InjectorProvider {
  
   private static Injector injector;
   public static synchronized Injector get(Map conf) {
   if (injector == null) {
   injector = Guice.createInjector(
   new DAOModule(conf),
   new S3Module(conf));
   }
  
   return injector;
   }
   }
  
  
  
  
  
   On Wed, Mar 26, 2014 at 9:26 AM, Svend Vanderveken
   svend.vanderve...@gmail.com wrote:
  
private static Injector injector;
  
   or better:
  
   private static volatile Injector injector;
  
  
   see
  
   http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
  
  
   On Tue, Mar 25, 2014 at 9:55 PM, Patricio Echagüe
   patric...@gmail.com
   wrote:
  
   It's fine. You can synchronize with a static monitor the creation
   on
   the
   injector per worker. That's how I do it.
  
   public class InjectorProvider {
  
   private static Injector injector;
  
   public static Injector get() {
   if (injector == null) {
   synchronized (InjectorProvider.class) {
   if (injector == null) {
   injector = Guice.createInjector(
   new DAOModule(),
   new S3Module();
   }
   }
   }
  
   return injector;
   }
  
  
   On Tue, Mar 25, 2014 at 6:24 PM, Adam Lewis m...@adamlewis.com
   wrote:
  
  
   Doesn't Storm 0.9 have a prepare for the worker?
  
  
   No, I don't think it does, but please point this out if I'm
   mistaken.
   I
   found the right JIRA issue though:
   https://issues.apache.org/jira/browse/STORM-126
  
   Seems like the patch was well along but hasn't seen any recent
   activity.
  
  
  
  
 
 




Help me understand storm in the wild

2014-03-27 Thread Software Dev
How are many of you typically using storm? Use cases usually work best ;)

Say you have one source for a stream of user activity (page views,
product searches, purchases, etc). Now lets say you wanted to
calculate some running metrics for each of these activities. Is it
best to have one major topology where you would send each metric type
to their respective bolts or would you create many topologies, one for
each activity?

Thanks


Re: Help me understand storm in the wild

2014-03-27 Thread Noel Milton Vega

It always depends. :) But here's one factor to consider:


From a R.A.S. perspective (descibed here: http://bit.ly/1fS01MU) a 
one-topology approach
*may* negatively impact operational agility when those metrics can be 
calculated independently
of one another. In such cases, merging them into one topology creates an 
unnatural dependency
where, the moment code for any one metric needs to be updated (for 
whatever reason),
calculation of remaining metrics must stop since the common topology 
must be stopped,

modified, and tested.

So *S*erviceability characteristics of that topology will suffer, and 
*A*vailability numbers

for metrics that didn't require modification will unnecessarily suffer, too.

You may also want to consider the ability/inability to tweak 
per-topology run-time

properties (e.g. memory allocation, schedulers, etc.).

Perhaps a few 'bundled' topologies where it makes sense, versus 
many-to-one or

many-to-many approach.

Besides R.A.S. there are other reasons pro and con.


On 03/27/2014 01:47 PM, Software Dev wrote:

How are many of you typically using storm? Use cases usually work best ;)

Say you have one source for a stream of user activity (page views,
product searches, purchases, etc). Now lets say you wanted to
calculate some running metrics for each of these activities. Is it
best to have one major topology where you would send each metric type
to their respective bolts or would you create many topologies, one for
each activity?

Thanks




Re: Can be storm-deploy script used for deploying storm 0.9.x version on Amazon?

2014-03-27 Thread Yi Wang
current storm-deply script currently points to
https://github.com/nathanmarz/storm

0.9.0.1 is the latest tag that works

-
Yi

-
Yi


On Wed, Mar 26, 2014 at 6:01 AM, Spico Florin spicoflo...@gmail.com wrote:

 Hello!
   I would like to know what changes should be applied to the storm-deploy
 script (https://github.com/nathanmarz/storm-deploy) in order to install
 it on Amazon?
   Thank you in advance.
   Regards,
   Florin



Re: Auto acking

2014-03-27 Thread Software Dev
Ok, I havent come across that class.

What creates that class?

On Wed, Mar 26, 2014 at 10:21 PM, Angelo Genovese ang...@genovese.ca wrote:
 BasicBolts are wrapped automatically with a basic bolt executor. It does the
 back/fail

 On Mar 26, 2014 9:40 PM, Software Dev static.void@gmail.com wrote:

 Sorry for the basic question but...

 Where does BaseBasicBolt auto ack? It looks like it uses
 BasicOutputCollector which doesn't even have an ack or fail method
 whereas OutputCollector does?


Re: date time in tuple

2014-03-27 Thread Adam Lewis

 Also it might be worth reading:
 https://github.com/nathanmarz/storm/wiki/Serialization


After which you'll seek out this library:
https://github.com/magro/kryo-serializers


On Thu, Mar 27, 2014 at 11:09 AM, Dan Guja dang...@gmail.com wrote:

 Also it might be worth reading:
 https://github.com/nathanmarz/storm/wiki/Serialization


 On Thu, Mar 27, 2014 at 9:01 AM, Dan Guja dang...@gmail.com wrote:

 Try this:
 (DateTime)tuple.getValueByField(myDateTimeFieldName);


 On Thu, Mar 27, 2014 at 8:50 AM, michael campbell 
 michael.campb...@dsl.pipex.com wrote:


 How do you put a datetime, let's say a jodatime datetime value, in a
 tuple?

 How do you get a datetime out of a tuple, what sort of method
 corresponds to tuple.getLongByField for a datetime?

 Michael Campbell

 --







Re: Can be storm-deploy script used for deploying storm 0.9.x version on Amazon?

2014-03-27 Thread Sasi Panja
Hi Florin,
The cluster consisted of 1 master, 1 zookeeper, and 5 workers (test env)
and it took about 10 minutes for everything to be up and view  the UI
console. The storm-deploy script was run from another ec2 machine running
in the same zone.
I was then able to run my topology in less than 2 minutes.

As Marc mentioned, it might be worth to try Michael's Wirbelsturm script,
which seems quite elegant and powerful from the github/blog documentation.
-Sasi


On Thu, Mar 27, 2014 at 4:15 AM, Marc Vaillant vaill...@animetrics.comwrote:

 Hi Florin,

 I just wanted to suggest that you also look at Wirbelsturm by Michael
 Noll as an alternative to storm-deploy
 https://github.com/miguno/wirbelsturm.  I think that you will find it
 more complete, better documented, and more mainstream because it uses
 vagrant and puppet instead of pallet and jclouds.  I highly recommend
 reading Michael's blog post about it:

 http://www.michael-noll.com/blog/2014/03/17/wirbelsturm-one-click-deploy-storm-kafka-clusters-with-vagrant-puppet

 Best,
 Marc

 On Thu, Mar 27, 2014 at 10:19:24AM +0200, Spico Florin wrote:
  Hello, Sasi!
Thank your very much for your response.Due to the fact that I would
 like to
  deploy also a Storm application on Amazon, it really helps to spare time
 . If
  it possible, just as an information, do you have some measurement on how
 much
  time did take the cluster to be up and running?
 
  Best regards,
   Florin
 
 
  On Wed, Mar 26, 2014 at 8:36 PM, Sasi Panja sasi.pa...@gmail.com
 wrote:
 
 
  I was able to deploy and run 0.9.0-rc2  on EC2  using
 https://github.com/
  nathanmarz/storm-deploy
 
  After following the instructions from the wiki and updating the
  configuration files, the following command worked for me  :
 
   lein deploy-storm --start --name yourclustername  --branch master
 --commit
  0.9.0-rc2
 
  I am using   Leiningen 2.3.4
 
  All the workers, master and zookkeeper servers were up and running
 in the
  cluster
 
 
 ###
  #
 
  # CLUSTERS CONFIG FILE under storm-deploy/conf
 
 
 ###
  #
 
  nimbus.image: us-west-2/ami-ca2ca4fa #64-bit ubuntu
  nimbus.hardware: m1.xlarge
 
  supervisor.count: 5
  supervisor.image: us-west-2/ami-ca2ca4fa #64-bit ubuntu on
  us-west-2
  supervisor.hardware: m1.xlarge
 
  zookeeper.count: 1
  zookeeper.image: us-west-2/ami-ca2ca4fa #64-bit ubuntu
  zookeeper.hardware: m1.large
 
 
 ###
  #
 
  config.clj under   ~/.pallet
 
 
 ###
  #
 
  (defpallet
 
:services
 
{
 
 :default {
 
   :blobstore-provider aws-s3
 
   :provider aws-ec2
 
   :environment {:user {:username storm  ; this must be
 storm
 
:private-key-path
 /home/ubuntu/.ssh/
  id_rsa
 
:public-key-path
 /home/ubuntu/.ssh/
  id_rsa.pub}
 
 :aws-user-id }
 
   :identity 
 
   :credential XX
 
   :jclouds.regions us-west-2
 
   }
 
  })
 
  
 
 
  -Sasi
 
 
  On Wed, Mar 26, 2014 at 3:01 AM, Spico Florin spicoflo...@gmail.com
 
  wrote:
 
  Hello!
I would like to know what changes should be applied to the
  storm-deploy script (https://github.com/nathanmarz/storm-deploy)
 in
  order to install it on Amazon?
Thank you in advance.
Regards,
Florin
 
 
 
 



Adding a file as a resource to Storm

2014-03-27 Thread Software Dev
Similar to hadoop, is it possible to add a local file as a resource
that would be available to storm workers?


Re: Using storm for heavy processing

2014-03-27 Thread Naresh Bhatti
Padma, the exception were in our application code. We observed the storm
logs on the supervisors to identify the problems.
~Naresh



On Thu, Mar 27, 2014 at 4:50 AM, Nathan Leung ncle...@gmail.com wrote:

 Personally I would not use storm for such long running computations. I
 would lean towards a batch processing system such as hadoop.
 On Mar 27, 2014 12:34 AM, Swara Desai swarade...@gmail.com wrote:

 Hi,
 Can someone please shed some light on this? Has anyone used storm to
 handle long processing, like video transcoding?

 Thanks


 On Wed, Mar 26, 2014 at 4:13 PM, Swara Desai swarade...@gmail.comwrote:

 Hi,
 I am evaluating storm for some processing that might take upto one hour.
 This implies that the bolt would be processing one tuple for an hour.
 Can storm handle cases which have long processing time? Other than
 configuring things like timeout, have any of you noticed any issues with
 heavy processing?

 Thanks in advance





Re: Adding a file as a resource to Storm

2014-03-27 Thread Mikhail Davidov
What I've done is just embed them in the topology jar and extract them on
prepare().
On Mar 27, 2014 5:16 PM, Software Dev static.void@gmail.com wrote:

 Similar to hadoop, is it possible to add a local file as a resource
 that would be available to storm workers?



Re: Storm + Spring MVC

2014-03-27 Thread Daniel Fagnan
While I haven’t used storm in many months, but that’s not exactly how storm is 
integrated.

You would typically have a separate storm cluster that operates independently 
of any web app. That is, the Storm cluster starts off with a set of Spouts. 
These are the input bits. What you most likely want is having a message queue 
like Kafka or Kestrel (the easier option). This is where the integration with 
web apps are. There are plenty of drivers for these queues for JVM languages. 
When you want to send something off to Storm, you’d put the message/data within 
the queue. Ultimately, Storm picks this up for processing.

That’s a basic explanation, but, think of Storm as a separate entity that’s 
mostly agnostic to whatever system the data is coming from.

Hope that helps,
Daniel.

On Mar 27, 2014, at 7:16 PM, Jude K j2k...@gmail.com wrote:

 I am curios has anyone integrated Storm with Spring Web MVC Framework? 
 
 I am new to Storm and working on designing new app the integrates Storm into 
 existing Spring MVC App.
 



java.lang.IllegalArgumentException: timeout value is negative - seen in Worker logs

2014-03-27 Thread Binita Bharati
Hi all,

Am using storm-0.9.0.1.

The following error is seen in the worker logs:

2014-03-25 16:18:24 STDIO [ERROR] Mar 25, 2014 4:18:24 PM
org.jboss.netty.channel.DefaultChannelPipeline
WARNING: An exception was thrown by a user handler while handling an
exception event ([id: 0x8068e4b0] EXCEPTION:
java.net.ConnectException: Connection refused)
java.lang.IllegalArgumentException: timeout value is negative
at java.lang.Thread.sleep(Native Method)
at backtype.storm.messaging.netty.Client.reconnect(Client.java:78)
at 
backtype.storm.messaging.netty.StormClientHandler.exceptionCaught(StormClientHandler.java:108)
at 
org.jboss.netty.handler.codec.frame.FrameDecoder.exceptionCaught(FrameDecoder.java:377)
at 
org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:78)
at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:41)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)


Relevant Netty config from storm.yaml:
 storm.messaging.transport: backtype.storm.messaging.netty.Context
 storm.messaging.netty.server_worker_threads: 1
 storm.messaging.netty.client_worker_threads: 1
 storm.messaging.netty.buffer_size: 5242880
 storm.messaging.netty.max_retries: 100
 storm.messaging.netty.max_wait_ms: 1000
 storm.messaging.netty.min_wait_ms: 100

Anyone knows why ?

Thanks
Binita


need some examples in java (for write on zookeper)

2014-03-27 Thread M Tarkeshwar Rao
Hi all,

Can you please suggest any useful link for zookeeper. I need some examples in 
java.
I want to write meta data on zookeeper using curator framework.

Can I implement some kind of signal on write on zookeeper? I mean to say that 
if one process write on zookeeper, it should signal the another process to 
start work on the written data using watches. Is it possible?

Regards
Tarkeshwar