Puppet module for deploying Storm released

2014-02-26 Thread Michael G. Noll
Hi everyone,

I have released a Puppet module to deploy Storm 0.9 in case anyone is
interested.

The module uses Puppet parameterized classes and as such decouples code
(Puppet manifests) from configuration data -- hence you can use Puppet
Hiera to configure the way Storm is deployed without having to write or
fork/modify Puppet manifests.  The module is available under the Apache
v2 license.  Any code contributions, bug reports, etc. are of course
very welcome.

The module including docs and examples is available at:
https://github.com/miguno/puppet-storm

Enjoy!
Michael




RE: Storm Applications

2014-02-26 Thread Simon Cooper
Using normal storm, any bolt can output to anything at any time, as each bolt 
runs arbitrary code. So a bolt in the middle of a topology can write to a 
database, or file, or anything else you need. It will likely be the last bolt 
in the topology, but it doesn't have to be.

If you use trident, then you use specific abstractions to read and write data - 
to read, you use a StateFactory and a QueryFunction, and to write, you use a 
StateFactory with a StateUpdater.

If you want to read data from flume, you'll have to write a spout to pull data 
from flume and emit it into a topology. Start with the IRichSpout interface for 
normal storm, or ITridentSpout for trident.

SimonC

From: P lva [mailto:ruvi...@gmail.com]
Sent: 26 February 2014 02:44
To: user@storm.incubator.apache.org
Subject: Storm Applications

Hello Everyone,

I came across storm recently and I'm trying to understand it better.

Storm, unlike flume, doesn't really have any code for a sink. Read somewhere 
that storm is a real time stream processing engine where you don't expect data 
to land anywhere. What kind of a situation would this be ?

One example I envision is a situation where you only want to maintain counters 
without the actual data itself. Is this right ? If yes, I'm assuming that these 
counters have to be updated in a database. How does this affect the performance 
?

Can I route flume streams through storm cluster to compute the counters,store 
the counters in hbase (instead of going flume --- hive .--- top 10 query), 
effectively decreasing the number of mapreduce jobs on hadoop cluster ?







Storm Message Size

2014-02-26 Thread Klausen Schaefersinho
Hi,

I have a topology which process events and aggregates them in some form and
performs some prediction based on a machine learning (ML) model. Every x
events the one of the bolt involved in the normal processing emit an
trainModel event, which is routed to a bolt which is just dedicated to
the training. One the training is done, the new model should be send back
to the prediction bolt. The topology looks like:


InputSpout - AggregationBolt - PredictionBolt - OutputBolt
 | /\
  \/   |
   TrainingBolt -+


The model can get quite large ( 100 mb) so I am not sure how this would
impact the performance of my cluster.  Does anybody has experiences with
transmitting large messages?

Also the training might take a while, so the aggregation bolt should not
trigger the training bolt if he is busy. Is there an established patterns
how to archive this kind of synchronization? I could have some streams to
send states, but then I would mix data stream with control stream, what I
really would like to avoid. An alternative would be use ZooKeeper and
perform the synchronization there. Lats but not least I could also make
make the aggregation bolt into a data base and have the training bolt
periodically wake up and read the data base. Does anybody has experience
with such a setup?

Kind Regards,

Klaus


Re: Storm Message Size

2014-02-26 Thread Enno Shioji
I can't comment on how large tuples fare, but about the synchronization,
would this not make more sense?

InputSpout - AggregationBolt - PredictionBolt - OutputBolt
 | |
  \/   |
   Agg. State|
/\ |
  |V
   TrainingBolt - Model State

I.e. AggregationBolt writes to AggregationState, which is polled by
TrainingBolt, which writes to ModelState. ModelState is then polled by
PredictionBolt.

This way, you can get rid of the large tuples as well and use instead
something like S3 for these large states.





On Wed, Feb 26, 2014 at 11:02 AM, Klausen Schaefersinho 
klaus.schaef...@gmail.com wrote:

 Hi,

 I have a topology which process events and aggregates them in some form
 and performs some prediction based on a machine learning (ML) model. Every
 x events the one of the bolt involved in the normal processing emit an
 trainModel event, which is routed to a bolt which is just dedicated to
 the training. One the training is done, the new model should be send back
 to the prediction bolt. The topology looks like:


 InputSpout - AggregationBolt - PredictionBolt - OutputBolt
  | /\
   \/   |
TrainingBolt -+


 The model can get quite large ( 100 mb) so I am not sure how this would
 impact the performance of my cluster.  Does anybody has experiences with
 transmitting large messages?

 Also the training might take a while, so the aggregation bolt should not
 trigger the training bolt if he is busy. Is there an established patterns
 how to archive this kind of synchronization? I could have some streams to
 send states, but then I would mix data stream with control stream, what I
 really would like to avoid. An alternative would be use ZooKeeper and
 perform the synchronization there. Lats but not least I could also make
 make the aggregation bolt into a data base and have the training bolt
 periodically wake up and read the data base. Does anybody has experience
 with such a setup?

 Kind Regards,

 Klaus




Re: Storm Message Size

2014-02-26 Thread Klausen Schaefersinho
THX,

the idea is good, I will keep that in mind. The only drawback is that it
relies on polling, what I do not like to much in the PredictionBolt. Off
couse I could also pass S3 or File refernces around in the messages, to
trigger an update. But for the sake of simplicity I was thinking of keeping
everything in storm and do not rely if possible on other system.

Cheers,

Klaus


On Wed, Feb 26, 2014 at 12:22 PM, Enno Shioji eshi...@gmail.com wrote:

 I can't comment on how large tuples fare, but about the synchronization,
 would this not make more sense?

 InputSpout - AggregationBolt - PredictionBolt - OutputBolt
  | |
   \/   |
Agg. State|
 /\ |
   |V
TrainingBolt - Model State

 I.e. AggregationBolt writes to AggregationState, which is polled by
 TrainingBolt, which writes to ModelState. ModelState is then polled by
 PredictionBolt.

 This way, you can get rid of the large tuples as well and use instead
 something like S3 for these large states.





 On Wed, Feb 26, 2014 at 11:02 AM, Klausen Schaefersinho 
 klaus.schaef...@gmail.com wrote:

 Hi,

 I have a topology which process events and aggregates them in some form
 and performs some prediction based on a machine learning (ML) model. Every
 x events the one of the bolt involved in the normal processing emit an
 trainModel event, which is routed to a bolt which is just dedicated to
 the training. One the training is done, the new model should be send back
 to the prediction bolt. The topology looks like:


  InputSpout - AggregationBolt - PredictionBolt - OutputBolt
  | /\
   \/   |
TrainingBolt -+


 The model can get quite large ( 100 mb) so I am not sure how this would
 impact the performance of my cluster.  Does anybody has experiences with
 transmitting large messages?

 Also the training might take a while, so the aggregation bolt should not
 trigger the training bolt if he is busy. Is there an established patterns
 how to archive this kind of synchronization? I could have some streams to
 send states, but then I would mix data stream with control stream, what I
 really would like to avoid. An alternative would be use ZooKeeper and
 perform the synchronization there. Lats but not least I could also make
 make the aggregation bolt into a data base and have the training bolt
 periodically wake up and read the data base. Does anybody has experience
 with such a setup?

 Kind Regards,

 Klaus





Re: Storm Load Balancing

2014-02-26 Thread Sean Allen
Well 6700 isnt running at all. There's no uptime so they aren't ever
starting.

6701 appears to have died 20 minutes before you took the screenshot, that
is going to result in load being shuffled around.

So you had 3 functional workers, 6701, 6702, 6703 and 6701 went down

leaving

6702 and 6703

Those are both issue to look into.

Beyond that, you can try doing a rebalance.

What sort of data is being processed? Given you are seeing a wide range on
a single worker, it seems like you have data issues.
Some set of data takes longer, or you are doing a fields grouping on a
field that isn't evenly distributed, etc.









On Tue, Feb 25, 2014 at 9:25 AM, An Tran tra...@gmail.com wrote:

 I am having an issue with Storm Load balancing.  I have a bunch of exector
 (50) spread across 4 workers and it looks like some executor are way over
 capacity where others are idle .  See attached image for more detail.

 Can you guys explain to me what's going on and how do i fix this?




-- 

Ce n'est pas une signature


STORM with MYSQL optimizations

2014-02-26 Thread masoom alam
Dear All,

Have any body worked on the configurations/optimizations needed generally
for using STORM with MYSQL. Our scenario stores data in MYSQL tables, but
as the data rate increases MYSQL starts responding very slow (in some cases
connection refused error), resulting in DBWriterBolt to slowdown. All the
Topology is bottleneck by this issue. We cannot increase the traffic at
source beyond a certain level, the reason we noted is that sink (MYSQL) or
the bolt adjacent to sink is performing slow.

Any suggestion on how should we proceed will be highly appreciated.

Thanks.


Re: STORM with MYSQL optimizations

2014-02-26 Thread Sean Allen
How much traffic exactly are you pushing at mysql before the load gets to
high and it starts to buckle under the weight?


On Wed, Feb 26, 2014 at 8:38 AM, masoom alam masoom.a...@gmail.com wrote:

 Dear All,

 Have any body worked on the configurations/optimizations needed generally
 for using STORM with MYSQL. Our scenario stores data in MYSQL tables, but
 as the data rate increases MYSQL starts responding very slow (in some cases
 connection refused error), resulting in DBWriterBolt to slowdown. All the
 Topology is bottleneck by this issue. We cannot increase the traffic at
 source beyond a certain level, the reason we noted is that sink (MYSQL) or
 the bolt adjacent to sink is performing slow.

 Any suggestion on how should we proceed will be highly appreciated.

 Thanks.





-- 

Ce n'est pas une signature


Re: Setting up Storm Cluster

2014-02-26 Thread Sean Allen
There are good basic default configurations for each. There's nothing you
should have to do. Older versions of storm 0.9.x defaulted to ZeroMQ, the
latest defaults to Netty. I would advise not tuning any parameters of
either until you need to and understand what you are doing.


On Sat, Feb 22, 2014 at 7:01 AM, An Tran tra...@gmail.com wrote:

 Hi,

 I am trying to install the latest version of Storm.  The documentation I
 found (
 http://storm.incubator.apache.org/documentation/Setting-up-a-Storm-cluster.html)
 does not mention ZeroMQ or Netty configuration.  Is this information
 correct and the most up to date?




-- 

Ce n'est pas une signature


Re: STORM with MYSQL optimizations

2014-02-26 Thread masoom alam
1000 Events per second.




On Wed, Feb 26, 2014 at 6:40 PM, Sean Allen s...@monkeysnatchbanana.comwrote:

 How much traffic exactly are you pushing at mysql before the load gets to
 high and it starts to buckle under the weight?


 On Wed, Feb 26, 2014 at 8:38 AM, masoom alam masoom.a...@gmail.comwrote:

 Dear All,

 Have any body worked on the configurations/optimizations needed generally
 for using STORM with MYSQL. Our scenario stores data in MYSQL tables, but
 as the data rate increases MYSQL starts responding very slow (in some cases
 connection refused error), resulting in DBWriterBolt to slowdown. All the
 Topology is bottleneck by this issue. We cannot increase the traffic at
 source beyond a certain level, the reason we noted is that sink (MYSQL) or
 the bolt adjacent to sink is performing slow.

 Any suggestion on how should we proceed will be highly appreciated.

 Thanks.





 --

 Ce n'est pas une signature



Re: STORM with MYSQL optimizations

2014-02-26 Thread Sean Allen
Is your mysql set up to handle 1000 writes a second?

I'm going to guess no. If that is the case then Klaus' suggestions are good
ones. Batch or Shard.


On Wed, Feb 26, 2014 at 8:45 AM, masoom alam masoom.a...@gmail.com wrote:

 1000 Events per second.




 On Wed, Feb 26, 2014 at 6:40 PM, Sean Allen 
 s...@monkeysnatchbanana.comwrote:

 How much traffic exactly are you pushing at mysql before the load gets to
 high and it starts to buckle under the weight?


 On Wed, Feb 26, 2014 at 8:38 AM, masoom alam masoom.a...@gmail.comwrote:

 Dear All,

 Have any body worked on the configurations/optimizations needed
 generally for using STORM with MYSQL. Our scenario stores data in MYSQL
 tables, but as the data rate increases MYSQL starts responding very slow
 (in some cases connection refused error), resulting in DBWriterBolt to
 slowdown. All the Topology is bottleneck by this issue. We cannot increase
 the traffic at source beyond a certain level, the reason we noted is that
 sink (MYSQL) or the bolt adjacent to sink is performing slow.

 Any suggestion on how should we proceed will be highly appreciated.

 Thanks.





 --

 Ce n'est pas une signature





-- 

Ce n'est pas une signature


Re: STORM with MYSQL optimizations

2014-02-26 Thread masoom alam
@Sean: You are right, MYSQL is not configured to handle 1000 events per
second. I will post the results of Batch, which is also slow in our case.I
think we should investigate thoroughly why Batch of for example 1000 is
also slow in our case.

BTW, How easy it is to configure/implement Shards in MYSQL. Any useful
pointers?


On Wed, Feb 26, 2014 at 6:48 PM, Sean Allen s...@monkeysnatchbanana.comwrote:

 Is your mysql set up to handle 1000 writes a second?

 I'm going to guess no. If that is the case then Klaus' suggestions are
 good ones. Batch or Shard.


 On Wed, Feb 26, 2014 at 8:45 AM, masoom alam masoom.a...@gmail.comwrote:

 1000 Events per second.




 On Wed, Feb 26, 2014 at 6:40 PM, Sean Allen 
 s...@monkeysnatchbanana.comwrote:

 How much traffic exactly are you pushing at mysql before the load gets
 to high and it starts to buckle under the weight?


 On Wed, Feb 26, 2014 at 8:38 AM, masoom alam masoom.a...@gmail.comwrote:

 Dear All,

 Have any body worked on the configurations/optimizations needed
 generally for using STORM with MYSQL. Our scenario stores data in MYSQL
 tables, but as the data rate increases MYSQL starts responding very slow
 (in some cases connection refused error), resulting in DBWriterBolt to
 slowdown. All the Topology is bottleneck by this issue. We cannot increase
 the traffic at source beyond a certain level, the reason we noted is that
 sink (MYSQL) or the bolt adjacent to sink is performing slow.

 Any suggestion on how should we proceed will be highly appreciated.

 Thanks.





 --

 Ce n'est pas une signature





 --

 Ce n'est pas une signature



Re: STORM with MYSQL optimizations

2014-02-26 Thread Sean Allen
Sharding is a pain in the ass and should be avoided when possible.

If its possible, I'd look for another data store that can handle a higher
load as a cluster so you don't have to worry about the details of sharding.




On Wed, Feb 26, 2014 at 8:54 AM, masoom alam masoom.a...@gmail.com wrote:

 @Sean: You are right, MYSQL is not configured to handle 1000 events per
 second. I will post the results of Batch, which is also slow in our case.I
 think we should investigate thoroughly why Batch of for example 1000 is
 also slow in our case.

 BTW, How easy it is to configure/implement Shards in MYSQL. Any useful
 pointers?


 On Wed, Feb 26, 2014 at 6:48 PM, Sean Allen 
 s...@monkeysnatchbanana.comwrote:

 Is your mysql set up to handle 1000 writes a second?

 I'm going to guess no. If that is the case then Klaus' suggestions are
 good ones. Batch or Shard.


 On Wed, Feb 26, 2014 at 8:45 AM, masoom alam masoom.a...@gmail.comwrote:

 1000 Events per second.




 On Wed, Feb 26, 2014 at 6:40 PM, Sean Allen s...@monkeysnatchbanana.com
  wrote:

 How much traffic exactly are you pushing at mysql before the load gets
 to high and it starts to buckle under the weight?


 On Wed, Feb 26, 2014 at 8:38 AM, masoom alam masoom.a...@gmail.comwrote:

 Dear All,

 Have any body worked on the configurations/optimizations needed
 generally for using STORM with MYSQL. Our scenario stores data in MYSQL
 tables, but as the data rate increases MYSQL starts responding very slow
 (in some cases connection refused error), resulting in DBWriterBolt to
 slowdown. All the Topology is bottleneck by this issue. We cannot increase
 the traffic at source beyond a certain level, the reason we noted is that
 sink (MYSQL) or the bolt adjacent to sink is performing slow.

 Any suggestion on how should we proceed will be highly appreciated.

 Thanks.





 --

 Ce n'est pas une signature





 --

 Ce n'est pas une signature





-- 

Ce n'est pas une signature


Re: [RELEASE] Apache Storm 0.9.1-incubating released (defaults.yaml)

2014-02-26 Thread Derek Dagit

The defaults.yaml file is part of the source distribution and is packaged into 
storm's jar when deployed.

In a storm cluster deployment, it is not meant to be on the file system in 
${storm.home}/conf.

Perhaps you are pointing to your source working tree as storm home?
--
Derek

On 2/26/14, 5:59, Lajos wrote:

Quick question on this: defaults.yaml is in both conf and storm-core.jar, so 
the first time you start nimbus 0.9.1 you get this message:

java.lang.RuntimeException: Found multiple defaults.yaml resources. You're 
probably bundling the Storm jars with your topology jar. 
[file:/scratch/projects/apache-storm-0.9.1-incubating/conf/defaults.yaml, 
jar:file:/scratch/projects/apache-storm-0.9.1-incubating/lib/storm-core-0.9.1-incubating.jar!/defaults.yaml]
 at backtype.storm.utils.Utils.findAndReadConfigFile(Utils.java:133) 
~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
...

Shouldn't conf/defaults.yaml be called like conf/defaults.yaml.copy or 
something? I like that it is in the conf directory, because now I can easily 
see all the config options instead of having to go to the source directory. But 
it shouldn't prevent startup ...

Thanks,

Lajos



On 22/02/2014 21:09, P. Taylor Goetz wrote:

The Storm team is pleased to announce the release of Apache Storm version 
0.9.1-incubating. This is our first Apache release.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.incubator.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.incubator.apache.org/downloads.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 0.9.1-incubating

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Enjoy!

[1]: http://s.apache.org/Ki0 (CHANGELOG)
[2]: https://issues.apache.org/jira/browse/STORM



Re: Storm Message Size

2014-02-26 Thread Adam Lewis
Hi Klaus,

I've been dealing with similar use cases.  I do a couple of things (which
may not be a final solution, but it is interesting to discuss alternate
approaches): I have passed trained models in the 200MB range through storm,
but I try to avoid it. The model gets dropped into persistence and then
only ID to the model is passed through the topology.  So my training bolt
passes the whole model blob to the persistence bolt and that's it...in the
future I may even remove that step so that the model blob never gets
transferred by storm.  Also, I use separate topologies for training, and
those tend to have timeouts much higher because the train aggregator can
take quite a while.  Traditionally this would probably happen in Hadoop or
some other batch system, but I'm too busy to do the setup and storm is
handling it fine anyway.

I don't have to do any polling because I have model selection running as a
logically different step, i.e. tuple shows up for prediction, run a
selection step which finds the model ID for scoring that tuple, then it
flows on to an actual scoring bolt which retrieves the model based on ID
and applies it to the tuple.  If the creation of a new model leads you to
re-score old tuples, you could use the model write to trigger those
tuples to be replayed from some source of state such that they will pickup
the new model ID and proceed as normal.

Best,

Adam




On Wed, Feb 26, 2014 at 7:54 AM, Klausen Schaefersinho 
klaus.schaef...@gmail.com wrote:

 THX,

 the idea is good, I will keep that in mind. The only drawback is that it
 relies on polling, what I do not like to much in the PredictionBolt. Off
 couse I could also pass S3 or File refernces around in the messages, to
 trigger an update. But for the sake of simplicity I was thinking of keeping
 everything in storm and do not rely if possible on other system.

 Cheers,

 Klaus


 On Wed, Feb 26, 2014 at 12:22 PM, Enno Shioji eshi...@gmail.com wrote:

 I can't comment on how large tuples fare, but about the synchronization,
 would this not make more sense?

 InputSpout - AggregationBolt - PredictionBolt - OutputBolt
  | |
   \/   |
Agg. State|
 /\ |
   |V
TrainingBolt - Model State

 I.e. AggregationBolt writes to AggregationState, which is polled by
 TrainingBolt, which writes to ModelState. ModelState is then polled by
 PredictionBolt.

 This way, you can get rid of the large tuples as well and use instead
 something like S3 for these large states.





 On Wed, Feb 26, 2014 at 11:02 AM, Klausen Schaefersinho 
 klaus.schaef...@gmail.com wrote:

 Hi,

 I have a topology which process events and aggregates them in some form
 and performs some prediction based on a machine learning (ML) model. Every
 x events the one of the bolt involved in the normal processing emit an
 trainModel event, which is routed to a bolt which is just dedicated to
 the training. One the training is done, the new model should be send back
 to the prediction bolt. The topology looks like:


  InputSpout - AggregationBolt - PredictionBolt - OutputBolt
  | /\
   \/   |
TrainingBolt -+


 The model can get quite large ( 100 mb) so I am not sure how this would
 impact the performance of my cluster.  Does anybody has experiences with
 transmitting large messages?

 Also the training might take a while, so the aggregation bolt should not
 trigger the training bolt if he is busy. Is there an established patterns
 how to archive this kind of synchronization? I could have some streams to
 send states, but then I would mix data stream with control stream, what I
 really would like to avoid. An alternative would be use ZooKeeper and
 perform the synchronization there. Lats but not least I could also make
 make the aggregation bolt into a data base and have the training bolt
 periodically wake up and read the data base. Does anybody has experience
 with such a setup?

 Kind Regards,

 Klaus






Re: [RELEASE] Apache Storm 0.9.1-incubating released

2014-02-26 Thread Spico Florin
Hello, Padma!
  You can create a storm cluster on Windows with one node as described
here:
http://ptgoetz.github.io/blog/2013/12/18/running-apache-storm-on-windows/
I could set up following the instructions from this article.
I hope that will help you also.
Regards,\
 Florin


On Wed, Feb 26, 2014 at 2:23 PM, padma priya chitturi 
padmapriy...@gmail.com wrote:

 Does 0.9.1 version has inbuilt support to run on windows ?


 On Wed, Feb 26, 2014 at 5:29 PM, Lajos la...@protulae.com wrote:

 Quick question on this: defaults.yaml is in both conf and storm-core.jar,
 so the first time you start nimbus 0.9.1 you get this message:

 java.lang.RuntimeException: Found multiple defaults.yaml resources.
 You're probably bundling the Storm jars with your topology jar.
 [file:/scratch/projects/apache-storm-0.9.1-incubating/conf/defaults.yaml,
 jar:file:/scratch/projects/apache-storm-0.9.1-incubating/
 lib/storm-core-0.9.1-incubating.jar!/defaults.yaml]
 at backtype.storm.utils.Utils.findAndReadConfigFile(Utils.java:133)
 ~[storm-core-0.9.1-incubating.jar:0.9.1-incubating]
 ...

 Shouldn't conf/defaults.yaml be called like conf/defaults.yaml.copy or
 something? I like that it is in the conf directory, because now I can
 easily see all the config options instead of having to go to the source
 directory. But it shouldn't prevent startup ...

 Thanks,

 Lajos




 On 22/02/2014 21:09, P. Taylor Goetz wrote:

 The Storm team is pleased to announce the release of Apache Storm
 version 0.9.1-incubating. This is our first Apache release.

 Storm is a distributed, fault-tolerant, and high-performance realtime
 computation system that provides strong guarantees on the processing of
 data. You can read more about Storm on the project website:

 http://storm.incubator.apache.org

 Downloads of source and binary distributions are listed in our download
 section:

 http://storm.incubator.apache.org/downloads.html

 Distribution artifacts are available in Maven Central at the following
 coordinates:

 groupId: org.apache.storm
 artifactId: storm-core
 version: 0.9.1-incubating

 The full list of changes is available here[1]. Please let us know [2] if
 you encounter any problems.

 Enjoy!

 [1]: http://s.apache.org/Ki0 (CHANGELOG)
 [2]: https://issues.apache.org/jira/browse/STORM





Re: [RELEASE] Apache Storm 0.9.1-incubating released (defaults.yaml)

2014-02-26 Thread Lajos

Hi Derek,

Ah! I accidentally unpacked source on top of binary, when I meant to put 
it in a separate directory. That's the problem, thanks.


Cheers,

L


On 26/02/2014 15:32, Derek Dagit wrote:

The defaults.yaml file is part of the source distribution and is
packaged into storm's jar when deployed.

In a storm cluster deployment, it is not meant to be on the file system
in ${storm.home}/conf.

Perhaps you are pointing to your source working tree as storm home?


Re: Unexpected behavior on message resend

2014-02-26 Thread Harald Kirsch

Hi Adam,

ok, good to know. I resolved to create the tuple from scratch in case it 
needs to be resend. I don't where else in-place modification could hurt 
in a linear process. Am I missing something?


Thanks,
Harald.

On 26.02.2014 15:48, Adam Lewis wrote:

I've already gotten slapped around on the list for doing in place
modifications, so let me pass it on :)

Don't modify tuple objects in place.

You shouldn't rely on serialization happening or not happening for
correctness.


On Mon, Feb 24, 2014 at 11:18 AM, Harald Kirsch
harald.kir...@raytion.com mailto:harald.kir...@raytion.com wrote:

Hi all,

my  TOPOLOGY_MESSAGE_TIMEOUT_SECS was slightly to low. I got a fail
for a tuple and the spout just resend it.

One bolt normalizes a date in place in a field of the tuple. After
the spout resend the tuple, I got errors from the date parser
because the date was already normalized.

Since I currently have only one node, I know of course what happens.
The tuple was just the very same object that was already partially
processed when the timeout hit.

In a distributed setup I envisage the bolt to be on another machine
with a serialized copy of the spout's tuple such that changes to the
tuple are not reflected in the original. Would that be true?

I reckon from this that all processing in bolts needs to be
idempotent if I want to be able to replay failed tuples.

Is that true or am I doing something wrong?

Harald.


--
Harald Kirsch
Raytion GmbH
Kaiser-Friedrich-Ring 74
40547 Duesseldorf
Fon +49-211-550266-0 tel:%2B49-211-550266-0
Fax +49-211-550266-19 tel:%2B49-211-550266-19
http://www.raytion.com




--
Harald Kirsch
Raytion GmbH
Kaiser-Friedrich-Ring 74
40547 Duesseldorf
Fon +49-211-550266-0
Fax +49-211-550266-19
http://www.raytion.com


Re: Unexpected behavior on message resend

2014-02-26 Thread Adam Lewis
In my case it was the state objects created as part of trident aggregation.
 Here is the final message in the thread (i.e. read bottom up):

http://mail-archives.apache.org/mod_mbox/storm-user/201312.mbox/%3CCAAYLz+p4YhF+i3LAkFoyU3nvngZXOusZWXj=0+bynrx0+tg...@mail.gmail.com%3E




On Wed, Feb 26, 2014 at 10:35 AM, Harald Kirsch
harald.kir...@raytion.comwrote:

 Hi Adam,

 ok, good to know. I resolved to create the tuple from scratch in case it
 needs to be resend. I don't where else in-place modification could hurt in
 a linear process. Am I missing something?

 Thanks,
 Harald.


 On 26.02.2014 15:48, Adam Lewis wrote:

 I've already gotten slapped around on the list for doing in place
 modifications, so let me pass it on :)

 Don't modify tuple objects in place.

 You shouldn't rely on serialization happening or not happening for
 correctness.


 On Mon, Feb 24, 2014 at 11:18 AM, Harald Kirsch
 harald.kir...@raytion.com mailto:harald.kir...@raytion.com wrote:

 Hi all,

 my  TOPOLOGY_MESSAGE_TIMEOUT_SECS was slightly to low. I got a fail
 for a tuple and the spout just resend it.

 One bolt normalizes a date in place in a field of the tuple. After
 the spout resend the tuple, I got errors from the date parser
 because the date was already normalized.

 Since I currently have only one node, I know of course what happens.
 The tuple was just the very same object that was already partially
 processed when the timeout hit.

 In a distributed setup I envisage the bolt to be on another machine
 with a serialized copy of the spout's tuple such that changes to the
 tuple are not reflected in the original. Would that be true?

 I reckon from this that all processing in bolts needs to be
 idempotent if I want to be able to replay failed tuples.

 Is that true or am I doing something wrong?

 Harald.


 --
 Harald Kirsch
 Raytion GmbH
 Kaiser-Friedrich-Ring 74
 40547 Duesseldorf
 Fon +49-211-550266-0 tel:%2B49-211-550266-0
 Fax +49-211-550266-19 tel:%2B49-211-550266-19
 http://www.raytion.com



 --
 Harald Kirsch
 Raytion GmbH
 Kaiser-Friedrich-Ring 74
 40547 Duesseldorf
 Fon +49-211-550266-0
 Fax +49-211-550266-19
 http://www.raytion.com



Storm cannot run in combination with a recent Hadoop/HBase version.

2014-02-26 Thread Niels Basjes
Hi,

I'm trying to write some storm bolts and I want them to output the
information they produce into HBase.
Now the HBase we have running here is based on CDH 4.5.0 which is fully
based on the zookeeper versions in the 3.4.x range.

The problem I have is that Storm currently still uses zookeeper 3.3.3

The important difference in my case between these two is that
3.3.x has:  org.apache.zookeeper.server.NIOServerCnxn$Factory
3.4.x has:  org.apache.zookeeper.server.NIOServerCnxnFactory

As a consequence I'm getting a ClassNotFoundException.

I found that during a short period this problem was fixed but because of a
performance problem in curator was turned back.
https://github.com/nathanmarz/storm/pull/225

What does it take to get this fixed (i.e. zookeeper goes to a 3.4.x
version)?

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes


Re: [DISCUSS] Pulling Contrib Modules into Apache

2014-02-26 Thread P. Taylor Goetz
Thanks for the feedback Bobby.

To clarify, I’m mainly talking about spout/bolt/trident state implementations 
that integrate storm with *Technology X*, where *Technology X* is not a 
fundamental part of storm. 

Examples would be technologies that are part of or related to the Hadoop/Big 
Data ecosystem and enable the Lamda Architecture, e.g.: Kafka, HDFS, HBase, 
Cassandra, etc.

The idea behind having one or more Storm committers act as a “sponsor” is to 
make sure new additions are done carefully and with good reason. To add a new 
module, it would require committer/PPMC consensus, and assignment of one or 
more sponsors. Part of a sponsor’s job would be to ensure that a module is 
maintained, which would require enough familiarity with the code so support it 
long term. If a new module was proposed, but no committers were willing to act 
as a sponsor, it would not be added.

It would be the Committers’/PPMC’s responsibly to make sure things didn’t get 
out of hand, and to do something about it if it does.

Here’s an old Hadoop JIRA thread [1] discussing the addition of Hive as a 
contrib module, similar to what happened with HBase as Bobby pointed out. Some 
interesting points are brought up. The difference here is that both HBase and 
Hive were pretty big codebases relative to Hadoop. With spout/bolt/state 
implementations I doubt we’d see anything along that scale.

- Taylor

[1] https://issues.apache.org/jira/browse/HADOOP-3601


On Feb 26, 2014, at 12:35 PM, Bobby Evans ev...@yahoo-inc.com wrote:

 I can see a lot of value in having a distribution of storm that comes with 
 batteries included, everything is tested together and you know it works.  But 
 I don’t see much long term developer benefit in building them all together.  
 If there is strong coupling between storm and these external projects so that 
 they break when storm changes then we need to understand the coupling and 
 decide if we want to reduce that coupling by stabilizing APIs, improving 
 version numbering and release process, etc.; or if the functionality is 
 something that should be offered as a base service in storm.
 
 I can see politically the value of giving these other projects a home in 
 Apache, and making them sub-projects is the simplest route to that.  I’d love 
 to have storm on yarn inside Apache.  I just don’t want to go overboard with 
 it.  There was a time when HBase was a “contrib” module under Hadoop along 
 with a lot of other things, and the Apache board came and told Hadoop to 
 brake it up.
 
 Bringing storm-kafka into storm does not sound like it will solve much from a 
 developer’s perspective, because there is at least as much coupling with 
 kafka as there is with storm.  I can see how it is a huge amount of overhead 
 and pain to set up a new project just for a few hundred lines of code, as 
 such I am in favor of pulling in closely related projects, especially those 
 that are spouts and state implementations. I just want to be sure that we do 
 it carefully, with a good reason, and with enough people who are familiar 
 with the code to support it long term.
 
 If it starts to look like we are pulling in too many projects perhaps we 
 should look at something more like the bigtop project  
 https://bigtop.apache.org/ which produces a tested distribution of Hadoop 
 with many different sub-projects included in it.
 
 I am also a bit concerned about these sub-projects becoming second class 
 citizens, where we break something, but because the build is off by default 
 we don’t know it.  I would prefer that they are built and tested by default.  
 If the build and test time starts to take too long, to me that means we need 
 to start wondering if we have too many contrib modules.
 
 —Bobby
 
 From: Brian Enochson 
 brian.enoch...@gmail.commailto:brian.enoch...@gmail.com
 Reply-To: 
 user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org 
 user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org
 Date: Tuesday, February 25, 2014 at 9:50 PM
 To: user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org 
 user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org
 Cc: d...@storm.incubator.apache.orgmailto:d...@storm.incubator.apache.org 
 d...@storm.incubator.apache.orgmailto:d...@storm.incubator.apache.org
 Subject: Re: [DISCUSS] Pulling Contrib Modules into Apache
 
 hi,
   I am in agreement with Taylor and believe I understand his intent. An 
 incredible tool/framework/application like Storm is only enhanced and gains 
 value from the number of well maintained and vetted modules that can be used 
 for integration and adding further functionality.
  I am relatively new to the Storm community but have spent quite some time 
 reviewing contributing modules out there, reviewing various duplicates and 
 running into some version incompatibilities. I understand the need to keep 
 Storm itself pure, but do think there needs to be some structure and 
 

Re: [DISCUSS] Pulling Contrib Modules into Apache

2014-02-26 Thread Brian O'Neill

Bobby,

FWIW, I¹d love to see storm-yarn inside.  I think we could definitely make
things easier on the end-user if they were more cohesive.

e.g. Imagine if we had ³storm launch yarn² inside of $storm/bin that would
kickoff a storm-yarn launch, with whatever version was built.  It would
likely simplify the ³create-tarball² and storm-yarn getStormConfig process
as well.

-brian

---
Brian O'Neill
Chief Technology Officer

Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42  €
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 2/26/14, 4:25 PM, Bobby Evans ev...@yahoo-inc.com wrote:

I totally agree and I am +1 on bringing these spout/trident pieces in,
assuming there are committers to support them.

I am also curious about how people feel about pulling in other projects
like storm-starter, storm-deploy, storm-mesos, and storm-yarn?

Storm-starter in my option seems more like documentation and it would be
nice to pull in so that it stays up to date with storm itself, just like
the documentation.

The others are more of ways to run storm in different environments.  They
seem like there could be a lot of coupling between them and storm as
storm evolves, and they kind of fit with integrate storm with
*Technology X*² except X in this case is a compute environment instead of
a data source or store. But then again we also just shot down a request
to create juju charms for storm.

‹Bobby

From: P. Taylor Goetz ptgo...@gmail.commailto:ptgo...@gmail.com
Reply-To: 
d...@storm.incubator.apache.orgmailto:d...@storm.incubator.apache.org
Date: Wednesday, February 26, 2014 at 1:21 PM
To: 
d...@storm.incubator.apache.orgmailto:d...@storm.incubator.apache.org
Cc: 
user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org
user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org
Subject: Re: [DISCUSS] Pulling Contrib Modules into Apache

Thanks for the feedback Bobby.

To clarify, I¹m mainly talking about spout/bolt/trident state
implementations that integrate storm with *Technology X*, where
*Technology X* is not a fundamental part of storm.

Examples would be technologies that are part of or related to the
Hadoop/Big Data ecosystem and enable the Lamda Architecture, e.g.: Kafka,
HDFS, HBase, Cassandra, etc.

The idea behind having one or more Storm committers act as a ³sponsor² is
to make sure new additions are done carefully and with good reason. To
add a new module, it would require committer/PPMC consensus, and
assignment of one or more sponsors. Part of a sponsor¹s job would be to
ensure that a module is maintained, which would require enough
familiarity with the code so support it long term. If a new module was
proposed, but no committers were willing to act as a sponsor, it would
not be added.

It would be the Committers¹/PPMC¹s responsibly to make sure things didn¹t
get out of hand, and to do something about it if it does.

Here¹s an old Hadoop JIRA thread [1] discussing the addition of Hive as a
contrib module, similar to what happened with HBase as Bobby pointed out.
Some interesting points are brought up. The difference here is that both
HBase and Hive were pretty big codebases relative to Hadoop. With
spout/bolt/state implementations I doubt we¹d see anything along that
scale.

- Taylor

[1] https://issues.apache.org/jira/browse/HADOOP-3601


On Feb 26, 2014, at 12:35 PM, Bobby Evans
ev...@yahoo-inc.commailto:ev...@yahoo-inc.com wrote:

I can see a lot of value in having a distribution of storm that comes
with batteries included, everything is tested together and you know it
works.  But I don¹t see much long term developer benefit in building them
all together.  If there is strong coupling between storm and these
external projects so that they break when storm changes then we need to
understand the coupling and decide if we want to reduce that coupling by
stabilizing APIs, improving version numbering and release process, etc.;
or if the functionality is something that should be offered as a base
service in storm.

I can see politically the value of giving these other projects a home in
Apache, and making them sub-projects is the simplest route to that.  I¹d
love to have storm on yarn inside Apache.  I just don¹t want to go
overboard with it.  There was a time when HBase was a ³contrib² module
under 

RE: [DISCUSS] Pulling Contrib Modules into Apache

2014-02-26 Thread Huang, Roger
Bobby,
I vote to include both storm-yarn and storm-deploy.
Roger


-Original Message-
From: Brian O'Neill [mailto:boneil...@gmail.com] On Behalf Of Brian O'Neill
Sent: Wednesday, February 26, 2014 3:39 PM
To: d...@storm.incubator.apache.org
Cc: user@storm.incubator.apache.org
Subject: Re: [DISCUSS] Pulling Contrib Modules into Apache


Bobby,

FWIW, I¹d love to see storm-yarn inside.  I think we could definitely make 
things easier on the end-user if they were more cohesive.

e.g. Imagine if we had ³storm launch yarn² inside of $storm/bin that would 
kickoff a storm-yarn launch, with whatever version was built.  It would likely 
simplify the ³create-tarball² and storm-yarn getStormConfig process as well.

-brian

---
Brian O'Neill
Chief Technology Officer

Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42  € 
healthmarketscience.com

This information transmitted in this email message is for the intended 
recipient only and may contain confidential and/or privileged material. If you 
received this email in error and are not the intended recipient, or the person 
responsible to deliver it to the intended recipient, please contact the sender 
at the email above and delete this email and any attachments and destroy any 
copies thereof. Any review, retransmission, dissemination, copying or other use 
of, or taking any action in reliance upon, this information by persons or 
entities other than the intended recipient is strictly prohibited.
 






On 2/26/14, 4:25 PM, Bobby Evans ev...@yahoo-inc.com wrote:

I totally agree and I am +1 on bringing these spout/trident pieces in, 
assuming there are committers to support them.

I am also curious about how people feel about pulling in other projects 
like storm-starter, storm-deploy, storm-mesos, and storm-yarn?

Storm-starter in my option seems more like documentation and it would 
be nice to pull in so that it stays up to date with storm itself, just 
like the documentation.

The others are more of ways to run storm in different environments.  
They seem like there could be a lot of coupling between them and storm 
as storm evolves, and they kind of fit with integrate storm with 
*Technology X*² except X in this case is a compute environment instead 
of a data source or store. But then again we also just shot down a 
request to create juju charms for storm.

‹Bobby

From: P. Taylor Goetz ptgo...@gmail.commailto:ptgo...@gmail.com
Reply-To: 
d...@storm.incubator.apache.orgmailto:d...@storm.incubator.apache.org
Date: Wednesday, February 26, 2014 at 1:21 PM
To: 
d...@storm.incubator.apache.orgmailto:d...@storm.incubator.apache.org
Cc: 
user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org
user@storm.incubator.apache.orgmailto:user@storm.incubator.apache.org

Subject: Re: [DISCUSS] Pulling Contrib Modules into Apache

Thanks for the feedback Bobby.

To clarify, I¹m mainly talking about spout/bolt/trident state 
implementations that integrate storm with *Technology X*, where 
*Technology X* is not a fundamental part of storm.

Examples would be technologies that are part of or related to the 
Hadoop/Big Data ecosystem and enable the Lamda Architecture, e.g.: 
Kafka, HDFS, HBase, Cassandra, etc.

The idea behind having one or more Storm committers act as a ³sponsor² 
is to make sure new additions are done carefully and with good reason. 
To add a new module, it would require committer/PPMC consensus, and 
assignment of one or more sponsors. Part of a sponsor¹s job would be to 
ensure that a module is maintained, which would require enough 
familiarity with the code so support it long term. If a new module was 
proposed, but no committers were willing to act as a sponsor, it would 
not be added.

It would be the Committers¹/PPMC¹s responsibly to make sure things 
didn¹t get out of hand, and to do something about it if it does.

Here¹s an old Hadoop JIRA thread [1] discussing the addition of Hive as 
a contrib module, similar to what happened with HBase as Bobby pointed out.
Some interesting points are brought up. The difference here is that 
both HBase and Hive were pretty big codebases relative to Hadoop. With 
spout/bolt/state implementations I doubt we¹d see anything along that 
scale.

- Taylor

[1] https://issues.apache.org/jira/browse/HADOOP-3601


On Feb 26, 2014, at 12:35 PM, Bobby Evans 
ev...@yahoo-inc.commailto:ev...@yahoo-inc.com wrote:

I can see a lot of value in having a distribution of storm that comes 
with batteries included, everything is tested together and you know it 
works.  But I don¹t see much long term developer benefit in building 
them all together.  If there is strong coupling between storm and these 
external projects so that they break when storm changes then we need to 
understand the coupling and decide if we want to reduce that coupling 
by stabilizing APIs, improving version numbering and release 

Re: Spout missing Acks when a Bolt uses JRuby

2014-02-26 Thread Jonathan Nilsson
Thanks Taylor. I was afraid creating the JRuby runtime this way might be
expensive. Initially I did create it inside of the prepare() method but I
ran into some trouble because the Ruby class is not serializable. I played
around with it a little more today and I had some success creating a static
class to hold my Ruby objects. That way I could initialize it in prepare()
and just call my processing code in execute.

When I initialize my Ruby objects this way I'm generally receiving my Acks
the way I expect to, but there are still some that I don't receive. This is
different than when I had the Ruby initialization in execute. In that
scenario I failed to receive any Acks at all.

I'm still new to Storm so it's possible that I'm just missing something
obvious. The thing is though when I take the JRuby code out everything
works fine. Is it possible that I'm just not waiting long enough? In my
tests it takes about 10 seconds for my test data to flow through the
topology (single node) and I shut it down after 30 seconds. The Acks i do
receive happen pretty much instantly after I call the ack() in the last
Bolt though.



On Wed, Feb 26, 2014 at 6:13 PM, P. Taylor Goetz ptgo...@gmail.com wrote:

 Hi Jonathan,

 I've used jruby fairly extensively with storm (though with the trident
 API), but it's been a while so I'm rusty.

 Initializing the jruby runtime is very expensive, so you should do that in
 the prepare() method of your bolt. That means you'll have to store it as an
 instance variable in your bolt, which in turn opens the door for potential
 concurrency issues in your jruby code. Be warned.

 It can get kind of crazy. I forget what the magic jruby runtime
 configuration was off hand. But it works.

 I'll try to unarchive those memories and reply.

 -Taylor





  On Feb 25, 2014, at 10:32 PM, Jonathan Nilsson 
 jonathan.nils...@gmail.com wrote:
 
  I'm trying to write a Storm Bolt that does some processing with JRuby.
 When my data goes through this Bolt I see that the Spout does not appear to
 be receiving any acks. I'm pretty sure I'm anchoring my tuples correctly.
 If I take the JRuby Bolt out of the topology everything works fine again.
 In trying to isolate the problem I wrote a Bolt that does no processing at
 all but does call Ruby.getGlobalRuntime(). That call alone seems to be
 enough to stop the acks from flowing. I've boiled the execute method down to
 
public void execute(Tuple input) {
  Ruby.getGlobalRuntime();
  LOG.info(Sending Ack for +input);
  collector.ack(input);
}
 
  and I get the log message but no messages in the Spout's ack or fail
 methods. If I remove the Ruby.getGlobalRuntime(); line everything works.
 I've tried using Ruby.getThreadLocalRuntime() but it doesn't seem to make a
 difference. Has anyone seen a similar problem? Are there any tricks to
 calling JRuby code from within Storm?



Facing Error in storm-deploy

2014-02-26 Thread Gaurav Taank
Hi everyone,

I am trying to deploy the storm on AWS cluster, and getting following
error.

I am using a mac machine, so these are the steps I followed:

1. downloaded lein, converted it to executable, moved to usr/local/bin and
executed same.

2. did a git clone of storm deploy code.

3. cd into storm deploy and run lein deps

4. made a config.clj file as below. Also, did ssh-keygen in ~/.ssh to
create a public private key pair which was used in config.clj.

(defpallet

  :services

  {

   :default {

 :blobstore-provider aws-s3

 :provider aws-ec2

 :environment {:user {:username storm  ; this must be storm

  :private-key-path ~/.ssh/id_rsa

  :public-key-path ~/.ssh/id_rsa.pub}

   :aws-user-id 2517}

 :identity 

 :credential 

 :jclouds.regions us-east-1

 }

})


Then did a:

lein deploy-storm --start --name mycluster --branch 0.8.3

 the error below is mentioned. I have followed the steps as described and
cross checked these at multiple places, however the error remains. Any help
or pointers shall be really useful, Thanks in Advance! --Gaurav

INFO  execute - Output:

/Users/admin/.ssh/id_rsa.pub


DEBUG execute - out

= /Users/admin/.ssh/id_rsa.pub\n

INFO  execute - Output:

/Users/admin/.ssh/id_rsa


DEBUG execute - out

= /Users/admin/.ssh/id_rsa\n

INFO  execute - Output:

storm


DEBUG execute - out

= storm\n

INFO  execute - Output:

/Users/admin/.ssh/id_rsa.pub


DEBUG execute - out

= /Users/admin/.ssh/id_rsa.pub\n

INFO  execute - Output:

/Users/admin/.ssh/id_rsa


DEBUG execute - out

= /Users/admin/.ssh/id_rsa\n

INFO  execute - Output:

/Users/admin/.ssh/id_rsa.pub


DEBUG execute - out

= /Users/admin/.ssh/id_rsa.pub\n

INFO  execute - Output:

/Users/admin/.ssh/id_rsa


DEBUG execute - out

= /Users/admin/.ssh/id_rsa\n

DEBUG jclouds - Found jclouds sshj driver

DEBUG jclouds - extensions (:log4j :slf4j :sshj)

DEBUG jclouds - options [:jclouds.regions us-east-1 :blobstore-provider
aws-s3]

ERROR logging - Exception in thread main

ERROR logging - com.google.inject.CreationException: Guice creation errors:


1) org.jclouds.rest.RestContextorg.jclouds.aws.ec2.AWSEC2Client, A cannot
be used as a key; It is not fully specified.


1 error (form-init8975416400432954481.clj:1)

ERROR logging - at clojure.lang.Compiler.eval(Compiler.java:5440)

ERROR logging - at clojure.lang.Compiler.eval(Compiler.java:5415)

ERROR logging - at clojure.lang.Compiler.load(Compiler.java:5857)

ERROR logging - at clojure.lang.Compiler.loadFile(Compiler.java:5820)

ERROR logging - at clojure.main$load_script.invoke(main.clj:221)

ERROR logging - at clojure.main$init_opt.invoke(main.clj:226)

ERROR logging - at clojure.main$initialize.invoke(main.clj:254)

ERROR logging - at clojure.main$null_opt.invoke(main.clj:279)

ERROR logging - at clojure.main$main.doInvoke(main.clj:354)

ERROR logging - at clojure.lang.RestFn.invoke(RestFn.java:422)

ERROR logging - at clojure.lang.Var.invoke(Var.java:369)

ERROR logging - at clojure.lang.AFn.applyToHelper(AFn.java:165)

ERROR logging - at clojure.lang.Var.applyTo(Var.java:482)

ERROR logging - at clojure.main.main(main.java:37)

ERROR logging - Caused by: com.google.inject.CreationException: Guice
creation errors:


1) org.jclouds.rest.RestContextorg.jclouds.aws.ec2.AWSEC2Client, A cannot
be used as a key; It is not fully specified.


1 error

ERROR logging - at
com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:435)

ERROR logging - at
com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:154)

ERROR logging - at
com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:106)

ERROR logging - at com.google.inject.Guice.createInjector(Guice.java:95)

ERROR logging - at
org.jclouds.ContextBuilder.buildInjector(ContextBuilder.java:324)

ERROR logging - at
org.jclouds.ContextBuilder.buildInjector(ContextBuilder.java:262)

ERROR logging - at
org.jclouds.ContextBuilder.buildView(ContextBuilder.java:524)

ERROR logging - at
org.jclouds.ContextBuilder.buildView(ContextBuilder.java:504)

ERROR logging - at
org.jclouds.compute2$compute_service.doInvoke(compute2.clj:92)

ERROR logging - at clojure.lang.RestFn.applyTo(RestFn.java:147)

ERROR logging - at clojure.core$apply.doInvoke(core.clj:548)

ERROR logging - at clojure.lang.RestFn.invoke(RestFn.java:562)

ERROR logging - at
pallet.compute.jclouds$eval5952$fn__5954.invoke(jclouds.clj:720)

ERROR logging - at clojure.lang.MultiFn.invoke(MultiFn.java:167)

ERROR logging - at pallet.compute$compute_service.doInvoke(compute.clj:36)

ERROR logging - at clojure.lang.RestFn.applyTo(RestFn.java:140)

ERROR logging - at clojure.core$apply.invoke(core.clj:542)

ERROR logging - at