date:20160713

Re: Storm Shared Memory

2016-07-13 Thread Xin Wang

Now Storm no concept like 'shared memory' you should use external
resources. As a reference, you can take a look at `storm-redis`
https://github.com/apache/storm/tree/master/external/storm-redis.

2016-07-13 21:21 GMT+08:00 Stephen Powis :

> You'd need to use some external data store (such as redis) to maintain
> state that exists across multiple JVMs
>
> On Wed, Jul 13, 2016 at 4:22 AM, jimmy tekli 
> wrote:
>
>> Hello ,
>> I developed a java project based on the storm topology that process
>> incoming tuples in a certain way.I tested it locally and it worked
>> perfectly in the context of streaming tuples of course. My topology is
>>  formed of one spout and three bolts.In my code I used Data Structures
>> (such as HashMaps,linkedHashmap and TreeMap) to store some information
>> concerning the tuple's processing and I declared them static in the
>> topology builder class so they could be shared by all the bolts because
>> they need to accessed and varied by them frequently. The problem is that
>> when i deployed it over a remote cluster these static attributes weren't
>> visible (shared ) by the different bolts (I read about it online and I
>> concluded because of multiple JVMs). My question is if  there is a certain
>> concept concerning "shared memory" in Apache Storm where we can store these
>> DataStructures for them to be shared and accessed by all the bolts at any
>> given time.
>>
>> thanks in advance for your help.
>>
>
>

Re: Storm Shared Memory

2016-07-13 Thread Stephen Powis

You'd need to use some external data store (such as redis) to maintain
state that exists across multiple JVMs

On Wed, Jul 13, 2016 at 4:22 AM, jimmy tekli  wrote:

> Hello ,
> I developed a java project based on the storm topology that process
> incoming tuples in a certain way.I tested it locally and it worked
> perfectly in the context of streaming tuples of course. My topology is
>  formed of one spout and three bolts.In my code I used Data Structures
> (such as HashMaps,linkedHashmap and TreeMap) to store some information
> concerning the tuple's processing and I declared them static in the
> topology builder class so they could be shared by all the bolts because
> they need to accessed and varied by them frequently. The problem is that
> when i deployed it over a remote cluster these static attributes weren't
> visible (shared ) by the different bolts (I read about it online and I
> concluded because of multiple JVMs). My question is if  there is a certain
> concept concerning "shared memory" in Apache Storm where we can store these
> DataStructures for them to be shared and accessed by all the bolts at any
> given time.
>
> thanks in advance for your help.
>

RE: How to use KafkaSpout to consume kafka cluster secured with kerberos

2016-07-13 Thread Ziemer, Tom

Hi everybody,

I am facing the same problem (with Storm 1.0.1 and Kafka 0.9.0.1). Any ideas?

Thanks,
Tom

From: Hao Chen [mailto:h...@apache.org]
Sent: Dienstag, 12. Juli 2016 02:44
To: user@storm.apache.org
Subject: How to use KafkaSpout to consume kafka cluster secured with kerberos

Hi Team,

How to use KafkaSpout to consume kafka cluster secured with kerberos? I can't 
find lots of accurate information about the API.

The versions I am using are:

  *   storm: 0.9.3
  *   storm-kafka: 0.10.0
Thanks,
- Hao

Re: Allocating separate memory and workers to topologies of a single jar?

2016-07-13 Thread Navin Ipe

*Solved:*

All of your solutions worked. Thanks.
Mentioning the solution more elaborately for posterity:

According to NathamMarz
:
*The slots on each machine is determined by the storm.yaml that the
supervisor has, not the storm.yaml on your machine or the topology config.
You have to restart the supervisor for it to pick up any changes.*

Another person with the same problem and solution:
https://groups.google.com/forum/#!topic/storm-user/CtErpjCPDWA

*What to do if you want to run more than the default 4 topologies:*
The number of topologies you can run on each supervisor is determined by
the number of slots available for it. By default it is 4. If you want to
increase it, you have to add some code to the storm.yaml file on the
supervisor nodes.

I searched for the storm.yaml file and added this:

*supervisor.slots.ports:- 6700- 6701- 6702- 6703-
6704- 6705  *

So I got 6 slots and was able to run 5 topologies. On the storm UI, the
supervisor summary now shows:

Slots Used slots
6   5

On Wed, Jul 13, 2016 at 3:39 PM, Navin Ipe 
wrote:

> Thanks Satish, but that didn't work either. Although 4 of the topologies
> got 1088MB assigned, the 5th topology got 0MB.
>
> Num workers Num executors Assigned Mem (MB)
> 1121088
> 14  1088
> 1271088
> 18  1088
> 00   0
>
>
>
> On Wed, Jul 13, 2016 at 3:00 PM, Satish Duggana 
> wrote:
>
>> Config.TOPOLOGY_WORKER_CHILDOPTS:  Options which can override
>> WORKER_CHILDOPTS for a topology. You can configure any java options like
>> memory, gc etc
>>
>> In your case it can be
>> config.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx1g");
>>
>> Thanks,
>> Satish.
>>
>>
>> On Wed, Jul 13, 2016 at 1:45 PM, Navin Ipe <
>> navin@searchlighthealth.com> wrote:
>>
>>> Using *stormConfig.put("topology.**worker.childopts",1024)* gave me
>>> this error:
>>> *java.lang.IllegalArgumentException: Field TOPOLOGY_WORKER_CHILDOPTS
>>> must be an Iterable but was a class java.lang.Integer*
>>>
>>> A bit of looking around showed me that this might be the right syntax: 
>>> *config.put(Config.TOPOLOGY_WORKER_CHILDOPTS,
>>> SOME_OPTS);*
>>>
>>> But couldn't find any example on what Storm expects as SOME_OPTS.
>>>
>>> On Wed, Jul 13, 2016 at 12:38 PM, Spico Florin 
>>> wrote:
>>>
 Hello!
   For the the topology that you have 0MB allocated, for me it seems
 that you don't have enough slots available. Check out the storm.yaml file
 (on your worker machines) how many slots you have allocated.
 (by default the are 4 slots available supervisor.slots.ports:
 - 6700
 - 6701
 - 6702
 - 6703) You have 5 topologies, therefore one is not ran.

 Regarding the memory allocation, you allocate memory per each worker
 (slot available), not per topology. If  you set up for your topology a
 number of workers equal to 1, then you topology will run on a single worker
 (available slot) and will receive the configuration that you gave for your
 worker. If you configure to spread your spout and bolts to multiple
 workers,(that are available in as configured slots)  then all the workers
 will receive the same amount of memory configured globally via
  worker.childopts property in the storm yaml . So you don't configure the
 meory per topology but per worker.

 If you want use different memory allocation for workers depending on
 your topology components memory consumption, then you should use the
 property

 stormConfig.put("topology.worker.childopts",1024)

 I hope it helps.

 Regards,
  Florin

 On Wed, Jul 13, 2016 at 9:23 AM, Navin Ipe <
 navin@searchlighthealth.com> wrote:

> I tried setting stormConfig.put(Config.WORKER_HEAP_MEMORY_MB, 1024);
> and now *all topologies* Assigned memory is 0MB.
> Does anyone know why it works this way? How do we assign memory to
> topologies in this situation?
>
> On Wed, Jul 13, 2016 at 10:39 AM, Navin Ipe <
> navin@searchlighthealth.com> wrote:
>
>> Hi,
>>
>> I have a program *MyProg.java* inside which I'm creating 5
>> topologies and using *stormSubmitter* to submit it to Storm. The
>> storm UI shows the assigned memory for each topology as:
>> *Assigned Mem (MB)*
>> 832
>> 0
>> 832
>> 832
>> 832
>>
>> One of the topologies was assigned 0MB. Assuming that it was because
>> of setting a single Config for all of them, I gave them separate configs
>> (as below), but nothing changed.
>>
>> *StormConfigFactory stormConfigFactory = new StormConfigFactory();*
>>
>>
>>

Re: storm cgroup question

2016-07-13 Thread 李岩

thanks

2016-07-13 4:26 GMT+08:00 Bobby Evans :

> cgroups is still a work in progress even if the code is in, it is not
> fully tested and ready to go.
>
> - Bobby
>
>
> On Tuesday, July 12, 2016 9:12 AM, 李岩  wrote:
>
>
> Hi All
>
> I have a question:
>
> Is current storm 1.0.x support cgroups features? I found this doc
> http://storm.apache.org/releases/1.0.0/cgroups_in_storm.html which
> indicates it will support
>
> Also I found this PR https://github.com/apache/storm/pull/1053, but it is
> not merge into 1.0.x branch. Does this mean storm will support cgroups
> features in the future, but not in current 1.0.x release?
>
>
>

Re: Allocating separate memory and workers to topologies of a single jar?

2016-07-13 Thread Navin Ipe

Thanks Satish, but that didn't work either. Although 4 of the topologies
got 1088MB assigned, the 5th topology got 0MB.

Num workers Num executors Assigned Mem (MB)
1121088
14  1088
1271088
18  1088
00   0



On Wed, Jul 13, 2016 at 3:00 PM, Satish Duggana 
wrote:

> Config.TOPOLOGY_WORKER_CHILDOPTS:  Options which can override
> WORKER_CHILDOPTS for a topology. You can configure any java options like
> memory, gc etc
>
> In your case it can be
> config.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx1g");
>
> Thanks,
> Satish.
>
>
> On Wed, Jul 13, 2016 at 1:45 PM, Navin Ipe <
> navin@searchlighthealth.com> wrote:
>
>> Using *stormConfig.put("topology.**worker.childopts",1024)* gave me this
>> error:
>> *java.lang.IllegalArgumentException: Field TOPOLOGY_WORKER_CHILDOPTS must
>> be an Iterable but was a class java.lang.Integer*
>>
>> A bit of looking around showed me that this might be the right syntax: 
>> *config.put(Config.TOPOLOGY_WORKER_CHILDOPTS,
>> SOME_OPTS);*
>>
>> But couldn't find any example on what Storm expects as SOME_OPTS.
>>
>> On Wed, Jul 13, 2016 at 12:38 PM, Spico Florin 
>> wrote:
>>
>>> Hello!
>>>   For the the topology that you have 0MB allocated, for me it seems that
>>> you don't have enough slots available. Check out the storm.yaml file (on
>>> your worker machines) how many slots you have allocated.
>>> (by default the are 4 slots available supervisor.slots.ports:
>>> - 6700
>>> - 6701
>>> - 6702
>>> - 6703) You have 5 topologies, therefore one is not ran.
>>>
>>> Regarding the memory allocation, you allocate memory per each worker
>>> (slot available), not per topology. If  you set up for your topology a
>>> number of workers equal to 1, then you topology will run on a single worker
>>> (available slot) and will receive the configuration that you gave for your
>>> worker. If you configure to spread your spout and bolts to multiple
>>> workers,(that are available in as configured slots)  then all the workers
>>> will receive the same amount of memory configured globally via
>>>  worker.childopts property in the storm yaml . So you don't configure the
>>> meory per topology but per worker.
>>>
>>> If you want use different memory allocation for workers depending on
>>> your topology components memory consumption, then you should use the
>>> property
>>>
>>> stormConfig.put("topology.worker.childopts",1024)
>>>
>>> I hope it helps.
>>>
>>> Regards,
>>>  Florin
>>>
>>> On Wed, Jul 13, 2016 at 9:23 AM, Navin Ipe <
>>> navin@searchlighthealth.com> wrote:
>>>
 I tried setting stormConfig.put(Config.WORKER_HEAP_MEMORY_MB, 1024);
 and now *all topologies* Assigned memory is 0MB.
 Does anyone know why it works this way? How do we assign memory to
 topologies in this situation?

 On Wed, Jul 13, 2016 at 10:39 AM, Navin Ipe <
 navin@searchlighthealth.com> wrote:

> Hi,
>
> I have a program *MyProg.java* inside which I'm creating 5 topologies
> and using *stormSubmitter* to submit it to Storm. The storm UI shows
> the assigned memory for each topology as:
> *Assigned Mem (MB)*
> 832
> 0
> 832
> 832
> 832
>
> One of the topologies was assigned 0MB. Assuming that it was because
> of setting a single Config for all of them, I gave them separate configs
> (as below), but nothing changed.
>
> *StormConfigFactory stormConfigFactory = new StormConfigFactory();*
>
>
>
>
>
>
> *StormSubmitter.submitTopology(upTopologyName,
> stormConfigFactory.GetNewConfig(),
> upBuilder.createTopology());StormSubmitter.submitTopology(viTopologyName,
> stormConfigFactory.GetNewConfig(),
> viBuilder.createTopology());StormSubmitter.submitTopology(prTopologyName,
> stormConfigFactory.GetNewConfig(),
> prBuilder.createTopology());StormSubmitter.submitTopology(opTopologyName,
> stormConfigFactory.GetNewConfig(),
> opBuilder.createTopology());StormSubmitter.submitTopology(poTopologyName,
> stormConfigFactory.GetNewConfig(), poBuilder.createTopology());*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *And the StormConfigFactory class:public class StormConfigFactory {
> public Config GetNewConfig() {Config stormConfig = new
> Config();stormConfig.setNumWorkers(1);
> stormConfig.setNumAckers(1);
> stormConfig.put(Config.TOPOLOGY_DEBUG, false);
> stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64);
> stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,
> 65536);
> stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,
> 65536);stormC

Re: Allocating separate memory and workers to topologies of a single jar?

2016-07-13 Thread Satish Duggana

Config.TOPOLOGY_WORKER_CHILDOPTS:  Options which can override
WORKER_CHILDOPTS for a topology. You can configure any java options like
memory, gc etc

In your case it can be
config.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx1g");

Thanks,
Satish.


On Wed, Jul 13, 2016 at 1:45 PM, Navin Ipe 
wrote:

> Using *stormConfig.put("topology.**worker.childopts",1024)* gave me this
> error:
> *java.lang.IllegalArgumentException: Field TOPOLOGY_WORKER_CHILDOPTS must
> be an Iterable but was a class java.lang.Integer*
>
> A bit of looking around showed me that this might be the right syntax: 
> *config.put(Config.TOPOLOGY_WORKER_CHILDOPTS,
> SOME_OPTS);*
>
> But couldn't find any example on what Storm expects as SOME_OPTS.
>
> On Wed, Jul 13, 2016 at 12:38 PM, Spico Florin 
> wrote:
>
>> Hello!
>>   For the the topology that you have 0MB allocated, for me it seems that
>> you don't have enough slots available. Check out the storm.yaml file (on
>> your worker machines) how many slots you have allocated.
>> (by default the are 4 slots available supervisor.slots.ports:
>> - 6700
>> - 6701
>> - 6702
>> - 6703) You have 5 topologies, therefore one is not ran.
>>
>> Regarding the memory allocation, you allocate memory per each worker
>> (slot available), not per topology. If  you set up for your topology a
>> number of workers equal to 1, then you topology will run on a single worker
>> (available slot) and will receive the configuration that you gave for your
>> worker. If you configure to spread your spout and bolts to multiple
>> workers,(that are available in as configured slots)  then all the workers
>> will receive the same amount of memory configured globally via
>>  worker.childopts property in the storm yaml . So you don't configure the
>> meory per topology but per worker.
>>
>> If you want use different memory allocation for workers depending on your
>> topology components memory consumption, then you should use the property
>>
>> stormConfig.put("topology.worker.childopts",1024)
>>
>> I hope it helps.
>>
>> Regards,
>>  Florin
>>
>> On Wed, Jul 13, 2016 at 9:23 AM, Navin Ipe <
>> navin@searchlighthealth.com> wrote:
>>
>>> I tried setting stormConfig.put(Config.WORKER_HEAP_MEMORY_MB, 1024); and
>>> now *all topologies* Assigned memory is 0MB.
>>> Does anyone know why it works this way? How do we assign memory to
>>> topologies in this situation?
>>>
>>> On Wed, Jul 13, 2016 at 10:39 AM, Navin Ipe <
>>> navin@searchlighthealth.com> wrote:
>>>
 Hi,

 I have a program *MyProg.java* inside which I'm creating 5 topologies
 and using *stormSubmitter* to submit it to Storm. The storm UI shows
 the assigned memory for each topology as:
 *Assigned Mem (MB)*
 832
 0
 832
 832
 832

 One of the topologies was assigned 0MB. Assuming that it was because of
 setting a single Config for all of them, I gave them separate configs (as
 below), but nothing changed.

 *StormConfigFactory stormConfigFactory = new StormConfigFactory();*






 *StormSubmitter.submitTopology(upTopologyName,
 stormConfigFactory.GetNewConfig(),
 upBuilder.createTopology());StormSubmitter.submitTopology(viTopologyName,
 stormConfigFactory.GetNewConfig(),
 viBuilder.createTopology());StormSubmitter.submitTopology(prTopologyName,
 stormConfigFactory.GetNewConfig(),
 prBuilder.createTopology());StormSubmitter.submitTopology(opTopologyName,
 stormConfigFactory.GetNewConfig(),
 opBuilder.createTopology());StormSubmitter.submitTopology(poTopologyName,
 stormConfigFactory.GetNewConfig(), poBuilder.createTopology());*



















 *And the StormConfigFactory class:public class StormConfigFactory {
 public Config GetNewConfig() {Config stormConfig = new
 Config();stormConfig.setNumWorkers(1);
 stormConfig.setNumAckers(1);
 stormConfig.put(Config.TOPOLOGY_DEBUG, false);
 stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64);
 stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,
 65536);
 stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,
 65536);stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING,
 50);stormConfig.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS,
 60);stormConfig.put(Config.STORM_ZOOKEEPER_SERVERS,
 Arrays.asList(new String[]{"localhost"}));
 return stormConfig;}}*


 *How do I allocate separate memory and workers for each topology?*

 --
 Regards,
 Navin

>>>
>>>
>>>
>>> --
>>> Regards,
>>> Navin
>>>
>>
>>
>
>
> --
> Regards,
> Navin
>

Re: MasterBatchCoordinator has become the bottleneck in my trident topology, can the number of MasterBatchCoordinator task larger than 1?

2016-07-13 Thread Satish Duggana

AFAIK,  there can be only one master-batch-coordinator for each batch
group. You should look at increasing parallelism of trident operations to
get better latencies.

Thanks,
Satish.

On Wed, Jul 13, 2016 at 12:26 PM, hong mao  wrote:

> I get the resaon, there is only one MasterBatchCoordinator for each
> trident spout, so setting the thrid parameter (parallelism) takes no
> effect.
>
>
> 2016-07-12 0:08 GMT+08:00 hong mao :
>
>> Hi all,
>>  We are using trident topology to pull messages from kafka and store
>> into hbase using partitionPersist. Recently we meet with a following
>> situation, the latency of the topology is increasing, as is shown in
>> stormUI, large portion of the latency is taken by $mastercoord-bg0
>> 
>>  ,
>> which corresponds to MasterBatchCoordinator, and its task parallelism is 1.
>> By checking source code TridentTopologyBuilder.java, I find that there is
>> no way to configure the task parallelism of MasterBatchCoordinator (MBC).
>>
>> for(String batch: batchesToCommitIds.keySet()) {
>>> List commitIds = batchesToCommitIds.get(batch);
>>> builder.setSpout(masterCoordinator(batch), new
>>> MasterBatchCoordinator(commitIds, batchesToSpouts.get(batch)));
>>> }
>>>
>>
>> Should I just write a new Builder which overrides TridentTopologyBuilder
>> so that I can increase the task parallelism of MasterBatchCoordinator and
>> imporve the throughput of MBC ?
>>
>> If it is ok, may I open an issue to solve this problem ?
>>
>>
>> Thanks a lot!
>> 
>>
>
>

Re: Storm monitoring

2016-07-13 Thread Julien Nioche

Hi,

I had written a MetricsConsumer for CloudWatch some time ago. The code
would need reworking a bit, I might release it if I find the time.

For StormCrawler  topologies, our Elasticsearch
module

has
a MetricsConsumer which sends the metrics to an ES index; we use Kibana to
display the graphs and monitor the crawls.

Julien

On 31 May 2016 at 14:19, Matthew Lowe  wrote:

> Thought you all might be interested.
>
> I have now got capacity monitoring working with AWS Cloudwatch. These are
> 2 bolts within the same topology:
>
> 
>
>
> #! /bin/bash
>
> API_KEY='XX'
>
> # get all the topology id's
> TOPOLOGY_IDS=`curl -s
> http://:8080/api/v1/topology/summary | json_pp |
> grep '"id"' | sed 's/.* : "\(.*\)".*/\1/'`
> for id in ${TOPOLOGY_IDS} ; do
> TOPOLOGY=`curl -s
> http://XXX:8080/api/v1/topology/${id}?sys=false`
> 
>
> # check the capacity for each task
> CAPACITIES=`echo ${TOPOLOGY} | json_pp | grep '"capacity"' | sed 's/.* :
> "\(.*\)".*/\1/'`
> CAPACITIES_ARRAY=()
> for capactiy in ${CAPACITIES} ; do
> CAPACITIES_ARRAY+=("${capactiy}")
> done
>
> # get all of the bolt ids
> BOLT_IDS=`echo ${TOPOLOGY} | json_pp | grep '"boltId"' | sed 's/.* :
> "\(.*\)".*/\1/'`
> BOLT_IDS_ARRAY=()
> for boltId in ${BOLT_IDS} ; do
> BOLT_IDS_ARRAY+=("${boltId}")
> done
>
> # send a trigger if needed
> # capacity and bolt id arrays will always be the same size
> for(( x=0; x<${#CAPACITIES_ARRAY[@]}; x++ ));  do
> TIMESTAMP=`date +"%Y-%m-%dT%H:%M:%S.%N"`
> /usr/local/bin/aws cloudwatch put-metric-data --metric-name
> "${BOLT_IDS_ARRAY[$x]}-capacity" --namespace "${id}" --value
> "${CAPACITIES_ARRAY[$x]}" --timestamp "${TIMESTAMP}"
> done
>
>
> done
>
> On Mon, May 30, 2016 at 5:37 PM, Radhwane Chebaane <
> r.cheba...@mindlytix.com> wrote:
>
>>
>> Using graphite API was so helpful for us, but since it doesn't support
>> *Tags* introduced since *InfluxDB 0.9*, we created a *new metric
>> Consumer *
>> that supports the new InfluxDB API instead of Grahpite API. *Tags* are
>> important when filtering dashboard metrics based on components, bolt or
>> worker name.
>>
>> Regards,
>> Radhwane
>>
>>
>> On 30/05/2016 17:19, Stephen Powis wrote:
>>
>> +1 for graphite and grafana via Verisign's plugin.
>>
>> Using graphite a few years ago was a real game changer for us, and more
>> recently grafana to help build out dashboards instead of copy/pasting
>> graphite urls around.  Here's two different dashboards we have relating to
>> our storm topologies.  We're able to correlate information from all parts
>> of our app, hardware monitoring metrics (via zabbix) and of course storm.
>> Additionally we use seyren on top of graphite for our alerting as well.
>>
>> bolt specific dashboard 
>>
>> Dashboard correlating lots of related information from various sources
>> 
>>
>>
>>
>> On Mon, May 30, 2016 at 9:13 AM, Radhwane Chebaane <
>> r.cheba...@mindlytix.com> wrote:
>>
>>> Hi Matthew,
>>>
>>> We actually use the InfluxData  Stack
>>> (InfluxDb + Grafana).
>>> We send our data directly to a time-series database, *InfluxDB*
>>> . Then, we
>>> visualize metrics with a customizable dashboard, *Grafana*
>>> .
>>> This way, you can have real-time metrics on your Storm topology. You may
>>> also add custom metrics for enhanced monitoring.
>>>
>>> To export Storm metrics to InfluxDB you can use this *MetricsConsumer *which
>>> is compatible with the latest version of InfluxDB and Storm 1.0.0:
>>> https://github.com/mathieuboniface/storm-metrics-influxdb
>>>
>>> Or you can use the old Verisign plug-in with Graphite protocol:
>>> https://github.com/verisign/storm-graphite
>>>
>>> Best regards,
>>> Radhwane CHEBAANE
>>>
>>>
>>> On 30/05/2016 14:47, Matthew Lowe wrote:
>>>
>>> Hello all.
>>>
>>> What kind of monitoring solutions do you use with storm?
>>>
>>> For example I have a bash script that reads the Json data from the REST UI 
>>> and alerts if there are any bolts with high capacities.
>>>
>>> It's only small and hacky, but I am genuinely interested to how you all 
>>> monitor your topologies.
>>>
>>> Best Regards
>>> Matthew Lowe
>>>
>>>
>>>
>>
>>
>


-- 

*Open Source Solutions for Text Engineering*

http://www.digitalpebble.com
http://digitalpebble.blogspot.com/
#digitalpebble

Storm Shared Memory

2016-07-13 Thread jimmy tekli

Hello ,I developed a java project based on the storm topology that process 
incoming tuples in a certain way.I tested it locally and it worked perfectly in 
the context of streaming tuples of course. My topology is  formed of one spout 
and three bolts.In my code I used Data Structures (such as 
HashMaps,linkedHashmap and TreeMap) to store some information concerning the 
tuple's processing and I declared them static in the topology builder class so 
they could be shared by all the bolts because they need to accessed and varied 
by them frequently. The problem is that when i deployed it over a remote 
cluster these static attributes weren't visible (shared ) by the different 
bolts (I read about it online and I concluded because of multiple JVMs). My 
question is if  there is a certain concept concerning "shared memory" in Apache 
Storm where we can store these DataStructures for them to be shared and 
accessed by all the bolts at any given time.thanks in advance for your help.

Re: Allocating separate memory and workers to topologies of a single jar?

2016-07-13 Thread Navin Ipe

Using *stormConfig.put("topology.**worker.childopts",1024)* gave me this
error:
*java.lang.IllegalArgumentException: Field TOPOLOGY_WORKER_CHILDOPTS must
be an Iterable but was a class java.lang.Integer*

A bit of looking around showed me that this might be the right syntax:
*config.put(Config.TOPOLOGY_WORKER_CHILDOPTS,
SOME_OPTS);*

But couldn't find any example on what Storm expects as SOME_OPTS.

On Wed, Jul 13, 2016 at 12:38 PM, Spico Florin 
wrote:

> Hello!
>   For the the topology that you have 0MB allocated, for me it seems that
> you don't have enough slots available. Check out the storm.yaml file (on
> your worker machines) how many slots you have allocated.
> (by default the are 4 slots available supervisor.slots.ports:
> - 6700
> - 6701
> - 6702
> - 6703) You have 5 topologies, therefore one is not ran.
>
> Regarding the memory allocation, you allocate memory per each worker (slot
> available), not per topology. If  you set up for your topology a number of
> workers equal to 1, then you topology will run on a single worker
> (available slot) and will receive the configuration that you gave for your
> worker. If you configure to spread your spout and bolts to multiple
> workers,(that are available in as configured slots)  then all the workers
> will receive the same amount of memory configured globally via
>  worker.childopts property in the storm yaml . So you don't configure the
> meory per topology but per worker.
>
> If you want use different memory allocation for workers depending on your
> topology components memory consumption, then you should use the property
>
> stormConfig.put("topology.worker.childopts",1024)
>
> I hope it helps.
>
> Regards,
>  Florin
>
> On Wed, Jul 13, 2016 at 9:23 AM, Navin Ipe <
> navin@searchlighthealth.com> wrote:
>
>> I tried setting stormConfig.put(Config.WORKER_HEAP_MEMORY_MB, 1024); and
>> now *all topologies* Assigned memory is 0MB.
>> Does anyone know why it works this way? How do we assign memory to
>> topologies in this situation?
>>
>> On Wed, Jul 13, 2016 at 10:39 AM, Navin Ipe <
>> navin@searchlighthealth.com> wrote:
>>
>>> Hi,
>>>
>>> I have a program *MyProg.java* inside which I'm creating 5 topologies
>>> and using *stormSubmitter* to submit it to Storm. The storm UI shows
>>> the assigned memory for each topology as:
>>> *Assigned Mem (MB)*
>>> 832
>>> 0
>>> 832
>>> 832
>>> 832
>>>
>>> One of the topologies was assigned 0MB. Assuming that it was because of
>>> setting a single Config for all of them, I gave them separate configs (as
>>> below), but nothing changed.
>>>
>>> *StormConfigFactory stormConfigFactory = new StormConfigFactory();*
>>>
>>>
>>>
>>>
>>>
>>>
>>> *StormSubmitter.submitTopology(upTopologyName,
>>> stormConfigFactory.GetNewConfig(),
>>> upBuilder.createTopology());StormSubmitter.submitTopology(viTopologyName,
>>> stormConfigFactory.GetNewConfig(),
>>> viBuilder.createTopology());StormSubmitter.submitTopology(prTopologyName,
>>> stormConfigFactory.GetNewConfig(),
>>> prBuilder.createTopology());StormSubmitter.submitTopology(opTopologyName,
>>> stormConfigFactory.GetNewConfig(),
>>> opBuilder.createTopology());StormSubmitter.submitTopology(poTopologyName,
>>> stormConfigFactory.GetNewConfig(), poBuilder.createTopology());*
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *And the StormConfigFactory class:public class StormConfigFactory {
>>> public Config GetNewConfig() {Config stormConfig = new
>>> Config();stormConfig.setNumWorkers(1);
>>> stormConfig.setNumAckers(1);
>>> stormConfig.put(Config.TOPOLOGY_DEBUG, false);
>>> stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64);
>>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,
>>> 65536);
>>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,
>>> 65536);stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING,
>>> 50);stormConfig.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS,
>>> 60);stormConfig.put(Config.STORM_ZOOKEEPER_SERVERS,
>>> Arrays.asList(new String[]{"localhost"}));
>>> return stormConfig;}}*
>>>
>>>
>>> *How do I allocate separate memory and workers for each topology?*
>>>
>>> --
>>> Regards,
>>> Navin
>>>
>>
>>
>>
>> --
>> Regards,
>> Navin
>>
>
>


-- 
Regards,
Navin

Re: Required field 'nimbus_uptime_secs' is unset!

2016-07-13 Thread Spico Florin

Hi!
  For me it seems that you have to pass the classpath or to build a fat
jar. Please have a look at this post:
http://stackoverflow.com/questions/32976198/deploy-storm-topology-remotely-using-storm-jar-command-on-windows
Florin

On Wed, Jul 13, 2016 at 8:09 AM, ram kumar  wrote:

> I also tried setting option for nimbus_uptime_secs
>
> COMMAND:
> sparse submit -o "nimbus_uptime_secs=5"
>
> Still facing the same issue.
>
> On Tue, Jul 12, 2016 at 3:57 PM, ram kumar 
> wrote:
>
>> Hi all,
>>
>> I am trying to run storm topology in production mode,
>>
>> project.clj
>> (defproject meta "0.0.1-SNAPSHOT"
>>   :source-paths ["topologies"]
>>   :resource-paths ["_resources"]
>>   :target-path "_build"
>>   :min-lein-version "2.0.0"
>>   :jvm-opts ["-client"]
>>   :dependencies  [[org.apache.storm/storm-core "0.10.0"]
>>   [com.parsely/streamparse "0.0.4-SNAPSHOT"]
>>   ]
>>   :jar-exclusions [#"log4j\.properties" #"backtype" #"trident"
>> #"META-INF" #"meta-inf" #"\.yaml"]
>>   :uberjar-exclusions [#"log4j\.properties" #"backtype" #"trident"
>> #"META-INF" #"meta-inf" #"\.yaml"]
>>   )
>>
>>
>> config.json
>> {
>> "library": "",
>> "topology_specs": "topologies/",
>> "virtualenv_specs": "virtualenvs/",
>> "envs": {
>> "prod": {
>> "user": "ram",
>> "nimbus": "10.218.173.100",
>> "workers": ["10.154.9.166"],
>> "log": {
>> "path": "/home/ram/log/storm/streamparse",
>> "max_bytes": 100,
>> "backup_count": 10,
>> "level": "info"
>> },
>> "use_ssh_for_nimbus": true,
>> "use_virtualenv": false,
>> "virtualenv_root": "/home/ram/storm-topologies/meta"
>> }
>> }
>> }
>>
>>
>> When submitting topology as, *sparse submit *
>> getting,
>>
>> java.lang.RuntimeException:
>> org.apache.thrift7.protocol.TProtocolException: *Required field
>> 'nimbus_uptime_secs' is unset!*
>> Struct:ClusterSummary(supervisors:[SupervisorSummary(host:10.154.9.166,
>> uptime_secs:1654461, num_workers:2, num_used_workers:0,
>> supervisor_id:0a07f441-05f8-4103-b439-22cf42a1fcff,
>> version:0.10.0.2.4.2.0-258)], nimbus_uptime_secs:0, topologies:[])
>>
>>
>>
>> Can anyone help me with this,
>> Thanks,
>> Ram
>>
>>
>

Re: Allocating separate memory and workers to topologies of a single jar?

2016-07-13 Thread Spico Florin

Hello!
  For the the topology that you have 0MB allocated, for me it seems that
you don't have enough slots available. Check out the storm.yaml file (on
your worker machines) how many slots you have allocated.
(by default the are 4 slots available supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703) You have 5 topologies, therefore one is not ran.

Regarding the memory allocation, you allocate memory per each worker (slot
available), not per topology. If  you set up for your topology a number of
workers equal to 1, then you topology will run on a single worker
(available slot) and will receive the configuration that you gave for your
worker. If you configure to spread your spout and bolts to multiple
workers,(that are available in as configured slots)  then all the workers
will receive the same amount of memory configured globally via
 worker.childopts property in the storm yaml . So you don't configure the
meory per topology but per worker.

If you want use different memory allocation for workers depending on your
topology components memory consumption, then you should use the property

stormConfig.put("topology.worker.childopts",1024)

I hope it helps.

Regards,
 Florin

On Wed, Jul 13, 2016 at 9:23 AM, Navin Ipe 
wrote:

> I tried setting stormConfig.put(Config.WORKER_HEAP_MEMORY_MB, 1024); and
> now *all topologies* Assigned memory is 0MB.
> Does anyone know why it works this way? How do we assign memory to
> topologies in this situation?
>
> On Wed, Jul 13, 2016 at 10:39 AM, Navin Ipe <
> navin@searchlighthealth.com> wrote:
>
>> Hi,
>>
>> I have a program *MyProg.java* inside which I'm creating 5 topologies
>> and using *stormSubmitter* to submit it to Storm. The storm UI shows the
>> assigned memory for each topology as:
>> *Assigned Mem (MB)*
>> 832
>> 0
>> 832
>> 832
>> 832
>>
>> One of the topologies was assigned 0MB. Assuming that it was because of
>> setting a single Config for all of them, I gave them separate configs (as
>> below), but nothing changed.
>>
>> *StormConfigFactory stormConfigFactory = new StormConfigFactory();*
>>
>>
>>
>>
>>
>>
>> *StormSubmitter.submitTopology(upTopologyName,
>> stormConfigFactory.GetNewConfig(),
>> upBuilder.createTopology());StormSubmitter.submitTopology(viTopologyName,
>> stormConfigFactory.GetNewConfig(),
>> viBuilder.createTopology());StormSubmitter.submitTopology(prTopologyName,
>> stormConfigFactory.GetNewConfig(),
>> prBuilder.createTopology());StormSubmitter.submitTopology(opTopologyName,
>> stormConfigFactory.GetNewConfig(),
>> opBuilder.createTopology());StormSubmitter.submitTopology(poTopologyName,
>> stormConfigFactory.GetNewConfig(), poBuilder.createTopology());*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *And the StormConfigFactory class:public class StormConfigFactory {
>> public Config GetNewConfig() {Config stormConfig = new
>> Config();stormConfig.setNumWorkers(1);
>> stormConfig.setNumAckers(1);
>> stormConfig.put(Config.TOPOLOGY_DEBUG, false);
>> stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 64);
>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE,
>> 65536);
>> stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE,
>> 65536);stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING,
>> 50);stormConfig.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS,
>> 60);stormConfig.put(Config.STORM_ZOOKEEPER_SERVERS,
>> Arrays.asList(new String[]{"localhost"}));
>> return stormConfig;}}*
>>
>>
>> *How do I allocate separate memory and workers for each topology?*
>>
>> --
>> Regards,
>> Navin
>>
>
>
>
> --
> Regards,
> Navin
>

Re: Storm Shared Memory

Re: Storm Shared Memory

RE: How to use KafkaSpout to consume kafka cluster secured with kerberos

Re: Allocating separate memory and workers to topologies of a single jar?

Re: storm cgroup question

Re: Allocating separate memory and workers to topologies of a single jar?

Re: Allocating separate memory and workers to topologies of a single jar?

Re: MasterBatchCoordinator has become the bottleneck in my trident topology, can the number of MasterBatchCoordinator task larger than 1?

Re: Storm monitoring

Storm Shared Memory

Re: Allocating separate memory and workers to topologies of a single jar?

Re: Required field 'nimbus_uptime_secs' is unset!

Re: Allocating separate memory and workers to topologies of a single jar?

13 matches

Site Navigation

Mail list logo

Footer information