Re: ack in downstream when using all grouping method

2016-12-19 Thread Xunyun Liu
​ Yes, my processing logic is task id dependent. Thus the behavior of different bolt instances are similar but not exactly the same. This is also the reason why I want some instances to be non-critical that do not affect the ack procedure. I would like to explore the possibility of modifying the a

Re: deploy bolts to a specific supervisor

2016-12-19 Thread Ambud Sharma
Storm workers are suppose to be identical for the most part. You can tune things a little by setting odd number of executors compared to the worker count. To ideally accomplish what you are trying to do you can: 1. make these "cpu intensive" bolts register their task ids to zookeeper or other KV s

Re: Support of topic wildcards in Kafka spout

2016-12-19 Thread Ambud Sharma
No, this is currently not supported. Please open a feature request: https://issues.apache.org/jira/browse/STORM/ so we can vote on it from a community perspective and see if others would be interested in developing this feature. On Tue, Nov 22, 2016 at 5:53 AM, Wijekoon, Manusha < manusha.wijek...

Re: Apache Storm 1.0.2 integration with ElasticSearch 5.0.0

2016-12-19 Thread Ambud Sharma
That is wire-level protocol incompatibility for ES or zen is disabled or nodes are not reachable in Elasticsearch. On Mon, Nov 28, 2016 at 7:21 PM, Zhechao Ma wrote: > As far as I know, storm-elasticsearch still doesn't support elasticsearch > 5.0. You can use *Elasticsearch-Hadoop *5.x instead,

Re: Clarity on external/storm-kafka-client

2016-12-19 Thread Ambud Sharma
The Storm-External project has the Kafka Spouts and bolts; Storm doesn't directly control the compatibility with Kafka, with that being said the default version of Kafka integrations will work according to your list; so the answer is Yes about the "compatibility" However, it's still possible to us

Re: Does the bolt in between have the ability to re-emit a failed tuple?

2016-12-19 Thread Ambud Sharma
Replaying of tuples is done from the Spout and not done on a point to point basis like Apache Flume. Either a tuple is completely processed i.e. acked by every single bolt in the pipeline or it's not; if it's not then it will be replayed by the Spout (if the Spout implements a replay logic when th

Re: Storm blobstore expiry

2016-12-19 Thread Ambud Sharma
What type of Blobstore is it? On Thu, Dec 1, 2016 at 1:57 AM, Mostafa Gomaa wrote: > Hello All, > > I am using the storm blobstore to store some dynamic dictionaries, however > I found that the blobstore had been cleared when I restarted the machine. > Is there a way to make the blobstore non-

Re: help on Consuming the data from SSL enabled 0.9 kafka topic

2016-12-19 Thread Ambud Sharma
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_storm-user-guide/content/stormkafka-secure-config.html On Mon, Dec 19, 2016 at 4:49 PM, Ambud Sharma wrote: > Not sure if this helps: > > On Thu, Dec 1, 2016 at 4:11 AM, Srinivas.Veerabomma < > srinivas.veerabo...@target.com> wrote: > >

Re: help on Consuming the data from SSL enabled 0.9 kafka topic

2016-12-19 Thread Ambud Sharma
Not sure if this helps: On Thu, Dec 1, 2016 at 4:11 AM, Srinivas.Veerabomma < srinivas.veerabo...@target.com> wrote: > Hi, > > > > I need some help. Basically looking for some sample Storm code or some > suggestions. > > > > My requirement is to develop a code in Apache Storm latest version to >

Re: topology.debug always true

2016-12-19 Thread Ambud Sharma
Yes, your reasoning is correct. The topology is overriding the debug configurations, Storm allows topology to override all of the topology specific settings. On Fri, Dec 2, 2016 at 3:41 AM, Mostafa Gomaa wrote: > I think nimbus configuration is what you have on your nimbus machine, > "topology.d

Re: Worker's Behavior With Heap Limit

2016-12-19 Thread Ambud Sharma
LRU caches are an effective memory management technique for Storm bolts if lookup is what you are trying to do however if you are doing in memory aggregations, I highly recommend sticking with standard Java maps and then checkpoint state to an external data store (hbase, redis etc.) Note: * Storm

Re: How to send multi tick tuple?

2016-12-19 Thread Ambud Sharma
Counting ticks with modulo operator is the ideal way to do it. Here's an example for you: https://github.com/Symantec/hendrix/blob/current/hendrix-alerts/src/main/java/io/symcpe/hendrix/alerts/SuppressionBolt.java#L136 Some slides explaining what's going on: http://www.slideshare.net/HadoopSummi

Re: ack in downstream when using all grouping method

2016-12-19 Thread Ambud Sharma
Storm is a framework built on replays, fundamentally replays are the way guaranteed event processing is accomplished. Typically all Bolt Instances in a given registered bolt should be running the same code, unless you are doing some logic based on task ids. This implies that behavior of bolt instan

Re: ack in downstream when using all grouping method

2016-12-19 Thread Xunyun Liu
Thank you for your answer, Ambud. My use case is that only some of the bolt instances are critical that I need them responding to the signal through proper acknowledgment. However, the rest of them are non-critical which are preferably not to interfere the normal ack process, much like receiving an

Re: ack in downstream when using all grouping method

2016-12-19 Thread Ambud Sharma
Forgot to answer your specific question. Storm message id is internal and will be different so you will see a duplicate tuple with a different id. On Dec 19, 2016 3:59 PM, "Ambud Sharma" wrote: > Yes that is correct. All downstream tuples must be processed for the root > tuple to be acknowledged

Re: ack in downstream when using all grouping method

2016-12-19 Thread Ambud Sharma
Yes that is correct. All downstream tuples must be processed for the root tuple to be acknowledged. Type of grouping does not change the acking behavior. On Dec 19, 2016 3:53 PM, "Xunyun Liu" wrote: > Hi there, > > As some grouping methods allow sending multiple copies of emitted data to > down

ack in downstream when using all grouping method

2016-12-19 Thread Xunyun Liu
Hi there, As some grouping methods allow sending multiple copies of emitted data to downstream bolt instances, I was wondering what will happen if any one of them is not able to ack the tuple due to failures. The intrinsic question is that, when the all grouping method is used, whether the recipie

Re: How to send multi tick tuple?

2016-12-19 Thread Hugo Da Cruz Louro
It is hard to tell without the whole context, but another viable option may be to have a same thread timer, similar to what we have in the storm-kafka-client KafkaSpout. Please take a look here