One thing to note is that you should try to avoid JDBC operations in a
bolt, as they may block the bolt and affect the topology's performance. Try
to do the database access asynchronously, or create a separate thread for
JDBC operations.
2015-05-14 10:30 GMT-04:00 Mason Yu
The best documentation is here Guaranteeing Message Processing
| |
| | | | | | | |
| Guaranteeing Message ProcessingToggle navigation Apache Storm Home About
About Contributing Bylaws Download Documentation News Guarante... |
| |
| View on storm.apache.org | Preview by Yahoo |
|
When you send a tuple from the spout, it sets the message id in the acker.
When a message fans out, it did logical xor of new tuple ids with the
existing message id, and when tuples get acked their message id id xor'd
into the same value again. This fanning out tuples should be quite light
weight.
HI,
In storm we have normal topology means we have spout and bolts etc.In that
document they mention it is at-least-once processing ,means may be chance of
processing more than once ?
How can i test storm has at-least-once-processing ? please help me?
While In trident is exactly once
Hello everyone!
I'm currently toying around with a prototype built ontop of Storm and have
been running into some not so easy going while trying to work with
Hibernate and storm. I was hoping to get input on if this is just a case
of I'm doing it wrong or maybe get some useful tips.
In my
Dell - Internal Use - Confidential
Nathan,
Can you explain in a little more detail what you mean by “When you have more
tasks than executors, the spout thread does the same logic, it just does it for
more tasks during its main loop.” I thought the spout thread emits tuples
based on the max
Maybe I will make an analogy. Think of spout executors as people wrapping
presents. Think of spout tasks as tables where people can wrap presents.
If you have 10 tasks and 1 executor, then you have 10 tasks and 1 person.
The person will wrap a present at one table, then go to the next, wrap a
Interesting. Hibernate hooks inside a J2ee container or Spring
which requires a specific OR mapping to a 20th century RDBMS.
Storm works in a Linux distributed environment which does not
need a RDBMS. RDBMS's do not work in a distributed environment.
Mason Yu Jr.
CEO
Big Data Architects,
The reason objects are serialized is so that they can be shipped to another
process. As long as that's what you want, it follows that you'd have to
share the sessions across processes. I don't think this is possible or wise!
On Thu, May 14, 2015 at 2:58 PM, Stephen Powis spo...@salesforce.com
I am using shuffelgrouping. With one worker it is running well.
On Thu, May 14, 2015 at 4:18 PM, rajesh_kall...@dellteam.com wrote:
*Dell - Internal Use - Confidential *
Regarding the NotSerializableException:
I would like to see what grouping you have defined between the two bolts
that
Yes, right.
Worker is a JVM instance which can contains one or more executors, each
executor is spout, bolt, acker, etc.
2015년 5월 14일 목요일, Asif Ihsanasifihsan.ih...@gmail.com님이 작성한 메시지:
You are right about it. What does multiple workers means. Does it mean
that when I run topology with single
It makes sense to me since the bolts can be distributed over different
supervisors and may have different DB connections. In this case, you have to
detach the hibernate entities from one connection and then re-attach them to
the other, if you want to pass them between bolts. —
Sincerely,
Fan
Does anyone know why the storm command-line program in the bin directory of
the downloads is the storm python script rather than the shell script
that's found in the storm github repo?
We have a minor problem with this script because it explicitly invokes
/usr/bin/python, which is not standard
Hello Folks,
We have a topology currently running in AWS. The Nimbus node was scheduled
for maintenance, which was completed a few hours earlier. After this, the
Topology is running (since the supevisors weren't affected by the
maintenance), but the Nimbus UI isn't reachable and I am assuming the
If I create 6 workers for 5 bolts and 1 spout. Then how will these spout
and bolts be distributed among workers.
On Thu, May 14, 2015 at 5:34 PM, 임정택 kabh...@gmail.com wrote:
Yes, right.
Worker is a JVM instance which can contains one or more executors, each
executor is spout, bolt, acker,
In this specific case, what do you mean by kafka log roll?
How did you correlate the storm problems with the kafka log roll?
Do you have any logs to show?
Thank you for your time!
+
Jeff Maass maas...@gmail.com
linkedin.com/in/jeffmaass
I think that question is probably better asked of the developer's group :
http://mail-archives.apache.org/mod_mbox/storm-dev/
Incidentally, how is the inclusion of the logback.xml file causing you a
problem?
Thank you for your time!
+
Jeff Maass maas...@gmail.com
In prototyping an application, I use a simple pojo (see below) to send data
between bolts. In an earlier version of my bolt, I would simply get the
pojo using tuple.getValue() inside the execute() function, like:
SimplePojo pojo = (SimplePojo) tuple.getValue(0);
But when I scale up and have many
Hi Jeff,
What makes us believe that is that, when using a spouts-only topology (we
needed to coordinate transactions between three systems and ensure exactly
once semantics) we had better performance, even when feeding from a slower
input. When we added bolts, performance degraded as stuff has to
Isn't the cleanup method guaranteed to be called only while running as
local topology?
On May 13, 2015 9:20 AM, Jeffery Maass maas...@gmail.com wrote:
Bolts which implement IBolt have a method called cleanup() which is called
by the Storm framework.
How do i know how many records are processed per second by the Storm Topology ,
Please see below screen shot from the UI , Everything is running as single node
cluster hortonworks SandBox
Please help me understand below metric
Complete Latency , Execute Latency and Process Latency for below
Dell - Internal Use - Confidential
Thanks Jeffery, great to see enthusiastic community around storm.
Do you have any suggestions on the log retention and indexing mechanisms
(logstash etc?)
From: Jeffery Maass [mailto:maas...@gmail.com]
Sent: Thursday, May 14, 2015 4:42 PM
To:
Dell - Internal Use - Confidential
Regarding the NotSerializableException:
I would like to see what grouping you have defined between the two bolts that
you have experienced the NotSerializableException with.
I am guessing it is localOrShuffleGrouping, can you change it to
shuffleGrouping and
Hi Eran,
Have you checked storm UI metrics? Is there capacity overload?
Also please check log files for errors.
Best regards,
Dmytro Dragan
On May 14, 2015 08:45, Eran Chinthaka Withana eran.chinth...@gmail.com
wrote:
Hi Nathan
No I still didn't try jstack
But I'm just wondering whether
Hi,
i want to know in trident topology,in document they said if tuple is failed in
one batch etc ..
here i want to know what are the possible scenarios to fail tuples ?
Here if we want to maintain exactly once processing should use trident state?
When i run storm application in normal
Hello!
I would like to know if there is any spout implementation for streaming
data from HDFS to Storm (something similar to Spark Streaming from HDFS). I
know that there is bolt implementation to write data into HDFS (
https://github.com/ptgoetz/storm-hdfs and
I am using Esper in one of the bolts of the Storm topology. Bolt emit
MapEventBean array (EventBean[]). With the single worker topology run
smoothly. With multiple workers it gives following error in emit function
call.
May 14, 2015 2:37:07 PM clojure.tools.logging$eval1$fn__7 invoke
SEVERE:
Thank you all for sharing your thoughts/comments.
Richards Peter.
Hi.
Storm serializes tuples when tuple should be sent to other (remote) worker.
In other words, Storm doesn't serialize tuples when destination is local
task. That's why you didn't meet error when testing with 1 worker.
MapEventBean seems to be not serializable, so you need to convert to other
Hi
Any thoughts/comments about this topic?
Thanks,
Richards Peter.
Does anyone have any suggestions on how to handle the kafka log roll. When this
happens my topology just stops processing.
31 matches
Mail list logo