On Saturday, February 28, 2015 9:33 PM, Guozhang Wang wangg...@gmail.com
wrote:
Is this you are looking for?
http://kafka.apache.org/07/documentation.html
On Fri, Feb 27, 2015 at 7:02 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
There used to be available a very lucid page
There used to be available a very lucid page describing Kafka 0.7, its design,
and the rationale behind certain decisions. I last saw it about 18 months ago.
I can't find it now. Is it still available? I can find the 0.8 version, it's up
there on the site.
Any help? Any links?
Philip
.
To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
amortized cost below a single m2.2xlarge with the same amount of EBS
storage and provisioned IOPs.
On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole
philip.oto
If only Kafka had rack awarenessyou could run 1 cluster and set up the
replicas in different AZs.
https://issues.apache.org/jira/browse/KAFKA-1215
As for your question about ephemeral versus EBS, I presume you are proposing to
use ephemeral *with* replicas, right?
Philip
Yes, IMHO, that is going to be way too many topics. Use a smaller number of
topics, and embedded attributes like tag and user in the messages written
to Kafka.
Phiilp
-
http://www.philipotoole.com
On Friday, September 5, 2014 4:21 AM, Sharninder
Agreed. I can't see this being a good use for Kafka.
Philip
-
http://www.philipotoole.com
On Thursday, September 4, 2014 9:57 PM, Sharninder sharnin...@gmail.com wrote:
Since you want all chats and mail history persisted all the time, I
personally
thread (single ConsumerStream) and use
commitOffset API to commit all partitions managed by each
ConsumerConnector after the thread finished processing the messages.
Does that solve the problem, Bhavesh?
Gwen
On Tue, Sep 2, 2014 at 5:47 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
Yeah
Only problem is no of connections to Kafka is increased.
*Why* is it a problem?
Philip
Either use the SimpleConsumer which gives you much finer-grained control, or
(this worked with 0.7) spin up a ConsumerConnection (this is a HighLevel
consumer concept) per partition, turn off auto-commit.
Philip
-
http://www.philipotoole.com
On
only when batch is done.
Thanks,
Bhavesh
On Tue, Sep 2, 2014 at 4:43 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
Either use the SimpleConsumer which gives you much finer-grained control,
or (this worked with 0.7) spin up a ConsumerConnection (this is a HighLevel
consumer
management.
Thanks,
Bhavesh
On Tue, Sep 2, 2014 at 5:20 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
No, you'll need to write your own failover.
I'm not sure I follow your second question, but the high-level Consumer
should be able to do what you want if you disable auto
Retention is per topic, per Kafka broker, it is nothing to do with the
Producer. You do not need to restart the Producer for retention changes to take
effect. You do, however, need to restart the broker however. Once restarted,
all messages will then be subject to the new policy.
Philip
Kafka can ingest any kind of data, and connect to many types of systems. Much
work exists in this area already, for hooking a wide variety of systems to
Kafka. If your system isn't supported, then you write a Kafka Producer to pull
(or receive) messages from your system, and write them to
If you have studied the docs yet, you should, as this is a broad question which
needs background to understand the answer.
But in summary, the high-level Consumer does more for you, and importantly,
provides balancing between Consumers. The SimpleConsumer does less for you, but
gives you more
It's not a bug, right? It's the way the system works (if I have been following
the thread correctly) -- when the retention time passes, the message is gone.
Either consume your messages sooner, or increase your retention time. Kafka is
not magic, it can only do what it's told.
In practise I
!
On Wed, Aug 20, 2014 at 10:04 AM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
Nice work. That tool I put together was getting a bit old. :-)
I updated the Kafka ecosystem page with details of both tools.
https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem
Philip
- we have a low data traffic compared to your figures: around 30 GB a
day. Will it be an issue?
I have personal experience that Kafka deals extremely well with very
low-volumes, as well as very high. I have used Kafka for small integration-test
setups, as well as large production systems.
Todd -- can you share details of the ZK cluster you are running, to support
this scale? Is it one single Kafka cluster? Are you using 1 single ZK cluster?
Thanks,
Philip
-
http://www.philipotoole.com
On Monday, August 11, 2014 9:32 PM, Todd Palino
I'd love to know more about what you're trying to do here. It sounds like
you're trying to create topics on a schedule, trying to make it easy to locate
data for a given time range? I'm not sure it makes sense to use Kafka in this
manner.
Can you provide more detail?
Philip
On Mon, Aug 11, 2014 at 5:01 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
I'd love to know more about what you're trying to do here. It sounds like
you're trying to create topics on a schedule, trying to make it easy to
locate data for a given time range? I'm not sure it makes sense
another queue system.
Chen
Chen
On Mon, Aug 11, 2014 at 6:07 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
It's still not clear to me why you need to create so many topics.
Write the data to a single topic and consume it when it arrives. It
doesn't matter if it arrives
cluster, with the hope that the topic deletion api will be
available soon.Meantime just have a cron job cleaning up the outdated
topics from zookeeper.
Let me know what you think,
Thanks,
Chen
On Mon, Aug 11, 2014 at 6:53 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
Why
I think the question is what in your consuming application could cause it not
to check in with ZK for longer than the timeout.
-
http://www.philipotoole.com
On Thursday, August 7, 2014 8:16 AM, Jason Rosenberg j...@squareup.com wrote:
Well, it's
A big GC pause in your application, for example, could do it.
Philip
-
http://www.philipotoole.com
On Thursday, August 7, 2014 11:56 AM, Philip O'Toole philip.oto...@yahoo.com
wrote:
I think the question is what in your consuming application
Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch
the Apache log files, and send them to a suitable Producer (for example I wrote
something that will accept messages from a syslog client, and stream them to
Kafka. https://github.com/otoolep/syslog-gollector)
listeners are in separate async threads (and that's what it
looks like looking at the kafka consumer code).
Maybe I should increase the zk session timeout and see if that helps.
On Thu, Aug 7, 2014 at 2:56 PM, Philip O'Toole
philip.oto...@yahoo.com.invalid wrote:
A big GC pause in your
On Thu, Jul 17, 2014 at 9:28 PM, Philip O'Toole
philip_o_to...@yahoo.com.invalid wrote:
First things first. I friggin' think Kafka rocks. It's a system that have
given me a lot of joy, and I've spent a lot of fun hours (and sometimes not
so fun) looking at consumer lag metrics. I'd like
First things first. I friggin' think Kafka rocks. It's a system that have given
me a lot of joy, and I've spent a lot of fun hours (and sometimes not so fun)
looking at consumer lag metrics. I'd like to give back, beyond spreading the
gospel about it architecturally and operationally.
My only
FWIW, I happen to know Subodh -- we worked together many years back. We
discussed this a little off-the-list, but perhaps my thoughts might be of wider
interest.
Kafka, in my experience, works best when Producers have a persistent TCP
connection to the Broker(s) (and possibly Zookeeper). I
You should find code here that will help you get a HTTP app server together,
that writes to Kafka on the back-end.
https://cwiki.apache.org/confluence/display/KAFKA/Clients
https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem
On Wednesday, July 16, 2014 9:36 PM, Philip O'Toole
I went looking for a Syslog Collector, written in Go, which would stream to
Kafka. I couldn't find any, so put one together myself -- others might be
interested. It optionally performs basic parsing of an RFC5424 header too,
before streaming the messages to Kafka. As always, YMMV.
On Feb 11, 2014, at 7:45 AM, Jun Rao jun...@gmail.com wrote:
We do catch the exception. However, we don't know what to do with it.
Retrying may not fix the problem. So, we just log it and let the thread die.
Thanks,
Jun
On Mon, Feb 10, 2014 at 8:42 PM, Philip O'Toole phi...@loggly.com
Saw this thrown today, which brought down a Consumer thread -- we're using
Consumers built on the High-level consumer framework. What may have
happened here? We are using a custom C++ Producer which does not do
compression, and which hasn't changed in months, but this error is
relatively new to
I should we *think* this exception brought down the Consumer thread. The
problematic partition on our system was 2-29, so this is definitely the
related thread.
Philip
On Mon, Feb 10, 2014 at 5:00 PM, Philip O'Toole phi...@loggly.com wrote:
Saw this thrown today, which brought down a Consumer
validation failed. Is there any issue
with the network?
Thanks,
Jun
On Mon, Feb 10, 2014 at 5:00 PM, Philip O'Toole phi...@loggly.com wrote:
Saw this thrown today, which brought down a Consumer thread -- we're using
Consumers built on the High-level consumer framework. What may have
Is this a Kafka C++ lib you wrote yourself, or some open-source library?
What version of Kafka?
Philip
On Fri, Jan 31, 2014 at 1:30 PM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Hi,
If Kafka Producer is using a C++ Kafka lib to produce messages, how can
Kafka Consumers written in
++ program writes bytes to kafka, and java reads bytes from kafka.
Is there something special about the way the messages are being serialized
in C++?
--Tom
On Fri, Jan 31, 2014 at 2:36 PM, Philip O'Toole phi...@loggly.com wrote:
Is this a Kafka C++ lib you wrote yourself, or some open-source
and
data will be lost.
Do you have by any chance a pointer to an existing implementation of a
such producer?
Thanks
Le 30 janv. 2014 à 15:13, Philip O'Toole phi...@loggly.com a écrit :
What exactly are you struggling with? Your question is too broad. What
you want to do is eminently possible
http://kafka.apache.org/07/configuration.html
Hello -- I can look the at the code too, but how does this setting interact
with compression? After all, a Producer doing compression doesn't know the
size of a message on the wire it will send to a Kafka broker until after
it has been compressed. And
What version are you running?
Philip
On Mon, Dec 9, 2013 at 4:30 AM, Sanket Maru san...@ecinity.com wrote:
I am working on a small project and discovered that our consumer hasn't
been executed for over a month now.
How can i check the unprocessed events ? From which date the events are
OK, I am only familiar with 0.72.
Philip
On Mon, Dec 9, 2013 at 4:54 AM, Sanket Maru san...@ecinity.com wrote:
I am using kafka 0.8.0
On Mon, Dec 9, 2013 at 6:09 PM, Philip O'Toole phi...@loggly.com wrote:
What version are you running?
Philip
On Mon, Dec 9, 2013 at 4:30 AM
Take apart the hard disk, and flip the magnets in the motors so it spins in
reverse. The Kafka software won't be any the wiser. That should give you
exactly what you need, combined with high-performance sequential reads.
:-D
On Dec 6, 2013, at 7:43 AM, Joe Stein joe.st...@stealth.ly wrote:
Hello,
Say we are using Zookeeper-based Producers, and we specify a topic to be
written to. Since we don't specify the actual brokers, is there a way to
prevent a topic from appearing on a specific broker? What if we set the
topic's partition count to 0 on the broker we don't want it to appear?
We're running 0.72.
Thanks,
Philip
On Thu, Dec 5, 2013 at 4:29 PM, Philip O'Toole phi...@loggly.com wrote:
Hello,
Say we are using Zookeeper-based Producers, and we specify a topic to be
written to. Since we don't specify the actual brokers, is there a way to
prevent a topic from
, if a topic already exists on at least one broker in a cluster, it
won't be created on newly added brokers.
Thanks,
Jun
On Thu, Dec 5, 2013 at 4:29 PM, Philip O'Toole phi...@loggly.com wrote:
Hello,
Say we are using Zookeeper-based Producers, and we specify a topic to be
written
Sweet -- thanks Jun.
On Thu, Dec 5, 2013 at 9:25 PM, Jun Rao jun...@gmail.com wrote:
That's right. Remove the local log dir from brokers that you don't want to
have the topic.
Thanks,
Jun
On Thu, Dec 5, 2013 at 9:22 PM, Philip O'Toole phi...@loggly.com wrote:
Interesting. So if we
Simple tool I wrote to monitor 0.7 consumers.
https://github.com/otoolep/stormkafkamon
On Wed, Dec 4, 2013 at 12:49 PM, David DeMaagd ddema...@linkedin.comwrote:
You can use either the MaxLag MBean (0.8):
http://kafka.apache.org/documentation.html#monitoring
Or the ConsumerOffsetChecker
.
On Sun, Dec 1, 2013 at 9:59 PM, Joe Stein joe.st...@stealth.ly wrote:
Awesome Philip, thanks for sharing!
On Sun, Dec 1, 2013 at 9:17 PM, Philip O'Toole phi...@loggly.com
wrote:
A couple of us here at Loggly recently spoke at AWS reinvent, on how we
use Kafka 0.72 in our ingestion
A couple of us here at Loggly recently spoke at AWS reinvent, on how we use
Kafka 0.72 in our ingestion pipeline. The slides are at the link below, and may
be of interest to people on this list.
By FS I guess you mean file system.
In that case, if one is that concerned, why not run a single Kafka broker on
the same machine, and connect to it over localhost? And disable ZK mode too,
perhaps.
I may be missing something, but I never fully understand why people try really
hard to build
There are many options. Another simple consumer could read from it to a second
broker.
Philip
On Nov 28, 2013, at 4:18 PM, Steve Morin steve.mo...@gmail.com wrote:
Philip,
How would do you mirror this to a main Kafka instance?
-Steve
On Nov 28, 2013, at 16:14, Philip O'Toole phi
...@gmail.com wrote:
Philip, what about if the broker goes down?
I may be missing something.
Diego.
El 28/11/2013 21:09, Philip O'Toole phi...@loggly.com escribió:
By FS I guess you mean file system.
In that case, if one is that concerned, why not run a single Kafka broker
on the same
code structured? Have you open sourced it?
On Nov 28, 2013, at 16:08, Philip O'Toole phi...@loggly.com wrote:
By FS I guess you mean file system.
In that case, if one is that concerned, why not run a single Kafka broker on
the same machine, and connect to it over localhost? And disable
a new ZK session and new connections to the brokers.
Thanks,
Jun
On Tue, Nov 26, 2013 at 9:33 PM, Philip O'Toole phi...@loggly.com wrote:
I want to use a ZK cluster for my Kafka cluster, which is only available
over a cross-country VPN tunnel. The VPN tunnel is prone to resets, every
have the logic for handling ZK session
expirations. So, they should recover automatically. The issue is that if
there is a real failure in the broker/consumer while the VPN is down, the
failure may not be detected.
Thanks,
Jun
On Wed, Nov 27, 2013 at 8:02 AM, Philip O'Toole phi
I want to use a ZK cluster for my Kafka cluster, which is only available over a
cross-country VPN tunnel. The VPN tunnel is prone to resets, every other day or
so, perhaps down for a couple of minutes at a time.
Is this a concern? Any setting changes I should make to mitigate any potential
.
On Tue, Nov 19, 2013 at 11:51 AM, Philip O'Toole phi...@loggly.com wrote:
Don't get scared, this if perfectly normal and easily fixed. :-) The second
topology attempted to fetch messages from an offset in Kafka that does not
exists. This could happen due to Kafka retention policies (messages
Don't get scared, this if perfectly normal and easily fixed. :-) The second
topology attempted to fetch messages from an offset in Kafka that does not
exists. This could happen due to Kafka retention policies (messages
deleted) or a bug in your code. Your code needs to catch this exception,
and
We use 0.72 -- I am not sure if this matters with 0.8.
Why would one choose a partition, as opposed to a random partition choice?
What design pattern(s) would mean choosing a partition? When is it a good
idea?
Any feedback out there?
Thanks,
Philip
D'oh. Bad config on our part. Something we thought we had fixed long ago, but
it crept back in.
Make sure fetch sizes are big enough!
Philp
On Oct 31, 2013, at 7:18 PM, Philip O'Toole phi...@loggly.com wrote:
We suddenly started seeing these messages from one our consumers tonight.
What
On Wed, Oct 30, 2013 at 8:13 PM, Lee, Yeon Ok (이연옥) yeono...@ebay.comwrote:
Hi, all.
I just got curiosity why Apache Kafka is better than any other Message
System in terms of throughput, and durability.
Because it's brilliant, that's why. :-)
What’s the fact to let Kafka have better
You have two choices.
-- Do what you say, and write your own consumer, based on the
SimpleConsumer. Handle all commits, ZK accesses, and balancing yourself.
-- Use a ConsumerConnector for every partition, and call commitOffsets()
explicitly when you have processed a message. This does a commit
I would like to second that. It would be real useful.
Philip
On Oct 8, 2013, at 9:31 AM, Jason Rosenberg j...@squareup.com wrote:
What I would like to see is a way for inactive topics to automatically get
removed after they are inactive for a period of time. That might help in
this case.
Is this with 0.7 or 0.8?
On Wed, Oct 2, 2013 at 12:59 PM, Joe Stein crypt...@gmail.com wrote:
Are you sure the consumers are behind? could the pause be because the
stream is empty and producing messages is what is behind the consumption?
What if you shut off your consumers for 5 minutes and
) that both the brokers create a new
topic log for the same topic.
The brokers are in different availability zones. Does that matter?
Suchi
On Fri, Sep 20, 2013 at 4:20 PM, Philip O'Toole phi...@loggly.com wrote:
Seems to me you are confusing partitions and brokers. Partition count has
/consumer use zookeeper to
discover brokers.
I can clearly see in the logs(brokers) that both the brokers create a new
topic log for the same topic.
The brokers are in different availability zones. Does that matter?
Suchi
On Fri, Sep 20, 2013 at 4:20 PM, Philip O'Toole phi...@loggly.com wrote
Seems to me you are confusing partitions and brokers. Partition count has
nothing to do with the number of brokers to which a message a sent -- just
the number of partitions into which that message will be split when it gets
to a broker.
You need to explicitly set the destination brokers in the
:49 PM, Philip O'Toole phi...@loggly.com wrote:
Hello Kafka users and developers,
We at Loggly launched our new system last week, and Kafka is a critical
part. I just wanted to say a sincere thank-you to the Kafka team at
LinkedIn who put this software together. It's really, really great
guys using 0.7 or 0.8?
Jun
On Mon, Sep 9, 2013 at 12:49 PM, Philip O'Toole phi...@loggly.com wrote:
Hello Kafka users and developers,
We at Loggly launched our new system last week, and Kafka is a critical
part. I just wanted to say a sincere thank-you to the Kafka team at
LinkedIn who
It means the first.
Philip
On Thu, Aug 29, 2013 at 8:55 AM, Mark static.void@gmail.com wrote:
If I have 3 brokers with 3 partitions does that mean:
1) I have 3 partitions per broker so I can have up to 9 consumers
or
2) There is only 1 partition per brokers which means I can have
On Thu, Aug 29, 2013 at 11:11 AM, Mark static.void@gmail.com wrote:
Also, are the consumer offsets store in Kafka or Zookeeper?
Zookeeper.
On Aug 29, 2013, at 11:09 AM, Mark static.void@gmail.com wrote:
1) Should a producer be aware of which broker to write to or is this
Yes, the Kafka team has told me that this is how it works (at least for 0.72).
Philip
On Fri, Aug 23, 2013 at 7:53 AM, Yu, Libo libo...@citi.com wrote:
Hi team,
Right now, from a stream, an iterator can be obtained which has a blocking
hasNext().
So what is the implementation behind the
I am curious. What is it about your design that requires you track order so
tightly? Maybe there is another way to meet your needs instead of relying on
Kafka to do it.
Philip
On Aug 22, 2013, at 9:32 PM, Ross Black ross.w.bl...@gmail.com wrote:
Hi,
I am using Kafka 0.7.1, and using the
brokers in production
environments, giving us a total of 24 partitions. Throughput has been superb.
For integration testing however, we usually use just 1 or 2 partitions.
Philip
Thanks in advance!
--Tom
--
Philip O'Toole
Senior Developer
Loggly, Inc.
San Francisco, CA.
www.loggly.com
Come
1 topic.
I don't understand the second question.
Philip
On Aug 21, 2013, at 9:52 AM, Tom Brown tombrow...@gmail.com wrote:
Philip,
How many topics per broker (just one?) And what is the read/write profile
of your setup?
--Tom
On Wed, Aug 21, 2013 at 12:24 PM, Philip O'Toole phi
No, there isn't, not at the very start when there is no state in
Zookeeper. Once there is state the Kafka team have told me that
rebalancing will not result in any dupes.
However, if there is no state in Zookeeper and your partitions are
empty, simply wait until all consumers have balanced before
If I understand what you are asking, I have dealt successfully with the
same type of issue. It can take more than one Boost async_write() over a
broken connection before the client software notices that the connection is
gone.
The best way to detect if a connection is broken is not by detecting
You set the partition-count to 100 per broker. 3 brokers. 300 partitions total.
Philip
On Thu, Jul 25, 2013 at 11:29 AM, Ian Friedman i...@flurry.com wrote:
Hi guys, apologies in advance for the newb question:
I am running a 3 broker setup, and I have a topic configured with 100
partitions
Have you actually examined the Kafka files on disk, to make sure those
dupes are really there? Or is this a case of reading the same message
more than once?
Philip
On Thu, Jul 18, 2013 at 8:55 AM, Sybrandy, Casey
casey.sybra...@six3systems.com wrote:
Hello,
We recently started seeing
Hello -- we're doing some heavy lifting now with our high-level based
consumer. We open a Consumer Connection per partition within the one JVM,
and are using Kafka 0.72. We saw a burst of the exceptions shown below. Is
this something we should be concerned about? Or is this the normal output
from
down the consumer. Is that the case?
Thanks,
Jun
On Wed, Jul 10, 2013 at 6:43 PM, Philip O'Toole phi...@loggly.com wrote:
Hello -- we're doing some heavy lifting now with our high-level based
consumer. We open a Consumer Connection per partition within the one JVM,
and are using Kafka
It seems like you're not explicitly controlling the offsets. Is that
correct?
If so, the moment you pull a message from the stream, the client framework
considers it processed. So if your app subsequently crashes before the
message is fully processed, and auto-commit updates the offsets in
'
call to ConsumerConnector is made.
Thanks,
Chris
On Tue, Jul 9, 2013 at 11:21 AM, Philip O'Toole phi...@loggly.com wrote:
It seems like you're not explicitly controlling the offsets. Is that
correct?
If so, the moment you pull a message from the stream, the client
framework
Of course -- make an Offset Request. This can be done in many ways,
Java, Python, C++, Ruby. -1 means get latest offset, if I remember
correctly.
http://people.apache.org/~joestein/kafka-0.7.1-incubating-docs/
It's just bytes on the wire, and bytes come back.
Philip
On Sat, Jun 29, 2013 at
Are you just killing Kafka, or Zookeeper too?
On Jun 20, 2013, at 8:59 AM, Yu, Libo libo...@citi.com wrote:
Hi,
I have kafka running on a three host cluster. I have a
script that can automatically start zookeepers on all
three hosts and then start kafka servers on them. It
can also kill
You should consider using it regardless. I find 0.72 to be a great system,
which is well designed and reliable.
As for Storm, it depends. If you just want a simple pub-sub queue system,
probably not.
Philip
On Jun 18, 2013, at 6:48 AM, Piyush Rai piyushra...@gmail.com wrote:
I am trying
Another idea. If a set of messages arrive over a single TCP connection, route
to a partition depending on TCP connection.
To be honest, these approaches, while they work, may not scale when the message
rate is high. If at all possible, try to think of a way to remove this
requirement from your
Depends how important being able to access every single bit of the messages
are, right down to looking at what is on the disk. It's very important to us,
we need that control. Ability to scale throughout as needed is also important -
too important to do anything but run it ourselves. All these
]
Philip
Thanks,
Jun
On Thu, Jun 13, 2013 at 7:34 PM, Philip O'Toole phi...@loggly.com wrote:
Hello -- is it possible for our code to stall a ConsumerConnector from
doing any consuming for, say, 30 seconds, until we can be sure that
all other ConsumeConnectors are rebalanced?
It seems
at some point. This will block the fetcher
from putting the data into the other queue.
Thanks,
Jun
On Wed, Jun 12, 2013 at 9:10 PM, Philip O'Toole phi...@loggly.com wrote:
Jun -- thanks.
But if the topic is the same, doesn't each thread get a partition?
Isn't that how it works
Hello -- is it possible for our code to stall a ConsumerConnector from
doing any consuming for, say, 30 seconds, until we can be sure that
all other ConsumeConnectors are rebalanced?
It seems that the first ConsumerConnector to come up is prefetching
some data, and we end up with duplicate
at 7:34 PM, Philip O'Toole phi...@loggly.com wrote:
Hello -- is it possible for our code to stall a ConsumerConnector from
doing any consuming for, say, 30 seconds, until we can be sure that
all other ConsumeConnectors are rebalanced?
It seems that the first ConsumerConnector to come up
dups are expected
during rebalance. In 0.8, such dups are eliminated. Other than that,
rebalance shouldn't cause dups since we commit consumed offsets to ZK
before doing a rebalance.
Thanks,
Jun
On Thu, Jun 13, 2013 at 7:34 PM, Philip O'Toole phi...@loggly.com wrote:
Hello
Hello -- we're using 0.72. We're looking at the source, but want to be sure. :-)
We create a single ConsumerConnector, call createMessageStreams, and
hand the streams off to individual threads. If one of those threads
calls next() on a stream, gets some messages, and then *blocks* in
some
data getting into the consumer for
topic 2.
Thanks,
Jun
On Wed, Jun 12, 2013 at 7:43 PM, Philip O'Toole phi...@loggly.com wrote:
Hello -- we're using 0.72. We're looking at the source, but want to be
sure. :-)
We create a single ConsumerConnector, call createMessageStreams, and
hand
12, 2013 at 9:10 PM, Philip O'Toole phi...@loggly.com wrote:
Jun -- thanks.
But if the topic is the same, doesn't each thread get a partition?
Isn't that how it works?
Philip
On Wed, Jun 12, 2013 at 9:08 PM, Jun Rao jun...@gmail.com wrote:
Yes, when the consumer is consuming multiple topics
We often replay data days old, and have never seen any issues like
this. We are running 0.72.
Philip
On Mon, Jun 10, 2013 at 11:17 AM, Todd Bilsborrow
tbilsbor...@rhythmnewmedia.com wrote:
We've been running Kafka 0.7.0 in production for several months and have been
quite happy. Our use case
Hello -- I'll try to look at the code, but I'm seeing something here
and I want to be *sure* I'm correct.
Say a batch sitting in a 0.72 partition is, say, 5MB in size. An
instance of a high-level consumer has a configured fetch size of
300KB. This actually becomes the maxSize value, right, in
,
Neha
On Fri, May 31, 2013 at 11:25 PM, Philip O'Toole phi...@loggly.com wrote:
Hello -- I'll try to look at the code, but I'm seeing something here
and I want to be *sure* I'm correct.
Say a batch sitting in a 0.72 partition is, say, 5MB in size. An
instance of a high-level consumer has
As a test, why not just use a disk with provisioned IOPs of 4000? Just as a
test - see if it improves.
Also, you have not supplied any metrics regarding the VM's performance. Is the
CPU busy? Is IO maxed out? Network? Disk? Use a tool like atop, and tell us
what you find.
Philip
On May 20,
1 - 100 of 111 matches
Mail list logo