Hmmm..none of the ones there seem like the canonical version, how do I
know which of the ones published there is the one to use?
(I searched for 'kafka' on there...).
Jason
On Tue, Nov 20, 2012 at 10:29 PM, Pierre-Yves Ritschard
wrote:
> For what it's worth, I also publish releases on cloj
For what it's worth, I also publish releases on clojars.org
On Wed, Nov 21, 2012 at 7:23 AM, Jason Rosenberg wrote:
> +100
> I've been manually creating poms and uploading jars to our nexus repo too,
> not ideal at all
>
> On Tue, Nov 20, 2012 at 6:48 PM, Otis Gospodnetic <
> otis_gospodn
+100
I've been manually creating poms and uploading jars to our nexus repo too,
not ideal at all
On Tue, Nov 20, 2012 at 6:48 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Eh, correction: I see KAFKA-133 is actually *not* marked for 0.8 release -
> it's just marked as affect
David,
One KafkaStream is meant to be iterated by a single thread. A better
approach is to request higher number of streams
from the Kafka consumer and let each process have its own KafkaStream.
Thanks,
Neha
On Tue, Nov 20, 2012 at 9:40 PM, David Ross wrote:
> Hello,
>
> We want to process mess
Hello,
We want to process messages from a single KafkaStream in a number of
processes. Is it possible to have this code executing in multiple threads
against the same stream?
for (message <- stream) {
someBlockingOperation(message)
}
The scaladocs mention thread safety, but some of the code se
The attribute getCurrentOffset gives the log end offset. It's not
necessarily the log size though since older segments could be deleted.
Thanks,
Jun
On Tue, Nov 20, 2012 at 1:12 PM, Mike Heffner wrote:
> Jun,
>
> Do you have any idea on what the JMX attribute values on the beans "
> kafka:type
Nice, ok, I need to start using 0.8 (is there a semi-stable revision to
start playing with?).
On Tue, Nov 20, 2012 at 2:45 PM, Jay Kreps wrote:
> In 0.7 there is no other way to access stats remotely. Technically the JMX
> is accessible so you can certainly start the broker yourself
>new Kaf
Eh, correction: I see KAFKA-133 is actually *not* marked for 0.8 release - it's
just marked as affecting the 0.8 release. :(
Otis
Performance Monitoring for Solr / ElasticSearch / HBase -
http://sematext.com/spm
>
> From: Otis Gospodnetic
>To: "kafka-us
Pretty pretty pretty please please please from us at Sematext, too. I provided
the instructions in KAFKA-133:
https://issues.apache.org/jira/browse/KAFKA-133?focusedCommentId=13500822&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13500822
I've also pinged zkclie
I believe the (bare minimum) runtime deps are: kafka, scala-library, zookeeper,
and zkclient. Also snappy if you want snappy support.
HTH,
David
On Nov 20, 2012, at 5:44 PM, Jamie Wang wrote:
> Hi,
>
> I am new to using Kafka. I read all the documentations and followed the
> quickstart steps
+1
We have kafka + deps defined in a custom Ivy repo
On Nov 20, 2012, at 7:18 PM, Matthew Rathbone wrote:
> ++ to both Maven packages and multiple Scala versions.
>
> As above, we host our own 2.9.2 build in Nexus. Seems crazy everyone is
> doing the same thing and constantly repeating work.
>
+1, pretty please, please, please.
(we also use Storm, and would love to see published artifacts)
We use sonatype to publish our open source artifacts:
https://docs.sonatype.org/display/Repository/Sonatype+OSS+Maven+Repository+Usage+Guide
Its fairly straightforward. I can help out if you need it
++ to both Maven packages and multiple Scala versions.
As above, we host our own 2.9.2 build in Nexus. Seems crazy everyone is
doing the same thing and constantly repeating work.
On Tue, Nov 20, 2012 at 5:42 PM, Roman Garcia wrote:
> +1
> We also host our packages (kafka-scala28 and kafka-scala
+1
We also host our packages (kafka-scala28 and kafka-scala292) on our Nexus
server
Multiple Scala versions support would be nice as well.
On Tue, Nov 20, 2012 at 9:36 PM, Evan Chan wrote:
> +1.
> We hosted our own built version of Kafka on our Nexus server as well.
>
> -Evan
>
>
> On Tue, Nov
+1.
We hosted our own built version of Kafka on our Nexus server as well.
-Evan
On Tue, Nov 20, 2012 at 3:27 PM, Chris Riccomini wrote:
> Hey Guys,
>
> I was talking with Jay, and he recommended I forward some feedback along.
>
> I have been playing with Kafka 0.8 this week, and am feeling the
Hey Guys,
I was talking with Jay, and he recommended I forward some feedback along.
I have been playing with Kafka 0.8 this week, and am feeling the pain in the
lack of Maven support for it. Specifically, it'd be nice if this stuff were:
1. In Apache's SNAPSHOT repository
2. In some relea
In 0.7 there is no other way to access stats remotely. Technically the JMX
is accessible so you can certainly start the broker yourself
new KafkaServer(...)
and just add a wrapper that calls the methods you are interested in, but if
you are doing this from java it may be a bit awkward to reach i
Hi,
I am new to using Kafka. I read all the documentations and followed the
quickstart steps. I was able to run the sample kafka system. Looking through
the Kafka directories extracted from the tar file, there are a lot of sub
directories. I am wondering if they are all really needed to run
ka
Hi,
I would like to expose some of the kafka stats that appear in the current
kafka jmx mbeans.
In our system we are using the yammer metrics library (instead of polling
jmx), so I'd like to wrap the stats and expose them as yammer metrics
elements, etc.
Looking at the code, it doesn't seem easy
Jun,
Do you have any idea on what the JMX attribute values on the beans "
kafka:type=kafka.logs.{topic name}-{partition idx}" represent then? It
seems like these should correctly represent the current offsets of the
producer logs? They appeared to track correctly for a while, but once the
log size
Evan,
That's correct. The Storm ZK consumer path for us is:
/{prefix}/{spout name}/10.x.x.x:9092:{partition}
and is a JSON blob. ConsumerOffsetChecker would then not work for this.
Mike
On Tue, Nov 20, 2012 at 12:11 PM, Evan Chan wrote:
> Mike,
>
> I'm not sure the Storm-bundled kafka store
I think this may be a terminology issue. By "re-partitioning" I think Neha
means taking data currently on disk and splitting it into a different
number of partitions on different servers. We can't really do this because
the partition function is something computed on the client.
A different issue
Docs are not updated since 0.8 is not yet released.
Thanks,
Neha
On Tue, Nov 20, 2012 at 11:09 AM, Jason Rosenberg wrote:
> Is there a configuration doc page for 0.8 (since apparently there are some
> new settings)?
>
> Jason
>
> On Tue, Nov 20, 2012 at 10:39 AM, Jun Rao wrote:
>
>> That's righ
Hi Neha,
Thanks for the response, and we're currently working to integrate with
mbeans exposed with collectors and monitor it.
It will be great to know if we've not having support of repartition,
can we move the files in one partition to another to pick-up? Will
that work.
Noads-8:
total 9886868
Is there a configuration doc page for 0.8 (since apparently there are some
new settings)?
Jason
On Tue, Nov 20, 2012 at 10:39 AM, Jun Rao wrote:
> That's right. VIP is only used for getting metadata. All producer send
> requests are through direct RPC to each broker.
>
> Thanks,
>
> Jun
>
> On
zookeeper server version is 3.3.3 is pretty buggy and has known
session expiration and unexpected ephemeral node deletion bugs.
Please upgrade to 3.3.4 and retry.
Thanks,
Neha
On Tue, Nov 20, 2012 at 10:42 AM, Xiaoyu Wang wrote:
> Hello everybody,
>
> We have run into this problem a few times in
Hello everybody,
We have run into this problem a few times in the past week. The symptom is
some broker disappear from zookeeper. The broker appears to be healthy.
After that, producers start producing lots of ZK producer cache stale log
and stop making any progress.
"logger.info("Try #" + numRet
You can try to put all brokers in a vip and expose the vip to the producer.
If there is no vip, it takes the same amount effort as moving a zk cluster
to a new set of hosts.
Thanks,
Jun
On Tue, Nov 20, 2012 at 10:20 AM, David Arthur wrote:
> If I understand correctly, the brokers stay informed
That's right. VIP is only used for getting metadata. All producer send
requests are through direct RPC to each broker.
Thanks,
Jun
On Tue, Nov 20, 2012 at 10:28 AM, Jason Rosenberg wrote:
> Ok,
>
> I think I understand (so I'll need to change some things in our set up to
> work with 0.8).
>
>
>> So the VIP is only for getting meta-data? After that, under the covers,
the producers will make direct connections to individual kafka hosts that
they learned about from connecting through the VIP
That's right.
Thanks for your questions !
On Tue, Nov 20, 2012 at 10:28 AM, Jason Rosenberg wr
Ok,
I think I understand (so I'll need to change some things in our set up to
work with 0.8).
So the VIP is only for getting meta-data? After that, under the covers,
the producers will make direct connections to individual kafka hosts that
they learned about from connecting through the VIP?
Jas
If I understand correctly, the brokers stay informed about one another through
ZooKeeper and therefor any broker can give info about any other broker?
This is an interesting approach. What would happen if your broker list changed
dramatically over time?
On Nov 20, 2012, at 1:02 PM, Neha Narkhe
I think the confusion is that we are answering a slightly different
question then what you are asking. If I understand you are asking, "do I
need to put ALL the kafka broker urls into the config for the client and
will this need to be updated if I add machines to the cluster?".
The answer to both
On Tue, Nov 20, 2012 at 10:00 AM, Neha Narkhede wrote:
> > By requiring use of a configured broker.list for each client, means that
> > 1000's of deployed apps need to be updated any time the kafka cluster
> > changes, no? (Or am I not understanding?).
>
> The advantage is that you can configure
This is being discussed in another thread -
http://markmail.org/message/mypnt7sgkqt55jb2?q=Jason+async+producer
Basically, you want zookeeper on the producer to do just one thing -
notify the change in the liveness of brokers in Kafka
cluster. In 0.8, brokers are not the entity to worry about, wha
> By requiring use of a configured broker.list for each client, means that
> 1000's of deployed apps need to be updated any time the kafka cluster
> changes, no? (Or am I not understanding?).
The advantage is that you can configure broker.list to point to a VIP, so you
can transparently change th
In the case that producer does not require zk.connect, how can the
producer recognize the new brokers or brokers which went down?
On Tue, Nov 20, 2012 at 8:31 AM, Jun Rao wrote:
> David,
>
> The change in 0.8 is that instead of requiring zk.connect, we require
> broker.list. In both cases, you ty
Ok,
So, I'm still wrapping my mind around this. I liked being able to use zk
for all clients, since it made it very easy to think about how to update
the kafka cluster. E.g. how to add new brokers, how to move them all to
new hosts entirely, etc., without having to redeploy all the clients. The
This is likely caused by https://issues.apache.org/jira/browse/KAFKA-550.
The fix has been checked into trunk.
Thanks,
Jun
On Tue, Nov 20, 2012 at 4:44 AM, Michal Haris wrote:
> Hi, I am seeing behaviour which I am not expecting when using topic
> filters.
>
> TopicFilter sourceTopicFilter = ne
We use m1.large's with ephemeral storage and get 20MB/sec using Kafka's
built in benchmarking tool. No compression.
On Tue, Nov 20, 2012 at 7:52 AM, David Arthur wrote:
> In my experience, anything smaller than m1.xlarge isn't really suitable
> for I/O intensive high performance stuff. I would
The tool gets the end offset of the log using getOffsetBefore and the
consumer offset from ZK. It then calculates the lag.
We do have a JMX for lag in ZookeeperConsumerConnector. The api is the
following, but you need to provide topic/brokerid/partitionid.
/**
* JMX interface for monitoring con
Mike,
I'm not sure the Storm-bundled kafka stores offsets in the same ZK
locations as the regular Kafka consumer. Actually if you can verify the
location that would be great, cuz I'm curious. Anyways the
ConsumerOffsetChecker would not be able to help if the ZK locations were
different.
-Ev
Jason,
Auto discovery of new brokers and rolling restart of brokers are still
supported in 0.8. It's just that most of the ZK related logic is moved to
the broker.
There are 2 reasons why we want to remove zkclient from the client.
1. If the client goes to GC, it can cause zk session expiration
Trunk does not have latest 0.8 code yet. We plan to merge 0.8 back
into trunk soon, but it hasn't happened yet
Typically, the number of producers to a production Kafka clusters is
very large, which means large number of connections
to zookeeper. If there is a slight blip on the zookeeper cluster d
David,
The change in 0.8 is that instead of requiring zk.connect, we require
broker.list. In both cases, you typically provide a list of hosts and
ports. Functionality wise, they achieve the same thing, ie, the producer is
able to send the data to the right broker. Are you saying that zk.connect
i
Muthu,
a) Not as of now. Please feel free to create the JIRA and specify the
details there
b) I doubt increasing partitions will help. 500 GB/day/topic suggests
the data per partition is only 10 GB/day. Before thinking about
increasing the # of partitions, I would try a few things-
1. Inspect th
In 0.8, both the broker and the consumer still need zkclient. So, a zk
cluster is still needed.
Thanks,
Jun
On Tue, Nov 20, 2012 at 8:04 AM, Jason Rosenberg wrote:
> Agreed, I'm not sure I understand the move away from zk. Is it still
> required for consumers, and for the brokers themselves?
Agreed, I'm not sure I understand the move away from zk. Is it still
required for consumers, and for the brokers themselves? If so, we still
need to deploy a zk cluster anyway.
Will kafka now support coordinating 1000's of producer clients?
Jason
On Tue, Nov 20, 2012 at 7:54 AM, David Arthur
I have not tried that yet, I was hoping to use an existing Ruby monitoring
process that we use to monitor several other existing resources. I also
don't want to make changes to the Kafka consumer code, as it's part of a
bundled package (Storm).
Where does ConsumerOffsetChecker pull its informatio
I checked out trunk. I guess I assumed that included the latest 0.8. Is
that not right? Am I just looking at 0.7.x+?
Honestly, I don't think it would be a positive thing not to be able to rely
on zookeeper in producer code. How does that affect the discovery of a
kafka cluster under dynamic co
On Nov 20, 2012, at 12:23 AM, Jun Rao wrote:
> Jason,
>
> In 0.8, producer doesn't use zkclient at all. You just need to set
> broker.list.
This seems like a regression in functionality. For me, one of the benefits of
Kafka is only needing to know about ZooKeeper
> A number of things have cha
In my experience, anything smaller than m1.xlarge isn't really suitable for I/O
intensive high performance stuff. I would guess that, for Kafka, a single
m1.xlarge would outperform two m1.large. I have no hard evidence to support
this however.
What I'd like to see are some benchmarks comparing
Hi, I am seeing behaviour which I am not expecting when using topic filters.
TopicFilter sourceTopicFilter = new Whitelist("pageviews");
List> streams =
consumer.createMessageStreamsByFilter(sourceTopicFilter, 3);
The topic has exactly 3 partitions and 3 streams are created, however only
the last
Hi Jun,
Thanks for the response.
a) Is there any plan in the roadmap to address this re-partition or
partition balance with new partitions? Please let me know to have the
JIRA for this.
b) Do we need to go for more partitions for the topic6 (46 to ??) to
reduce the new requests + backlog.
-Muth
54 matches
Mail list logo