Re: Aggregated windowed counts

2017-01-05 Thread Benjamin Black
>> > >> Check out the documentation and WordCount example: > >> > >> http://docs.confluent.io/current/streams/index.html > >> > >> > https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams

Re: Aggregated windowed counts

2017-01-04 Thread Benjamin Black
tp://docs.confluent.io/current/streams/index.html > > https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams/WordCountLambdaExample.java > > > Let us know if you have further questions. > > > -Matthias > > On 1/4/17 12:48 P

Aggregated windowed counts

2017-01-04 Thread Benjamin Black
Hello, I'm looking for guidance on how to approach a counting problem. We want to consume a stream of data that consists of IDs and generate an output of the aggregated count with a window size of X seconds using processing time and a hopping time window. For example, using a window size of 1 seco

enhancements to kafka-websocket

2014-04-16 Thread Benjamin Black
i've been a busy little bee this week so kafka-websocket now supports keyed messages and user-defined transforms for messages. the latter is pretty raw, improvement suggestions in the form of pull requests welcome! https://github.com/b/kafka-websocket/commit/6430fd522c0e0305995f648511726da0b616d2e

Re: Broker plugins

2014-03-13 Thread Benjamin Black
Or introduce an app layer between the producers and kafka that does the processing without changes/load to the producers. On Thu, Mar 13, 2014 at 1:18 PM, Neha Narkhede wrote: > In general, the preference has been to avoid having user code run on the > brokers since that just opens a can of worm

Re: Remote Zookeeper

2014-03-11 Thread Benjamin Black
If they are on different physical machines then binding to localhost/using localhost as the host name is unlikely to be what you want. On Tuesday, March 11, 2014, A A wrote: > Thanks. I already checked out the wiki and step 6 in particular. > Just to clarify, Zk, Broker1 and Broker2 are on 3 dif

Re: Remote Zookeeper

2014-03-11 Thread Benjamin Black
"I'd suggest deleting the zookeeper and Kafka logs and starting over using the getting started tutorial from the wiki." | v "Could this be an issue? INFO Registered broker 2 at path /brokers/ids/2 with address localhost:9092. INFO Registered broker 1 at path /brokers/ids/1 with address loca

Re: kafka-websocket on github

2014-03-11 Thread Benjamin Black
sometime ago and found that the browsers (at least chrome and firefox) had > issues keeping up with the throughput of my queues and would eventually > freeze when too much data was coming in too fast. > Have you had this issue too? > > P > > > On Tue, Mar 11, 2014 at 2:48 PM,

Re: kafka-websocket on github

2014-03-11 Thread Benjamin Black
Joe Stein > Founder, Principal Consultant > Big Data Open Source Security LLC > http://www.stealth.ly > Twitter: @allthingshadoop > / > > > > On Mar 11, 2014, at 1:48 PM, Benjamin Black wrote: > > > > exa

Re: kafka-websocket on github

2014-03-11 Thread Benjamin Black
for? > > -Jay > > > On Mon, Mar 10, 2014 at 11:02 AM, Benjamin Black wrote: > > > I put this up over the weekend, thought it might be useful to folks: > > > > https://github.com/b/kafka-websocket > > >

Re: Remote Zookeeper

2014-03-11 Thread Benjamin Black
Seems you my have put your cluster in a very confused state with random addition and removal of brokers and topics. I'd suggest deleting the zookeeper and Kafka logs and starting over using the getting started tutorial from the wiki. On Tuesday, March 11, 2014, A A wrote: > I noticed a problem w

Re: Remote Zookeeper

2014-03-11 Thread Benjamin Black
That's not how Kafka works. You need to pass the full list of brokers. On Tuesday, March 11, 2014, A A wrote: > Hi again. > > Got the setup working. I now have 2 brokers (broker 1 and broker 2) with > one remote zk. I was also able to create some topics > > $KAFKA_HOME/bin/kafka-list-topic.sh --

Re: Remote Zookeeper

2014-03-10 Thread Benjamin Black
zookeeper.connect https://kafka.apache.org/08/configuration.html On Mon, Mar 10, 2014 at 7:17 PM, A A wrote: > Hi > > Pretty new to Kafka. Have been successful in installing Kafka 0.8.0. > I am just wondering how should I make my kafka cluster (2 brokers) connect > to a single remote zookeper

kafka-websocket on github

2014-03-10 Thread Benjamin Black
I put this up over the weekend, thought it might be useful to folks: https://github.com/b/kafka-websocket

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-02-02 Thread Benjamin Black
for boxes that just need to start up rather just doing anything > substantial. > > Going back to the original issue. Is this the only lever I have to get > kafka to work on a 512megs ram box? KAFKA_HEAP_OPTS="-Xmx256M -Xms128M? > Any other trick? > > Thanks > >

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread Benjamin Black
Sorry, was looking at pre-release 0.8 code. No idea now why they are not being set as expected. On Fri, Jan 31, 2014 at 1:20 PM, Benjamin Black wrote: > kafka-run-class.sh in 0.8 does not define KAFKA_HEAP_OPTS. i think you > want KAFKA_OPTS. > > > On Fri, Jan 31, 2014 at

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-31 Thread Benjamin Black
gt; /var/lib/kafka-0.8.0-src/config/server.properties > > What else is there do? I did not have an issue with kafka 7. Only 8. > > Thanks > > > > > > > > > On Fri, Jan 31, 2014 at 4:47 AM, Benjamin Black wrote: > > > are you sure the ja

Re: There is insufficient memory for the Java Runtime Environment to continue.

2014-01-30 Thread Benjamin Black
are you sure the java opts are being set as you expect? On Jan 30, 2014 12:41 PM, "David Montgomery" wrote: > Hi, > > This is a dedicated machine on DO.But i can say I did not have a > problem with kafka 7. > I just upgraded the macine to 1gig on digi ocean. Same error. > > export KAFKA_HEAP

Re: redis versus zookeeper to track consumer offsets

2013-12-17 Thread Benjamin Black
ZK was designed from the start as a clustered, consistent, highly available store for this sort of data and it works extremely well. Redis wasn't and I don't know anyone using Redis in production, including me, who doesn't have stories of Redis losing data. I'm sticking with ZK. On Tue, Dec 17, 2

Re: storing last processed offset, recovery of failed message processing etc.

2013-12-09 Thread Benjamin Black
You might look at Curator http://curator.apache.org/ On Mon, Dec 9, 2013 at 12:36 PM, S Ahmed wrote: > Say am I doing this, a scenerio that I just came up with that demonstrates > #2. > > Someone signs up on a website, and you have to: > > 1. create the user profile > 2. send email confirmation

Re: Trade-off between topics and partitions?

2013-12-05 Thread Benjamin Black
Deja vu! IMO, what you are describing is a database problem, even though you are talking/thinking about it as a queue problem. I'm sure you could construct something using Kafka (and Samza), but I think you'd have an easier time with a database. The number of pending messages per user and the aver

Re: are kafka consumer apps guaranteed to see msgs at least once?

2013-11-21 Thread Benjamin Black
why not just disable autocommit and only call commit offsets() after you've processed a batch? it isn't obvious to me how doing so would allow a message to be processed zero times. On Nov 21, 2013 5:52 PM, "Imran Rashid" wrote: > Hi Edward, > > I think you misunderstand ... I definitely do *not*

Re: How to get monitoring stats

2013-11-19 Thread Benjamin Black
https://www.google.com/search?q=jmx+to+graphite On Tue, Nov 19, 2013 at 5:24 PM, David Montgomery wrote: > Hi, > On the Kafka cluster I would like like to get monitoring stats. I do not > know java per this page. > > Monitoring > > Our monitoring is done though a centralized monitoring system

Re: who is using kafka to stare large messages?

2013-10-07 Thread Benjamin Black
I don't think the batch referred to initially is a Kafka API batch, hence the confusion. I'm sure someone from LinkedIn can clarify. On Oct 7, 2013 9:27 AM, "S Ahmed" wrote: > When you batch things on the producer, say you batch 1000 messages or by > time whatever, the total message size of the b

Re: Managing Millions of Paritions in Kafka

2013-10-06 Thread Benjamin Black
; use TTL on hbase. > On 7 Oct 2013 08:38, "Benjamin Black" wrote: > > > What you are discovering is that Kafka is a message broker, not a > database. > > > > > > On Sun, Oct 6, 2013 at 5:34 PM, Ravindranath Akila < > > ravindranathak...@gmail.com&

Re: Managing Millions of Paritions in Kafka

2013-10-06 Thread Benjamin Black
What you are discovering is that Kafka is a message broker, not a database. On Sun, Oct 6, 2013 at 5:34 PM, Ravindranath Akila < ravindranathak...@gmail.com> wrote: > Thanks a lot Neha! > > Actually, using keyed messages(with Simple Consumers) was the approach we > took. But it seems we can't ma

Re: default producer to retro-fit existing log files collection process?

2013-09-04 Thread Benjamin Black
commons-logging has a log4j logger, so perhaps you just need to use it and the log4j-kafka appender to achieve your goal? On Tue, Sep 3, 2013 at 2:08 PM, Maxime Petazzoni wrote: > Tomcat uses commons-logging for logging. You might be able to write an > adapter towards Kafka, in a similar way as

Re: Out of memory exception

2013-09-03 Thread Benjamin Black
This is a common JVM tuning scenario. You should adjust the values based on empirical data. See the heap size section of http://docs.oracle.com/cd/E21764_01/web./e13814/jvm_tuning.htm On Aug 30, 2013 10:40 PM, "Vadim Keylis" wrote: > I followed linkedin setup example in the docs and located 3

Re: Kafka 0.7 performance compared to bare metal

2013-08-30 Thread Benjamin Black
ttp://man7.org/linux/man-pages/man2/sendfile.2.html somehow works > inefficiently with SSD. And I don't understand why and how can this be > fixed. > > I do understand that you advising me to use more partitions and more > consumer threads. But I would like to know the limits

Re: Kafka 0.7 performance compared to bare metal

2013-08-30 Thread Benjamin Black
You are maxing out the single consumer thread. On Aug 30, 2013 1:35 AM, "Rafael Bagmanov" wrote: > Hi, > > I am trying to understand how fast is kafka 0.7 compared to what I can get > from hard drive. In essence I have 3 questions. > > In all tests below, I'm using single broker with single one-p

Re: Securing kafka

2013-08-29 Thread Benjamin Black
IP filters on the hosts. On Aug 29, 2013 10:03 AM, "Calvin Lei" wrote: > Is there a way to stop a malicious user to connect directly to a kafka > broker and send any messages? Could we have the brokers to accept a message > to a list of know IPs? >

Re: Loadbalancing producers

2013-08-29 Thread Benjamin Black
Producers discover the broker that owns each partition and send messages directly. There isn't anything to load balance at layer 4. On Aug 29, 2013 8:54 AM, "Mark" wrote: > Can you explain why not? > > On Aug 29, 2013, at 8:43 AM, Benjamin Black wrote: > > &g

Re: Loadbalancing producers

2013-08-29 Thread Benjamin Black
The LB in front of the brokers doesn't make sense. On Aug 29, 2013 8:42 AM, "Mark" wrote: > We have a few dozen front-end web apps running rails. Each one of these > rails instances has an embedded producer which in turn connects to a 3 > brokers who are behind a load-balancer. Does that sound ab

Re: Topic stuck with leader: none

2013-08-01 Thread Benjamin Black
If you shutdown all brokers, then bring them up one at a time does it change? Might have to do it several times. On Aug 1, 2013 8:28 PM, "Vinicius Carvalho" wrote: > Hi there! So, I setup a demo and forgot to create the topic. But since my > producer is set to create topics automatic, it created

Re: class kafka.common.LeaderNotAvailableException (kafka.producer.BrokerPartitionInfo)

2013-08-01 Thread Benjamin Black
I'm seeing 0.8-beta1 get into this state at times, though I don't have a way to repro it. Restarting all brokers and zk repeatedly has fixed it. Again, unknown why. On Aug 1, 2013 9:06 PM, "Jun Rao" wrote: > Is this issue reproducible? > > Thanks, > > Jun > > > On Thu, Aug 1, 2013 at 8:24 AM, Nan