Re: Multiple processors belongs to same GroupId needs to read same message from same topic

2016-08-16 Thread Tanay Soni
Hi,

I am wondering why not have two different GroupIDs for the processors? This
would ensure that both P1 & P2 read each message from the topic.

- Tanay

On Wed 17 Aug, 2016 11:06 am David Garcia,  wrote:

> You could create another partition in topic T, and publish the same
> message to both partitions.  You would have to configure P2 to read from
> the other partition.  Or you could have P1 write the message to another
> topic and configure P2 to listen to that topic.
>
> -David
>
> On 8/16/16, 11:54 PM, "Deepak Pengoria"  wrote:
>
> For your information, I am using Confluent-3.0.0 (having Streaming
> api-0.10)
>
> On Wed, Aug 17, 2016 at 10:23 AM, Deepak Pengoria <
> deepak.pengo...@gmail.com
> > wrote:
>
> > Hi, I have a problem for which I am not able to find the solution.
> Below
> > is the problem statement :
> >
> > I have two Kafka-Steaming api processors say P1 and P2, both want to
> read
> > same message(s) from same topic say T. Topic T is having only one
> partition
> > and contains some configuration information and this topic doesn't
> update
> > frequently (update hardly once in a month). Currently if P1 read the
> > message from topic T then that P2 will not be able to read this
> message.
> >
> > How can I achieve that both the processors could read same message?
> It is
> > something on which I am stuck and need help.
> >
> > Regards,
> > Deepak
> >
>
>
>


Re: Multiple processors belongs to same GroupId needs to read same message from same topic

2016-08-16 Thread David Garcia
You could create another partition in topic T, and publish the same message to 
both partitions.  You would have to configure P2 to read from the other 
partition.  Or you could have P1 write the message to another topic and 
configure P2 to listen to that topic.

-David

On 8/16/16, 11:54 PM, "Deepak Pengoria"  wrote:

For your information, I am using Confluent-3.0.0 (having Streaming api-0.10)

On Wed, Aug 17, 2016 at 10:23 AM, Deepak Pengoria  wrote:

> Hi, I have a problem for which I am not able to find the solution. Below
> is the problem statement :
>
> I have two Kafka-Steaming api processors say P1 and P2, both want to read
> same message(s) from same topic say T. Topic T is having only one 
partition
> and contains some configuration information and this topic doesn't update
> frequently (update hardly once in a month). Currently if P1 read the
> message from topic T then that P2 will not be able to read this message.
>
> How can I achieve that both the processors could read same message? It is
> something on which I am stuck and need help.
>
> Regards,
> Deepak
>




Multiple processors belongs to same GroupId needs to read same message from same topic

2016-08-16 Thread Deepak Pengoria
Hi, I have a problem for which I am not able to find the solution. Below is
the problem statement :

I have two Kafka-Steaming api processors say P1 and P2, both want to read
same message(s) from same topic say T. Topic T is having only one partition
and contains some configuration information and this topic doesn't update
frequently (update hardly once in a month). Currently if P1 read the
message from topic T then that P2 will not be able to read this message.

How can I achieve that both the processors could read same message? It is
something on which I am stuck and need help.

Regards,
Deepak


Re: Multiple processors belongs to same GroupId needs to read same message from same topic

2016-08-16 Thread Deepak Pengoria
For your information, I am using Confluent-3.0.0 (having Streaming api-0.10)

On Wed, Aug 17, 2016 at 10:23 AM, Deepak Pengoria  wrote:

> Hi, I have a problem for which I am not able to find the solution. Below
> is the problem statement :
>
> I have two Kafka-Steaming api processors say P1 and P2, both want to read
> same message(s) from same topic say T. Topic T is having only one partition
> and contains some configuration information and this topic doesn't update
> frequently (update hardly once in a month). Currently if P1 read the
> message from topic T then that P2 will not be able to read this message.
>
> How can I achieve that both the processors could read same message? It is
> something on which I am stuck and need help.
>
> Regards,
> Deepak
>


Re: [DISCUSS] Java 8 as a minimum requirement

2016-08-16 Thread Alexis Midon
java7 is end of life. http://www.oracle.com/technetwork/java/eol-135779.html
+1



On Tue, Aug 16, 2016 at 6:43 AM Ismael Juma  wrote:

> Hey Harsha,
>
> I noticed that you proposed that Storm should drop support for Java 7 in
> master:
>
> http://markmail.org/message/25do6wd3a6g7cwpe
>
> It's useful to know what other Apache projects are doing in this regard, so
> I'm interested in the timeline being proposed for Storm's transition. I
> could not find it in the thread above, so I'd appreciate it if you could
> clarify it for us (and sorry if I missed it).
>
> Thanks,
> Ismael
>
> On Mon, Jun 20, 2016 at 5:05 AM, Harsha  wrote:
>
> > Hi Ismael,
> >   Agree on timing is more important. If we give enough heads
> >   up to the users who are on Java 7 thats great but still
> >   shipping this in 0.10.x line is won't be good as it still
> >   perceived as maint release even the release might contain
> >   lot of features .  If we can make this as part of 0.11 and
> >   cutting 0.10.1 features moving to 0.11 and giving rough
> >   timeline when that would be released would be ideal.
> >
> > Thanks,
> > Harsha
> >
> > On Fri, Jun 17, 2016, at 11:13 AM, Ismael Juma wrote:
> > > Hi Harsha,
> > >
> > > Comments below.
> > >
> > > On Fri, Jun 17, 2016 at 7:48 PM, Harsha  wrote:
> > >
> > > > Hi Ismael,
> > > > "Are you saying that you are aware of many Kafka users still
> > > > using Java 7
> > > > > who would be ready to upgrade to the next Kafka feature release
> > (whatever
> > > > > that version number is) before they can upgrade to Java 8?"
> > > > I know there quite few users who are still on java 7
> > >
> > >
> > > This is good to know.
> > >
> > >
> > > > and regarding the
> > > > upgrade we can't say Yes or no.  Its upto the user discretion when
> they
> > > > choose to upgrade and ofcourse if there are any critical fixes that
> > > > might go into the release.  We shouldn't be restricting their upgrade
> > > > path just because we removed Java 7 support.
> > > >
> > >
> > > My point is that both paths have their pros and cons and we need to
> weigh
> > > them up. If some users are slow to upgrade the Java version (Java 7 has
> > > been EOL'd for over a year), there's a good chance that they are slow
> to
> > > upgrade Kafka too. And if that is the case (and it may not be), then
> > > holding up improvements for the ones who actually do upgrade may be the
> > > wrong call. To be clear, I am still in listening mode and I haven't
> made
> > > up
> > > my mind on the subject.
> > >
> > > Once we released 0.9.0 there aren't any 0.8.x releases. i.e we don't
> > > > have LTS type release where we continually ship critical fixes over
> > > > 0.8.x minor releases. So if a user notices a critical fix the only
> > > > option today is to upgrade to next version where that fix is shipped.
> > > >
> > >
> > > We haven't done a great job at this in the past, but there is no
> decision
> > > that once a new major release is out, we don't do patch releases for
> the
> > > previous major release. In fact, we have been collecting critical fixes
> > > in
> > > the 0.9.0 branch for a potential 0.9.0.2.
> > >
> > > I understand there is no decision made yet but given the premise was to
> > > > ship this in 0.10.x  , possibly 0.10.1 which I don't agree with. In
> > > > general against shipping this in 0.10.x version. Removing Java 7
> > support
> > > > when the release is minor in general not a good idea to users.
> > > >
> > >
> > > Sorry if I didn't communicate this properly. I simply meant the next
> > > feature release. I used 0.10.1.0 as an example, but it could also be
> > > 0.11.0.0 if that turns out to be the next release. A discussion on that
> > > will probably take place once the scope is clear. Personally, I think
> the
> > > timing is more important the the version number, but it seems like some
> > > people disagree.
> > >
> > > Ismael
> >
>


Re: Automated Testing w/ Kafka Streams

2016-08-16 Thread Guozhang Wang
About moving some streams text utils into a separate package: I think this
has been requested before with a filed JIRA

https://issues.apache.org/jira/browse/KAFKA-3625


Guozhang

On Tue, Aug 16, 2016 at 10:18 AM, Michael Noll  wrote:

> Addendum:
>
> > Unfortunately, Apache Kafka does not publish these testing facilities as
> maven artifacts -- that's why everyone is rolling their own.
>
> Some testing facilities (like kafka.utils.TestUtils) are published via
> maven, but other helpful testing facilities are not.
>
> Since Radek provided a snippet how to pull in the artifact that includes
> k.u.TestUtils, here's the same snippet for Maven/pom.xml, with dependency
> scope set to `test`:
>
>   
>   org.apache.kafka
>   kafka_2.11
>   0.10.0.0
>   test
>   test
>   
>
>
>
> On Tue, Aug 16, 2016 at 7:14 PM, Michael Noll 
> wrote:
>
> > Mathieu,
> >
> > FWIW here are some pointers to run embedded Kafka/ZK instances for
> > integration testing.  The second block of references below uses Curator's
> > TestingServer for running embedded ZK instances.  See also the relevant
> > pom.xml for how the integration tests are being run (e.g. disabled JVM
> > reusage to ensure test isolation).
> >
> > Unfortunately, Apache Kafka does not publish these testing facilities as
> > maven artifacts -- that's why everyone is rolling their own.
> >
> > In Apache Kafka:
> >
> > Helper classes (e.g. embedded Kafka)
> > https://github.com/apache/kafka/tree/trunk/streams/src/
> > test/java/org/apache/kafka/streams/integration/utils
> >
> > Integration test example:
> > https://github.com/apache/kafka/blob/trunk/streams/src/
> > test/java/org/apache/kafka/streams/integration/
> FanoutIntegrationTest.java
> >
> > Also, for kafka.utils.TestUtils usage:
> > https://github.com/apache/kafka/blob/trunk/core/src/
> > test/scala/integration/kafka/api/IntegrationTestHarness.scala
> >
> > In confluentinc/examples:
> >
> > Helper classes (e.g. embedded Kafka, embedded Confluent Schema
> > Registry for Avro testing)
> > https://github.com/confluentinc/examples/tree/
> > kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> > confluent/examples/streams/kafka
> >
> > Some more sophisticated integration tests:
> > https://github.com/confluentinc/examples/blob/
> > kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> > confluent/examples/streams/WordCountLambdaIntegrationTest.java
> > https://github.com/confluentinc/examples/blob/
> > kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> > confluent/examples/streams/SpecificAvroIntegrationTest.java
> >
> > Best,
> > Michael
> >
> >
> >
> >
> > On Tue, Aug 16, 2016 at 3:36 PM, Mathieu Fenniak <
> > mathieu.fenn...@replicon.com> wrote:
> >
> >> Hi Radek,
> >>
> >> No, I'm not familiar with these tools.  I see that Curator's
> TestingServer
> >> looks pretty straight-forward, but, I'm not really sure what
> >> kafka.util.TestUtils
> >> is.  I can't find any documentation referring to this, and it doesn't
> seem
> >> to be a part of any published maven artifacts in the Kafka project; can
> >> you
> >> point me at what you're using a little more specifically?
> >>
> >> Mathieu
> >>
> >>
> >> On Mon, Aug 15, 2016 at 2:39 PM, Radoslaw Gruchalski <
> >> ra...@gruchalski.com>
> >> wrote:
> >>
> >> > Out of curiosity, are you aware of kafka.util.TestUtils and Apache
> >> Curator
> >> > TestingServer?
> >> > I’m using this successfully to test publis / consume scenarios with
> >> things
> >> > like Flink, Spark and custom apps.
> >> > What would stop you from taking the same approach?
> >> >
> >> > –
> >> > Best regards,
> >> > Radek Gruchalski
> >> > ra...@gruchalski.com
> >> >
> >> >
> >> > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak (
> >> > mathieu.fenn...@replicon.com) wrote:
> >> >
> >> > Hi Michael,
> >> >
> >> > It would definitely be an option. I am not currently doing any testing
> >> > like that; it could replace the ProcessorTopologyTestDriver-style
> >> testing
> >> > that I'd like to do, but there are some trade-offs to consider:
> >> >
> >> > - I can't do an isolated test of just the TopologyBuilder; I'd be
> >> > bringing in configuration management code (eg. configuring where to
> >> access
> >> > ZK + Kafka).
> >> > - Tests using a running Kafka server wouldn't have a clear end-point;
> if
> >> > something in the toplogy doesn't publish a message where I expected it
> >> to,
> >> > my test can only fail via a timeout.
> >> > - Tests are likely to be slower; this might not be significant, but a
> >> > small difference in test speed has a big impact in productivity after
> a
> >> > few
> >> > months of development
> >> > - Tests will be more complex & fragile; some additional component
> needs
> >> > to manage starting up that Kafka server, making sure it's ready-to-go,
> >> > running tests, and then tearing it down
> >> > - Tests will have to be cautious of state 

Re: Automated Testing w/ Kafka Streams

2016-08-16 Thread Michael Noll
Addendum:

> Unfortunately, Apache Kafka does not publish these testing facilities as
maven artifacts -- that's why everyone is rolling their own.

Some testing facilities (like kafka.utils.TestUtils) are published via
maven, but other helpful testing facilities are not.

Since Radek provided a snippet how to pull in the artifact that includes
k.u.TestUtils, here's the same snippet for Maven/pom.xml, with dependency
scope set to `test`:

  
  org.apache.kafka
  kafka_2.11
  0.10.0.0
  test
  test
  



On Tue, Aug 16, 2016 at 7:14 PM, Michael Noll  wrote:

> Mathieu,
>
> FWIW here are some pointers to run embedded Kafka/ZK instances for
> integration testing.  The second block of references below uses Curator's
> TestingServer for running embedded ZK instances.  See also the relevant
> pom.xml for how the integration tests are being run (e.g. disabled JVM
> reusage to ensure test isolation).
>
> Unfortunately, Apache Kafka does not publish these testing facilities as
> maven artifacts -- that's why everyone is rolling their own.
>
> In Apache Kafka:
>
> Helper classes (e.g. embedded Kafka)
> https://github.com/apache/kafka/tree/trunk/streams/src/
> test/java/org/apache/kafka/streams/integration/utils
>
> Integration test example:
> https://github.com/apache/kafka/blob/trunk/streams/src/
> test/java/org/apache/kafka/streams/integration/FanoutIntegrationTest.java
>
> Also, for kafka.utils.TestUtils usage:
> https://github.com/apache/kafka/blob/trunk/core/src/
> test/scala/integration/kafka/api/IntegrationTestHarness.scala
>
> In confluentinc/examples:
>
> Helper classes (e.g. embedded Kafka, embedded Confluent Schema
> Registry for Avro testing)
> https://github.com/confluentinc/examples/tree/
> kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> confluent/examples/streams/kafka
>
> Some more sophisticated integration tests:
> https://github.com/confluentinc/examples/blob/
> kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> confluent/examples/streams/WordCountLambdaIntegrationTest.java
> https://github.com/confluentinc/examples/blob/
> kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> confluent/examples/streams/SpecificAvroIntegrationTest.java
>
> Best,
> Michael
>
>
>
>
> On Tue, Aug 16, 2016 at 3:36 PM, Mathieu Fenniak <
> mathieu.fenn...@replicon.com> wrote:
>
>> Hi Radek,
>>
>> No, I'm not familiar with these tools.  I see that Curator's TestingServer
>> looks pretty straight-forward, but, I'm not really sure what
>> kafka.util.TestUtils
>> is.  I can't find any documentation referring to this, and it doesn't seem
>> to be a part of any published maven artifacts in the Kafka project; can
>> you
>> point me at what you're using a little more specifically?
>>
>> Mathieu
>>
>>
>> On Mon, Aug 15, 2016 at 2:39 PM, Radoslaw Gruchalski <
>> ra...@gruchalski.com>
>> wrote:
>>
>> > Out of curiosity, are you aware of kafka.util.TestUtils and Apache
>> Curator
>> > TestingServer?
>> > I’m using this successfully to test publis / consume scenarios with
>> things
>> > like Flink, Spark and custom apps.
>> > What would stop you from taking the same approach?
>> >
>> > –
>> > Best regards,
>> > Radek Gruchalski
>> > ra...@gruchalski.com
>> >
>> >
>> > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak (
>> > mathieu.fenn...@replicon.com) wrote:
>> >
>> > Hi Michael,
>> >
>> > It would definitely be an option. I am not currently doing any testing
>> > like that; it could replace the ProcessorTopologyTestDriver-style
>> testing
>> > that I'd like to do, but there are some trade-offs to consider:
>> >
>> > - I can't do an isolated test of just the TopologyBuilder; I'd be
>> > bringing in configuration management code (eg. configuring where to
>> access
>> > ZK + Kafka).
>> > - Tests using a running Kafka server wouldn't have a clear end-point; if
>> > something in the toplogy doesn't publish a message where I expected it
>> to,
>> > my test can only fail via a timeout.
>> > - Tests are likely to be slower; this might not be significant, but a
>> > small difference in test speed has a big impact in productivity after a
>> > few
>> > months of development
>> > - Tests will be more complex & fragile; some additional component needs
>> > to manage starting up that Kafka server, making sure it's ready-to-go,
>> > running tests, and then tearing it down
>> > - Tests will have to be cautious of state existing in Kafka. eg. two
>> > test suites that touch the same topics could be influenced by state of a
>> > previous test. Either you take a "destroy the world" approach between
>> test
>> > cases (or test suites), which probably makes test speed much worse, or,
>> > you
>> > find another way to isolate test's state.
>> >
>> > I'd have to face all these problems at the higher level that I'm calling
>> > "systems-level tests", but, I think it would be better to do the
>> majority
>> > of the automated testing at a lower level 

Re: Automated Testing w/ Kafka Streams

2016-08-16 Thread Michael Noll
Mathieu,

FWIW here are some pointers to run embedded Kafka/ZK instances for
integration testing.  The second block of references below uses Curator's
TestingServer for running embedded ZK instances.  See also the relevant
pom.xml for how the integration tests are being run (e.g. disabled JVM
reusage to ensure test isolation).

Unfortunately, Apache Kafka does not publish these testing facilities as
maven artifacts -- that's why everyone is rolling their own.

In Apache Kafka:

Helper classes (e.g. embedded Kafka)

https://github.com/apache/kafka/tree/trunk/streams/src/test/java/org/apache/kafka/streams/integration/utils

Integration test example:

https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/FanoutIntegrationTest.java

Also, for kafka.utils.TestUtils usage:

https://github.com/apache/kafka/blob/trunk/core/src/test/scala/integration/kafka/api/IntegrationTestHarness.scala

In confluentinc/examples:

Helper classes (e.g. embedded Kafka, embedded Confluent Schema Registry
for Avro testing)

https://github.com/confluentinc/examples/tree/kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/confluent/examples/streams/kafka

Some more sophisticated integration tests:

https://github.com/confluentinc/examples/blob/kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/confluent/examples/streams/WordCountLambdaIntegrationTest.java

https://github.com/confluentinc/examples/blob/kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/confluent/examples/streams/SpecificAvroIntegrationTest.java

Best,
Michael




On Tue, Aug 16, 2016 at 3:36 PM, Mathieu Fenniak <
mathieu.fenn...@replicon.com> wrote:

> Hi Radek,
>
> No, I'm not familiar with these tools.  I see that Curator's TestingServer
> looks pretty straight-forward, but, I'm not really sure what
> kafka.util.TestUtils
> is.  I can't find any documentation referring to this, and it doesn't seem
> to be a part of any published maven artifacts in the Kafka project; can you
> point me at what you're using a little more specifically?
>
> Mathieu
>
>
> On Mon, Aug 15, 2016 at 2:39 PM, Radoslaw Gruchalski  >
> wrote:
>
> > Out of curiosity, are you aware of kafka.util.TestUtils and Apache
> Curator
> > TestingServer?
> > I’m using this successfully to test publis / consume scenarios with
> things
> > like Flink, Spark and custom apps.
> > What would stop you from taking the same approach?
> >
> > –
> > Best regards,
> > Radek Gruchalski
> > ra...@gruchalski.com
> >
> >
> > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak (
> > mathieu.fenn...@replicon.com) wrote:
> >
> > Hi Michael,
> >
> > It would definitely be an option. I am not currently doing any testing
> > like that; it could replace the ProcessorTopologyTestDriver-style
> testing
> > that I'd like to do, but there are some trade-offs to consider:
> >
> > - I can't do an isolated test of just the TopologyBuilder; I'd be
> > bringing in configuration management code (eg. configuring where to
> access
> > ZK + Kafka).
> > - Tests using a running Kafka server wouldn't have a clear end-point; if
> > something in the toplogy doesn't publish a message where I expected it
> to,
> > my test can only fail via a timeout.
> > - Tests are likely to be slower; this might not be significant, but a
> > small difference in test speed has a big impact in productivity after a
> > few
> > months of development
> > - Tests will be more complex & fragile; some additional component needs
> > to manage starting up that Kafka server, making sure it's ready-to-go,
> > running tests, and then tearing it down
> > - Tests will have to be cautious of state existing in Kafka. eg. two
> > test suites that touch the same topics could be influenced by state of a
> > previous test. Either you take a "destroy the world" approach between
> test
> > cases (or test suites), which probably makes test speed much worse, or,
> > you
> > find another way to isolate test's state.
> >
> > I'd have to face all these problems at the higher level that I'm calling
> > "systems-level tests", but, I think it would be better to do the majority
> > of the automated testing at a lower level that doesn't bring these
> > considerations into play.
> >
> > Mathieu
> >
> >
> > On Mon, Aug 15, 2016 at 12:13 PM, Michael Noll 
> > wrote:
> >
> > > Mathieu,
> > >
> > > follow-up question: Are you also doing or considering integration
> > testing
> > > by spawning a local Kafka cluster and then reading/writing to that
> > cluster
> > > (often called embedded or in-memory cluster)? This approach would be in
> > > the middle between ProcessorTopologyTestDriver (that does not spawn a
> > Kafka
> > > cluster) and your system-level testing (which I suppose is running
> > against
> > > a "real" test Kafka cluster).
> > >
> > > -Michael
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Aug 15, 2016 at 3:44 PM, Mathieu Fenniak <
> > > mathieu.fenn...@replicon.com> 

what API matches the discussion in I Heart Logs?

2016-08-16 Thread Daniel Lyons
Hi,

I’ve read and become somewhat indoctrinated by the possibility discussed in I 
Heart Logs of having an event stream written to Kafka feeding a process that 
loads a database. That process should, I think, store the last offset it has 
processed in its downstream database rather than in ZooKeeper. The sequence of 
API calls that I’ve found to accomplish this doesn’t match my expectations from 
reading the book:

  1. Instantiate a KafkaConsumer with some randomish group.id and 
enable.auto.commit=false.
  2. Subscribe to the event stream topic
  3. poll(0, same topic as above) and discard whatever comes back (to enable 
the next step)
  4. seek(offset)
  5. poll(N, same topic) inside a loop, handling messages

Is this the right approach? I can’t help but feel like I am missing something 
obvious, because the book makes it sound like this is a common pattern but I 
feel like I’m using this API against its intended purpose.

For the time being, I have just one server, one ZK, one partition and a handful 
of topics while I experiment.

Thanks for your help,

-- 
Daniel Lyons






Re: Scala: Kafka Consumer (kafka-clients 0.9.0.1)

2016-08-16 Thread Ajay Sharma
BTW we used 0.8.x

On 8/16/16, 9:51 AM, "Ajay Sharma"  wrote:

>Amir,
>We had similar requirement to consume every message reliably; the approach
>I picked was to push any message with unsuccessful consumption to a
>secondary topic for later entertainment; in our case the message/events
>were non-dependant so we use to make second attempt for consumption and on
>any failure push it to poison queue (Elastics Search in our case).
>
>Regards, ajay
>
>On 8/16/16, 1:50 AM, "Amir Zuker"  wrote:
>
>>Hi everyone,
>>
>>I have a question regarding the 'KafkaConsumer' and its API in regards to
>>committing offsets. (kafka-clients 0.9.0.1)
>>
>>*The scenario -*
>>I am working with auto commit set to disabled because I want to implement
>>a
>>retry mechanism and eventually transfer the message to another topic that
>>contains the poison messages.
>>Since I want it to be reliable, I am not using the auto commit and I wish
>>to take control on when that should happen
>>
>>*The implementation detail -*
>>My class that extends 'Runnable' and is created by the KafkaConsumer
>>needs
>>to commit the offset once it is done with handling the topic message.
>>However, the API for committing messages is located on the KafkaConsumer
>>with no relation to partition or thread.
>>
>>*The problem -*
>>If I understand correctly, I can use the same KafkaConsumer instance with
>>multiple threads against multiple partitions.
>>If that is the case, how can I commit the offset specific to my
>>'Runnable'
>>instance that just processed a single message without affecting other
>>threads and partitions?
>>
>>Thanks in advance,
>>Amir Zuker
>



Re: Scala: Kafka Consumer (kafka-clients 0.9.0.1)

2016-08-16 Thread Ajay Sharma
Amir,
We had similar requirement to consume every message reliably; the approach
I picked was to push any message with unsuccessful consumption to a
secondary topic for later entertainment; in our case the message/events
were non-dependant so we use to make second attempt for consumption and on
any failure push it to poison queue (Elastics Search in our case).

Regards, ajay

On 8/16/16, 1:50 AM, "Amir Zuker"  wrote:

>Hi everyone,
>
>I have a question regarding the 'KafkaConsumer' and its API in regards to
>committing offsets. (kafka-clients 0.9.0.1)
>
>*The scenario -*
>I am working with auto commit set to disabled because I want to implement
>a
>retry mechanism and eventually transfer the message to another topic that
>contains the poison messages.
>Since I want it to be reliable, I am not using the auto commit and I wish
>to take control on when that should happen
>
>*The implementation detail -*
>My class that extends 'Runnable' and is created by the KafkaConsumer needs
>to commit the offset once it is done with handling the topic message.
>However, the API for committing messages is located on the KafkaConsumer
>with no relation to partition or thread.
>
>*The problem -*
>If I understand correctly, I can use the same KafkaConsumer instance with
>multiple threads against multiple partitions.
>If that is the case, how can I commit the offset specific to my 'Runnable'
>instance that just processed a single message without affecting other
>threads and partitions?
>
>Thanks in advance,
>Amir Zuker



Re: Automated Testing w/ Kafka Streams

2016-08-16 Thread Mathieu Fenniak
Hi Guozhang,

Thanks for the feedback.  What would you think about
including ProcessorTopologyTestDriver in a released artifact from kafka
streams in a future release?  Or alternatively, what other approach would
you recommend to incorporating it into another project's tests?  I can copy
it wholesale into my project and it works fine, but I'll have to keep it
up-to-date by hand, which isn't ideal. :-)

Mathieu


On Mon, Aug 15, 2016 at 3:24 PM, Guozhang Wang  wrote:

> Mathieu,
>
> Your composition of Per-module Unit Tests + ProcessorTopologyTestDriver +
> System Tests looks good to me, and I agree with you that since this is part
> of your pre-commit process, which could be triggered concurrently from
> different developers / teams, EmbeddedSingleNodeKafkaCluster +
> EmbeddedZookeeper may not work best for you.
>
>
> Guozhang
>
>
> On Mon, Aug 15, 2016 at 1:39 PM, Radoslaw Gruchalski  >
> wrote:
>
> > Out of curiosity, are you aware of kafka.util.TestUtils and Apache
> Curator
> > TestingServer?
> > I’m using this successfully to test publis / consume scenarios with
> things
> > like Flink, Spark and custom apps.
> > What would stop you from taking the same approach?
> >
> > –
> > Best regards,
> > Radek Gruchalski
> > ra...@gruchalski.com
> >
> >
> > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak (
> > mathieu.fenn...@replicon.com) wrote:
> >
> > Hi Michael,
> >
> > It would definitely be an option. I am not currently doing any testing
> > like that; it could replace the ProcessorTopologyTestDriver-style
> testing
> > that I'd like to do, but there are some trade-offs to consider:
> >
> > - I can't do an isolated test of just the TopologyBuilder; I'd be
> > bringing in configuration management code (eg. configuring where to
> access
> > ZK + Kafka).
> > - Tests using a running Kafka server wouldn't have a clear end-point; if
> > something in the toplogy doesn't publish a message where I expected it
> to,
> > my test can only fail via a timeout.
> > - Tests are likely to be slower; this might not be significant, but a
> > small difference in test speed has a big impact in productivity after a
> few
> > months of development
> > - Tests will be more complex & fragile; some additional component needs
> > to manage starting up that Kafka server, making sure it's ready-to-go,
> > running tests, and then tearing it down
> > - Tests will have to be cautious of state existing in Kafka. eg. two
> > test suites that touch the same topics could be influenced by state of a
> > previous test. Either you take a "destroy the world" approach between
> test
> > cases (or test suites), which probably makes test speed much worse, or,
> you
> > find another way to isolate test's state.
> >
> > I'd have to face all these problems at the higher level that I'm calling
> > "systems-level tests", but, I think it would be better to do the majority
> > of the automated testing at a lower level that doesn't bring these
> > considerations into play.
> >
> > Mathieu
> >
> >
> > On Mon, Aug 15, 2016 at 12:13 PM, Michael Noll 
> > wrote:
> >
> > > Mathieu,
> > >
> > > follow-up question: Are you also doing or considering integration
> testing
> > > by spawning a local Kafka cluster and then reading/writing to that
> > cluster
> > > (often called embedded or in-memory cluster)? This approach would be in
> > > the middle between ProcessorTopologyTestDriver (that does not spawn a
> > Kafka
> > > cluster) and your system-level testing (which I suppose is running
> > against
> > > a "real" test Kafka cluster).
> > >
> > > -Michael
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Aug 15, 2016 at 3:44 PM, Mathieu Fenniak <
> > > mathieu.fenn...@replicon.com> wrote:
> > >
> > > > Hey all,
> > > >
> > > > At my workplace, we have a real focus on software automated testing.
> > I'd
> > > > love to be able to test the composition of a TopologyBuilder with
> > > > org.apache.kafka.test.ProcessorTopologyTestDriver
> > > >  > > > 317b95efa4/streams/src/test/java/org/apache/kafka/test/
> > > > ProcessorTopologyTestDriver.java>;
> > > > has there ever been any thought given to making this part of the
> public
> > > API
> > > > of Kafka Streams?
> > > >
> > > > For some background, here are some details on the automated testing
> > plan
> > > > that I have in mind for a Kafka Streams application. Our goal is to
> > > enable
> > > > continuous deployment of any new development we do, so, it has to be
> > > > rigorously tested with complete automation.
> > > >
> > > > As part of our pre-commit testing, we'd first have these gateways; no
> > > code
> > > > would reach our master branch without passing these tests:
> > > >
> > > > - At the finest level, unit tests covering individual pieces like a
> > > > Serde, ValueMapper, ValueJoiner, aggregate adder/subtractor, etc.
> > > These
> > > > pieces are very isolated, very easy 

Re: [DISCUSS] Java 8 as a minimum requirement

2016-08-16 Thread Ismael Juma
Hey Harsha,

I noticed that you proposed that Storm should drop support for Java 7 in
master:

http://markmail.org/message/25do6wd3a6g7cwpe

It's useful to know what other Apache projects are doing in this regard, so
I'm interested in the timeline being proposed for Storm's transition. I
could not find it in the thread above, so I'd appreciate it if you could
clarify it for us (and sorry if I missed it).

Thanks,
Ismael

On Mon, Jun 20, 2016 at 5:05 AM, Harsha  wrote:

> Hi Ismael,
>   Agree on timing is more important. If we give enough heads
>   up to the users who are on Java 7 thats great but still
>   shipping this in 0.10.x line is won't be good as it still
>   perceived as maint release even the release might contain
>   lot of features .  If we can make this as part of 0.11 and
>   cutting 0.10.1 features moving to 0.11 and giving rough
>   timeline when that would be released would be ideal.
>
> Thanks,
> Harsha
>
> On Fri, Jun 17, 2016, at 11:13 AM, Ismael Juma wrote:
> > Hi Harsha,
> >
> > Comments below.
> >
> > On Fri, Jun 17, 2016 at 7:48 PM, Harsha  wrote:
> >
> > > Hi Ismael,
> > > "Are you saying that you are aware of many Kafka users still
> > > using Java 7
> > > > who would be ready to upgrade to the next Kafka feature release
> (whatever
> > > > that version number is) before they can upgrade to Java 8?"
> > > I know there quite few users who are still on java 7
> >
> >
> > This is good to know.
> >
> >
> > > and regarding the
> > > upgrade we can't say Yes or no.  Its upto the user discretion when they
> > > choose to upgrade and ofcourse if there are any critical fixes that
> > > might go into the release.  We shouldn't be restricting their upgrade
> > > path just because we removed Java 7 support.
> > >
> >
> > My point is that both paths have their pros and cons and we need to weigh
> > them up. If some users are slow to upgrade the Java version (Java 7 has
> > been EOL'd for over a year), there's a good chance that they are slow to
> > upgrade Kafka too. And if that is the case (and it may not be), then
> > holding up improvements for the ones who actually do upgrade may be the
> > wrong call. To be clear, I am still in listening mode and I haven't made
> > up
> > my mind on the subject.
> >
> > Once we released 0.9.0 there aren't any 0.8.x releases. i.e we don't
> > > have LTS type release where we continually ship critical fixes over
> > > 0.8.x minor releases. So if a user notices a critical fix the only
> > > option today is to upgrade to next version where that fix is shipped.
> > >
> >
> > We haven't done a great job at this in the past, but there is no decision
> > that once a new major release is out, we don't do patch releases for the
> > previous major release. In fact, we have been collecting critical fixes
> > in
> > the 0.9.0 branch for a potential 0.9.0.2.
> >
> > I understand there is no decision made yet but given the premise was to
> > > ship this in 0.10.x  , possibly 0.10.1 which I don't agree with. In
> > > general against shipping this in 0.10.x version. Removing Java 7
> support
> > > when the release is minor in general not a good idea to users.
> > >
> >
> > Sorry if I didn't communicate this properly. I simply meant the next
> > feature release. I used 0.10.1.0 as an example, but it could also be
> > 0.11.0.0 if that turns out to be the next release. A discussion on that
> > will probably take place once the scope is clear. Personally, I think the
> > timing is more important the the version number, but it seems like some
> > people disagree.
> >
> > Ismael
>


RE: DLL Hell

2016-08-16 Thread Martin Gainty



> From: mathieu.fenn...@replicon.com
> Date: Tue, 16 Aug 2016 06:57:16 -0600
> Subject: Re: DLL Hell
> To: users@kafka.apache.org
> 
> Hey Martin,
> 
> I had to modify the -G argument to that command to include the visual
> studio year.  If you run "cmake /?", it will output all the available
> generators.  My cmake looked like:
> 
> cmake -G "Visual Studio 12 2013 Win64" -DJNI=1 ..
> 
> I think this is probably a change in cmake since the rocksdb doc was
> written (
> https://cmake.org/cmake/help/v3.0/generator/Visual%20Studio%2012%202013.html
> ).
> MG>same "informative error"
>C:\cygwin64\bin\cmake -G "Visual Studio 12 2013 Win64" -DJNI=1
CMake Error: Could not create named generator Visual Studio 12 2013 Win64
Generators  Unix Makefiles   = Generates standard UNIX makefiles.  
Ninja= Generates build.ninja files.  CodeBlocks - Ninja 
  = Generates CodeBlocks project files.  CodeBlocks - Unix Makefiles  = 
Generates CodeBlocks project files.  CodeLite - Ninja = Generates 
CodeLite project files.  CodeLite - Unix Makefiles= Generates CodeLite 
project files.  Eclipse CDT4 - Ninja = Generates Eclipse CDT 4.0 
project files.  Eclipse CDT4 - Unix Makefiles= Generates Eclipse CDT 4.0 
project files.  KDevelop3= Generates KDevelop 3 project 
files.  KDevelop3 - Unix Makefiles   = Generates KDevelop 3 project files.  
Kate - Ninja = Generates Kate project files.  Kate - Unix 
Makefiles= Generates Kate project files.  Sublime Text 2 - Ninja   
= Generates Sublime Text 2 project files.  Sublime Text 2 - Unix Makefiles  
 = Generates Sublime Text 2 project files.
MG>I am thinking if I want to automate this native build..I could more easily 
create binary thru maven-nar-plugin ?
MG>as I do not have any MS VS or DotNet installed..maybe I need to install many 
gigs of MS specific VS?
MG>Please advise
> Mathieu
> 
> 
> On Tue, Aug 16, 2016 at 5:03 AM, Martin Gainty  wrote:
> 
> > havent used cmake in over 10 years so Im a bit lost..
> > cmake -G "Visual Studio 12 Win64" -DGFLAGS=1 -DSNAPPY=1 -DJEMALLOC=1
> > -DJNI=1
> > CMake Error: Could not create named generator Visual Studio 12 Win64
> > ?Please advise
> > Martin
> > __
> >
> >
> >
> > > From: mathieu.fenn...@replicon.com
> > > Date: Mon, 15 Aug 2016 13:43:47 -0600
> > > Subject: Re: DLL Hell
> > > To: users@kafka.apache.org
> > >
> > > Hi Martin,
> > >
> > > rocksdb does not currently distribute a Windows-compatible build of their
> > > rocksdbjni library.  I recently wrote up some instructions on how to
> > > produce a local build, which you can find here:
> > > http://mail-archives.apache.org/mod_mbox/kafka-users/
> > 201608.mbox/%3CCAHoiPjweo-xSj3TiodcDVf4wNnnJ8u6PcwWDPF7L
> > T5ps%2BxQ3eA%40mail.gmail.com%3E
> > >
> > > I'd also suggest tracking this issue in GitHub, which is likely to be
> > > updated if this ever changes: https://github.com/facebook/
> > rocksdb/issues/703
> > >
> > > Mathieu
> > >
> > >
> > > On Mon, Aug 15, 2016 at 1:34 PM, Martin Gainty 
> > wrote:
> > >
> > > > kafka-trunk\streams>gradle buildCaused by: java.lang.RuntimeException:
> > > > librocksdbjni-win64.dll was not found inside JAR.at
> > org.rocksdb.
> > > > NativeLibraryLoader.loadLibraryFromJarToTemp(
> > NativeLibraryLoader.java:106)
> > > >  at org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(
> > NativeLibraryLoader.java:78)
> > > > at org.rocksdb.NativeLibraryLoader.loadLibrary(
> > NativeLibraryLoader.java:56)
> > > >at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:47) at
> > > > org.rocksdb.RocksDB.(RocksDB.java:23)
> > > > any idea where I can locale librocksdbjni-win64.dll ?
> > > > /thanks/
> > > > Martin
> > > > __
> > > >
> > > >
> >
> >
  

Re: Automated Testing w/ Kafka Streams

2016-08-16 Thread Mathieu Fenniak
Hi Radek,

No, I'm not familiar with these tools.  I see that Curator's TestingServer
looks pretty straight-forward, but, I'm not really sure what
kafka.util.TestUtils
is.  I can't find any documentation referring to this, and it doesn't seem
to be a part of any published maven artifacts in the Kafka project; can you
point me at what you're using a little more specifically?

Mathieu


On Mon, Aug 15, 2016 at 2:39 PM, Radoslaw Gruchalski 
wrote:

> Out of curiosity, are you aware of kafka.util.TestUtils and Apache Curator
> TestingServer?
> I’m using this successfully to test publis / consume scenarios with things
> like Flink, Spark and custom apps.
> What would stop you from taking the same approach?
>
> –
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com
>
>
> On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak (
> mathieu.fenn...@replicon.com) wrote:
>
> Hi Michael,
>
> It would definitely be an option. I am not currently doing any testing
> like that; it could replace the ProcessorTopologyTestDriver-style testing
> that I'd like to do, but there are some trade-offs to consider:
>
> - I can't do an isolated test of just the TopologyBuilder; I'd be
> bringing in configuration management code (eg. configuring where to access
> ZK + Kafka).
> - Tests using a running Kafka server wouldn't have a clear end-point; if
> something in the toplogy doesn't publish a message where I expected it to,
> my test can only fail via a timeout.
> - Tests are likely to be slower; this might not be significant, but a
> small difference in test speed has a big impact in productivity after a
> few
> months of development
> - Tests will be more complex & fragile; some additional component needs
> to manage starting up that Kafka server, making sure it's ready-to-go,
> running tests, and then tearing it down
> - Tests will have to be cautious of state existing in Kafka. eg. two
> test suites that touch the same topics could be influenced by state of a
> previous test. Either you take a "destroy the world" approach between test
> cases (or test suites), which probably makes test speed much worse, or,
> you
> find another way to isolate test's state.
>
> I'd have to face all these problems at the higher level that I'm calling
> "systems-level tests", but, I think it would be better to do the majority
> of the automated testing at a lower level that doesn't bring these
> considerations into play.
>
> Mathieu
>
>
> On Mon, Aug 15, 2016 at 12:13 PM, Michael Noll 
> wrote:
>
> > Mathieu,
> >
> > follow-up question: Are you also doing or considering integration
> testing
> > by spawning a local Kafka cluster and then reading/writing to that
> cluster
> > (often called embedded or in-memory cluster)? This approach would be in
> > the middle between ProcessorTopologyTestDriver (that does not spawn a
> Kafka
> > cluster) and your system-level testing (which I suppose is running
> against
> > a "real" test Kafka cluster).
> >
> > -Michael
> >
> >
> >
> >
> >
> > On Mon, Aug 15, 2016 at 3:44 PM, Mathieu Fenniak <
> > mathieu.fenn...@replicon.com> wrote:
> >
> > > Hey all,
> > >
> > > At my workplace, we have a real focus on software automated testing.
> I'd
> > > love to be able to test the composition of a TopologyBuilder with
> > > org.apache.kafka.test.ProcessorTopologyTestDriver
> > >  > > 317b95efa4/streams/src/test/java/org/apache/kafka/test/
> > > ProcessorTopologyTestDriver.java>;
> > > has there ever been any thought given to making this part of the
> public
> > API
> > > of Kafka Streams?
> > >
> > > For some background, here are some details on the automated testing
> plan
> > > that I have in mind for a Kafka Streams application. Our goal is to
> > enable
> > > continuous deployment of any new development we do, so, it has to be
> > > rigorously tested with complete automation.
> > >
> > > As part of our pre-commit testing, we'd first have these gateways; no
> > code
> > > would reach our master branch without passing these tests:
> > >
> > > - At the finest level, unit tests covering individual pieces like a
> > > Serde, ValueMapper, ValueJoiner, aggregate adder/subtractor, etc.
> > These
> > > pieces are very isolated, very easy to unit test.
> > > - At a higher level, I'd like to have component tests of the
> > composition
> > > of the TopologyBuilder; this is where ProcessorTopologyTestDriver
> > would
> > > be
> > > valuable. There'd be far fewer of these tests than the lower-level
> > > tests.
> > > There are no external dependencies to these tests, so they'd be very
> > > fast.
> > >
> > > Having passed that level of testing, we'd deploy the Kafka Streams
> > > application to an integration testing area where the rest of our
> > > application is kept up-to-date, and proceed with these integration
> tests:
> > >
> > > - Systems-level tests where we synthesize inputs to the Kafka topics,
> > > wait for 

Mismatch in the number of messages processed

2016-08-16 Thread Jayesh
Hello,

I have a very basic doubt.

I created a kafka topic and produced 10 messages using the
kafka-console-producer utility. When I consume messages from this topic, it
consumes 10 messages - fine. However, it shows that I have processed a
total of 11 messages. This number is +1 the total number of messages
consumed, each time.

Why is that so? Does it read any extra information that it counts as
message?
I am attaching a couple of screenshots. Thank you already for your help.

[image: Screen Shot 2016-08-16 at 6.38.28 pm.png][image: Screen Shot
2016-08-16 at 6.38.41 pm.png]


Re: DLL Hell

2016-08-16 Thread Mathieu Fenniak
Hey Martin,

I had to modify the -G argument to that command to include the visual
studio year.  If you run "cmake /?", it will output all the available
generators.  My cmake looked like:

cmake -G "Visual Studio 12 2013 Win64" -DJNI=1 ..

I think this is probably a change in cmake since the rocksdb doc was
written (
https://cmake.org/cmake/help/v3.0/generator/Visual%20Studio%2012%202013.html
).

Mathieu


On Tue, Aug 16, 2016 at 5:03 AM, Martin Gainty  wrote:

> havent used cmake in over 10 years so Im a bit lost..
> cmake -G "Visual Studio 12 Win64" -DGFLAGS=1 -DSNAPPY=1 -DJEMALLOC=1
> -DJNI=1
> CMake Error: Could not create named generator Visual Studio 12 Win64
> ?Please advise
> Martin
> __
>
>
>
> > From: mathieu.fenn...@replicon.com
> > Date: Mon, 15 Aug 2016 13:43:47 -0600
> > Subject: Re: DLL Hell
> > To: users@kafka.apache.org
> >
> > Hi Martin,
> >
> > rocksdb does not currently distribute a Windows-compatible build of their
> > rocksdbjni library.  I recently wrote up some instructions on how to
> > produce a local build, which you can find here:
> > http://mail-archives.apache.org/mod_mbox/kafka-users/
> 201608.mbox/%3CCAHoiPjweo-xSj3TiodcDVf4wNnnJ8u6PcwWDPF7L
> T5ps%2BxQ3eA%40mail.gmail.com%3E
> >
> > I'd also suggest tracking this issue in GitHub, which is likely to be
> > updated if this ever changes: https://github.com/facebook/
> rocksdb/issues/703
> >
> > Mathieu
> >
> >
> > On Mon, Aug 15, 2016 at 1:34 PM, Martin Gainty 
> wrote:
> >
> > > kafka-trunk\streams>gradle buildCaused by: java.lang.RuntimeException:
> > > librocksdbjni-win64.dll was not found inside JAR.at
> org.rocksdb.
> > > NativeLibraryLoader.loadLibraryFromJarToTemp(
> NativeLibraryLoader.java:106)
> > >  at org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(
> NativeLibraryLoader.java:78)
> > > at org.rocksdb.NativeLibraryLoader.loadLibrary(
> NativeLibraryLoader.java:56)
> > >at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:47) at
> > > org.rocksdb.RocksDB.(RocksDB.java:23)
> > > any idea where I can locale librocksdbjni-win64.dll ?
> > > /thanks/
> > > Martin
> > > __
> > >
> > >
>
>


Re: consumer.poll() hangs indefinitely in docker container

2016-08-16 Thread Oleg Zhurakousky
It is accurate since it’s an API/implementation problem and therefore container 
independent. Sure if everything is configured correctly and broker is 
accessible then things do work, but try to shut down consumer when broker is 
not accessible. And when I mean shut down I am not implying shutting down the 
JVM via System.exit(), I am simply saying that poll() will block indefinitely 
and so will close().

In fact it’s a very common problem in Kafka components (see below)

https://issues.apache.org/jira/browse/KAFKA-3540
https://issues.apache.org/jira/browse/KAFKA-1894
https://issues.apache.org/jira/browse/KAFKA-3539

Cheers
Oleg

On Aug 16, 2016, at 5:06 AM, Jaikiran Pai 
> wrote:

On Friday 12 August 2016 08:45 PM, Oleg Zhurakousky wrote:
It hangs indefinitely in any container.

I don't think that's accurate. We have been running Kafka brokers and 
consumers/producers in docker containers for a while now and they are 
functional. Of course, you need to make sure you use the IP addresses instead 
of localhost/127.0.0.1 to make sure that the brokers are accessible to the 
consumers/producers and you don't run into the situation that you explain about 
the broker connection not happening successfully.

By the way, I am not saying that the consumer.poll() doesn't have that issue 
you state.

-Jaikiran


 It’s a known issue and has been brought up many times on this list, yet there 
is not fix for it.
The problem is with the fact that while poll() attempts to create an elusion 
that it is async and even allows you to set a timeout it is essentially very 
misleading if you look inside its implementation. The first call it makes is to 
fetch topic metadata. That call is not part of the Future it returns so if 
connection to broker is not available you’re dead since Kafka attempts to 
reconnect and there s no property to set reconnect attempts, so it attempts to 
reconnect indefinitely.

Cheers
Oleg

On Aug 12, 2016, at 9:22 AM, Brem, Robert 
> wrote:

Hy I'm new to Kafka and messaging at all.

I have a simple java application that contains a consumer and a producer. It is 
working on the host system but if I try to run it in a docker container (Kafka 
is not in the container, it is still on the host) consumer.poll() hangs up and 
does not return.
telnet tells me that inside the container the host:port 172.17.0.1:9092 is open.

In the docker container on startup Kafka tells me: Marking the coordinator ... 
dead for group ...

Can you give me a hint, in which direction I should look?
Thanks!

That's the output on the host, with the working application:

15:03:42,657 INFO  [org.apache.kafka.clients.consumer.ConsumerConfig] 
(ServerService Thread Pool -- 63) ConsumerConfig values:
   metric.reporters = []
   metadata.max.age.ms = 30
   partition.assignment.strategy = 
[org.apache.kafka.clients.consumer.RangeAssignor]
   reconnect.backoff.ms = 50
   sasl.kerberos.ticket.renew.window.factor = 0.8
   max.partition.fetch.bytes = 1048576
   bootstrap.servers = [172.17.0.1:9092]
   ssl.keystore.type = JKS
   enable.auto.commit = true
   sasl.mechanism = GSSAPI
   interceptor.classes = null
   exclude.internal.topics = true
   ssl.truststore.password = null
   client.id = consumer-1
   ssl.endpoint.identification.algorithm = null
   max.poll.records = 2147483647
   check.crcs = true
   request.timeout.ms = 4
   heartbeat.interval.ms = 3000
   auto.commit.interval.ms = 5000
   receive.buffer.bytes = 65536
   ssl.truststore.type = JKS
   ssl.truststore.location = null
   ssl.keystore.password = null
   fetch.min.bytes = 1
   send.buffer.bytes = 131072
   value.deserializer = class 
org.apache.kafka.common.serialization.StringDeserializer
   group.id = pokertracker
   retry.backoff.ms = 100
   sasl.kerberos.kinit.cmd = /usr/bin/kinit
   sasl.kerberos.service.name = null
   sasl.kerberos.ticket.renew.jitter = 0.05
   ssl.trustmanager.algorithm = PKIX
   ssl.key.password = null
   fetch.max.wait.ms = 500
   sasl.kerberos.min.time.before.relogin = 6
   connections.max.idle.ms = 54
   session.timeout.ms = 1
   metrics.num.samples = 2
   key.deserializer = class 
org.apache.kafka.common.serialization.StringDeserializer
   ssl.protocol = TLS
   ssl.provider = null
   ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
   ssl.keystore.location = null
   ssl.cipher.suites = 

Re: Lost offsets after migration to Kafka brokers v0.10.0

2016-08-16 Thread Sam Pegler
clj-kafka uses the old consumer API's and offset storage in ZK.  If I were
you I'd migrate to https://github.com/weftio/gregor which wraps the new
consumer API and stores offsets in Kafka.

I'm going to assume you didn't migrate ZK state based off this?

__


On 16 August 2016 at 12:15, Javier Holguera <
javier.holgu...@fundingcircle.com> wrote:

> Hi,
>
> Yesterday my company completed a “successful” migration from Kafka brokers
> v0.9.0.1 to Kafka 0.10.0.
>
> However the migration can’t be considered completely successfully because
> we accidentally lost our offsets. Fortunately our apps are designed to be
> able to replay from the beginning on the topic without much problem, but
> it’s something we weren’t expecting and I would like to understand what we
> did wrong to let his happen.
>
> Our apps use kafka client v0.8.2.1 wrapped around latest version of
> clj-kafka. We are using its functionality to commit offsets (like here:
> https://github.com/pingles/clj-kafka/blob/master/src/clj_
> kafka/offset.clj#L70)
> using OffsetCommitRequest.
>
> Any help would be welcomed.
>
> Thanks!
>
> --
> Javier Holguera
> Sent with Airmail
>


Lost offsets after migration to Kafka brokers v0.10.0

2016-08-16 Thread Javier Holguera
Hi,

Yesterday my company completed a “successful” migration from Kafka brokers
v0.9.0.1 to Kafka 0.10.0.

However the migration can’t be considered completely successfully because
we accidentally lost our offsets. Fortunately our apps are designed to be
able to replay from the beginning on the topic without much problem, but
it’s something we weren’t expecting and I would like to understand what we
did wrong to let his happen.

Our apps use kafka client v0.8.2.1 wrapped around latest version of
clj-kafka. We are using its functionality to commit offsets (like here:
https://github.com/pingles/clj-kafka/blob/master/src/clj_kafka/offset.clj#L70)
using OffsetCommitRequest.

Any help would be welcomed.

Thanks!

-- 
Javier Holguera
Sent with Airmail


RE: DLL Hell

2016-08-16 Thread Martin Gainty
havent used cmake in over 10 years so Im a bit lost..
cmake -G "Visual Studio 12 Win64" -DGFLAGS=1 -DSNAPPY=1 -DJEMALLOC=1 -DJNI=1
CMake Error: Could not create named generator Visual Studio 12 Win64
?Please advise
Martin 
__ 



> From: mathieu.fenn...@replicon.com
> Date: Mon, 15 Aug 2016 13:43:47 -0600
> Subject: Re: DLL Hell
> To: users@kafka.apache.org
> 
> Hi Martin,
> 
> rocksdb does not currently distribute a Windows-compatible build of their
> rocksdbjni library.  I recently wrote up some instructions on how to
> produce a local build, which you can find here:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201608.mbox/%3CCAHoiPjweo-xSj3TiodcDVf4wNnnJ8u6PcwWDPF7LT5ps%2BxQ3eA%40mail.gmail.com%3E
> 
> I'd also suggest tracking this issue in GitHub, which is likely to be
> updated if this ever changes: https://github.com/facebook/rocksdb/issues/703
> 
> Mathieu
> 
> 
> On Mon, Aug 15, 2016 at 1:34 PM, Martin Gainty  wrote:
> 
> > kafka-trunk\streams>gradle buildCaused by: java.lang.RuntimeException:
> > librocksdbjni-win64.dll was not found inside JAR.at org.rocksdb.
> > NativeLibraryLoader.loadLibraryFromJarToTemp(NativeLibraryLoader.java:106)
> >  at 
> > org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78)
> > at 
> > org.rocksdb.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:56)
> >at org.rocksdb.RocksDB.loadLibrary(RocksDB.java:47) at
> > org.rocksdb.RocksDB.(RocksDB.java:23)
> > any idea where I can locale librocksdbjni-win64.dll ?
> > /thanks/
> > Martin
> > __
> >
> >
  

Re: Scala: Kafka Consumer (kafka-clients 0.9.0.1)

2016-08-16 Thread Amir Zuker
This is not the case in my situation.
I am using an older version where it doesn't have the subscriber model, and
processing messages can occur concurrently in multiple threads.

If I use the KafkaConsumer with multiple threads and partitions -
kafkaConsumer.run(5) //5 thread count -
How can I commit an offset for a specific thread and partition in
kafka-clients 0.9.0.1?

Thanks



Regards,
Amir Zuker

On Tue, Aug 16, 2016 at 1:36 PM, Sudev A C  wrote:

> Hi,
>
> Message object consists of partition, topic, offset and message.
> https://kafka.apache.org/090/javadoc/index.html?org/apache/
> kafka/clients/consumer/KafkaConsumer.html
> You many use this to get current offsets for topic-partition combination.
>
> Thanks
> Sudev
>
> On Tue, Aug 16, 2016 at 2:20 PM, Amir Zuker  wrote:
>
> > Hi everyone,
> >
> > I have a question regarding the 'KafkaConsumer' and its API in regards to
> > committing offsets. (kafka-clients 0.9.0.1)
> >
> > *The scenario -*
> > I am working with auto commit set to disabled because I want to
> implement a
> > retry mechanism and eventually transfer the message to another topic that
> > contains the poison messages.
> > Since I want it to be reliable, I am not using the auto commit and I wish
> > to take control on when that should happen
> >
> > *The implementation detail -*
> > My class that extends 'Runnable' and is created by the KafkaConsumer
> needs
> > to commit the offset once it is done with handling the topic message.
> > However, the API for committing messages is located on the KafkaConsumer
> > with no relation to partition or thread.
> >
> > *The problem -*
> > If I understand correctly, I can use the same KafkaConsumer instance with
> > multiple threads against multiple partitions.
> > If that is the case, how can I commit the offset specific to my
> 'Runnable'
> > instance that just processed a single message without affecting other
> > threads and partitions?
> >
> > Thanks in advance,
> > Amir Zuker
> >
>
>
>
> --
> Thanks
> Sudev A C
> Data Team
>


Re: Scala: Kafka Consumer (kafka-clients 0.9.0.1)

2016-08-16 Thread Sudev A C
Hi,

Message object consists of partition, topic, offset and message.
https://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
You many use this to get current offsets for topic-partition combination.

Thanks
Sudev

On Tue, Aug 16, 2016 at 2:20 PM, Amir Zuker  wrote:

> Hi everyone,
>
> I have a question regarding the 'KafkaConsumer' and its API in regards to
> committing offsets. (kafka-clients 0.9.0.1)
>
> *The scenario -*
> I am working with auto commit set to disabled because I want to implement a
> retry mechanism and eventually transfer the message to another topic that
> contains the poison messages.
> Since I want it to be reliable, I am not using the auto commit and I wish
> to take control on when that should happen
>
> *The implementation detail -*
> My class that extends 'Runnable' and is created by the KafkaConsumer needs
> to commit the offset once it is done with handling the topic message.
> However, the API for committing messages is located on the KafkaConsumer
> with no relation to partition or thread.
>
> *The problem -*
> If I understand correctly, I can use the same KafkaConsumer instance with
> multiple threads against multiple partitions.
> If that is the case, how can I commit the offset specific to my 'Runnable'
> instance that just processed a single message without affecting other
> threads and partitions?
>
> Thanks in advance,
> Amir Zuker
>



-- 
Thanks
Sudev A C
Data Team


Rebuilding corrupt index takes long --- or not?

2016-08-16 Thread Harald Kirsch

Hi all,

we just had a case with Kafka 0.9 where an index rebuild for ~200M 
segments took on average 45 seconds. All indexes of a partition were 
corrupt. There are 13 segments and the rebuild took 10 minutes.


After the rebuild, these are representative sizes:

% ll -h /data/xyz-0
-rw-r--r-- 1 solr solr  45K Aug 16 10:44 00096346.index
-rw-r--r-- 1 solr solr 191M Aug 16 10:44 00096346.log

I now wonder whether this is an expected. Here is a log-excerpt showing 
the long run time for this segment, this one more than one minute.



[2016-08-16 10:44:20,831] WARN Found a corrupted index file, 
/data/xyz-0/00096346.index, deleting and rebuilding index... 
(kafka.log.Log)
[2016-08-16 10:45:46,305] WARN Found a corrupted index file, 
/data/xyz-0/00011722.index, deleting and rebuilding index... 
(kafka.log.Log)


These runtime seem excessive on normal desktop hardware. Or am I 
underestimating the necessary effort to rebuild an index file?


Harald.