Re: [ANNOUNCE] Apache Kafka 1.1.1 Released

2018-07-19 Thread Ismael Juma
Thank you for managing the release Dong!

Ismael

On Thu, 19 Jul 2018, 16:54 Dong Lin,  wrote:

> The Apache Kafka community is pleased to announce the release for Apache
> Kafka 1.1.1.
>
> This is a bug fix release and it includes fixes and improvements from 43
> JIRAs, including a few critical bugs.
>
> All of the changes in this release can be found in the release notes:
>
> https://dist.apache.org/repos/dist/release/kafka/1.1.1/RELEASE_NOTES.html
>
> You can download the source release from:
>
> *
> https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka-1.1.1-src.tgz
> <
> https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka-1.1.1-src.tgz
> >*
>
> and binary releases from:
>
> *
> https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.11-1.1.1.tgz
> <
> https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.11-1.1.1.tgz
> >*
> (Scala 2.11)
>
> *
> https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.12-1.1.1.tgz
> <
> https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.12-1.1.1.tgz
> >*
> (Scala 2.12)
>
>
> 
> ---
>
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
> ** The Producer API allows an application to publish a stream records to
> one
> or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more
> topics
> and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor,
> consuming
> an input stream from one or more topics and producing an output stream to
> one or more output topics, effectively transforming the input streams to
> output streams.
>
> ** The Connector API allows building and running reusable producers or
> consumers
> that connect Kafka topics to existing applications or data systems. For
> example, a connector to a relational database might capture every change to
> a table.three key capabilities:
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
> ** Building real-time streaming data pipelines that reliably get data
> between
> systems or applications.
>
> ** Building real-time streaming applications that transform or react
> to the streams
> of data.
>
>
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
>
> A big thank you for the following 29 contributors to this release!
>
> Ismael Juma, Rajini Sivaram, Matthias J. Sax, Guozhang Wang, Anna Povzner,
> tedyu, Jagadesh Adireddi, John Roesler, Manikumar Reddy O, Randall Hauch,
> Attila Sasvari, Chia-Ping Tsai, Colin Patrick McCabe, Dhruvil Shah, Fedor
> Bobin, Gitomain, Gunnar Morling, Jarek Rudzinski, Jason Gustafson, Jun Rao,
> Mickael Maison, Robert Yokota, Vahid Hashemian, Valentino Proietti, fredfp,
> huxi, maytals, ro7m, yaphet
>
> We welcome your help and feedback. For more information on how to report
> problems,and to get involved, visit the project website at
> http://kafka.apache.org/
>
> Thank you!
>
>
> Regards,
> Dong
>


Re: [ANNOUNCE] Apache Kafka 1.1.1 Released

2018-07-19 Thread Dong Lin
On Thu, Jul 19, 2018 at 4:53 PM, Dong Lin  wrote:

> The Apache Kafka community is pleased to announce the release for Apache
> Kafka 1.1.1.
>
> This is a bug fix release and it includes fixes and improvements from 43
> JIRAs, including a few critical bugs.
>
> All of the changes in this release can be found in the release notes:
>
> https://dist.apache.org/repos/dist/release/kafka/1.1.1/RELEASE_NOTES.html
>
> You can download the source release from:
>
> *https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka-1.1.1-src.tgz
> *
>
> and binary releases from:
>
> *https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.11-1.1.1.tgz
> *
> (Scala 2.11)
>
> *https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.12-1.1.1.tgz
> *
> (Scala 2.12)
>
>
> 
> ---
>
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
> ** The Producer API allows an application to publish a stream records to one
> or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more topics
> and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor, 
> consuming
> an input stream from one or more topics and producing an output stream to
> one or more output topics, effectively transforming the input streams to
> output streams.
>
> ** The Connector API allows building and running reusable producers or 
> consumers
> that connect Kafka topics to existing applications or data systems. For
> example, a connector to a relational database might capture every change
> to a table.three key capabilities:
>
>
> With these APIs, Kafka can be used for two broad classes of application:
>
> ** Building real-time streaming data pipelines that reliably get data between
> systems or applications.
>
> ** Building real-time streaming applications that transform or react to
> the streams of data.
>
>
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
>
> A big thank you for the following 29 contributors to this release!
>
> Ismael Juma, Rajini Sivaram, Matthias J. Sax, Guozhang Wang, Anna Povzner,
> tedyu, Jagadesh Adireddi, John Roesler, Manikumar Reddy O, Randall Hauch,
> Attila Sasvari, Chia-Ping Tsai, Colin Patrick McCabe, Dhruvil Shah, Fedor
> Bobin, Gitomain, Gunnar Morling, Jarek Rudzinski, Jason Gustafson, Jun Rao,
> Mickael Maison, Robert Yokota, Vahid Hashemian, Valentino Proietti, fredfp,
> huxi, maytals, ro7m, yaphet
>
> We welcome your help and feedback. For more information on how to report
> problems,and to get involved, visit the project website at
> http://kafka.apache.org/
>
> Thank you!
>
>
> Regards,
> Dong
>


[ANNOUNCE] Apache Kafka 1.1.1 Released

2018-07-19 Thread Dong Lin
The Apache Kafka community is pleased to announce the release for Apache
Kafka 1.1.1.

This is a bug fix release and it includes fixes and improvements from 43
JIRAs, including a few critical bugs.

All of the changes in this release can be found in the release notes:

https://dist.apache.org/repos/dist/release/kafka/1.1.1/RELEASE_NOTES.html

You can download the source release from:

*https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka-1.1.1-src.tgz
*

and binary releases from:

*https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.11-1.1.1.tgz
*
(Scala 2.11)

*https://www.apache.org/dyn/closer.cgi?path=/kafka/1.1.1/kafka_2.12-1.1.1.tgz
*
(Scala 2.12)



---


Apache Kafka is a distributed streaming platform with four core APIs:

** The Producer API allows an application to publish a stream records to one
or more Kafka topics.

** The Consumer API allows an application to subscribe to one or more topics
and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor,
consuming
an input stream from one or more topics and producing an output stream to
one or more output topics, effectively transforming the input streams to
output streams.

** The Connector API allows building and running reusable producers or
consumers
that connect Kafka topics to existing applications or data systems. For
example, a connector to a relational database might capture every change to
a table.three key capabilities:


With these APIs, Kafka can be used for two broad classes of application:

** Building real-time streaming data pipelines that reliably get data between
systems or applications.

** Building real-time streaming applications that transform or react
to the streams
of data.


Apache Kafka is in use at large and small companies worldwide, including
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
Target, The New York Times, Uber, Yelp, and Zalando, among others.

A big thank you for the following 29 contributors to this release!

Ismael Juma, Rajini Sivaram, Matthias J. Sax, Guozhang Wang, Anna Povzner,
tedyu, Jagadesh Adireddi, John Roesler, Manikumar Reddy O, Randall Hauch,
Attila Sasvari, Chia-Ping Tsai, Colin Patrick McCabe, Dhruvil Shah, Fedor
Bobin, Gitomain, Gunnar Morling, Jarek Rudzinski, Jason Gustafson, Jun Rao,
Mickael Maison, Robert Yokota, Vahid Hashemian, Valentino Proietti, fredfp,
huxi, maytals, ro7m, yaphet

We welcome your help and feedback. For more information on how to report
problems,and to get involved, visit the project website at
http://kafka.apache.org/

Thank you!


Regards,
Dong


Measure latency from Source to Sink

2018-07-19 Thread antonio saldivar
Hello

I am developing an application using kafka and flink, I need to be able to
measure the latency from the producer and when it comes out to the sink.

I can append the timestamp in Millisecond when I send the trxn from the
producer but at the end how to append the timestamp when it comes out the
sink.

Someone can help me with an example

Thank you
Best Regards


Measure Latency from Source to Sink

2018-07-19 Thread antonio saldivar
Hello

I am developing an application using kafka and flink, I need to be able to
measure the latency from the producer and when it comes out to the sink.

I can append the timestamp in Millisecond when I send the trxn from the
producer but at the end how to append the timestamp when it comes out the
sink.

Someone can help me with an example

Thank you
Best Regards


Re: Use Kafka Streams for windowing data and processing each window at once

2018-07-19 Thread Bill Bejeck
Hi Bruno,

What you are asking is a common request.  There is a KIP in the works,
https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables,
that should suit the requirements you've outlined.

In the meantime, I'll see if I can come up with an alternative approach
over the next few days.

-Bill

On Thu, Jul 19, 2018 at 12:07 PM Bruno Bottazzini <
bruno.bottazz...@targatelematics.com> wrote:

> Hello,
>
> We have a doubt about Kafka streams on how it works. Or at least we are
> having some troubles in making it to work.
>
> The purpose we want to achieve is to group by user some messages that
> we receive from a Kafka topic and window them in order to aggregate the
> messages we receive in the window (5 minutes). Then, I'd like to
> collect all aggregates in each window in order to process them at once
> adding them to a report of all the messages I received in the 5 minutes
> interval.
>
> The last point seems to be the tough part as Kafka Streams doesn't seem
> to provide (at least we can't find it :() anything that can collect all
> the window related stuff in a "finite" stream to be processed in one
> place.
>
> The file (implemented_code.txt) contains the code we have implemented
> where it contains at least one of our tries to make it to work.
>
> You can find its result inside the file (result.txt)
>
> For each window there are many log lines and they are mixed with the
> other windows.
>
> What I'd like to have is something like:
>
> // Hypothetical implementation
> windowedMessages.streamWindows((interval, window) -> process(interval,
> window));
>
> where method process would be something like:
>
> // Hypothetical implementation
> void process(Interval interval, WindowStream>
> windowStream) {
> // Create report for the whole window
> Report report = new Report(nameFromInterval());
> // Loop on the finite iterable that represents the window content
> for (WindowStreamEntry> entry: windowStream)
> {
> report.addLine(entry.getKey(), entry.getValue());
> }
> report.close();
> }
>
>


Re: Monitor the state of an event-based choreography

2018-07-19 Thread Christopher Smith
This sounds like a pretty good use case for opentracing.org. There is some
nice integration with Kafka now too.

--Chris

On Thu, Jul 19, 2018, 8:26 AM Dan Rosanova 
wrote:

> Are you looking more for an Actor or orchestration layer and visibility? I
> don’t know of one per se,  it would also be interested.
>
> Dan
>
>
> 
> From: Jonathan Roy 
> Sent: Thursday, July 19, 2018 3:22 AM
> To: users@kafka.apache.org
> Subject: Monitor the state of an event-based choreography
>
> Hello,
>
> We are currently studying event-driven architectures with Kafka, and one
> thing that scares me a bit is that we have no way to know at any moment
> what is the state of a business transaction that spans multiple services.
> Let’s consider the following simple flow:
> 1. the ‘orders’ service creates a ‘new order’ event.
> 2. the ‘validation’ service validates the order and publishes an ‘order
> validated’ or ‘order rejected’.
> 3. the ‘email’ service sends a confirmation email to the customer.
>
> This does not directly concern Kafka but are there some tools that help
> visually tracking the status of any particular order, and know which part
> of the flow remains to be done? I know Confluent Control Center allows to
> monitor the Kafka topics, but I’m not aware of any tool allowing to
> visualize the business flows of events.
>
> Thanks!
>
> Jonathan Roy
>


Use Kafka Streams for windowing data and processing each window at once

2018-07-19 Thread Bruno Bottazzini
Hello,

We have a doubt about Kafka streams on how it works. Or at least we are
having some troubles in making it to work.

The purpose we want to achieve is to group by user some messages that
we receive from a Kafka topic and window them in order to aggregate the
messages we receive in the window (5 minutes). Then, I'd like to
collect all aggregates in each window in order to process them at once
adding them to a report of all the messages I received in the 5 minutes
interval.

The last point seems to be the tough part as Kafka Streams doesn't seem
to provide (at least we can't find it :() anything that can collect all
the window related stuff in a "finite" stream to be processed in one
place.

The file (implemented_code.txt) contains the code we have implemented
where it contains at least one of our tries to make it to work. 

You can find its result inside the file (result.txt)

For each window there are many log lines and they are mixed with the
other windows.

What I'd like to have is something like:

// Hypothetical implementation
windowedMessages.streamWindows((interval, window) -> process(interval,
window));

where method process would be something like:

// Hypothetical implementation
void process(Interval interval, WindowStream>
windowStream) {
// Create report for the whole window   
Report report = new Report(nameFromInterval());
// Loop on the finite iterable that represents the window content
for (WindowStreamEntry> entry: windowStream)
{
report.addLine(entry.getKey(), entry.getValue());
}
report.close();
}

StreamsBuilder builder = new StreamsBuilder();
KStream messages = builder.stream("KAFKA_TOPIC");

TimeWindowedKStream windowedMessages =
messages.
groupByKey().windowedBy(TimeWindows.of(SIZE_MS));

KTable, List> messagesAggregatedByWindow =
windowedMessages.
aggregate(
() -> new LinkedList<>(), new MyAggregator<>(),
Materialized.with(new MessageKeySerde(), new 
MessageListSerde())
);

messagesAggregatedByWindow.toStream().foreach((key, value) -> log.info("({}), 
KEY {} MESSAGE {}",  value.size(), key, value.toString()));

KafkaStreams streams = new KafkaStreams(builder.build(), config);
streams.start();
KEY [UserId(82770583)@153150276/153150277] Message 
[Message(userId=UserId(82770583),message="a"),Message(userId=UserId(82770583),message="b"),Message(userId=UserId(82770583),message="d")]
KEY [UserId(77082590)@153150276/153150277] Message 
[Message(userId=UserId(77082590),message="g")]
KEY [UserId(85077691)@153150275/153150276] Message 
[Message(userId=UserId(85077691),message="h")]
KEY [UserId(79117307)@153150238/153150239] Message 
[Message(userId=UserId(79117307),message="e")]
KEY [UserId(73176289)@153150276/153150277] Message 
[Message(userId=UserId(73176289),message="r"),Message(userId=UserId(73176289),message="q")]
KEY [UserId(92077080)@153150276/153150277] Message 
[Message(userId=UserId(92077080),message="w")]
KEY [UserId(78530050)@153150276/153150277] Message 
[Message(userId=UserId(78530050),message="t")]
KEY [UserId(64640536)@153150276/153150277] Message 
[Message(userId=UserId(64640536),message="y")]


Re: Monitor the state of an event-based choreography

2018-07-19 Thread Dan Rosanova
Are you looking more for an Actor or orchestration layer and visibility? I 
don’t know of one per se,  it would also be interested.

Dan



From: Jonathan Roy 
Sent: Thursday, July 19, 2018 3:22 AM
To: users@kafka.apache.org
Subject: Monitor the state of an event-based choreography

Hello,

We are currently studying event-driven architectures with Kafka, and one thing 
that scares me a bit is that we have no way to know at any moment what is the 
state of a business transaction that spans multiple services. Let’s consider 
the following simple flow:
1. the ‘orders’ service creates a ‘new order’ event.
2. the ‘validation’ service validates the order and publishes an ‘order 
validated’ or ‘order rejected’.
3. the ‘email’ service sends a confirmation email to the customer.

This does not directly concern Kafka but are there some tools that help 
visually tracking the status of any particular order, and know which part of 
the flow remains to be done? I know Confluent Control Center allows to monitor 
the Kafka topics, but I’m not aware of any tool allowing to visualize the 
business flows of events.

Thanks!

Jonathan Roy


Re: UNKNOWN_PRODUCER_ID error when running Streams WordCount demo with processing.guarantee set to EXACTLY_ONCE

2018-07-19 Thread Bill Bejeck
Hi

Thanks for reporting this.  Just off the top of my head, I'm thinking it
may have to do with using a console producer, but I'll have to take a
deeper look.

Thanks,
Bill

On Thu, Jul 19, 2018 at 9:59 AM lambdaliu(刘少波) 
wrote:

> Hi,
>
> I test the Kafka Streams WordCount demo follow the steps described in
> http://kafka.apache.org/11/documentation/streams/quickstart  with change
> the processing.guarantee property to EXACTLY_ONCE.
>
> And seeing the following WARN message in streams demo app logs:
> [2018-07-18 21:08:03,510] WARN The configuration 'admin.retries' was
> supplied but isn't a known config.
> (org.apache.kafka.clients.consumer.ConsumerConfig)
> [2018-07-18 21:11:29,218] WARN [Producer
> clientId=apache-wordcount-2a671de0-d2b7-404f-bfe8-9e8cad5008d4-StreamThread-1-0_0-producer,
> transactionalId=apache-wordcount-0_0] Got error produce response with
> correlation id 15 on topic-partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
> retrying (2147483646 attempts left). Error: UNKNOWN_PRODUCER_ID
> (org.apache.kafka.clients.producer.internals.Sender)
> [2018-07-18 21:15:04,092] WARN [Producer
> clientId=apache-wordcount-2a671de0-d2b7-404f-bfe8-9e8cad5008d4-StreamThread-1-0_0-producer,
> transactionalId=apache-wordcount-0_0] Got error produce response with
> correlation id 21 on topic-partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
> retrying (2147483646 attempts left). Error: UNKNOWN_PRODUCER_ID
> (org.apache.kafka.clients.producer.internals.Sender)
>
> There are also some ERROR message in the broker logs:
> [2018-07-18 21:10:16,463] INFO Updated PartitionLeaderEpoch. New:
> {epoch:0, offset:0}, Current: {epoch:-1, offset:-1} for Partition:
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0.
> Cache now contains 0 entries. (kafka.server.epoch.LeaderEpochFileCache)
> [2018-07-18 21:10:16,965] INFO [Log
> partition=apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
> dir=/tmp/kafka-logs0] Incrementing log start offset to 5 (kafka.log.Log)
> [2018-07-18 21:10:16,966] INFO Cleared earliest 0 entries from epoch cache
> based on passed offset 5 leaving 1 in EpochFile for partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0
> (kafka.server.epoch.LeaderEpochFileCache)
> [2018-07-18 21:11:29,217] ERROR [ReplicaManager broker=0] Error processing
> append operation on partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0
> (kafka.server.ReplicaManager)
> org.apache.kafka.common.errors.UnknownProducerIdException: Found no record
> of producerId=5000 on the broker. It is possible that the last message with
> the producerId=5000 has been removed due to hitting the retention limit.
> [2018-07-18 21:11:29,331] INFO [Log
> partition=apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
> dir=/tmp/kafka-logs0] Incrementing log start offset to 9 (kafka.log.Log)
> [2018-07-18 21:11:29,332] INFO Cleared earliest 0 entries from epoch cache
> based on passed offset 9 leaving 1 in EpochFile for partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0
> (kafka.server.epoch.LeaderEpochFileCache)
> [2018-07-18 21:15:04,091] ERROR [ReplicaManager broker=0] Error processing
> append operation on partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0
> (kafka.server.ReplicaManager)
> org.apache.kafka.common.errors.UnknownProducerIdException: Found no record
> of producerId=5000 on the broker. It is possible that the last message with
> the producerId=5000 has been removed due to hitting the retention limit.
> [2018-07-18 21:15:04,204] INFO [Log
> partition=apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
> dir=/tmp/kafka-logs0] Incrementing log start offset to 13 (kafka.log.Log)
> [2018-07-18 21:15:04,205] INFO Cleared earliest 0 entries from epoch cache
> based on passed offset 13 leaving 1 in EpochFile for partition
> apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0
> (kafka.server.epoch.LeaderEpochFileCache)
>
> I found the outputs of the WordCount app is correct. But each time I send
> a line to streams-wordcount-input, the Streams App throw a new
> UNKNOWN_PRODUCER_ID error, and Broker also throw a new
> UnknownProducerIdException.
> The broker version I use is 1.1.0. Have anyone ecountered this problem
> before or can give me any hints about what might be causing this behaviour?
>
> Thanks,
> lambdaliu
>


UNKNOWN_PRODUCER_ID error when running Streams WordCount demo with processing.guarantee set to EXACTLY_ONCE

2018-07-19 Thread 刘少波
Hi,

I test the Kafka Streams WordCount demo follow the steps described in 
http://kafka.apache.org/11/documentation/streams/quickstart  with change the 
processing.guarantee property to EXACTLY_ONCE.

And seeing the following WARN message in streams demo app logs:
[2018-07-18 21:08:03,510] WARN The configuration 'admin.retries' was supplied 
but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)
[2018-07-18 21:11:29,218] WARN [Producer 
clientId=apache-wordcount-2a671de0-d2b7-404f-bfe8-9e8cad5008d4-StreamThread-1-0_0-producer,
 transactionalId=apache-wordcount-0_0] Got error produce response with 
correlation id 15 on topic-partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0, 
retrying (2147483646 attempts left). Error: UNKNOWN_PRODUCER_ID 
(org.apache.kafka.clients.producer.internals.Sender)
[2018-07-18 21:15:04,092] WARN [Producer 
clientId=apache-wordcount-2a671de0-d2b7-404f-bfe8-9e8cad5008d4-StreamThread-1-0_0-producer,
 transactionalId=apache-wordcount-0_0] Got error produce response with 
correlation id 21 on topic-partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0, 
retrying (2147483646 attempts left). Error: UNKNOWN_PRODUCER_ID 
(org.apache.kafka.clients.producer.internals.Sender)

There are also some ERROR message in the broker logs:
[2018-07-18 21:10:16,463] INFO Updated PartitionLeaderEpoch. New: {epoch:0, 
offset:0}, Current: {epoch:-1, offset:-1} for Partition: 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0. Cache 
now contains 0 entries. (kafka.server.epoch.LeaderEpochFileCache)
[2018-07-18 21:10:16,965] INFO [Log 
partition=apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
 dir=/tmp/kafka-logs0] Incrementing log start offset to 5 (kafka.log.Log)
[2018-07-18 21:10:16,966] INFO Cleared earliest 0 entries from epoch cache 
based on passed offset 5 leaving 1 in EpochFile for partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0 
(kafka.server.epoch.LeaderEpochFileCache)
[2018-07-18 21:11:29,217] ERROR [ReplicaManager broker=0] Error processing 
append operation on partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0 
(kafka.server.ReplicaManager)
org.apache.kafka.common.errors.UnknownProducerIdException: Found no record of 
producerId=5000 on the broker. It is possible that the last message with the 
producerId=5000 has been removed due to hitting the retention limit.
[2018-07-18 21:11:29,331] INFO [Log 
partition=apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
 dir=/tmp/kafka-logs0] Incrementing log start offset to 9 (kafka.log.Log)
[2018-07-18 21:11:29,332] INFO Cleared earliest 0 entries from epoch cache 
based on passed offset 9 leaving 1 in EpochFile for partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0 
(kafka.server.epoch.LeaderEpochFileCache)
[2018-07-18 21:15:04,091] ERROR [ReplicaManager broker=0] Error processing 
append operation on partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0 
(kafka.server.ReplicaManager)
org.apache.kafka.common.errors.UnknownProducerIdException: Found no record of 
producerId=5000 on the broker. It is possible that the last message with the 
producerId=5000 has been removed due to hitting the retention limit.
[2018-07-18 21:15:04,204] INFO [Log 
partition=apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0,
 dir=/tmp/kafka-logs0] Incrementing log start offset to 13 (kafka.log.Log)
[2018-07-18 21:15:04,205] INFO Cleared earliest 0 entries from epoch cache 
based on passed offset 13 leaving 1 in EpochFile for partition 
apache-wordcount-KSTREAM-AGGREGATE-STATE-STORE-03-repartition-0 
(kafka.server.epoch.LeaderEpochFileCache)

I found the outputs of the WordCount app is correct. But each time I send a 
line to streams-wordcount-input, the Streams App throw a new 
UNKNOWN_PRODUCER_ID error, and Broker also throw a new 
UnknownProducerIdException.
The broker version I use is 1.1.0. Have anyone ecountered this problem before 
or can give me any hints about what might be causing this behaviour?

Thanks,
lambdaliu


Re: Problem consuming from broker 1.1.0

2018-07-19 Thread SenthilKumar K
Hi Craig Ching,

Reg. *We did end up turning on debug logs for the console consumer and
found that one broker seemed to be having problems, it would lead to
timeouts communicating with it.  After restarting that broker, things
sorted themselves out.*

We had similar problem on prod cluster and i'm trying to figure out the
root cause for why broker stopped responding ? Pls check my email subject :
*"**Kafka Broker Not Responding" *where i described problem in detail.

Curious to know , Were you able to figure out reason for broker failure ?.
Of Course , turning off/on is not the ideal solution.

--Senthil


On Thu, Jun 14, 2018 at 9:48 AM, Craig Ching  wrote:

> Hi Manikumar!
>
> Thanks for responding!  Sorry it took me so long to get back!
>
> We did end up turning on debug logs for the console consumer and found
> that one broker seemed to be having problems, it would lead to timeouts
> communicating with it.  After restarting that broker, things sorted
> themselves out. However, I always hate the “turn off/ turn on” solution ;)
> It’s interesting to me that the 1.1.0 consumer, though it reported timeouts
> in the logs, never had a problem, it seemed able to recover.  Whereas the
> 1.0.1 consumer (talking to a 1.1.0 cluster remember) couldn’t recover.
> Does any of this make sense?  I’m happy to provide more details and logs if
> necessary as I’d like to understand the root problem here.
>
> Thanks again!
>
> Cheers,
> Craig
>
> > On Jun 13, 2018, at 12:43 AM, Manikumar 
> wrote:
> >
> > Can you post consumer debug logs?
> > You can enable console consumer debug logs here:
> > kafka/config/tools-log4j.properties
> >
> > On Wed, Jun 13, 2018 at 9:55 AM Craig Ching 
> wrote:
> >
> >> Hi!
> >>
> >> We’re having a problem with a new kafka cluster at 1.1.0.  The problem
> is,
> >> in general, that consumers can’t consume from the different broker (old
> >> broker was 0.11 I think).  The easiest recipe I have for reproducing the
> >> problem is that downloading kafka 1.0.1 and running console consumer
> can’t
> >> consume from the 1.1.0 cluster while a 1.1.0 console consumer can.
> We’re
> >> invoking console consumer like this:
> >>
> >> bin/kafka-console-consumer.sh \
> >>—bootstrap-server [kafka server] \
> >>—topic [our topic] \
> >>—max-messages 3
> >>
> >> That works for a 1.1.0 console consumer, but not for 1.0.1.  However, if
> >> we change that to:
> >>
> >> bin/kafka-console-consumer.sh \
> >>—zookeeper [zookeeper server] \
> >>—topic [our topic] \
> >>—max-messages 3
> >>
> >> Then it works for 1.0.1.
> >>
> >> I was wondering, is the zookeeper schema published for 1.1.0?  I have a
> >> feeling that maybe something is wrong in zookeeper and I know earlier
> >> versions of kafka used to publish the zk schema, could there be a
> problem
> >> in zk?  if so, what might I look for?
> >>
> >> Any help is greatly appreciated!
> >>
> >> Cheers,
> >> Craig
>
>


Re: [VOTE] 2.0.0 RC2

2018-07-19 Thread Rajini Sivaram
Hi all,

We found a blocker in 2.0.0 which is a bug in the newly added OAuth
protocol implementation (https://issues.apache.org/jira/browse/KAFKA-7182).
Since the current implementation doesn't conform to the SASL/OAUTHBEARER
spec in RFC-7628, we need to fix this before the release to conform to the
spec and avoid compatibility issues later. A fix is currently being
reviewed and I will try and create RC3 later today.

Many thanks to everyone who tested and voted for RC2. Please help test RC3
if you have time.

Regards,

Rajini

On Wed, Jul 18, 2018 at 4:03 PM, Guozhang Wang  wrote:

> +1. Verified the following:
>
> - javadocs
> - web docs
> - maven staging repository
>
> Besides what Ismael mentioned on upgrade guide, some of the latest doc
> fixes in 2.0 seems not be reflected in
> http://kafka.apache.org/20/documentation.html yet (this does not need a
> new
> RC, we can just re-copy-and-paste to kafka-site again).
>
>
> Thanks Rajini!
>
>
> Guozhang
>
>
>
> On Wed, Jul 18, 2018 at 7:48 AM, Ismael Juma  wrote:
>
> > Thanks Rajini! A documentation issue that we must fix before the release
> > (but does not require another RC), 1.2 (which became 2.0) is mentioned in
> > the upgrade notes:
> >
> > http://kafka.apache.org/20/documentation.html#upgrade
> >
> > Ismael
> >
> > On Sun, Jul 15, 2018 at 9:25 AM Rajini Sivaram 
> > wrote:
> >
> > > Hi Ismael,
> > >
> > > Thank you for pointing that out. I have re-uploaded the RC2 artifacts
> to
> > > maven including streams-scala_2.12. Also submitted a PR to update
> build &
> > > release scripts to include this.
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > >
> > >
> > > On Fri, Jul 13, 2018 at 7:19 AM, Ismael Juma 
> wrote:
> > >
> > > > Hi Rajini,
> > > >
> > > > Thanks for generating the RC. It seems like the kafka-streams-scala
> > 2.12
> > > > artifact is missing from the Maven repository:
> > > >
> > > > https://repository.apache.org/content/groups/staging/org/
> apache/kafka/
> > > >
> > > > Since this is the first time we are publishing this artifact, it is
> > > > possible that this never worked properly.
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Jul 10, 2018 at 10:17 AM Rajini Sivaram <
> > rajinisiva...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hello Kafka users, developers and client-developers,
> > > > >
> > > > >
> > > > > This is the third candidate for release of Apache Kafka 2.0.0.
> > > > >
> > > > >
> > > > > This is a major version release of Apache Kafka. It includes 40 new
> > > KIPs
> > > > > and
> > > > >
> > > > > several critical bug fixes. Please see the 2.0.0 release plan for
> > more
> > > > > details:
> > > > >
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > action?pageId=80448820
> > > > >
> > > > >
> > > > > A few notable highlights:
> > > > >
> > > > >- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for
> > > CreateTopics
> > > > >(KIP-277)
> > > > >- SASL/OAUTHBEARER implementation (KIP-255)
> > > > >- Improved quota communication and customization of quotas
> > (KIP-219,
> > > > >KIP-257)
> > > > >- Efficient memory usage for down conversion (KIP-283)
> > > > >- Fix log divergence between leader and follower during fast
> > leader
> > > > >failover (KIP-279)
> > > > >- Drop support for Java 7 and remove deprecated code including
> old
> > > > scala
> > > > >clients
> > > > >- Connect REST extension plugin, support for externalizing
> secrets
> > > and
> > > > >improved error handling (KIP-285, KIP-297, KIP-298 etc.)
> > > > >- Scala API for Kafka Streams and other Streams API improvements
> > > > >(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
> > > > >
> > > > >
> > > > > Release notes for the 2.0.0 release:
> > > > >
> > > > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/
> RELEASE_NOTES.html
> > > > >
> > > > >
> > > > > *** Please download, test and vote by Friday, July 13, 4pm PT
> > > > >
> > > > >
> > > > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > > >
> > > > > http://kafka.apache.org/KEYS
> > > > >
> > > > >
> > > > > * Release artifacts to be voted upon (source and binary):
> > > > >
> > > > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/
> > > > >
> > > > >
> > > > > * Maven artifacts to be voted upon:
> > > > >
> > > > > https://repository.apache.org/content/groups/staging/
> > > > >
> > > > >
> > > > > * Javadoc:
> > > > >
> > > > > http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/javadoc/
> > > > >
> > > > >
> > > > > * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
> > > > >
> > > > > https://github.com/apache/kafka/tree/2.0.0-rc2
> > > > >
> > > > >
> > > > >
> > > > > * Documentation:
> > > > >
> > > > > http://kafka.apache.org/20/documentation.html
> > > > >
> > > > >
> > > > > * Protocol:
> > > > >
> > > > > http://kafka.apache.org/20/protocol.html
> > > > >
> > > > >
> > > > > * Successful Jenkins builds for the 2.0 branch:
> > > > >
> > > > > Unit/integration tests:

org.apache.kafka.common.errors.TimeoutException : xxx passed since batch creation plus linger time

2018-07-19 Thread igor . b
Hi
We are using kafka 0.10.2 and recently got following error:
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
taz_data_info-0: 270120 ms has passed since batch creation plus linger time

The producer configuration is as follows:
 acks: all
  auto.complete: false
  bootstrap.servers: ...
  key.serializer: org.apache.kafka.common.serialization.StringSerializer
  retries: 2147483647
  type: Producer
  value.serializer: ...
Other configurations are defaults
We've tried to understand where this 270 seconds are comming from and failed to 
find something connected. 
According to 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-91+Provide+Intuitive+User+Timeouts+in+The+Producer
 it might happen in "Await send" step, but I haven't found anything on broker 
logs

This is very low volume topic with only 1 partition with RF 3.

Any ideas will be highly appreciated
Igor.


Kafka Broker Not Responding

2018-07-19 Thread SenthilKumar K
Hello Kafka Experts,

 We are currently facing issue on our 3 node Kafka Cluster , one of the
broker is not responding to any queries. I've checked logs but founding
nothing related to this problem.


Kafka Version: 1.1.0

Server.conf:
## Timeout properties to check prod outage
replica.fetch.wait.max.ms = 1000
replica.socket.timeout.ms = 6
request.timeout.ms = 6
zookeeper.connection.timeout.ms = 1
zookeeper.session.timeout.ms = 1
controller.socket.timeout.ms = 6
group.max.session.timeout.ms = 40
group.min.session.timeout.ms = 7000
##
socket.send.buffer.bytes = 102400
reserved.broker.max.id = 2147483647
num.partitions = 30
ssl.secure.random.implementation = SHA1PRNG
ssl.key.password = ***
log.cleaner.delete.retention.ms = 18
log.retention.ms = 60
listeners = SSL://IP1:9093
broker.id = 1
socket.receive.buffer.bytes = 102400
message.max.bytes = 67108864
ssl.truststore.password = ***
ssl.enabled.protocols = TLSv1.2
auto.create.topics.enable = true
log.roll.ms = 360
auto.leader.rebalance.enable = true
ssl.keystore.location = /tmp/keyStore.jks
zookeeper.connect = IP1:2181/kafka.cluster1
log.retention.check.interval.ms = 30
replica.fetch.max.bytes = 67108864
socket.request.max.bytes = 104857600
default.replication.factor = 2
offsets.topic.replication.factor = 2
log.dirs = /kafka_logs/,/kafka_logs/
ssl.keystore.password = 
min.insync.replicas = 2
security.inter.broker.protocol = SSL
compression.codec = 3
ssl.truststore.location = /tmp/trustStore.jks


*ERROR [Consumer clientId=consumer-1, groupId=console-consumer-21688]
Offset commit failed on partition logs-28 at offset 0: The request timed
out. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)*


Partition : logs-28 is actually resides on the broker which is not
responding... If i restart the broker it will start responding but would
like to figure exact cause on why out of 3 brokers one broker is failing .

WARN [ReplicaFetcher replicaId=72804, leaderId=72802, fetcherId=0] Error in
response for fetch request (type=FetchRequest, replicaId=72804,
maxWait=1000, minBytes=1, maxBytes=10485760, fetchData={logs-22=(offset=0,
tartOffset=0, maxBytes=67108864), logs-23=(offset=48199929,
tartOffset=48199929, maxBytes=67108864), logs-5=(offset=0, tartOffset=0,
maxBytes=67108864), logs-17=(offset=48225630, tartOffset=48225630,
maxBytes=67108864), logs-24=(offset=48913184, tartOffset=48913184,
maxBytes=67108864), logs-5=(offset=48256727, tartOffset=48256727,
maxBytes=67108864), logs-4=(offset=0, tartOffset=0, maxBytes=67108864),
logs-17=(offset=0, tartOffset=0, maxBytes=67108864),
logs-6=(offset=48916295, tartOffset=48916295, maxBytes=67108864),
logs-17=(offset=48210909, tartOffset=48210909, maxBytes=67108864),
logs-29=(offset=48193290, tartOffset=48193290, maxBytes=67108864),
logs-16=(offset=0, tartOffset=0, maxBytes=67108864),
__consumer_offsets-11=(offset=499512, tartOffset=0, maxBytes=67108864),
__consumer_offsets-41=(offset=0, tartOffset=0, maxBytes=67108864),
logs-11=(offset=50644515, tartOffset=50644515, maxBytes=67108864),
logs-18=(offset=48881988, tartOffset=48881988, maxBytes=67108864),
logs-28=(offset=0, tartOffset=0, maxBytes=67108864),
logs-29=(offset=50603359, tartOffset=50603359, maxBytes=67108864),
logs-0=(offset=51682300, tartOffset=51682300, maxBytes=67108864),
logs-11=(offset=50638782, tartOffset=50638782, maxBytes=67108864),
logs-12=(offset=51674247, tartOffset=51674247, maxBytes=67108864),
logs-23=(offset=50625272, tartOffset=50625272, maxBytes=67108864),
__consumer_offsets-5=(offset=0, tartOffset=0, maxBytes=67108864),
__consumer_offsets-35=(offset=0, tartOffset=0, maxBytes=67108864),
logs-5=(offset=50676068, tartOffset=50676068, maxBytes=67108864),
logs-23=(offset=0, tartOffset=0, maxBytes=67108864)},
isolationLevel=READ_UNCOMMITTED, toForget=, metadata=(sessionId=26393661,
epoch=INITIAL)) (kafka.server.ReplicaFetcherThread)
*java.net.SocketTimeoutException: Failed to connect within 6 ms*


Pls advise here.


--Senthil


Monitor the state of an event-based choreography

2018-07-19 Thread Jonathan Roy
Hello,

We are currently studying event-driven architectures with Kafka, and one thing 
that scares me a bit is that we have no way to know at any moment what is the 
state of a business transaction that spans multiple services. Let’s consider 
the following simple flow:
1. the ‘orders’ service creates a ‘new order’ event.
2. the ‘validation’ service validates the order and publishes an ‘order 
validated’ or ‘order rejected’.
3. the ‘email’ service sends a confirmation email to the customer.

This does not directly concern Kafka but are there some tools that help 
visually tracking the status of any particular order, and know which part of 
the flow remains to be done? I know Confluent Control Center allows to monitor 
the Kafka topics, but I’m not aware of any tool allowing to visualize the 
business flows of events.

Thanks!

Jonathan Roy