how to enhance Kafka streaming Consumer rate ?

2018-02-08 Thread ? ?
Hi:
I used kafka streaming for real time analysis.
and I put stream_thread_num same with partitions of topic
and set ConsumerConfig.max_poll_records =500
I use foreach  method only in this
but find with large records in kafka. the cosumer LAG is big some times and 
trigger kafka topic rebalance.
How to how to enhance Kafka streaming Consumer rate ?

funk...@live.com


are offsets per consumer or per consumer group?

2018-02-08 Thread Xavier Noria
Let's suppose a topic has three partitions and two consumer groups
listening.

The offset maintained by Kafka in each partition is associated with the
consumer group? Or with the individual consumer polling from that partition
in each consumer group respectively?

I am trying to understand the system behavior when listeners crash, but in
order to formulate more questions I need to double-check that before.


Re: are offsets per consumer or per consumer group?

2018-02-08 Thread Luke Steensen
Offsets are maintained per consumer group. When an individual consumer
crashes, the consumer group coordinator will detect that failure and
trigger a rebalance. This redistributes the partitions being consumed
across the available consumer processes, using the most recently committed
offset for each as the starting point.


On Thu, Feb 8, 2018 at 6:58 AM, Xavier Noria  wrote:

> Let's suppose a topic has three partitions and two consumer groups
> listening.
>
> The offset maintained by Kafka in each partition is associated with the
> consumer group? Or with the individual consumer polling from that partition
> in each consumer group respectively?
>
> I am trying to understand the system behavior when listeners crash, but in
> order to formulate more questions I need to double-check that before.
>


Re: are offsets per consumer or per consumer group?

2018-02-08 Thread Xavier Noria
On Thu, Feb 8, 2018 at 4:27 PM, Luke Steensen <
luke.steen...@braintreepayments.com> wrote:

Offsets are maintained per consumer group. When an individual consumer
> crashes, the consumer group coordinator will detect that failure and
> trigger a rebalance. This redistributes the partitions being consumed
> across the available consumer processes, using the most recently committed
> offset for each as the starting point.
>

Excellent, the getting started guide uses "consumer" sometimes meaning an
individual consumer, and sometimes meaning a consumer group. That
difficults a bit understanding how it works with exactitude. Thanks for
clarifying.

Let me followup with these questions then:

1) The group coordinator runs in Kafka? Or is the client library
responsible for that?

2) Say that a consumer group has consumers A, B and C, assigned to the 3
partitions resectively. Consumer A polls and gets messages 75-80, but when
it is processing message 77 crashes. The coordinator rebalances and assigns
that partition to some of the other two, but at which offset is that
partition left?

3) If the answer is 81, a critical consumer group that cannot miss messages
is expected to write custom coordination code to avoid missing 77-80? If
yes, are there best practices out there for doing this?


Re: are offsets per consumer or per consumer group?

2018-02-08 Thread Luke Steensen
1) In more recent versions of Kafka, the consumer group coordinator runs on
the broker. Previously, there was a "high level consumer" that spoke
directly to zookeeper and did group management within the client libraries,
but this is no longer used.

2) That depends on when your consumer commits offsets. The normal case is
to commit the offset for a message after that message has been processed.
In that case, the next consumer to be assigned that partition would
reprocess message 77. The other option is to commit offsets as messages are
received but before they are processed. This cause messages to be processed
at most once, instead of at least once.

3) The best way to get at least once processing is to make sure your client
is not configured to automatically commit offsets, and to do so explicitly.
This way you can be sure commits only happen once the result of processing
a message has been durably stored (or whatever needs to happen for your use
case). That doesn't necessarily mean you need to commit immediately after
each individual message, only that when you commit it is only for messages
that have been completely processed.


On Thu, Feb 8, 2018 at 9:37 AM, Xavier Noria  wrote:

> On Thu, Feb 8, 2018 at 4:27 PM, Luke Steensen <
> luke.steen...@braintreepayments.com> wrote:
>
> Offsets are maintained per consumer group. When an individual consumer
> > crashes, the consumer group coordinator will detect that failure and
> > trigger a rebalance. This redistributes the partitions being consumed
> > across the available consumer processes, using the most recently
> committed
> > offset for each as the starting point.
> >
>
> Excellent, the getting started guide uses "consumer" sometimes meaning an
> individual consumer, and sometimes meaning a consumer group. That
> difficults a bit understanding how it works with exactitude. Thanks for
> clarifying.
>
> Let me followup with these questions then:
>
> 1) The group coordinator runs in Kafka? Or is the client library
> responsible for that?
>
> 2) Say that a consumer group has consumers A, B and C, assigned to the 3
> partitions resectively. Consumer A polls and gets messages 75-80, but when
> it is processing message 77 crashes. The coordinator rebalances and assigns
> that partition to some of the other two, but at which offset is that
> partition left?
>
> 3) If the answer is 81, a critical consumer group that cannot miss messages
> is expected to write custom coordination code to avoid missing 77-80? If
> yes, are there best practices out there for doing this?
>


Re: are offsets per consumer or per consumer group?

2018-02-08 Thread Xavier Noria
Thanks very much Luke!


RE: Issues with Kafka Group Coordinator Failover

2018-02-08 Thread Krishnakumar Gurumurthy
Hello Kafka User Community,

In our project(as part of kafka  failover evaluation), we have a single cluster 
with five kafka nodes (five partition), three consumers (attached to single 
group) and single Zookeeper node. As soon as cluster startups, we see leader 
election per partition and each consumers discovers the group co-ordinator. 
Now, when manually shutdown kafka service in the co-ordinator node, entire 
cluster goes down (means no publish/subscribe of messages happening).

As per Kafka wiki  co-ordinator 
design
 and 
client-assignment,
 we see coordinator failover handling as part of kafka cluster. Kindly let us 
know anyone  in the community has encountered this earlier or any known 
solution available.

Best Regards,
Krishnakumar G




ProducerFencedException: Producer attempted an operation with an old epoch.

2018-02-08 Thread dan bress
Hi,

I recently switched my Kafka Streams 1.0.0 app to use exactly_once
semantics and since them my cluster has been stuck in rebalancing.  Is
there an explanation as to what is going on, or how I can resolve it?

I saw a similar issue discussed on the mailing list, but I don't know if a
ticket was created or there was a resolution.

http://mail-archives.apache.org/mod_mbox/kafka-users/201711.mbox/%3CCAKkfnUY0C311Yq%3Drt8kyna4cyucV8HbgWpiYj%3DfnYMt9%2BAb8Mw%40mail.gmail.com%3E

This is the exception I'm seeing:
2018-02-08 17:09:20,763 ERR [kafka-producer-network-thread |
dp-app-devel-dbress-a92a93de-c1b1-4655-b487-0e9f3f3f3409-StreamThread-4-0_414-producer]
Sender [Producer
clientId=dp-app-devel-dbress-a92a93de-c1b1-4655-b487-0e9f3f3f3409-StreamThread-4-0_414-producer,
transactionalId=dp-app-devel-dbress-0_414] Aborting producer batches due to
fatal error
org.apache.kafka.common.errors.ProducerFencedException: Producer attempted
an operation with an old epoch. Either there is a newer producer with the
same transactionalId, or the producer's transaction has been expired by the
broker.
2018-02-08 17:09:20,764 ERR
[dp-app-devel-dbress-a92a93de-c1b1-4655-b487-0e9f3f3f3409-StreamThread-4]
ProcessorStateManager task [0_414] Failed to flush state store
summarykey-to-summary:
org.apache.kafka.common.KafkaException: Cannot perform send because at
least one previous transactional or idempotent request has failed with
errors.
at
org.apache.kafka.clients.producer.internals.TransactionManager.failIfNotReadyForSend(TransactionManager.java:278)
at
org.apache.kafka.clients.producer.internals.TransactionManager.maybeAddPartitionToTransaction(TransactionManager.java:263)
at
org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:804)
at
org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:760)
at
org.apache.kafka.streams.processor.internals.RecordCollectorImpl.send(RecordCollectorImpl.java:100)
at
org.apache.kafka.streams.state.internals.StoreChangeFlushingLogger.flush(StoreChangeFlushingLogger.java:92)
at
org.apache.kafka.streams.state.internals.InMemoryKeyValueFlushingLoggedStore.flush(InMemoryKeyValueFlushingLoggedStore.java:139)
at
org.apache.kafka.streams.state.internals.InnerMeteredKeyValueStore.flush(InnerMeteredKeyValueStore.java:268)
at
org.apache.kafka.streams.state.internals.MeteredKeyValueStore.flush(MeteredKeyValueStore.java:126)
at
org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:245)
at
org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:196)
at
org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:324)
at
org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:304)
at
org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:208)
at
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:299)
at
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:289)
at
org.apache.kafka.streams.processor.internals.AssignedTasks$2.apply(AssignedTasks.java:87)
at
org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:451)
at
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:380)
at
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:309)
at
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1018)
at
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:835)
at
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)


mistake in Kafka the definitive guide

2018-02-08 Thread adrien ruffie
Hello all,


I'm reading Kafka the definitive guide and I suspect that found an error in the 
page 166 figure "Figure 8-5 a fail over causes committed offsets without 
matching records".


In the figure we can't see Topic B ... specified in the box "Group C1, Topic B, 
Parition 0, Offset 6" ... the figure is not correct?


Thus, I don't really understand the figure ...


best regards,


Adrien


Re: mistake in Kafka the definitive guide

2018-02-08 Thread Matthias J. Sax
Books can contain errors...

Check the reported Errata and add a new one if not reported yet:
http://www.oreilly.com/catalog/errata.csp?isbn=0636920044123


-Matthias

On 2/8/18 1:14 PM, adrien ruffie wrote:
> Hello all,
> 
> 
> I'm reading Kafka the definitive guide and I suspect that found an error in 
> the page 166 figure "Figure 8-5 a fail over causes committed offsets without 
> matching records".
> 
> 
> In the figure we can't see Topic B ... specified in the box "Group C1, Topic 
> B, Parition 0, Offset 6" ... the figure is not correct?
> 
> 
> Thus, I don't really understand the figure ...
> 
> 
> best regards,
> 
> 
> Adrien
> 



signature.asc
Description: OpenPGP digital signature


RE: mistake in Kafka the definitive guide

2018-02-08 Thread adrien ruffie
Thank Matthias,

I know books can contain errors, and I'm not really sure about this errors,  
that's why I asked the question 😊


De : Matthias J. Sax 
Envoyé : jeudi 8 février 2018 22:19:59
À : users@kafka.apache.org
Objet : Re: mistake in Kafka the definitive guide

Books can contain errors...

Check the reported Errata and add a new one if not reported yet:
http://www.oreilly.com/catalog/errata.csp?isbn=0636920044123


-Matthias

On 2/8/18 1:14 PM, adrien ruffie wrote:
> Hello all,
>
>
> I'm reading Kafka the definitive guide and I suspect that found an error in 
> the page 166 figure "Figure 8-5 a fail over causes committed offsets without 
> matching records".
>
>
> In the figure we can't see Topic B ... specified in the box "Group C1, Topic 
> B, Parition 0, Offset 6" ... the figure is not correct?
>
>
> Thus, I don't really understand the figure ...
>
>
> best regards,
>
>
> Adrien
>



RE: mistake in Kafka the definitive guide

2018-02-08 Thread adrien ruffie
I take a look the errata page, and I saw that Paolo had already entered the 
errata. But is unconfirmed for the moment


PDF Page 166
Figure 8-5

The labels on the topics/partitions are all the same: "Topic A, Partition 0". I 
think they should be (from top to bottom): "Topic A, Partition 0" "Topic B, 
Partition 0" "Topic __conusmer_offsets"
Paolo Baronti   Dec 01, 2017



De : adrien ruffie 
Envoyé : jeudi 8 février 2018 22:14:53
À : users@kafka.apache.org
Objet : mistake in Kafka the definitive guide

Hello all,


I'm reading Kafka the definitive guide and I suspect that found an error in the 
page 166 figure "Figure 8-5 a fail over causes committed offsets without 
matching records".


In the figure we can't see Topic B ... specified in the box "Group C1, Topic B, 
Parition 0, Offset 6" ... the figure is not correct?


Thus, I don't really understand the figure ...


best regards,


Adrien


Re: mistake in Kafka the definitive guide

2018-02-08 Thread Matthias J. Sax
I did not write the book, but I agree with the reported errata.

-Matthias


On 2/8/18 2:10 PM, adrien ruffie wrote:
> I take a look the errata page, and I saw that Paolo had already entered the 
> errata. But is unconfirmed for the moment
> 
> 
> PDF Page 166
> Figure 8-5
> 
> The labels on the topics/partitions are all the same: "Topic A, Partition 0". 
> I think they should be (from top to bottom): "Topic A, Partition 0" "Topic B, 
> Partition 0" "Topic __conusmer_offsets"
> Paolo Baronti   Dec 01, 2017
> 
> 
> 
> De : adrien ruffie 
> Envoyé : jeudi 8 février 2018 22:14:53
> À : users@kafka.apache.org
> Objet : mistake in Kafka the definitive guide
> 
> Hello all,
> 
> 
> I'm reading Kafka the definitive guide and I suspect that found an error in 
> the page 166 figure "Figure 8-5 a fail over causes committed offsets without 
> matching records".
> 
> 
> In the figure we can't see Topic B ... specified in the box "Group C1, Topic 
> B, Parition 0, Offset 6" ... the figure is not correct?
> 
> 
> Thus, I don't really understand the figure ...
> 
> 
> best regards,
> 
> 
> Adrien
> 



signature.asc
Description: OpenPGP digital signature


How start with Kafka?

2018-02-08 Thread Andy
I am Kafka beginner.
I have compiled under Windows and Visual Studio 2015 librdkafka and
cppkafka .
I have two examples: kafka_consumer.cpp and kafka_producer.cpp.
I try:
>producer -b 127.0.0.1:9092 -t topic
>consumer -b 127.0.0.1:9092 -t topic -g group
but is bad connection


Cancel partition reassignment?

2018-02-08 Thread Dylan Martin
Hi all.


I'm trying to cancel a failed partition reassignment.  I've heard that this can 
be done by deleting /admin/reassign_partitions in zookeeper.  I've tried and 
/admin/reassign_partitions won't go away.


Does anyone know a way to cancel a partition reassignment?


-Dylan


(206) 855-9740 - Home

(206) 235-8809 - Cell

The information contained in this email message, and any attachment thereto, is 
confidential and may not be disclosed without the sender's express permission. 
If you are not the intended recipient or an employee or agent responsible for 
delivering this message to the intended recipient, you are hereby notified that 
you have received this message in error and that any review, dissemination, 
distribution or copying of this message, or any attachment thereto, in whole or 
in part, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender by telephone, fax or email and delete the 
message and all of its attachments. Thank you.


Re: Cancel partition reassignment?

2018-02-08 Thread Ted Yu
Have you seen this thread ?

http://search-hadoop.com/m/Kafka/uyzND1pHiNuYt8hc1?subj=Re+Question+Kafka+Reassign+partitions+tool

On Thu, Feb 8, 2018 at 4:12 PM, Dylan Martin 
wrote:

> Hi all.
>
>
> I'm trying to cancel a failed partition reassignment.  I've heard that
> this can be done by deleting /admin/reassign_partitions in zookeeper.  I've
> tried and /admin/reassign_partitions won't go away.
>
>
> Does anyone know a way to cancel a partition reassignment?
>
>
> -Dylan
>
>
> (206) 855-9740 - Home
>
> (206) 235-8809 - Cell
>
> The information contained in this email message, and any attachment
> thereto, is confidential and may not be disclosed without the sender's
> express permission. If you are not the intended recipient or an employee or
> agent responsible for delivering this message to the intended recipient,
> you are hereby notified that you have received this message in error and
> that any review, dissemination, distribution or copying of this message, or
> any attachment thereto, in whole or in part, is strictly prohibited. If you
> have received this message in error, please immediately notify the sender
> by telephone, fax or email and delete the message and all of its
> attachments. Thank you.
>


Re: Partition rebalancing when both leader and follower is down?

2018-02-08 Thread Devendar Rao
Asking again to find out if there is a setting in KAFKA which can handle
this case automatically.

On Sat, Feb 3, 2018 at 12:45 PM, Devendar Rao 
wrote:

> Hi,
>
> I don't see partition rebalancing happening when both leader and replica
> nodes are down for a given partition. The partition's leader is set to -1;
> but never gets fixed. I manually assign the partition using
> "kafka-preferred-replica-election.sh".  Does kafka not balance the
> partitions automatically for this case?
>
> I've this setting; I'm using kafka-0.10.2 version
> and auto.leader.rebalance.enable is set to true
>
> Error:
> Leader for partition *my-topic* unavailable for fetching offset, wait for
> metadata refresh
>
> Thanks
>


Re: mistake in Kafka the definitive guide

2018-02-08 Thread Ted Yu
I think Paolo's comment is correct.

BTW the paragraph above the figure says

Kafka currently lacks transactions

The above seems to have been written some time ago.

Cheers

On Thu, Feb 8, 2018 at 2:10 PM, adrien ruffie 
wrote:

> I take a look the errata page, and I saw that Paolo had already entered
> the errata. But is unconfirmed for the moment
>
>
> PDF Page 166
> Figure 8-5
>
> The labels on the topics/partitions are all the same: "Topic A, Partition
> 0". I think they should be (from top to bottom): "Topic A, Partition 0"
> "Topic B, Partition 0" "Topic __conusmer_offsets"
> Paolo Baronti   Dec 01, 2017
>
>
> 
> De : adrien ruffie 
> Envoyé : jeudi 8 février 2018 22:14:53
> À : users@kafka.apache.org
> Objet : mistake in Kafka the definitive guide
>
> Hello all,
>
>
> I'm reading Kafka the definitive guide and I suspect that found an error
> in the page 166 figure "Figure 8-5 a fail over causes committed offsets
> without matching records".
>
>
> In the figure we can't see Topic B ... specified in the box "Group C1,
> Topic B, Parition 0, Offset 6" ... the figure is not correct?
>
>
> Thus, I don't really understand the figure ...
>
>
> best regards,
>
>
> Adrien
>


Re: How start with Kafka?

2018-02-08 Thread Matthias J. Sax
I assume you did start a local Kafka cluster?

Besides that, the client docs might help:
https://docs.confluent.io/current/clients/

They contain examples for librdkafka, too.

Otherwise, your question is hard to answer as there is not enough
information.


-Matthias

On 2/8/18 3:44 PM, Andy wrote:
> I am Kafka beginner.
> I have compiled under Windows and Visual Studio 2015 librdkafka and
> cppkafka .
> I have two examples: kafka_consumer.cpp and kafka_producer.cpp.
> I try:
>> producer -b 127.0.0.1:9092 -t topic
>> consumer -b 127.0.0.1:9092 -t topic -g group
> but is bad connection
> 



signature.asc
Description: OpenPGP digital signature


why kafka index file use memory mapped files ,however log file doesn't

2018-02-08 Thread YuFeng Shen
Hi Experts,

We know that kafka use memory mapped files for it's index files ,however it's 
log files don't use the memory mapped files technology.

May I know why index files use memory mapped files, however log files don't  
use the same technology?


Jacky


Re: How start with Kafka?

2018-02-08 Thread ??????
first  build a broker, then try your example


 
---Original---
From: "Andy"
Date: 2018/2/9 07:45:11
To: "users";
Subject: How start with Kafka?


I am Kafka beginner.
I have compiled under Windows and Visual Studio 2015 librdkafka and
cppkafka .
I have two examples: kafka_consumer.cpp and kafka_producer.cpp.
I try:
>producer -b 127.0.0.1:9092 -t topic
>consumer -b 127.0.0.1:9092 -t topic -g group
but is bad connection