Re: Some questions on Kafka on order of messages with mutiple partitions

2023-05-18 Thread Peter Bukowinski
It looks like you successfully removed replicas from partitions 0, 1, and 2, 
but partitons 3 - 8 still show 9 replicas. You probably intended to remove them 
from all 9 partitions? You’ll need to create another json file with partitions 
3  - 8 to complete the task.

—
Peter


> On May 17, 2023, at 12:41 AM, Mich Talebzadeh  
> wrote:
> 
> Thanks Miguel, I did that
> 
> Based on the following Json file
> 
> {
> "version":1,
> "partitions":[
> {"topic":"md","partition":0,"replicas":[1,3,7]},
> {"topic":"md","partition":1,"replicas":[2,8,9]},
> {"topic":"md","partition":2,"replicas":[7,10,12]}
> ]
> }
> 
> I ran this command
> 
> kafka-reassign-partitions.sh --bootstrap-server rhes75:9092
> --reassignment-json-file ./reduce_replication_factor2.json --execute
> 
> Current partition replica assignment
> {"version":1,"partitions":[{"topic":"md","partition":0,"replicas":[1,3,7],"log_dirs":["any","any","any"]},{"topic":"md","partition":1,"replicas":[2,8,9],"log_dirs":["any","any","any"]},{"topic":"md","partition":2,"replicas":[7,10,12],"log_dirs":["any","any","any"]}]}
> Save this to use as the --reassignment-json-file option during rollback
> Successfully started partition reassignments for md-0,md-1,md-2
> 
> kafka-topics.sh --describe --bootstrap-server rhes75:9092 --topic md
> 
> Topic: md   TopicId: UfQly87bQPCbVKoH-PQheg PartitionCount: 9
> ReplicationFactor: 3Configs: segment.bytes=1073741824,retention.ms
> =1000,retention.bytes=1073741824
>Topic: md   Partition: 0Leader: 1   Replicas: 1,3,7
> Isr: 1,3,7
>Topic: md   Partition: 1Leader: 2   Replicas: 2,8,9
> Isr: 2,8,9
>Topic: md   Partition: 2Leader: 7   Replicas: 7,10,12
>Isr: 10,7,12
>Topic: md   Partition: 3Leader: 1   Replicas:
> 1,12,9,11,7,3,10,8,2  Isr: 10,1,9,2,12,7,3,11,8
>Topic: md   Partition: 4Leader: 7   Replicas:
> 7,9,11,1,3,10,8,2,12  Isr: 10,1,9,2,12,7,3,11,8
>Topic: md   Partition: 5Leader: 3   Replicas:
> 3,11,1,7,10,8,2,12,9  Isr: 10,1,9,2,12,7,3,11,8
>Topic: md   Partition: 6Leader: 10  Replicas:
> 10,1,7,3,8,2,12,9,11  Isr: 10,1,9,2,12,7,3,11,8
>Topic: md   Partition: 7Leader: 8   Replicas:
> 8,7,3,10,2,12,9,11,1  Isr: 10,1,9,2,12,7,3,11,8
>Topic: md   Partition: 8Leader: 2   Replicas:
> 2,3,10,8,12,9,11,1,7  Isr: 10,1,9,2,12,7,3,11,8
> 
> 
> Mich
> 
> 
>   view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
> 
> 
> https://en.everybodywiki.com/Mich_Talebzadeh
> 
> 
> 
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
> 
> 
> 
> 
> On Wed, 17 May 2023 at 00:21, Miguel A. Sotomayor 
> wrote:
> 
>> Hi Mich,
>> 
>> You can use the script `kafka-reassign-partitions.sh` to re-locate or
>> change the number of replicas
>> 
>> Regards
>> Miguel
>> 
>> El mar, 16 may 2023 a las 18:44, Mich Talebzadeh (<
>> mich.talebza...@gmail.com>)
>> escribió:
>> 
>>> Thanks Peter. I meant reduce replication from 9 to 3 and Not partitions.
>>> Apologies for any confusion
>>> 
>>> 
>>> Cheers
>>> 
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
>>> loss, damage or destruction of data or any other property which may arise
>>> from relying on this email's technical content is explicitly disclaimed.
>>> The author will in no case be liable for any monetary damages arising
>> from
>>> such loss, damage or destruction.
>>> 
>>> 
>>> 
>>> 
>>> On Tue, 16 May 2023 at 17:38, Peter Bukowinski  wrote:
>>> 
>>>> Mich,
>>>> 
>>>> It is not possible to reduce the number of partitions for a kafka topic
>>>> without deleting and recreating the topic. What previous responders to
>>> your
>>>> inquiry noted is that yo

Re: Some questions on Kafka on order of messages with mutiple partitions

2023-05-16 Thread Peter Bukowinski
Mich,

It is not possible to reduce the number of partitions for a kafka topic without 
deleting and recreating the topic. What previous responders to your inquiry 
noted is that your topic replication of 9 is high. What you want to do is 
reduce your replication, not the partitions. You can do this using the same 
json file you had the first time, with all 9 partitions. Just remove 6 of the 9 
broker ids from the replicas array, e.g.

cat reduce_replication_factor.json
{
"version":1,
"partitions":[
{"topic":"md","partition":0,"replicas":[12,10,8]},
{"topic":"md","partition":1,"replicas":[9,8,2]},
{"topic":"md","partition":2,"replicas":[11,2,12]},
{"topic":"md","partition":3,"replicas":[1,12,9]},
{"topic":"md","partition":4,"replicas":[7,9,11]},
{"topic":"md","partition":5,"replicas":[3,11,1]}
]
}

You may want to adjust where the replicas sit to achieve a better balance 
across the cluster, but this arrangement only truncates the last 6 replicas 
from the list, so should complete quickly as no replica data would move, only 
be deleted.

—
Peter Bukowinski



> On May 12, 2023, at 1:24 PM, Mich Talebzadeh  
> wrote:
> 
> My bad. Only need 3 partitions
> 
> {
> "version":1,
> "partitions":[
> {"topic":"md","partition":0,"replicas":[12,10,8,2,9,11,1,7,3]},
> {"topic":"md","partition":1,"replicas":[9,8,2,12,11,1,7,3,10]},
> {"topic":"md","partition":2,"replicas":[11,2,12,9,1,7,3,10,8]}
> ]
> }
> 
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies Limited
> London
> United Kingdom
> 
> 
>   view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
> 
> 
> https://en.everybodywiki.com/Mich_Talebzadeh
> 
> 
> 
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
> 
> 
> 
> 
> On Fri, 12 May 2023 at 21:18, Mich Talebzadeh 
> wrote:
> 
>> This json file seemed to work
>> 
>> cat reduce_replication_factor.json
>> {
>> "version":1,
>> "partitions":[
>> {"topic":"md","partition":0,"replicas":[12,10,8,2,9,11,1,7,3]},
>> {"topic":"md","partition":1,"replicas":[9,8,2,12,11,1,7,3,10]},
>> {"topic":"md","partition":2,"replicas":[11,2,12,9,1,7,3,10,8]},
>> {"topic":"md","partition":3,"replicas":[1,12,9,11,7,3,10,8,2]},
>> {"topic":"md","partition":4,"replicas":[7,9,11,1,3,10,8,2,12]},
>> {"topic":"md","partition":5,"replicas":[3,11,1,7,10,8,2,12,9]}
>> ]
>> }
>> 
>> kafka-reassign-partitions.sh --bootstrap-server
>> rhes75:9092,rhes75:9093,rhes75:9094,rhes76:9092,rhes76:9093,rhes76:9094,rhes76:9095,rhes76:9096,
>> rhes76:9097 --reassignment-json-file ./reduce_replication_factor.json
>> --execute
>> 
>> The output
>> 
>> Successfully started partition reassignments for
>> md-0,md-1,md-2,md-3,md-4,md-5
>> 
>> 
>> I guess it is going to take sometime before it is completed.
>> 
>> Thanks
>> 
>> 
>> 
>> 
>> On Fri, 12 May 2023 at 20:16, Mich Talebzadeh 
>> wrote:
>> 
>>> Thanks Matthias.
>>> 
>>> with regard to your point below:
>>> 
>>> A replication factor of 9 sounds very high. For production, a replication
>>> factor of 3 is recommended.
>>> 
>>> Is it possible to dynamically reduce this number to 3 when the topic is
>>> actively consumed)?
>>> 
>>> 
>>> 
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>> 
>>> 
>>> 
&

Re: Kafka cluster rolling restart

2023-03-06 Thread Peter Bukowinski
When doing rolling restarts, I always wait until the under-replicated partition 
count returns to zero before restarting the next broker. This state is achieved 
AFTER the last restarted broker returns to a running state. If you just wait 
for the running state, you risk restarting the next broker before all 
partitions have returned to healthy, and then you’ll have offline partitions 
because your minISR is 2.

--
Peter Bukowinski

> On Mar 6, 2023, at 7:04 AM, Luis Alves  wrote:
> 
> Hello,
> 
> I'm doing some tests with rolling restarts in a Kafka cluster and I have a
> couple of questions related to the impact of rolling restarts on Kafka
> consumers/producers and on the overall process.
> 
> First, some context on my setup:
> 
>   - Kafka cluster with 3 nodes.
>   - Topic replication factor of 3 with minISR of 2.
>   - All topics have a single partition (I intend to increase the
>   partitioning factor in the future, but for now it's just 1 for testing
>   purposes).
>   - Kafka version is 3.2.3.
>   - I have two systems that communicate via these Kafka topics. The
>   high-level flow is:
>  1. System A sends a message to a Kafka topic (at a rate of ~10
>  events/sec).
>  2. System B consumes the message.
>  3. System B sends a reply to a Kafka topic.
>  4. System A consumes the reply.
>   - When the system is stable, I see end-to-end latencies (measured on
>   System A) around 10ms in the 99th percentile.
>   - System A is using Kafka client 3.3.1, and System B is using Kafka
>   client 3.4.0.
>   - Kafka consumers and producers on both systems are with the default
>   configurations, except that the Kafka consumers have auto-commits disabled.
>   - All Kafka brokers are configured with controlled.shutdown.enable set
>   to true.
>   - The Kafka cluster is running in Kubernetes and deployed using Strimzi
>   (this is just for awareness).
>   - The rolling restart process is the following (when using Strimzi to
>   manage it, and when we try to do it manually):
>  1. Restart each broker, one at a time, by sending a SIGTERM to the
>  broker process. The controller broker is the last one to be restarted.
>  2. Only restart the next broker when the current broker reports the
>  broker state as RUNNING. Note: when we do this manually (without
> Strimzi),
>  we wait to see the end-to-end latencies stabilize before moving
> to the next
>  broker.
> 
> Now, my questions:
> 
>   1. When we do this process with Strimzi (waits for the broker state to
>   be RUNNING before moving to the next one), we've seen end-to-end latencies
>   growing up to 1-2 minutes (System A is not even able to send events to the
>   Kafka topic). This is unexpected because AFAIK the configurations that we
>   are using are the ones recommended for high availability during rolling
>   restarts. My question is: is it enough to wait for the broker state to be
>   RUNNING to move on to the next broker?
>   2. When we do this process manually (we wait for end-to-end latencies to
>   stabilize and only then move to the next broker), we've seen end-to-end
>   latencies growing up to 1 second. While this is much better than what we
>   see in 1., my question is whether this latency increase is expected or not.
> 
> Thanks in advance,
> Luís Alves


Re: Help with log.dirs please

2022-09-23 Thread Peter Bukowinski
Great to hear, Chris!

I would recommend not using zookeeper addresses in any of your kafka commands 
as that has been deprecated for some time. Switch to using `--bootstrap-server`.

—
Peter

> On Sep 22, 2022, at 1:08 AM, Chris Peart  wrote:
> 
> Hi Peter,
> 
> 
> 
> Thanks for getting back to me on this, i have done the following checks as 
> suggested:
> 
> Kafka version: 2.8.1
> 
> I can connect to all 4 brokers on port 9092 from my client system
> 
> I can see this on 3 of the 4 brokers have the following message:
> 
> DEBUG [Controller id=X] Broker Y has been elected as the controller, so 
> stopping the election process. (kafka.controller.KafkaController)
> 
> I was just going to send you the command for my topic creation and realised i 
> was not using the new zookeeper path in my ZK connection string, so i was 
> using /kafka instead of /kafkaprod   MyBad :-(
> 
> 
> 
> So all was working after all, once i removed the data/files from /data/kafka 
> and changing the zookeeper.connect string to use /kafkaprod.
> 
> 
> 
> We now have all six disk working on all 4 brokers.
> 
> 
> 
> Thanks again for your help on this and sorry for not spotting my mistake 
> sooner :-)
> 
> 
> 
> Many Thanks
> 
> Chris
> 
> 
> 
> On 2022-09-21 22:24, Peter Bukowinski wrote:
> 
>> Hmmm. Let’s start with some low level troubleshooting.
>> 
>> What kafka version are you using?
>> 
>> Can reach the ip:port of all broker listener addresses? I like to use `nc 
>> -vz ip port` to validate connectivity from my kafka client.
>> 
>> Check the controller.log on all brokers. All but one of them should say 
>> something like `DEBUG [Controller id=X] Broker Y has been elected as the 
>> controller, so stopping the election process. 
>> (kafka.controller.KafkaController)`. This will tell you that all brokers are 
>> in agreement that only one of them is the controller. This will confirm 
>> zookeeper is configured and working correctly.
>> 
>> Can you share the command you’re using to create your topics?
>> 
>> When you run `kafka-broker-api-versions.sh --bootstrap-server host:port | 
>> grep ‘>’ | sort -n -k3`, do you get a list of all your brokers?
>> 
>> —
>> Peter
>> 
>>> On Sep 21, 2022, at 10:26 AM, Chris Peart >> <mailto:ch...@peart.me.uk>> wrote:
>>> 
>>> Hi Peter,
>>> Kafka brokers are staying up and show when I query zookeeper brokers/ids
>>> Is the any other files that require deletion as it doesn’t make sense I 
>>> have 0 brokers, when trying to create a topic. 
>>> 
>>> I get no errors when listing topics, but obviously don’t see any topics as 
>>> I cannot create any. 
>>> 
>>> I have rebooted the Kafka brokers and restarted the all the zookeeper 
>>> services on our production zookeeper cluster. 
>>> 
>>> Not sure what other steps I can take to resolve this, any thoughts would be 
>>> appreciated. 
>>> 
>>> Many Thanks 
>>> Chris
>>> 
>>> 
>>>> On 21 Sep 2022, at 6:03 pm, Peter Bukowinski >>> <mailto:pmb...@gmail.com>> wrote:
>>>> 
>>>> Hi Chris,
>>>> 
>>>> Are you sure kafka is starting up (and remains up) successfully? From the 
>>>> error, it seems like none of the brokers are online.
>>>> 
>>>> —
>>>> Peter
>>>> 
>>>> 
>>>>> On Sep 21, 2022, at 4:02 AM, Chris Peart >>>> <mailto:ch...@peart.me.uk>> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi Peter,
>>>>> 
>>>>> I have made the changes i suggested:
>>>>> 
>>>>> Stopping Kafka
>>>>> 
>>>>> Deleting all files and folders in /Kafka/data
>>>>> 
>>>>> Changing the zookeeper setting to point to a different path in the 
>>>>> zookeeper cluster
>>>>> 
>>>>> Start Kafka
>>>>> 
>>>>> I see the usual files in /kafka/data/ meta.properties and the offset 
>>>>> files in all 6 disks.
>>>>> 
>>>>> I see all 4 brokers in zookeeper using the new path i specified in 
>>>>> server.properties.
>>>>> 
>>>>> When i try to create a topic now i receive: Replication factor: 3 larger 
>>>>> than available brokers: 0
>>>>> 
>>>>> I see no errors in server.log & co

Re: Help with log.dirs please

2022-09-21 Thread Peter Bukowinski
Hmmm. Let’s start with some low level troubleshooting.

What kafka version are you using?

Can reach the ip:port of all broker listener addresses? I like to use `nc -vz 
ip port` to validate connectivity from my kafka client.

Check the controller.log on all brokers. All but one of them should say 
something like `DEBUG [Controller id=X] Broker Y has been elected as the 
controller, so stopping the election process. 
(kafka.controller.KafkaController)`. This will tell you that all brokers are in 
agreement that only one of them is the controller. This will confirm zookeeper 
is configured and working correctly.

Can you share the command you’re using to create your topics?

When you run `kafka-broker-api-versions.sh --bootstrap-server host:port | grep 
‘>’ | sort -n -k3`, do you get a list of all your brokers?

—
Peter

> On Sep 21, 2022, at 10:26 AM, Chris Peart  wrote:
> 
> Hi Peter,
> Kafka brokers are staying up and show when I query zookeeper brokers/ids
> Is the any other files that require deletion as it doesn’t make sense I have 
> 0 brokers, when trying to create a topic. 
> 
> I get no errors when listing topics, but obviously don’t see any topics as I 
> cannot create any. 
> 
> I have rebooted the Kafka brokers and restarted the all the zookeeper 
> services on our production zookeeper cluster. 
> 
> Not sure what other steps I can take to resolve this, any thoughts would be 
> appreciated. 
> 
> Many Thanks 
> Chris
> 
> 
>> On 21 Sep 2022, at 6:03 pm, Peter Bukowinski  wrote:
>> 
>> Hi Chris,
>> 
>> Are you sure kafka is starting up (and remains up) successfully? From the 
>> error, it seems like none of the brokers are online.
>> 
>> —
>> Peter
>> 
>> 
>>> On Sep 21, 2022, at 4:02 AM, Chris Peart  wrote:
>>> 
>>> 
>>> 
>>> Hi Peter,
>>> 
>>> I have made the changes i suggested:
>>> 
>>> Stopping Kafka
>>> 
>>> Deleting all files and folders in /Kafka/data
>>> 
>>> Changing the zookeeper setting to point to a different path in the 
>>> zookeeper cluster
>>> 
>>> Start Kafka
>>> 
>>> I see the usual files in /kafka/data/ meta.properties and the offset files 
>>> in all 6 disks.
>>> 
>>> I see all 4 brokers in zookeeper using the new path i specified in 
>>> server.properties.
>>> 
>>> When i try to create a topic now i receive: Replication factor: 3 larger 
>>> than available brokers: 0
>>> 
>>> I see no errors in server.log & controller.log.
>>> 
>>> Any advice would be great please as i've exhausted all my options.
>>> 
>>> Many Thanks,
>>> 
>>> Chris
>>> 
>>>> On 2022-09-20 21:43, Chris Peart wrote:
>>>> 
>>>> Thanks Peter,
>>>> I'll give this a go tomorrow and let you know how I get on.
>>>> Many Thanks,
>>>> Chris
>>>>> On 20 Sep 2022, at 9:32 pm, Peter Bukowinski  wrote:
>>>> Hi Chris,
>>>> If the configs are correct and the permissions on all the 
>>>> /data/X/kafka/data directories are correct, then kafka should use all of 
>>>> the log dirs when creating topics. Remember that kafka will not 
>>>> automatically move any existing topic data when the cluster configs 
>>>> change. I'd test by creating a topic with more partitions than storage 
>>>> locations.
>>>> If you'd rather start fresh, you have the steps correct. An alternative to 
>>>> changing the zk path is to use zkCli to remove the paths. If you use a 
>>>> zookeeper chroot, just delete everything from that chroot down from zkCli, 
>>>> e.g. `rmr /[kafka-chroot]`
>>>> --
>>>> Peter
>>>>> On Sep 20, 2022, at 11:56 AM, Chris Peart  wrote:
>>>> Thinking about this, as this is not in production it might be easier just 
>>>> reset everything.
>>>> Would it be something like:
>>>> Stopping Kafka
>>>> Deleting all files and folders in /Kafka/data
>>>> Changing the zookeeper setting to point to a different path in the 
>>>> zookeeper cluster
>>>> Start Kafka
>>>> Some help on resetting Kafka would be great if ok please.
>>>> Many Thanks
>>>> Chris
>>>>> On 20 Sep 2022, at 3:37 pm, Chris Peart  wrote:
>>>> Hi Peter,
>>>> I have checked the logs on all 4 brokers and could only see 
>>>> /data/1/data/kafka being used, log.dirs config in the logs showed all the 
&

Re: Help with log.dirs please

2022-09-21 Thread Peter Bukowinski
Hi Chris,

Are you sure kafka is starting up (and remains up) successfully? From the 
error, it seems like none of the brokers are online.

—
Peter


> On Sep 21, 2022, at 4:02 AM, Chris Peart  wrote:
> 
> 
> 
> Hi Peter,
> 
> I have made the changes i suggested:
> 
> Stopping Kafka
> 
> Deleting all files and folders in /Kafka/data
> 
> Changing the zookeeper setting to point to a different path in the zookeeper 
> cluster
> 
> Start Kafka
> 
> I see the usual files in /kafka/data/ meta.properties and the offset files in 
> all 6 disks.
> 
> I see all 4 brokers in zookeeper using the new path i specified in 
> server.properties.
> 
> When i try to create a topic now i receive: Replication factor: 3 larger than 
> available brokers: 0
> 
> I see no errors in server.log & controller.log.
> 
> Any advice would be great please as i've exhausted all my options.
> 
> Many Thanks,
> 
> Chris
> 
> On 2022-09-20 21:43, Chris Peart wrote:
> 
>> Thanks Peter,
>> I'll give this a go tomorrow and let you know how I get on.
>> Many Thanks,
>> Chris
>> On 20 Sep 2022, at 9:32 pm, Peter Bukowinski  wrote:
>> Hi Chris,
>> If the configs are correct and the permissions on all the /data/X/kafka/data 
>> directories are correct, then kafka should use all of the log dirs when 
>> creating topics. Remember that kafka will not automatically move any 
>> existing topic data when the cluster configs change. I'd test by creating a 
>> topic with more partitions than storage locations.
>> If you'd rather start fresh, you have the steps correct. An alternative to 
>> changing the zk path is to use zkCli to remove the paths. If you use a 
>> zookeeper chroot, just delete everything from that chroot down from zkCli, 
>> e.g. `rmr /[kafka-chroot]`
>> --
>> Peter
>> On Sep 20, 2022, at 11:56 AM, Chris Peart  wrote:
>> Thinking about this, as this is not in production it might be easier just 
>> reset everything.
>> Would it be something like:
>> Stopping Kafka
>> Deleting all files and folders in /Kafka/data
>> Changing the zookeeper setting to point to a different path in the zookeeper 
>> cluster
>> Start Kafka
>> Some help on resetting Kafka would be great if ok please.
>> Many Thanks
>> Chris
>> On 20 Sep 2022, at 3:37 pm, Chris Peart  wrote:
>> Hi Peter,
>> I have checked the logs on all 4 brokers and could only see 
>> /data/1/data/kafka being used, log.dirs config in the logs showed all the 
>> disks but no errors.
>> I managed to get the other 5 disks working by adding the path 
>> /data/[2-6]/data/kafka and setting the owner as kafka and restarting kafka.
>> So now when i create topics i see disks 2-6 being used but not disk 1.
>> I have stopped all the brokers deleted all files /data/1/kafka/data on all 
>> brokers and started them, but i still don't see disk 1 being used when 
>> creating topics, it's as if there is some dirty configuration somewhere, 
>> maybe in zookeeper?
>> Any help here would be much appreciated :)
>> Many Thanks,
>> Chris
>> On 2022-09-16 08:22, Chris Peart wrote:
>> Thanks Peter,
>> I'll check the logs next week and let you know my findings.
>> Many Thanks
>> Chris
>> On 16 Sep 2022, at 7:45 am, Peter Bukowinski  wrote:
>> The next thing I'd check is the broker logs. The parameters read from the 
>> config should appear in the logs when kafka starts up. Search the logs for 
>> 'log.dirs' and ensure the correct configs are loaded.
>> --
>> Peter
>> On Sep 15, 2022, at 11:10 PM, Chris Peart  wrote:
>> Hi Peter,
>> Thanks for your response, we have the following configuration:
>> Partition count=4
>> Replication factor=3
>> All four brokers have topics-partitions in /data/1/kafka/data and are 
>> receiving data.
>> Each server has 6 x 2TB disks for kaka data.
>> Many Thanks,
>> Chris
>> On 16 Sep 2022, at 1:56 am, Peter Bukowinski  wrote:
>> Hi Chris,
>> Can you share the partition count and replication factor of your partitions? 
>> Also, do all four brokers contain topic-partition directories in 
>> /data/1/kafka/data or just a single broker? Depending on your topic config, 
>> it may be entirely normal that his has happened.
>> --
>> Peter Bukowinski
>> On Sep 15, 2022, at 3:35 AM, Chris Peart  wrote:
>> Hi All,
>> I have a 4 node kafka cluster running version 2.8.1, we have started pushing 
>> data to the cluster but can only see one disk being used.
>> We had 6 disk configured as non-raid and 1 partition per disk, we have the 
>

Re: Help with log.dirs please

2022-09-20 Thread Peter Bukowinski
Hi Chris,

If the configs are correct and the permissions on all the /data/X/kafka/data 
directories are correct, then kafka should use all of the log dirs when 
creating topics. Remember that kafka will not automatically move any existing 
topic data when the cluster configs change. I’d test by creating a topic with 
more partitions than storage locations.

If you’d rather start fresh, you have the steps correct. An alternative to 
changing the zk path is to use zkCli to remove the paths. If you use a 
zookeeper chroot, just delete everything from that chroot down from zkCli, e.g. 
`rmr /[kafka-chroot]`

—
Peter

> On Sep 20, 2022, at 11:56 AM, Chris Peart  wrote:
> 
> Thinking about this, as this is not in production it might be easier just 
> reset everything. 
> 
> Would it be something like:
> 
> Stopping Kafka 
> Deleting all files and folders in /Kafka/data
> Changing the zookeeper setting to point to a different path in the zookeeper 
> cluster
> Start Kafka 
> 
> Some help on resetting Kafka would be great if ok please. 
> 
> Many Thanks 
> Chris 
> 
>> On 20 Sep 2022, at 3:37 pm, Chris Peart  wrote:
>> 
>> 
>> Hi Peter,
>> 
>> I have checked the logs on all 4 brokers and could only see 
>> /data/1/data/kafka being used, log.dirs config in the logs showed all the 
>> disks but no errors.
>> 
>> I managed to get the other 5 disks working by adding the path 
>> /data/[2-6]/data/kafka and setting the owner as kafka and restarting kafka.
>> 
>> So now when i create topics i see disks 2-6 being used but not disk 1.
>> 
>> I have stopped all the brokers deleted all files /data/1/kafka/data on all 
>> brokers and started them, but i still don't see disk 1 being used when 
>> creating topics, it's as if there is some dirty configuration somewhere, 
>> maybe in zookeeper?
>> 
>> Any help here would be much appreciated :)
>> 
>> 
>> 
>> Many Thanks,
>> 
>> Chris
>> 
>> 
>> 
>> 
>> 
>>> On 2022-09-16 08:22, Chris Peart wrote:
>>> 
>>> Thanks Peter,
>>> I’ll check the logs next week and let you know my findings. 
>>> Many Thanks 
>>> Chris 
>>> 
>>>> On 16 Sep 2022, at 7:45 am, Peter Bukowinski  wrote:
>>>> 
>>>> The next thing I’d check is the broker logs. The parameters read from the 
>>>> config should appear in the logs when kafka starts up. Search the logs for 
>>>> ‘log.dirs’ and ensure the correct configs are loaded.
>>>> 
>>>> --
>>>> Peter
>>>> 
>>>>> On Sep 15, 2022, at 11:10 PM, Chris Peart  wrote:
>>>>> 
>>>>> Hi Peter,
>>>>> 
>>>>> Thanks for your response, we have the following configuration:
>>>>> 
>>>>> Partition count=4
>>>>> Replication factor=3
>>>>> 
>>>>> All four brokers have topics-partitions in /data/1/kafka/data and are 
>>>>> receiving data. 
>>>>> 
>>>>> Each server has 6 x 2TB disks for kaka data. 
>>>>> 
>>>>> Many Thanks,
>>>>> Chris
>>>>> 
>>>>>>> On 16 Sep 2022, at 1:56 am, Peter Bukowinski  wrote:
>>>>>> 
>>>>>> Hi Chris,
>>>>>> 
>>>>>> Can you share the partition count and replication factor of your 
>>>>>> partitions? Also, do all four brokers contain topic-partition 
>>>>>> directories in /data/1/kafka/data or just a single broker? Depending on 
>>>>>> your topic config, it may be entirely normal that his has happened.
>>>>>> 
>>>>>> 
>>>>>> —
>>>>>> Peter Bukowinski 
>>>>>> 
>>>>>>>> On Sep 15, 2022, at 3:35 AM, Chris Peart  wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> I have a 4 node kafka cluster running version 2.8.1, we have started 
>>>>>>> pushing data to the cluster but can only see one disk being used.
>>>>>>> 
>>>>>>> We had 6 disk configured as non-raid and 1 partition per disk, we have 
>>>>>>> the following in fstab:
>>>>>>> 
>>>>>>> /dev/mapper/VolGroup01-data/data/1xfsnodev,noatime,nofail  
>>>>>>> 1  2
>>>>>>> 
>>>>>>> /dev/mapper/VolGroup02-data/data/2   xfsnodev,noatime,nofail  1 
>>>>>>>  2
>>>>>>> 
>>>>>>> /dev/mapper/VolGroup03-data/data/3xfsnodev,noatime,nofail  
>>>>>>> 1  2
>>>>>>> 
>>>>>>> /dev/mapper/VolGroup04-data/data/4xfsnodev,noatime,nofail  
>>>>>>> 1  2
>>>>>>> 
>>>>>>> /dev/mapper/VolGroup05-data/data/5xfsnodev,noatime,nofail  
>>>>>>> 1  2
>>>>>>> 
>>>>>>> /dev/mapper/VolGroup06-data/data/6xfsnodev,noatime,nofail  
>>>>>>> 1  2
>>>>>>> 
>>>>>>> We configured server.properties to be: 
>>>>>>> log.dirs=/data/1/kafka/data,/data/2/kafka/data,/data/3/kafka/data,/data/4/kafka/data,/data/5/kafka/data,/data/6/kafka/data
>>>>>>> 
>>>>>>> i can see all our topics in /data/1/kafka/data but don't see anything 
>>>>>>> in /data/2-5
>>>>>>> 
>>>>>>> Any help would be appreciated as this is going to production next week?
>>>>>>> 
>>>>>>> Many Thanks,
>>>>>>> 
>>>>>>> Chris



Re: Help with log.dirs please

2022-09-16 Thread Peter Bukowinski
The next thing I’d check is the broker logs. The parameters read from the 
config should appear in the logs when kafka starts up. Search the logs for 
‘log.dirs’ and ensure the correct configs are loaded.

--
Peter

> On Sep 15, 2022, at 11:10 PM, Chris Peart  wrote:
> 
> Hi Peter,
> 
> Thanks for your response, we have the following configuration:
> 
> Partition count=4
> Replication factor=3
> 
> All four brokers have topics-partitions in /data/1/kafka/data and are 
> receiving data. 
> 
> Each server has 6 x 2TB disks for kaka data. 
> 
> Many Thanks,
> Chris
> 
>> On 16 Sep 2022, at 1:56 am, Peter Bukowinski  wrote:
>> 
>> Hi Chris,
>> 
>> Can you share the partition count and replication factor of your partitions? 
>> Also, do all four brokers contain topic-partition directories in 
>> /data/1/kafka/data or just a single broker? Depending on your topic config, 
>> it may be entirely normal that his has happened.
>> 
>> 
>> —
>> Peter Bukowinski 
>> 
>>>> On Sep 15, 2022, at 3:35 AM, Chris Peart  wrote:
>>> 
>>> 
>>> 
>>> Hi All,
>>> 
>>> I have a 4 node kafka cluster running version 2.8.1, we have started 
>>> pushing data to the cluster but can only see one disk being used.
>>> 
>>> We had 6 disk configured as non-raid and 1 partition per disk, we have the 
>>> following in fstab:
>>> 
>>> /dev/mapper/VolGroup01-data/data/1xfsnodev,noatime,nofail  1  2
>>> 
>>> /dev/mapper/VolGroup02-data/data/2   xfsnodev,noatime,nofail  1  2
>>> 
>>> /dev/mapper/VolGroup03-data/data/3xfsnodev,noatime,nofail  1  2
>>> 
>>> /dev/mapper/VolGroup04-data/data/4xfsnodev,noatime,nofail  1  2
>>> 
>>> /dev/mapper/VolGroup05-data/data/5xfsnodev,noatime,nofail  1  2
>>> 
>>> /dev/mapper/VolGroup06-data/data/6xfsnodev,noatime,nofail  1  2
>>> 
>>> We configured server.properties to be: 
>>> log.dirs=/data/1/kafka/data,/data/2/kafka/data,/data/3/kafka/data,/data/4/kafka/data,/data/5/kafka/data,/data/6/kafka/data
>>> 
>>> i can see all our topics in /data/1/kafka/data but don't see anything in 
>>> /data/2-5
>>> 
>>> Any help would be appreciated as this is going to production next week?
>>> 
>>> Many Thanks,
>>> 
>>> Chris
>> 
> 


Re: Help with log.dirs please

2022-09-15 Thread Peter Bukowinski
Hi Chris,

Can you share the partition count and replication factor of your partitions? 
Also, do all four brokers contain topic-partition directories in 
/data/1/kafka/data or just a single broker? Depending on your topic config, it 
may be entirely normal that his has happened.


—
Peter Bukowinski 

> On Sep 15, 2022, at 3:35 AM, Chris Peart  wrote:
> 
> 
> 
> Hi All,
> 
> I have a 4 node kafka cluster running version 2.8.1, we have started pushing 
> data to the cluster but can only see one disk being used.
> 
> We had 6 disk configured as non-raid and 1 partition per disk, we have the 
> following in fstab:
> 
> /dev/mapper/VolGroup01-data/data/1xfsnodev,noatime,nofail  1  2
> 
> /dev/mapper/VolGroup02-data/data/2   xfsnodev,noatime,nofail  1  2
> 
> /dev/mapper/VolGroup03-data/data/3xfsnodev,noatime,nofail  1  2
> 
> /dev/mapper/VolGroup04-data/data/4xfsnodev,noatime,nofail  1  2
> 
> /dev/mapper/VolGroup05-data/data/5xfsnodev,noatime,nofail  1  2
> 
> /dev/mapper/VolGroup06-data/data/6xfsnodev,noatime,nofail  1  2
> 
> We configured server.properties to be: 
> log.dirs=/data/1/kafka/data,/data/2/kafka/data,/data/3/kafka/data,/data/4/kafka/data,/data/5/kafka/data,/data/6/kafka/data
> 
> i can see all our topics in /data/1/kafka/data but don't see anything in 
> /data/2-5
> 
> Any help would be appreciated as this is going to production next week?
> 
> Many Thanks,
> 
> Chris



Re: Consumer Lag-Apache_kafka_JMX metrics

2022-08-16 Thread Peter Bukowinski
Richard recently answered your query. A kafka cluster does not keep track of 
lag on behalf of external consumers and it therefore is not available in JMX. 
This is why tools like Burrow were written. The java kafka consumer published 
consumer lag metrics, and perhaps some other third-party clients do, as well.

> On Aug 16, 2022, at 12:05 PM, Kafka Life  wrote:
> 
> Hello Experts, Any info or pointers on my query please.
> 
> 
> 
> On Mon, Aug 15, 2022 at 11:36 PM Kafka Life  wrote:
> 
>> Dear Kafka Experts
>> we need to monitor the consumer lag in kafka clusters 2.5.1 and 2.8.0
>> versions of kafka in Grafana.
>> 
>> 1/ What is the correct path for JMX metrics to evaluate Consumer Lag in
>> kafka cluster.
>> 
>> 2/ I had thought it is FetcherLag  but it looks like it is not as per the
>> link below.
>> 
>> https://www.instaclustr.com/support/documentation/kafka/monitoring-information/fetcher-lag-metrics/#:~:text=Aggregated%20Fetcher%20Consumer%20Lag%20This%20metric%20aggregates%20lag,in%20sync%20with%20partitions%20that%20it%20is%20replicating
>> .
>> 
>> Could one of you experts please guide on which JMX i should use for
>> consumer lag apart from kafka burrow or such intermediate tools
>> 
>> Thanking you in advance
>> 
>> 



Re: Ensuring that the message is persisted after acknowledgement

2021-08-24 Thread Peter Bukowinski
Kunal,

I recommend looking at the broker and topic parameters that include the term 
“flush” , such as 
https://kafka.apache.org/documentation/#topicconfigs_flush.messages 


Kafka lets you configure how often log messages are flushed to disk, either per 
topic or globally. The default settings leave the flushing completely to the 
OS. Kafka was designed to take full advantage of the OS page cache because it 
significantly improves performance for both producers and consumers, allowing 
them to write to and read from memory.

If your application requires absolute disk persistence and you are willing to 
take a significant performance hit, you can set the topic property 
flush.messages to 1 for any topic that requires this guarantee.

—
Peter

> On Aug 24, 2021, at 10:31 PM, Kunal Goyal  wrote:
> 
> Hi Sunil
> 
> The article that you shared talks about acks. But even if the message is
> received by all in-sync replicas and kafka sends response back to the
> producer, it is possible that none of the replicas did not flush the
> messages to disk. So, if all the replicas crash for some reason, the
> messages would be lost. For our application, we require some way to
> guarantee that the messages are persisted to disk.
> 
> Regards,
> Kunal
> 
> On Tue, Aug 24, 2021 at 8:40 PM Vairavanathan Emalayan <
> vairavanathan.emala...@cohesity.com> wrote:
> 
>> 
>> 
>> -- Forwarded message -
>> From: sunil chaudhari 
>> Date: Fri, Aug 20, 2021 at 8:00 AM
>> Subject: Re: Ensuring that the message is persisted after acknowledgement
>> To: 
>> Cc: Vairavanathan Emalayan 
>> 
>> 
>> Hi Kunal,
>> This article may help you.
>> 
>> https://betterprogramming.pub/kafka-acks-explained-c0515b3b707e
>> 
>> 
>> Cheers,
>> Sunil.
>> 
>> On Fri, 20 Aug 2021 at 8:11 PM, Kunal Goyal 
>> wrote:
>> 
>>> Hello,
>>> 
>>> We are exploring using Kafka for our application. Our requirement is that
>>> once we write some messages to Kafka, it should be guaranteed that the
>>> messages are persisted to disk.
>>> We found this
>>> <
>>> https://www.quora.com/Does-Kafka-sync-data-to-disk-asynchronously-like-Redis-does
 
>>> article which says that a Kafka broker acknowledges a record after it has
>>> written the record to the buffer of the I/O device; it does not issue an
>>> explicit fsync operation nor does it wait for the OS to confirm that the
>>> data has been written. Is this statement true for the current
>>> implementation? If so, is there any way in which we can ensure fsync is
>>> called before acknowledgement of messages?
>>> Any help would be appreciated.
>>> 
>>> --
>>> 
>>> Thanks & Regards
>>> 
>>> Kunal Goyal
>>> 
>> 



Re: Under-replicated-partitions

2021-07-27 Thread Peter Bukowinski
Hi Sridhar,

If your min.insync.replicas value is set to 3, then kafka won’t be able to move 
replicas until there are three replicas listed in the ISR. I would look into 
the health of broker 21 — it’s either down or unhealthy. It’s the only one not 
showing in the ISR list. 

—
Peter Bukowinski

> On Jul 27, 2021, at 1:12 AM, Sridhar Rao  wrote:
> 
> Hi Fabio Pardi,
> 
> Thanks for your prompt response.
> Split brain was our suspicion and we are investigating other possibilities.
> Perhaps our understanding of the problem might be incorrect at the moment.
> The issue started when one of the broker instances went down abruptly (3
> brokers, 3 zookeepers) and the cluster was unstable.
> 
> Later, we were able to restart the affected broker instance followed by
> rolling restart of other 2 brokers. The cluster was stabilized at this
> point.
> However, we noticed un-repl partitions and Preferred Replica imbalance
> irregularities.
> 
> [xxx(user):/xxx/install/1.0.0/bin] ./kafka-topics.sh --describe --zookeeper
> zookeeper1:2181 --under-replicated-partitions
>Topic: ABC  Partition: 3Leader: 31  Replicas: 31,21,11
> Isr: 31,11
>Topic: __consumer_offsets   Partition: 1Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 3Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 7Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 9Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 13   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 15   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 19   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 21   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 25   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 27   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 31   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 33   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 37   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 43   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: __consumer_offsets   Partition: 45   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: __consumer_offsets   Partition: 49   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: _kafka_lenses_alerts Partition: 0Leader: 31
> Replicas: 31,21,11  Isr: 31,11
>Topic: _kafka_lenses_alerts_settingsPartition: 0Leader: 31
> Replicas: 31,21,11  Isr: 31,11
>Topic: _kafka_lenses_processors Partition: 0Leader: 31
> Replicas: 31,21,11  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 0Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 4Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 6Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 10   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 12   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 16   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 18   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 22   Leader: 31
> Replicas: 21,11,31  Isr: 31,11
>Topic: connect-kfkxxxprd-offset Partition: 24   Leader: 31
> Replicas: 31,11,21  Isr: 31,11
>Topic: connect-kfkxxxprd-status Partition: 3Leader: 31
> Replicas: 21,31,11  Isr: 31,11
> 
> On Tue, Jul 27, 2021 at 9:46 AM Fabio Pardi  wrote:
> 
>> 
>> 
>> On 27/07/2021 09:19, Sridhar Rao wrote:
>>> Hi Everyone,
>>> 
>>> Recently we noticed a high number of under-replicated-partitions after
>>> zookeeper split brain issue.
>>> We tried fixing the issue by executing ./kafka-reassign-partitions.sh
>>> procedure. However Kafka refuses to re-assign the partitions in ISR and
>>> un-repl partitions remain the same.
>>>

Re: Is it safe to delete old log segments manually?

2021-03-25 Thread Peter Bukowinski
In this case, yes, for any given topic-partition on the broker, you should be 
able to delete the oldest log segment, its associated index and timeindex 
files, and the snapshot file (which will be recreated on startup) in order to 
gain some free space.

—
Peter Bukowinski

> On Mar 25, 2021, at 11:08 AM, Sankalp Bhatia  
> wrote:
> 
> Thank you for the response Peter. However, for us all the brokers are 
> currently offline. So if I delete the entire topic-partition directory in one 
> of the brokers, the first broker would start with no means to replicate the 
> data which we just deleted. What are your thoughts on this? Do you think this 
> approach will work safe in our case? 
> 
> Thanks,
> Sankalp
> 
> On Thu, 25 Mar 2021 at 21:09, Peter Bukowinski  <mailto:pmb...@gmail.com>> wrote:
> Hi Sankalp,
> 
> As long as you have replication, I’ve found it is safest to delete entire 
> topic-partition directories than it is to delete individual log segments from 
> them. For one, you get back more space. Second, you don’t have to worry about 
> metadata corruption.
> 
> When I’ve run out of disk space in the past, the first thing I did was reduce 
> topic retention where I could, waited for the log cleanup routines to run, 
> then I looked for and deleted associated topic partition directories on the 
> brokers with filled disks before starting kafka on them. When the brokers 
> rejoined the cluster, they started catching up on the deleted topic-partition 
> directories.
> 
> --
> Peter Bukowinski
> 
> > On Mar 25, 2021, at 8:00 AM, Sankalp Bhatia  > <mailto:sankalpbhati...@gmail.com>> wrote:
> > 
> > Hi All,
> > 
> > Brokers in one of our Apache Kafka clusters are continuously crashing as
> > they have run out of disk space. As per my understanding, reducing the
> > value of retention.ms <http://retention.ms/> and retention.bytes properties 
> > will not work because
> > the broker is crashing before the log-retention thread can be scheduled (
> > link
> > <https://github.com/apache/kafka/blob/3eaf44ba8ea26a7a820894390e8877d404ddd5a2/core/src/main/scala/kafka/log/LogManager.scala#L394-L398
> >  
> > <https://github.com/apache/kafka/blob/3eaf44ba8ea26a7a820894390e8877d404ddd5a2/core/src/main/scala/kafka/log/LogManager.scala#L394-L398>>
> > ).
> > One option we are exploring is if we can manually delete some of the old
> > segment files to make some space in our data disk for the broker to startup
> > while reducing the retention.ms <http://retention.ms/> config at the same 
> > time. There is an old
> > email thread (link
> > <https://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3CCAOG_4Qbwx44T-=vrpkvqgrum8lpmdzl2bxxrgz5c9h1_noh...@mail.gmail.com%3E
> >  
> > <https://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3CCAOG_4Qbwx44T-=vrpkvqgrum8lpmdzl2bxxrgz5c9h1_noh...@mail.gmail.com%3E>>)
> > which suggests it is safe to do so, but we want to understand if there have
> > been recent changes to topic-partition metadata which we might end up
> > corrupting if we try this? If so, are there any tips to get around this
> > issue?
> > 
> > Thanks,
> > Sankalp



Re: Is it safe to delete old log segments manually?

2021-03-25 Thread Peter Bukowinski
Hi Sankalp,

As long as you have replication, I’ve found it is safest to delete entire 
topic-partition directories than it is to delete individual log segments from 
them. For one, you get back more space. Second, you don’t have to worry about 
metadata corruption.

When I’ve run out of disk space in the past, the first thing I did was reduce 
topic retention where I could, waited for the log cleanup routines to run, then 
I looked for and deleted associated topic partition directories on the brokers 
with filled disks before starting kafka on them. When the brokers rejoined the 
cluster, they started catching up on the deleted topic-partition directories.

--
Peter Bukowinski

> On Mar 25, 2021, at 8:00 AM, Sankalp Bhatia  wrote:
> 
> Hi All,
> 
> Brokers in one of our Apache Kafka clusters are continuously crashing as
> they have run out of disk space. As per my understanding, reducing the
> value of retention.ms and retention.bytes properties will not work because
> the broker is crashing before the log-retention thread can be scheduled (
> link
> <https://github.com/apache/kafka/blob/3eaf44ba8ea26a7a820894390e8877d404ddd5a2/core/src/main/scala/kafka/log/LogManager.scala#L394-L398>
> ).
> One option we are exploring is if we can manually delete some of the old
> segment files to make some space in our data disk for the broker to startup
> while reducing the retention.ms config at the same time. There is an old
> email thread (link
> <https://mail-archives.apache.org/mod_mbox/kafka-users/201403.mbox/%3CCAOG_4Qbwx44T-=vrpkvqgrum8lpmdzl2bxxrgz5c9h1_noh...@mail.gmail.com%3E>)
> which suggests it is safe to do so, but we want to understand if there have
> been recent changes to topic-partition metadata which we might end up
> corrupting if we try this? If so, are there any tips to get around this
> issue?
> 
> Thanks,
> Sankalp


Re: kafka log.retention.bytes

2021-02-24 Thread Peter Bukowinski
log.retention.bytes is a broker-level config that sets the maximum size of a 
topic partition on a broker, so it will apply to all topics…

unless a topic has the retention.bytes property configured — this is a 
topic-level config and only applies to a single topic — in which case that 
takes precedence.

Kafka does not have a built-in mechanism for preventing full disks. You must do 
some topic growth prediction and set your topic retentions to be proportional 
to the total storage available per broker.

As with most times disks fill up, it’s not enjoyable dealing with the fallout, 
so spending the time to get it right is well worth it.

-- Peter (from phone)

> On Feb 24, 2021, at 7:46 AM, Calvin Chen  wrote:
> 
> Hi all
> 
> I have question about Kafka topic log retention on bytes, does the 
> log.retention.bytes apply to each topic or it apply to all topics in broker? 
> If it apply to each topic, then, when topic numbers keep growing in broker, 
> how can we make sure total disk size for all topic logs will not exceed total 
> disk capacity of that broker?
> 
> Thanks
> Calvin
> 


Intended behavior when a broker loses its log volume

2020-10-11 Thread Peter Bukowinski
Greeting, all.

What is the expected behavior of a broker when it loses its only configured 
data log directory?

I’m running kafka 2.2.1 in aws and we had an outage caused by the loss of an 
attached volume on one of the brokers. The broker did not relinquish leadership 
of its topic partitions when this occurred, so it caused an outage that was 
only mitigated after we restarted the broker, forcing leadership changes. I run 
kafka on bare metal with JBOD data dirs, and losing a disk in those clusters 
does not cause an outage.

I’m curious what I should expect with only one storage location per broker.

—
Peter Bukowinski

Re: Can we use VIP ip rather than Kafka Broker host name in bootstrap string

2020-08-26 Thread Peter Bukowinski
I do something like this in my environment to simplify things. We use a consul 
service address, e.g ‘kafka.service.subdomain.consul', to provide the VIP, 
which returns the address of a live broker in the cluster. Kafka clients use 
that address in their configs. It works very well.

—
Peter

> On Aug 26, 2020, at 11:54 AM, manoj.agraw...@cognizant.com wrote:
> 
> Hi All ,
> Can we use VIP ip rather than Kafka Broker host name in bootstrap string  at 
> producer side ?
> Any concern or recommendation way


Re: Kafka BrokerState Metric Value 3

2020-08-19 Thread Peter Bukowinski
The broker state metric just reports on the state of the broker itself, not 
whether it is in sync. A replacement broker will quickly reach a broker state 
of 3 on startup even though it has to catch up on many replicas. Don’t rely on 
it for checking if a cluster/broker is healthy with no under-replicated 
partitions.

For that, you can look at the underreplicated partition count metric.

-- Peter (from phone)

> On Aug 19, 2020, at 12:52 AM, Dhirendra Singh  wrote:
> 
> So is this metric just gives information that broker process up and running
> ? or does it indicate something more of broker state or partitions it hold ?
> 
> 
> 
>> On Mon, Aug 17, 2020 at 6:17 PM Karolis Pocius
>>  wrote:
>> 
>> I tried using this metric for determining when the broker is back in the
>> cluster and became the leader for partitions it owned before restart, but
>> that's not the case.
>> 
>> In the end I've settled for checking
>> kafka.server:name=LeaderCount,type=ReplicaManager which tells me when the
>> broker is actually operational and serving data.
>> 
>> On Mon, Aug 17, 2020 at 3:29 PM Dhirendra Singh 
>> wrote:
>> 
>>> I have a question regarding Kafka BrokerState Metric value 3. According
>> to
>>> the documentation value 3 means running state.
>>> What does this running state mean for the broker? Does it mean data of
>> all
>>> partitions on this broker is in sync ?
>>> Is it safe to assume that when broker transition to state 3 after restart
>>> it recovered all partitions data from the leader and is in sync with the
>>> leaders ?
>>> 
>>> Thanks,
>>> dsingh
>>> 
>> 


Re: Kafka compatibility with ZK

2020-08-02 Thread Peter Bukowinski
That procedure looks safe and sane to me, Marina.

> On Aug 2, 2020, at 10:04 AM, Marina Popova  
> wrote:
> 
> 
> Actually, I'm very interested in your experience as well I'm about to 
> start the same (similar) upgrade - from Kafka 0.11/ZK3.4.13 to Kafka 2.4/ZK 
> 3.5.6
> 
> I have Kafka and ZK as separate clusters.
> 
> My plan is :
> 1. rolling upgrade the Kafka cluster to 2.4 - using the 
> inter.broker.protocol.version set to 0.11 at first
> 2. rolling upgrade ZK cluster to 3.5.6
> 3. set inter.broker.protocol.version=2.4.0 and rolling restart the Kafka 
> cluster again
> 
> Anybody sees a problem with this approach?
> 
> 
> thanks,
> Marina
> 
> 
> Sent with ProtonMail Secure Email.
> 
> ‐‐‐ Original Message ‐‐‐
>> On Thursday, July 23, 2020 4:01 PM, Andrey Klochkov  
>> wrote:
>> 
>> Hello,
>> We're upgrading our Kafka from 1.1.0 to 2.4.1 and I'm wondering if ZK needs
>> to be upgraded too (we're currently on 3.4.6). The upgrade guide says that
>> "kafka has switched to the XXX version of ZK" but never says if switching
>> to a newer ZK is mandatory or not. What are the guidelines on keeping Kafka
>> and ZK compatible?
>> 
>> ---
>> 
>> Andrey Klochkov
> 
> 


Re: Kafka compatibility with ZK

2020-07-23 Thread Peter Bukowinski
Agreed. We use a cloudera distribution of zookeeper, that is versioned at 3.4.5 
(plus a bunch of backported patches) with kafka 2.4 and haven’t had any issues.


> On Jul 23, 2020, at 1:19 PM, Andrey Klochkov  wrote:
> 
> We are running a separate ZK cluster and its version is not really tied to
> the version of Kafka we're using.
> 
> I have seen the Confluent compatibility matrix and based on that our
> *current* version of Kafka is not compatible with our version of ZK, and we
> haven't seen any problems with that. My suspicion is that Confluent might
> be using some of the newer ZK features such as e.g. dynamic
> configuration and that's where their requirements come from, but that
> doesn't mean Kafka requires the versions of ZK that Confluent lists as
> required for the Confluent platform.
> 
> On Thu, Jul 23, 2020 at 1:06 PM M. Manna  wrote:
> 
>> Hi,
>> 
>> AFAIK, ZK is packed with Kafka. So if you upgrade to 2.4.1 you’ll get what
>> is in 2.4.1.
>> 
>> It’s a little different however, if you’re hosting ZK in a different host
>> running independently of Kafka.
>> 
>> What’s your situation ?
>> 
>> 
>> 
>> On Thu, 23 Jul 2020 at 21:02, Andrey Klochkov 
>> wrote:
>> 
>>> Hello,
>>> We're upgrading our Kafka from 1.1.0 to 2.4.1 and I'm wondering if ZK
>> needs
>>> to be upgraded too (we're currently on 3.4.6). The upgrade guide says
>> that
>>> "kafka has switched to the XXX version of ZK" but never says if switching
>>> to a newer ZK is mandatory or not. What are the guidelines on keeping
>> Kafka
>>> and ZK compatible?
>>> 
>>> --
>>> Andrey Klochkov
>>> 
>> 
> 
> 
> -- 
> Andrey Klochkov



Re: Kafka compatibility with ZK

2020-07-23 Thread Peter Bukowinski
Zookeeper is not part of the kafka project and must be installed separately. 
Confluent maintain a version compatibility table you can use as a reference: 
https://docs.confluent.io/current/installation/versions-interoperability.html#zk
 


> On Jul 23, 2020, at 1:05 PM, M. Manna  wrote:
> 
> Hi,
> 
> AFAIK, ZK is packed with Kafka. So if you upgrade to 2.4.1 you’ll get what
> is in 2.4.1.
> 
> It’s a little different however, if you’re hosting ZK in a different host
> running independently of Kafka.
> 
> What’s your situation ?
> 
> 
> 
> On Thu, 23 Jul 2020 at 21:02, Andrey Klochkov  wrote:
> 
>> Hello,
>> We're upgrading our Kafka from 1.1.0 to 2.4.1 and I'm wondering if ZK needs
>> to be upgraded too (we're currently on 3.4.6). The upgrade guide says that
>> "kafka has switched to the XXX version of ZK" but never says if switching
>> to a newer ZK is mandatory or not. What are the guidelines on keeping Kafka
>> and ZK compatible?
>> 
>> --
>> Andrey Klochkov
>> 



Re: How to Change number of partitions without Rolling restart?

2020-06-21 Thread Peter Bukowinski
You can’t use a wildcard and must address each topic individually. You can 
automate it with a for loop that takes an array/list of topics as the item to 
iterate over.

-- Peter Bukowinski

> On Jun 21, 2020, at 9:16 PM, sunil chaudhari  
> wrote:
> 
> Manoj,
> You mean I have execute this command manually for all 350 Topics which I
> already have?
> Is there any possibility I can use any wild cards?
> 
> 
>> On Mon, 22 Jun 2020 at 9:28 AM,  wrote:
>> 
>> You can use below command to alter to partition
>> 
>> ./bin/kafka-topics.sh --alter --zookeeper localhost:2181 --topic my-topic
>> --partitions 6
>> 
>> Thanks
>> Manoj
>> 
>> 
>> 
>> On 6/21/20, 7:38 PM, "sunil chaudhari" 
>> wrote:
>> 
>>[External]
>> 
>> 
>>Hi,
>>I already have 350 topics created. Please guide me how can I do that
>> for
>>these many topics?
>>Also I want each new topic to be created with more number partitions
>>automatically than previous number 3, which I had set in properties.
>> 
>>Regards,
>>Sunil.
>> 
>>On Mon, 22 Jun 2020 at 6:31 AM, Liam Clarke-Hutchinson <
>>liam.cla...@adscale.co.nz> wrote:
>> 
>>> Hi Sunil,
>>> 
>>> The broker setting num.partitions only applies to automatically
>> created
>>> topics (if that is enabled) at the time of creation. To change
>> partitions
>>> for a topic you need to use kafka-topics.sh to do so for each topic.
>>> 
>>> Kind regards,
>>> 
>>> Liam Clarke-Hutchinson
>>> 
>>> On Mon, Jun 22, 2020 at 3:16 AM sunil chaudhari <
>>> sunilmchaudhar...@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> I want to change number of partitions for all topics.
>>>> How can I change that? Is it server.properties which I need to
>> change?
>>>> Then, in that case I have to restart broker right?
>>>> 
>>>> I checked from confluent control center, there is no option to
>> change
>>>> partitions.
>>>> 
>>>> Please advise.
>>>> 
>>>> Regards,
>>>> Sunil
>>>> 
>>> 
>> 
>> 
>> This e-mail and any files transmitted with it are for the sole use of the
>> intended recipient(s) and may contain confidential and privileged
>> information. If you are not the intended recipient(s), please reply to the
>> sender and destroy all copies of the original message. Any unauthorized
>> review, use, disclosure, dissemination, forwarding, printing or copying of
>> this email, and/or any action taken in reliance on the contents of this
>> e-mail is strictly prohibited and may be unlawful. Where permitted by
>> applicable law, this e-mail and other e-mail communications sent to and
>> from Cognizant e-mail addresses may be monitored.
>> This e-mail and any files transmitted with it are for the sole use of the
>> intended recipient(s) and may contain confidential and privileged
>> information. If you are not the intended recipient(s), please reply to the
>> sender and destroy all copies of the original message. Any unauthorized
>> review, use, disclosure, dissemination, forwarding, printing or copying of
>> this email, and/or any action taken in reliance on the contents of this
>> e-mail is strictly prohibited and may be unlawful. Where permitted by
>> applicable law, this e-mail and other e-mail communications sent to and
>> from Cognizant e-mail addresses may be monitored.
>> 


Re: Kafka partitions replication issue

2020-06-17 Thread Peter Bukowinski


> On Jun 17, 2020, at 5:16 AM, Karnam, Sudheer  wrote:
> 
> Team,
> We are using kafka version 2.3.0 and we are facing issue with brokers 
> replication
> 
> 1.Kafka has 6 brokers.
> 2.Mainly 7 topics exist in kafka cluster and each topic has 128 partitions.
> 3.Each partition has 3 in-sync-replicas and these are distributed among 6 
> kafka brokers.
> 4.All partitions has preferred leader and "Auto Leader Rebalance Enable" 
> configuration enabled.
> Issue:
> We had a kafka broker-3 failure because of hardware issues and partitions 
> having broker-3 as leader are disrupted.
> As per kafka official page, partitions should elect new leader once preferred 
> leader fails.
> 
> [2020-06-01 14:02:25,029] ERROR [ReplicaManager broker=3] Error processing 
> append operation on partition object-xxx-xxx-xx-na4-93 
> (kafka.server.ReplicaManager)
> org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync 
> replicas for partition object-xxx-xxx-xx-na4-93 is [1], below required 
> minimum [2]
> 
> Above error message found in kafka logs,
> " object-xxx-xxx-xx-na4-93 " topic has 128 partition and 93rd partition has 3 
> replicas. It is distributed among (broker-3,broker-2,broker-4).
> Broker -3 is the preferred leader.
> When broker-3 failed, Leader position should move to any one of 
> (broker-2,broker-4) but it didn't happened.
> As per error message, whenever leader is failing it is throwing error by 
> stating only one insync replica available.
> 
> Please help us in finding root cause for not selecting new leader.
> 
> 
> Thanks,
> Sudheer

Hi Sudheer,

What do you have `replica.lag.time.max.ms` set to for your cluster? Also, are 
your producers using `acks=-1` or `acks=all`? If the replica lag time is too 
short or you are using `acks=1`, then it’s likely that when broker 3 failed, 
the both followers for partition you mention had not yet caught up with the 
leader, so the cluster is unable to meet the min.insync.replicas count of 2.

You have a few choices you can make. If you value topic availability over 
complete data integrity, then you can set `min.insync.replicas=1`, or set 
`unclean.leader.election.enable=true`. The former will keep a partition online 
with only one in-sync replica. The latter will allow a replica that hadn’t 
fully caught up to the leader to become a leader.

I have both of these set in my environment since I have the luxury of not 
dealing with transactional data and “best effort” delivery is sufficient for my 
needs. In practice, the amount of loss we see is an extremely small fraction of 
the total data pushed through kafka and only occurs around broker failures.

—
Peter



Re: Disk space - sharp increase in usage

2020-06-02 Thread Peter Bukowinski


> On Jun 2, 2020, at 12:56 AM, Victoria Zuberman 
>  wrote:
> 
> Hi,
> 
> Background:
> Kafka cluster
> 7 brokers, with 4T disk each
> version 2.3 (recently upgraded from 0.1.0 via 1.0.1)
> 
> Problem:
> Used disk space went from 40% to 80%.
> Looking for root cause.
> 
> Suspects:
> 
>  1.  Incoming traffic
> 
> Ruled out, according to metrics no significant change in “bytes in” for 
> topics in cluster
> 
>  1.  Upgrade
> 
> The raise started on the day of upgrade to 2.3
> 
> But we upgraded another cluster in the same way and we don’t see similar 
> issue there
> 
> Is there a known change or issue at 2.3 related to disk space usage?
> 
>  1.  Replication factor
> 
> Is there a way to see whether replication factor of any topic was changed 
> recently? Didn’t find in metrics...

You can use the kafka-topics.sh script to check the replica count for all your 
topics. Upgrading would not have affected the replica count, though.

>  1.  Retention
> 
> Is there a way to see whether retention was changed recently? Didn’t find in 
> metrics...

You can use  kafka-topics.sh —-zookeeper host:2181 --describe 
--topics-with-overrides
to list any topics with non-default retention, but I’m guessing that’s not it.

If your disk usage went from 40 to 80% on all brokers — effectively doubled — 
it could be that your kafka data log directory path(s) changed during the 
upgrade. As you upgraded each broker and (re)started the kafka, it would have 
left the existing data under the old one path and created new topic partition 
directories and logs under the new path as it rejoined the cluster. Have you 
verified that your data log directory locations are the same as they used to be?

> Would appreciate any other ideas or investigation leads
> 
> Thanks,
> Victoria
> 
> ---
> NOTICE:
> This email and all attachments are confidential, may be proprietary, and may 
> be privileged or otherwise protected from disclosure. They are intended 
> solely for the individual or entity to whom the email is addressed. However, 
> mistakes sometimes happen in addressing emails. If you believe that you are 
> not an intended recipient, please stop reading immediately. Do not copy, 
> forward, or rely on the contents in any way. Notify the sender and/or 
> Imperva, Inc. by telephone at +1 (650) 832-6006 and then delete or destroy 
> any copy of this email and its attachments. The sender reserves and asserts 
> all rights to confidentiality, as well as any privileges that may apply. Any 
> disclosure, copying, distribution or action taken or omitted to be taken by 
> an unintended recipient in reliance on this message is prohibited and may be 
> unlawful.
> Please consider the environment before printing this email.


Re: Partitioning issue when a broker is going down

2020-05-17 Thread Peter Bukowinski


> On May 17, 2020, at 11:45 AM, Victoria Zuberman 
>  wrote:
> 
>  Regards acks=all:
> -
> Interesting point. Will check acks and min.insync.replicas values.
> If I understand the root cause that you are suggesting correctly, given my 
> RF=2 and 3 brokers in cluster:
> min.insync.replicas > 1 and acks=all, removing one broker ---> partition 
> that had a replica on the removed broker can't get written until the replica 
> is up on another broker?

That is correct. From a producer standpoint, the unaffected partitions will 
still be able to accept data, so depending on data rate and message size, 
producers may not be negatively affected by the missing broker.

> Regards number of partitions
> -
> The producer to this topic is using librdkafka, using partioner_cb callback, 
> which receives number of partition as partitions_cnt.

This makes sense, when called, you will get partitions that are able to accept 
data. When a broker goes down and some topics become under-replicated, and your 
producer settings omit the remaining replicas of those partitions as valid 
targets, then partitions_cnt will only enumerate the remaining partitions.

> Still trying to understand how the library obtains partitions_cnt value.
> I wonder if the behavior is similar to Java library, where it the default 
> partitioner uses the number of available partitions as the number of current 
> partitions...

The logic is similar as that is how kafka is designed. The client will fetch 
the topic’s metadata (including partitions available for writing) on connect, 
on error, and by the interval determined by topic.metadata.refresh.interval.ms, 
unless it is set to -1.

> On 17/05/2020, 20:59, "Peter Bukowinski"  wrote:
> 
> 
>If your producer is set to use acks=all, then it won’t be able to produce 
> to the topic topic partitions that had replicas on the missing broker until 
> the replacement broker has finished catching up to be included in the ISR.
> 
>What method are you using that reports on the number of topic partitions? 
> If some partitions go offline, the cluster still knows how many there are 
> supposed to be, so I’m curious what is reporting 10 when there should be 15.
> 
>-- Peter
> 
>> On May 17, 2020, at 10:36 AM, Victoria Zuberman 
>>  wrote:
>> 
>> Hi,
>> 
>> Kafka cluster with 3 brokers, version 1.0.1.
>> Topic with 15 partitions, replication factor 2. All replicas in sync.
>> Bringing down one of the brokers (ungracefully), then adding a broker in 
>> version 1.0.1
>> 
>> During this process, are we expected either of the following to happen:
>> 
>> 1.  Some of the partitions become unavailable for producer to write to
>> 2.  Cluster reports the number of partitions at the topic as 10 and not 15
>> It seems like both issues take place in our case, for about a minute.
>> 
>> We are trying to understand whether it is an expected behavior and if not, 
>> what can be causing it.
>> 
>> Thanks,
>> Victoria
>> ---
>> NOTICE:
>> This email and all attachments are confidential, may be proprietary, and may 
>> be privileged or otherwise protected from disclosure. They are intended 
>> solely for the individual or entity to whom the email is addressed. However, 
>> mistakes sometimes happen in addressing emails. If you believe that you are 
>> not an intended recipient, please stop reading immediately. Do not copy, 
>> forward, or rely on the contents in any way. Notify the sender and/or 
>> Imperva, Inc. by telephone at +1 (650) 832-6006 and then delete or destroy 
>> any copy of this email and its attachments. The sender reserves and asserts 
>> all rights to confidentiality, as well as any privileges that may apply. Any 
>> disclosure, copying, distribution or action taken or omitted to be taken by 
>> an unintended recipient in reliance on this message is prohibited and may be 
>> unlawful.
>> Please consider the environment before printing this email.
> 
> 


Re: Partitioning issue when a broker is going down

2020-05-17 Thread Peter Bukowinski
If your producer is set to use acks=all, then it won’t be able to produce to 
the topic topic partitions that had replicas on the missing broker until the 
replacement broker has finished catching up to be included in the ISR.

What method are you using that reports on the number of topic partitions? If 
some partitions go offline, the cluster still knows how many there are supposed 
to be, so I’m curious what is reporting 10 when there should be 15.

-- Peter

> On May 17, 2020, at 10:36 AM, Victoria Zuberman 
>  wrote:
> 
> Hi,
> 
> Kafka cluster with 3 brokers, version 1.0.1.
> Topic with 15 partitions, replication factor 2. All replicas in sync.
> Bringing down one of the brokers (ungracefully), then adding a broker in 
> version 1.0.1
> 
> During this process, are we expected either of the following to happen:
> 
>  1.  Some of the partitions become unavailable for producer to write to
>  2.  Cluster reports the number of partitions at the topic as 10 and not 15
> It seems like both issues take place in our case, for about a minute.
> 
> We are trying to understand whether it is an expected behavior and if not, 
> what can be causing it.
> 
> Thanks,
> Victoria
> ---
> NOTICE:
> This email and all attachments are confidential, may be proprietary, and may 
> be privileged or otherwise protected from disclosure. They are intended 
> solely for the individual or entity to whom the email is addressed. However, 
> mistakes sometimes happen in addressing emails. If you believe that you are 
> not an intended recipient, please stop reading immediately. Do not copy, 
> forward, or rely on the contents in any way. Notify the sender and/or 
> Imperva, Inc. by telephone at +1 (650) 832-6006 and then delete or destroy 
> any copy of this email and its attachments. The sender reserves and asserts 
> all rights to confidentiality, as well as any privileges that may apply. Any 
> disclosure, copying, distribution or action taken or omitted to be taken by 
> an unintended recipient in reliance on this message is prohibited and may be 
> unlawful.
> Please consider the environment before printing this email.


Re: Kafka topic partition directory

2020-03-28 Thread Peter Bukowinski
Kafka doesn’t monitor the contents of the log data directories unless it 
created the file or directory. If it didn’t create the directory/file it will 
ignore it.

-- Peter

> On Mar 28, 2020, at 4:17 PM, anila devi  
> wrote:
> 
> Hi Users, 
> If I create a directory or a file in the same directory where kafka creates 
> partition topic, the kafka broker node does not restart. Is it expected ? 
> Thanks,Dhiman
> 


Re: log.dirs and SSDs

2020-03-11 Thread Peter Bukowinski
Hah :)

I think this deserves an experiment. I’d try setting up some tests with one, 
two, four, and eight log directories per disk and running some performance 
tests. I’d be interested to see your results.

> On Mar 11, 2020, at 5:45 PM, Eugen Dueck  wrote:
> 
> I'm asking the questions here! 
> So is that the way to tune the broker if it does not achieve disk throughput?
> 
> ____
> 差出人: Peter Bukowinski 
> 送信日時: 2020年3月12日 9:38
> 
> Couldn’t the same be accomplished by increasing the num.io.threads broker 
> setting?
> 
>> On Mar 11, 2020, at 5:15 PM, Eugen Dueck  wrote:
>> 
>> So there is not e.g. a single thread responsible per directory in log.dirs 
>> that could become a bottleneck relative to SSD throughput of GB/s?
>> 
>> This is in fact the case for Apache Pulsar, and the openmessaging benchmark 
>> uses 4 directories on the same SSD to increase throughput.
>> 
>> 
>> 差出人: Peter Bukowinski 
>> 送信日時: 2020年3月12日 8:51
>> 
>>> On Mar 11, 2020, at 4:28 PM, Eugen Dueck  wrote:
>>> 
>>> So log.dirs should contain only one entry per HDD disk, to avoid random 
>>> seeks.
>>> What about SSDs? Can throughput be increased by specifying multiple 
>>> directories on the same SSD?
>> 
>> 
>> Given a constant number of partitions, I don’t see any advantage to 
>> splitting partitions among multiple log directories vs. keeping them all in 
>> one (per disk). You’d still have the same total number of topic-partition 
>> directories and the same number of topic-partition leaders.
>> 
>> If you want to increase throughput, focus on using the appropriate number of 
>> partitions.
>> 
>> —
>> Peter Bukowinski
> 



Re: log.dirs and SSDs

2020-03-11 Thread Peter Bukowinski
Couldn’t the same be accomplished by increasing the num.io.threads broker 
setting?

> On Mar 11, 2020, at 5:15 PM, Eugen Dueck  wrote:
> 
> So there is not e.g. a single thread responsible per directory in log.dirs 
> that could become a bottleneck relative to SSD throughput of GB/s?
> 
> This is in fact the case for Apache Pulsar, and the openmessaging benchmark 
> uses 4 directories on the same SSD to increase throughput.
> 
> ________
> 差出人: Peter Bukowinski 
> 送信日時: 2020年3月12日 8:51
> 宛先: users@kafka.apache.org 
> 件名: Re: log.dirs and SSDs
> 
>> On Mar 11, 2020, at 4:28 PM, Eugen Dueck  wrote:
>> 
>> So log.dirs should contain only one entry per HDD disk, to avoid random 
>> seeks.
>> What about SSDs? Can throughput be increased by specifying multiple 
>> directories on the same SSD?
> 
> 
> Given a constant number of partitions, I don’t see any advantage to splitting 
> partitions among multiple log directories vs. keeping them all in one (per 
> disk). You’d still have the same total number of topic-partition directories 
> and the same number of topic-partition leaders.
> 
> If you want to increase throughput, focus on using the appropriate number of 
> partitions.
> 
> —
> Peter Bukowinski



Re: log.dirs and SSDs

2020-03-11 Thread Peter Bukowinski
> On Mar 11, 2020, at 4:28 PM, Eugen Dueck  wrote:
> 
> So log.dirs should contain only one entry per HDD disk, to avoid random seeks.
> What about SSDs? Can throughput be increased by specifying multiple 
> directories on the same SSD?


Given a constant number of partitions, I don’t see any advantage to splitting 
partitions among multiple log directories vs. keeping them all in one (per 
disk). You’d still have the same total number of topic-partition directories 
and the same number of topic-partition leaders.

If you want to increase throughput, focus on using the appropriate number of 
partitions.

—
Peter Bukowinski

Re: what happened in case of single disk failure

2020-03-11 Thread Peter Bukowinski
Yes, that’s correct. While a broker is down:

all topic partitions assigned to that broker will be under-replicated
topic partitions with an unmet minimum ISR count will be offline
leadership of partitions meeting the minimum ISR count will move to the next 
in-sync replica in the replica list
if no in-sync replica exists for a topic-partitions, it will be offline
Setting unclean.leader.election.enable=true will allow an out-of-sync replica 
to become a leader.
If topic partition availability is more important to you than data integrity, 
you should allow unclean leader election.


> On Mar 11, 2020, at 6:11 AM, 张祥  wrote:
> 
> Hi, Peter, following what we talked about before, I want to understand what
> will happen when one broker goes down, I would say it will be very similar
> to what happens under disk failure, except that the rules apply to all the
> partitions on that broker instead of only one malfunctioned disk. Am I
> right? Thanks.
> 
> 张祥  于2020年3月5日周四 上午9:25写道:
> 
>> Thanks Peter, really appreciate it.
>> 
>> Peter Bukowinski  于2020年3月4日周三 下午11:50写道:
>> 
>>> Yes, you should restart the broker. I don’t believe there’s any code to
>>> check if a Log directory previously marked as failed has returned to
>>> healthy.
>>> 
>>> I always restart the broker after a hardware repair. I treat broker
>>> restarts as a normal, non-disruptive operation in my clusters. I use a
>>> minimum of 3x replication.
>>> 
>>> -- Peter (from phone)
>>> 
>>>> On Mar 4, 2020, at 12:46 AM, 张祥  wrote:
>>>> 
>>>> Another question, according to my memory, the broker needs to be
>>> restarted
>>>> after replacing disk to recover this. Is that correct? If so, I take
>>> that
>>>> Kafka cannot know by itself that the disk has been replaced, manually
>>>> restart is necessary.
>>>> 
>>>> 张祥  于2020年3月4日周三 下午2:48写道:
>>>> 
>>>>> Thanks Peter, it makes a lot of sense.
>>>>> 
>>>>> Peter Bukowinski  于2020年3月3日周二 上午11:56写道:
>>>>> 
>>>>>> Whether your brokers have a single data directory or multiple data
>>>>>> directories on separate disks, when a disk fails, the topic partitions
>>>>>> located on that disk become unavailable. What happens next depends on
>>> how
>>>>>> your cluster and topics are configured.
>>>>>> 
>>>>>> If the topics on the affected broker have replicas and the minimum ISR
>>>>>> (in-sync replicas) count is met, then all topic partitions will remain
>>>>>> online and leaders will move to another broker. Producers and
>>> consumers
>>>>>> will continue to operate as usual.
>>>>>> 
>>>>>> If the topics don’t have replicas or the minimum ISR count is not met,
>>>>>> then the topic partitions on the failed disk will be offline.
>>> Producers can
>>>>>> still send data to the affected topics — it will just go to the online
>>>>>> partitions. Consumers can still consume data from the online
>>> partitions.
>>>>>> 
>>>>>> -- Peter
>>>>>> 
>>>>>>>> On Mar 2, 2020, at 7:00 PM, 张祥  wrote:
>>>>>>>> 
>>>>>>>> Hi community,
>>>>>>>> 
>>>>>>>> I ran into disk failure when using Kafka, and fortunately it did not
>>>>>> crash
>>>>>>> the entire cluster. So I am wondering how Kafka handles multiple
>>> disks
>>>>>> and
>>>>>>> it manages to work in case of single disk failure. The more detailed,
>>>>>> the
>>>>>>> better. Thanks !
>>>>>> 
>>>>> 
>>> 
>> 



Re: Adding additional nodes to Existing ZK cluster

2020-03-07 Thread Peter Bukowinski
With a single zk in your zookeeper connect string, broker restarts are 
vulnerable to a single point of failure. If that zookeeper is offline, the 
broker will not start. You want at least two zookeepers in the connect string — 
it’s the same reason you should put more than one kafka broker in client 
bootstrap configs.

You can probably get away with just updating kafka broker settings with the 
additional zookeepers and not restarting the broker service, since the 
additional zookeepers wouldn’t be useful until the next restart, anyway.

-- Peter

> On Mar 7, 2020, at 8:40 PM, sunil chaudhari  
> wrote:
> 
> Hi Peter,
> That was great explanation.
> However I have question about the last stage where you mentioned to update
> the zookeeper server in the services where single zookeeper is used.
> Why do I need to do that?
> Is it because only single zookeeper is used and you want to make sure high
> availability of zookeeper?
> 
> What if tomorrow I add 2 more instances of zookeeper, total 5. Is it
> required to update 2 new zK instances to my kafka brokers?
> 
> 
> Regards,
> Sunil.
> 
>> On Sat, 7 Mar 2020 at 11:08 PM, Peter Bukowinski  wrote:
>> 
>> This change will require brief interruption for services depending on the
>> current zookeeper — but only for the amount of time it takes the service on
>> the original zookeeper to restart. Here’s the basic process:
>> 
>> 1. Provision two new zookeepers hosts, but don’t start the service on the
>> new hosts.
>> 2. Edit the zoo.cfg file on all hosts to contain the following lines
>> (assuming default ports):
>> 
>>server.1=ORIGINAL_ZK_IP:2888:3888
>>server.2=SECOND_ZK_IP:2888:3888
>>server.3=THIRD_ZK_IP:2888:3888
>> 
>> 3. Ensure the myid file on the second node contains ‘2’ and on the third
>> node contains ‘3'
>> 4. Start the second and third zookeeper services and ensure they have
>> become followers:
>> 
>>echo stat | nc ZK2_IP 2181 | grep state
>>echo stat | nc ZK3_IP 2181 | grep state
>> 
>> 5. Restart the original zookeeper service and then check the state of all
>> three zookeepers
>> 
>>echo stat | nc ZK1_IP 2181 | grep state
>>echo stat | nc ZK2_IP 2181 | grep state
>>echo stat | nc ZK3_IP 2181 | grep state
>> 
>> You should see that one of the new zookeepers has become the leader.
>> 
>> Now all that’s left to do is update your zookeeper server strings in the
>> services that were previously using the single zookeeper.
>> 
>> Hope this helped!
>> 
>> —
>> Peter
>> 
>>>> On Mar 6, 2020, at 12:50 PM, JOHN, BIBIN  wrote:
>>> 
>>> Team,
>>> I currently have a 1 node ZK cluster and which is working fine. Now I
>> want to add additional 2 more nodes to ZK cluster. Could you please provide
>> best practice so I don't loose existing data?
>>> 
>>> 
>>> Thanks
>>> Bibin John
>> 
>> 


Re: what happened in case of single disk failure

2020-03-04 Thread Peter Bukowinski
Yes, you should restart the broker. I don’t believe there’s any code to check 
if a Log directory previously marked as failed has returned to healthy.

I always restart the broker after a hardware repair. I treat broker restarts as 
a normal, non-disruptive operation in my clusters. I use a minimum of 3x 
replication.

-- Peter (from phone)

> On Mar 4, 2020, at 12:46 AM, 张祥  wrote:
> 
> Another question, according to my memory, the broker needs to be restarted
> after replacing disk to recover this. Is that correct? If so, I take that
> Kafka cannot know by itself that the disk has been replaced, manually
> restart is necessary.
> 
> 张祥  于2020年3月4日周三 下午2:48写道:
> 
>> Thanks Peter, it makes a lot of sense.
>> 
>> Peter Bukowinski  于2020年3月3日周二 上午11:56写道:
>> 
>>> Whether your brokers have a single data directory or multiple data
>>> directories on separate disks, when a disk fails, the topic partitions
>>> located on that disk become unavailable. What happens next depends on how
>>> your cluster and topics are configured.
>>> 
>>> If the topics on the affected broker have replicas and the minimum ISR
>>> (in-sync replicas) count is met, then all topic partitions will remain
>>> online and leaders will move to another broker. Producers and consumers
>>> will continue to operate as usual.
>>> 
>>> If the topics don’t have replicas or the minimum ISR count is not met,
>>> then the topic partitions on the failed disk will be offline. Producers can
>>> still send data to the affected topics — it will just go to the online
>>> partitions. Consumers can still consume data from the online partitions.
>>> 
>>> -- Peter
>>> 
>>>>> On Mar 2, 2020, at 7:00 PM, 张祥  wrote:
>>>>> 
>>>>> Hi community,
>>>>> 
>>>>> I ran into disk failure when using Kafka, and fortunately it did not
>>> crash
>>>> the entire cluster. So I am wondering how Kafka handles multiple disks
>>> and
>>>> it manages to work in case of single disk failure. The more detailed,
>>> the
>>>> better. Thanks !
>>> 
>> 


Re: what happened in case of single disk failure

2020-03-02 Thread Peter Bukowinski
Whether your brokers have a single data directory or multiple data directories 
on separate disks, when a disk fails, the topic partitions located on that disk 
become unavailable. What happens next depends on how your cluster and topics 
are configured.

If the topics on the affected broker have replicas and the minimum ISR (in-sync 
replicas) count is met, then all topic partitions will remain online and 
leaders will move to another broker. Producers and consumers will continue to 
operate as usual.

If the topics don’t have replicas or the minimum ISR count is not met, then the 
topic partitions on the failed disk will be offline. Producers can still send 
data to the affected topics — it will just go to the online partitions. 
Consumers can still consume data from the online partitions.

-- Peter

> On Mar 2, 2020, at 7:00 PM, 张祥  wrote:
> 
> Hi community,
> 
> I ran into disk failure when using Kafka, and fortunately it did not crash
> the entire cluster. So I am wondering how Kafka handles multiple disks and
> it manages to work in case of single disk failure. The more detailed, the
> better. Thanks !


Re: when to expand cluster

2020-02-27 Thread Peter Bukowinski
No, it’s not bad. Kafka is designed to serve data to many consumers at the same 
time, whether they are independent of each other or in the same consumer group.

I would encourage you to play with different partition counts and use kafka’s 
performance testing tools (kafka-producer-perf-test.sh and 
kafka-consumer-perf-test.sh) to test throughput in different scenarios and see 
the results for yourself.

—
Peter

> On Feb 27, 2020, at 1:28 AM, 张祥  wrote:
> 
> I believe no matter the partition count exceeds the broker count, we can
> always have the same number of consumer instances as the partition count.
> 
> So what I want to know is when two partition exists on the same broker, two
> consumer instances will be talking to same broker, is that bad ?
> 
> 张祥  于2020年2月27日周四 下午2:20写道:
> 
>> Thanks. What influence does it have for consumers and producers when
>> partition number is more than broker number, which means at least one
>> broker serves two partitions for one topic ? performance wise.
>> 
>> Peter Bukowinski  于2020年2月26日周三 下午11:02写道:
>> 
>>> Disk usage is one reason to expand. Another reason is if you need more
>>> ingest or output throughout for your topic data. If your producers aren’t
>>> able to send data to kafka fast enough or your consumers are lagging, you
>>> might benefit from more brokers and more partitions.
>>> 
>>> -- Peter
>>> 
>>>> On Feb 26, 2020, at 12:56 AM, 张祥  wrote:
>>>> 
>>>> In documentation, it is described how to expand cluster:
>>>> 
>>> https://kafka.apache.org/20/documentation.html#basic_ops_cluster_expansion
>>> .
>>>> But I am wondering what the criteria for expand is. I can only think of
>>>> disk usage threshold. For example, suppose several disk usage exceed
>>> 80%.
>>>> Is this correct and is there more ?
>>> 
>> 



Re: when to expand cluster

2020-02-26 Thread Peter Bukowinski
The effect for producers isn’t very significant once your topic partition count 
exceeds your broker count. For consumers — especially if you are using consumer 
groups — the more partitions you have, the more consumer instances you can have 
in a single consumer group. (The maximum number of active consumers in a 
consumer group = the total number of topic partitions assigned to the group.) 

As long as you are not exceeding the broker’s network and disk IO, your total 
consumer throughput goes up with more partitions. Additional network and disk 
IO are a benefit of additional brokers.

--
Peter

> On Feb 26, 2020, at 10:23 PM, 张祥  wrote:
> 
> Thanks. What influence does it have for consumers and producers when
> partition number is more than broker number, which means at least one
> broker serves two partitions for one topic ? performance wise.
> 
> Peter Bukowinski  于2020年2月26日周三 下午11:02写道:
> 
>> Disk usage is one reason to expand. Another reason is if you need more
>> ingest or output throughout for your topic data. If your producers aren’t
>> able to send data to kafka fast enough or your consumers are lagging, you
>> might benefit from more brokers and more partitions.
>> 
>> -- Peter
>> 
>>>> On Feb 26, 2020, at 12:56 AM, 张祥  wrote:
>>> 
>>> In documentation, it is described how to expand cluster:
>>> 
>> https://kafka.apache.org/20/documentation.html#basic_ops_cluster_expansion
>> .
>>> But I am wondering what the criteria for expand is. I can only think of
>>> disk usage threshold. For example, suppose several disk usage exceed 80%.
>>> Is this correct and is there more ?
>> 


Re: when to expand cluster

2020-02-26 Thread Peter Bukowinski
Disk usage is one reason to expand. Another reason is if you need more ingest 
or output throughout for your topic data. If your producers aren’t able to send 
data to kafka fast enough or your consumers are lagging, you might benefit from 
more brokers and more partitions.

-- Peter

> On Feb 26, 2020, at 12:56 AM, 张祥  wrote:
> 
> In documentation, it is described how to expand cluster:
> https://kafka.apache.org/20/documentation.html#basic_ops_cluster_expansion.
> But I am wondering what the criteria for expand is. I can only think of
> disk usage threshold. For example, suppose several disk usage exceed 80%.
> Is this correct and is there more ?


Re: Confluent Replicator

2020-02-19 Thread Peter Bukowinski
That is possible as long and you include a topic.rename.format argument in the 
replication.properties file. The origin and destination cluster configs can 
point to the same cluster.

See the example here 
https://docs.confluent.io/current/multi-dc-deployments/replicator/replicator-quickstart.html#configure-and-run-replicator

-- Peter

> On Feb 19, 2020, at 7:41 PM, George  wrote:
> 
> Hi all.
> 
> is it possible, for testing purposes to replicate topic A from Cluster 1 to
> topic B on cluster 1/same cluster?
> 
> G
> 
> -- 
> You have the obligation to inform one honestly of the risk, and as a person
> you are committed to educate yourself to the total risk in any activity!
> 
> Once informed & totally aware of the risk,
> every fool has the right to kill or injure themselves as they see fit!


Re: Replicas more than replication-factor

2020-02-12 Thread Peter Bukowinski
In cases like this where situation isn’t self-repairing, I stop the at-fault 
broker and delete the topic-partition directory/directories from the filesystem 
before starting the broker again. Are you using local storage on your brokers?

--
Peter

> On Feb 12, 2020, at 8:33 PM, Madhuri Khattar (mkhattar) 
>  wrote:
> 
> In my case I figured it was broker 3 and rebooted it after deleting 
> /admin/reassign-partitions since it was stuck there. However I am not having 
> any luck.
> I still see 400+ underreplicated partitions. I replaced broker 3 as well but 
> still there is no change after almost 7-8 hours now.
> 
> 
> 
> Madhuri Khattar
> SR ENGINEER.IT ENGINEERING
> mkhat...@cisco.com
> Tel: +1 408 525 5989
> 
> Cisco Systems, Inc.
> 400 East Tasman Drive
> SAN JOSE
> 95134
> United States
> cisco.com
> 
> Think before you print.
> This email may contain confidential and privileged material for the sole use 
> of the intended recipient. Any review, use, distribution or disclosure by 
> others is strictly prohibited. If you are not the intended recipient (or 
> authorized to receive for the recipient), please contact the sender by reply 
> email and delete all copies of this message.
> http://www.cisco.com/c/en/us/about/legal/terms-sale-software-license-agreement/company-registration-information.html
> -Original Message-
> From: Peter Bukowinski  
> Sent: Wednesday, February 12, 2020 8:27 PM
> To: users@kafka.apache.org
> Cc: senthilec...@apache.org
> Subject: Re: Replicas more than replication-factor
> 
> I’ve had this happen a few times when a partition reassignment was underway 
> and one of the brokers that is a destination for the reassignment became 
> unhealthy. This essentially stalls the reassignment indefinitely. The 
> partition with 10 instead of 5 replicas was undergoing a reassignment where 
> all the replicas were being moved to new brokers. When that occurs, there 
> will necessarily be 2x the number of desired replicas for a short time before 
> the old replicas are removed.
> 
> The solution is to identify the faulty broker and make it healthy again. In 
> this case, it looks like broker 59 is at fault since it is the only one not 
> in the ISR set on the under-replicated partitions.
> 
> -- Peter
> 
>> On Feb 12, 2020, at 8:10 PM, SenthilKumar K  wrote:
>> 
>> We are also facing the similar issue in our kafka cluster.
>> 
>> Kafka Version: 2.2.0
>> RF: 5
>> 
>> PartitionLatest OffsetLeaderReplicasIn Sync ReplicasPreferred 
>> Leader?Under Replicated?
>> 0 121 <http://198.18.134.22:9000/clusters/IAD/brokers/121> 
>> (121,50,51,52,53)
>> (52,121,53,50,51) true false
>> 1 122 <http://198.18.134.22:9000/clusters/IAD/brokers/122> 
>> (122,51,52,53,54)
>> (52,53,54,51,122) true false
>> 2 123 <http://198.18.134.22:9000/clusters/IAD/brokers/123> 
>> (123,52,53,54,55)
>> (52,53,54,123,55) true false
>> 3 125 <http://198.18.134.22:9000/clusters/IAD/brokers/125> 
>> (125,53,54,55,56)
>> (56,125,53,54,55) true false
>> 4 127 <http://198.18.134.22:9000/clusters/IAD/brokers/127> 
>> (127,54,55,56,57)
>> (56,57,54,127,55) true false
>> 5 56 <http://198.18.134.22:9000/clusters/IAD/brokers/56>
>> (56,93,57,102,92,96,128,59,95,55) (56,93,57,102,92,96,128,95,55) true 
>> true
>> 6 56 <http://198.18.134.22:9000/clusters/IAD/brokers/56>
>> (56,93,57,60,97,96,129,59,103,95) (56,93,57,60,97,96,129,103,95) true 
>> true
>> 7 57 <http://198.18.134.22:9000/clusters/IAD/brokers/57>
>> (57,60,97,96,59,130,95,104,62,100) (57,60,97,96,130,95,104,62,100) 
>> true true
>> 8 101 <http://198.18.134.22:9000/clusters/IAD/brokers/101>
>> (101,60,65,97,96,105,59,131,62,100) (101,60,65,97,96,105,131,62) true 
>> true
>> 
>> Have a look at the partition ^ 5, total replica for partition 5 is 10 
>> but the replica set to 5. Under-replicated % is 26 for the same topic. 
>> And the partition reassignment is getting stuck for more than 24 hours.
>> 
>> Under-replicated % 26
>> 
>> --Senthil
>>> On Thu, Feb 13, 2020 at 6:32 AM Madhuri Khattar (mkhattar) 
>>>  wrote:
>>> Hi Today there was some network glich which caused some of my topics 
>>> to have replicas more than the replication-factor:
>>> $ ./kafka-topics --zookeeper sjc-ddzk-01 --topic foodee --describe
>>> Topic:foodee PartitionCount:6   ReplicationFactor:2
>>> Configs:retention.ms=34560,segment.ms
>>> =34560,max.message.bytes=200
>>> Topic: foodee: 0   Leader:

Re: Replicas more than replication-factor

2020-02-12 Thread Peter Bukowinski
I’ve had this happen a few times when a partition reassignment was underway and 
one of the brokers that is a destination for the reassignment became unhealthy. 
This essentially stalls the reassignment indefinitely. The partition with 10 
instead of 5 replicas was undergoing a reassignment where all the replicas were 
being moved to new brokers. When that occurs, there will necessarily be 2x the 
number of desired replicas for a short time before the old replicas are removed.

The solution is to identify the faulty broker and make it healthy again. In 
this case, it looks like broker 59 is at fault since it is the only one not in 
the ISR set on the under-replicated partitions.

-- Peter

> On Feb 12, 2020, at 8:10 PM, SenthilKumar K  wrote:
> 
> We are also facing the similar issue in our kafka cluster.
> 
> Kafka Version: 2.2.0
> RF: 5
> 
> PartitionLatest OffsetLeaderReplicasIn Sync ReplicasPreferred Leader?Under
> Replicated?
> 0 121  (121,50,51,52,53)
> (52,121,53,50,51) true false
> 1 122  (122,51,52,53,54)
> (52,53,54,51,122) true false
> 2 123  (123,52,53,54,55)
> (52,53,54,123,55) true false
> 3 125  (125,53,54,55,56)
> (56,125,53,54,55) true false
> 4 127  (127,54,55,56,57)
> (56,57,54,127,55) true false
> 5 56 
> (56,93,57,102,92,96,128,59,95,55) (56,93,57,102,92,96,128,95,55) true true
> 6 56 
> (56,93,57,60,97,96,129,59,103,95) (56,93,57,60,97,96,129,103,95) true true
> 7 57 
> (57,60,97,96,59,130,95,104,62,100) (57,60,97,96,130,95,104,62,100) true true
> 8 101 
> (101,60,65,97,96,105,59,131,62,100) (101,60,65,97,96,105,131,62) true true
> 
> Have a look at the partition ^ 5, total replica for partition 5 is 10 but
> the replica set to 5. Under-replicated % is 26 for the same topic. And the
> partition reassignment is getting stuck for more than 24 hours.
> 
> Under-replicated % 26
> 
> --Senthil
>> On Thu, Feb 13, 2020 at 6:32 AM Madhuri Khattar (mkhattar)
>>  wrote:
>> Hi Today there was some network glich which caused some of my topics to
>> have replicas more than the replication-factor:
>> $ ./kafka-topics --zookeeper sjc-ddzk-01 --topic foodee --describe
>> Topic:foodee PartitionCount:6   ReplicationFactor:2
>> Configs:retention.ms=34560,segment.ms
>> =34560,max.message.bytes=200
>>  Topic: foodee: 0   Leader: 0
>> Replicas: 0,1   Isr: 0,1
>>  Topic: foodee: 1   Leader: 1
>> Replicas: 1,2   Isr: 1,2
>>  Topic: foodee: 2   Leader: 1
>> Replicas: 2,3,1   Isr: 1,2
>>  Topic: foodee: 3   Leader: 1
>> Replicas: 3,4,1   Isr: 1,4
>>  Topic: foodee: 4   Leader: 0
>> Replicas: 0,4   Isr: 0,4
>>  Topic: foodee: 5   Leader: 0
>> Replicas: 0,2   Isr: 0,2
>> This has happened for many topics.
>> As a result I am having lot of under replicated partitions and partitions
>> and leaders are not evenly distributed:
>> Reassignment is also getting stuck.


Re: JMX Metrics to display disk Usage

2020-01-09 Thread Peter Bukowinski
Kafka does not report collect/report on topic data filesystem usage. I used 
this collectd project to help me collect the topic usage data and export it to 
graphite:

https://github.com/HubSpot/collectd-kafka-disk/blob/master/README.md

The plugin collects the size of each topic-partition directory on disk. From 
graphite, I sum that data per topic so I can see total disk usage per topic 
across my cluster.

-- Peter

> On Jan 9, 2020, at 8:05 PM, JOHN, BIBIN  wrote:
> 
> Is there any metrics available to view disk usage in each broker/topic via 
> JMX exporter? If yes, could you please send me details? Thanks


Re: Topics marked for deletion stuck as ineligible for deletion

2019-12-16 Thread Peter Bukowinski
If it was replaced, and a new broker was brought online with the same id, 
wherever topic partitions had been previously assigned to it should have been 
recreated.

At this point, however, I would shut down the cluster, delete the znodes, 
delete the topic directories from the brokers, then bring the cluster back up.

-- Peter

> On Dec 16, 2019, at 3:00 AM, Vincent Rischmann  wrote:
> 
> It doesn't exist anymore, we replaced it after a hardware failure.
> 
> Thinking about it I don't think I reassigned the partitions for broker 5 to 
> the new broker before deleting these topics, I didn't realize that it was 
> necessary for all brokers to be online.
> 
> Since broker 5 is never coming back again I'm guessing my only choice is to 
> manually modify the znodes ? 
> 
>> On Fri, Dec 13, 2019, at 19:07, Peter Bukowinski wrote:
>> If any brokers are offline, kafka can’t successfully delete a topic. 
>> What’s the state of broker 5?
>> 
>> -- Peter (from phone)
>> 
>>>> On Dec 13, 2019, at 8:55 AM, Vincent Rischmann  
>>>> wrote:
>>> 
>>> Hi,
>>> 
>>> I've deleted a bunch of topics yesterday on our cluster but some are now 
>>> stuck in "marked for deletion".
>>> 
>>> * i've looked in the data directory of every broker and there's no data 
>>> left for the topics, the directory doesn't exist anymore.
>>> * in zookeeper the znode `brokers/topics/mytopic` still exists
>>> * the znode `admin/delete_topics/mytopic` still exists
>>> 
>>> I've tried the following to no avail:
>>> 
>>> * restarting all brokers
>>> * removing the `admin/delete_topics/mytopic` node and re-running 
>>> `kafka-topics.sh --delete --topic mytopic`
>>> 
>>> In the kafka-controller.log of some brokers I see this which seems relevant:
>>> 
>>>   [2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
>>> sending request (type=StopReplicaRequest, controllerId=6, 
>>> controllerEpoch=78, deletePartitions=false, partitions=mytopic-17) to 
>>> broker 5, since it is offline. (kafka.controller.ControllerChannelManager)
>>>   [2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
>>> sending request (type=StopReplicaRequest, controllerId=6, 
>>> controllerEpoch=78, deletePartitions=false, partitions=mytopic-24) to 
>>> broker 5, since it is offline. (kafka.controller.ControllerChannelManager)
>>> 
>>> and
>>> 
>>>   12061:[2019-12-12 10:35:55,290] INFO [Topic Deletion Manager 1], Handling 
>>> deletion for topics mytopic (kafka.controller.TopicDeletionManager)
>>>   12062:[2019-12-12 10:35:55,292] INFO [Topic Deletion Manager 1], Not 
>>> retrying deletion of topic mytopic at this time since it is marked 
>>> ineligible for deletion (kafka.controller.TopicDeletionManager)
>>> 
>>> Since the data directory is already deleted I'm thinking of simply removing 
>>> the znode `brokers/topics/mytopic` from zookeeper manually.
>>> 
>>> Does anyone has another suggestion ? Is it safe to remove the znode 
>>> manually ?
>>> 
>>> Thanks.
>> 


Re: Topics marked for deletion stuck as ineligible for deletion

2019-12-13 Thread Peter Bukowinski
If any brokers are offline, kafka can’t successfully delete a topic. What’s the 
state of broker 5?

-- Peter (from phone)

> On Dec 13, 2019, at 8:55 AM, Vincent Rischmann  wrote:
> 
> Hi,
> 
> I've deleted a bunch of topics yesterday on our cluster but some are now 
> stuck in "marked for deletion".
> 
> * i've looked in the data directory of every broker and there's no data left 
> for the topics, the directory doesn't exist anymore.
> * in zookeeper the znode `brokers/topics/mytopic` still exists
> * the znode `admin/delete_topics/mytopic` still exists
> 
> I've tried the following to no avail:
> 
> * restarting all brokers
> * removing the `admin/delete_topics/mytopic` node and re-running 
> `kafka-topics.sh --delete --topic mytopic`
> 
> In the kafka-controller.log of some brokers I see this which seems relevant:
> 
>[2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
> sending request (type=StopReplicaRequest, controllerId=6, controllerEpoch=78, 
> deletePartitions=false, partitions=mytopic-17) to broker 5, since it is 
> offline. (kafka.controller.ControllerChannelManager)
>[2019-12-13 10:15:07,244] WARN [Channel manager on controller 6]: Not 
> sending request (type=StopReplicaRequest, controllerId=6, controllerEpoch=78, 
> deletePartitions=false, partitions=mytopic-24) to broker 5, since it is 
> offline. (kafka.controller.ControllerChannelManager)
> 
> and
> 
>12061:[2019-12-12 10:35:55,290] INFO [Topic Deletion Manager 1], Handling 
> deletion for topics mytopic (kafka.controller.TopicDeletionManager)
>12062:[2019-12-12 10:35:55,292] INFO [Topic Deletion Manager 1], Not 
> retrying deletion of topic mytopic at this time since it is marked ineligible 
> for deletion (kafka.controller.TopicDeletionManager)
> 
> Since the data directory is already deleted I'm thinking of simply removing 
> the znode `brokers/topics/mytopic` from zookeeper manually.
> 
> Does anyone has another suggestion ? Is it safe to remove the znode manually ?
> 
> Thanks.


Re: More partitions => less throughput?

2019-11-30 Thread Peter Bukowinski
Testing multiple brokers VMs on a single host won’t give you accurate 
performance numbers unless that is how you will be deploying kafka in 
production. (Don’t do this.) All your kafka networking is being handled by a 
single host, so instead of being spread out between machines to increase total 
possible throughput, they are competing with each other.

Given that this is the test environment you settled on, you should tune the 
number of partitions taking number of producers and consumers, and also the 
average message size into account. If you have only one producer, then a single 
consumer should be sufficient to read the data in real-time. If you have 
multiple producers, you may need to scale up the consumer count and use 
consumer groups.

-- Peter

> On Nov 30, 2019, at 8:57 AM, Tom Brown  wrote:
> 
> I think the number of partitions needs to be tuned to the size of the
> cluster; 64 partitions on what is essentially a single box seems high. Do
> you know what hardware you will be deploying on in production? Can you run
> your benchmark on that instead of a vm?
> 
> —Tom
> 
>> On Thursday, November 28, 2019, Craig Pastro  wrote:
>> 
>> Hello there,
>> 
>> I was wondering if anyone here could help me with some insight into a
>> conundrum that I am facing.
>> 
>> Basically, the story is that I am running three Kafka brokers via docker on
>> a single vm with log.flush.interval.messages = 1 and min.insync.replicas =
>> 2. Then I create two topics: both with replication factor = 3, but one with
>> one partition and the other with 64.
>> 
>> Then I try to run a benchmark using these topics and what I find is as
>> follows:
>> 
>> 1 partition, 1381.02 records/sec,  685.87 ms average latency
>> 64 partitions, 601.00 records/sec, 1298.18 ms average latency
>> 
>> This is the opposite of what I expected. In neither case am I even close to
>> the IOPS of what the disk can handle. So what I would like to know is if
>> there is any obvious reason that I am missing for the slow down with more
>> partitions?
>> 
>> If it is helpful the docker-compose file and the code to do the
>> benchmarking can be found at https://github.com/siyopao/kafka-benchmark.
>> (Any comments or advice on how to make the code better are greatly
>> appreciated!) The benchmarking code is inspired by and very similar to what
>> the bin/kafka-producer-perf-test.sh script does.
>> 
>> Thank you!
>> 
>> Best wishes,
>> Craig
>> 


Re: Broker shutdown slowdown between 1.1.0 and 2.3.1

2019-11-21 Thread Peter Bukowinski
How many partitions are on each of your brokers? That’s a key factor affecting 
shutdown and startup time. Even if it is large, though, I’ve seen a notable 
reduction in shutdown and startup times as I’ve moved from kafka 0.11 to 1.x to 
2.x.

I’m currently doing a rolling restart of a 150-broker cluster running kafka 
2.3.1. The cluster is very busy (~500k msg/sec, ~1GB/sec). Each broker has 
about 65 partitions. Each broker restart cycle (stop/start, rejoin ISR) takes 
about 90 seconds.


> On Nov 21, 2019, at 3:52 PM, Nicholas Feinberg  wrote:
> 
> I've been looking at upgrading my cluster from 1.1.0 to 2.3.1. While
> testing, I've noticed that shutting brokers down seems to take consistently
> longer on 2.3.1. Specifically, the process of 'creating snapshots' seems to
> take several times longer than it did on 1.1.0. On a small testing setup,
> the time needed to create snapshots and shut down goes from ~20s to ~120s;
> with production-scale data, it goes from ~2min to ~30min.
> 
> To allow myself to roll back, I'm still using the 1.1 versions of the
> inter-broker protocol and the message format - is it possible that those
> could slow things down in 2.3.1? If not, any ideas what else could be at
> fault, or what I could do to narrow down the issue further?
> 
> Thanks!
> -Nicholas



Re: Moving partition(s) to different broker

2019-11-11 Thread Peter Bukowinski
If the only replicas for that topic partition exist on brokers 15 and 24 and 
they are both down, then you cannot recover the partition until either of them 
is replaced or repaired and rejoins the cluster. You may need to enable unclean 
leader election, as well.

As you’ve discovered, adding replicas to an offline partition partition doesn’t 
work. A topic/partition needs to be in a healthy state for that to work.

Is your goal just to have no offline partitions or to recover the data 
contained in the affected partition? Producers and consumers should still be 
able to access the topic in its current state.

--
Peter

> On Nov 11, 2019, at 11:34 PM, SenthilKumar K  wrote:
> 
> Hi Experts, We have seen a problem with partition leader i.e it's set to -1.
> 
> describe o/p:
> Topic: 1453 Partition: 47 Leader: -1 Replicas: 24,15 Isr: 24
> 
> Kafka Version: 2.2.0
> Replication:  2
> Partitions: 48
> 
> Brokers 24 ,15 both are down due to disk errors and we lost the partition
> 47. I tried increasing the replica ( 2 to 3 ) of the partition alone using
> the Kafka partition reassignment tool but that didn't help.
> 
> {
> 
> "version": 1,
> 
> "partitions": [{
> 
> "topic": "1453",
> 
> "partition": 47,
> 
> "replicas": [22, 11, 5],
> 
> "log_dirs": ["any", "any", "any"]
> 
> }]
> 
> }
> 
> 
> Reassignment of partition 1453-47 is still in progress - Its stuck more
> than 3 hours.
> 
> 
> How to recover the partition 47? Pls advise. Thanks!
> 
> --Senthil


Re: Broker that stays outside of the ISR, how to recover

2019-10-18 Thread Peter Bukowinski
Hi Bart,

Before changing anything, I would verify whether or not the affected broker is 
trying to catch up. Have you looked at the broker’s log? Do you see any errors? 
Check your metrics or the partition directories themselves to see if data is 
flowing into the broker.

If you do want to reset the broker to have it start a fresh resync, stop the 
kafka broker service/process, 'rm -rf /path/to/kafka-logs' — check the value of 
your log.dir or log.dirs property in your server.properties file for the path — 
and then start the service again. It should check in with zookeeper and then 
start following the topic partition leaders for all the topic partition 
replicas assigned to it.

-- Peter

>> On Oct 18, 2019, at 12:16 AM, Bart van Deenen  
>> wrote:
> Hi all
> 
> We had a Kafka broker failure (too many open files, stupid), and now the 
> partitions on that broker will no longer become part of the ISR set. It's 
> been a few days (organizational issues), and we have significant amounts of 
> data on the ISR partitions.
> 
> In order to make the partitions on the broker become part of the ISR set 
> again, should I:
> 
> * increase `replica.lag.time.max.ms` on the broker to the number of ms that 
> the partitions are behind. I can guesstimate the value to about 7 days, or 
> should I measure it somehow?
> * stop the broker and wipe files (which ones?) and then restart it. Should I 
> also do stuff on zookeeper ?
> 
> Is there any _official_ information on how to deal with this situation?
> 
> Thanks for helping!


Brokers occasionally dropping out of cluster

2019-10-08 Thread Peter Bukowinski
. 
(kafka.coordinator.group.GroupMetadataManager)
[2019-10-07 14:42:27,629] INFO [GroupMetadataManager brokerId=14] Removed 0 
expired offsets in 0 milliseconds. 
(kafka.coordinator.group.GroupMetadataManager)
[2019-10-07 14:46:07,510] INFO [Partition internal_test-33 broker=14] Shrinking 
ISR from 16,17,14 to 14. Leader: (highWatermark: 2007553, endOffset: 2007555). 
Out of sync replicas: (brokerId: 16, endOffset: 2007553) (brokerId: 17, 
endOffset: 2007553). (kafka.cluster.Partition)
[2019-10-07 14:46:07,511] INFO [Partition internal_test-33 broker=14] Cached 
zkVersion [17] not equal to that in zookeeper, skip updating ISR 
(kafka.cluster.Partition)

This "Cached zkVersion” message repeats continuously from here on, until I 
restart the broker service and it rejoins the cluster and resuming replicating. 
There is no detectable hardware issue with the broker.

Here are the corresponding controller log entries:

[2019-10-07 14:45:55,427] INFO [Controller id=24] Newly added brokers: , 
deleted brokers: 14, bounced brokers: , all live brokers: 
1,2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18,19,20,21,22,23,24,25 
(kafka.controller.KafkaController)
[2019-10-07 14:45:55,477] INFO [RequestSendThread controllerId=24] Shutting 
down (kafka.controller.RequestSendThread)
[2019-10-07 14:45:55,477] INFO [RequestSendThread controllerId=24] Shutdown 
completed (kafka.controller.RequestSendThread)
[2019-10-07 14:45:55,477] INFO [RequestSendThread controllerId=24] Stopped 
(kafka.controller.RequestSendThread)
[2019-10-07 14:45:55,481] INFO [Controller id=24] Broker failure callback for 
14 (kafka.controller.KafkaController)

Has anyone seen this issue on recent versions of kafka?

—
Peter Bukowinski



Re: How Kafka leader replica decides to advance Highwater Mark (HW) based on Kafka producer configurations.

2019-10-08 Thread Peter Bukowinski
The ack method only affects the producer. With acks=0 or 1, HW advancement is 
done asynchronously with the producer, so the leader will continue to accept 
writes regardless of follower status. Only if you have acks=all and min ISR set 
to the replication factor will your acks be synchronous with the HW.

If you are producing to a topic partition that is under-replicated, your 
settings determine whether new data can be written to it or not.

-- Peter (from phone)

> On Oct 8, 2019, at 8:31 AM, Isuru Boyagane  
> wrote:
> 
> Hi Peter,
> 
> Okay. Then, whatever the producer acks configuration is, Leader replica
> waits for *all* other follower replicas to persist the message before
> advancing HW(without considering the *current ISR* set)?
> 
> Thanks
> 
>> On Tue, 8 Oct 2019 at 14:54, Peter Bukowinski  wrote:
>> 
>> The behavior of the high water mark is independent of producer ack
>> settings. It is a property of topic partitions only. (Remember that
>> multiple producers with different configurations can write to the same
>> topic.) The high water mark advances to the latest offset that exists in
>> all topic partition replicas. If your topic has a replication factor of 3,
>> then a partition’s HW will only advance toward the LEO after both replicas
>> are in sync with the leader.
>> 
>> min.insync.replicas controls how many replicas must be in sync when your
>> producer in using acks=all. This does not affect consumers.
>> 
>> HW does affect consumers, since data between the HW and LEO cannot be
>> consumed yet.
>> 
>> --
>> Peter
>> 
>>> On Oct 8, 2019, at 12:35 AM, Isuru Boyagane <
>> isuruboyagane...@cse.mrt.ac.lk> wrote:
>>> 
>>> Hi,
>>> 
>>> 4
>>> 
>>> I read about the Kafka replication protocol. I found that Kafka maintains
>>> LEO and HW. As I understood,
>>> 
>>> LEO: Offset of latest message a replica has seen.
>>> 
>>> HW: Offset of the latest message which is guaranteed that each replica
>> has
>>> persisted into their local log.
>>> 
>>> We can set the following acknowledgement methods in Kafka producers
>>> configurations.
>>> 
>>> 
>>>  1. acks = 0 (Leader replica sends an acknowledgement to the producer
>>> once it has seen the message)
>>> 2. acks = 1 (Leader replica sends an acknowledgement to the producer
>>> once it has persisted the message to its local log)
>>> 3. acks = all (Leader replica sends an acknowledgement to the
>>> producer once every in-sync replica has persisted the message to
>>> its local
>>> log)
>>> 
>>> So my question is how the leader advances the HW depending on the
>>> acknowledgement method message producer uses.
>>> 
>>> What I think is,
>>> 
>>>  1.
>>> 
>>>  for acks = 0, Leader advances the HW when it sees a new message.
>>>  2.
>>> 
>>>  for acks = 1, Leader advances the HW when it has written the new
>> message
>>>  to its local log.
>>>  3.
>>> 
>>>  for acks = all, Leader advances HW when each an every follower sent ack
>>>  that they persisted the message.
>>> 
>>> Is this correct? Can anyone clarify this?
>>> 
>>> Thanks
>> 
> 
> 
> --
> isuru


Re: How Kafka leader replica decides to advance Highwater Mark (HW) based on Kafka producer configurations.

2019-10-08 Thread Peter Bukowinski
The behavior of the high water mark is independent of producer ack settings. It 
is a property of topic partitions only. (Remember that multiple producers with 
different configurations can write to the same topic.) The high water mark 
advances to the latest offset that exists in all topic partition replicas. If 
your topic has a replication factor of 3, then a partition’s HW will only 
advance toward the LEO after both replicas are in sync with the leader.

min.insync.replicas controls how many replicas must be in sync when your 
producer in using acks=all. This does not affect consumers.

HW does affect consumers, since data between the HW and LEO cannot be consumed 
yet.

--
Peter

> On Oct 8, 2019, at 12:35 AM, Isuru Boyagane  
> wrote:
> 
> Hi,
> 
> 4
> 
> I read about the Kafka replication protocol. I found that Kafka maintains
> LEO and HW. As I understood,
> 
> LEO: Offset of latest message a replica has seen.
> 
> HW: Offset of the latest message which is guaranteed that each replica has
> persisted into their local log.
> 
> We can set the following acknowledgement methods in Kafka producers
> configurations.
> 
> 
>   1. acks = 0 (Leader replica sends an acknowledgement to the producer
>  once it has seen the message)
>  2. acks = 1 (Leader replica sends an acknowledgement to the producer
>  once it has persisted the message to its local log)
>  3. acks = all (Leader replica sends an acknowledgement to the
>  producer once every in-sync replica has persisted the message to
> its local
>  log)
> 
> So my question is how the leader advances the HW depending on the
> acknowledgement method message producer uses.
> 
> What I think is,
> 
>   1.
> 
>   for acks = 0, Leader advances the HW when it sees a new message.
>   2.
> 
>   for acks = 1, Leader advances the HW when it has written the new message
>   to its local log.
>   3.
> 
>   for acks = all, Leader advances HW when each an every follower sent ack
>   that they persisted the message.
> 
> Is this correct? Can anyone clarify this?
> 
> Thanks


Re: How to enable RACK awareness on Already Running Kafka Cluster

2019-09-19 Thread Peter Bukowinski
Hi Ashu,

It’s possible to enable rack-awareness in a rolling manner. Kafka will never 
automatically move existing partitions, unless you tell it to or have a 
separate tool (e.g. Cruise Control) that does it for you. Rack-awareness comes 
into play when topics are initially created and partitions are distributed 
around the cluster.

After you’ve set the broker.rack property for each broker and have restarted 
them, you will need to do a manual rebalance to add a third replica and 
properly distribute your replicas between the availability zones.

—
Peter

> On Sep 19, 2019, at 12:17 AM, Ashutosh singh  wrote:
> 
> Greetings,
> 
> We have 8 nodes Brokers setup on AWS in 2 availability zones with 2
> replication. Our plan is to add one more node and distribute 3 nodes in
> each AZ and change replication factor to 3.
> 
> In order to replicate data in each AZ we need to enable Rack awareness. Can
> someone guide how can I achieve this ? I can bring down only one broker and
> if I change settings in one broker i.e. (put broker.rack=rackid) and
> restart the Kafka then what would be impact ? how it will handle with the
> requests because other brokers have not these settings yet.
> 
> Also, we have 1000+ topics in our cluster , do we need to manually reassign
> partitions for all topics ?
> 
> I have searched everywhere but couldn't find any place where it says if can
> do it while kafka cluster is in operation.
> 
> I really appreciate if someone can help on this.
> 
> 
> -- 
> Regard
> Ashu



Re: Kafka logs are getting deleted too soon

2019-07-17 Thread Peter Bukowinski
Indeed, something seems wrong. I have a kafka (2.0.1) cluster that aggregates 
data from multiple locations. It has so much data moving through it I can’t 
afford to keep more than 24 hours on disk. The retention is working correctly. 
I don’t restrict topics by size, only by time.

What version of kafka are you using?

Looking back at the example log directory listing, I see that you mentioned 
seeing .log.deleted files. Yes, that means kafka tagged that log segment 
for deletion, and then the cleanup process removed it soon after. Something is 
causing your data to be cleaned, despite your retention overrides.

Can you try removing 'retention.bytes’ and setting ‘retention.ms=-1' for the 
topic? That should persist the data indefinitely.



> On Jul 17, 2019, at 6:07 PM, Sachin Nikumbh  
> wrote:
> 
> I am not setting the group id for the console consumer. When I say, the .log 
> files are all 0 bytes long it is after the producer has gone through 96 GB 
> worth of data. Apart from this topic where I am dumping 96GB of data, I have 
> some test topics where I am publishing very small amount of data. I don't 
> have any problem reading messages from those topics. The .log files for those 
> topics are properly sized and I can read those messages using multiple 
> console consumers at the same time. I have a feeling that the this specific 
> topic is having trouble due to the amount of data that I am publishing. I am 
> failing to understand which Kafka settings are playing role here.
> I am sure 96GB of data is really not a big deal for Kafka and I am not the 
> first one to do this.
>On Wednesday, July 17, 2019, 04:58:48 PM EDT, Peter Bukowinski 
>  wrote:  
> 
> Are you setting a group.id for your console consumer, perhaps, and keeping it 
> static? That would explain the inability to reconsume the data. As to why 
> your logs look empty, kafka likes to hold the data in memory and leaves it to 
> the OS to flush the data to disk. On a non-busy broker, the interval between 
> when data arrives and when it is flushed to disk can be quite a while.
> 
> 
>> On Jul 17, 2019, at 1:39 PM, Sachin Nikumbh  
>> wrote:
>> 
>> Hi Jamie,
>> I have 3 brokers and the replication factor for my topic is set to 3. I know 
>> for sure that the producer is producing data successfully because I am 
>> running a console consumer at the same time and it shows me the messages. 
>> After the producer produces all the data, I have /var/log/kafka/myTopic-* 
>> directories (15 of them) and all of them have only one .log file with size 
>> of 0 bytes. So, I am not sure if that addresses your question around the 
>> active segment.
>> ThanksSachin
>> On Wednesday, July 17, 2019, 04:00:56 PM EDT, Jamie 
>>  wrote:  
>> 
>> Hi Sachin, 
>> My understanding is that the active segment is never deleted which means you 
>> should have at least 1GB of data in your partition, if the data is indeed 
>> being produced to Kafka, Are there are errors in your broker logs? How many 
>> brokers do you have have and what is the replication factor of the topic? If 
>> you have less than 3 brokers, have you set offsets.topic.replication.factor 
>> to the number of brokers? 
>> 
>> Thanks, 
>> Jamie
>> 
>> -Original Message-
>> From: Sachin Nikumbh 
>> To: users 
>> Sent: Wed, 17 Jul 2019 20:21
>> Subject: Re: Kafka logs are getting deleted too soon
>> 
>> Broker 
>> configs:===broker.id=36num.network.threads=3num.io.threads=8socket.send.buffer.bytes=102400socket.receive.buffer.bytes=102400socket.request.max.bytes=104857600log.dirs=/var/log/kafkanum.partitions=1num.recovery.threads.per.data.dir=1offsets.topic.replication.factor=1transaction.state.log.replication.factor=1transaction.state.log.min.isr=1log.retention.hours=168log.segment.bytes=1073741824log.retention.check.interval.ms=30zookeeper.connect=myserver1:2181,myserver2:2181,myserver3:2181zookeeper.connection.timeout.ms=6000confluent.support.metrics.enable=trueconfluent.support.customer.id=anonymousgroup.initial.rebalance.delay.ms=0auto.create.topics.enable=false
>> Topic configs:==--partitions 15--replication-factor 
>> 3retention.ms=3144960retention.bytes=10737418240
>> As you can see, I have tried to override the retention.bytes for each 
>> partition to 10GB to be explicit. 96GB over 10 partitions which 6.4GB. So, I 
>> gave myself more than enough buffer. Even then, I am left with no logs. 
>> Here's an example:
>> % ls -ltr /var/log/kafka/MyTopic-0total 4-rw-r--r-- 1 root root  14  
>> Jul 17 15:05 leader-epoch-checkpoint-rw-r--r-- 1 root root 10485756 Jul 17 
>> 15:05 05484128.timeindex-rw-r--r-- 1 r

Re: Kafka logs are getting deleted too soon

2019-07-17 Thread Peter Bukowinski
Are you setting a group.id for your console consumer, perhaps, and keeping it 
static? That would explain the inability to reconsume the data. As to why your 
logs look empty, kafka likes to hold the data in memory and leaves it to the OS 
to flush the data to disk. On a non-busy broker, the interval between when data 
arrives and when it is flushed to disk can be quite a while.


> On Jul 17, 2019, at 1:39 PM, Sachin Nikumbh  
> wrote:
> 
> Hi Jamie,
> I have 3 brokers and the replication factor for my topic is set to 3. I know 
> for sure that the producer is producing data successfully because I am 
> running a console consumer at the same time and it shows me the messages. 
> After the producer produces all the data, I have /var/log/kafka/myTopic-* 
> directories (15 of them) and all of them have only one .log file with size of 
> 0 bytes. So, I am not sure if that addresses your question around the active 
> segment.
> ThanksSachin
>On Wednesday, July 17, 2019, 04:00:56 PM EDT, Jamie 
>  wrote:  
> 
> Hi Sachin, 
> My understanding is that the active segment is never deleted which means you 
> should have at least 1GB of data in your partition, if the data is indeed 
> being produced to Kafka, Are there are errors in your broker logs? How many 
> brokers do you have have and what is the replication factor of the topic? If 
> you have less than 3 brokers, have you set offsets.topic.replication.factor 
> to the number of brokers? 
> 
> Thanks, 
> Jamie
> 
> -Original Message-
> From: Sachin Nikumbh 
> To: users 
> Sent: Wed, 17 Jul 2019 20:21
> Subject: Re: Kafka logs are getting deleted too soon
> 
> Broker 
> configs:===broker.id=36num.network.threads=3num.io.threads=8socket.send.buffer.bytes=102400socket.receive.buffer.bytes=102400socket.request.max.bytes=104857600log.dirs=/var/log/kafkanum.partitions=1num.recovery.threads.per.data.dir=1offsets.topic.replication.factor=1transaction.state.log.replication.factor=1transaction.state.log.min.isr=1log.retention.hours=168log.segment.bytes=1073741824log.retention.check.interval.ms=30zookeeper.connect=myserver1:2181,myserver2:2181,myserver3:2181zookeeper.connection.timeout.ms=6000confluent.support.metrics.enable=trueconfluent.support.customer.id=anonymousgroup.initial.rebalance.delay.ms=0auto.create.topics.enable=false
> Topic configs:==--partitions 15--replication-factor 
> 3retention.ms=3144960retention.bytes=10737418240
> As you can see, I have tried to override the retention.bytes for each 
> partition to 10GB to be explicit. 96GB over 10 partitions which 6.4GB. So, I 
> gave myself more than enough buffer. Even then, I am left with no logs. 
> Here's an example:
> % ls -ltr /var/log/kafka/MyTopic-0total 4-rw-r--r-- 1 root root   14  
>  Jul 17 15:05 leader-epoch-checkpoint-rw-r--r-- 1 root root 10485756 Jul 17 
> 15:05 05484128.timeindex-rw-r--r-- 1 root root0
> Jul 17 15:05 05484128.log-rw-r--r-- 1 root root 10485760 Jul 17 
> 15:05 05484128.index
> 
>  I kept my eyes on the directory for each partition as the producer was 
> publishing data and I saw periodic .deleted files. Does it mean that Kafka 
> was deleting logs?
> Any help would be highly appreciated.
> On Wednesday, July 17, 2019, 01:47:44 PM EDT, Peter Bukowinski 
>  wrote:  
> 
> Can you share your broker and topic config here?
> 
>> On Jul 17, 2019, at 10:09 AM, Sachin Nikumbh  
>> wrote:
>> 
>> Thanks for the quick response, Tom.
>> I should have mentioned in my original post that I am always using 
>> --from-beginning with my console consumer. Even then  I don't get any data. 
>> And as mentioned, the .log files are of size 0 bytes.
>> On Wednesday, July 17, 2019, 11:09:22 AM EDT, Thomas Aley 
>>  wrote:  
>> 
>> Hi Sachin,
>> 
>> Try adding --from-beginning to your console consumer to view the 
>> historically produced data. By default the console consumer starts from 
>> the last offset.
>> 
>> Tom Aley
>> thomas.a...@ibm.com
>> 
>> 
>> 
>> From:  Sachin Nikumbh 
>> To:Kafka Users 
>> Date:  17/07/2019 16:01
>> Subject:[EXTERNAL] Kafka logs are getting deleted too soon
>> 
>> 
>> 
>> Hi all,
>> I have ~ 96GB of data in files that I am trying to get into a Kafka 
>> cluster. I have ~ 11000 keys for the data and I have created 15 partitions 
>> for my topic. While my producer is dumping data in Kafka, I have a console 
>> consumer that shows me that kafka is getting the data. The producer runs 
>> for a few hours before it is done. However, at this point, when I run the 
>> console consumer, it does not 

Re: Kafka logs are getting deleted too soon

2019-07-17 Thread Peter Bukowinski
Can you share your broker and topic config here?

> On Jul 17, 2019, at 10:09 AM, Sachin Nikumbh  
> wrote:
> 
> Thanks for the quick response, Tom.
> I should have mentioned in my original post that I am always using 
> --from-beginning with my console consumer. Even then  I don't get any data. 
> And as mentioned, the .log files are of size 0 bytes.
>On Wednesday, July 17, 2019, 11:09:22 AM EDT, Thomas Aley 
>  wrote:  
> 
> Hi Sachin,
> 
> Try adding --from-beginning to your console consumer to view the 
> historically produced data. By default the console consumer starts from 
> the last offset.
> 
> Tom Aley
> thomas.a...@ibm.com
> 
> 
> 
> From:  Sachin Nikumbh 
> To:Kafka Users 
> Date:  17/07/2019 16:01
> Subject:[EXTERNAL] Kafka logs are getting deleted too soon
> 
> 
> 
> Hi all,
> I have ~ 96GB of data in files that I am trying to get into a Kafka 
> cluster. I have ~ 11000 keys for the data and I have created 15 partitions 
> for my topic. While my producer is dumping data in Kafka, I have a console 
> consumer that shows me that kafka is getting the data. The producer runs 
> for a few hours before it is done. However, at this point, when I run the 
> console consumer, it does not fetch any data. If I look at the logs 
> directory, .log files for all the partitions are of 0 byte size. 
> If I am not wrong, the default value for log.retention.bytes is -1 which 
> means there is no size limit for the logs/partition. I do want to make 
> sure that the value for this setting is per partition. Given that the 
> default time based retention is 7 days, I am failing to understand why the 
> logs are getting deleted. The other thing that confuses me is that when I 
> use kafka.tools.GetOffsetShell, it shows me large enough values for all 
> the 15 partitions for offsets.
> Can someone please help me understand why I don't see logs and why 
> is kafka.tools.GetOffsetShell making me believe there is data.
> ThanksSachin
> 
> 
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598. 
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> 



Re: Justin Trudeau: support Justin Trudeau to postpone the decision of banning HUAWEI

2019-07-17 Thread Peter Bukowinski
I’m not even Canadian. No.

-- Peter (from phone)

> On Jul 17, 2019, at 7:33 AM, jiang0...@gmail.com wrote:
> 
> Hey,
> 
> I just signed the petition "Justin Trudeau: support Justin Trudeau to
> postpone the decision of banning HUAWEI" and wanted to see if you could
> help by adding your name.
> 
> Our goal is to reach 100 signatures and we need more support. You can read
> more and sign the petition here:
> 
> http://chng.it/D9NLSsKwVp
> 
> Thanks!
> Jacky


Re: Kafka logs are getting deleted too soon

2019-07-17 Thread Peter Bukowinski
Are you running the console consumer with the ‘--from-beginning’ option? It 
defaults to reading from tail of the log, so if there is nothing being produced 
it will be idle.

-- Peter (from phone)

> On Jul 17, 2019, at 8:00 AM, Sachin Nikumbh  
> wrote:
> 
> Hi all,
> I have ~ 96GB of data in files that I am trying to get into a Kafka cluster. 
> I have ~ 11000 keys for the data and I have created 15 partitions for my 
> topic. While my producer is dumping data in Kafka, I have a console consumer 
> that shows me that kafka is getting the data. The producer runs for a few 
> hours before it is done. However, at this point, when I run the console 
> consumer, it does not fetch any data. If I look at the logs directory, .log 
> files for all the partitions are of 0 byte size. 
> If I am not wrong, the default value for log.retention.bytes is -1 which 
> means there is no size limit for the logs/partition. I do want to make sure 
> that the value for this setting is per partition. Given that the default time 
> based retention is 7 days, I am failing to understand why the logs are 
> getting deleted. The other thing that confuses me is that when I use 
> kafka.tools.GetOffsetShell, it shows me large enough values for all the 15 
> partitions for offsets.
> Can someone please help me understand why I don't see logs and why is 
> kafka.tools.GetOffsetShell making me believe there is data.
> ThanksSachin


Re: Kafka Topic Partition Consumer Lags

2019-06-26 Thread Peter Bukowinski
Is there a correlation between the lagging partitions and the consumer assigned 
to them?

> On Jun 26, 2019, at 4:25 PM, Garvit Sharma  wrote:
> 
> Can anyone please help me with this.
> 
> On Wed, Jun 26, 2019 at 8:56 PM Garvit Sharma  wrote:
> 
>> Hey Steve,
>> 
>> I have checked, count of messages on all the partitions are same.
>> 
>> I am still exploring an approach using which the root cause could be
>> determined.
>> 
>> Thanks,
>> 
>> On Wed, Jun 26, 2019 at 8:07 PM Garvit Sharma 
>> wrote:
>> 
>>> I am not sure about that. Is there a way to analyse that ?
>>> 
>>> On Wed, Jun 26, 2019 at 7:35 PM Steve Howard 
>>> wrote:
>>> 
 Hi Garvit,
 
 Are the slow partitions "hot", i.e., receiving a lot more messages than
 others?
 
 Thanks,
 
 Steve
 
 On Wed, Jun 26, 2019, 9:56 AM Garvit Sharma >>> 
> Just to add more details, these consumers are processing the Kafka
 events
> and writing to DB(fast write guaranteed).
> 
> On Wed, Jun 26, 2019 at 7:23 PM Garvit Sharma 
> wrote:
> 
>> Hi All,
>> 
>> I can see huge consumer lag in a few partitions of Kafka topic. I
 need to
>> know the root cause of this issue.
>> 
>> Please let me know, how to proceed.
>> 
>> Below is sample consumer lag data :
>> 
>> [image: image.png]
>> 
>> Thanks,
>> 
>> 
> 
 
>>> 



Re: Newb Trying to Publish via Kafka CLI

2019-06-05 Thread Peter Bukowinski
Hi,

Leave the ’ssl://'  part off your --broker-list argument and it should 
work. You only need ‘host:port’.

—
Peter Bukowinski

> On Jun 5, 2019, at 12:41 PM, jbail...@gmail.com wrote:
> 
> Hello,
> 
> I am trying to connect to kafka via CLI to publish messages to a topic from a 
> windows box.
> I’ve read the docs, googled and asked others (everyone I know just uses the 
> GUI).  I’m clearly missing something.
> 
> I’ve included the (masked) info I use to connect via Kafka Tool followed by a 
> short list of specific questions.  
> 
> Properties : General
>   Cluster Name – ABCD, Kafka Cluster Version – 0.11, Zookeeper Host – 
> localhost, Zookeeper Port – , Chroot path - /
> Properties : Security
>   Type – SSL, Truststore Location – Path to kafka-trustore, Trustore 
> Password – abcdefghij123, Keystore Location – , Keystore Password – 
> , Keystore Private Key Password – 
> Properties : Advanced
>   Bootstrap servers – ssl://123.45.67.891:1234, ssl://123.45.67.892:1234, 
> ssl://123.45.67.893:1234, SASL mechanism – 
> 
> 1. Does it matter which version of kafka I use to connect?  I'm trying with 
> 2.12-2.2.0
> 2. I’ve attempted to use the below command, but receive the following error.  
> What am I missing here?
> 
>Kafka-console-producer --broker-list ssl://123.45.67.891:1234, 
> ssl://123.45.67.892:1234 --producer.config 
> C:\Users\example_user\Kafka\client-ssl.properties --topic FakeTopic
> 
>Contents of client-ssl.properties:
>security.protocol = SSL
>ssl.truststore.location = C:/Users/example_user/Kafka/kafka-truststore
>ssl.truststore.password = abcdefghij123
> 
>Error:
>> [2019-06-05 13:01:59,592] ERROR [Producer clientId=console-producer] 
>> Connection to node -1 (/123.45.67.891:1234) failed authentication due to: 
>> SSL handshake failed (org.apache.kafka.clients.NetworkClient)
> 



Re: Need some guidance to handle Kafka issues in Cloudera

2019-05-25 Thread Peter Bukowinski
Lacking any details makes it difficult to assist. Do you have Cloudera support? 

-- Peter

> On May 25, 2019, at 8:27 AM, PRASADA RAO Baratam  wrote:
> 
> Need some guidance to handle Kafka issues in Cloudera
> 
> Regards
> Prasad


Re: Kafka delaying message

2019-05-22 Thread Peter Bukowinski
I’d suggest using separate topics for messages that require delay and ones that 
do not. If you are limited to a single topic, then I’d use some other metadata 
to differentiate messages that require delayed processing from ones that do 
not. If you do not want to block the polling thread, you’ll need to route 
messages into a buffer of some sort to process them asynchronously.

—
Peter

> On May 22, 2019, at 1:10 PM, Pavel Molchanov  
> wrote:
> 
> This solution will block receiving polling thread for 15 minutes. Not good.
> 
> What should we do if a topic has messages that should be processed
> immediately and delayed messages at the same time?
> 
> *Pavel Molchanov*
> 
> 
> 
> On Wed, May 22, 2019 at 2:41 PM Peter Bukowinski  wrote:
> 
>> There is no out-of-the-box way to tell a consumer to not consume an offset
>> until it is x minutes old. Your best bet is encode the creation time into
>> the message themselves and add some processing logic into your consumer.
>> Let’s assume your topic has a single partition or your partitions are keyed
>> to guarantee message order. Your consumer could work like this in
>> pseudo-code:
>> 
>> consumer loop:
>>consume message
>>if (current time - message.timestamp) >= 15 minutes
>>process message
>>else
>>sleep 15 minutes - (current time - message.timestamp)
>>process message
>> 
>> Since the messages enter the topic in the order they were published,
>> pausing on the current offset should never cause a bottleneck on the later
>> messages. If you fall behind, the greater than or equal to logic will
>> prevent your consumer from pausing until it has caught up to your desired
>> delay.
>> 
>> This is a simplified scenario that may or may not map to your production
>> use case, though.
>> 
>> —
>> Peter
>> 
>> 
>>> On May 22, 2019, at 11:12 AM, Pavel Molchanov <
>> pavel.molcha...@infodesk.com> wrote:
>>> 
>>> Andrien,
>>> 
>>> Thank you for asking this question! I have the same problem and wanted to
>>> ask the same question. I hope that someone will answer soon.
>>> 
>>> *Pavel Molchanov*
>>> 
>>> 
>>> 
>>> On Wed, May 22, 2019 at 9:54 AM Adrien Ruffie 
>> wrote:
>>> 
>>>> Hello all,
>>>> 
>>>> I have a specific need and I don't know if a generic solution exist ...
>>>> maybe you can enlighten me
>>>> 
>>>> I need to delay each sended message about 15 mins.
>>>> Example
>>>> Message with offset 1 created at 2:41PM by the producer and received by
>> the
>>>> consumer at 2:56PM
>>>> Message with offset 2 created at 2:46PM by the producer and received by
>> the
>>>> consumer at 3:01PM
>>>> Message with offset 3 created at 2:46PM by the producer and received by
>> the
>>>> consumer at 3:01PM
>>>> Message with offset 4 created at 3:01PM by the producer and received by
>> the
>>>> consumer at 3:16PM
>>>> and so forth ...
>>>> 
>>>> any option, mechanism, producer/consumer implementations already exist ?
>>>> 
>>>> Thank a lot and best regards
>>>> 
>>>> Adrian
>>>> 
>> 
>> 



Re: Kafka delaying message

2019-05-22 Thread Peter Bukowinski
There is no out-of-the-box way to tell a consumer to not consume an offset 
until it is x minutes old. Your best bet is encode the creation time into the 
message themselves and add some processing logic into your consumer. Let’s 
assume your topic has a single partition or your partitions are keyed to 
guarantee message order. Your consumer could work like this in pseudo-code:

consumer loop:
consume message
if (current time - message.timestamp) >= 15 minutes
process message
else
sleep 15 minutes - (current time - message.timestamp)
process message

Since the messages enter the topic in the order they were published, pausing on 
the current offset should never cause a bottleneck on the later messages. If 
you fall behind, the greater than or equal to logic will prevent your consumer 
from pausing until it has caught up to your desired delay.

This is a simplified scenario that may or may not map to your production use 
case, though.

—
Peter


> On May 22, 2019, at 11:12 AM, Pavel Molchanov  
> wrote:
> 
> Andrien,
> 
> Thank you for asking this question! I have the same problem and wanted to
> ask the same question. I hope that someone will answer soon.
> 
> *Pavel Molchanov*
> 
> 
> 
> On Wed, May 22, 2019 at 9:54 AM Adrien Ruffie  wrote:
> 
>> Hello all,
>> 
>> I have a specific need and I don't know if a generic solution exist ...
>> maybe you can enlighten me
>> 
>> I need to delay each sended message about 15 mins.
>> Example
>> Message with offset 1 created at 2:41PM by the producer and received by the
>> consumer at 2:56PM
>> Message with offset 2 created at 2:46PM by the producer and received by the
>> consumer at 3:01PM
>> Message with offset 3 created at 2:46PM by the producer and received by the
>> consumer at 3:01PM
>> Message with offset 4 created at 3:01PM by the producer and received by the
>> consumer at 3:16PM
>> and so forth ...
>> 
>> any option, mechanism, producer/consumer implementations already exist ?
>> 
>> Thank a lot and best regards
>> 
>> Adrian
>> 



Re: Help - Updating Keystore Dynamically - KAFKA-6810

2019-05-16 Thread Peter Bukowinski
Yes, it is still relevant — unless you’ve enabled SSL for inter-broker 
communication and you are trying to update the truststore associated with that 
listener.

You should use the kafka-configs command to set the dynamic config value: 
https://kafka.apache.org/21/documentation.html#dynamicbrokerconfigs 
<https://kafka.apache.org/21/documentation.html#dynamicbrokerconfigs>

> bin/kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers 
> --entity-default --alter --add-config 
> {listener.name.[listener_name].}ssl.truststore.location=/path/to/new/truststore

The part in brackets may be optional if you don’t have more than one listener 
configured with a truststore.


> On May 16, 2019, at 3:26 PM, Darshan  wrote:
> 
> I sent another email that I am looking to dynamically update SSL
> truststore, and not keystore. Would that be still relevant? Thanks.
> 
> On Thu, May 16, 2019 at 2:54 PM Peter Bukowinski  wrote:
> 
>> It’s my understanding that dynamic configuration requires you to write
>> znodes, e.g. /config/brokers/ssl.keystore.location. I believe you can use
>> the same path. Brokers should be watching that path and if a node is added
>> or updated the config values will be read in and loaded over existing
>> values.
>> 
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration#KIP-226-DynamicBrokerConfiguration-SSLkeystore
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration#KIP-226-DynamicBrokerConfiguration-SSLkeystore
>>> 
>> 
>> 
>>> On May 16, 2019, at 2:08 PM, Darshan 
>> wrote:
>>> 
>>> Hi
>>> 
>>> I am testing out Kafka 2.2.0 and was hoping to test out "Enable dynamic
>>> reconfiguration of SSL truststores"
>>> https://issues.apache.org/jira/browse/KAFKA-6810. But unfortunately I
>> could
>>> not get it work. Please find the server.properties. Just wondering if we
>>> need an change of config. Please advise..
>>> 
>>> 1. I added a new entry in the truststore, and validated it that it is
>>> present.
>>> 2. The client (kafka writer) could not write to Kafka due to
>> SSLException.
>>> 3. I restarted Kafka broker.
>>> 4. The client could write messages.
>>> 
>>> 
>>> server.properties
>>> 
>> 
>>> 
>>> # Server Basics #
>>> 
>>> # The id of the broker. This must be set to a unique integer for each
>>> broker.
>>> broker.id=1
>>> auto.create.topics.enable=true
>>> delete.topic.enable=true
>>> 
>>>  Upgrading from 1.1.0 to 2.2.0 
>>> inter.broker.protocol.version=1.1
>>> log.message.format.version=1.1
>>> 
>>> # Socket Server Settings
>>> #
>>> 
>>> listeners=INTERNAL://1.1.1.65:9092,EXTERNAL://10.28.118.172:443
>>> ,INTERNAL_PLAINTEXT://1.1.1.65:9094
>>> advertised.listeners=INTERNAL://1.1.1.65:9092,EXTERNAL://
>> 10.28.118.172:443
>>> ,INTERNAL_PLAINTEXT://1.1.1.65:9094
>>> 
>> listener.security.protocol.map=INTERNAL:SSL,EXTERNAL:SSL,INTERNAL_PLAINTEXT:PLAINTEXT
>>> inter.broker.listener.name=INTERNAL_PLAINTEXT
>>> 
>>> default.replication.factor=1
>>> offsets.topic.replication.factor=1
>>> 
>>> # Hostname the broker will bind to. If not set, the server will bind to
>> all
>>> interfaces
>>> host.name=10.28.118.172
>>> 
>>> # The number of threads handling network requests
>>> num.network.threads=12
>>> 
>>> # The number of threads doing disk I/O
>>> num.io.threads=12
>>> 
>>> # The send buffer (SO_SNDBUF) used by the socket server
>>> socket.send.buffer.bytes=102400
>>> 
>>> # The receive buffer (SO_RCVBUF) used by the socket server
>>> socket.receive.buffer.bytes=102400
>>> 
>>> # The maximum size of a request that the socket server will accept
>>> (protection against OOM)
>>> socket.request.max.bytes=104857600
>>> 
>>> # Max message size is 10 MB
>>> message.max.bytes=1120
>>> 
>>> # Consumer side largest message size is 10 MB
>>> fetch.message.max.bytes=1120
>>> 
>>> # Replica max fetch size is 10MB
>>> replica.fetch.max.byt

Re: Help - Updating Keystore Dynamically - KAFKA-6810

2019-05-16 Thread Peter Bukowinski
It’s my understanding that dynamic configuration requires you to write znodes, 
e.g. /config/brokers/ssl.keystore.location. I believe you can use the same 
path. Brokers should be watching that path and if a node is added or updated 
the config values will be read in and loaded over existing values.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration#KIP-226-DynamicBrokerConfiguration-SSLkeystore
 



> On May 16, 2019, at 2:08 PM, Darshan  wrote:
> 
> Hi
> 
> I am testing out Kafka 2.2.0 and was hoping to test out "Enable dynamic
> reconfiguration of SSL truststores"
> https://issues.apache.org/jira/browse/KAFKA-6810. But unfortunately I could
> not get it work. Please find the server.properties. Just wondering if we
> need an change of config. Please advise..
> 
> 1. I added a new entry in the truststore, and validated it that it is
> present.
> 2. The client (kafka writer) could not write to Kafka due to SSLException.
> 3. I restarted Kafka broker.
> 4. The client could write messages.
> 
> 
> server.properties
> 
> 
> # Server Basics #
> 
> # The id of the broker. This must be set to a unique integer for each
> broker.
> broker.id=1
> auto.create.topics.enable=true
> delete.topic.enable=true
> 
>  Upgrading from 1.1.0 to 2.2.0 
> inter.broker.protocol.version=1.1
> log.message.format.version=1.1
> 
> # Socket Server Settings
> #
> 
> listeners=INTERNAL://1.1.1.65:9092,EXTERNAL://10.28.118.172:443
> ,INTERNAL_PLAINTEXT://1.1.1.65:9094
> advertised.listeners=INTERNAL://1.1.1.65:9092,EXTERNAL://10.28.118.172:443
> ,INTERNAL_PLAINTEXT://1.1.1.65:9094
> listener.security.protocol.map=INTERNAL:SSL,EXTERNAL:SSL,INTERNAL_PLAINTEXT:PLAINTEXT
> inter.broker.listener.name=INTERNAL_PLAINTEXT
> 
> default.replication.factor=1
> offsets.topic.replication.factor=1
> 
> # Hostname the broker will bind to. If not set, the server will bind to all
> interfaces
> host.name=10.28.118.172
> 
> # The number of threads handling network requests
> num.network.threads=12
> 
> # The number of threads doing disk I/O
> num.io.threads=12
> 
> # The send buffer (SO_SNDBUF) used by the socket server
> socket.send.buffer.bytes=102400
> 
> # The receive buffer (SO_RCVBUF) used by the socket server
> socket.receive.buffer.bytes=102400
> 
> # The maximum size of a request that the socket server will accept
> (protection against OOM)
> socket.request.max.bytes=104857600
> 
> # Max message size is 10 MB
> message.max.bytes=1120
> 
> # Consumer side largest message size is 10 MB
> fetch.message.max.bytes=1120
> 
> # Replica max fetch size is 10MB
> replica.fetch.max.bytes=1120
> 
> # Max request size 10MB
> max.request.size=1120
> 
>  SHUTDOWN and REBALANCING ###
> # Both the following properties are also enabled by default as well, also
> explicitly settings here
> controlled.shutdown.enable=true
> auto.leader.rebalance.enable=true
> unclean.leader.election.enable=true
> 
> 
> # Security Settings ##
> ssl.endpoint.identification.algorithm=""
> ssl.keystore.location=/dir/keystore.jks
> ssl.keystore.password=pwd
> ssl.key.password=pwd
> ssl.truststore.location=/dir/truststore.jks
> ssl.truststore.password=pwd
> ssl.keystore.type=JKS
> ssl.truststore.type=JKS
> security.protocol=SSL
> ssl.client.auth=required
> allow.everyone.if.no.acl.found=false
> authorizer.class.name=kafka.security.auth.SimpleAclAuthorizer
> # User.ANONYMOUS is included for AMS to be able to program ACL via 9094 port
> super.users=User:CN=KafkaBroker1;User:ANONYMOUS



Re: kafka ssl config

2019-05-02 Thread Peter Bukowinski
If you can access the remote file via a mounted filesystem, you can specify 
'/mountpoint/truststore.jks’ as the value for ssl.truststore.location. You 
cannot use a url to specify a remote resource.


> On May 2, 2019, at 11:38 AM, anurag  wrote:
> 
> Hi All,
> 
> Is it possible to set the value of ssl.truststore.location to a location on
> remote host. Basically I have ssl certificates available on remote host and
> i would like my docker kafka container to read and use certificates from
> remote location. If this is possible can you please provide an example.
> 
> Many thanks,
> 
> Anurag



Re: How to count number of available messages per topic?

2019-04-28 Thread Peter Bukowinski
You’ll need to do this programmatically with some simple math. There’s a binary 
included with kafka called kafka-run-class that you can use to expose earliest 
and latest offset information.

This will return the earliest unexpired offsets for each partition in a topic:

kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 
--topic TOPIC --time -2

This will return the latest offset:

kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 
--topic TOPIC --time -1

With that info, subtract the latest from the earliest per partition, sum the 
results, and you’ll have the number of messages available in your topic.

-- Peter

> On Apr 28, 2019, at 1:41 AM, jaaz jozz  wrote:
> 
> Hello,
> I want to count how many messages available in each topic in my kafka
> cluster.
> I understand that just looking at the latest offset available is not
> correct, because older messages may have been already purged due to
> retention policy.
> So what is the correct way of counting that?
> 
> Thanks,
> Jazz.


Re: Can the consumer know the user who sent the message ?

2019-04-21 Thread Peter Bukowinski
Kafka’s authorization layer is entirely separate from topic data, other than 
granting or denying access. If you don’t want to alter the messages themselves 
to hold information about the producers, then you should consider using 
separate topics.

-- Peter (from phone)

> On Apr 21, 2019, at 10:13 AM, Kumaran Ponnambalam  
> wrote:
> 
> Hi,
> 
> I am trying to add security to my Kafka setup. I can setup authorization for 
> a list of users. However, i would like my consumer to know about the specific 
> user who produced the message. Based on the user, i would like to perform 
> additional segmentation and processing in a multi-tenanted scenario. I don't 
> want the producer to set the user ID as a custom attribute or header, for 
> security reasons. Rather discover the user based on their authentication used.
> 
> Any thoughts on how to achieve this?
> 
> thanks
> 
> Kumaran


Re: Kafka memory estimation

2019-04-12 Thread Peter Bukowinski
The memory that a kafka broker uses is the java heap + the page cache. If 
you’re able to split your memory metrics by memory-used and memory-cached, you 
should see that the majority of a broker’s memory usage is cached memory.

As a broker receives data from producers, the data first enters the page cache. 
If you have consumers reading this data as it comes in, they will be able to 
read it directly from the page cache and the latency of having to fetch it from 
disk. By default, kafka leaves it to the OS to flush this data to disk.

The page cache can only use free memory (not committed to other processes), and 
it will also relinquish memory to other processes that need it.

On the metrics dashboards that I’ve build for my clusters, I always include 
cached and used memory charts, as well as disk read and write charts. I’m fine 
see to many disk reads I know I have lagging consumers.

-- Peter (from phone)

> On Apr 12, 2019, at 8:20 PM, Rammohan Vanteru  wrote:
> 
> Hi Steve,
> 
> We are using Prometheus jmx exporter and Prometheus to scrape metrics based
> on memory metric we are measuring.
> 
> Jmx exporter:
> https://github.com/prometheus/jmx_exporter/blob/master/README.md
> 
> Thanks,
> Ramm.
> 
> On Fri, Apr 12, 2019 at 12:43 PM Steve Howard 
> wrote:
> 
>> Hi Rammohan,
>> 
>> How are you measuring "Kafka seems to be reserving most of the memory"?
>> 
>> Thanks,
>> 
>> Steve
>> 
>> On Thu, Apr 11, 2019 at 11:53 PM Rammohan Vanteru 
>> wrote:
>> 
>>> Hi Users,
>>> 
>>> As per the article here:
>>> https://docs.confluent.io/current/kafka/deployment.html#memory memory
>>> requirement is roughly calculated based on formula: write throughput*30
>>> (buffer time in seconds), which fits in our experiment i.e. 30MB/s*30~
>>> 900MB. Followup questions here:
>>> 
>>>   - How do we estimate the figure ’30 seconds’ for our buffer time?
>>>   - Kafka seems to be reserving most of the memory even when the none of
>>>   the messages are being streamed.. how does that play out.
>>> 
>>> Any information provided would be helpful.
>>> 
>>> Thanks,
>>> Rammohan.
>> 


Re: Question for min.insync.replicas

2019-04-09 Thread Peter Bukowinski

> On Apr 9, 2019, at 8:55 AM, Han Zhang  wrote:
> 
> Hi
> 
> Is there a property that I can set the min insync replicas to an individual 
> topic?
> 
> I only know min.insync.replicas but it's for the broker level.
> 
> Thanks

Hi Han,

Yes, the topic-level config is also min.insync.replicas. You set it using the 
`kafka-topics` tool, e.g.

/path/to/kafka-topics.sh \
--zookeeper 127.0.0.1:2181 \
--alter \
--topic my_topic \
--config min.insync.replicas=1

You can also set this property when you create the topic:

/path/to/kafka-topics.sh \
--zookeeper 127.0.0.1:2181 \
--create \
--topic new_topic \
--replication-factor 3 \
--partitions 20 \
--config min.insync.replicas=1

—
Peter Bukowinski

Re: Need help to find references to antipatterns/pitfalls/incorrect ways to use Kafka

2019-03-31 Thread Peter Bukowinski
I don’t want to be a downer, but because kafka is relatively new, the reference 
material you seek probably doesn’t exist. Kafka is flexible and can be made to 
work in many different scenarios — not all of the ideal.

It sounds like you’ve already reached a conclusion that kafka is the wrong 
solution for your requirements. Please share with us the evidence that you used 
to reach this conclusion. It would be helpful if you described the technical 
problems you encountered in your experiments so that others can give their 
opinion on whether can can be resolved or whether they are deal-breakers.

--
Peter

> On Mar 31, 2019, at 4:24 PM,   wrote:
> 
> Hello!
> 
> 
> 
> I ask for your help in connection with the my recent task:
> 
> - Price lists are delivered to 20,000 points of sale with a frequency of <10
> price lists per day.
> 
> - The order in which the price lists follow is important. It is also
> important that the price lists are delivered to the point of sale online.
> 
> - At each point of sale, an agent application is deployed, which processes
> the received price lists.
> 
> 
> 
> This task is not particularly difficult. Help in solving the task is not
> required.
> 
> 
> 
> The difficulty is that Kafka in our company is a new "silver bullet", and
> the project manager requires me to implement the following technical
> decision: 
> 
> deploy 20,000 Kafka consumer instances (one instance for each point of sale)
> for one topic partitioned into 20,000 partitions - one partition per
> consumer.
> 
> Technical problems obtained in experiments with this technical decision do
> not convince him.
> 
> 
> 
> Please give me references to the books/documents/blogposts. which clearly
> shows that Kafka not intended for this way to use (references to other
> anti-patterns/pitfalls will be useful).
> 
> My own attempts to find such references were unsuccessful.
> 
> 
> 
> Thank you!
> 
> 
> 


Re: Proxying the Kafka protocol

2019-03-19 Thread Peter Bukowinski
https://docs.confluent.io/3.0.0/kafka-rest/docs/intro.html

The Kafka REST proxy may be what you need. You can put multiple instances 
behind a load balancer to scale to your needs.


-- Peter (from phone)

> On Mar 19, 2019, at 8:30 AM, Ryanne Dolan  wrote:
> 
> Hello James, I'm not aware of anything like that for Kafka, but you can use
> MirrorMaker for network segmentation. With this approach you have one Kafka
> cluster in each segment and a MM cluster in the more privileged segment.
> You don't need to expose the privileged segment at all -- you just need to
> let MM reach the external segment(s).
> 
> Ryanne
> 
>> On Tue, Mar 19, 2019, 10:20 AM James Grant  wrote:
>> 
>> Hello,
>> 
>> We would like to expose a Kafka cluster running on one network to clients
>> that are running on other networks without having to have full routing
>> between the two networks. In this case these networks are in different AWS
>> accounts but the concept applies more widely. We would like to access Kafka
>> over a single (or very few) host names.
>> 
>> In addition we would like to filter incoming messages to enforce some level
>> of data quality and also impose some access control.
>> 
>> A solution we are looking into is to provide a Kafka protocol level proxy
>> that presents to clients as a single node Kafka cluster holding all the
>> topics and partitions of the cluster behind it. This proxy would be able to
>> operate in a load balanced cluster behind a single DNS entry and would also
>> be able to intercept and filter/alter messages as they passed through.
>> 
>> The advantages we see in this approach over the HTTP proxy is that it
>> presents the Kafka protocol whilst also meaning that we can use a typical
>> TCP level load balancer that it is easy to route connections to. This means
>> that we continue to use native Kafka clients.
>> 
>> Does anything like this already exist? Does anybody think it would useful?
>> Does anybody know of any reason it would be impossible (or a bad idea) to
>> do?
>> 
>> James Grant
>> 
>> Developer - Expedia Group
>> 


Re: Whether kafka broker will have impact with 2 MB message size

2019-03-13 Thread Peter Bukowinski
We have many production clusters with three topics in the 1-3MB range and the 
rest in the multi-kb to sub-kb range. We do use gzip compression, implemented 
at the broker rather than the producer level. The clusters don’t usually break 
a sweat. We use MirrorMaker to aggregate these topics to a large central 
cluster and the message sizes aren’t an issue there, either.

The main problem we have with these large topics — especially one of them that 
is high throughput — is that after a cluster has been up a while and partitions 
have moved around (we use cruise-control) due to hardware failures, the topic 
tends to become less balanced across the broker storage and some brokers/disks 
fill up faster than others.

—
Peter

> On Mar 13, 2019, at 6:28 PM, 1095193...@qq.com wrote:
> 
> We have a use case where we want to produce data to kafka with max
> size of 2 MB rarely (That is, based on user operations message size
> will vary).
> 
> Whether producing 2 Mb size will have any impact or we need to split
> the message to small chunk such as 100 KB and produce.
> 
> If we produce into small chunk, it will increase response time for the
> user. Also, we have checked by producing 2 MB message to kafka and we
> doesn't see much latency there.
> 
> Anyway if we split the data and produce, it doesn't have any impact in
> disk size. But whether broker performance will degrade due to this?
> 
> Our broker configuration is:
> 
> RAM 125.6 GB Disk Size 2.9 TB Processors 40
> 
> Thanks,
> Hemnath K B.



Re: Kafka Mirror Maker place of execution

2019-03-12 Thread Peter Bukowinski
Hi Franz,

The MirrorMaker instances are colocated with the brokers, yes. These are beefy, 
dedicated hosts that are handling the loads admirably.

The core cluster receives about 400k msg/sec, 1GB/sec across 20 topics at peak 
times. CPU usage occasionally crosses 50% during peak times. If I find that the 
hardware is getting overloaded, or that our consumer load on the core cluster 
increases significantly, I will move MM to a separate set of hosts. Right now, 
it’s quite cost-effective as is. :)

—
Peter


> On Mar 12, 2019, at 10:02 AM, Franz van Betteraey  wrote:
> 
> Hi Peter,
> 
> these are remarkable numbers but to be honest I do not get where you run the 
> Mirror Maker processes. 
> Do you run them near the remote clusters or near the target (core?) 
> datacenter cluster?
> 
> As I understand you run 30 MirrorMaker Instances (one for each remote 
> cluster) on each of the 100 Kafka Nodes of your core datacenter cluster.
> So you run the Mirror Maker on the same machine as the Kafka Nodes and do not 
> use a dedicated machines for the Mirror Maker process?
> 
> 
> Best regards,
>  Franz
>  
> 
> Gesendet: Dienstag, 12. März 2019 um 16:24 Uhr
> Von: "Peter Bukowinski" mailto:pmb...@gmail.com>>
> An: users@kafka.apache.org <mailto:users@kafka.apache.org>
> Betreff: Re: Kafka Mirror Maker place of execution
> I have a setup with about 30 remote kafka clusters and one cluster in a core 
> datacenter where I aggregate data from all the remote clusters. The remote 
> clusters have 30 nodes each with moderate specs. The core cluster has 100 
> nodes with lots of cpu, ram, and ssd storage per node.
> 
> I run MirrorMaker directly on the core brokers. Each broker runs one 
> MirrorMaker instance per edge cluster, sharing the same group.id. Since I’m 
> running 100 instances per edge cluster, the number of threads I use = (total 
> partition count of topics I am mirroring) / 100. In practice, each MM 
> instance runs with about 25 threads, so each broker runs 25*30=750 threads of 
> MirrorMaker.
> 
> I’ve been running this setup for many months and it’s proved to be stable 
> with very low consumer lag.
> 
> --
> Peter Bukowinski
> 
>> On Mar 12, 2019, at 6:42 AM, Ryanne Dolan  wrote:
>> 
>> Franz, you can run MM on or near either source or target cluster, but it's
>> more efficient near the target because this minimizes producer latency. If
>> latency is high, poducers will block waiting on ACKs for in-flight records,
>> which reduces throughput.
>> 
>> I recommend running MM near the target cluster but not necessarily on the
>> same machines, because often Kafka nodes are relatively expensive, with SSD
>> arrays and huge IO bandwidth etc, which isn't necessary for MM.
>> 
>> Ryanne
>> 
>> On Tue, Mar 12, 2019, 8:13 AM Franz van Betteraey 
>> wrote:
>> 
>>> Hi all,
>>> 
>>> there are best practices out there which recommend to run the Mirror Maker
>>> on the target cluster.
>>> 
>>> https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html
>>> 
>>> I wonder why this recommendation exists because ultimately all data must
>>> cross the border between the clusters, regardless of whether they are
>>> consumed at the target or produced at the source. A reason I can imagine is
>>> that the Mirror Maker supports multimple consumer but only one producer -
>>> so consuming data on the way with the greater latency might be speed up by
>>> the use of multiple consumers.
>>> 
>>> If performance because of multi threading is a point, would it be usefaul
>>> to use several producer (one per consumer) to replicate the data (with a
>>> custom replication process)? Does anyone knows why the Mirror Maker shares
>>> a single producer among all consumers?
>>> 
>>> My usecase is the replication of data from several source cluster (~10) to
>>> a single target cluster. I would prefer to run the replication process on
>>> the source cluster to avoid to many replication processes (each for one
>>> source) on the target cluster.
>>> 
>>> Hints and suggestions on this topic are very welcome.
>>> 
>>> Best regards
>>> Franz
>>> 
>>> If you would like to earn some SO recommendation points feel free to
>>> answer this question on SO ;-)
>>> https://stackoverflow.com/q/55122268/367285[https://stackoverflow.com/q/55122268/367285
>>>  
>>> <https://stackoverflow.com/q/55122268/367285[https://stackoverflow.com/q/55122268/367285>]



Re: Kafka Mirror Maker place of execution

2019-03-12 Thread Peter Bukowinski
I have a setup with about 30 remote kafka clusters and one cluster in a core 
datacenter where I aggregate data from all the remote clusters. The remote 
clusters have 30 nodes each with moderate specs. The core cluster has 100 nodes 
with lots of cpu, ram, and ssd storage per node.

I run MirrorMaker directly on the core brokers. Each broker runs one 
MirrorMaker instance per edge cluster, sharing the same group.id. Since I’m 
running 100 instances per edge cluster, the number of threads I use = (total 
partition count of topics I am mirroring) / 100. In practice, each MM instance 
runs with about 25 threads, so each broker runs 25*30=750 threads of 
MirrorMaker.

I’ve been running this setup for many months and it’s proved to be stable with 
very low consumer lag.

--
Peter Bukowinski

> On Mar 12, 2019, at 6:42 AM, Ryanne Dolan  wrote:
> 
> Franz, you can run MM on or near either source or target cluster, but it's
> more efficient near the target because this minimizes producer latency. If
> latency is high, poducers will block waiting on ACKs for in-flight records,
> which reduces throughput.
> 
> I recommend running MM near the target cluster but not necessarily on the
> same machines, because often Kafka nodes are relatively expensive, with SSD
> arrays and huge IO bandwidth etc, which isn't necessary for MM.
> 
> Ryanne
> 
> On Tue, Mar 12, 2019, 8:13 AM Franz van Betteraey 
> wrote:
> 
>> Hi all,
>> 
>> there are best practices out there which recommend to run the Mirror Maker
>> on the target cluster.
>> 
>> https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html
>> 
>> I wonder why this recommendation exists because ultimately all data must
>> cross the border between the clusters, regardless of whether they are
>> consumed at the target or produced at the source. A reason I can imagine is
>> that the Mirror Maker supports multimple consumer but only one producer -
>> so consuming data on the way with the greater latency might be speed up by
>> the use of multiple consumers.
>> 
>> If performance because of multi threading is a point, would it be usefaul
>> to use several producer (one per consumer) to replicate the data (with a
>> custom replication process)? Does anyone knows why the Mirror Maker shares
>> a single producer among all consumers?
>> 
>> My usecase is the replication of data from several source cluster (~10) to
>> a single target cluster. I would prefer to run the replication process on
>> the source cluster to avoid to many replication processes (each for one
>> source) on the target cluster.
>> 
>> Hints and suggestions on this topic are very welcome.
>> 
>> Best regards
>>  Franz
>> 
>> If you would like to earn some SO recommendation points feel free to
>> answer this question on SO ;-)
>> https://stackoverflow.com/q/55122268/367285
>> 


Produce Message Conversions Per Sec

2019-03-04 Thread Peter Bukowinski
Greetings,

I have a concern about the produce message conversions per sec metrics of my 
kafka brokers. I have a go application that produces topics into a kafka 2.0.1 
cluster using confluent-kafka-go with librdkafka 0.11.6 (and 
log.message.format.version=2.0). The produce conversions/sec metrics are 
identical to the messages in/sec metric, which indicates all messages produced 
into this cluster are undergoing some sort of format conversion. I recently 
upgraded my brokers from kafka 1.0.1 and my clients from the 0.11.4 version of 
the library, but that had no effect on the produce conversion rate.

One thing to note in my configuration is that the producers are not configured 
to compress the messages, but compression.type is set to gzip on the brokers. 
In my scenario, I’d rather my brokers spend the CPU cycles on compression than 
my client. Is this responsible for the conversion metrics I’m seeing? If so, 
I’ll stop worrying since it is working as intended.

—
Peter Bukowinski

Re: Kafka partitioning and auto-scaling in k8s

2019-02-21 Thread Peter Bukowinski
I’ll assume when you say load, you mean data rate flowing into your kafka 
topic(s).

One instance can consume from multiple partitions, so on a variable load 
workflow, it’s a good idea to have more partitions than your average workload 
will require. When the data rate is low, fewer consumers will be able to handle 
multiple partitions each, with the group coordinator handling the distribution 
among them. When load spikes, more consumers will join the group and partitions 
will be reassigned across the larger pool.

-- Peter (from phone)

> On Feb 21, 2019, at 10:12 PM, Ali Nazemian  wrote:
> 
> Hi All,
> 
> I was wondering how an application can be auto-scalable if only a single
> instance can read from the single Kafka partition and two instances cannot
> read from the single partition at the same time with the same consumer
> group.
> 
> Suppose there is an application that has 10 instances running on Kubernetes
> in production at this moment (using the same consumer group) and we have
> got a Kafka topic with 10 partitions. Due to the increase in load,
> Kubernetes provision more instance to take the extra load. However, since
> the maximum number of consumers with the same consumer group can be 10 in
> this example, no matter how many new instances are created they are not
> able to address the extra load until partition number increases. Is there
> any out of the box solution to address this situation?
> 
> Thanks,
> Ali


Re: Any way to set a quota for a consumer group?

2019-02-21 Thread Peter Bukowinski
You can set consumer client.id to be the same as the consumer group.id for all 
the consumer in your consumer group to accomplish this.

—
Peter

> On Feb 21, 2019, at 7:56 AM, 洪朝阳 <15316036...@163.com> wrote:
> 
> It’s very great that Apache Kafka get a feature of setting quota since 0.9.
> https://kafka.apache.org/documentation/#design_quotas
> 
> But it’s not very perfect that this feature can only limit a specific
> client identified by a property “client.id” rather than a consumer group.
> 
> Is there any way to set a quota for a consumer group?
> 



Re: Lag checking from producer

2019-02-19 Thread Peter Bukowinski
From your description, it sounds like kafka may be ill-suited for your project. 
A backpressure mechanism essentially requires producers to be aware of 
consumers and that is counter to Kafka’s design. Also, it sounds like your 
producers are logical (if not actual) consumers of data generated by the 
consumers. I see a couple options:

1. Kafka has a quota system which can rate-limit producers. If you can predict 
the rate at which your consumers can ingest data from the Kafka cluster, 
keeping your producers to that rate would be more kafka-esque (haha) than 
bolting on a separate on/off flow mechanism.

2. If you need more control than a rate limiter, then you should probably 
introduce a new topic that your “consumers” produce into and that your 
“producers” consume from. If your producers then depend on new messages being 
available in this topic before they can produce new data, you can have a direct 
rate link between both sides.

Just spitballing, here. :)

—
Peter

> On Feb 19, 2019, at 9:25 AM, Filipp Zhinkin  wrote:
> 
> Hi,
> 
> thank you for the reply!
> 
> I'm developing system where producers are spending money every time a
> request arrives.
> Consumers account money spent using the data from producers as well as
> few other sources.
> Consumers are also responsible to calculatate statistics that affect
> policies used by producers to spend money.
> 
> As a result, if consumers are temporary lagging then producres are
> neither know if they can spend money, nor they know how to better do
> it.
> 
> There is some lag values that producers can tolerate, but if lag is
> growing further then producers have to stop.
> 
> Thanks,
> Filipp.
> 
> On Tue, Feb 19, 2019 at 7:44 PM Javier Arias Losada
>  wrote:
>> 
>> Hi,
>> 
>> could you please be more specific on your use case?
>> One of the theoretical advantages of a system like kafka is that you can
>> decouple producers and consumers, so you don't need to to do backpressure.
>> A different topic is how to handle lagging consumers, in that scenario you
>> could scale up your service, etc.
>> 
>> Best.
>> 
>> El mar., 19 feb. 2019 a las 15:43, Filipp Zhinkin 
>> ()
>> escribió:
>> 
>>> Hi!
>>> 
>>> I'm trying to implement backpressure mechanism that asks producers to
>>> stop doing any work when consumers are not able to process all
>>> messages in time (producers require statistics calculated by consumers
>>> in order to answer client requests, when consumers are lagging behind
>>> we have to stop producers from making any responses).
>>> 
>>> I see several ways to implement it:
>>> - compute lag on consumer side and store it somewhere (zk, some db, etc);
>>> - use separate service like Burrow;
>>> - compute lag on every producer by getting commited and end offsets
>>> for every partition via consumer API.
>>> 
>>> Are there any downsides of the latter approach? Would is negatively
>>> impact brokers performance?
>>> 
>>> Thanks,
>>> Filipp.
>>> 



Re: Total Volume metrics of Kafka

2019-01-16 Thread Peter Bukowinski
On each broker, we have a process (scheduled with cron) that polls the kafka 
jmx api every 60 seconds. It sends the metrics data to graphite 
(https://graphiteapp.org). We have graphite configured as a data source for 
grafana (https://grafana.com) and use it to build various dashboards to present 
the metrics we’re interested in.

There are various jmx-to-graphite tools available. We use one written in house, 
but this one looks like it’ll do the job: https://github.com/logzio/jmx2graphite


> On Jan 16, 2019, at 2:15 PM, Amitav Mohanty  wrote:
> 
> Peter,
> 
> Thanks for the inputs. I am interested in aggregate bytes published into a
> topic. The approach of metrics collector along with graphing tool seems
> appealing. I can volume ingested over arbitrary periods of time which is
> exactly what I am looking for. Can you please point to some metrics
> collector that I can use? Is it sort of a cron-job that notes the rate
> every minute or every 15 mins?
> 
> Regards,
> Amitav
> 
> On Thu, Jan 17, 2019 at 3:23 AM Peter Bukowinski  wrote:
> 
>> Amitav,
>> 
>> When you say total volume, do you want a topic’s size on disk, taking into
>> account replication and retention, or do you want the aggregate bytes
>> published into a topic? If you have a metrics collector and a graphing tool
>> such as grafana, you can transform the rate metrics to a byte sum by
>> applying an integral function, but those will always grow and not take into
>> account deletion after the retention period.
>> 
>> If you want metrics on how much space a topic occupies on disk, I’d
>> suggest using collectd and this plugin:
>> https://github.com/HubSpot/collectd-kafka-disk
>> 
>> —
>> Peter
>> 
>>> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty 
>> wrote:
>>> 
>>> Hi
>>> 
>>> I am interested in getting total volume of data that a topic ingested in
>> a
>>> period of time. Does Kafka collect any such metrics? I check JMX console
>>> but I only found rate metrics.
>>> 
>>> Regards,
>>> Amitav
>> 
>> 



Re: Total Volume metrics of Kafka

2019-01-16 Thread Peter Bukowinski
Amitav,

When you say total volume, do you want a topic’s size on disk, taking into 
account replication and retention, or do you want the aggregate bytes published 
into a topic? If you have a metrics collector and a graphing tool such as 
grafana, you can transform the rate metrics to a byte sum by applying an 
integral function, but those will always grow and not take into account 
deletion after the retention period.

If you want metrics on how much space a topic occupies on disk, I’d suggest 
using collectd and this plugin: https://github.com/HubSpot/collectd-kafka-disk

—
Peter

> On Jan 16, 2019, at 1:12 PM, Amitav Mohanty  wrote:
> 
> Hi
> 
> I am interested in getting total volume of data that a topic ingested in a
> period of time. Does Kafka collect any such metrics? I check JMX console
> but I only found rate metrics.
> 
> Regards,
> Amitav



Re: Open Source Schema Registry

2018-10-23 Thread Peter Bukowinski
Have a look at https://github.com/confluentinc/schema-registry 



> On Oct 23, 2018, at 9:28 AM, chinchu chinchu  wrote:
> 
> Hi folks,
> We are looking  to use  open source schema registry with apache kafka 1.0.1
> and avro. *Do we need to write a serialzer/deserialzier  similar to
> confluent's KafkAvroSerializer to achieve this ?*.Our schemas are large ,so
> one of the reason that  we are looking to use the registry is to decrease
> the  payload with every record. The consumers at any point in time will
> know what the schema is for each topic ,because they are all internal to my
> team so that is some thing that we can work around for now.
> 
> Thanks,
> Chinchu



Re: When Kafka stores group information in zookeeper?

2018-10-22 Thread Peter Bukowinski
It all depends on which type of consumer you are using. If you use an old 
(original) consumer, you must specify one or more zookeepers since group 
management info is stored in zookeeper. If you use a new consumer, group 
management is handled by the kafka cluster itself so you must specify one or 
more brokers in the bootstrap-server list. Kafka has supported both original 
and new  consumer styles since 0.9.

In summary, kafka stores consumer group info in zookeeper only if you are using 
the old consumer style. It is a consumer-specific setting entirely independent 
of topic configuration.

-- Peter

> On Oct 22, 2018, at 7:49 PM, 赖剑清  wrote:
> 
> Hi, Kafka users:
> 
> I tried to gain the information of topic-consumer groups using 
> kafka-consumer-groups.sh. And I found commands below receive different infos:
> ./kafka-consumer-groups.sh --list --zookeeper localhost:2181
> ./kafka-consumer-groups.sh --list --new-consumer --bootstrap-server 
> localhost:9092
> 
> I suppose the first command get data from zookeeper while the second one from 
> the coordinator and my question is:
> When Kafka store group information in zookeeper? When in coordinator?
> Is there any parameter I can specify while creating topic or beginning a new 
> consumer group to make sure these information store in exactly destination?
> 
> Version of the broker is 0.9.0.1 and the client is 0.9.0.1 in Java.


Re: Visual tool for kafka?

2018-10-19 Thread Peter Bukowinski
https://github.com/yahoo/kafka-manager 

This does all that, and has admin features (which you can disable) that allow 
you to change or create topics, do partition reassignment and preferred leader 
election.

—
Peter

> On Oct 18, 2018, at 11:52 PM, 1095193...@qq.com wrote:
> 
> Hi 
>   I need a Visual tool for kafka. For example, A Web UI can display the 
> detail of each topics、the offset of each consumer. Has any recommended visual 
> tools?
> 
> 
> 
> 1095193...@qq.com



Re: multi-disk brokers data replication

2018-05-10 Thread Peter Bukowinski
Remember that topic partitions will not automatically move between brokers
or storage locations, so any topics that became under-replicated when the
node went down won't heal themselves. When the disk is replaced, you'll be
able to start the broker, and then it should begin doing log recovery and
catching up on its replication.

On Thu, May 10, 2018 at 12:06 PM, <andrianjar...@gmail.com> wrote:

>
> Thanks a lot for the explanation Peter, sounds like what I thought.
>
> I am just not sure I got the last part. So if a disk on such a broker
> fails, and we have kafka version <1, the whole broker dies ?
>
> What happens when the disk is replaced then ?
>
> > On May 10, 2018, at 20:42, Peter Bukowinski <pmb...@gmail.com> wrote:
> >
> > Oops, sorry about the name misspelling, Andrian. (spell-check just tried
> to
> > correct it again).
> >
> >> On Thu, May 10, 2018 at 11:41 AM, Peter Bukowinski <pmb...@gmail.com>
> wrote:
> >>
> >> Adrian,
> >>
> >> Replicas are *always* assigned to different brokers. You cannot, for
> >> example, deploy a single broker with a replication factor of 2 or 3
> (with
> >> min.insync.replicas of 2 or 3, respectively), even with multiple data
> >> directories.
> >>
> >> At the cluster level, kafka is not aware of an individual broker's
> storage
> >> topology (single or multiple storage locations). Topic partitions on a
> >> single, multi-data directory broker are distributed among storage
> locations
> >> in a round-robin manner.
> >>
> >> In a disk failure scenario, you will only lose one replica of all the
> >> topic partitions that existed on that disk, assuming you're running
> 1.0+.
> >> If you're not running 1.0+, then a single disk failure on a broker
> >> configured with JBOD will bring down the broker.
> >>
> >> Hope this helps,
> >>
> >> Peter Bukowinski
> >>
> >> On Thu, May 10, 2018 at 1:49 AM, Andrian Jardan <
> andrianjar...@gmail.com>
> >> wrote:
> >>
> >>> Hello everyone,
> >>>
> >>> I was wondering how data is spread across disks when more than 1 data
> >>> folder is specified on a broker ?
> >>>
> >>> I am specifically interested to understand if failure of 3 disks may
> lead
> >>> to data loss (with replication factor at 3)?
> >>>
> >>> Or is the data replicated so it resides on 3 brokers, and not 3
> different
> >>> data folders ?
> >>>
> >>> Thanks !
> >>>
> >>> —
> >>> Andrian Jardan
> >>> Infrastructure and DevOps expert
> >>> cell: +49 174 2815994
> >>> Skype: macrosdnb
> >>>
> >>>
> >>
>


Re: multi-disk brokers data replication

2018-05-10 Thread Peter Bukowinski
Oops, sorry about the name misspelling, Andrian. (spell-check just tried to
correct it again).

On Thu, May 10, 2018 at 11:41 AM, Peter Bukowinski <pmb...@gmail.com> wrote:

> Adrian,
>
> Replicas are *always* assigned to different brokers. You cannot, for
> example, deploy a single broker with a replication factor of 2 or 3 (with
> min.insync.replicas of 2 or 3, respectively), even with multiple data
> directories.
>
> At the cluster level, kafka is not aware of an individual broker's storage
> topology (single or multiple storage locations). Topic partitions on a
> single, multi-data directory broker are distributed among storage locations
> in a round-robin manner.
>
> In a disk failure scenario, you will only lose one replica of all the
> topic partitions that existed on that disk, assuming you're running 1.0+.
> If you're not running 1.0+, then a single disk failure on a broker
> configured with JBOD will bring down the broker.
>
> Hope this helps,
>
> Peter Bukowinski
>
> On Thu, May 10, 2018 at 1:49 AM, Andrian Jardan <andrianjar...@gmail.com>
> wrote:
>
>> Hello everyone,
>>
>> I was wondering how data is spread across disks when more than 1 data
>> folder is specified on a broker ?
>>
>> I am specifically interested to understand if failure of 3 disks may lead
>> to data loss (with replication factor at 3)?
>>
>> Or is the data replicated so it resides on 3 brokers, and not 3 different
>> data folders ?
>>
>> Thanks !
>>
>> —
>> Andrian Jardan
>> Infrastructure and DevOps expert
>> cell: +49 174 2815994
>> Skype: macrosdnb
>>
>>
>


Re: multi-disk brokers data replication

2018-05-10 Thread Peter Bukowinski
Adrian,

Replicas are *always* assigned to different brokers. You cannot, for
example, deploy a single broker with a replication factor of 2 or 3 (with
min.insync.replicas of 2 or 3, respectively), even with multiple data
directories.

At the cluster level, kafka is not aware of an individual broker's storage
topology (single or multiple storage locations). Topic partitions on a
single, multi-data directory broker are distributed among storage locations
in a round-robin manner.

In a disk failure scenario, you will only lose one replica of all the topic
partitions that existed on that disk, assuming you're running 1.0+. If
you're not running 1.0+, then a single disk failure on a broker configured
with JBOD will bring down the broker.

Hope this helps,

Peter Bukowinski

On Thu, May 10, 2018 at 1:49 AM, Andrian Jardan <andrianjar...@gmail.com>
wrote:

> Hello everyone,
>
> I was wondering how data is spread across disks when more than 1 data
> folder is specified on a broker ?
>
> I am specifically interested to understand if failure of 3 disks may lead
> to data loss (with replication factor at 3)?
>
> Or is the data replicated so it resides on 3 brokers, and not 3 different
> data folders ?
>
> Thanks !
>
> —
> Andrian Jardan
> Infrastructure and DevOps expert
> cell: +49 174 2815994
> Skype: macrosdnb
>
>


Re: Kafka mirror maker help

2018-04-27 Thread Peter Bukowinski
I run instances of Mirror Maker as supervisord tasks (http://supervisord.org 
<http://supervisord.org/>). I’d recommend looking into it. In addition to 
letting you sidestep the service issue, supervisord watches the processes and 
can auto-restart them if they stop for any reason.

—
Peter Bukowinski

> On Apr 27, 2018, at 11:58 AM, Hans Jespersen <h...@confluent.io> wrote:
> 
> Sorry I hit send a bit too soon. I was so focused on the systemd part of
> the email and not the Mirror Maker part.
> Confluent packages include Mirror Maker but the systemd scripts are setup
> to use Confluent Replicator rather than Mirror Maker.
> My apologies.
> 
> -hans
> 
> /**
> * Hans Jespersen, Director Systems Engineering, Confluent Inc.
> * h...@confluent.io (650)924-2670
> */
> 
> On Fri, Apr 27, 2018 at 11:56 AM, Hans Jespersen <h...@confluent.io> wrote:
> 
>> The latest Confluent packages now ship with systemd scripts. That is since
>> Confluent Version 4.1 - which included Apache Kafka 1.1
>> 
>> -hans
>> 
>> /**
>> * Hans Jespersen, Director Systems Engineering, Confluent Inc.
>> * h...@confluent.io (650)924-2670
>> */
>> 
>> On Fri, Apr 27, 2018 at 11:15 AM, Andrew Otto <o...@wikimedia.org> wrote:
>> 
>>> Hiya,
>>> 
>>> Saravanan, I saw you emailed my colleague Alex about WMF’s old debian
>>> packaging.  I’ll reply here.
>>> 
>>> We now use Confluent’s Kafka debian packaging which does not (or did not?)
>>> ship with init scripts.  We don’t use Sys V init.d scripts anymore either,
>>> but use systemd instead.  Our systemd service unit (ERB template format)
>>> is
>>> here:
>>> 
>>> https://github.com/wikimedia/puppet/blob/production/modules/
>>> confluent/templates/initscripts/kafka-mirror-instance.systemd.erb
>>> 
>>> 
>>> 
>>> On Fri, Apr 27, 2018 at 1:35 AM, Amrit Jangid <jangid.ii...@gmail.com>
>>> wrote:
>>> 
>>>> You should share related info, such source-destination Kafka versions,
>>>> sample Config or error if any.
>>>> 
>>>> FYI,  Go through
>>>> https://kafka.apache.org/documentation/#basic_ops_mirror_maker
>>>> 
>>> 
>> 
>> 



Re: Using Kafka CLI without specifying the URLs every single time?

2018-04-20 Thread Peter Bukowinski
One solution is to build wrapper scripts around the standard kafka scripts. 
You’d put your relevant cluster parameters (brokers, zookeepers) in a single 
config file (I like yaml), then your script would import that config file and 
pass the appropriate parameters to the kafka command. You could call the 
wrapper scripts by passing the name of the cluster as an argument and then 
passing the standard kafka options, e.g.

ktopics --cluster my_cluster --list


-- Peter Bukowinski

> On Apr 20, 2018, at 3:23 AM, Horváth Péter Gergely 
> <horvath.peter.gerg...@gmail.com> wrote:
> 
> Hello All,
> 
> I wondering if there is any way to avoid having to enter the host URLs for
> each Kafka CLI command you execute.
> 
> This is kind of tedious as different CLI commands require specifying
> different servers (--broker-list, --bootstrap-server and --zookeeper);
> which is especially painful if the host names are long, and only slightly
> different (e.g. naming scheme for AWS:
> ec2-12-34-56-2.region-x.compute.amazonaws.com).
> 
> I know I could simply export shell variables for each type of endpoint and
> refer that in the command, but that still only eases the pain:
> export KAFKA_ZK=ec2-12-34-56-2.region-x.compute.amazonaws.com
> bin/kafka-topics.sh --list --zookeeper ${KAFKA_ZK}
> 
> Is there by any chance a better way of doing this I am not aware of?
> Technically I am looking for some solution where I don't have to remember
> that a Kafka CLI command expects --broker-list, --bootstrap-server or
> --zookeeper, but can specify these settings once.
> 
> Thanks,
> Peter


Re: Default kafka log.dir /tmp | tmp-file-cleaner process

2018-04-18 Thread Peter Bukowinski
I believe that default parameters should help people new to a product get a 
test instance running relatively easily. Setting the log.dir to temp aligns 
with this philosophy.

Once you’re out of a testing phase, you’ll hopefully be familiar enough with 
the product to set appropriate values for config parameters.

That being said, perhaps this particular config parameter should be highlighted 
for update when moving to production.

-- Peter Bukowinski

> On Apr 18, 2018, at 11:27 AM, adrien ruffie <adriennolar...@hotmail.fr> wrote:
> 
> Hi Marc,
> 
> 
> I think it's depends rather on "log.dirs" parameter. Because this parameter 
> would prefer to use in more convenient case, "log.dir" parameter is secondary.
> 
> 
> logs.dirs: The directories in which the log data is kept. If not set, the 
> value in log.dir is used
> 
> 
> log.dir: The directory in which the log data is kept (supplemental for 
> log.dirs property)
> 
> 
> It's obvious that the "logs.dirs" parameter should be used preferably, before 
> "log.dir".
> 
> 
> and the "log.dir" is just used to make sure that writing is possible at a 
> place where rights are most often allowed.
> 
> 
> best regards,
> 
> 
> Adrien
> 
> 
> De : Marc van den Bogaard <mailingl...@neozo.de>
> Envoyé : mercredi 18 avril 2018 17:43:07
> À : users@kafka.apache.org
> Objet : Default kafka log.dir /tmp | tmp-file-cleaner process
> 
> Hey guys,
> 
> when I look in the kafka documentation 
> (https://kafka.apache.org/documentation/ 
> <https://kafka.apache.org/documentation/>) the default log.dir for the kafka 
> logs is /tmp.
> 
> Could someone please tell me why? Because if you don’t change this you 
> probably get some issues regarding the tmp-file-cleaner process
> which is running on most of the nix-systems and deletes files under /tmp 
> (e.g. older than 10 days which were not touched). We already had some 
> problems and segment files where removed which caused kafka to crash. So we 
> changed this configuration so something like /var/lib/kafka/data/… I didn’t 
> find anyone else with this problem nor information regarding this.
> 
> 
> Best regards
> 
> Marc


Re: log retention bytes and log segment bytes

2018-04-12 Thread Peter Bukowinski
Hi Amit,

This is from the broker config section of the very good documentation on the 
kafka web site: https://kafka.apache.org/0100/documentation.html#brokerconfigs

log.segment.bytes: The maximum size of a single log file (default 1GB)

log.retention.bytes: The maximum size of the log before deleting it (default 
unlimited)

My explanation:

The log.segment.bytes parameter refers to individual segments of the whole 
topic partition log on the broker. Kafka will create a new log segment for the 
partition once the segment limit is reached.

The log.retention.bytes parameter refers to the cumulative size of all the 
partition log segments on the broker. If this value is set, then kafka will 
purge the oldest segments until the total size falls beneath the maximum.

In your example, you will get a new topic partition log after only 100 bytes, 
and you can have up to 50 logs until total size reaches 5000 bytes. At that 
point the oldest logs will start to be purged.

-- Peter (from phone)

> On Apr 12, 2018, at 10:10 AM, amit mishra  wrote:
> 
> Hi all ,
> 
> I am using kafka 0.10.
> 
>log.retention.bytes = 5000
>log.retention.check.interval.ms = 6000
>log.retention.hours = 24
>log.retention.minutes = null
>log.retention.ms = null
>log.roll.hours = 168
>log.roll.jitter.hours = 0
>log.roll.jitter.ms = null
>log.roll.ms = null
>log.segment.bytes = 100
> 
> Please let me know what does log.retention.bytes and log.segment.bytes
> denotes ?
> 
> Regards,
> amit


Re: How frequent does the follower replica issue a fetch request to leader replica?

2018-04-11 Thread Peter Bukowinski
Hi Yu,

The broker property ‘replica.fetch.wait.max.ms’ determines the longest interval 
a follower should wait before issuing a fetch request of the leader. In reality 
these usually happen much more frequently, but it depends on the rate at which 
producers write to kafka topics (among other factors).

The request rate of all your replica fetchers is one of the metrics that kafka 
reports. You can find it under kafka.server,type=replica-fetcher-metrics. Many 
metrics are published for each leader and for each fetcher assigned to that 
leader (if you have more than one configured in the num.replica.fetchers broker 
parameter).

For one of my small 5-broker clusters, my follower request rates range from 130 
per second on the high end to 3 per second on the low end. I’ve configured 16 
fetchers.

--
Peter Bukowinski

> On Apr 10, 2018, at 10:54 PM, Yu Watanabe <yu.w.ten...@gmail.com> wrote:
> 
> Hello.
> 
> I would like to ask question regarding to fetch request from follower to
> leader replica.
> 
> According to on line document it describes the flow of how fetch requests
> are fetched for sync operation .
> 
> https://kafka.apache.org/documentation/#replication
> 
> "Followers consume messages from the leader just as a normal Kafka consumer
> would and apply them to their own log.
> Having the followers pull from the leader has the nice property of allowing
> the follower to naturally batch together log entries they are applying to
> their log."
> 
> However,  I could not find the interval or frequency of the fetch request.
> How often does follower replica issues fetch request to leader ?
> 
> Thanks,
> Yu
> 
> -- 
> Yu Watanabe
> 渡辺 裕
> 
> LinkedIn : jp.linkedin.com/in/yuwatanabe1


Re: Kafka Brokers compatibility with Jumbo Frames option.

2018-03-31 Thread Peter Bukowinski
Hi Alex,

Jumbo frames are a function the network layer. Kafka operates at a higher level 
in the stack and is thus not aware of ethernet frame sizes. It will work fine 
as long as all network interfaces and devices in your kafka data path are set 
to support jumbo frames. If the end-to-end is not set to jumbo frames, you will 
most likely see dropped or fragmented packets.

—
Peter Bukowinski


> On Mar 31, 2018, at 6:51 AM, Pena Quijada Alexander <a.penaquij...@reply.it> 
> wrote:
> 
> Hi all,
> 
> My name is Alexander, a Linux System Administrator. I'm using Kafka for the 
> Big Data proposes, and I'm wondering if the Kafka Brokers servers, are 
> compatible with the Jumbo Frames option enabled in the network cards that we 
> use to communicate with the Cluster? We wish to enable this new features in 
> order to increase our Network performance and set the MTU parameter to 9000.
> 
> At the moment we use Kafka version: 0.10.0.
> 
> Many thanks in advance for the kind cooperation.
> 
> Kind regards and happy Easter!
> 
> Alexander Pena Quijada
> 
> 
> 
> 
> --
> The information transmitted is intended for the person or entity to which it 
> is addressed and may contain confidential and/or privileged material. Any 
> review, retransmission, dissemination or other use of, or taking of any 
> action in reliance upon, this information by persons or entities other than 
> the intended recipient is prohibited. If you received this in error, please 
> contact the sender and delete the material from any computer.



Re: Number of partitions for offsets topic cannot be changed

2018-03-25 Thread Peter Bukowinski
Once the offsets topic has been created, changing any server.properties 
affecting it will have no effect. If you are pre-production, you can start over 
using the new server properties. 

By start over, I mean stopping brokers, purging the kafka logs/data directories 
of all brokers, purging cluster data from zookeepers, then starting brokers 
again with your desired server properties settings.

-- Peter Bukowinski

> On Mar 25, 2018, at 1:01 PM, Anu P <iamkafkau...@gmail.com> wrote:
> 
> Thanks Swapnil.
> 
> I changed *offsets.topic.num.partitions* in server.properties file and
> restarted the broker. But, still this config change does not take effect.
> 
> Any ideas?
> 
> Thanks in advance
> Tanvi
> 
> On Sun, Mar 25, 2018 at 12:30 AM, Swapnil Gupta <neomatrix1...@gmail.com>
> wrote:
> 
>> In brief, this is system level configuration by Kafka.
>> 
>> Consumer offsets partitions can't be changed through command line.
>> 
>> You have to change the configuration file and set this
>> *offsets.topic.num.partitions* property to change this.
>> 
>>> On Sun, Mar 25, 2018, 12:49 Anu P <iamkafkau...@gmail.com> wrote:
>>> 
>>> Hi,
>>> 
>>> I am trying to change the number of partitions for __consumer_offsets
>> topic
>>> by using the following command. However, I get an error stating "Number
>> of
>>> partitions for offsets topic cannot be changed"
>>> 
>>> 
>>> */opt/kafka/bin/kafka-topics.sh --zookeeper  --alter --topic
>>> __consumer_offsets --partitions 54   *
>>> 
>>> 
>>> Error while executing topic command : The number of partitions for the
>>> offsets topic cannot be changed.
>>> ERROR java.lang.IllegalArgumentException: The number of partitions for
>> the
>>> offsets topic cannot be changed.
>>> at
>>> 
>>> kafka.admin.TopicCommand$$anonfun$alterTopic$1.apply(
>> TopicCommand.scala:142)
>>> .
>>> 
>>> 
>>> 
>>> Environment details:
>>> 
>>>   1. 3 zookeepers and 3 kafka brokers (version 1.0.1)
>>>   2. After deploying kafka, I tried the kafka-topics command (above) to
>>>   increase it partitions. But it fails. Can someone please help me
>>> understand
>>>   why? Am I missing something?
>>>   3. I want to distribute partitions leaders equally among brokers. So
>>>   want to increase the number of partitions for offsets topic.
>>> 
>>> 
>>> Thanks in advance.
>>> 
>>> Tanvi
>>> 
>>