I have a similar issue, let me know how it goes. :)

-----Original Message-----
From: Andrew Otto [mailto:ao...@wikimedia.org] 
Sent: Wednesday, May 27, 2015 3:12 PM
To: users@kafka.apache.org
Subject: Kafka partitions unbalanced

Hi all,

I’ve recently noticed that our broker log.dirs are using up different amounts 
of storage.  We use JBOD for our brokers, with 12 log.dirs, 1 on each disk.  
One of our topics is larger than the others, and has 12 partitions.  
Replication factor is 3, and we have 4 brokers.  Each broker then has to store 
9 partitions for this topic (12*3/4 == 9).

I guess I had originally assumed that Kafka would be smart enough to spread 
partitions for a given topic across each of the log.dirs as evenly as it could. 
 However, on some brokers this one topic has 2 partitions in a single log.dir, 
meaning that the storage taken up on a single disk by this topic on those 
brokers is twice what it should be.

e.g.

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3       1.8T  1.2T  622G  66% /var/spool/kafka/a
/dev/sdb3       1.8T  1.7T  134G  93% /var/spool/kafka/b
…
$ du -sh /var/spool/kafka/{a,b}/data/webrequest_upload-*
501G    a/data/webrequest_upload-4
500G    b/data/webrequest_upload-11
501G    b/data/webrequest_upload-8


This also means that those over populated disks have more writes to do.  My I/O 
is imbalanced!

This is sort of documented at http://kafka.apache.org/documentation.html 
<http://kafka.apache.org/documentation.html>:

"If you configure multiple data directories partitions will be assigned 
round-robin to data directories. Each partition will be entirely in one of the 
data directories. If data is not well balanced among partitions this can lead 
to load imbalance between disks.”

But my data is well balanced among partitions!  It’s just that multiple 
partitions are assigned to a single disk.

Anyyyyyyway, on to a question:  Is it possible to move partitions between 
log.dirs?  Is there tooling to do so?  Poking around in there, it looks like it 
might be as simple as shutting down the broker, moving the partition directory, 
and then editing both replication-offset-checkpoint and 
recovery-point-offset-checkpoint files so that they say the appropriate things 
in the appropriate directories, and then restarting broker.

Someone tell me that this is a horrible idea. :)

-Ao


Reply via email to