Hi Eno,

many thanks for trying that. That is very helpful for me.

That basic check didn't work for me but I have since discovered what my issue was. Despite using a version of kafka that supports rack-awareness we have been deliberately setting 'inter.broker.protocol.version' to an older version (due to various issues with some of our consumers). When I update this parameter to use a later version, I can see 'rack' being written to zookeeper.

For now I need to turn my attention to resolving the issues with my consumers.

Thanks again for helping out.

Bryan



On 21/09/2018 14:52, Eno Thereska wrote:
Hi Bryan,

I did a simple check with starting a broker with no rack id and then
restarting with a rack id and I can confirm I could get the rack id from
zookeeper after the restart. This was on trunk. Does that basic check work
for you (i.e., without reassigning partitions)?

Thanks
Eno

On Fri, Sep 21, 2018 at 2:07 PM, Bryan Duggan <bryan.dug...@boxever.com>
wrote:

I didn't get a response to this, but I've been investigating more and can
now frame the problem slightly differently (hopefully, more accurately).

According to this document

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+
data+structures+in+Zookeeper

Which defines broker data structures in zookeeper, the following is the
broker schema (from version 0.10 onwards - I am using version 0.11)

{ "fields":
     [ {"name": "version", "type": "int", "doc": "version id"},
       {"name": "host", "type": "string", "doc": "ip address or host name
of the broker"},
       {"name": "port", "type": "int", "doc": "port of the broker"},
       {"name": "jmx_port", "type": "int", "doc": "port for jmx"}
       {"name": "endpoints", "type": "array", "items": "string", "doc":
"endpoints supported by the broker"}
       {"name": "rack", "type": "string", "doc": "Rack of the broker.
Optional. This will be used in rack aware replication assignment for fault
tolerance."}
     ]
}

when I check my broker data in zookeeper (which has a non-null broker.rack
setting in the properties file), I have the following;

{"endpoints":["PLAINTEXT://x.x.x.x.abcd:9092"],"jmx_port":-1
,"host":"x.x.x.x.abc","timestamp":"1537527988341","port":9092,"version":2}

there is no 'rack'.

In the server.log file on my kafka broker I see;
----
[2018-09-21 13:00:40,227] INFO KafkaConfig values:
     advertised.host.name = null
     .
     .
     broker.id = 1234567
     broker.rack = rack1
     compression.type = producer
     .
-----

so it looks fine from the broker side. However, when I restart kafka on
the host, it doesn't load any rack information into zookeeper.

Can someone please confirm to me, if I have rack awareness, should I
expect to see a value for 'rack' in zookeeper? If so, do I need to do
something else on the broker side to get it to include it as part of the
meta-data it writes (as far as I can see it writes the metadata each time
kafka is restarted).

thanks
Bryan








On 20/09/2018 11:31, Bryan Duggan wrote:

Hi,

I have a kafka cluster consisting of 3 brokers across 3 different AWS
availability zones.  It hosts several topics, each of which has a
replication factor of 3. The cluster is currently not 'rack-aware'.

I am trying to do the following;

     - add 3 additional brokers (one in each of the 3 AZs)

     - make the cluster 'rack-aware'. (ie: create 3 racks on a per-AZ
basic, each containing 2 brokers)

     - reassign the topics with the intention of having 1 replica in each
of the 3 racks.

To achieve this I've added 'broker.rack' to the properties file for each
broker. The rack name is the same as the AZ name each broker is in. I've
restarted kafka on all brokers (in case that's required for rack-awareness
to take effect).

Following restart I've attempted to reassign topics across all 6 brokers
by running the following;

     - ./kafka-reassign-partitions.sh --zookeeper $ZK
--topics-to-move-json-file topics-to-move.json --broker-list '1,2,3,4,5,6'

(where topics-to-move.json is a simple json file containing the topics to
reassign)

The problem I am having is, after running 'kafka-reassign-partitions.sh'
with 6 brokers listed in the broker-list, it doesn't honour
rack-awareness, and instead assigns 2 partitions to brokers in a single
rack with a 3rd being assigned elsewhere.

The version of kafka I am using is 2.11-1.1.1.

Any documentation I've read suggests the above should have achieved what
I want. However, it is not working as expected.

Has anyone else make their kafka cluster 'rack-aware'? If so, did you
experience any issues doing so?

Or, can anyone tell me if there's some step I'm missing to make this work.

TIA

Bryan





Reply via email to