Re: Problems trying to make kafka 'rack-aware'

2018-09-21 Thread Bryan Duggan

Hi Eno,

many thanks for trying that. That is very helpful for me.

That basic check didn't work for me but I have since discovered what my 
issue was. Despite using a version of kafka that supports rack-awareness 
we have been deliberately setting 'inter.broker.protocol.version' to an 
older version (due to various issues with some of our consumers). When I 
update this parameter to use a later version, I can see 'rack' being 
written to zookeeper.


For now I need to turn my attention to resolving the issues with my 
consumers.


Thanks again for helping out.

Bryan



On 21/09/2018 14:52, Eno Thereska wrote:

Hi Bryan,

I did a simple check with starting a broker with no rack id and then
restarting with a rack id and I can confirm I could get the rack id from
zookeeper after the restart. This was on trunk. Does that basic check work
for you (i.e., without reassigning partitions)?

Thanks
Eno

On Fri, Sep 21, 2018 at 2:07 PM, Bryan Duggan 
wrote:


I didn't get a response to this, but I've been investigating more and can
now frame the problem slightly differently (hopefully, more accurately).

According to this document

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+
data+structures+in+Zookeeper

Which defines broker data structures in zookeeper, the following is the
broker schema (from version 0.10 onwards - I am using version 0.11)

{ "fields":
 [ {"name": "version", "type": "int", "doc": "version id"},
   {"name": "host", "type": "string", "doc": "ip address or host name
of the broker"},
   {"name": "port", "type": "int", "doc": "port of the broker"},
   {"name": "jmx_port", "type": "int", "doc": "port for jmx"}
   {"name": "endpoints", "type": "array", "items": "string", "doc":
"endpoints supported by the broker"}
   {"name": "rack", "type": "string", "doc": "Rack of the broker.
Optional. This will be used in rack aware replication assignment for fault
tolerance."}
 ]
}

when I check my broker data in zookeeper (which has a non-null broker.rack
setting in the properties file), I have the following;

{"endpoints":["PLAINTEXT://x.x.x.x.abcd:9092"],"jmx_port":-1
,"host":"x.x.x.x.abc","timestamp":"1537527988341","port":9092,"version":2}

there is no 'rack'.

In the server.log file on my kafka broker I see;

[2018-09-21 13:00:40,227] INFO KafkaConfig values:
 advertised.host.name = null
 .
 .
 broker.id = 1234567
 broker.rack = rack1
 compression.type = producer
 .
-

so it looks fine from the broker side. However, when I restart kafka on
the host, it doesn't load any rack information into zookeeper.

Can someone please confirm to me, if I have rack awareness, should I
expect to see a value for 'rack' in zookeeper? If so, do I need to do
something else on the broker side to get it to include it as part of the
meta-data it writes (as far as I can see it writes the metadata each time
kafka is restarted).

thanks
Bryan








On 20/09/2018 11:31, Bryan Duggan wrote:


Hi,

I have a kafka cluster consisting of 3 brokers across 3 different AWS
availability zones.  It hosts several topics, each of which has a
replication factor of 3. The cluster is currently not 'rack-aware'.

I am trying to do the following;

 - add 3 additional brokers (one in each of the 3 AZs)

 - make the cluster 'rack-aware'. (ie: create 3 racks on a per-AZ
basic, each containing 2 brokers)

 - reassign the topics with the intention of having 1 replica in each
of the 3 racks.

To achieve this I've added 'broker.rack' to the properties file for each
broker. The rack name is the same as the AZ name each broker is in. I've
restarted kafka on all brokers (in case that's required for rack-awareness
to take effect).

Following restart I've attempted to reassign topics across all 6 brokers
by running the following;

 - ./kafka-reassign-partitions.sh --zookeeper $ZK
--topics-to-move-json-file topics-to-move.json --broker-list '1,2,3,4,5,6'

(where topics-to-move.json is a simple json file containing the topics to
reassign)

The problem I am having is, after running 'kafka-reassign-partitions.sh'
with 6 brokers listed in the broker-list, it doesn't honour
rack-awareness, and instead assigns 2 partitions to brokers in a single
rack with a 3rd being assigned elsewhere.

The version of kafka I am using is 2.11-1.1.1.

Any documentation I've read suggests the above should have achieved what
I want. However, it is not working as expected.

Has anyone else make their kafka cluster 'rack-aware'? If so, did you
experience any issues doing so?

Or, can anyone tell me if there's some step I'm missing to make this work.

TIA

Bryan








Re: Problems trying to make kafka 'rack-aware'

2018-09-21 Thread Eno Thereska
Hi Bryan,

I did a simple check with starting a broker with no rack id and then
restarting with a rack id and I can confirm I could get the rack id from
zookeeper after the restart. This was on trunk. Does that basic check work
for you (i.e., without reassigning partitions)?

Thanks
Eno

On Fri, Sep 21, 2018 at 2:07 PM, Bryan Duggan 
wrote:

>
> I didn't get a response to this, but I've been investigating more and can
> now frame the problem slightly differently (hopefully, more accurately).
>
> According to this document
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+
> data+structures+in+Zookeeper
>
> Which defines broker data structures in zookeeper, the following is the
> broker schema (from version 0.10 onwards - I am using version 0.11)
>
> { "fields":
> [ {"name": "version", "type": "int", "doc": "version id"},
>   {"name": "host", "type": "string", "doc": "ip address or host name
> of the broker"},
>   {"name": "port", "type": "int", "doc": "port of the broker"},
>   {"name": "jmx_port", "type": "int", "doc": "port for jmx"}
>   {"name": "endpoints", "type": "array", "items": "string", "doc":
> "endpoints supported by the broker"}
>   {"name": "rack", "type": "string", "doc": "Rack of the broker.
> Optional. This will be used in rack aware replication assignment for fault
> tolerance."}
> ]
> }
>
> when I check my broker data in zookeeper (which has a non-null broker.rack
> setting in the properties file), I have the following;
>
> {"endpoints":["PLAINTEXT://x.x.x.x.abcd:9092"],"jmx_port":-1
> ,"host":"x.x.x.x.abc","timestamp":"1537527988341","port":9092,"version":2}
>
> there is no 'rack'.
>
> In the server.log file on my kafka broker I see;
> 
> [2018-09-21 13:00:40,227] INFO KafkaConfig values:
> advertised.host.name = null
> .
> .
> broker.id = 1234567
> broker.rack = rack1
> compression.type = producer
> .
> -
>
> so it looks fine from the broker side. However, when I restart kafka on
> the host, it doesn't load any rack information into zookeeper.
>
> Can someone please confirm to me, if I have rack awareness, should I
> expect to see a value for 'rack' in zookeeper? If so, do I need to do
> something else on the broker side to get it to include it as part of the
> meta-data it writes (as far as I can see it writes the metadata each time
> kafka is restarted).
>
> thanks
> Bryan
>
>
>
>
>
>
>
>
> On 20/09/2018 11:31, Bryan Duggan wrote:
>
>>
>> Hi,
>>
>> I have a kafka cluster consisting of 3 brokers across 3 different AWS
>> availability zones.  It hosts several topics, each of which has a
>> replication factor of 3. The cluster is currently not 'rack-aware'.
>>
>> I am trying to do the following;
>>
>> - add 3 additional brokers (one in each of the 3 AZs)
>>
>> - make the cluster 'rack-aware'. (ie: create 3 racks on a per-AZ
>> basic, each containing 2 brokers)
>>
>> - reassign the topics with the intention of having 1 replica in each
>> of the 3 racks.
>>
>> To achieve this I've added 'broker.rack' to the properties file for each
>> broker. The rack name is the same as the AZ name each broker is in. I've
>> restarted kafka on all brokers (in case that's required for rack-awareness
>> to take effect).
>>
>> Following restart I've attempted to reassign topics across all 6 brokers
>> by running the following;
>>
>> - ./kafka-reassign-partitions.sh --zookeeper $ZK
>> --topics-to-move-json-file topics-to-move.json --broker-list '1,2,3,4,5,6'
>>
>> (where topics-to-move.json is a simple json file containing the topics to
>> reassign)
>>
>> The problem I am having is, after running 'kafka-reassign-partitions.sh'
>> with 6 brokers listed in the broker-list, it doesn't honour
>> rack-awareness, and instead assigns 2 partitions to brokers in a single
>> rack with a 3rd being assigned elsewhere.
>>
>> The version of kafka I am using is 2.11-1.1.1.
>>
>> Any documentation I've read suggests the above should have achieved what
>> I want. However, it is not working as expected.
>>
>> Has anyone else make their kafka cluster 'rack-aware'? If so, did you
>> experience any issues doing so?
>>
>> Or, can anyone tell me if there's some step I'm missing to make this work.
>>
>> TIA
>>
>> Bryan
>>
>>
>>
>>
>


Re: Problems trying to make kafka 'rack-aware'

2018-09-21 Thread Bryan Duggan


I didn't get a response to this, but I've been investigating more and 
can now frame the problem slightly differently (hopefully, more 
accurately).


According to this document

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper

Which defines broker data structures in zookeeper, the following is the 
broker schema (from version 0.10 onwards - I am using version 0.11)


{ "fields":
    [ {"name": "version", "type": "int", "doc": "version id"},
  {"name": "host", "type": "string", "doc": "ip address or host 
name of the broker"},

  {"name": "port", "type": "int", "doc": "port of the broker"},
  {"name": "jmx_port", "type": "int", "doc": "port for jmx"}
  {"name": "endpoints", "type": "array", "items": "string", "doc": 
"endpoints supported by the broker"}
  {"name": "rack", "type": "string", "doc": "Rack of the broker. 
Optional. This will be used in rack aware replication assignment for 
fault tolerance."}

    ]
}

when I check my broker data in zookeeper (which has a non-null 
broker.rack setting in the properties file), I have the following;


{"endpoints":["PLAINTEXT://x.x.x.x.abcd:9092"],"jmx_port":-1,"host":"x.x.x.x.abc","timestamp":"1537527988341","port":9092,"version":2}

there is no 'rack'.

In the server.log file on my kafka broker I see;

[2018-09-21 13:00:40,227] INFO KafkaConfig values:
    advertised.host.name = null
    .
    .
    broker.id = 1234567
    broker.rack = rack1
    compression.type = producer
    .
-

so it looks fine from the broker side. However, when I restart kafka on 
the host, it doesn't load any rack information into zookeeper.


Can someone please confirm to me, if I have rack awareness, should I 
expect to see a value for 'rack' in zookeeper? If so, do I need to do 
something else on the broker side to get it to include it as part of the 
meta-data it writes (as far as I can see it writes the metadata each 
time kafka is restarted).


thanks
Bryan







On 20/09/2018 11:31, Bryan Duggan wrote:


Hi,

I have a kafka cluster consisting of 3 brokers across 3 different AWS 
availability zones.  It hosts several topics, each of which has a 
replication factor of 3. The cluster is currently not 'rack-aware'.


I am trying to do the following;

    - add 3 additional brokers (one in each of the 3 AZs)

    - make the cluster 'rack-aware'. (ie: create 3 racks on a per-AZ 
basic, each containing 2 brokers)


    - reassign the topics with the intention of having 1 replica in 
each of the 3 racks.


To achieve this I've added 'broker.rack' to the properties file for 
each broker. The rack name is the same as the AZ name each broker is 
in. I've restarted kafka on all brokers (in case that's required for 
rack-awareness to take effect).


Following restart I've attempted to reassign topics across all 6 
brokers by running the following;


    - ./kafka-reassign-partitions.sh --zookeeper $ZK 
--topics-to-move-json-file topics-to-move.json --broker-list '1,2,3,4,5,6'


(where topics-to-move.json is a simple json file containing the topics 
to reassign)


The problem I am having is, after running 
'kafka-reassign-partitions.sh' with 6 brokers listed in the 
broker-list, it doesn't honour  rack-awareness, and instead assigns 2 
partitions to brokers in a single rack with a 3rd being assigned 
elsewhere.


The version of kafka I am using is 2.11-1.1.1.

Any documentation I've read suggests the above should have achieved 
what I want. However, it is not working as expected.


Has anyone else make their kafka cluster 'rack-aware'? If so, did you 
experience any issues doing so?


Or, can anyone tell me if there's some step I'm missing to make this work.

TIA

Bryan







Problems trying to make kafka 'rack-aware'

2018-09-20 Thread Bryan Duggan

Hi,

I have a kafka cluster consisting of 3 brokers across 3 different AWS 
availability zones.  It hosts several topics, each of which has a 
replication factor of 3. The cluster is currently not 'rack-aware'.


I am trying to do the following;

    - add 3 additional brokers (one in each of the 3 AZs)

    - make the cluster 'rack-aware'. (ie: create 3 racks on a per-AZ 
basic, each containing 2 brokers)


    - reassign the topics with the intention of having 1 replica in 
each of the 3 racks.


To achieve this I've added 'broker.rack' to the properties file for each 
broker. The rack name is the same as the AZ name each broker is in. I've 
restarted kafka on all brokers (in case that's required for 
rack-awareness to take effect).


Following restart I've attempted to reassign topics across all 6 brokers 
by running the following;


    - ./kafka-reassign-partitions.sh --zookeeper $ZK 
--topics-to-move-json-file topics-to-move.json --broker-list '1,2,3,4,5,6'


(where topics-to-move.json is a simple json file containing the topics 
to reassign)


The problem I am having is, after running 'kafka-reassign-partitions.sh' 
with 6 brokers listed in the broker-list, it doesn't honour  
rack-awareness, and instead assigns 2 partitions to brokers in a single 
rack with a 3rd being assigned elsewhere.


The version of kafka I am using is 2.11-1.1.1.

Any documentation I've read suggests the above should have achieved what 
I want. However, it is not working as expected.


Has anyone else make their kafka cluster 'rack-aware'? If so, did you 
experience any issues doing so?


Or, can anyone tell me if there's some step I'm missing to make this work.

TIA

Bryan