Jay Kreps created KAFKA-2063:
--------------------------------

             Summary: Bound fetch response size
                 Key: KAFKA-2063
                 URL: https://issues.apache.org/jira/browse/KAFKA-2063
             Project: Kafka
          Issue Type: Improvement
            Reporter: Jay Kreps


Currently the only bound on the fetch response size is 
max.partition.fetch.bytes * num_partitions. There are two problems:
1. First this bound is often large. You may chose max.partition.fetch.bytes=1MB 
to enable messages of up to 1MB. However if you also need to consume 1k 
partitions this means you may receive a 1GB response in the worst case!
2. The actual memory usage is unpredictable. Partition assignment changes, and 
you only actually get the full fetch amount when you are behind and there is a 
full chunk of data ready. This means an application that seems to work fine 
will suddenly OOM when partitions shift or when the application falls behind.

We need to decouple the fetch response size from the number of partitions.

The proposal for doing this would be to add a new field to the fetch request, 
max_bytes which would control the maximum data bytes we would include in the 
response.

The implementation on the server side would grab data from each partition in 
the fetch request until it hit this limit, then send back just the data for the 
partitions that fit in the response. The implementation would need to start 
from a random position in the list of topics included in the fetch request to 
ensure that in a case of backlog we fairly balance between partitions (to avoid 
first giving just the first partition until that is exhausted, then the next 
partition, etc).

This setting will make the max.partition.fetch.bytes field in the fetch request 
much less useful and we  should discuss just getting rid of it.

I believe this also solves the same thing we were trying to address in 
KAFKA-598. The max_bytes setting now becomes the new limit that would need to 
be compared to max_message size. This can be much larger--e.g. setting a 50MB 
max_bytes setting would be okay, whereas now if you set 50MB you may need to 
allocate 50MB*num_partitions.

This will require evolving the fetch request protocol version to add the new 
field and we should do a KIP for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to