[ 
https://issues.apache.org/jira/browse/KAFKA-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps reassigned KAFKA-642:
-------------------------------

    Assignee: Jay Kreps
    
> Protocol tweaks for 0.8
> -----------------------
>
>                 Key: KAFKA-642
>                 URL: https://issues.apache.org/jira/browse/KAFKA-642
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jay Kreps
>            Assignee: Jay Kreps
>            Priority: Blocker
>         Attachments: KAFKA-642-v1.patch, KAFKA-642-v2.patch, 
> KAFKA-642-v3.patch, KAFKA-642-v4.patch, KAFKA-642-v6.patch
>
>
> There are a couple of things in the protocol that are not idea. It would be 
> good to tweak these for 0.8 so we start clean.
> Here is a set of problems and proposals:
> Problems:
> 1. Correlation id is not used across all the requests. I don't think it can 
> work as intended because of this.
> 2. On reflection I am not sure that we need a correlation id field. I think 
> that since we need to guarantee that processing is sequential on any 
> particular socket we can correlate with a simple queue. (e.g. as the client 
> sends messages it adds them to a queue and as it receives responses it just 
> correlates to whatever is at the head of the queue).
> 3. The metadata response seems to have a number of problems. Among them is 
> that it weirdly repeats all the broker information many times. The response 
> includes the ISR, leader (maybe), and the replicas. Each of these repeat all 
> the broker information. This is super weird. I think what we should be doing 
> here is including all broker information for all brokers and then just having 
> the appropriate ids for the isr, leader, and replicas.
> 4. For topic discovery I think we need to support the case where no topics 
> are specified in the metadata request and for this return information about 
> all topics. I don't think we do this now.
> 5. I don't understand what the creator id is.
> 6. The offset request and response is not fully thought through and should be 
> generalized.
> Proposals:
> 1, 2. Correlation id. This is not strictly speaking needed, but it is maybe 
> useful for debugging to be able to trace a particular request from client to 
> server. So we will extend this across all the requests.
> 3. For metadata response I will try to fix this up by normalizing out the 
> broker list and having the isr, replicas, and leader field just have the node 
> id.
> 4. This should be uncontroversial and easy to add.
> 5. Let's remove creator id, it isn't used.
> 6. Let's generalize offset request. My proposal is below:
> Rename TopicMetadata API to ClusterMetadata, as this will contain all the 
> data that is known cluster-wide. Then let's generalize the offset request to 
> be PartitionMetadata--namely stuff about a particular partition on a 
> particular server.
> The format of PartitionMetdata would be the following:
> PartitionMetadataRequest => [TopicName [PartitionId MinSegmentTime 
> MaxSegmentInfos]]
>   TopicName => string
>   PartitionId => uint32
>   MinSegmentTime => uint64
>   MaxSegmentInfos => int32
> PartitionMetadataResponse => [TopicName [PartitionMetadata]]
>   TopicName => string
>   PartitionMetadata => PartitionId LogSize NumberOfSegments LogEndOffset 
> HighwaterMark [SegmentData]
>   SegmentData => StartOffset LastModifiedTime
>   LogSize => uint64
>   NumberOfSegments => int32
>   LogEndOffset => int64
>   HighwaterMark => int64
> This would be general enough that we could continue to add to it for any new 
> pieces of data we need.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to