Re: [VOTE] KIP-4 Metadata Schema

Jun Rao Sun, 03 Apr 2016 22:25:49 -0700

Grant,

2. Hmm, I am not sure about the listener approach. It ignores configs like
enable.topic.deletion and also opens the door for potential ordering issues
since now there are two separate paths for propagating the metadata to the
brokers. Could we just leave out markedForDeletion for now? In the common
case, if a topic is deleted, it will only be in markedForDeletion state for
a few milli seconds anyway.


3. Yes, for those usage, it just seems it's a bit weird for the client to
issue a MetadataRequest to get the controller info since it doesn't need
any topic metadata.

5. The issue is that for a client, when handling a metadata response, the
natural logic is if there is any error in the response, go to the error
handling path (e.g., back off and refresh metadata). Otherwise, get the
leader info and initiate a request to the leader if leader is available. If
you look at the current logic in MetadataCache.getPartitionMetadata(), if
an assigned replica is not alive, we will send a REPLICA_NOT_AVAILABLE
error code in the response. If the client follows the above logic, it will
keep doing the error handling even though there is nothing wrong with the
leader. A better behavior is to simply return the list of replica ids with
no error code in this case.

Thanks,

Jun



On Sun, Apr 3, 2016 at 6:29 PM, Grant Henke <ghe...@cloudera.com> wrote:

> Responding to a few of the other comments:
>
> it seems that you propagated
> > the topic deletion marker by having the replicaManager read from ZK
> > directly. It seems that it would be simpler/consistent if the controller
> > propagates that information directly through UpdateMetaRequest.
>
>
> I was told that I should not try and modify controller logic with KIP-4
> changes. It was indicated that a larger controller rewrite and testing was
> planned and those changes should be considered then. Since marking a topic
> for deletion doesn't flow through the controller and therefore the
> UpdateMetadataRequest,
> it would take quite a bit of change. We would need to trigger a new
> UpdateMetadataRequest
> every time a new topic is marked for deletion.
>
> Instead I added a listener to maintain a cache of the topic deletion znodes
> in the ReplicaManager where the existing UpdateMetadataRequests are
> handled. This would make it easy to swap out later once the data is a part
> of that request and have minimal impact in the mean time.
>
>
> > Could you add a description on how controller id will be used in the
> > client?
>
>
> I will add it to the wiki. Today metrics are the only way to access this
> piece of data. It is useful information about the cluster for many reasons.
> Having programatic access to identify the controller is helpful for
> automation. For example, It can be used during rolling restart logic to
> shutdown the controller last to prevent multiple fail overs. Beyond
> automation, it can be leveraged in KIP-4 to route admin requests to the
> controller broker.
>
> We had a weird semantic in version 0 of MetadataRequest. If a replica is
> > not live, but the leader is live, we return an
> > error ReplicaNotAvailableException in the partition metadata. This makes
> it
> > a bit confusing for the client to parse since it has to first check
> whether
> > leader is available or not before error code checking. We were thinking
> of
> > changing that behavior the next time we bump up the version of
> > MetadataRequest.
> > Now that time has come, could you include that in the proposal?
>
>
> I am not sure I completely follow the issue and requested change. Could you
> point me to the discussion?
>
> Thank you,
> Grant
>
> On Sun, Apr 3, 2016 at 8:09 PM, Grant Henke <ghe...@cloudera.com> wrote:
>
> > Hi Jun and Ismael,
> >
> > Initially I had 2 booleans used to indicate if a topic was internal and
> if
> > a topic was marked for deletion. To save space on large deployments,
> Ismael
> > suggested I break out the internal topics and deleted topics into their
> own
> > lists. The idea was that instead of 2 bytes added per topic, in the
> general
> > case the lists would be empty. Even in those lists I still only return
> > topics that were requested. In fact on the client side they are just
> > utilized to translate back to booleans. I do prefer the booleans from an
> > expressiveness standpoint but was not strongly opinionated on the
> > structure.
> >
> > Thank you,
> > Grant
> >
> > On Sun, Apr 3, 2016 at 8:00 PM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> >> Hi Jun,
> >>
> >> A couple of comments inline.
> >>
> >> On Sun, Apr 3, 2016 at 5:55 PM, Jun Rao <j...@confluent.io> wrote:
> >>
> >> > 1. It seems a bit weird to return just a list of internal topics w/o
> the
> >> > corresponding metadata. It also seems a bit weird to return the
> internal
> >> > topics even if the client doesn't ask for it.
> >>
> >>
> >> Good point.
> >>
> >>
> >> > Would it be better to just
> >> > add a flag in topic_metadata to indicate whether it's an internal
> topic
> >> or
> >> > not, and only include the internal topics when thy are asked (or all
> >> topics
> >> > are requested) for?
> >> >
> >>
> >> The disadvantage of this is that we are adding one byte per topic even
> >> though we have a very small number of internal topics (currently a
> single
> >> internal topic). It seems a bit wasteful and particularly so when using
> >> regex subscriptions (since we have to retrieve all topics in that case).
> >>
> >> 2. A similar comment on topics_marked_for_deletion. Would it be better
> to
> >> > only return them when asked for and just return a new TopicDeleted
> error
> >> > code in topic_metadata?
> >>
> >>
> >> I agree that this seems better.
> >>
> >> Ismael
> >>
> >
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>

Re: [VOTE] KIP-4 Metadata Schema

Reply via email to