> While we are at it, can we also think of "persistent://"? PIP-11  covers
> this only for limited use cases.

PIP-11 is just covering the "external" representation of topic, by allowing
apps to use the short form while internally using the complete name.

> In general the topic names are a sap on space. With a 200 byte name, its
> about 10M of strings in memory just for topic names, on a broker serving
> 50K topics. Using a hash of the name seems much better.

But at some point we need to have the actual name. I think we should start
by making sure we're storing it the least possible number of times.

> Maybe shorten it to p://? That will be 9MB for a million topics, in ZK
> strings  (more with utf8).

This could be easily done to support both format "persistent" == "p" and
"non-persistent" == "np" and internally it could be stored and referred as
that.

In general, for user perspective, "p" is possibly even more cryptic than
"persistent"

On Wed, Jan 10, 2018 at 6:31 PM Joe F <j...@apache.org> wrote:

> While we are at it, can we also think of "persistent://"? PIP-11  covers
> this only for limited use cases.
>
> Maybe shorten it to p://? That will be 9MB for a million topics, in ZK
> strings  (more with utf8).
>
> In general the topic names are a sap on space. With a 200 byte name, its
> about 10M of strings in memory just for topic names, on a broker serving
> 50K topics. Using a hash of the name seems much better.
>
> Joe
>
> On Wed, Jan 10, 2018 at 10:10 AM, Matteo Merli <mme...@apache.org> wrote:
>
> > That's correct:
> >  * Old topics with an arbitrary number of `/` will continue to work
> >  * New topic without cluster name will not be able to use `/` in them
> > (without some kind of escaping)
> >
> > I don't see an easy way around it, only that it doesn't affect the
> backward
> > compatibility. We just need to properly document the allowed characters
> > that can be used in topic names.
> >
> > I have added a preliminary version of the changes at:
> > https://github.com/apache/incubator-pulsar/pull/1051
> >
> > Matteo
> >
> > On Tue, Jan 9, 2018 at 1:50 AM Sijie Guo <guosi...@gmail.com> wrote:
> >
> > > Glad to see this proposal coming out to hide the cluster information!
> > >
> > > I have a few questions regarding how to keep BC here (correct me if I
> am
> > > wrong):
> > >
> > >
> > > If I understand pulsar correct, you can use "/" in the topic name. so
> > what
> > > is the plan to distinguish following names:
> > >
> > > persistent://<tenant>/<cluster>/namespace/test/topic => in the old
> > scheme,
> > > "test/topic" is the topic name.
> > >
> > > now: if cluster is dropped, when pulsar receives following name:
> > >
> > > persistent://<tenant>/namespace/test/topic
> > >
> > > will pulsar interpret namespace as cluster, test as namespace and topic
> > as
> > > the topic name?
> > >
> > >
> > > - Sijie
> > >
> > >
> > >
> > > On Sat, Jan 6, 2018 at 4:35 AM, Matteo Merli <mme...@apache.org>
> wrote:
> > >
> > > > https://github.com/apache/incubator-pulsar/wiki/PIP-10:-
> > > > Remove-cluster-for-namespace-and-topic-names
> > > >
> > > > [Copying the wiki text here for easier quoting]
> > > >
> > > > ------------------------
> > > >
> > > >
> > > >
> > > > * **Status**: Proposal
> > > > * **Author**: Matteo Merli
> > > > * **Pull Request**: [ ]
> > > > * **Mailing List discussion**:
> > > >
> > > >
> > > > ## Motivation
> > > >
> > > > Currently in Pulsar there is a distinction between *local* and
> *global*
> > > > topics,
> > > > where *global* topics are replicated and *local* topics are not.
> > > >
> > > > A topic is *global* if it's created on a *global* namespace and
> *local*
> > > if
> > > > it's
> > > > created on a namespace that it's tied to a particular Pulsar cluster.
> > > >
> > > > For example:
> > > >  * Global namespace --> `my-tenant/global/my-namespace`
> > > >  * Local namespace --> `my-tenant/us-west/my-namespace`
> > > >
> > > > Similarly, the topic names will follow as:
> > > >
> > > > * Global topic --> `persistent://my-tenant/
> > global/my-namespace/my-topic`
> > > > * Local topic --> `persistent://my-tenant/us-
> > west/my-namespace/my-topic`
> > > >
> > > > This distinction leads to a few confusing side effects:
> > > >
> > > >  * Global it's kind of an overloaded term and everyone has a
> different
> > > view
> > > > of it
> > > >  * If a user starts with *local* topic in a single cluster, later
> this
> > > > cannot
> > > >    be converted into a *global* topic directly, because the topic
> name
> > > > already
> > > >    include the particular cluster
> > > >  * Looking at the topic or namespace name, there is the wrong
> > impression
> > > of
> > > >    a hierarchy between a tenant and a cluster, while in reality there
> > is
> > > a
> > > >    many to many relationship between the two.
> > > >
> > > > In reality, the difference between the two types is only coming from
> > > legacy
> > > > reason and there is no practical difference between a *global* with
> > just
> > > > one single cluster in the replication list and a *local* namespace.
> > > >
> > > > Given that *local* namespace is just a special case in the more
> general
> > > > *global* namespace, this proposal is to make all the namespaces to be
> > > > *global*.
> > > >
> > > > Once all the namespaces are global, there will be no need to specify
> > > > `global`
> > > > in the namespace or topic names. Thus the names could be simplified
> > like
> > > > in:
> > > >
> > > >  * Namespace --> `my-tenant/my-namespace`
> > > >  * Topic --> `persistent://my-tenant/my-namespace/my-topic`
> > > >
> > > > Existing namespaces and topics will continue work as before. All REST
> > > APIs
> > > > and
> > > > tools will accept both naming schemes, though the documentation will
> > just
> > > > refer to the new naming, to avoid confusion.
> > > >
> > > >
> > > > ## Changes
> > > >
> > > >  * `NamespaceName` and `DestinationName` are the only classes that
> are
> > > used
> > > > to
> > > >     do the naming validation and will be updated to support both old
> > and
> > > > new
> > > >     scheme.
> > > >  * When creating a namespace we will add an option to immediately
> > specify
> > > >    the replication clusters, to avoid multiple CLI commands or REST
> > > calls.
> > > >  * Admin API REST URL handlers will need to be adapted because
> they're
> > > > based
> > > >    on expecting a certain number of `/` in the URL. New handlers will
> > be
> > > > added
> > > >    and the old ones will be marked as "hidden" for the auto-generated
> > > >    documentation in Swagger.
> > > >  * Examples and test will be converted to use the new convention.
> Most
> > > > tests
> > > >    will not be converted at this point, to ensure both old and new
> > scheme
> > > >    can coexist.
> > > >
> > > >
> > > >
> > > > --
> > > > Matteo Merli
> > > > <mme...@apache.org>
> > > >
> > >
> >
> >
> > --
> > Matteo Merli
> > <mme...@apache.org>
> >
>


-- 
Matteo Merli
<mme...@apache.org>

Reply via email to