Re: The design of ClusterStateProvider & ClusterState

Ilan Ginzburg Fri, 05 Apr 2024 05:53:13 -0700

I would suggest doing any such change in two independent steps:
- Moving classes around without any functional change ("pure" refactoring)
- A change to what a class exposes, its behavior etc.

Otherwise it is very hard to track what has simply moved and what has
changed.
Is the principal motivation here class content refactoring or change of
behavior?

Independent of which classes do the work, I would like the cluster state to
be considered for what it is: a *stale cache* of the ZooKeeper data (at
varying levels of details and staleness).
Rather than try to keep it up to date with constant watches (it is still
stale given these are async), consider it can be very stale and deal with
the staleness when encountered.
For example, DO NOT watch the whole collection list (for collections not
represented on the node). If a request for an unknown collection is
received, check (in ZK) if it exists. Implement of course a level of
caching of fetched data.

Similarly, for "watched" collections, no need to get all updates. A
periodic re-fetch (or re-check) from ZooKeeper might be justified, or in
general fetch from ZooKeeper the collection state when it is absent locally
or is identified as stale.

I believe approaching the distribution of cluster state to all nodes in
such a way will greatly limit the load and chatiness with ZooKeeper esp on
"dynamic" clusters with many state changes.

Ilan

On Fri, Apr 5, 2024 at 4:15 AM David Smiley <dsmi...@apache.org> wrote:

> I've been looking at HttpClusterStateProvider lately, and of ClusterState.
> It has a method getClusterState which goes and loads the complete
> ClusterState (all collections with all state info).  ClusterState is
> immutable.  At a massive collection scale, such a method is very
> disconcerting!  Thankfully, there's a method getState(collection)
> returning a CollectionRef (holder of DocCollection) implemented by
> fetching only the state of the pertinent collection.  Likewise the
> live nodes can be retrieved directly from ClusterStateProvider without
> requiring using ClusterState.
>
> I'd like to make a bold proposal: Merge ClusterState with
> ClusterStateProvider, keeping the same ClusterState name & package and
> all/most API methods.  This means it would lose its immutability
> designation.  If an immutable variation is needed, one could exist.
>
> Don't include methods like getCollectionsMap which is evil at
> many-collection scale.  Listing/looping collections should be done
> sparingly; don't make it too easy to do by accident.
>
> Possibly also move CloudSolrClient's StateCache (a cache of
> DocCollection keyed by collection name) into the new & improved
> ClusterState.
>
> The end-game is ClusterState being where we can list live nodes,
> aliases, collections, and most importantly a cache of DocCollection.
> With an eventually consistent mind-set; anything can be out of date
> and may need to be re-fetched.
>
> Has anyone thought similarly or have concerns in such a pursuit?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>

Re: The design of ClusterStateProvider & ClusterState

Reply via email to