AFAIK, Cassandra will not process schema changes in parallel. However, by sending requests in parallel, you can minimise the time Cassandra staying idle while the client is waiting for schema agreement after each CREATE KEYSPACE statement.

On 09/03/2022 20:46, Leon Zaruvinsky wrote:
Hi Bowen,

Haha, agree with you on wanting fewer keyspaces but unfortunately we're kind of locked in to our architecture for the time being.

We do part of what you're saying, in that we shut down all but one node and then run CREATE against that single node. But we do that serially, O(keyspaces).  If we were to submit the CREATE statements in parallel, is your claim that Cassandra would process these in parallel as well?

Thanks,
Leon

On Wed, Mar 9, 2022 at 12:46 PM Bowen Song <bo...@bso.ng> wrote:

    First of all, you really shouldn't have that many keyspaces. Put that
    aside, the quickest way to create large number of keyspaces without
    causing schema disagreement is create keyspaces in parallel over a
    connection pool with a number of connections all against the same
    single
    Cassandra node. Because all CREATE KEYPSPACE statements are sent
    to the
    same node, you don't need to worry about schema disagreement it may
    cause, as the server side internally will ensure the consistency
    of the
    schema.

    On 09/03/2022 18:35, Leon Zaruvinsky wrote:
    > Hey folks,
    >
    > A step in our Cassandra restore process is to re-create every
    keyspace
    > that existed in the backup in a brand new cluster. Because these
    > creations are sequential, and because we have _a lot_ of keyspaces,
    > this ends up being the slowest part of our restore.  We already
    have
    > some optimizations in place to speed up schema agreement after each
    > create, but even so we'd like to get the time down significantly
    more.
    >
    > I was curious if anyone has any guidance or has experimented
    with ways
    > of creating keyspaces that are faster than a bunch of CREATE calls.
    > It's fine for the cluster to be offline during the process.
    >
    > Thanks,
    > Leon

Reply via email to