Re: unable to repair

2021-05-31 Thread Jeff Jirsa


> On May 30, 2021, at 2:12 AM, Sébastien Rebecchi  
> wrote:
> 
> 
> Hello,
> 
> I have a more general question about that, I cannot find clear answer.
> 
> In my use case I have many tables (around 10k new tables created per months) 
> and they are created from many clients and only dynamically, with several 
> clients creating same tables simulteanously.

This sounds like a bad idea in practice. There are lots of things that aren’t 
going to scale in this manner. Cassandra 


> 
> What is the recommended way of creating tables dynamically? If I am doing "if 
> not exists" queries + wait for schema aggreement before and after each create 
> statement, will it work correctly for Cassandra?
> 

Waiting for schema agreement is the key. The if not exists on DDL is not 
actually paxos. 

> Sébastien.
> 
>> Le ven. 28 mai 2021 à 20:45, Sébastien Rebecchi  a 
>> écrit :
>> Thank you for your answer.
>> 
>> If I send all my create operations still from many clients but to 1 
>> coordinator node, always the same, would it prevent schema mismatch?
>> 
>> Sébastien.
>> 
>> 
>> Le ven. 28 mai 2021 à 01:14, Kane Wilson  a écrit :
 Which client operations could trigger schema change at node level? Do you 
 mean that for ex creating a new table trigger a schema change globally, 
 not only at KS/table single level?
>>> Yes, any DDL statement (creating tables, altering, dropping, etc) triggers 
>>> a schema change across the cluster (globally). All nodes need to be told of 
>>> this schema change.
>>>  
 I don't have schema changes, except keyspaces and tables creations. But 
 they are done from multiple sources indeed. With a "create if not exists" 
 statement, on demand. Thanks you for your answer, I will try to see if I 
 could precreate them then.
>>>  Yep, definitely do that. You don't want to be issuing simultaneous create 
>>> statements from different clients. IF NOT EXISTS won't necessarily catch 
>>> all cases.
>>>  
 As for the schema mismatch, what is the best way of fixing that issue? 
 Could Cassandra recover from that on its own or is there a nodetool 
 command to force schema agreement? I have heard that we have to restart 
 the nodes 1 by 1, but it seems a very heavy procedure for that.
>>> A rolling restart is usually enough to fix the issue. You might want to 
>>> repair afterwards, and check that data didn't make it to different versions 
>>> of the table on different nodes (in which case some more intervention may 
>>> be necessary to save that data).
>>>  
>>> -- 
>>> raft.so - Cassandra consulting, support, and managed services


Re: multiple clients making schema changes at once

2021-05-31 Thread Sébastien Rebecchi
Hello,

Yes this is quite annoying. How did you implement that "external lock"? I
also thought of doing an external service that would be dedicated to that.
Cassandra client apps would send create instruction to that service, that
would receive them and do the creates 1 by 1, and the client app would wait
the response from it before starting to insert.

Best,

Sébastien.

Le mar. 1 juin 2021 à 05:21, Max C.  a écrit :

> In our case we have a shared dev cluster with (for example) a key space
> for each developer, a key space for each CI runner, etc.   As part of
> initializing our test suite we setup the schema to match the code that is
> about to be tested.  This can mean multiple CI runners each adding/dropping
> tables at the same time but for different key spaces.
>
> Our experience is even though the schema changes do not conflict, we still
> run into schema mismatch problems.   Our solution to this was to have a
> lock (external to Cassandra) that ensures only a single schema change
> operation is being issued at a time.
>
> People assume schema changes in Cassandra work the same way as MySQL or
> multiple users editing files on disk — i.e. as long as you’re not editing
> the same file (or same MySQL table), then there’s no problem.  *This is
> NOT the case.*  Cassandra schema changes are more like “git push”ing a
> commit to the same branch — i.e. at most one change can be outstanding at a
> time (across all tables, all key spaces)…otherwise you will run into
> trouble.
>
> Hope that helps.  Best of luck.
>
> - Max
>
> Hello,
>>
>> I have a more general question about that, I cannot find clear answer.
>>
>> In my use case I have many tables (around 10k new tables created per
>> months) and they are created from many clients and only dynamically, with
>> several clients creating same tables simulteanously.
>>
>> What is the recommended way of creating tables dynamically? If I am doing
>> "if not exists" queries + wait for schema aggreement before and after each
>> create statement, will it work correctly for Cassandra?
>>
>> Sébastien.
>>
>
>


Re: multiple clients making schema changes at once

2021-05-31 Thread Max C.
In our case we have a shared dev cluster with (for example) a key space for 
each developer, a key space for each CI runner, etc.   As part of initializing 
our test suite we setup the schema to match the code that is about to be 
tested.  This can mean multiple CI runners each adding/dropping tables at the 
same time but for different key spaces.

Our experience is even though the schema changes do not conflict, we still run 
into schema mismatch problems.   Our solution to this was to have a lock 
(external to Cassandra) that ensures only a single schema change operation is 
being issued at a time.

People assume schema changes in Cassandra work the same way as MySQL or 
multiple users editing files on disk — i.e. as long as you’re not editing the 
same file (or same MySQL table), then there’s no problem.  This is NOT the 
case.  Cassandra schema changes are more like “git push”ing a commit to the 
same branch — i.e. at most one change can be outstanding at a time (across all 
tables, all key spaces)…otherwise you will run into trouble.

Hope that helps.  Best of luck.

- Max


Hello,

I have a more general question about that, I cannot find clear answer.

In my use case I have many tables (around 10k new tables created per months) 
and they are created from many clients and only dynamically, with several 
clients creating same tables simulteanously.

What is the recommended way of creating tables dynamically? If I am doing "if 
not exists" queries + wait for schema aggreement before and after each create 
statement, will it work correctly for Cassandra?

Sébastien.