[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843582#comment-15843582 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- Committed as [af3fe39dcabd9ef77a00309ce6741268423206df|] to trunk. Will open up a separate follow up ticket for DDL statement rework, as the comment section for this JIRA has grown quite a bit. Thanks again for the review. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842944#comment-15842944 ] Sylvain Lebresne commented on CASSANDRA-9425: - bq. With both of these present in current trunk code, I'd prefer to commit what is here as is, and move those into a separate ticket Wfm. Thanks for addressing the rest, +1, ship it (assuming clean CI, which we seem to have but I checked quickly). Great work! > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840109#comment-15840109 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- bq. TableMetadataRef.to() feels like it could be misused easily so I'd suggest making things a bit more explicit. Done. bq. for indexes, where we kind of hack around them not having their own ID. Addressed by moving 2i {{TableMetadata}} s handling to {{Tables}} and {{Schema}}. bq. Also, the Schema ctor should probably be private. Addressed. bq. Some of Schema.createKeyspace() is actually redundant: Keyspace ctor already initialize all tables and views, so it ends up being duplicated. It's largely harmless though we do end up reloading each ColumnFamilyStore for no reason. Addressed. bq. Schema.load and Schema.unload should probably be synchronized for their modification of keyspaces. Addressed. bq. validateTable seems to replace Validation.validateColumnFamily but the latter is still there and used in a few instances. Would be nice to clean up. Validation.validateKeyspaceNotSystem could probably be migrated too for consistency. Addressed. bq. In TableMetadata, validateCompatibility() should probably check more stuffs (we can't alter non-PK column types as we want, nor change the partition key size for instance). Addressed. The only unaddressed issues at this point are loading of schema and system keyspaces in {{Schema}} static block and a proper refactoring of {{load()}}. With both of these present in current trunk code, I'd prefer to commit what is here as is, and move those into a separate ticket (I would also like to separate out {{Keyspace}} instance handling and metadata handling into two separate classes instead of overloading {{Schema}}, to address the issue properly and not just put some lipstick on this pig). > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828320#comment-15828320 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- Anyway, thanks. Addressed the points in the last comment. The remained of the original ones later tonight. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828214#comment-15828214 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- bq. Well, actually, your motivations on this are lost on me. I'm certainly not pretending that optimization was the most important thing ever, but as mentioned in the original ticket it did show up in some profiling and the "optimization" was beyond tiny and entirely encapsulated within the class. As an aside, also not a fan of manually writting hashcode() methods when the simpler Objects.hashcode() can be and was used. I do want to get rid of the auth object in my metadata code because conceptually I don't want metadata to be aware of auth. But once you get rid of the field, memoising the hash code in the {{DataResource}} object itself becomes meaningless, so you need to 'fix' {{DataResource.hashCode()}} properly, too (in that JIRA, you claimed that the allocation of the array for varargs was an issue; in reality it shouldn't be - once that code gets hot enough, it should be stack allocated, but I have no flight recorder runs to prove it). So should some of the allocated {{DataResource}} objects. TL;DR: I don't like this part of the CASSANDRA-10410 ticket, but I don't care enough about this to fight it, so I can revert my change. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827747#comment-15827747 ] Sylvain Lebresne commented on CASSANDRA-9425: - bq. and undid a tiny bit of of the related CASSANDRA-10410 optimisation I hope you don’t mind. Well, actually, your motivations on this are lost on me. I'm certainly not pretending that optimization was the most important thing ever, but as mentioned in the original ticket it did show up in some profiling and the "optimization" was beyond tiny and entirely encapsulated within the class. As an aside, also not a fan of manually writting hashcode() methods when the simpler Objects.hashcode() can be and was used. bq. Problem is, in announce for CREATE INDEX we do not call KSM.validate(), yet. So for the time being, until the next patch, where we always validate the entirety of the resulting schema before applying anything, it has to stay here, even if imperfect. Wfm, though it sounds like something we could add as comment in the code, so we remember what's up in the unlikely even that "next patch" forgets about this (and so I don't have to ask about this removal when reviewing the next patch since I'll most likely will have forgot about it). bq. we should probably eventually move their processing elsewhere, away from the sensitive path I don't necessarily disagree but I don't think it's a big problem right now so I'd suggest keeping it simple for now. bq. It’s a very minor case of duplication that doesn’t worry me, I’d rather not factor it wouldn’t strictly speaking make things simpler. I think that, all other things being equal, terser and easier to read code *does* make things simpler, even if it's a tiny bit so, so I still stand by my suggestion. But it's certainly small and probably up to personal preferences, so ok. bq. It’s an ugly pre-existing piece of code to avoid re-compiling the UDFs; It annoys me immensely that a method in SchemaKeyspace is referencing Schema.instance, hence the TODO. I want to get rid of it eventually, but it’s not important enough to spend my time on it atm. Fair enough, but could you update the TODO with this expanded and useful context. bq. The alternative would be using {{get(name).orElse(null)}} Well, that's one alternative. The other, which I'd personally would go with, is to remove the method that return {{Optional}} altogether. {{Optional}} is useless outside of adding clarity to the fact there may be no result, but naming the method {{getNullable}} is good enough on that front imo. Thanks for the explanations on the rest. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826564#comment-15826564 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- Alright. Fixed the failing tests ({{TableId}} patch broke Paxos, and index metadata patch altered custom index handling a bit - for the better - but broke {{CustomIndexTests}}), and addressed some of the comments. In addition got rid of {{DataResource}} in {{TableMetadata}}, and undid a tiny bit of of the related CASSANDRA-10410 optimisation I hope you don’t mind. bq. I'd rename {{TableMetadata.table}} to {{TableMetadata.name}} for consistency with other classes ({{KeyspaceMetadata}} and {{IndexMetadata}} at least). While at it, I'd also rename {{ksName}} to {{keyspace}} and {{viewName}} to {{name}} in {{ViewMetadata}}. Both done. bq. I'd also probably rename {{PartitionColumns}} to {{RegularAndStaticColumns}} now. Done. bq. Nit: In {{SSTableTxnWriter}}, it feels slightly inconsistent that {{createRangeAware}} takes a {{TableMetadata}}, but create takes a {{TableMetadataRef}}. I try to use {{TableMetadata}} directly whenever possible, only pass the Ref when unavoidable. In this case it’s no big deal, though, so done. bq. In {{Indexes.validate()}}, not sure it's worth doing the duplicate name check since we do it at the keyspace level already (and only doing it within a single table is a false sense of safety). Problem is, in announce for {{CREATE INDEX}} we do not call {{KSM.validate()}}, yet. So for the time being, until the next patch, where we always validate the entirety of the resulting schema before applying anything, it has to stay here, even if imperfect. Removing it will fail some tests. And some checks - for now - are still better than no checks. Verdict: left be until the next patch. bq. I'd rename {{addRegularColumn}} and {{alterColumnType}} in {{ViewMetadata}} as they are not modifying the {{ViewMetadata}} but creating copies (say to {{withAddedRegularColumn()}} and {{withAlteredColumnType}}). Done. bq. Is there a rational for notifying everything at the end, versus calling each notification in the "appropriate" sub-method (notifying table alter in {{alterTable()}}, )? The later would feels a tad more readable to me as it's slightly easier to check we haven't forgotten something. I prefer it stylistically. But I also want to get schema and related DB objects ({{Keyspace}}, {{CFS}}, and everything downstream) to get to the consistent state as fast as possible. For that reason I’m delaying potentially blocking change event handlers until the very end. On that note, we should probably eventually move their processing elsewhere, away from the sensitive path - just enqueue them into some queue and have something poll and process the queue from time to time. bq. I'd maybe suggest adding a private {{handleDiff(mapDiff, Function onDropped, Function onAdded, Function onUpdated)}} method to simplify a bit. It’s a very minor case of duplication that doesn’t worry me, I’d rather not factor it wouldn’t strictly speaking make things simpler. Also see the previous point. bq. There is a TODO FIXME in {{SchemaKeyspace}} (and not 100% sure what it's about). It’s an ugly pre-existing piece of code to avoid re-compiling the UDFs; It annoys me immensely that a method in {{SchemaKeyspace}} is referencing {{Schema.instance}}, hence the TODO. I want to get rid of it eventually, but it’s not important enough to spend my time on it atm. bq. There is TODO: FIXME in {{ColumnFamilyStore.setCompressionParameters}}. See CASSANDRA-12949. The method in its current implementation (1.1+) is broken and unsafe, so I’ve disabled it for safety reasons. Hopefully someone will get around to dealing with it some time, but I don’t have the time. Either way, committed a clarification for that TODO. bq. {{Keyspaces has both a {{get(String)}} that return {{Optional}} and a {{getNullable(String)}}. We should probably remove one. Not just {{Keyspaces}}. I prefer to have both options available, as there are multiple things that can tolerate null return. The alternative would be using {{get(name).orElse(null)}}, which I personally find unappealing. bq. The patch changes some system table options (compared to current trunk) I removed the explicit setting or {{readRepairChance}} to 0.0 as it was redundant. We made {{readRepairChance}} default to be 0.0 a while ago. Only {{dcLocalReadRepairChance}} needs resetting. {{system_distributed}} was switched to default gc gs because gc gs of 0 is a bug. {{view_build_status}} table accepts deletes, which can be compacted away before propagation with 0 gc gs. See CASSANDRA-12954. Other tables in that keyspace should be unaffected, as they don’t use TTL nor explicit deletes. As for {{system_auth}}, it was never intended to have a special memtable flush period setting, it was just
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818764#comment-15818764 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- Thanks. I have incorporated both yours and Robert's changes, and rebased against most recent trunk. Rerunning the tests now. Will address your other suggestions once I get the current branch to green test state again. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814547#comment-15814547 ] Sylvain Lebresne commented on CASSANDRA-9425: - I very much like the changes of this patch in general. I also agree with [~snazy] small follow-up, getting rid of {{keyspaceAndTablePair}} sounds good. The patch is big though so I have a bunch of remarks/suggestions, some of which may or may not be very useful. Some of the comments may also not be entirely new to this patch, I haven't double-checked everything, but as the main goal of this patch is to clean stuff up, let say I don't feel too bad if there is a few such remarks. I have also a few things, mostly nits, that I just wrote while reviewing and pushed in branch below. One noticeable commit that is not entirely nitty is the introduction of a {{TableId}} class (that just "wrap" a UUID). It felts like this make the code read better with no real downside. It's a trivial patch, just with a big-ish diff. | [9425-review|https://github.com/pcmanus/cassandra/commits/9425-review] | [utests|http://cassci.datastax.com/job/pcmanus-9425-review-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-9425-review-dtest] | Other points: * {{TableMetadataRef.to()}} feels like it could be misused easily so I'd suggest making things a bit more explicit. In practice, it has 3 main usages: *# in {{Schema}}, which is kind of the one true use, where I'd directly use the ctor (which would be package-protected and have the same comment than on {{set()}}). *# for offline tools: I'd rename {{to()}} to something like {{forOfflineTool()}} that make it clear it's ok only in that context. A comment would be nice too. *# for indexes, where we kind of hack around them not having their own ID. I think it's worth pointing out we're doing something specific by having a {{forIndex()}} method with proper comments (and probably an assertion that validate it's indeed called on an index). Or maybe go with my next point. * Talking of index handling, {{TableMetadataRef.inPlaceUpdateIndexedTableMetadata}} feels dangerous. The {{set()}} method is very explicit about not being for public consumption, but that {{inPlaceUpdateIndexedTableMetadata}} method calls {{set()}} directly and is public. It's also particularly hard to convince oneself that it's used in a safe maner (you check first that {{SecondaryIndexManager.reload()}} is indeed only called protected by migration synchronization, but the call to {{inPlaceUpdateIndexedTableMetadata}} is on a task executed by an {{ExecutorService}} so you have to double-check it's not really asynchronous (despite what the javadoc in {{Index.java}} claims!) and thus is indeed protected). Anyway, at a minimum some comments are imo missing, but I wonder if we could make this more obviously correct by pushing it to {{Schema}}, having a map of 'index name' -> TableMetadataRef specific to indexes. Updating that in {{Schema.load()}} shouldn't be terribly complex. Or am I missing a reason why that wouldn't work? * I'm unclear on the changes to {{AlterTypeStatement}}. The patch removes the code that propagate type changes to other table/view/types, but I neither see where that has moved, nor why that wouldn't be needed anymore (as an aside, if it does is not needed anymore, than we could remove the {{updateTypes()}} and {{updateWith()}} methods that are unused in the patch). * I'd rename {{addRegularColumn}} and {{alterColumnType}} in {{ViewMetadata}} as they are not modifying the {{ViewMetadata}} but creating copies (say to {{withAddedRegularColumn()}} and {{withAlteredColumnType}}). * The patch changes some system table options (compared to current trunk), namely: ** {{AuthKeyspace}} tables don't get his {{memtableFlushPeriod}} to 1h. ** {{SystemDistributedKeyspace}} gets a default gcGrace, but it was of 10 days. ** None of the system table gets {{readRepairChance == 0}} (only {{dcLocaReadRepairChance}} is set to 0). * In {{Indexes.validate()}}, not sure it's worth doing the duplicate name check since we do it at the keyspace level already (and only doing it withing a single table is a false sense of safety). * {{Keyspaces}} has both a {{get(String)}} that return {{Optional}} and a {{getNullable(String)}}. We should probably remove one. * There is {{TODO: FIXME}} in ColumnFamilyStore.setCompressionParameters. * There is a {{TODO FIXME}} in SchemaKeyspace (and not 100% sure what it's about). * Nit: In {{SSTableTxnWriter}}, it feels slightly inconsistent that {{createRangeAware}} takes a {{TableMetadata}}, but {{create}} takes a {{TableMetadataRef}}. * In {{Schema}}: ** Realized afterwards that this wasn't new to this patch, but mentioning it nonetheless: loading the system keyspaces in Schema ctor feels dodgy, especially since it's conditional on some initialization methods (and it's a singleton). What if
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749415#comment-15749415 ] Robert Stupp commented on CASSANDRA-9425: - Discussed offline with Aleksey about this and found a solution to remove the fields {{keyspaceAndTablePair}} and {{keyspaceAndTableBytes}} in {{TableMetadata}}. Patch on top of 9425-1 (CI triggered): ||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:9425-opts]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-9425-opts-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-9425-opts-dtest/lastSuccessfulBuild/] > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748646#comment-15748646 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- Rebased against most recent trunk and force pushed. Test runs are clean. Many of the renames in the commit are not necessary. That said, I feel like 4.0 and this schema revamp is probably the last opportunity that will present itself in a while to get some order w/ metadata-related code. If there is a strong objection, I will roll any non-requred changes back. Also, I apologise for cramming it all (the renames and logic changes) in one commit. Some of it could be separated and should've. If pressed, I'll untangle these, too. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693674#comment-15693674 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- The first commit is finishing the work I haven't completed for schema in 3.0, making the remaining metadata objects effectively immutable. So that's mostly {{CFMetaData}}, {{ViewDefinition}}, and {{UDFunction}}. {{CFMetaData}} is now split into {{TableMetadata}} - an immutable, severely cleaned up, table descriptor class, and {{TableMetadataRef}} - a wrapper used in long-lived instances like {{ColumnFamilyStore}}, {{AtomicBTreePartition}}, and {{SSTable}}, where looking up {{Schema.instance}} is too much, yet updating them every time on a table change is a hassle, and potentially error-prone (though {{ColumnFamilyStore}} could arguably just have a {{TableMetadata}} field in place of {{TableMetadataRef}}. It's a huge diff, but a lot of it in tests, and most of it is renames in method signatures and fields. Multiple schema-related classes have been moved to {{org.apache.cassandra.schema}} package, where they ultimately belong, to allow for tighter visibility on some of the more dangerous schema altering methods. Some of them have also been renamed to more accurately reflect their purpose or be more consistent (which IMO is ok, if they are being moved to new files anyway). The review should focus on {{org.apache.cassandra.schema}} package, most everything else can be skimmed. The commit fixes multiple issues (for 4.0 only), including but not limited to CASSANDRA-12950, CASSANDRA-12951, CASSANDRA-12953, CASSANDRA-12954. It doesn't fix CASSANDRA-12949 but disables the method as unsafe. Among other things, Schema merge logic has been vastly simplified (and moved from {{SchemaKeyspace}} to {{Schema}}). CI runs: ||branch||testall||dtest|| |[9425-1|https://github.com/iamaleksey/cassandra/tree/9425-1]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-9425-1-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-9425-1-dtest]| > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693620#comment-15693620 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- Splitting the work into two separate commits: 1. Rework the remaining metadata objects to be immutable, or treated as such. Make it so that {{CFMetaData}}, {{ViewDefinition}}, {{UDFunction}} are no longer altered in-place 2. Rewrite all of DDL CQL statements to operate on (now fully) immutable {{Keyspaces}} instances, returning back a transformed one, while also collapsing validation and transformation steps in one method. This removed the single-node races between validation being run in parallel to each other and to migrations in request threads. Commit #1 available [here|https://github.com/iamaleksey/cassandra/commits/9425-1] Commit #2 to follow once #1 is committed. Then the work will move to CASSANDRA-10699 - first a spec then the code. More details on Commit #1 in the next comment. > Make node-local schema fully immutable > -- > > Key: CASSANDRA-9425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 > Project: Cassandra > Issue Type: Sub-task >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko > Fix For: 4.0 > > > The way we handle schema changes currently is inherently racy. > All of our {{SchemaAlteringStatement}} s perform validation on a schema state > that's won't necessarily be there when the statement gets executed and > mutates schema. > We should make all the *Metadata classes ({{KeyspaceMetadata, > TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently > snapshottable, with a single top-level {{AtomicReference}} to the current > snapshot. Have DDL statements perform validation and transformation on the > same state. > In pseudo-code, think > {code} > public interface DDLStatement > { > /** > * Validates that the DDL statement can be applied to the provided schema > snapshot. > * > * @param schema snapshot of schema before executing CREATE KEYSPACE > */ > void validate(SchemaSnapshot schema); > > /** > * Applies the DDL statement to the provided schema snapshot. > * Implies that validate() has already been called on the provided > snapshot. > * > * @param schema snapshot of schema before executing the statement > * @return snapshot of schema as it would be after executing the statement > */ > SchemaSnapshot transform(SchemaSnapshot schema); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable
[ https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550746#comment-14550746 ] Aleksey Yeschenko commented on CASSANDRA-9425: -- FWIW I have most of the work done, originally as part of CASSANDRA-6717. That said, it's not correct to bundle the two, given that they aren't, strictly speaking, related, and one of them is blocking 3.0beta1. Make node-local schema fully immutable -- Key: CASSANDRA-9425 URL: https://issues.apache.org/jira/browse/CASSANDRA-9425 Project: Cassandra Issue Type: Sub-task Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko Fix For: 3.x The way we handle schema changes currently is inherently racy. All of our {{SchemaAlteringStatement}} s perform validation on a schema state that's won't necessarily be there when the statement gets executed and mutates schema. We should make all the *Metadata classes ({{KeyspaceMetadata, TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently snapshottable, with a single top-level {{AtomicReference}} to the current snapshot. Have DDL statements perform validation and transformation on the same state. In pseudo-code, think {code} public interface DDLStatement { /** * Validates that the DDL statement can be applied to the provided schema snapshot. * * @param schema snapshot of schema before executing CREATE KEYSPACE */ void validate(SchemaSnapshot schema); /** * Applies the DDL statement to the provided schema snapshot. * Implies that validate() has already been called on the provided snapshot. * * @param schema snapshot of schema before executing the statement * @return snapshot of schema as it would be after executing the statement */ SchemaSnapshot transform(SchemaSnapshot schema); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)