[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-27 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843582#comment-15843582
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

Committed as [af3fe39dcabd9ef77a00309ce6741268423206df|] to trunk. Will open up 
a separate follow up ticket for DDL statement rework, as the comment section 
for this JIRA has grown quite a bit.

Thanks again for the review.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-27 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842944#comment-15842944
 ] 

Sylvain Lebresne commented on CASSANDRA-9425:
-

bq. With both of these present in current trunk code, I'd prefer to commit what 
is here as is, and move those into a separate ticket

Wfm.

Thanks for addressing the rest, +1, ship it (assuming clean CI, which we seem 
to have but I checked quickly). Great work!

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-26 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840109#comment-15840109
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

bq. TableMetadataRef.to() feels like it could be misused easily so I'd suggest 
making things a bit more explicit.

Done.

bq. for indexes, where we kind of hack around them not having their own ID.

Addressed by moving 2i {{TableMetadata}} s handling to {{Tables}} and 
{{Schema}}.

bq. Also, the Schema ctor should probably be private.

Addressed.

bq. Some of Schema.createKeyspace() is actually redundant: Keyspace ctor 
already initialize all tables and views, so it ends up being duplicated. It's 
largely harmless though we do end up reloading each ColumnFamilyStore for no 
reason.

Addressed.

bq. Schema.load and Schema.unload should probably be synchronized for their 
modification of keyspaces.

Addressed.

bq. validateTable seems to replace Validation.validateColumnFamily but the 
latter is still there and used in a few instances. Would be nice to clean up. 
Validation.validateKeyspaceNotSystem could probably be migrated too for 
consistency.

Addressed.

bq. In TableMetadata, validateCompatibility() should probably check more stuffs 
(we can't alter non-PK column types as we want, nor change the partition key 
size for instance).

Addressed.

The only unaddressed issues at this point are loading of schema and system 
keyspaces in {{Schema}} static block and a proper refactoring of {{load()}}. 
With both of these present in current trunk code, I'd prefer to commit what is 
here as is, and move those into a separate ticket (I would also like to 
separate out {{Keyspace}} instance handling and metadata handling into two 
separate classes instead of overloading {{Schema}}, to address the issue 
properly and not just put some lipstick on this pig).

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-18 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828320#comment-15828320
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

Anyway, thanks. Addressed the points in the last comment. The remained of the 
original ones later tonight.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-18 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828214#comment-15828214
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

bq. Well, actually, your motivations on this are lost on me. I'm certainly not 
pretending that optimization was the most important thing ever, but as 
mentioned in the original ticket it did show up in some profiling and the 
"optimization" was beyond tiny and entirely encapsulated within the class. As 
an aside, also not a fan of manually writting hashcode() methods when the 
simpler Objects.hashcode() can be and was used.

I do want to get rid of the auth object in my metadata code because 
conceptually I don't want metadata to be aware of auth. But once you get rid of 
the field, memoising the hash code in the {{DataResource}} object itself 
becomes meaningless, so you need to 'fix' {{DataResource.hashCode()}} properly, 
too (in that JIRA, you claimed that the allocation of the array for varargs was 
an issue; in reality it shouldn't be - once that code gets hot enough, it 
should be stack allocated, but I have no flight recorder runs to prove it). So 
should some of the allocated {{DataResource}} objects. TL;DR: I don't like this 
part of the CASSANDRA-10410 ticket, but I don't care enough about this to fight 
it, so I can revert my change.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-18 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827747#comment-15827747
 ] 

Sylvain Lebresne commented on CASSANDRA-9425:
-

bq. and undid a tiny bit of of the related CASSANDRA-10410 optimisation I hope 
you don’t mind.

Well, actually, your motivations on this are lost on me. I'm certainly not 
pretending that optimization was the most important thing ever, but as 
mentioned in the original ticket it did show up in some profiling and the 
"optimization" was beyond tiny and entirely encapsulated within the class. As 
an aside, also not a fan of manually writting hashcode() methods when the 
simpler Objects.hashcode() can be and was used.

bq. Problem is, in announce for CREATE INDEX we do not call KSM.validate(), 
yet. So for the time being, until the next patch, where we always validate the 
entirety of the resulting schema before applying anything, it has to stay here, 
even if imperfect.

Wfm, though it sounds like something we could add as comment in the code, so we 
remember what's up in the unlikely even that "next patch" forgets about this 
(and so I don't have to ask about this removal when reviewing the next patch 
since I'll most likely will have forgot about it).

bq. we should probably eventually move their processing elsewhere, away from 
the sensitive path

I don't necessarily disagree but I don't think it's a big problem right now so 
I'd suggest keeping it simple for now.

bq. It’s a very minor case of duplication that doesn’t worry me, I’d rather not 
factor it wouldn’t strictly speaking make things simpler.

I think that, all other things being equal, terser and easier to read code 
*does* make things simpler, even if it's a tiny bit so, so I still stand by my 
suggestion. But it's certainly small and probably up to personal preferences, 
so ok.

bq. It’s an ugly pre-existing piece of code to avoid re-compiling the UDFs; It 
annoys me immensely that a method in SchemaKeyspace is referencing 
Schema.instance, hence the TODO. I want to get rid of it eventually, but it’s 
not important enough to spend my time on it atm.

Fair enough, but could you update the TODO with this expanded and useful 
context.

bq. The alternative would be using {{get(name).orElse(null)}}

Well, that's one alternative. The other, which I'd personally would go with, is 
to remove the method that return {{Optional}} altogether. {{Optional}} is 
useless outside of adding clarity to the fact there may be no result, but 
naming the method {{getNullable}} is good enough on that front imo.

Thanks for the explanations on the rest.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826564#comment-15826564
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

Alright. Fixed the failing tests ({{TableId}} patch broke Paxos, and index 
metadata patch altered custom index handling a bit - for the better - but broke 
{{CustomIndexTests}}), and addressed some of the comments.

In addition got rid of {{DataResource}} in {{TableMetadata}}, and undid a tiny 
bit of of the related CASSANDRA-10410 optimisation I hope you don’t mind.

bq. I'd rename {{TableMetadata.table}} to {{TableMetadata.name}} for 
consistency with other classes ({{KeyspaceMetadata}} and {{IndexMetadata}} at 
least). While at it, I'd also rename {{ksName}} to {{keyspace}} and 
{{viewName}} to {{name}} in {{ViewMetadata}}.

Both done.

bq. I'd also probably rename {{PartitionColumns}} to 
{{RegularAndStaticColumns}} now.

Done.

bq. Nit: In {{SSTableTxnWriter}}, it feels slightly inconsistent that 
{{createRangeAware}} takes a {{TableMetadata}}, but create takes a 
{{TableMetadataRef}}.

I try to use {{TableMetadata}} directly whenever possible, only pass the Ref 
when unavoidable. In this case it’s no big deal, though, so done.

bq. In {{Indexes.validate()}}, not sure it's worth doing the duplicate name 
check since we do it at the keyspace level already (and only doing it within a 
single table is a false sense of safety).

Problem is, in announce for {{CREATE INDEX}} we do not call {{KSM.validate()}}, 
yet. So for the time being, until the next patch, where we always validate the 
entirety of the resulting schema
before applying anything, it has to stay here, even if imperfect. Removing it 
will fail some tests. And some checks - for now - are still better than no 
checks. Verdict: left be until the next patch.

bq. I'd rename {{addRegularColumn}} and {{alterColumnType}} in {{ViewMetadata}} 
as they are not modifying the {{ViewMetadata}} but creating copies (say to 
{{withAddedRegularColumn()}} and {{withAlteredColumnType}}).

Done.

bq. Is there a rational for notifying everything at the end, versus calling 
each notification in the "appropriate" sub-method (notifying table alter in 
{{alterTable()}}, )? The later would feels a tad more readable to me as 
it's slightly easier to check we haven't forgotten something.

I prefer it stylistically. But I also want to get schema and related DB objects 
({{Keyspace}}, {{CFS}}, and everything downstream) to get to the consistent 
state as fast as possible. For that reason
I’m delaying potentially blocking change event handlers until the very end. On 
that note, we should probably eventually move their processing elsewhere, away 
from the sensitive path - just enqueue them into some queue and have something 
poll and process the queue from time to time.

bq. I'd maybe suggest adding a private {{handleDiff(mapDiff, Function 
onDropped, Function onAdded, Function onUpdated)}} method to simplify a bit.

It’s a very minor case of duplication that doesn’t worry me, I’d rather not 
factor it wouldn’t strictly speaking make things simpler. Also see the previous 
point.

bq. There is a TODO FIXME in {{SchemaKeyspace}} (and not 100% sure what it's 
about).

It’s an ugly pre-existing piece of code to avoid re-compiling the UDFs; It 
annoys me immensely that a method in {{SchemaKeyspace}} is referencing 
{{Schema.instance}}, hence the TODO. I want to get rid of it eventually, but 
it’s not important enough to spend my time on it atm.

bq. There is TODO: FIXME in {{ColumnFamilyStore.setCompressionParameters}}.

See CASSANDRA-12949. The method in its current implementation (1.1+) is broken 
and unsafe, so I’ve disabled it for safety reasons. Hopefully someone will get 
around to dealing with it some time, but I don’t have the time. Either way, 
committed a clarification for that TODO.

bq. {{Keyspaces has both a {{get(String)}} that return {{Optional}} and a 
{{getNullable(String)}}. We should probably remove one.

Not just {{Keyspaces}}. I prefer to have both options available, as there are 
multiple things that can tolerate null return. The alternative would be using 
{{get(name).orElse(null)}}, which I personally find unappealing.

bq. The patch changes some system table options (compared to current trunk)

I removed the explicit setting or {{readRepairChance}} to 0.0 as it was 
redundant. We made {{readRepairChance}} default to be 0.0 a while ago. Only 
{{dcLocalReadRepairChance}} needs resetting.

{{system_distributed}} was switched to default gc gs because gc gs of 0 is a 
bug. {{view_build_status}} table accepts deletes, which can be compacted away 
before propagation with 0 gc gs. See CASSANDRA-12954. Other tables in that 
keyspace should be unaffected, as they don’t use TTL nor explicit deletes.

As for {{system_auth}}, it was never intended to have a special memtable flush 
period setting, it was just 

[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-11 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818764#comment-15818764
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

Thanks. I have incorporated both yours and Robert's changes, and rebased 
against most recent trunk. Rerunning the tests now.

Will address your other suggestions once I get the current branch to green test 
state again.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2017-01-10 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814547#comment-15814547
 ] 

Sylvain Lebresne commented on CASSANDRA-9425:
-

I very much like the changes of this patch in general. I also agree with 
[~snazy] small follow-up, getting rid of {{keyspaceAndTablePair}} sounds good. 
The patch is big though so I have a bunch of remarks/suggestions, some of which 
may or may not be very useful. Some of the comments may also not be entirely 
new to this patch, I haven't double-checked everything, but as the main goal of 
this patch is to clean stuff up, let say I don't feel too bad if there is a few 
such remarks.

I have also a few things, mostly nits, that I just wrote while reviewing and 
pushed in branch below. One noticeable commit that is not entirely nitty is the 
introduction of a {{TableId}} class (that just "wrap" a UUID). It felts like 
this make the code read better with no real downside. It's a trivial patch, 
just with a big-ish diff.
| [9425-review|https://github.com/pcmanus/cassandra/commits/9425-review] | 
[utests|http://cassci.datastax.com/job/pcmanus-9425-review-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-9425-review-dtest] |


Other points:
* {{TableMetadataRef.to()}} feels like it could be misused easily so I'd 
suggest making things a bit more explicit. In practice, it has 3 main usages:
*# in {{Schema}}, which is kind of the one true use, where I'd directly use the 
ctor (which would be package-protected and have the same comment than on 
{{set()}}).
*# for offline tools: I'd rename {{to()}} to something like 
{{forOfflineTool()}} that make it clear it's ok only in that context. A comment 
would be nice too.
*# for indexes, where we kind of hack around them not having their own ID. I 
think it's worth pointing out we're doing something specific by having a 
{{forIndex()}} method with proper comments (and probably an assertion that 
validate it's indeed called on an index). Or maybe go with my next point.
* Talking of index handling, 
{{TableMetadataRef.inPlaceUpdateIndexedTableMetadata}} feels dangerous. The 
{{set()}} method is very explicit about not being for public consumption, but 
that {{inPlaceUpdateIndexedTableMetadata}} method calls {{set()}} directly and 
is public. It's also particularly hard to convince oneself that it's used in a 
safe maner (you check first that {{SecondaryIndexManager.reload()}} is indeed 
only called protected by migration synchronization, but the call to 
{{inPlaceUpdateIndexedTableMetadata}} is on a task executed by an 
{{ExecutorService}} so you have to double-check it's not really asynchronous 
(despite what the javadoc in {{Index.java}} claims!) and thus is indeed 
protected).
  Anyway, at a minimum some comments are imo missing, but I wonder if we could 
make this more obviously correct by pushing it to {{Schema}}, having a map of 
'index name' -> TableMetadataRef specific to indexes. Updating that in 
{{Schema.load()}} shouldn't be terribly complex. Or am I missing a reason why 
that wouldn't work?
* I'm unclear on the changes to {{AlterTypeStatement}}. The patch removes the 
code that propagate type changes to other table/view/types, but I neither see 
where that has moved, nor why that wouldn't be needed anymore (as an aside, if 
it does is not needed anymore, than we could remove the {{updateTypes()}} and 
{{updateWith()}} methods that are unused in the patch).
* I'd rename {{addRegularColumn}} and {{alterColumnType}} in {{ViewMetadata}} 
as they are not modifying the {{ViewMetadata}} but creating copies (say to 
{{withAddedRegularColumn()}} and {{withAlteredColumnType}}).
* The patch changes some system table options (compared to current trunk), 
namely:
** {{AuthKeyspace}} tables don't get his {{memtableFlushPeriod}} to 1h.
** {{SystemDistributedKeyspace}} gets a default gcGrace, but it was of 10 days.
** None of the system table gets {{readRepairChance == 0}} (only 
{{dcLocaReadRepairChance}} is set to 0).
* In {{Indexes.validate()}}, not sure it's worth doing the duplicate name check 
since we do it at the keyspace level already (and only doing it withing a 
single table is a false sense of safety).
* {{Keyspaces}} has both a {{get(String)}} that return {{Optional}} and a 
{{getNullable(String)}}. We should probably remove one.
* There is {{TODO: FIXME}} in ColumnFamilyStore.setCompressionParameters.
* There is a {{TODO FIXME}} in SchemaKeyspace (and not 100% sure what it's 
about).
* Nit: In {{SSTableTxnWriter}}, it feels slightly inconsistent that 
{{createRangeAware}} takes a {{TableMetadata}}, but {{create}} takes a 
{{TableMetadataRef}}.
* In {{Schema}}:
** Realized afterwards that this wasn't new to this patch, but mentioning it 
nonetheless: loading the system keyspaces in Schema ctor feels dodgy, 
especially since it's conditional on some initialization methods (and it's a 
singleton). What if 

[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2016-12-14 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749415#comment-15749415
 ] 

Robert Stupp commented on CASSANDRA-9425:
-

Discussed offline with Aleksey about this and found a solution to remove the 
fields {{keyspaceAndTablePair}} and {{keyspaceAndTableBytes}} in 
{{TableMetadata}}.
Patch on top of 9425-1 (CI triggered):
||trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:9425-opts]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-9425-opts-testall/lastSuccessfulBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-9425-opts-dtest/lastSuccessfulBuild/]


> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2016-12-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748646#comment-15748646
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

Rebased against most recent trunk and force pushed. Test runs are clean.

Many of the renames in the commit are not necessary. That said, I feel like 4.0 
and this schema revamp is probably the last opportunity that will present 
itself in a while to get some order w/ metadata-related code.

If there is a strong objection, I will roll any non-requred changes back. Also, 
I apologise for cramming it all (the renames and logic changes) in one commit. 
Some of it could be separated and should've. If pressed, I'll untangle these, 
too.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2016-11-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693674#comment-15693674
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

The first commit is finishing the work I haven't completed for schema in 3.0, 
making the remaining metadata objects effectively immutable. So that's mostly 
{{CFMetaData}}, {{ViewDefinition}}, and {{UDFunction}}.

{{CFMetaData}} is now split into {{TableMetadata}} - an immutable, severely 
cleaned up, table descriptor class, and {{TableMetadataRef}} - a wrapper used 
in long-lived instances like {{ColumnFamilyStore}}, {{AtomicBTreePartition}}, 
and {{SSTable}}, where looking up {{Schema.instance}} is too much, yet updating 
them every time on a table change is a hassle, and potentially error-prone 
(though {{ColumnFamilyStore}} could arguably just have a {{TableMetadata}} 
field in place of {{TableMetadataRef}}.

It's a huge diff, but a lot of it in tests, and most of it is renames in method 
signatures and fields.

Multiple schema-related classes have been moved to 
{{org.apache.cassandra.schema}} package, where they ultimately belong, to allow 
for tighter visibility on some of the more dangerous schema altering methods. 
Some of them have also been renamed to more accurately reflect their purpose or 
be more consistent (which IMO is ok, if they are being moved to new files 
anyway).

The review should focus on {{org.apache.cassandra.schema}} package, most 
everything else can be skimmed.

The commit fixes multiple issues (for 4.0 only), including but not limited to 
CASSANDRA-12950, CASSANDRA-12951, CASSANDRA-12953, CASSANDRA-12954. It doesn't 
fix CASSANDRA-12949 but disables the method as unsafe.

Among other things, Schema merge logic has been vastly simplified (and moved 
from {{SchemaKeyspace}} to {{Schema}}).

CI runs:
||branch||testall||dtest||
|[9425-1|https://github.com/iamaleksey/cassandra/tree/9425-1]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-9425-1-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-9425-1-dtest]|

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2016-11-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15693620#comment-15693620
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

Splitting the work into two separate commits:

1. Rework the remaining metadata objects to be immutable, or treated as such. 
Make it so that {{CFMetaData}}, {{ViewDefinition}}, {{UDFunction}} are no 
longer altered in-place

2. Rewrite all of DDL CQL statements to operate on (now fully) immutable 
{{Keyspaces}} instances, returning back a transformed one, while also 
collapsing validation and transformation steps in one method. This removed the 
single-node races between validation being run in parallel to each other and to 
migrations in request threads.

Commit #1 available 
[here|https://github.com/iamaleksey/cassandra/commits/9425-1]
Commit #2 to follow once #1 is committed.

Then the work will move to CASSANDRA-10699 - first a spec then the code.

More details on Commit #1 in the next comment.

> Make node-local schema fully immutable
> --
>
> Key: CASSANDRA-9425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
> Fix For: 4.0
>
>
> The way we handle schema changes currently is inherently racy.
> All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
> that's won't necessarily be there when the statement gets executed and 
> mutates schema.
> We should make all the *Metadata classes ({{KeyspaceMetadata, 
> TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
> snapshottable, with a single top-level {{AtomicReference}} to the current 
> snapshot. Have DDL statements perform validation and transformation on the 
> same state.
> In pseudo-code, think
> {code}
> public interface DDLStatement
> {
> /**
>  * Validates that the DDL statement can be applied to the provided schema 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing CREATE KEYSPACE
>  */
> void validate(SchemaSnapshot schema);
>  
> /**
>  * Applies the DDL statement to the provided schema snapshot.
>  * Implies that validate() has already been called on the provided 
> snapshot.
>  *
>  * @param schema snapshot of schema before executing the statement
>  * @return snapshot of schema as it would be after executing the statement
>  */
> SchemaSnapshot transform(SchemaSnapshot schema);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9425) Make node-local schema fully immutable

2015-05-19 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550746#comment-14550746
 ] 

Aleksey Yeschenko commented on CASSANDRA-9425:
--

FWIW I have most of the work done, originally as part of CASSANDRA-6717. That 
said, it's not correct to bundle the two, given that they aren't, strictly 
speaking, related, and one of them is blocking 3.0beta1.

 Make node-local schema fully immutable
 --

 Key: CASSANDRA-9425
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9425
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko
 Fix For: 3.x


 The way we handle schema changes currently is inherently racy.
 All of our {{SchemaAlteringStatement}} s perform validation on a schema state 
 that's won't necessarily be there when the statement gets executed and 
 mutates schema.
 We should make all the *Metadata classes ({{KeyspaceMetadata, 
 TableMetadata}}, {{ColumnMetadata}}, immutable, and local schema persistently 
 snapshottable, with a single top-level {{AtomicReference}} to the current 
 snapshot. Have DDL statements perform validation and transformation on the 
 same state.
 In pseudo-code, think
 {code}
 public interface DDLStatement
 {
  /**
  * Validates that the DDL statement can be applied to the provided schema 
 snapshot.
  *
  * @param schema snapshot of schema before executing CREATE KEYSPACE
  */
 void validate(SchemaSnapshot schema);
  
 /**
  * Applies the DDL statement to the provided schema snapshot.
  * Implies that validate() has already been called on the provided 
 snapshot.
  *
  * @param schema snapshot of schema before executing the statement
  * @return snapshot of schema as it would be after executing the statement
  */
 SchemaSnapshot transform(SchemaSnapshot schema);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)