[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

Tyler Hobbs (JIRA) Thu, 09 Jul 2015 11:00:20 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620949#comment-14620949
 ]


Tyler Hobbs commented on CASSANDRA-6717:
----------------------------------------

bq. If we drop it here, we should drop it from 2.2 as well - depends on whether 
or not we care about 2.2.0 being compatible with rc1-created aggregates. That 
said, it doesn't hurt to have it there, and it's already gone from 
SchemaKeyspace.

I don't think we care about supporting rc1-created aggregates, but if you think 
it's worth it to keep the code there, maybe just make the code comment clearer 
to indicate that.

bq. I think after 2.1 startup upgrade it should never be the case, but it 
doesn't hurt to leave it there either. I did remove it from 
SchemaKeyspace.createTableFromTableRowAndColumnRows() though

Okay, maybe just add that as a code comment where it's left?

bq. That should be covered by SchemaLoader.schemaDefinition(). Some of the 
tables there are currently commented out though, for one reason or the other.

Ah, I totally missed that.  However, relying on the right kinds of tables being 
in SchemaLoader for testing seems haphazard.  Future changes to that file could 
result in us losing test coverage without realizing it.  I have a strong 
preference for systematically listing out all of the table structures we to 
cover need inside LegacySchemaMigratorTest.

bq. The rest of the tests - yes. Will add later, but before closing the ticket, 
if you don't mind - I need this, and more code depending on this, in trunk, 
soon.

Sounds good.

bq. We used to add deletion to drop table/drop keyspace mutations \[...\] It's 
no longer feasible now that schema tables are not in the same keyspace as 
IndexInfo table

Makes sense, thanks.

It looks like cassci isn't running the tests on this branch for some reason.  
Can you look into that so that we don't accidentally hose everything when this 
is committed?

> Modernize schema tables
> -----------------------
>
>                 Key: CASSANDRA-6717
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Sylvain Lebresne
>            Assignee: Aleksey Yeschenko
>              Labels: client-impacting
>             Fix For: 3.0 beta 1
>
>
> There is a few problems/improvements that can be done with the way we store 
> schema:
> # CASSANDRA-4988: as explained on the ticket, storing the comparator is now 
> redundant (or almost, we'd need to store whether the table is COMPACT or not 
> too, which we don't currently is easy and probably a good idea anyway), it 
> can be entirely reconstructed from the infos in schema_columns (the same is 
> true of key_validator and subcomparator, and replacing default_validator by a 
> COMPACT_VALUE column in all case is relatively simple). And storing the 
> comparator as an opaque string broke concurrent updates of sub-part of said 
> comparator (concurrent collection addition or altering 2 separate clustering 
> columns typically) so it's really worth removing it.
> # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note 
> that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I 
> think we should fix it once and for-all nonetheless (see below).
> # For CASSANDRA-6382 and to allow indexing both map keys and values at the 
> same time, we'd need to be able to have more than one index definition for a 
> given column.
> # There is a few mismatches in table options between the one stored in the 
> schema and the one used when declaring/altering a table which would be nice 
> to fix. The compaction, compression and replication maps are one already 
> mentioned from CASSANDRA-4603, but also for some reason 
> 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' 
> in the schema table, and 'min/max_compaction_threshold' are column families 
> option in the schema but just compaction options for CQL (which makes more 
> sense).
> None of those issues are major, and we could probably deal with them 
> independently but it might be simpler to just fix them all in one shot so I 
> wanted to sum them all up here. In particular, the fact that 
> 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, 
> but it may limit future stuff too) which suggest we should migrate it to a 
> new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt 
> to rename schema_columnfamilies to schema_tables for the years to come since 
> that's the prefered vernacular for CQL.
> Overall, what I would suggest is to move all schema tables to a new keyspace, 
> named 'schema' for instance (or 'system_schema' but I prefer the shorter 
> version), and fix all the issues above at once. Since we currently don't 
> exchange schema between nodes of different versions, all we'd need to do that 
> is a one shot startup migration, and overall, I think it could be simpler for 
> clients to deal with one clear migration than to have to handle minor 
> individual changes all over the place. I also think it's somewhat cleaner 
> conceptually to have schema tables in their own keyspace since they are 
> replicated through a different mechanism than other system tables.
> If we do that, we could, for instance, migrate to the following schema tables 
> (details up for discussion of course):
> {noformat}
> CREATE TYPE user_type (
>   name text,
>   column_names list<text>,
>   column_types list<text>
> )
> CREATE TABLE keyspaces (
>   name text PRIMARY KEY,
>   durable_writes boolean,
>   replication map<string, string>,
>   user_types map<string, user_type>
> )
> CREATE TYPE trigger_definition (
>   name text,
>   options map<tex, text>
> )
> CREATE TABLE tables (
>   keyspace text,
>   name text,
>   id uuid,
>   table_type text, // COMPACT, CQL or SUPER
>   dropped_columns map<text, bigint>,
>   triggers map<text, trigger_definition>,
>   // options
>   comment text,
>   compaction map<text, text>,
>   compression map<text, text>,
>   read_repair_chance double,
>   dclocal_read_repair_chance double,
>   gc_grace_seconds int,
>   caching text,
>   rows_per_partition_to_cache text,
>   default_time_to_live int,
>   min_index_interval int,
>   max_index_interval int,
>   speculative_retry text,
>   populate_io_cache_on_flush boolean,
>   bloom_filter_fp_chance double
>   memtable_flush_period_in_ms int,
>   PRIMARY KEY (keyspace, name)
> )
> CREATE TYPE index_definition (
>   name text,
>   index_type text,
>   options map<text, text>
> )
> CREATE TABLE columns (
>   keyspace text,
>   table text,
>   name text,
>   kind text, // PARTITION_KEY, CLUSTERING_COLUMN, REGULAR or COMPACT_VALUE
>   component_index int;
>   type text,
>   indexes map<text, index_definition>,
>   PRIMARY KEY (keyspace, table, name)
> )
> {noformat}
> Nit: wouldn't hurt to create a simple enum that is reuse by both CFMetaData 
> and CFPropDefs for table options names while we're at it once they are the 
> same instead of repeating string constants which is fragile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

Reply via email to