[jira] [Updated] (IGNITE-18535) Define new classes for versioned tables/indexes schemas

2023-03-21 Thread Andrey Mashenkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-18535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Mashenkov updated IGNITE-18535:
--
Component/s: sql

> Define new classes for versioned tables/indexes schemas
> ---
>
> Key: IGNITE-18535
> URL: https://issues.apache.org/jira/browse/IGNITE-18535
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Ivan Bessonov
>Assignee: Andrey Mashenkov
>Priority: Major
>  Labels: ignite-3
>
> Current approach with schema management is faulty and can't support indexes. 
> On top of that, it doesn't allow us to truly have multi-versioned historical 
> data. Once the table is removed, it's removed for good, meaning that 
> "current" RO transactions will not be able to finish. This is not acceptable.
> h3. Schema definitions
> What we need to have is the following:
> {code:java}
> SchemaDefinitions = map {version -> SchemaDefinition}
> SchemaDefinition = {timestamp, set {TableDefinition}, set{IndexDefinition}}
> TableDefinition = {name, id, array[ColumnDefinition], ...}
> IndexDefinition = {name, id, tableId, state, array[IdxColumnDefinition], 
> ...}{code}
> Schema must be versioned, that's the first point. Well, it's already 
> versioned in "main", here I mean the global versioning to tie everything to 
> transactions and management of SQL indexes.
> Each definition correspond to a time period, where it represents the "actual" 
> state of things. It must be used for RO queries, for example. RW transactions 
> always use LATEST schema, obviously.
> Now, the meaning of defined values:
>  * version - a simple auto-incrementing integer value;
>  * "timestamp" - the schema is considered to be valid from this timestamp 
> until the timestamp of "next" version (or "inifinity" if the next version 
> doesn't yet exist);
>  * most of tables and indexes properties are self-explanatory;
>  * index state - RO or RW. We should differentiate the indexes that are not 
> yet built frome indexes that are fully available.
> Currently, it's not too clear where to store this structure. The problem lies 
> in the realm of metadata synchronization, that's not yet designed. But the 
> thing is that all nodes must eventually have an up-to-date state and every 
> data/index update must be consistent with the version that belongs to a 
> current operation's timestamp.
> There are two likely candidates - Meta-Storage or Configuration. We'll figure 
> it out later.
> h3. Seralization / storage
> It would be convenient to only store the oldest version + the collection of 
> diffs. Every node would unpack that locally, but we would save a lot on the 
> storage space in meta-storage in case when user has a lot of tables/indexes.
> This approach would also be beneficial for another reason: we need to know, 
> what's changed between versions. It may be hard to calculate if all that we 
> have are definitions themselves.
> h3. General thoughts
> This may be a good place to start using integer tableId and indexId more 
> often. UUIDs are too much. What's good is that "serializability" of schemas 
> gives us easy way of generating integer ids, just like it's don right now 
> with configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (IGNITE-18535) Define new classes for versioned tables/indexes schemas

2023-05-17 Thread Sergey Chugunov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-18535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Chugunov updated IGNITE-18535:
-
Epic Link: IGNITE-19502  (was: IGNITE-17766)

> Define new classes for versioned tables/indexes schemas
> ---
>
> Key: IGNITE-18535
> URL: https://issues.apache.org/jira/browse/IGNITE-18535
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Reporter: Ivan Bessonov
>Assignee: Andrey Mashenkov
>Priority: Major
>  Labels: ignite-3
> Fix For: 3.0.0-beta2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Current approach with schema management is faulty and can't support indexes. 
> On top of that, it doesn't allow us to truly have multi-versioned historical 
> data. Once the table is removed, it's removed for good, meaning that 
> "current" RO transactions will not be able to finish. This is not acceptable.
> h3. Schema definitions
> What we need to have is the following:
> {code:java}
> SchemaDefinitions = map {version -> SchemaDefinition}
> SchemaDefinition = {timestamp, set {TableDefinition}, set{IndexDefinition}}
> TableDefinition = {name, id, array[ColumnDefinition], ...}
> IndexDefinition = {name, id, tableId, state, array[IdxColumnDefinition], 
> ...}{code}
> Schema must be versioned, that's the first point. Well, it's already 
> versioned in "main", here I mean the global versioning to tie everything to 
> transactions and management of SQL indexes.
> Each definition correspond to a time period, where it represents the "actual" 
> state of things. It must be used for RO queries, for example. RW transactions 
> always use LATEST schema, obviously.
> Now, the meaning of defined values:
>  * version - a simple auto-incrementing integer value;
>  * "timestamp" - the schema is considered to be valid from this timestamp 
> until the timestamp of "next" version (or "inifinity" if the next version 
> doesn't yet exist);
>  * most of tables and indexes properties are self-explanatory;
>  * index state - RO or RW. We should differentiate the indexes that are not 
> yet built frome indexes that are fully available.
> Currently, it's not too clear where to store this structure. The problem lies 
> in the realm of metadata synchronization, that's not yet designed. But the 
> thing is that all nodes must eventually have an up-to-date state and every 
> data/index update must be consistent with the version that belongs to a 
> current operation's timestamp.
> There are two likely candidates - Meta-Storage or Configuration. We'll figure 
> it out later.
> h3. Seralization / storage
> It would be convenient to only store the oldest version + the collection of 
> diffs. Every node would unpack that locally, but we would save a lot on the 
> storage space in meta-storage in case when user has a lot of tables/indexes.
> This approach would also be beneficial for another reason: we need to know, 
> what's changed between versions. It may be hard to calculate if all that we 
> have are definitions themselves.
> h3. General thoughts
> This may be a good place to start using integer tableId and indexId more 
> often. UUIDs are too much. What's good is that "serializability" of schemas 
> gives us easy way of generating integer ids, just like it's don right now 
> with configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)