[
https://issues.apache.org/jira/browse/IGNITE-21003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Puchkovskiy updated IGNITE-21003:
---
Description:
This issue is to think about the problem and design it first, not about
implementing it right away.
Currently, for an RW transaction, we take schema at the beginning of the
transaction and run the whole transaction on that schema (indices are an
exception, but this is not visible to the end user). The transaction still
notices most schema changes (if a change happens before a read/write in the
transaction, the transaction gets aborted), but it does not notice a table
created after the transaction had started.
IGNITE-20107 addressed this issue, but it was decided that we need to design
this in more depth.
An alternative is to always use the latest schema on each operation (still
having schema validation). This might have some downsides/bring difficulties:
# Same query might return data with different schemas
# It's not clear against which schema to validate the current schema at
execution of each operation (probably, still against the initial one, defined
as Max(txStartTs, tableCreationTs))
# A query would have to use the same query timestamp on each node executing
its fragments, so it would have to propagate this timestamp
The schema synchronization design was created starting with an assumption that
the schema is taken for the start of a transaction, so the design should be
revised carefully when switching to the proposed one.
h2. Proposed changes
It seems that this can be achieved with the following:
# Always execute a KV/SQL operation/query using the current schema (obtained
using the schema sync procedure for now())
# An operation/query executed distributively (like SQL queries that produce a
few fragments executed on different nodes) must pass that query timestamp to
each of the nodes participating in its execution; they must use this timestamp
to get the 'current' schema
# In each read/write operation processed in a PartitionReplicaListener,
instead of failing the operation if the current schema is different from the
initial transaction schema, do the (already implemented) forward compatibility
check (so a few white-listed change types, like adding a nullable column, will
be allowed). This is optional, but the 'fail any read/write if theĀ table
schema is changed in any way' rule was introduced to disallow a user see any
effects of a DDL in the middle of the transaction (except for its abortion); if
we allow each query see new schema, this strict rule kinda makes no sense
anymore. (On commit, we still do forward compatibility check)
# To do the commit/read/write forward compatibility check, take the initial
table schema not at transaciton start, but at the moment when the transaction
had first enlisted the table. For this, we might need a mechanism to pass the
'tableId->enlistTs' map with each transactional operation/query (back and
forth), analogously to how it's done to maintain maxObservableTimestamp.
# Probably we should make column rename forward incompatible as the user will
now have to switch to the new name immediately.
was:
This issue is to think about the problem and design it first, not about
implementing it right away.
Currently, for an RW transaction, we take schema at the beginning of the
transaction and run the whole transaction on that schema (indices are an
exception, but this is not visible to the end user). The transaction still
notices most schema changes (if a change happens before a read/write in the
transaction, the transaction gets aborted), but it does not notice a table
created after the transaction had started.
IGNITE-20107 addressed this issue, but it was decided that we need to design
this in more depth.
An alternative is to always use the latest schema on each operation (still
having schema validation). This might have some downsides/bring difficulties:
# Same query might return data with different schemas
# It's not clear against which schema to validate the current schema at
execution of each operation (probably, still against the initial one, defined
as Max(txStartTs, tableCreationTs))
# A query would have to use the same query timestamp on each node executing
its fragments, so it would have to propagate this timestamp
The schema synchronization design was created starting with an assumption that
the schema is taken for the start of a transaction, so the design should be
revised carefully when switching to the proposed one.
> Switch to the 'always use current schema' approach
> --
>
> Key: IGNITE-21003
> URL: https://issues.apache.org/jira/browse/IGNITE-21003
> Project: Ignite
> Issue Type: Improvement
>Reporter: Roman Puchkovskiy
>Priority: Major
>