[
https://issues.apache.org/jira/browse/DERBY-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Hillegas updated DERBY-481:
--------------------------------
Attachment: derby-481-04-aa-insert.diff
Attaching derby-481-04-aa-insert.diff. I am running regression tests now.
This patch wires in INSERT support for generated columns. I threaded my way
through the INSERT machinery largely by following the way that CHECK
constraints are handled.
Before this patch, the compiler built 2 significant methods for evaluating
expressions:
1) A method which populates the base row from whatever data source is driving
the INSERT. That data source could be, for instance, a list of literal values
or a SELECT statement.
2) A method which runs the CHECK constraints.
My first attempt to support INSERT involved building the generation clauses
into method (1). Unfortunately, that method is generated by the data sources,
not by the driving INSERT node. I got this approach to work for the degenerate
case of inserting a single literal value. But this approach failed when I tried
to insert multiple literal values (where the data source is a UNION) and it
failed when the data source was a SELECT. It became apparent that this approach
would involve wiring code-generation logic into all implementations of
ResultSet--there are quite a few. This began to look too complicated so I
abandoned this approach.
The current patch represents a second attempt. Here the approach is to give the
generation clauses their own method. Now the compiler builds 3 significant
methods for evaluating expressions:
1') The original method which populates the base row from a data source (see
above).
2') A new method which runs the generation clauses, looking for referenced
columns in the row built by (1') and poking the generated values into that row.
3') The original method which runs the CHECK constraints (see above).
That was the tricky bit for compilation.
The tricky bit for execution was this: the base row has to be poked into the
Activation so that it is visible to the generation clauses when (2') runs. A
similar poking is done for CHECK constraints. If you examine this poking for
CHECK constraints, you will notice that sometimes the poking is undone after
the constraints run and sometimes we don't bother to undo the poking. I don't
understand the difference between these code paths. As a result, I have
defensively coded the new poking which we need for generated columns. I poke
the base row into the Activation just before the generation clauses run. After
the generation clauses run, I return the Activation to its previous state.
Here is a little more detail on the implementation:
A) At bind() time we do the following:
i) Prune out explicit mentions of generated columns. These can arise if the
user sets a generated column to the literal DEFAULT--as allowed by the ANSI/ISO
syntax. So for instance, the following is legal:
insert into T( refCol, generatedCol ) values ( 1, default )
We prune out the explicitly added generated columns because, later on in the
bind() phase, the insert list is expanded to include all columns with defaults
(not just generated columns).
ii) When the insert list is expanded to include all defaulted columns, we add
in the generated columns but we don't bind their expressions. This is because
the generation clause may refer to other columns in the base row. This, in
turn, creates an ordering problem. In addition we we don't yet have a result
set number for the base row--we need that number in order to bind references to
other columns which may appear in the generation clauses.
iii) Later on, just before we parse and bind the CHECK constraints, we parse
and bind the generation clauses. At this point, we have enough context to bind
the referenced columns.
B) At generate() time, we generate method (2') in between generating (1') and
(3'). The generated (2') method is now one of the arguments to the factory
method which creates the execution-driver, the InsertResultSet. This is just
like what we do for CHECK constraints: the generated (3') method is also an
argument to the instantiation of the InsertResultSet.
C) At execution time, we evaluate (2') just before we evaluate (3').
Touches the following files:
--------------------
-- BINDING
--------------------
M java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java
Adds a method so that a ResultColumn can report whether it represents a
generated column. I also forced all overrides of the expression field to go
through the setExpression() method. This, technically speaking, is not
necessary--but it made debugging easier for me and I think it will be useful
for other developers who need to debug this node.
M java/engine/org/apache/derby/impl/sql/compile/DMLModStatementNode.java
Changes are made to support both binding and code-generation. These are the
bind() changes:
i) Adds a method to object if the user tries to override the value in a
generated column with any value other than the DEFAULT literal. For instance,
the following is illegal:
insert into T( refCol, generatedCol ) values ( 1, 70 )
In addition, we remove explicit mentions of generated columns because we will
add them back when we enhance the INSERT statement with defaulted columns.
ii) Adds logic to parse and bind generated columns. This is modelled on the
logic which parses and binds CHECK constraints.
iii) Renames bindCheckConstraint() to bindRowScopedExpression() because this
method is now shared by the logic which binds CHECK constraints and the logic
which binds generation clauses.
M java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java
Short-circuits the logic which enhances the base row with defaulted columns.
Adds in the generated columns but does not add their generation clauses. This
is because the clauses cannot be bound at the same time as the rest of the
columns in the base row. We wait to bind them until the time that we bind CHECK
constraints.
M java/engine/org/apache/derby/impl/sql/compile/InsertNode.java
Wires binding and code-generation calls into bindStatement() and generate().
--------------------
-- CODE GENERATION
--------------------
M java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java
Skips code-generation for generated columns when walking the base row. The
generateCore() method generates (1'). We need to build the generation clauses
into (2') instead and this is done later on.
M java/engine/org/apache/derby/impl/sql/compile/DMLModStatementNode.java
In addition to the bind() changes described above, adds logic to generate the
(2') method.
--------------------
-- EXECUTION
--------------------
M java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java
M
java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java
Adds (2') as an argument to the factory method which instantiates
InsertResultSets.
M java/engine/org/apache/derby/iapi/sql/Activation.java
M java/engine/org/apache/derby/impl/sql/execute/BaseActivation.java
M java/engine/org/apache/derby/impl/sql/GenericActivationHolder.java
Adds a method for retrieving the current row from the Activation. This allows
us to return the Activation to its original state after we have run (2').
M java/engine/org/apache/derby/impl/sql/execute/InsertResultSet.java
M java/engine/org/apache/derby/impl/sql/execute/NoRowsResultSetImpl.java
Evaluates generation clauses close to where CHECK constraints are evaluated.
M
java/testing/org/apache/derbyTesting/functionTests/tests/lang/GeneratedColumnsTest.java
Uncomments basic INSERT tests.
> implement SQL generated columns
> -------------------------------
>
> Key: DERBY-481
> URL: https://issues.apache.org/jira/browse/DERBY-481
> Project: Derby
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 10.0.2.1
> Reporter: Rick Hillegas
> Assignee: Rick Hillegas
> Attachments: derby-481-00-aa-prototype.diff,
> derby-481-01-aa-catalog.diff, derby-481-02-aa-utilities.diff,
> derby-481-03-aa-grammar.diff, derby-481-04-aa-insert.diff,
> GeneratedColumns.html
>
>
> Satheesh has pointed out that generated columns, a SQL 2003 feature, would
> satisfy the performance requirements of Expression Indexes (bug 455).
> Generated columns may not be as elegant as Expression Indexes, but they are
> easier to implement. We would allow the following new kind of column
> definition in CREATE TABLE and ALTER TABLE statements:
> columnName GENERATED ALWAYS AS ( expression )
> If expression were an indexableExpression (as defined in bug 455), then we
> could create indexes on it. There is no work for the optimizer to do here.
> The Language merely has to compute the generated column at INSERT/UPDATE time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.