[ 
https://issues.apache.org/jira/browse/DERBY-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Hillegas updated DERBY-481:
--------------------------------

    Attachment: derby-481-04-aa-insert.diff

Attaching derby-481-04-aa-insert.diff. I am running regression tests now.

This patch wires in INSERT support for generated columns. I threaded my way 
through the INSERT machinery largely by following the way that CHECK 
constraints are handled.

Before this patch, the compiler built 2 significant methods for evaluating 
expressions:

1) A method which populates the base row from whatever data source is driving 
the INSERT. That data source could be, for instance, a list of literal values 
or a SELECT statement.

2) A method which runs the CHECK constraints.

My first attempt to support INSERT involved building the generation clauses 
into method (1). Unfortunately, that method is generated by the data sources, 
not by the driving INSERT node. I got this approach to work for the degenerate 
case of inserting a single literal value. But this approach failed when I tried 
to insert multiple literal values (where the data source is a UNION) and it 
failed when the data source was a SELECT. It became apparent that this approach 
would involve wiring code-generation logic into all implementations of 
ResultSet--there are quite a few. This began to look too complicated so I 
abandoned this approach.

The current patch represents a second attempt. Here the approach is to give the 
generation clauses their own method. Now the compiler builds 3 significant 
methods for evaluating expressions:

1') The original method which populates the base row from a data source (see 
above).

2') A new method which runs the generation clauses, looking for referenced 
columns in the row built by (1') and poking the generated values into that row.

3') The original method which runs the CHECK constraints (see above).

That was the tricky bit for compilation.

The tricky bit for execution was this: the base row has to be poked into the 
Activation so that it is visible to the generation clauses when (2') runs. A 
similar poking is done for CHECK constraints. If you examine this poking for 
CHECK constraints, you will notice that sometimes the poking is undone after 
the constraints run and sometimes we don't bother to undo the poking. I don't 
understand the difference between these code paths. As a result, I have 
defensively coded the new poking which we need for generated columns. I poke 
the base row into the Activation just before the generation clauses run. After 
the generation clauses run, I return the Activation to its previous state.

Here is a little more detail on the implementation:

A) At bind() time we do the following:

i) Prune out explicit mentions of generated columns. These can arise if the 
user sets a generated column to the literal DEFAULT--as allowed by the ANSI/ISO 
syntax. So for instance, the following is legal:

  insert into T( refCol, generatedCol ) values ( 1, default )

We prune out the explicitly added generated columns because, later on in the 
bind() phase, the insert list is expanded to include all columns with defaults 
(not just generated columns).

ii) When the insert list is expanded to include all defaulted columns, we add 
in the generated columns but we don't bind their expressions. This is because 
the generation clause may refer to other columns in the base row. This, in 
turn, creates an ordering problem. In addition we we don't yet have a result 
set number for the base row--we need that number in order to bind references to 
other columns which may appear in the generation clauses.

iii) Later on, just before we parse and bind the CHECK constraints, we parse 
and bind the generation clauses. At this point, we have enough context to bind 
the referenced columns.

B) At generate() time, we generate method (2') in between generating (1') and 
(3'). The generated (2') method is now one of the arguments to the factory 
method which creates the execution-driver, the InsertResultSet. This is just 
like what we do for CHECK constraints: the generated (3') method is also an 
argument to the instantiation of the InsertResultSet.

C) At execution time, we evaluate (2') just before we evaluate (3').



Touches the following files:

--------------------
-- BINDING
--------------------

M      java/engine/org/apache/derby/impl/sql/compile/ResultColumn.java

Adds a method so that a ResultColumn can report whether it represents a 
generated column. I also forced all overrides of the expression field to go 
through the setExpression() method. This, technically speaking, is not 
necessary--but it made debugging easier for me and I think it will be useful 
for other developers who need to debug this node.


M      java/engine/org/apache/derby/impl/sql/compile/DMLModStatementNode.java

Changes are made to support both binding and code-generation. These are the 
bind() changes:

i) Adds a method to object if the user tries to override the value in a 
generated column with any value other than the DEFAULT literal. For instance, 
the following is illegal:

  insert into T( refCol, generatedCol ) values ( 1, 70 )

In addition, we remove explicit mentions of generated columns because we will 
add them back when we enhance the INSERT statement with defaulted columns.

ii) Adds logic to parse and bind generated columns. This is modelled on the 
logic which parses and binds CHECK constraints.

iii) Renames bindCheckConstraint() to bindRowScopedExpression() because this 
method is now shared by the logic which binds CHECK constraints and the logic 
which binds generation clauses.


M      java/engine/org/apache/derby/impl/sql/compile/ResultSetNode.java

Short-circuits the logic which enhances the base row with defaulted columns. 
Adds in the generated columns but does not add their generation clauses. This 
is because the clauses cannot be bound at the same time as the rest of the 
columns in the base row. We wait to bind them until the time that we bind CHECK 
constraints.


M      java/engine/org/apache/derby/impl/sql/compile/InsertNode.java

Wires binding and code-generation calls into bindStatement() and generate().


--------------------
-- CODE GENERATION
--------------------

M      java/engine/org/apache/derby/impl/sql/compile/ResultColumnList.java

Skips code-generation for generated columns when walking the base row. The 
generateCore() method generates (1'). We need to build the generation clauses 
into (2') instead and this is done later on.


M      java/engine/org/apache/derby/impl/sql/compile/DMLModStatementNode.java

In addition to the bind() changes described above, adds logic to generate the 
(2') method.


--------------------
-- EXECUTION
--------------------


M      java/engine/org/apache/derby/iapi/sql/execute/ResultSetFactory.java
M      
java/engine/org/apache/derby/impl/sql/execute/GenericResultSetFactory.java

Adds (2') as an argument to the factory method which instantiates 
InsertResultSets.


M      java/engine/org/apache/derby/iapi/sql/Activation.java
M      java/engine/org/apache/derby/impl/sql/execute/BaseActivation.java
M      java/engine/org/apache/derby/impl/sql/GenericActivationHolder.java

Adds a method for retrieving the current row from the Activation. This allows 
us to return the Activation to its original state after we have run (2').

M      java/engine/org/apache/derby/impl/sql/execute/InsertResultSet.java
M      java/engine/org/apache/derby/impl/sql/execute/NoRowsResultSetImpl.java

Evaluates generation clauses close to where CHECK constraints are evaluated.


M      
java/testing/org/apache/derbyTesting/functionTests/tests/lang/GeneratedColumnsTest.java

Uncomments basic INSERT tests.


> implement SQL generated columns
> -------------------------------
>
>                 Key: DERBY-481
>                 URL: https://issues.apache.org/jira/browse/DERBY-481
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.0.2.1
>            Reporter: Rick Hillegas
>            Assignee: Rick Hillegas
>         Attachments: derby-481-00-aa-prototype.diff, 
> derby-481-01-aa-catalog.diff, derby-481-02-aa-utilities.diff, 
> derby-481-03-aa-grammar.diff, derby-481-04-aa-insert.diff, 
> GeneratedColumns.html
>
>
> Satheesh has pointed out that generated columns, a SQL 2003 feature, would 
> satisfy the performance requirements of Expression Indexes (bug 455). 
> Generated columns may not be as elegant as Expression Indexes, but they are 
> easier to implement. We would allow the following new kind of column 
> definition in CREATE TABLE and ALTER TABLE statements:
>     columnName GENERATED ALWAYS AS ( expression )
> If expression were an indexableExpression (as defined in bug 455), then we 
> could create indexes on it. There is no work for the optimizer to do here. 
> The Language merely has to compute the generated column at INSERT/UPDATE time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to