I looked at the SQL 2000 spec to see what it says about generated keys. Following are some lines copied from the Identity columns section and from Sequence generators section. The value generation for identity column follows the gules of Sequence generator. And in the Sequence generator section, the spec says that value generation is done in a transaction of its own and is not associated with the outside user transaction.
4.14.4 Identity columns
An identity column is associated with an internal sequence generator
SG. Let IC be the identity column of BT . When a row R is presented for insertion into BT , if R does not contain a column corresponding to IC, then the value V for IC in the row inserted into BT is obtained by applying the General Rules of Subclause 9.21, ''Generation of the next value of a sequence generator'', to SG . The definition of an identity column may specify GENERATED ALWAYS or GENERATED BY DEFAULT.4.21 Sequence generators
Changes to the current base value of a sequence generator are not controlled by SQL-transactions; therefore, commits and rollbacks of SQL-transactions have no effect on the current base value of a sequence generator.Based on this, I believe that Satheesh's initial approach on solving this bug is correct which is if the value generated causes a duplicate key failure for the insert and then that generated value for that insert should be consumed by the engine. The next attempt of the same insert should try to use the next generated value. An eg to understand the SQL spec behavior
ij(CONNECTION0)> create table tauto(i int generated by default as identity, k int);
0 rows inserted/updated/deleted
ij(CONNECTION0)> create unique index tautoInd on tauto(i);
0 rows inserted/updated/deleted
ij(CONNECTION0)> insert into tauto(k) values 1,2;
2 rows inserted/updated/deleted
ij(CONNECTION0)> select * from tauto;
I |K
-----------------------
1 |1
2 |2
2 rows selected
ij(CONNECTION0)> insert into tauto values (4,4);
1 row inserted/updated/deleted
ij(CONNECTION0)> insert into tauto(k) values 3;
1 row inserted/updated/deleted
ij(CONNECTION0)> insert into tauto(k) values 4; <=== Expected error. AND DERBY SHOULD CONSUME 4
ERROR 23505: The statement was aborted because it would have caused a duplicate
key value in a unique or primary key constraint or unique index identified by 'T
AUTOIND' defined on 'TAUTO'.
ij(CONNECTION0)> insert into tauto(k) values 5; <=== DERBY SHOULD GENERATE 5 AT THIS POINT
0 rows inserted/updated/deleted
ij(CONNECTION0)> create unique index tautoInd on tauto(i);
0 rows inserted/updated/deleted
ij(CONNECTION0)> insert into tauto(k) values 1,2;
2 rows inserted/updated/deleted
ij(CONNECTION0)> select * from tauto;
I |K
-----------------------
1 |1
2 |2
2 rows selected
ij(CONNECTION0)> insert into tauto values (4,4);
1 row inserted/updated/deleted
ij(CONNECTION0)> insert into tauto(k) values 3;
1 row inserted/updated/deleted
ij(CONNECTION0)> insert into tauto(k) values 4; <=== Expected error. AND DERBY SHOULD CONSUME 4
ERROR 23505: The statement was aborted because it would have caused a duplicate
key value in a unique or primary key constraint or unique index identified by 'T
AUTOIND' defined on 'TAUTO'.
ij(CONNECTION0)> insert into tauto(k) values 5; <=== DERBY SHOULD GENERATE 5 AT THIS POINT
As for Dan's concern "If I pre-loaded a table with 100,000 rows with consective values, then it's going to require 100,000 subsequent failing inserts before one succeeds", I think Derby should add the ability to set identity starting value by alter table which can be used after a load for example.
If anyone has objections/opinions on this, please send in your comments. In the mean time, I will start looking at what it will take to have Derby consume the value that it generates irrespective of whether the outer user transaction fails or succeeds.
thanks,
Mamta
On 12/15/05, Bryan Pendleton <[EMAIL PROTECTED]
> wrote:
>>I would be happy if Derby can consume identity values even if current
>>insert statement fails. For this case, some insert statements may fail
>>when they generate a value that is already present. But subsequent
>>inserts should pass.
>
> I wonder if the counter should be bumped to the max value on a failed
> insert, there is only a problem if a unique index exists, so getting the
> max will be fast.
I'm not sure why the user would want *any* inserts to fail. If the
database can figure out the current max, and generate a new value which
is beyond that, thus causing the insert to succeed, then that seems like
the best outcome.
thanks,
bryan
