[ 
https://issues.apache.org/jira/browse/DERBY-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Matrigali updated DERBY-2212:
----------------------------------


I continue to think that it is a bad idea to introduce duplicate values to the 
current btree unique index (and sorting paradigm).  I guess you are building a 
"mostly" uniquei index.  I understand that SQL wants to treat the nulls 
differently but the basic assumptions of a lot of the structure of the unique 
btree (splitting, searching, logical recovery, branch node keys, 
serializable locking, ....) all are based on actual lower level physical nature 
of the keys.  

For instance while the SQL layer may never do a search where key=null, the 
logical recovery code may execute a search on the unique index using the key 
expecting null to
match null.   So if index
is a single column unique index, with say 1,000,000 null values we may issue a 
search
in the unique index for "null" assuming we will find the single row because it 
is a unique
index.  I believe the current search algorithm is optimized for unique indexes 
to not include
the row location column in the search comparisons.

 Introducing duplicate nulls to unique indexes I believe will unnecessarily 
complicate the code and likely introduce bugs which will be hard to identify in 
the future.

Safest would be to use existing non-unique indexes with extra work by language 
layer to 
verify uniqueness at insert time (indexes don't support updates so they default 
to a delete and
insert -- thus just one path to code/check).  It may be worth fiddling with the 
optimizer to recognize that this index is a "mostly unique non-unique index" so 
that it can get estimates better than a regular non-unique index.  

Adding support to the btree itself to perform the check at insert time is more 
efficient at the cost of complicating the btree code, but I think could be 
isolated to a single compare call at point just before value is about to be 
inserted into tree.  

I know it is enticing to use the unique tree, but just sometimes treat nulls as 
not equal and sometimes as equal - but i don't think that is the right approach.

> Add "Unique where not null" to create index
> -------------------------------------------
>
>                 Key: DERBY-2212
>                 URL: https://issues.apache.org/jira/browse/DERBY-2212
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Oleksandr Alesinskyy
>            Assignee: Anurag Shekhar
>         Attachments: derby-2212preview.diff, derby-2212preview2.diff
>
>
> Derby prohibits creation of unique constraints on nullable colums (as well if 
> only some columns in the constraint list are nullable) and treat nulls in 
> unique indexes as normal values (i.e. only one row with null values in 
> indexed columns may be inserted into the table). This bahavior is very 
> restrictive, does not completely comply with SQL standards (both letter and 
> intent) as well as with business needs and intending meaning of NULL values 
> (2 null values are not considered as equal, this comparision shall return 
> NULL, and for selection criteria boolean null is treated as FALSE).
> This behavior, as far as I can see, is modelled after DB2 (and differs from 
> behavior of most other major databases, like SyBase, Oracle, etc.).
> But even DB2 provide some means to alleviate these restrictions, namely 
> "UNIQUE WHERE NOT NULL" clause for CREATE INDEX statement.
> It will be very good if such "UNIQUE WHERE NOT NULL" clause will be 
> introduced in Derby.
> Regards,
> Oleksandr Alesinskyy

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to