On 2015-08-21 11:23 PM, Will Parsons wrote:
> On 21 Aug 2015, R.Smith wrote:
>>
>> On 2015-08-21 04:47 AM, Will Parsons wrote:
>>> I'm working on a program that involves a catalogue of books.  Part of
>>> the database schema looks like this:
>>>
>>> create table Books(id integer primary key,
>>>                      title text collate nocase not null,
>>>                      author references Authors(id),
>>>                      ...
>>>                      unique(title, author));
>>>
>>> create table Authors(id integer primary key,
>>>                        name text unique not null check(name <> ''));
>>>
>>> The idea here is that the title+author of a book must be unique, but a
>>> book may not necessarily have an author associated with it.  But, the
>>> schema fragment as I have presented it does not disallow entering the
>>> same title with a null author multiple times, which is clearly
>>> undesirable.
>>>
>>> In thinking about how to avoid this, one possibility that has occurred
>>> to me is to add an entry to the Authors table with id=0, name=NULL, and
>>> to modify the Books table to be:
>>>
>>> create table Books(id integer primary key,
>>>                      title text collate nocase not null,
>>>                      author references Authors(id) not null,
>>>                      ...
>>>                      unique(title, author));
>>>
>>> With this, entries in the Books table that presently have the author
>>> field set to NUll would instead have author=0.
>>>
>>> What I would like to have is a trigger that when an attempt is made to
>>> enter a new record into the Books table with a NULL author field, is
>>> to force the author field to 0 instead.  I can't see how to do this
>>> with a "before insert" trigger, though.  Perhaps I'm approaching this
>>> the wrong way; in any case I'd appreciate advice.
>>>
>> Nothing wrong with your approach, simply declare the Author to be NOT
>> NULL and if it is possible to have non-Authors you could do what you
>> suggest, add a 0-Author to the parent table, or, you could use a partial
>> Index (available after SQLite 3.7 only I think) which would allow you to
>> have NULLS but still maintain a strict relation, something like this:
>>
>> create table Books(id integer primary key,
>>                      title text collate nocase not null,
>>                      author int not null references Authors(id),
>>                      ...
>>                      );
>>
>> create unique index uBookAuth on Books(title,author) where author is not 
>> null;
>>
>> create table Authors(id integer primary key,
>>                        name text unique not null check(name <> ''));
> I guess I don't understand how having a partial index where author is
> not null prevents adding two entries with the same title and a null
> author.  How is this different from what I have now with the
> "unique(title, author)" specification?
>

I think I may have misinterpreted slightly...

To clarify: are the book titles unique or are they not?

If they are Unique, i.e if no book title can ever appear twice, 
regardless who authored it, then simply declare a unique index on title.
If they are NOT Unique, i.e. if a title can appear more than once, then 
why is it not working for you?

Essentially, if I interpret correctly, you are asking the DB to NOT 
limit the number of same-titled books, except when you have supplied an 
author and the same author has already such a titled book, but then when 
you don't supply an author, it should know to now also limit the copies 
of  no-author books?

I assumed before that you only added NULL for author if you don't know 
the author yet (which would make sense and can later be updated) but 
then you can't force the unique constraint, there may be many books with 
coinciding titles and not-yet-known authors.

Reading again, I am now thinking that's not the case, you might add NULL 
authors to books which simply don't have authors (well, all books have 
authors, but the author might be unknown and accepted to be of unknown 
status for time to come), in which case, there might be many same-titled 
unknown-author books.

If this is the case and you still want to limit unknown author books to 
just 1 instance, I would suggest to use an explicit author name, maybe 
something like "(by Unknown)" which would be happily subdued by the 
Unique constraint if violated. Another reader suggested the empty 
string, which will also count as a valid author and do the limiting you 
require. Your own suggestion of using a zero ID in stead will also do 
the job just fine. In fact, this is the only reason we'd ever put NULL 
into a field such as this, precisely so that the Unique constraint is 
not forced, because for any valid value, it will be.

A note on NULL and why: NULL + NULL != 2(NULL).

NULL is not a valid value, in fact it isn't a value at all, it is not 
meant to appear in anything data related barring to indicate a fault or 
voidness. You can't have operations on non-values. If you want a 
constraint or perhaps another sort of function to work on some data, 
make sure it is actual data and not NULL.

To put this philosophically: "Infinity" is a much more precise value 
than "NULL".

This also means that NULL is not equal to NULL, it equals nothing 
because it cannot be compared, even to other NULLs - In DB theory 
anyway[1]. Sadly many DB engines have kind of circled around this 
problem a bit by allowing NULL to do all kinds of things it shouldn't. 
I'm happy though that for UNIQUE constraints in some DBs (and SQLite at 
least) a NULL is not equal to any other value, including other NULLs.



Apologies for the ramble, hope some of it helps.
Ryan


[1] - I don't actually see the reason for 3-value logic, NULL is a 
construct with origins in DB engine design, not relational theory. It's 
useful to have a word to indicate "emptiness" or more mathematically 
correct: "The empty set", but beyond that, any operational use of it is 
questionable.


Reply via email to