It seems obvious at first glance that WITHOUT ROWID should require a
table to have a PRIMARY KEY, but on second glance I don't get it.  SQL
itself doesn't require tables to have primary keys, so why couldn't an
RDBMS have primary key-less tables?

Obviously a B-Tree-based table will need *some* key, but it won't need
that key to be UNIQUE.  That's the problem with PRIMARY KEY: it
implies a UNIQUE constraint.

Consider a column-oriented schema with a single table:

    CREATE TABLE attrs (id, attr_name, attr_val);

representing "objects" with various attributes some of which are
allowed to take on multiple values, even multiple duplicate values.
One might add a pair of covering indexes:

    CREATE INDEX forward ON attrs (id, attr_name, attr_val);
    CREATE INDEX backward ON attrs (attr_val, attr_name, id);

which obviate the need to ever use the actual table -- this results in
roughly 50% more work at INSERT/UPDATE/DELETE time and 50% more space
than the same schema without a ROWID.  Clearly there's an advantage in
declaring attrs as WITHOUT ROWID, but the price is accepting a UNIQUE
constraint on one of the two covering indexes.

If the unique constraint is acceptable then the above becomes:

    CREATE TABLE attrs (id, attr_name, attr_val, PRIMARY KEY (id,
attr_name, attr_val)) WITHOUT ROWID;
    CREATE INDEX backward ON attrs (attr_val, attr_name, id);

Thinking about how SQLite3 would implement WITHOUT ROWID, it seems
logical that a set of columns to be used as the primary indexed is
required, but that they be unique shouldn't be.

To answer myself, PRIMARY KEY almost certainly happens to be a
convenient syntactic source of columns that SQLite3 can use as a key
for a rowid-less table's b-tree primary index.  The implied UNIQUE
constraint is just an unfortunate side-effect.

I can live with this for now, but I can imagine that this will become
a real problem for someone eventually.  I can envision SQLite3
requiring an INDEX be declared before any INSERTs on a WITHOUT ROWID
table, taking the first index as the table's primary index -- that way
no new syntax would be required, but behavior would then be dependent
on index declaration order, something that's not obviously a good
thing, and quite possibly bad because it would seem non-deterministic.

Anyways, this is a curiosity for me for now.  Adjusting to be able to
use WITHOUT ROWID actually helped me find unnecessary duplicate rows
in the tables I wanted to not have rowids for -- arguably a bug in my
code that I've now fixed.

Cheers,

Nico
--
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to