Re: Default record order...

DL Neil Wed, 06 Mar 2002 00:29:02 -0800

Richard,

> Thanks.  Interestingly enough, pulling records without a sort was
rather
> random (rather, influence by the last sort) - but once I placed a
primary
> key on the table, the order of unsorted records is always constant,
even
> after a sort (based on numerous tests yesterday and today with the
tables
> and a bunch of selects).


I'm a little surprised - and yet I'm not (which is a particularly
decisive statement!). Did you CREATE the table, add data, manipulate it,
and later impose the PK? When you queried/listed, was that the whole
table or less than (say) 25%?

If only a proportion of the table is SELECTed (WHERE clause) then an
index will be employed, but if it is an entire table list it is more
efficient to do a table scan. These would have the potential to yield
different sequences of rows from the table. I would do a LOT of testing
before relying on the PK-data sequence behavior mentioned above.

Let's take another trip down 'memory lane' and review our 'old'
terminology. A table consists of at least two files, the data and the
index (more in the manual). The data part is a SAM file (sequential
access method - remember that in this context "sequential" is talking
about a series of records not of key values) structure, ie the sequence
of rows is initially (only) chronological, new records are added to the
end, and deletions result in 'holes' being left mid-way. Periodically
one would do a 'reorg' (some call it 'compression') to recover the 'lost
space' and close up the holes. That process may also result in some
re-arrangement of the records, but regardless the idea even of
'chronology' is lost.

The second part(s) is the index(es). In the 'good old days' we talked
about ISAM (Indexed-sequential access method), where the data was kept
in a SAM format, but an 'index' structure was added to kept a note of
which key-values/records were in which 'block' of the file. INSERT still
took place at the end, but SELECT could pre-access the index and then
quickly locate a particular record from amongst the random-ness. In this
case a re-org could be much more sophisticated. You can still see the
'history' of ISAM in MySQL today - look at the
filenames/types/terminology in use in the docs. However I understand
that with various improvements they now use different algorithms, eg
b-tree structures for index organisation, etc - but that sounds like
'new tricks' so us 'old dogs' won't go there... NB: I have no idea if a
reorg by MySQL can be expected to inject some 'order' into the data as
well as 'balancing' the b-trees - the terminology of which puts me in
mind of bonsai, so goodness knows what it's doing whilst pruning a
little off the top, and squirrelling the nuts away for winter...

If the 'optimiser' decides that it will be quicker to deliver query
results using an index/PK, then you will see this influenced in the
sequence of rows in the resultset (AFAIK). If however the RDBMS figures
out that it will be faster to ignore the index and deliver from the data
'file' only, then who knows in what sequence the rows will be delivered!

SQL/relational algebra has no concept of 'sequence'. An RDBMS has no
requirement/responsibility to store data in a particular sequence. The
'set theory' says that a SELECT is required only to return a set of
rows. Thus the only way to impose your own or indeed today's view of
order on this (potential) chaos, is to use the ORDER BY clause!

Regards,
=dn



> > > > I guess your "problem" comes from the fact that SQL has no
concept
> > of
> > > > internal order. If you do not specify an ORDER BY clause, the
order
> > or
> > > > records returned is undefined, i.e. random.
> > >
> > > That's what I was figuring.  I asked because I wanted it
confirmed.
> > Thank
> > > you.
> > >
> > > > Of course, MySQL has some kind of internal order depending on
many
> > > > factors, but you may not rely on it. And neither you may rely on
the
> > > > fact that a PRIMARY KEY influences the internal order. You have
to
> > use
> > > > an ORDER BY clause if you want to get a sorted result (of
course,
> > you
> > > > want a key to speed up the ORDER BY clause).
> > >
> > > Hmmm.  Perhaps I'm misled by the default behaviour of other
database
> > engines,
> > > but I was taught that the primary key was stored in the database
to
> > optimize
> > > search/insert/delete - which meant *sorted*.  That is why you
don't
> > want a
> > > large (complex) primary key on tables that must run "fast" - the
> > overhead of
> > > sorting each insert/delete negatively affects performance. Or so I
was
> > taught,
> > > anyway (back in the dark ages - primative data structures and all
that
> > <G>).
> >
> > As others have said, this is not part of the relational model.
> > However you are correct - back in the 'good old days' we could rely
upon
> > hierarchical databases to do this, and I'm fairly sure that the
early
> > 'SQL' DBMSes also used to do this because they physically separated
the
> > Primary Key and the 'dependent part' of the row, so that any
'straight'
> > listing would come out in PK sequence.
> >
> > =dn
>
>


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Default record order...

Reply via email to