[firebird-support] Today's performance question - index direction

2013-09-04 Thread Tim Ward
I've noted that the documentation says that whether your index is ASC or 
DESC matters, but it's not clear to me either why it should or exactly 
what the implications are.

Boiled down[#], I've got a table MYTABLE with an integer column MYCOLUMN 
with an index on it, and I'm looking at queries like

(a) select COUNT(*) from MYTABLE where MYCOLUMN  2
(b) select COUNT(*) from MYTABLE where MYCOLUMN  2

The vast majority of rows in the table have MYCOLUMN = 2, so I'm looking 
at these queries to return very small numbers very quickly. (MYCOLUMN = 
2 means this object is OK, and I'm looking to find the objects that 
are not OK.)

If the index is ASC then

(a) is fast
(b) is very slow (although it claims to be using the index, according to 
the plan, it's doing as much work as a table scan, according to the stats)

If the index is DESC then

(a) is a bit slower but still fast
(b) is fast.

Questions:

(1) So it seems that if I want to do a  comparison in the WHERE 
clause I'm better off with a DESC index, but if I'm wanting to do a  
comparison there isn't the same need to use an ASC index?

(2) How does this indexing work anyway, that makes it sensitive to 
direction? - from other databases I'm sort-of vaguely used to the idea 
(without having got into the source code) that an index is a tree 
structure, and having walked down it to a particular node there's no 
different cost to going left (DESC) or right (ASC) from there?

[#] The real cases are rather more complicated. I haven't actually 
re-run the experiments on this cut-down scenario so don't know for sure 
that it will behave in the same way. In particular the real index is on 
multiple columns, of which the last is the interesting one. Yes I know 
that multiple column indices aren't exactly encouraged in Firebird; at 
this stage I'm doing experiments, not writing production code.

-- 
Tim Ward



RE: [firebird-support] Today's performance question - index direction

2013-09-04 Thread Leyne, Sean
Time

 Boiled down[#], I've got a table MYTABLE with an integer column
 MYCOLUMN with an index on it, and I'm looking at queries like
 
 (a) select COUNT(*) from MYTABLE where MYCOLUMN  2
 (b) select COUNT(*) from MYTABLE where MYCOLUMN  2
 
 The vast majority of rows in the table have MYCOLUMN = 2, so I'm looking at
 these queries to return very small numbers very quickly. (MYCOLUMN =
 2 means this object is OK, and I'm looking to find the objects that are not
 OK.)
 
 If the index is ASC then
 
 (a) is fast
 (b) is very slow (although it claims to be using the index, according to the
 plan, it's doing as much work as a table scan, according to the stats)
 
 If the index is DESC then
 
 (a) is a bit slower but still fast
 (b) is fast.
 
 Questions:
 
 (1) So it seems that if I want to do a  comparison in the WHERE clause I'm
 better off with a DESC index, but if I'm wanting to do a 
 comparison there isn't the same need to use an ASC index?

Correct!


 (2) How does this indexing work anyway, that makes it sensitive to direction?
 - from other databases I'm sort-of vaguely used to the idea (without having
 got into the source code) that an index is a tree structure, and having walked
 down it to a particular node there's no different cost to going left (DESC) or
 right (ASC) from there?

Firebird is only able to walk indexes in one direction, due to prefix 
compression within the index structures, which is why ASC/DESC matters.


 [#] The real cases are rather more complicated. I haven't actually re-run the
 experiments on this cut-down scenario so don't know for sure that it will
 behave in the same way. In particular the real index is on multiple columns,
 of which the last is the interesting one. Yes I know that multiple column
 indices aren't exactly encouraged in Firebird...

That is not completely true.

There are plenty of cases where multi-column indexes are an excellent solution, 
much better than depending on the use of separate single column indexes.  It 
really depends on your data usage/access modalities.


Sean


Re: [firebird-support] Today's performance question - index direction

2013-09-04 Thread Tim Ward

On 04/09/2013 16:54, Leyne, Sean wrote:


due to prefix compression within the index structures

Ah, yes, I did write one of those once - a 1980s spelling dictionary for 
a word processor, where being able to encode an entire word in the 
smallest possible number of five-bit codes was *much* more important 
than being able to read the list backwards. (I think we ended up with 
less than 3 bytes per word for an English dictionary?)


I'm slightly surprised that it's thought to matter with this century's 
disk prices however.


--
Tim Ward



Re: [firebird-support] Today's performance question - index direction

2013-09-04 Thread Alexandre Benson Smith

Em 4/9/2013 13:02, Tim Ward escreveu:



Ah, yes, I did write one of those once - a 1980s spelling dictionary 
for a word processor, where being able to encode an entire word in the 
smallest possible number of five-bit codes was *much* more important 
than being able to read the list backwards. (I think we ended up with 
less than 3 bytes per word for an English dictionary?)


I'm slightly surprised that it's thought to matter with this century's 
disk prices however.

--
Tim Ward




AFAIK it's not about disk price, but to make it more dense, so more 
index entries could fit in a single page, leading to fewer page reads 
and so improving speed.


see you !


Re: [firebird-support] Today's performance question - index direction

2013-09-04 Thread Svein Erling Tysvær
I've noted that the documentation says that whether your index is ASC or
DESC matters, but it's not clear to me either why it should or exactly
what the implications are.

Boiled down[#], I've got a table MYTABLE with an integer column MYCOLUMN
with an index on it, and I'm looking at queries like

(a) select COUNT(*) from MYTABLE where MYCOLUMN  2
(b) select COUNT(*) from MYTABLE where MYCOLUMN  2

The vast majority of rows in the table have MYCOLUMN = 2, so I'm looking
at these queries to return very small numbers very quickly. (MYCOLUMN =
2 means this object is OK, and I'm looking to find the objects that
are not OK.)

If the index is ASC then

(a) is fast
(b) is very slow (although it claims to be using the index, according to
the plan, it's doing as much work as a table scan, according to the stats)

Decided that a value in the middle should mean OK, is - well - not the greatest 
idea. Probably you also started out with a limited number of options and then 
adding other values to have other meanings made sense once upon a time. I would 
recommend to change OK from being 2 to being something like  (a higher 
value than you expect to need for other purposes). If that's not easy, I would 
consider adding the column MyColumnNegative, populated through a trigger and 
set to the negative equivalent of MyColumn. Then a) would still be fast, and if 
you changed b) to

(b) select COUNT(*) from MYTABLE where MYCOLUMNNEGATIVE  -2

then that should also be fast.

Other options that you could consider at least include using computed indexes 
(though I've never tried them) and adding a triggerpopulated new table(s) - 
either containing references to records that are not OK or a table containing 
the sums (when records are deleted, you insert a row containing -1, when 
records are inserted, you insert a row containing +1, when a row is modified, 
you may end up inserting one row with +1 and another with -1, occationally you 
sum data and delete old sums).

All in all, you have quite a few options you can consider to circumvent your 
problem while awaiting Firebird 3 with histograms (haven't checked alpha 1 yet, 
so I'm not certain they are included).

HTH,
Set

RE: [firebird-support] Today's performance question - index direction

2013-09-04 Thread Leyne, Sean
Tim,

  due to prefix compression within the index structures
 Ah, yes, I did write one of those once - a 1980s spelling dictionary for a 
 word
 processor, where being able to encode an entire word in the smallest
 possible number of five-bit codes was *much* more important than being
 able to read the list backwards. (I think we ended up with less than 3 bytes
 per word for an English dictionary?)
 
 I'm slightly surprised that it's thought to matter with this century's disk 
 prices
 however.

There are more costs that just disk space usage which need to be considered:
  - Data Transfer speed from disk to RAM
  - Data transfer from RAM to CPU
  - RAM consumption of cache
  - Loss of cache efficiency due to higher number of pages used by index 
structures

I do agree though, that if everyone was using SSDs for storage, the 
weight/strength of the above would be much weaker.


Sean