[firebird-support] Today's performance question - index direction
I've noted that the documentation says that whether your index is ASC or DESC matters, but it's not clear to me either why it should or exactly what the implications are. Boiled down[#], I've got a table MYTABLE with an integer column MYCOLUMN with an index on it, and I'm looking at queries like (a) select COUNT(*) from MYTABLE where MYCOLUMN 2 (b) select COUNT(*) from MYTABLE where MYCOLUMN 2 The vast majority of rows in the table have MYCOLUMN = 2, so I'm looking at these queries to return very small numbers very quickly. (MYCOLUMN = 2 means this object is OK, and I'm looking to find the objects that are not OK.) If the index is ASC then (a) is fast (b) is very slow (although it claims to be using the index, according to the plan, it's doing as much work as a table scan, according to the stats) If the index is DESC then (a) is a bit slower but still fast (b) is fast. Questions: (1) So it seems that if I want to do a comparison in the WHERE clause I'm better off with a DESC index, but if I'm wanting to do a comparison there isn't the same need to use an ASC index? (2) How does this indexing work anyway, that makes it sensitive to direction? - from other databases I'm sort-of vaguely used to the idea (without having got into the source code) that an index is a tree structure, and having walked down it to a particular node there's no different cost to going left (DESC) or right (ASC) from there? [#] The real cases are rather more complicated. I haven't actually re-run the experiments on this cut-down scenario so don't know for sure that it will behave in the same way. In particular the real index is on multiple columns, of which the last is the interesting one. Yes I know that multiple column indices aren't exactly encouraged in Firebird; at this stage I'm doing experiments, not writing production code. -- Tim Ward
RE: [firebird-support] Today's performance question - index direction
Time Boiled down[#], I've got a table MYTABLE with an integer column MYCOLUMN with an index on it, and I'm looking at queries like (a) select COUNT(*) from MYTABLE where MYCOLUMN 2 (b) select COUNT(*) from MYTABLE where MYCOLUMN 2 The vast majority of rows in the table have MYCOLUMN = 2, so I'm looking at these queries to return very small numbers very quickly. (MYCOLUMN = 2 means this object is OK, and I'm looking to find the objects that are not OK.) If the index is ASC then (a) is fast (b) is very slow (although it claims to be using the index, according to the plan, it's doing as much work as a table scan, according to the stats) If the index is DESC then (a) is a bit slower but still fast (b) is fast. Questions: (1) So it seems that if I want to do a comparison in the WHERE clause I'm better off with a DESC index, but if I'm wanting to do a comparison there isn't the same need to use an ASC index? Correct! (2) How does this indexing work anyway, that makes it sensitive to direction? - from other databases I'm sort-of vaguely used to the idea (without having got into the source code) that an index is a tree structure, and having walked down it to a particular node there's no different cost to going left (DESC) or right (ASC) from there? Firebird is only able to walk indexes in one direction, due to prefix compression within the index structures, which is why ASC/DESC matters. [#] The real cases are rather more complicated. I haven't actually re-run the experiments on this cut-down scenario so don't know for sure that it will behave in the same way. In particular the real index is on multiple columns, of which the last is the interesting one. Yes I know that multiple column indices aren't exactly encouraged in Firebird... That is not completely true. There are plenty of cases where multi-column indexes are an excellent solution, much better than depending on the use of separate single column indexes. It really depends on your data usage/access modalities. Sean
Re: [firebird-support] Today's performance question - index direction
On 04/09/2013 16:54, Leyne, Sean wrote: due to prefix compression within the index structures Ah, yes, I did write one of those once - a 1980s spelling dictionary for a word processor, where being able to encode an entire word in the smallest possible number of five-bit codes was *much* more important than being able to read the list backwards. (I think we ended up with less than 3 bytes per word for an English dictionary?) I'm slightly surprised that it's thought to matter with this century's disk prices however. -- Tim Ward
Re: [firebird-support] Today's performance question - index direction
Em 4/9/2013 13:02, Tim Ward escreveu: Ah, yes, I did write one of those once - a 1980s spelling dictionary for a word processor, where being able to encode an entire word in the smallest possible number of five-bit codes was *much* more important than being able to read the list backwards. (I think we ended up with less than 3 bytes per word for an English dictionary?) I'm slightly surprised that it's thought to matter with this century's disk prices however. -- Tim Ward AFAIK it's not about disk price, but to make it more dense, so more index entries could fit in a single page, leading to fewer page reads and so improving speed. see you !
Re: [firebird-support] Today's performance question - index direction
I've noted that the documentation says that whether your index is ASC or DESC matters, but it's not clear to me either why it should or exactly what the implications are. Boiled down[#], I've got a table MYTABLE with an integer column MYCOLUMN with an index on it, and I'm looking at queries like (a) select COUNT(*) from MYTABLE where MYCOLUMN 2 (b) select COUNT(*) from MYTABLE where MYCOLUMN 2 The vast majority of rows in the table have MYCOLUMN = 2, so I'm looking at these queries to return very small numbers very quickly. (MYCOLUMN = 2 means this object is OK, and I'm looking to find the objects that are not OK.) If the index is ASC then (a) is fast (b) is very slow (although it claims to be using the index, according to the plan, it's doing as much work as a table scan, according to the stats) Decided that a value in the middle should mean OK, is - well - not the greatest idea. Probably you also started out with a limited number of options and then adding other values to have other meanings made sense once upon a time. I would recommend to change OK from being 2 to being something like (a higher value than you expect to need for other purposes). If that's not easy, I would consider adding the column MyColumnNegative, populated through a trigger and set to the negative equivalent of MyColumn. Then a) would still be fast, and if you changed b) to (b) select COUNT(*) from MYTABLE where MYCOLUMNNEGATIVE -2 then that should also be fast. Other options that you could consider at least include using computed indexes (though I've never tried them) and adding a triggerpopulated new table(s) - either containing references to records that are not OK or a table containing the sums (when records are deleted, you insert a row containing -1, when records are inserted, you insert a row containing +1, when a row is modified, you may end up inserting one row with +1 and another with -1, occationally you sum data and delete old sums). All in all, you have quite a few options you can consider to circumvent your problem while awaiting Firebird 3 with histograms (haven't checked alpha 1 yet, so I'm not certain they are included). HTH, Set
RE: [firebird-support] Today's performance question - index direction
Tim, due to prefix compression within the index structures Ah, yes, I did write one of those once - a 1980s spelling dictionary for a word processor, where being able to encode an entire word in the smallest possible number of five-bit codes was *much* more important than being able to read the list backwards. (I think we ended up with less than 3 bytes per word for an English dictionary?) I'm slightly surprised that it's thought to matter with this century's disk prices however. There are more costs that just disk space usage which need to be considered: - Data Transfer speed from disk to RAM - Data transfer from RAM to CPU - RAM consumption of cache - Loss of cache efficiency due to higher number of pages used by index structures I do agree though, that if everyone was using SSDs for storage, the weight/strength of the above would be much weaker. Sean