> On 5/2/2014 3:25 PM, Vlad Khorsun wrote: >>> OK, the alternative to record lookups is to store the transaction id in >>> index. This would require an index insertion for all indexes defined on >>> a table even if the key value didn't change. It would also require a >>> corresponding index deletion for each index defined on the table when a >>> record version was garbage collected. The update performance would go >>> straight down the toilet. And this doesn't include the performance drop >>> due to fluffed up indexes. >>> >>> If you have a third alternative, by all means share it. >> Mark every index key with two tx numbers: >> - first, mandatory - is the number of tx that inserts this key (insert, >> update when key was changed), >> - second, optional - is the number of tx that deletes this index key >> (delete, update when key was changed). >> >> - inserts will cost the same as now, >> - updates will >> - if key was not changed - same cost as now (zero) >> - if key was changed - twice cost as now (mark old key with current tx >> number, insert new key) >> - delete will have additional cost to mark old key with current tx number >> - undo of update and delete must additionally clear the mark for the old key >> - index keys will be more wide than now, there is some tricks to reduce new >> index keys length >> - garbage collection will be the main winner - there is no need to process >> indices at the same time >> as table records. It allows to process every index independent from table >> and almost completely >> eliminates random IO when records are removed. Currently, when table have >> more than 5-6 indices, >> garbage collection is terrible slow because of random IO. >> - selects will have no need to read record version to evaluate record >> visibility >> - also it allows to have index coverage (also requires to use such index key >> encoding which allows >> to recover original value from index key) >> >> > > OK, I've given this some thought. With one reservation and a set of > concerns, I'm reasonably convinced that the information necessary to > implement this is available at each step of an entry entry life cycle -- > insertion, update, and eventual garbage collection is either available > when need or could be provided at negligible cost. The performance > considerations are significant and very difficult to balance. I will > stop short at saying it could be implemented and leave whether it should > be implement to folks more familiar with the current code than me. > > My sole reservation is that there can be information lost when a large > valued 64 integer is converted into a double to generate the index key. > The current key generation scheme. doesn't quite hack it Vlad's scheme. > That said, it could be readily fixed by appending the low order bits of > the original 64 bit integer to the end of the mulched key. This could > probably be done as an upwards compatible upgrade, but since Vlad's > scheme requires other significant changes, it probably doesn't make sense.
I prefer to concentrate not on number encoding (it is obviously task which could be solved) but on transaction info in index entries which allows to significanlty boost performance of garbage collection. > My first concern is over index density. Adding one and sometimes two > transaction ids to each index entry is going to significantly fluff up > the index, increasing the I/O necessary for index scans and > maintenance. Transaction ids are not small and don't compress well. > Given the density of the current index structure, adding transaction ids > might double the index size. > Yes, this is true. As i told, there is tricks to lower this issue: a) we already use variable length encoding for page\record numbers that allows to store small numbers using less number of bytes b) considering (a) we can store in keys not a transaction number but a difference between some base tx number (stored at b-tree page header) and real tx number. Such base tx number could be a number of transaction which put first key on page and later could be re-evaluated, if necessary. It allows as to use smaller stored tx numbers for a most of keys. c) We can avoid of storing first tx number (creating tx) when if become less than OIT, i.e. we can "compress" b-tree page when feasible. > My second concern is CPU performance. Index bucket traversal has always > been the single greatest hot spot in the architecture. Arno's jump > table, for example, was a net success by reducing CPU at the expense of > index density. The CPU cost of skip over a (variable length?) > transaction id, determine whether there is a second transaction id, and > is so, skip over the second could easily double or triple the cost of > index lookups. But perhaps a beastly clever compression schema that > compressed the pair as a single unit could mitigate this. But either > way, the addition cycles will be significant -- and probably painful. It requires investigation, but i doubt it will be even double of the current cost. Dmitry already confirms this. > My third concern is that garbage collection will be very tricky, but as > lone as the going/staying lists are maintained, doable. The simple > cases are probably straightforward, but untangling A-B-A updates are > going to be hard, particularly if the two A entries are split between > two index leaf pages. Still, this is only code, and I wouldn't expect a > noticeable perform hit. Garbage collection in indices will be a lot faster as it will not be coupled to GC in table. Every index could be processed at any time fully separately from table and requires no going\staying lists anymore. For blob's we still need that lists, yes, but it is not related with GC in indexes. As for A-B-A updates i see no problem as there will be 3 separate index entries with own tx numbers. Currently we also have 3 index entires for such scenario, btw. Old IB's optimization to not store existing key second time is not working in concurrent environment and easy allows to have missing entries. It was disabled in FB 2.0. I have a big concenrs of offered shema too. I already mentioned all obvious drawbacks (and benefits) of it. But currently used way of GC both inefficent and not 100% safe in highly concurrent environment. Regards, Vlad ------------------------------------------------------------------------------ Is your legacy SCM system holding you back? Join Perforce May 7 to find out: • 3 signs your SCM is hindering your productivity • Requirements for releasing software faster • Expert tips and advice for migrating your SCM now http://p.sf.net/sfu/perforce Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel