Hi, I spent some time trying to find a way to speed up qcow2 performance on allocation and snapshot so I branch from kevin/coroutine-block branch a qcow2x branch. Currently it just write using different algorithm (that is is fully compatible with qcow2, there is not ICountAsZero(TM) method :) ). Mainly is a complete rewrite of qcow2_co_writev. Some problems I encountered in the way current qcow2 code works
- reference decrement are not optimized (well, this is easy to fix on current code) - any L2 update blocks all other L2 updates, this is a problem if guest is writing large file sequentially cause most write needs to be serialized - L2 allocation can be done with relative data (this is not easy to do with current code) - data/l2 are allocated sequentially (if there are not hole freed) but written in another order. This cause excessive file fragmentation with default cache mode, for instance on xfs file is allocated quite sequentially on every write so any no-sequential write create a different fragment. Currently I'm getting these times with iotests (my_cleanup branch is another branch more conservative with a patch to collapse reference decrement, note that 011 and 026 are missing, still not working) X C B 001 6 3 7 002 3 3 4 003 3 3 3 004 0 1 0 005 0 0 0 007 35 32 36 008 3 4 3 009 1 0 0 010 0 0 0 012 0 0 2 013 125 err 158 014 189 err 203 015 48 70 610 017 4 4 4 018 5 5 5 019 4 4 4 020 4 4 4 021 0 0 0 022 74 103 103 023 75 err 95 024 3 3 3 025 3 3 6 027 1 1 0 028 1 1 1 X qcow2x C my_cleanup B kevin/coroutine-block Currently code is quite "spaghetti" code (needs a lot of cleanup, checks, better error handling and so on). Taking into account that code require additional optimizations and is full of internal debugging time times are quite good. Main questions are: - are somebody interesting ? - how can I send such a patch for review considering that is quite big (I know, I have to clean a bit too) ? Regards, Frediano