On 2014-11-30 20:58, Qu Wenruo wrote:
So, I think this does a good job of highlighting one of the bigger issues with btrfsck when it is compared to ext* and/or xfs. Despite this being a problem, I really don't think using a rdbms is the way to fix it, both for reasons outlined in other responses, and because fsck should be as fast as possible when nothing is wrong with the fs.[BACKGROUND] I'm trying to implement the function to repair missing inode item. Under that case, inode type must be salvaged(although it can be fallback to FILE).One case should be, if there is any dir_item/index or inode_ref refers the inode as parent, the type of that inode must be DIR. However, currently btrfsck implement (inode_record only records backref), we are unable to search the inode_backref whose parent is given inode number. [FIRST IMPLEMENT DESIGN] My first thought is to implement an generic inode-relation structure, recording parent ino, child ino, name and namelen, and restore the structure in a rbtree, not in the child/parent's list. But I soon recognize that this is a perfect use case for relational database, as 'ino' as the primary key for INODE table, ('parent_ino', 'child_ino', 'name') as the primary key for INODE_REF table. [CRAZY IDEA] So why not using SQL to implement the btrfsck inode-record things? With such crazy idea, it will be much much easier to do any iteration from a given ino, and with the already mature RDB implement, like sqlite3, we can save hundreds of lines of codes implementing the rb-tree or list. [PROS] 1. Easy to maintain Now we don't need to maintain the rbtree searching or list iteration, but easy SQL lines and its wrapper. 2. Easy to extend If we need to record something more, like extents and its relation to inode, we only need to create 2 tables and several SQL and wrappers. 3. Reduced memory usage for HUGE fs. When metadata grows to several TB or even more, current rb-tree based implement may run short of memory since they are all stored in memory. But if use SQL, RDBMS like sqlite3 can restore things in either memory or disk, which may hugely reduce the memory usage for huge btrfs. If not use existing RDBMS, we need to implement complicated memory control system to manage memory in userland. [CONS] 1. Heavy implement SQL hide the rb-tree or B+ tree implement but costs more memory(if not compressed) and CPU cycles, which will be slower than the simple rb-tree implement even using lightweight RDBMS like sqlite3. 2. Heavy dependency If use it, btrfs-progs will include RDBMS as the make and runtime dependency. Such low level progs depend on high level programs like sqlite3 may be very strange. 3. A lot of rework on existing codes. Even SQL is easier to maintain and extend, if we use it, we still need to reimplement several hundreds or even thousands lines of code to implement it, not to mention the regression tests. 4. Copyright Will it cause any copyright problem if using non-GPL RDBMS like sqlite3 in GPLv2 btrfs-progs? [NEED FEEDBACK] Any feedback or discussion on the crazy idea is welcomed, since this may needs a lot of work, it definitely needs a lot review on the idea before it comes to codes.
smime.p7s
Description: S/MIME Cryptographic Signature