-------- Original Message --------
Subject: Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?
From: Robert White <rwh...@pobox.com>
To: Qu Wenruo <quwen...@cn.fujitsu.com>, linux-btrfs
<linux-btrfs@vger.kernel.org>
Date: 2014年12月01日 12:03
On 11/30/2014 05:58 PM, Qu Wenruo wrote:
("why not use SQL to..." suggestion)
SQL, as in Structured Query Language, is _terrible_ for recursion. It
expresses all of its elements in terms of set theory and really can
only implement union and intersection of flat sets.
Several companies offer extensions to SQL in their implementations to
help with this lack of recursion such as "prior" in Oracle's PSQL, but
they are all stateful beyond reason.
Several companies, including microsoft, have proposed and partially
implemented "a relational database as a file system" paradigm and then
crashed into the fact that dealing with the parent of the parent of
something is different than dealing with the parent of the parent of
the parent of something.
There is a humours-but-true saying: "If you have a problme, and you
decide to solve it with (regex or xml or uml or sql etc) you now have
two problems."
Wait, regex and uml and xml is OK, but never heard sql is one of them...
Writing the SQL to walk the tree is harder than allocating the memory
as a vector, filling it with the data, and then walking the pointers.
In fact, such INODE and INODE_REF table is not (completely nor mainly)
used to walk the tree,
it is mainly used to search for:
1. is there any inode_ref refers to a given ino as parent.
This will not even be a problem when the fs is *OK*, since do a simple
btrfs_search_slot()
with key( objectied = ino, type = BTRFS_DIR_INDEX/ITEM_KEY, offset = 0)
will do it.
However when it comes to corrupted leaf, the whole INODE_ITEM with its
DIR_INDEX/ITEM are gone
with the leaf, so the old search way is not usable and btrfs-progs will
relay on other mechanism
to determine that.
And unfortunately, there is no such mechanism.
2. is there any dir_index/dir_item refers to a given ino as child.
Current inode_record works fine for this object.
So when the crazy idea disappear and sane ideas come back, it will
probably be rb-tree based
(parent, ino, name, namelen) entries to record parent-child relation
(currently it is a list_head only records backref inside the inode_record).
And another rb-tree based (ino) entries (same as current inode_record
structure).
Your suggestion is the first step on the road to The Inner Platform
Effect™. You have a specialized database (parent, inode, name) and now
you want to put a generic database engine over the specialized
database so that you an re-implement the specialized database with
generic primitives.
http://en.wikipedia.org/wiki/Inner-platform_effect
Things need to be only as generic as they need to be, and no more
generic than that.
Replacing a pointer to a record with a pointer to a cursor's result
table that will give you the name of the next result to query is not a
win. Even as you spell it out you can see that it is _not_ a reduction
in memory or processing.
And the "easy SQL lines" stop being that easy when "name" stops being
unique.
Name is still unique when parent ino is given, so the INODE_REF tables'
primary key is not
name but the (parent, ino, name) combine.
But the inner platform effect still seems valid for my crazy idea.
Anyway, the crazy idea comes to me when I see the RDB like feature in
the inode_record structure,
-and I just want to save sometime coding the new (parent, ino, name,
namelen) rb-tree-.
(I've been down this road before. Not with file systems but with
"managed objects" in a network management system. Nodes, Parent nodes,
etc. Just referring to distributed things like networks switches
instead of file system inodes. ... It doesn't end well. 8-) )
The RDB idea must come to you just like me, wanting to write less codes,
right?
So it seems the end may be the same. :-(
Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html