On Tuesday 09 September 2008 03:46, Kent Overstreet wrote: > Actually, I've got a better idea. In your atom table, have next to > your link count a use counter for the number of times the link count > has changed - when link count is incremented, increment the use > counter, when the link count is decremented increment the use count. > > Use the use count or possibly both to heuristically decide which xattr > names to reference count, and then just use garbage collection for the > ones you mark as permanent, if the table ever gets huge; in practice, > this won't happen at all. Integrate it into the online fsck or > whatever.
Hi Kent, Thanks for poking at this issue. I think the right thing to do is properly refcount the atoms, and just do it efficiently. So when we commit a phase (fsync or flush) one of the things that goes into the log is a list of [atom, +-count] pairs. Then once in a long while, we roll those pairs up into an atom count table, similar to the atime table. We can splurge and have 64 bit counts because the size of the table is not significant. Or be stingy and have two tables, low word and high word of the count, expecting to update the high word rarely if ever. I am not wild about the idea of letting atom garbage accumulate and later scanning for it, I suspect you feel the same way. This does make me introspect and wonder if the atom idea is really worth the trouble. I think it is. I hate the thought of people avoiding long xattr names just because they know the xattr name gets stored in every inode that uses the xattr. I also think that the atom idea will be good for a measurable improvement in performance by reducing cache pressure, and in Tux3, the big point is reducing the size of inode table attributes, which Tux3 has to scan frequently when versioning is in action. I'm still not totally sure about the whole atom idea. Quite sure, but not totally sure. Anyway, on Tuesday (later today actually) it gets implemented, thus becoming much more concrete. There is no immediate time pressure to solve the atom refcount issue, we just need to know that the denial of service you pointed out can be prevented without a huge efficiency penalty. I think the code complexity is going to be ok even with proper ref counting, because we are mainly recycling mechanisms like the proposed count table, already slated to be used elsewhere. And the throughput cost is probably negligible, because xattrs are mainly set once then read a lot, like file mode. Regards, Daniel _______________________________________________ Tux3 mailing list [email protected] http://tux3.org/cgi-bin/mailman/listinfo/tux3
