Hi,
I have a compelling use case for larger dnodes and have spent a small
amount of time looking into a potential way of implementing this. But,
before I spend any more time on it, I want to get some buy in from the
community and upstream on the implementation details
My current plan for adding support for larger dnodes is actually pretty
simple. I want to "steal" 8 bits from the unused padding currently in
the dnode_phys_t structure, use these bits to store a size for the dnode
(in number of 512 byte sectors), and then consult this size whenever
traversing a dnode block (in order to properly skip over the necessary
amount of bonus space and reach the "next" dnode in the block). This
would not only allow us to have a larger fixed size for dnodes, but it
would also allow us to have variable sized dnodes.
I have a *VERY* experimental patch to update the object allocation code
to test this idea out, and it seems to work pretty well. I can create a
dataset, create dnodes with varying size (.5K - 16K), and then use zdb
to ensure the object number allocation worked as planned. If you are
interested in checking out the source, it's up on github:
https://github.com/prakashsurya/zfs/commit/15c110722a50bd7196a71b9a360e5217445c7d13
Does anybody see any major road blocks to implementing larger dnodes in
this way? I'm still pretty unfamiliar will this section of the code, so
I'd love for some input from the people more familiar with it.
There is still a lot of work to go from this experimental patch to
something ready to land. The send/receive bits, diff bits, and code
to actually use the extra bonus area all remain to be implemented; I was
just interested in testing out the object allocation code with that
patch.
To properly land upstream, this type of change is dependent on adding
dataset level feature flags as well. So I'll soon need to bring up that
discussion, and decide on what's the best way to implement that.
I welcome any and all feedback on this idea. So far, through testing and
off list discussions, I don't foresee any major issues. It's just a
matter of careful implementation and extensive testing.
--
Cheers, Prakash
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer