----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.csiden.org/r/131/#review418 -----------------------------------------------------------
Ship it! Ship It! - George Wilson On Dec. 16, 2014, 7:32 p.m., Justin Gibbs wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.csiden.org/r/131/ > ----------------------------------------------------------- > > (Updated Dec. 16, 2014, 7:32 p.m.) > > > Review request for OpenZFS Developer Mailing List. > > > Bugs: 5056 > https://www.illumos.org/projects/illumos-gate//issues/5056 > > > Repository: illumos-gate > > > Description > ------- > > Various improvements the dmu buf user API. > > Submitted by: Justin Gibbs <just...@spectralogic.com> > Submitted by: Will Andrews <wi...@spectralogic.com> > Sponsored by: Spectra Logic Corporation > > Collect dmu buf user API support data into a new structure, > dmu_buf_user_t. Consumers of this interface must include a > dmu_buf_user_t as a member of the user data structure that will be > attached to a dmu buffer. This reduces the size of dmu_buf_impl_t > by two pointers. > > Queue dmu buf user eviction processing to a taskq. This prevents > FreeBSD witness(4) lock-order reversal warnings, potential deadlocks, > and reduces stack depth. > > Convert objset eviction from a synchronous to an asynchronous process > to accommodate the asynchronous invocation, via dmu buf user eviction, > of dnode_buf_pageout(). > > Modify existing users of the dmu buf user API to never access the dbuf to > which their data was attached after user eviction has occurred. Accessing > the dbuf from the callback is no longer safe now that callbacks occur > without the locks that used to protect them. Enforce this in ZFS_DEBUG > kernel builds by clearing the user's pointer to the dbuf, if any, at > the time of eviction. Callbacks have also been modified to clear their > dbuf pointer so most errors are caught even on non ZFS_DEBUG kernels. > However, this will not catch accesses from other contexts that occur > between the time of eviction and the processing of the callback. > > Clarify programmer intent and improve readability by providing > specialized functions for the common user data update actions "remove" > and "replace". > > Provide code-comment documentation for each API call. > > Perform runtime validation of proper API usage on ZFS_DEBUG kernels. > > uts/common/fs/zfs/sys/dbuf.h: > uts/common/fs/zfs/dbuf.c: > Add dbuf_verify_user() and call it from the dbuf user API and > during dbuf eviction processing to verify dbuf user API state. > > Replace calls to dbuf_set_data(db, NULL) with more explicit > db_clear_data(). dbuf_set_data() now asserts that its buffer > argument is never NULL. > > Implement new dmu buf API functions. > > Add the dmu_evict_taskq for processing dmu buf user evictions. > Add dmu_buf_user_evict_wait() which allows spa, dsl pool, and > dmu close/fini functions to drain pending user evictions. > > In dbuf_rele_and_unlock(), immediately evict dbufs with a zero > refcount for an objset that is being evicted. This allows > the indirect dbufs in a dnode to be evicted asynchronously > after the zero refcount dbufs that reference them are cleared > via dmu_objset_evict()->dnode_evict_dbufs(). > > uts/common/fs/zfs/sys/dmu_objset.h: > uts/common/fs/zfs/dmu_objset.c: > End the practice of including special dnodes in os->os_dnodes. > This allows os->os_dnodes to be managed completely by one > eviction path: dnode_buf_pageout()->dnode_destroy(). > > Split objset eviction processing into two pieces. The first > marks the objset as evicting, evicts any dbufs that have a > refcount of zero, and then queues up the objset for the second > phase of eviction. Once os->os_dnodes has been cleared by > dnode_buf_pageout()->dnode_destroy(), the second phase is executed. > The second phase closes the special dnodes, dequeues the objset > from the list of those undergoing eviction, and finally frees > the objset. > > NOTE: Due to asynchronous eviction processing (invocation of > dnode_buf_pageout()), it is possible for the meta dnode for the > objset to have no holds even though os->os_dnodes is not empty. > > uts/common/fs/zfs/sys/dnode.h: > uts/common/fs/zfs/dnode.c: > Collapse the initialization of a dnode from dnode_hold_impl() > into dnode_create(). Since we already grab os_lock, use it to > provide mutual exclusion for dnh->dnh_dnode and to arbitrate > the winner of the initial open race. The only way to unset the > handle is to page out an entire set of dnodes, which uses the > user eviction mechanism to arbitrate. > > In dnode_destroy(), invoke final stage of objset eviction if > the destroyed dnode is the last regular dnode in the objset. > > Modify dnode_buf_pageout() so that it doesn't reference the > evicted dbuf. > > uts/common/fs/zfs/dnode_sync.c: > In dnode_evict_dbufs(), remove multiple passes over dn->dn_dbufs. > This is possible now that objset eviction is asynchronously > completed in a different context once dbuf eviction completes. > > In the case of objset eviction, any dbufs held by children will > be evicted via dbuf_rele_and_unlock() once their refcounts go > to zero. Even when objset eviction is not active, the ordering > of the avl tree guarantees that children will be released before > parents, allowing the parent's refcounts to naturally drop to > zero before they are inspected in this single loop. > > uts/common/fs/zfs/sys/spa.h: > uts/common/fs/zfs/sys/spa_impl.h: > uts/common/fs/zfs/spa.c: > uts/common/fs/zfs/spa_misc.c: > Track evicting objsets in the spa. > > When recording or validating the min spa reference count, wait > for objset eviction processing to complete so that the reference > count is stable. > > uts/common/fs/zfs/sys/spa.h: > uts/common/fs/zfs/sys/dsl_dir.h: > uts/common/fs/zfs/spa_misc.c: > uts/common/fs/zfs/dsl_dir.c: > Add spa_async_close() and dsl_dir_async_rele(). These APIs are > used during the async eviction process of dsl datasets and dirs to > indicate that the normal rules for releasing a reference count on > the spa do not apply. Async releases occur from a taskq without > the namespace lock held and may be for objects contributing to > spa_minref (e.g. during pool export). Thus these APIs do not > enforce the namespace lock being held or the spa refcount being > greater than spa_minref on entry. > > uts/common/fs/zfs/dsl_deadlist.c: uts/common/fs/zfs/dsl_dataset.c: > Modify dsl_dataset_evict() so that it doesn't reference the > evicted dbuf to determine if the deadlist needs to be closed. > This is achieved by checking ds->ds_deadlist.dl_os instead > which is now properly cleared when a deadlist is closed prior > to eviction. > > uts/common/fs/zfs/dsl_pool.c: > In dsl_pool_close(), flush the user evictions for any just > released dsl dirs with a call to dmu_buf_user_evict_wait(). > The dsl dirs have back references to the dsl_pool_t which are > accessed during eviction so these must complete before the > dsl_pool_t is destroyed. > > uts/common/fs/zfs/dsl_dir.c: uts/common/fs/zfs/sys/dsl_dir.h: > uts/common/fs/zfs/dsl_dataset.c: uts/common/fs/zfs/sys/dsl_dataset.h: > uts/common/fs/zfs/dsl_dir.c: uts/common/fs/zfs/dsl_prop.c: > uts/common/fs/zfs/sa.c: uts/common/fs/zfs/sys/sa_impl.h: > uts/common/fs/zfs/zap.c: uts/common/fs/zfs/sys/zap_impl.h: > uts/common/fs/zfs/sys/zap_leaf.h: uts/common/fs/zfs/zap_micro.c: > Conform to new dbuf user API. > > uts/common/fs/zfs/sys/dsl_dataset.h: > uts/common/fs/zfs/dmu_objset.c: > uts/common/fs/zfs/dmu_send.c: > uts/common/fs/zfs/dmu_traverse.c > uts/common/fs/zfs/dsl_bookmark.c: > uts/common/fs/zfs/dsl_dataset.c: > uts/common/fs/zfs/dsl_deleg.c: > uts/common/fs/zfs/dsl_destroy.c: > uts/common/fs/zfs/dsl_prop.c: > uts/common/fs/zfs/dsl_scan.c: > uts/common/fs/zfs/dsl_userhold.c: > uts/common/fs/zfs/zil.c: > Record whether or not a dsl dataset is a snapshot in the upon its > creation in the ds_is_snapshot field rather than rely on access to > the ds_num_children in the dsl_dataset_phys_t. This ensures that > the snapshot trait can be determined during eviction processing > when the dbuf holding the dsl_dataset_phys_t is no longer > available. The conditional snapshot logic in dmu_objset_evict() > is currently the only place where a ds_is_snapshot is tested > during eviction. > > uts/common/fs/zfs/sys/dmu.h: > Add prototypes, data structures, and code comment documentation > for the dmu buf user api. > > Pull in ASSERT() via zfs_context.h. > > uts/common/fs/zfs/zfs_sa.c: > Use zfs_context.h instead of manual includes of sys/types.h > and sys/params.h so this file is compatible with the use of > zfs_context.h in sys/dmu.h. > > cmd/truss/Makefile.com: cmd/zhack/Makefile.com: > cmd/zinject/Makefile.com: lib/brand/solaris10/s10_brand/Makefile.com: > lib/libzfs/Makefile.com: > Disable E_STATIC_UNUSED since lint complains about unused inline > functions in header files, even though they are "inline", not > "static inline". > > > Diffs > ----- > > usr/src/uts/sparc/stmf_sbd/Makefile > a57ccaf73d2a002d96232e0a84e79e1b391e9e6b > usr/src/uts/intel/dev/Makefile b5c7c1a9c8defb5eaa1e09bf03a8e5a877369477 > usr/src/uts/common/fs/zfs/sys/spa_impl.h > 48b28eb5a13964b784174aa4a7c08f964185691d > usr/src/uts/common/fs/zfs/sys/sa.h bc89fa07d222c0007c45e390a8ae81a3e1708a8f > usr/src/uts/common/fs/zfs/sys/dsl_dataset.h > 3160a05a8c59a9d504dd2e34b78296a377975a0a > usr/src/uts/common/fs/zfs/sys/dmu.h > 24473daa6d2b5f9b5af2b4857d4f6e4b6d1abc7f > usr/src/uts/common/fs/zfs/zil.c 6beacadf710d650c9f1776f04ed3b1c6e27f4d6e > usr/src/uts/common/fs/zfs/zap_micro.c > 81ea4f730f97251ec54aefa91ae9de70e356ca8a > usr/src/uts/common/fs/zfs/spa_misc.c > 1729ba0503be512ca2b2a22165abf34b25fc0f0b > usr/src/uts/common/fs/zfs/dsl_userhold.c > 0985ec91fd50c8a1234973408d63e3f21631e597 > usr/src/uts/common/fs/zfs/dsl_prop.c > 4e93e43f96983f6e3fd6fed1f3bc49a4ed55b96d > usr/src/uts/common/fs/zfs/dsl_dir.c > f49584168344629463041e177d3202fa6d4e396b > usr/src/uts/common/fs/zfs/dsl_deleg.c > dc405ebe58b522eb3612aaae4585efb8fa269ba2 > usr/src/uts/common/fs/zfs/dsl_bookmark.c > 5fb7f96600dc6f4e823b703a20115e1340c80bf7 > usr/src/uts/common/fs/zfs/dmu_traverse.c > bffdff65d6272249c26e020fad4a630a6bd3fc26 > usr/src/uts/common/fs/zfs/dmu_objset.c > e39264cc2f9324fb17746876b80d1454f1920b7d > usr/src/uts/common/fs/zfs/dbuf.c bcf66bc53d7b5b3320f78679a9a631e16adb8c10 > usr/src/lib/libzfs/Makefile.com e7d33f3a028c56a6ba6e7e53a3aca51974a33505 > usr/src/cmd/zhack/Makefile.com 67927083c4311afec5bc1ab4b18958618b41b5f5 > usr/src/uts/common/fs/zfs/dnode.c 250dd9f9415ed1d670de2aef983d022002fa169a > usr/src/uts/intel/stmf_sbd/Makefile > a57ccaf73d2a002d96232e0a84e79e1b391e9e6b > usr/src/uts/common/fs/zfs/zfs_sa.c ed5f27647510d51ec35b3e1bd213dec892d2d6dd > usr/src/uts/common/fs/zfs/zap.c de2d4deb00ebe665939bbbb64088c9eedacaac68 > usr/src/uts/common/fs/zfs/sys/zap_leaf.h > cd8b74a77a8c62098c701aac01c6bedd57f20a6b > usr/src/uts/common/fs/zfs/sys/zap_impl.h > 0d4fc69815fcadf7323206aad69383a6ef9e8662 > usr/src/uts/common/fs/zfs/sys/spa.h > cf307a2ec977642d2cb79963ed3abc0039e4f07b > usr/src/uts/common/fs/zfs/sys/sa_impl.h > 6b9af2ef4f89760047d1f8c67bf5e641b6f08de7 > usr/src/uts/common/fs/zfs/sys/dsl_dir.h > 772bfbe6db1267472bcb1d4ad20bc959f05b665d > usr/src/uts/common/fs/zfs/sys/dnode.h > 5668af1fef08f5785173af7d3ee0bb41d27d1e12 > usr/src/uts/common/fs/zfs/sys/dmu_objset.h > 804f0c182b6f193590a822513f532e9cc5d400c0 > usr/src/uts/common/fs/zfs/sys/dbuf.h > 8be8ed6bc85bf1a30673d3755c40f6b02a5c11b2 > usr/src/uts/common/fs/zfs/spa.c d0408a58023cc33ae362ab0027619b4f70a405f8 > usr/src/uts/common/fs/zfs/sa.c 27d8513541a45e158321fd2ef3c54de9a02ac5ee > usr/src/uts/common/fs/zfs/dsl_scan.c > 5c9e68fb2ee68210c2450b1b5380f672a715f677 > usr/src/uts/common/fs/zfs/dsl_pool.c > 39a797c27958b924d41309d7c671ba48fc5d848d > usr/src/uts/common/fs/zfs/dsl_destroy.c > a776e144f1d2b02224d03c7e3fcfd14fd59d4ab9 > usr/src/uts/common/fs/zfs/dsl_deadlist.c > 4ac562bfdba32382d2d1d807293640c9c1430657 > usr/src/uts/common/fs/zfs/dsl_dataset.c > 5baf5c3c0a11db1deb546fa3b365f6faa1bfd4ca > usr/src/uts/common/fs/zfs/dnode_sync.c > 63bfc94f9aee862354726fbbc2ede0bec3b6b156 > usr/src/uts/common/fs/zfs/dmu_send.c > dc9f81cd2f28ef316b2425396393e14410be4f60 > usr/src/lib/brand/solaris10/s10_brand/Makefile.com > 022b31b4e7d2671eabdb7326d6c4f1a7c226a5f5 > usr/src/cmd/zinject/Makefile.com 76d297937f13b863a7d1b75add9e7f18f8761c2d > usr/src/cmd/truss/Makefile.com b50028339965450aee9a5fb5f77a2fc972f2e755 > > Diff: https://reviews.csiden.org/r/131/diff/ > > > Testing > ------- > > ztest and zfs test suite on FreeBSD and illumos. > > > Thanks, > > Justin Gibbs > >
_______________________________________________ developer mailing list developer@open-zfs.org http://lists.open-zfs.org/mailman/listinfo/developer