G'Day Pawel, Pawel Jakub Dawidek wrote: > > Hi. > > I discovered the following deadlock which can occur on adding cache > device. When we call this command: > > # zpool add <pool> cache <disk> > > It hangs here: > > mutex_enter(&l2arc_dev_mtx) > l2arc_add_vdev() > spa_load_l2cache() > spa_vdev_add() > zfs_ioc_vdev_add() > zfsdev_ioctl() > ioctl() > syscall() > > It cannot acquire the l2arc_dev_mtx mutex, because it is already held by > the l2arc_feed_thread thread. The l2arc_feed_thread cannot release it > because it hangs here: > > cv_wait(&scl->scl_cv, &scl->scl_lock) > spa_config_enter() > zio_create() > zio_write_phys() > l2arc_feed_thread() > > It will wait here forever, because the previous process hangs, and > spa_config_exit() is called at the end of spa_vdev_add() (via > spa_vdev_exit()) and spa_config_exit() calls cv_broadcast() for this > condvar.
Sorry about this deadlock, and thanks - your analysis is spot on, I've added it to the bug description: http://bugs.opensolaris.org/view_bug.do?bug_id=6701480 (it might take a few moments before the new description field appears on the opensolaris.org website). I've been working on the fix. cheers, Brendan -- Brendan [CA, USA]