> I just retried to reproduce it to generate a reliable
> test case. Unfortunately, I cannot reproduce the
> error message. So I really have no idea what might
> have cause it....

I also had this problem 2-3 times in the past,
but I cannot reproduce it.

====================================================================

Using dtrace against the kernel, I found out that the source
of the EBUSY error 16 is the kernel function zil_suspend():

[b]
...
  0                        <- dnode_cons                      0
  0                        -> dnode_setdblksz
  0                        <- dnode_setdblksz                14
  0                        -> dmu_zfetch_init
  0                          -> list_create
  0                          <- list_create          3734548404
  0                          -> rw_init
  0                          <- rw_init              3734548400
  0                        <- dmu_zfetch_init        3734548400
  0                        -> list_insert_head
  0                        <- list_insert_head        3734548052
  0                      <- dnode_create             3734548048
  0                    <- dnode_special_open         3734548048
  0                    -> dsl_dataset_set_user_ptr
  0                    <- dsl_dataset_set_user_ptr                 0
  0                  <- dmu_objset_open_impl                  0
  0                <- dmu_objset_open                         0
  0                -> dmu_objset_zil
  0                <- dmu_objset_zil                 3700903200
  0                -> zil_suspend
  0                 | zil_suspend:entry       zh_claim_txg: 83432
  0                <- zil_suspend                            16
  0                -> dmu_objset_close
  0                  -> dsl_dataset_close
  0                    -> dbuf_rele
  0                      -> dbuf_evict_user
  0                        -> dsl_dataset_evict
  0                          -> unique_remove
...

  1200  /*
  1201   * Suspend an intent log.  While in suspended mode, we still honor
  1202   * synchronous semantics, but we rely on txg_wait_synced() to do it.
  1203   * We suspend the log briefly when taking a snapshot so that the 
snapshot
  1204   * contains all the data it's supposed to, and has an empty intent log.
  1205   */
  1206  int
  1207  zil_suspend(zilog_t *zilog)
  1208  {
  1209          const zil_header_t *zh = zilog->zl_header;
  1210          lwb_t *lwb;
  1211
  1212          mutex_enter(&zilog->zl_lock);
  1213          if (zh->zh_claim_txg != 0) {            /* unplayed log */
  1214                  mutex_exit(&zilog->zl_lock);
  1215                  return (EBUSY);
  1216          }
                ...
[/b]

====================================================================

It seems that you can identify zfs filesystems that fail
zfs snapshot with error 16 EBUSY using

    zdb -iv {your_zpool_here} | grep claim_txg

If there are any ZIL headers listed with a claim_txg != 0, the
dataset that uses this ZIL should fail zfs snapshot with
error 16, EBUSY.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to