I have two questions for zfs code:
In dbuf_hold_imple():
/*
* If this buffer is currently syncing out, and we are are
* still referencing it from db_data, we need to make a copy
* of it in case we decide we want to dirty it again in this txg.
*/
if (db->db_level == 0 && db->db_blkid != DB_BONUS_BLKID &&
dn->dn_object != DMU_META_DNODE_OBJECT &&
db->db_state == DB_CACHED && db->db_data_pending) {
dbuf_dirty_record_t *dr = db->db_data_pending;
if (dr->dt.dl.dr_data == db->db_buf) {
arc_buf_contents_t type = DBUF_GET_BUFC_TYPE(db);
dbuf_set_data(db,
arc_buf_alloc(db->db_dnode->dn_objset->os_spa,
db->db.db_size, db, type));
bcopy(dr->dt.dl.dr_data->b_data, db->db.db_data,
db->db.db_size);
}
}
In this piece of code, it will copy the data to the extra buffer if the
current one is syncing into disks. My question is that why not deferring this
data copy to dbuf_dirty, where we check the db_data_pending and do the copy.
The similar case is in dbuf_sync_leaf():
if (dn->dn_object != DMU_META_DNODE_OBJECT) {
/*
* If this buffer is currently "in use" (i.e., there are
* active holds and db_data still references it), then make
* a copy before we start the write so that any
modifications
* from the open txg will not leak into this write.
*
* NOTE: this copy does not need to be made for objects only
* modified in the syncing context (e.g. DNONE_DNODE
blocks).
*/
if (refcount_count(&db->db_holds) > 1 && *datap ==
db->db_buf) {
arc_buf_contents_t type = DBUF_GET_BUFC_TYPE(db);
*datap = arc_buf_alloc(os->os_spa, blksz, db, type);
bcopy(db->db.db_data, (*datap)->b_data, blksz);
}
} else {
Here it copies the data also if there are extra reference count on the
syncing dbuf. What I don't understand is why it need to do this
operations since dbuf_dirty can handle this scenario perfectly.
And the further question, perhaps related to the above one, why does zfs
need to release the arc buffer in dbuf_dirty? The comment says this is
needed to protect other from reading the cached data block. But I don't
know if there is other calling patch to hit the arc buffer except via
dmu interface. Perhaps this is needed to protect the snapshot from reading the
new data?
} else if (db->db.db_object != DMU_META_DNODE_OBJECT) {
/*
* Release the data buffer from the cache so that we
* can modify it without impacting possible
other users
* of this cached data block. Note that indirect
* blocks and private objects are not released
until the
* syncing state (since they are only modified
then).
*/
arc_release(db->db_buf, db);
dbuf_fix_old_data(db, tx->tx_txg);
data_old = db->db_buf;
}
Thanks,
Jay
--
This messages posted from opensolaris.org