On Fri, Jun 19, 2015 at 03:15:01PM +0530, Chandan Rajendra wrote: > > private->io_lock is not acquired here but not in below. > > > > IIUC, this can be protected by EXTENT_LOCKED. > > > > private->io_lock plays the same role as BH_Uptodate_Lock (see > end_buffer_async_read()) i.e. without the io_lock we may end up in the > following situation, > > NOTE: Assume 64k page size and 4k block size. Also assume that the first 12 > blocks of the page are contiguous while the next 4 blocks are contiguous. When > reading the page we end up submitting two "logical address space" bios. So > end_bio_extent_readpage function is invoked twice (once for each bio). > > |-------------------------+-------------------------+-------------| > | Task A | Task B | Task C | > |-------------------------+-------------------------+-------------| > | end_bio_extent_readpage | | | > | process block 0 | | | > | - clear BLK_STATE_IO | | | > | - page_read_complete | | | > | process block 1 | | | > | ... | | | > | ... | | | > | ... | end_bio_extent_readpage | | > | ... | process block 0 | | > | ... | - clear BLK_STATE_IO | | > | ... | - page_read_complete | | > | ... | process block 1 | | > | ... | ... | | > | process block 11 | process block 3 | | > | - clear BLK_STATE_IO | - clear BLK_STATE_IO | | > | - page_read_complete | - page_read_complete | | > | - returns true | - returns true | | > | - unlock_page() | | | > | | | lock_page() | > | | - unlock_page() | | > |-------------------------+-------------------------+-------------| > > So we end up incorrectly unlocking the page twice and "Task C" ends up working > on an unlocked page. So private->io_lock makes sure that only one of the tasks > gets "true" as the return value when page_read_complete() is invoked. As an > optimization the patch gets the io_lock only when nr_sectors counter reaches > the value 0 (i.e. when the last block of the bio_vec is being processed). > Please let me know if my analysis was incorrect. > > Also, I noticed that page_read_complete() and page_write_complete() can be > replaced by just one function i.e. page_io_complete().
The explanations and the table would be good in the changelog and as comments. I think we'll need to consider the smaller blocks more often so some examples and locking rules would be useful, eg. documented in this file. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html