> kjournald submited buffers for IO and waiting for them to finish. It is means that the patch incorrectly moves internal kernel synchronization problem into user space as EIO instead of fixing a root cause or perform iterative synchronization. After patching users will be surprised a lot of EIO (remember 47% EIO in aio-stress runs after patching) will buy new disk... This patch is harmful.
Leonid >-----Original Message----- >From: Zach Brown [mailto:[EMAIL PROTECTED] >Sent: Wednesday, February 21, 2007 9:25 PM >To: Ken Chen >Cc: Ananiev, Leonid I; Chris Mason; linux-aio; Linux Kernel Mailing List; Benjamin LaHaise; Suparna >bhattacharya; Andrew Morton; Badari Pulavarty >Subject: Re: [PATCH 2/2] aio: propogate post-EIOCBQUEUED errors to completion event > > >On Feb 21, 2007, at 12:35 AM, Ken Chen wrote: > >> On 2/20/07, Ananiev, Leonid <[EMAIL PROTECTED]> wrote: >>> 1) mem=1G in kernel boot param if you have more >>> 2) unmount; mk2fs; mount >>> 3) dd if=/dev/zero of=<test_file> bs=1M count=1200 >>> 4) aiostress -s 1200m -O -o 2 -i 1 -r 16k <test_file> >>> 5) if i++<50 goto 2). >> >> Would you please instrument the call chain of >> invalidate_complete_page2() and tell us exactly where it returns zero >> value in your failure case? >> >> invalidate_complete_page2 >> try_to_release_page >> ext3_releasepage >> journal_try_to_free_buffers >> ??? > >For what it's worth, Badari has explained this race in the past in a >credible way. I'll take the liberty of pasting a mail from him: > >" >kjournald submited buffers for IO and waiting for them to finish. >Note that it has a ref. against the buffer. > >journal_commit_transaction() > ... > submited buffers for IO > /* Waiting for IO to complete */ > while (commit_transaction->t_locked_list) { > ... > get_bh(bh); > if (buffer_locked(bh)) { > spin_unlock(&journal->j_list_lock); > wait_on_buffer(bh); <<<<<< > spin_lock(&journal->j_list_lock); > } > > .. > put_bh(bh); > } > >Now, DIO process comes to frees the jh through >journal_try_to_free_buffers() >but fails to drop_buffers() since kjournald() has a reference against >it. >invalidate_inode_pages2_range() > .. > ext3_releasepage() > journal_try_to_free_buffers() > journal_put_journal_head() > __journal_try_to_free_buffer() > <--- freed jh > > try_to_free_buffers() > drop_buffers() > if (buffer_busy(bh)) > goto failed; > <<--- returns EIO due to >b_count > >" > >I don't mean to say that we shouldn't get traces to confirm the >theory, just sharing. And now we can point to this in the archives >next time :). > >- z - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

