Re: [Cluster-devel] [PATCH v2.1 11/30] iomap: add the new iomap_iter model
On Wed, Aug 11, 2021 at 12:17:20PM -0700, Darrick J. Wong wrote: > From: Christoph Hellwig > > The iomap_iter struct provides a convenient way to package up and > maintain all the arguments to the various mapping and operation > functions. It is operated on using the iomap_iter() function that > is called in loop until the whole range has been processed. Compared > to the existing iomap_apply() function this avoid an indirect call > for each iteration. > > For now iomap_iter() calls back into the existing ->iomap_begin and > ->iomap_end methods, but in the future this could be further optimized > to avoid indirect calls entirely. > > Based on an earlier patch from Matthew Wilcox . > > Signed-off-by: Christoph Hellwig > [djwong: add to apply.c to preserve git history of iomap loop control] > Reviewed-by: Darrick J. Wong > Signed-off-by: Darrick J. Wong Looks like a straight translation of Christoph's original. Seems fine to me as a simepl step towards preserving the git history. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com
Re: [Cluster-devel] [PATCH 31/30] iomap: move iomap iteration code to iter.c
On Wed, Aug 11, 2021 at 12:19:26PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Now that we've moved iomap to the iterator model, rename this file to be > in sync with the functions contained inside of it. > > Signed-off-by: Darrick J. Wong > --- > fs/iomap/Makefile |2 +- > fs/iomap/iter.c |0 > 2 files changed, 1 insertion(+), 1 deletion(-) > rename fs/iomap/{apply.c => iter.c} (100%) > > diff --git a/fs/iomap/Makefile b/fs/iomap/Makefile > index e46f936dde81..bb64215ae256 100644 > --- a/fs/iomap/Makefile > +++ b/fs/iomap/Makefile > @@ -26,9 +26,9 @@ ccflags-y += -I $(srctree)/$(src) # needed for > trace events > obj-$(CONFIG_FS_IOMAP) += iomap.o > > iomap-y += trace.o \ > -apply.o \ > buffered-io.o \ > direct-io.o \ > fiemap.o \ > +iter.o \ > seek.o > iomap-$(CONFIG_SWAP) += swapfile.o > diff --git a/fs/iomap/apply.c b/fs/iomap/iter.c > similarity index 100% > rename from fs/iomap/apply.c > rename to fs/iomap/iter.c LGTM, Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com
Re: [Cluster-devel] [PATCH v2.1 24/30] iomap: remove iomap_apply
On Wed, Aug 11, 2021 at 12:18:26PM -0700, Darrick J. Wong wrote: > From: Christoph Hellwig > > iomap_apply is unused now, so remove it. > > Signed-off-by: Christoph Hellwig > [djwong: rebase this patch to preserve git history of iomap loop control] > Reviewed-by: Darrick J. Wong > Signed-off-by: Darrick J. Wong > --- > fs/iomap/apply.c | 91 > - > fs/iomap/trace.h | 40 -- > include/linux/iomap.h | 10 - > 3 files changed, 141 deletions(-) Looks good. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com
Re: [Cluster-devel] [PATCH v2.1 19/30] iomap: switch iomap_bmap to use iomap_iter
On Wed, Aug 11, 2021 at 12:18:00PM -0700, Darrick J. Wong wrote: > From: Christoph Hellwig > > Rewrite the ->bmap implementation based on iomap_iter. > > Signed-off-by: Christoph Hellwig > [djwong: restructure the loop to make its behavior a little clearer] > Reviewed-by: Darrick J. Wong > Signed-off-by: Darrick J. Wong > --- > fs/iomap/fiemap.c | 31 +-- > 1 file changed, 13 insertions(+), 18 deletions(-) Looks good. Reviewed-by: Dave Chinner -- Dave Chinner da...@fromorbit.com
Re: [Cluster-devel] [PATCH v5 00/12] gfs2: Fix mmap + page fault deadlocks
On Tue, Aug 3, 2021 at 9:45 PM Linus Torvalds wrote: > On Tue, Aug 3, 2021 at 12:18 PM Andreas Gruenbacher > wrote: > > With this patch queue, fstest generic/208 (aio-dio-invalidate-failure.c) > > endlessly spins in gfs2_file_direct_write. It looks as if there's a bug > > in get_user_pages_fast when called with FOLL_FAST_ONLY: > > > > (1) The test case performs an aio write into a 32 MB buffer. > > > > (2) The buffer is initially not in memory, so when iomap_dio_rw() -> > > ... -> bio_iov_iter_get_pages() is called with the iter->noio flag > > set, we get to get_user_pages_fast() with FOLL_FAST_ONLY set. > > get_user_pages_fast() returns 0, which causes > > bio_iov_iter_get_pages to return -EFAULT. > > > > (3) Then gfs2_file_direct_write faults in the entire buffer with > > fault_in_iov_iter_readable(), which succeeds. > > > > (4) With the buffer in memory, we retry the iomap_dio_rw() -> > > ... -> bio_iov_iter_get_pages() -> ... -> get_user_pages_fast(). > > This should succeed now, but get_user_pages_fast() still returns 0. > > Hmm. Have you tried to figure out why that "still returns 0" happens? The call stack is: gup_pte_range gup_pmd_range gup_pud_range gup_p4d_range gup_pgd_range lockless_pages_from_mm internal_get_user_pages_fast get_user_pages_fast iov_iter_get_pages __bio_iov_iter_get_pages bio_iov_iter_get_pages iomap_dio_bio_actor iomap_dio_actor iomap_apply iomap_dio_rw gfs2_file_direct_write In gup_pte_range, pte_special(pte) is true and so we return 0. > One option - for debugging only - would be to introduce a new flag to > get_user_pages_fast() that says "print out reason if failed" and make > the retry (but not the original one) have that flag set. > > There are a couple of things of note when it comes to "get_user_pages_fast()": > > (a) some architectures don't even enable it > > (b) it can be very picky about the page table contents, and wants the > accessed bit to already be set (or the dirty bit, in the case of a > write). > > but (a) shouldn't be an issue on any common platform and (b) shouldn't > be an issue with fault_in_iov_iter_readable() that actually does a > __get_user() so it will access through the page tables. > > (It might be more of an issue with fault_in_iov_iter_writable() due to > walking the page tables by hand - if we don't do the proper > access/dirty setting, I could see get_user_pages_fast() failing). > > Anyway, for reason (a) I do think that eventually we should probably > introduce FOLL_NOFAULT, and allow the full "slow" page table walk - > just not calling down to handle_mm_fault() if it fails. > > But (a) should be a non-issue in your test environment, and so it > would be interesting to hear what it is that fails. Because scanning > through the patches, they all _look_ fine to me (apart from the one > comment about return values, which is more about being consistent with > copy_to/from_user() and making the code simpler - not about > correctness) > >Linus > Thanks, Andreas
Re: [Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?
On Mon, Aug 16, 2021 at 09:41:18AM -0500, David Teigland wrote: > On Fri, Aug 13, 2021 at 02:49:04PM +0800, Gang He wrote: > > Hi David, > > > > On 2021/8/13 1:45, David Teigland wrote: > > > On Thu, Aug 12, 2021 at 01:44:53PM +0800, Gang He wrote: > > > > In fact, I can reproduce this problem stably. > > > > I want to know if this error happen is by our expectation? since there > > > > is > > > > not any extreme pressure test. > > > > Second, how should we handle these error cases? call dlm_lock function > > > > again? maybe the function will fails again, that will lead to kernel > > > > soft-lockup after multiple re-tries. > > > > > > What's probably happening is that ocfs2 calls dlm_unlock(CANCEL) to cancel > > > an in-progress dlm_lock() request. Before the cancel completes (or the > > > original request completes), ocfs2 calls dlm_lock() again on the same > > > resource. This dlm_lock() returns -EBUSY because the previous request has > > > not completed, either normally or by cancellation. This is expected. > > These dlm_lock and dlm_unlock are invoked in the same node, or the different > > nodes? > > different Sorry, same node
Re: [Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?
On Fri, Aug 13, 2021 at 02:49:04PM +0800, Gang He wrote: > Hi David, > > On 2021/8/13 1:45, David Teigland wrote: > > On Thu, Aug 12, 2021 at 01:44:53PM +0800, Gang He wrote: > > > In fact, I can reproduce this problem stably. > > > I want to know if this error happen is by our expectation? since there is > > > not any extreme pressure test. > > > Second, how should we handle these error cases? call dlm_lock function > > > again? maybe the function will fails again, that will lead to kernel > > > soft-lockup after multiple re-tries. > > > > What's probably happening is that ocfs2 calls dlm_unlock(CANCEL) to cancel > > an in-progress dlm_lock() request. Before the cancel completes (or the > > original request completes), ocfs2 calls dlm_lock() again on the same > > resource. This dlm_lock() returns -EBUSY because the previous request has > > not completed, either normally or by cancellation. This is expected. > These dlm_lock and dlm_unlock are invoked in the same node, or the different > nodes? different > > A couple options to try: wait for the original request to complete > > (normally or by cancellation) before calling dlm_lock() again, or retry > > dlm_lock() on -EBUSY. > If I retry dlm_lock() repeatedly, I just wonder if this will lead to kernel > soft lockup or waste lots of CPU. I'm not aware of other code doing this, so I can't tell you with certainty. It would depend largely on the implementation in the caller. > If dlm_lock() function returns -EAGAIN, how should we handle this case? > retry it repeatedly? Again, this is a question more about the implementation of the calling code and what it wants to do. EAGAIN is specifically related to the DLM_LKF_NOQUEUE flag.