Re: [f2fs-dev] [PATCH 03/13] fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block()
On Thursday, May 2, 2019 4:15:05 AM IST Eric Biggers wrote: > From: Eric Biggers > > fscrypt_do_page_crypto() only does a single encryption or decryption > operation, with a single logical block number (single IV). So it > actually operates on a filesystem block, not a "page" per se. To > reflect this, rename it to fscrypt_crypt_block(). > Looks good to me, Reviewed-by: Chandan Rajendra -- chandan ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [BUG] watchdog: BUG: soft lockup.. after heavy disk access
2019-05-01 0:35 GMT+02:00, Hagbard Celine : > 2019-04-30 8:25 GMT+02:00, Chao Yu : >> On 2019/4/29 21:39, Hagbard Celine wrote: >>> 2019-04-29 9:36 GMT+02:00, Chao Yu : Hi Hagbard, At a glance, may I ask what's your device size? Since I notice that you used -i option during mkfs, if size is large enough, checkpoint may crash nat/sit_bitmap online. mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7 >>> >>> Both partitions are 128GB. >> >> 128GB is safe. :) >> >>> On 2019/4/29 15:25, Hagbard Celine wrote: > First I must admit that I'm not 100% sure if this is > f2fs/NMI/something-else bug, but it only triggers after significant > disk access. > > I've hit this bug several times since kernel 5.0.7 (have not tried > earlier kernels) while compiling many packages in batch. But in those > occasions it would hang all cores and loose the latest changes to the > filesystem so I could not get any logs. > This time it triggered while I was in the process of moving two of my > installations to new partitions and locked only one core, so I was > able to get the log after rebooting. > The I did to trigger was make a new partition and run commands: > #mkfs-f2fs -a 1 -f -i -l NVME_Gentoo-alt -o7 /dev/nvme0n1p7 > #mount -o > "rw,realtime,lazytime,background_gc=on,disable_ext_identity,discard,heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,data_flush,mode=adaptive,active_logs=6,whint_mode=fs-based,alloc_mode=default,fsync_mode=strict" >> >> Could you revert below commit to check whether the issue is caused by it? > > This did not seem to help. I was not able to reproduce the bug by > copying partition contents to a newly made partition, but the again I > was not able to reproduce it that way on unpatched kernel either. > > Goth with and without the mentioned patch reverted the only way I am > able to somewhat reliably reproduce is by compiling a big batch of > packages in Exherbo (using paludis/sydbox), if the batch contains one > of a handful big packages (util-linux is the most common to trigger) > and that big package is not first in the batch. In those cases it will > trigger one of the following ways in most runs: > > -NMI watchdog: BUG: soft lockup.. on from 4- to 8-cores. > > -NMI watchdog: Watchdog detected hard LOCKUP on cpu.. on 2- to 4-cores > > -no error message at all, everything suddenly goes to a grinding halt. After some more testing, it seems that this is not quite accurate. I might be facing at least two different issues. After about a day of compiling on a pure ext4 install I have not got any of the hangs with "NMI watchdog" errors, but one of the package that sometimes hung with no errors at all (dev-lang/llvm, during test phase) still exhibits this behavior. I'll do another full system compile on ext4 just to make sure it's not pure luck I did not get any NMI watchdog errors there. Then I'll copy the system to f2fs partition and try dropping mount options (starting with data_flush), running a full system compile for each, to see if that makes any difference. (unless anyone got a better idea on how to proceed, or comes up with another patch to test) > Most of the time I also loose the ability to switch vt by alt+F-key > and it's not even responding to "magic sysreq key". > > The few times I've been able to do vt-switch to the vt with output > from syslog there is usually nothing in the time-frame of crash but > one or two "kernel: perf: interrupt took too long..." > ..or some random subsystem messaging because something timed out > (usually some message from i915). Just to be clear the i915 is not in > use during the tests, console is on amdgpudrmfb and X is not running, > I even tested with a kernel without i915 to completely rule it out. > >> commit 50fa53eccf9f911a5b435248a2b0bd484fd82e5e >> Author: Chao Yu >> Date: Thu Aug 2 23:03:19 2018 +0800 >> >> f2fs: fix to avoid broken of dnode block list >> >> The only risk of reverting this patch is we may potentially lose fsynced >> data >> after sudden power-cut, although it's rare, anyway please backup your >> data >> before try it. >> >> Thanks, >> > /dev/nvme0n1p7 /mnt/gentoo-alt-new > #cd /mnt/gentoo-alt > #cp -a . /mnt/gentoo-alt-new > #umount /nmt/gentoo-alt > The bug triggers just after last command and last command was run > within 20 seconds ofter the second-last command returned. > "/mnt/gentoo-alt" was already mounted with same options as > "/mnt/gentoo-alt-new", and contained a Gentoo-rootfs of 64GB data > (according df -h). > > This was on kernel 5.0.10 with "[PATCH] f2fs: fix potential recursive > call when enabling data_flush" by Chao Yu > > Syslog for bug follows: > > Apr 29 07:02:52 40o2 kernel: watchdog: BUG: soft lockup - CPU#4 stuck > for 21s! [irq/61-nvme0q5:383] > Apr 29 07:02:52 40o2 kernel: Modul
Re: [f2fs-dev] [PATCH V2 10/13] fscrypt_encrypt_page: Loop across all blocks mapped by a page range
Hi Chandan, On Thu, May 02, 2019 at 11:22:05AM +0530, Chandan Rajendra wrote: > On Thursday, May 2, 2019 3:59:01 AM IST Eric Biggers wrote: > > Hi Chandan, > > > > On Wed, May 01, 2019 at 08:19:35PM +0530, Chandan Rajendra wrote: > > > On Wednesday, May 1, 2019 4:38:41 AM IST Eric Biggers wrote: > > > > On Tue, Apr 30, 2019 at 10:11:35AM -0700, Eric Biggers wrote: > > > > > On Sun, Apr 28, 2019 at 10:01:18AM +0530, Chandan Rajendra wrote: > > > > > > For subpage-sized blocks, this commit now encrypts all blocks > > > > > > mapped by > > > > > > a page range. > > > > > > > > > > > > Signed-off-by: Chandan Rajendra > > > > > > --- > > > > > > fs/crypto/crypto.c | 37 + > > > > > > 1 file changed, 25 insertions(+), 12 deletions(-) > > > > > > > > > > > > diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c > > > > > > index 4f0d832cae71..2d65b431563f 100644 > > > > > > --- a/fs/crypto/crypto.c > > > > > > +++ b/fs/crypto/crypto.c > > > > > > @@ -242,18 +242,26 @@ struct page *fscrypt_encrypt_page(const > > > > > > struct inode *inode, > > > > > > > > > > Need to update the function comment to clearly explain what this > > > > > function > > > > > actually does now. > > > > > > > > > > > { > > > > > > struct fscrypt_ctx *ctx; > > > > > > struct page *ciphertext_page = page; > > > > > > + int i, page_nr_blks; > > > > > > int err; > > > > > > > > > > > > BUG_ON(len % FS_CRYPTO_BLOCK_SIZE != 0); > > > > > > > > > > > > > > > > Make a 'blocksize' variable so you don't have to keep calling > > > > > i_blocksize(). > > > > > > > > > > Also, you need to check whether 'len' and 'offs' are > > > > > filesystem-block-aligned, > > > > > since the code now assumes it. > > > > > > > > > > const unsigned int blocksize = i_blocksize(inode); > > > > > > > > > > if (!IS_ALIGNED(len | offs, blocksize)) > > > > > return -EINVAL; > > > > > > > > > > However, did you check whether that's always true for ubifs? It > > > > > looks like it > > > > > may expect to encrypt a prefix of a block, that is only padded to the > > > > > next > > > > > 16-byte boundary. > > > > > > > > > > > + page_nr_blks = len >> inode->i_blkbits; > > > > > > + > > > > > > if (inode->i_sb->s_cop->flags & FS_CFLG_OWN_PAGES) { > > > > > > /* with inplace-encryption we just encrypt the page */ > > > > > > - err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, > > > > > > lblk_num, page, > > > > > > -ciphertext_page, len, offs, > > > > > > -gfp_flags); > > > > > > - if (err) > > > > > > - return ERR_PTR(err); > > > > > > - > > > > > > + for (i = 0; i < page_nr_blks; i++) { > > > > > > + err = fscrypt_do_page_crypto(inode, FS_ENCRYPT, > > > > > > + lblk_num, page, > > > > > > + ciphertext_page, > > > > > > + i_blocksize(inode), > > > > > > offs, > > > > > > + gfp_flags); > > > > > > + if (err) > > > > > > + return ERR_PTR(err); > > > > > > > > Apparently ubifs does encrypt data shorter than the filesystem block > > > > size, so > > > > this part is wrong. > > > > > > > > I suggest we split this into two functions, > > > > fscrypt_encrypt_block_inplace() and > > > > fscrypt_encrypt_blocks(), so that it's conceptually simpler what each > > > > function > > > > does. Currently this works completely differently depending on whether > > > > the > > > > filesystem set FS_CFLG_OWN_PAGES in its fscrypt_operations, which is > > > > weird. > > > > > > > > I also noticed that using fscrypt_ctx for writes seems to be > > > > unnecessary. > > > > AFAICS, page_private(bounce_page) could point directly to the pagecache > > > > page. > > > > That would simplify things a lot, especially since then fscrypt_ctx > > > > could be > > > > removed entirely after you convert reads to use read_callbacks_ctx. > > > > > > > > IMO, these would be worthwhile cleanups for fscrypt by themselves, > > > > without > > > > waiting for the read_callbacks stuff to be finalized. Finalizing the > > > > read_callbacks stuff will probably require reaching a consensus about > > > > how they > > > > should work with future filesystem features like fsverity and > > > > compression. > > > > > > > > So to move things forward, I'm considering sending out a series with > > > > the above > > > > cleanups for fscrypt, plus the equivalent of your patches: > > > > > > > > "fscrypt_encrypt_page: Loop across all blocks mapped by a page > > > > range" > > > > "fscrypt_zeroout_range: Encrypt all zeroed out blocks of a page" > > > > "Add decryption support for sub-pagesized blocks" (fs/
Re: [f2fs-dev] [PATCH 02/13] fscrypt: remove the "write" part of struct fscrypt_ctx
On Thursday, May 2, 2019 4:15:04 AM IST Eric Biggers wrote: > From: Eric Biggers > > Now that fscrypt_ctx is not used for writes, remove the 'w' fields. > Looks good to me, Reviewed-by: Chandan Rajendra -- chandan ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
Re: [f2fs-dev] [PATCH 01/13] fscrypt: simplify bounce page handling
On Thursday, May 2, 2019 4:15:03 AM IST Eric Biggers wrote: > From: Eric Biggers > > Currently, bounce page handling for writes to encrypted files is > unnecessarily complicated. A fscrypt_ctx is allocated along with each > bounce page, page_private(bounce_page) points to this fscrypt_ctx, and > fscrypt_ctx::w::control_page points to the original pagecache page. > > However, because writes don't use the fscrypt_ctx for anything else, > there's no reason why page_private(bounce_page) can't just point to the > original pagecache page directly. > > Therefore, this patch makes this change. In the process, it also cleans > up the API exposed to filesystems that allows testing whether a page is > a bounce page, getting the pagecache page from a bounce page, and > freeing a bounce page. Looks good to me, Reviewed-by: Chandan Rajendra -- chandan ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel