On 01/31/2017 08:05 AM, Greg Kroah-Hartman wrote: > On Tue, Jan 31, 2017 at 06:29:36AM +0100, Marek Vasut wrote: >> +CC Greg, LKML as I don't quite know where this should go. > > You do know about linux-fsdevel, right?
No, wasn't aware of it, sorry. >> On 01/18/2017 12:16 AM, Marek Vasut wrote: >>> I believe there is a possible race condition when configfs attributes >>> trigger filp_open() from the kernel. I initially observed the problem >>> on Linux 4.4 when loading DT overlay , which in turn loads a driver >>> which loads firmware. After some further investigation, I came up with >>> the following minimal-ish example patch, which can trigger the same >>> behavior on Linux 4.10-rc4 (next 20170117). > > What in-kernel code causes this problem? I didn't think DT overlays > were a feature in 4.4, are you running with code that isn't in the > normal releases? No, it happens in -next as well. I believe if write into configfs binary attribute triggers filp_open(), the kernel will crash. >>> The core of the demo is in cfs_over_item_dtbo_write(), which just checks >>> for valid current->fs . This function is triggered by writing data into >>> configfs binary attribute, ie.: > > Why are you caring about current->fs? Because that is what's NULL and is referenced (in set_root_rcu()) when the configfs binary attribute is written and triggers filp_open() . >>> $ mkdir /sys/kernel/config/test/overlays/1/dtbo >>> $ cat file_17201_bytes_long > /sys/kernel/config/test/overlays/1/dtbo >>> >>> I believe the 'cat' program exits quickly and thus calls fs_exit() >>> before the cfs_over_item_dtbo_write() is called. > > How can exit be called before write? I believe the exit happens after write, but this function cfs_over_item_dtbo_write() is entered only after the fs_exit(). >>> Any attempts to >>> access FS (like ie. loading firmware from FS) from that function will >>> therefore fail (by crashing the kernel, NULL pointer dereference in >>> set_root_rcu() in fs/namei.c). >>> >>> On the other hand, replacing 'cat' with 'dd' yields different result: >>> >>> $ dd if=file_17201_bytes_long of=/sys/kernel/config/test/overlays/1/dtbo >>> >>> The kernel does not crash. I believe this is because dd takes slightly >>> longer to complete, so the cfs_over_item_dtbo_write() can complete >>> before the dd process gets to calling fs_exit() and so the filesystem >>> access is still available, thus current->fs is valid. > > cat and dd act differently, if you strace them, it should show the > differences, perhaps you can narrow it down there? I can try. >>> Note that when using DT overlays (whose configfs interface is not yet >>> mainline), > > Ah, we can't do anything about code that is not merged, perhaps it is > just buggy? :) The configfs stuff is in -next , how is it not merged ? The code below is an example that triggers the problem. >>> there can easily be a device which requires a firmware in >>> the DT overlay. Such device will invoke firmware load, which uses the >>> filp_open() and will thus trigger the behavior above. Depending on >>> whether one uses dd or cat, the kernel will either crash or not. >>> >>> Any ideas ? > > I think you need to fix your device tree overlay code... This is not related to DTO, I only use that to trigger the problem. -- Best regards, Marek Vasut