Re: [PATCH] staging: android: vsoc: Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT
On Mon, Feb 11, 2019 at 9:22 AM Todd Kjos wrote: > > +Alistair Strachan > > On Mon, Feb 11, 2019 at 9:11 AM Greg KH wrote: > > > > On Mon, Feb 11, 2019 at 10:15:18PM +0530, Souptick Joarder wrote: > > > On Mon, Feb 11, 2019 at 9:27 PM Greg KH > > > wrote: > > > > > > > > On Mon, Feb 11, 2019 at 09:21:19PM +0530, Souptick Joarder wrote: > > > > > On Mon, Feb 11, 2019 at 9:10 PM Greg KH > > > > > wrote: > > > > > > > > > > > > On Mon, Feb 11, 2019 at 08:56:02PM +0530, Souptick Joarder wrote: > > > > > > > As mentioned in TODO list, Removed > > > > > > > VSOC_WAIT_FOR_INCOMING_INTERRUPT > > > > > > > ioctl. This functionality has been superseded by the futex and is > > > > > > > there for legacy reasons. > > > > > > > > > > > > > > Signed-off-by: Souptick Joarder > > > > > > > --- > > > > > > > drivers/staging/android/uapi/vsoc_shm.h | 7 --- > > > > > > > drivers/staging/android/vsoc.c | 5 - > > > > > > > 2 files changed, 12 deletions(-) > > > > > > > > > > > > So userspace is all fixed up now and this ioctl can be dropped? Any > > > > > > pointers to the userspace commit that did this? The ioctl is still being used and cannot be removed. > > > > > > > > > > I am not sure about user space part. > > > > > > > > Then we can not just delete the ioctl :) > > > > > > Agree, but where to verify the user space commit ? > > > Any pointer to the source code path ? The userspace code using the Linux 'vsoc' staging driver can be cloned from here: https://android.googlesource.com/device/google/cuttlefish_common > > > > Please work with the android developers to solve this. It should be in > > AOSP "somewhere" :( I'm working on documenting this better on source.android.com. Stay tuned. > > > > good luck, > > > > greg k-h
Re: [PATCH v2] media: uvcvideo: Fix 'type' check leading to overflow
On Wed, Dec 19, 2018 at 12:16 AM Laurent Pinchart wrote: > > Hi Alistair, > > Thank you for the patch. > > On Wednesday, 19 December 2018 03:32:48 EET Alistair Strachan wrote: > > From: Laurent Pinchart > > Are you sure you don't want to keep authorship ? I've merely reviewed v1 and > proposed an alternative implementation :-) Let me know what you would prefer > and I'll apply this to my tree. Whatever attribution you think is best works for me! Thank you for picking up this patch. > > When initially testing the Camera Terminal Descriptor wTerminalType > > field (buffer[4]), no mask is used. Later in the function, the MSB is > > overloaded to store the descriptor subtype, and so a mask of 0x7fff > > is used to check the type. > > > > If a descriptor is specially crafted to set this overloaded bit in the > > original wTerminalType field, the initial type check will fail (falling > > through, without adjusting the buffer size), but the later type checks > > will pass, assuming the buffer has been made suitably large, causing an > > overflow. > > > > Avoid this problem by checking for the MSB in the wTerminalType field. > > If the bit is set, assume the descriptor is bad, and abort parsing it. > > > > Originally reported here: > > https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8 > > A similar (non-compiling) patch was provided at that time. > > > > Reported-by: syzbot > > Signed-off-by: Alistair Strachan > > Cc: Laurent Pinchart > > Cc: Mauro Carvalho Chehab > > Cc: linux-me...@vger.kernel.org > > Cc: kernel-t...@android.com > > --- > > v2: Use an alternative fix suggested by Laurent > > drivers/media/usb/uvc/uvc_driver.c | 14 +++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/media/usb/uvc/uvc_driver.c > > b/drivers/media/usb/uvc/uvc_driver.c index bc369a0934a3..7fde3ce642c4 > > 100644 > > --- a/drivers/media/usb/uvc/uvc_driver.c > > +++ b/drivers/media/usb/uvc/uvc_driver.c > > @@ -1065,11 +1065,19 @@ static int uvc_parse_standard_control(struct > > uvc_device *dev, return -EINVAL; > > } > > > > - /* Make sure the terminal type MSB is not null, otherwise it > > - * could be confused with a unit. > > + /* > > + * Reject invalid terminal types that would cause issues: > > + * > > + * - The high byte must be non-zero, otherwise it would be > > + * confused with a unit. > > + * > > + * - Bit 15 must be 0, as we use it internally as a terminal > > + * direction flag. > > + * > > + * Other unknown types are accepted. > >*/ > > type = get_unaligned_le16([4]); > > - if ((type & 0xff00) == 0) { > > + if ((type & 0x7f00) == 0 || (type & 0x8000) != 0) { > > uvc_trace(UVC_TRACE_DESCR, "device %d videocontrol " > > "interface %d INPUT_TERMINAL %d has invalid " > > "type 0x%04x, skipping\n", udev->devnum, > > -- > Regards, > > Laurent Pinchart > > >
[PATCH v2] media: uvcvideo: Fix 'type' check leading to overflow
From: Laurent Pinchart When initially testing the Camera Terminal Descriptor wTerminalType field (buffer[4]), no mask is used. Later in the function, the MSB is overloaded to store the descriptor subtype, and so a mask of 0x7fff is used to check the type. If a descriptor is specially crafted to set this overloaded bit in the original wTerminalType field, the initial type check will fail (falling through, without adjusting the buffer size), but the later type checks will pass, assuming the buffer has been made suitably large, causing an overflow. Avoid this problem by checking for the MSB in the wTerminalType field. If the bit is set, assume the descriptor is bad, and abort parsing it. Originally reported here: https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8 A similar (non-compiling) patch was provided at that time. Reported-by: syzbot Signed-off-by: Alistair Strachan Cc: Laurent Pinchart Cc: Mauro Carvalho Chehab Cc: linux-me...@vger.kernel.org Cc: kernel-t...@android.com --- v2: Use an alternative fix suggested by Laurent drivers/media/usb/uvc/uvc_driver.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c index bc369a0934a3..7fde3ce642c4 100644 --- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -1065,11 +1065,19 @@ static int uvc_parse_standard_control(struct uvc_device *dev, return -EINVAL; } - /* Make sure the terminal type MSB is not null, otherwise it -* could be confused with a unit. + /* +* Reject invalid terminal types that would cause issues: +* +* - The high byte must be non-zero, otherwise it would be +* confused with a unit. +* +* - Bit 15 must be 0, as we use it internally as a terminal +* direction flag. +* +* Other unknown types are accepted. */ type = get_unaligned_le16([4]); - if ((type & 0xff00) == 0) { + if ((type & 0x7f00) == 0 || (type & 0x8000) != 0) { uvc_trace(UVC_TRACE_DESCR, "device %d videocontrol " "interface %d INPUT_TERMINAL %d has invalid " "type 0x%04x, skipping\n", udev->devnum, -- 2.20.1.415.g653613c723-goog
Re: [PATCH] media: uvcvideo: Fix 'type' check leading to overflow
Hi Laurent, On Tue, Dec 18, 2018 at 1:42 AM Laurent Pinchart wrote: > > Hi Alistair, > > Thank you for the patch. > > On Monday, 17 December 2018 23:02:22 EET Alistair Strachan wrote: > > When initially testing the Camera Terminal Descriptor wTerminalType > > field (buffer[4]), no mask is used. Later in the function, the MSB is > > overloaded to store the descriptor subtype, and so a mask of 0x7fff > > is used to check the type. > > > > If a descriptor is specially crafted to set this overloaded bit in the > > original wTerminalType field, the initial type check will fail (falling > > through, without adjusting the buffer size), but the later type checks > > will pass, assuming the buffer has been made suitably large, causing an > > overflow. > > > > This problem could be resolved in a few different ways, but this fix > > applies the same initial type check as used by UVC_ENTITY_TYPE (once we > > have a 'term' structure.) Such crafted wTerminalType fields will then be > > treated as *generic* Input Terminals, not as CAMERA or > > MEDIA_TRANSPORT_INPUT, avoiding an overflow. > > > > Originally reported here: > > https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8 > > A similar (non-compiling) patch was provided at that time. > > > > Reported-by: syzbot > > Signed-off-by: Alistair Strachan > > Cc: Laurent Pinchart > > Cc: Mauro Carvalho Chehab > > Cc: linux-me...@vger.kernel.org > > Cc: kernel-t...@android.com > > --- > > drivers/media/usb/uvc/uvc_driver.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/media/usb/uvc/uvc_driver.c > > b/drivers/media/usb/uvc/uvc_driver.c index bc369a0934a3..279a967b8264 > > 100644 > > --- a/drivers/media/usb/uvc/uvc_driver.c > > +++ b/drivers/media/usb/uvc/uvc_driver.c > > @@ -1082,11 +1082,11 @@ static int uvc_parse_standard_control(struct > > uvc_device *dev, p = 0; > > len = 8; > > > > - if (type == UVC_ITT_CAMERA) { > > + if ((type & 0x7fff) == UVC_ITT_CAMERA) { > > n = buflen >= 15 ? buffer[14] : 0; > > len = 15; > > > > - } else if (type == UVC_ITT_MEDIA_TRANSPORT_INPUT) { > > + } else if ((type & 0x7fff) == UVC_ITT_MEDIA_TRANSPORT_INPUT) { > > n = buflen >= 9 ? buffer[8] : 0; > > p = buflen >= 10 + n ? buffer[9+n] : 0; > > len = 10; > > How about rejecting invalid types instead ? Something along the lines of This looks simpler/better to me. I just re-sent your version of the change (build + runtime tested.) Thanks for reviewing. > diff --git a/drivers/media/usb/uvc/uvc_driver.c > b/drivers/media/usb/uvc/uvc_driver.c > index b62cbd800111..33a22c016456 100644 > --- a/drivers/media/usb/uvc/uvc_driver.c > +++ b/drivers/media/usb/uvc/uvc_driver.c > @@ -1106,11 +1106,19 @@ static int uvc_parse_standard_control(struct > uvc_device *dev, > return -EINVAL; > } > > - /* Make sure the terminal type MSB is not null, otherwise it > -* could be confused with a unit. > + /* > +* Reject invalid terminal types that would cause issues: > +* > +* - The high byte must be non-zero, otherwise it would be > +* confused with a unit. > +* > +* - Bit 15 must be 0, as we use it internally as a terminal > +* direction flag. > +* > +* Other unknown types are accepted. > */ > type = get_unaligned_le16([4]); > - if ((type & 0xff00) == 0) { > + if ((type & 0x7f00) == 0 || (type & 0x8000) != 0) { > uvc_trace(UVC_TRACE_DESCR, "device %d videocontrol " > "interface %d INPUT_TERMINAL %d has invalid " > "type 0x%04x, skipping\n", udev->devnum, > > -- > Regards, > > Laurent Pinchart > > >
[PATCH] media: uvcvideo: Fix 'type' check leading to overflow
When initially testing the Camera Terminal Descriptor wTerminalType field (buffer[4]), no mask is used. Later in the function, the MSB is overloaded to store the descriptor subtype, and so a mask of 0x7fff is used to check the type. If a descriptor is specially crafted to set this overloaded bit in the original wTerminalType field, the initial type check will fail (falling through, without adjusting the buffer size), but the later type checks will pass, assuming the buffer has been made suitably large, causing an overflow. This problem could be resolved in a few different ways, but this fix applies the same initial type check as used by UVC_ENTITY_TYPE (once we have a 'term' structure.) Such crafted wTerminalType fields will then be treated as *generic* Input Terminals, not as CAMERA or MEDIA_TRANSPORT_INPUT, avoiding an overflow. Originally reported here: https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8 A similar (non-compiling) patch was provided at that time. Reported-by: syzbot Signed-off-by: Alistair Strachan Cc: Laurent Pinchart Cc: Mauro Carvalho Chehab Cc: linux-me...@vger.kernel.org Cc: kernel-t...@android.com --- drivers/media/usb/uvc/uvc_driver.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c index bc369a0934a3..279a967b8264 100644 --- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -1082,11 +1082,11 @@ static int uvc_parse_standard_control(struct uvc_device *dev, p = 0; len = 8; - if (type == UVC_ITT_CAMERA) { + if ((type & 0x7fff) == UVC_ITT_CAMERA) { n = buflen >= 15 ? buffer[14] : 0; len = 15; - } else if (type == UVC_ITT_MEDIA_TRANSPORT_INPUT) { + } else if ((type & 0x7fff) == UVC_ITT_MEDIA_TRANSPORT_INPUT) { n = buflen >= 9 ? buffer[8] : 0; p = buflen >= 10 + n ? buffer[9+n] : 0; len = 10; -- 2.20.0.405.gbc1bbc6f85-goog
[tip:x86/urgent] x86/vdso: Pass --eh-frame-hdr to the linker
Commit-ID: cd01544a268ad8ee5b1dfe42c4393f1095f86879 Gitweb: https://git.kernel.org/tip/cd01544a268ad8ee5b1dfe42c4393f1095f86879 Author: Alistair Strachan AuthorDate: Fri, 14 Dec 2018 14:36:37 -0800 Committer: Borislav Petkov CommitDate: Sat, 15 Dec 2018 11:37:51 +0100 x86/vdso: Pass --eh-frame-hdr to the linker Commit 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") accidentally broke unwinding from userspace, because ld would strip the .eh_frame sections when linking. Originally, the compiler would implicitly add --eh-frame-hdr when invoking the linker, but when this Makefile was converted from invoking ld via the compiler, to invoking it directly (like vmlinux does), the flag was missed. (The EH_FRAME section is important for the VDSO shared libraries, but not for vmlinux.) Fix the problem by explicitly specifying --eh-frame-hdr, which restores parity with the old method. See relevant bug reports for additional info: https://bugzilla.kernel.org/show_bug.cgi?id=201741 https://bugzilla.redhat.com/show_bug.cgi?id=1659295 Fixes: 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") Reported-by: Florian Weimer Reported-by: Carlos O'Donell Reported-by: "H. J. Lu" Signed-off-by: Alistair Strachan Signed-off-by: Borislav Petkov Tested-by: Laura Abbott Cc: Andy Lutomirski Cc: Carlos O'Donell Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Joel Fernandes Cc: kernel-t...@android.com Cc: Laura Abbott Cc: stable Cc: Thomas Gleixner Cc: X86 ML Link: https://lkml.kernel.org/r/20181214223637.35954-1-astrac...@google.com --- arch/x86/entry/vdso/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 0624bf2266fd..5bfe2243a08f 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -171,7 +171,8 @@ quiet_cmd_vdso = VDSO$@ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ - $(call ld-option, --build-id) -Bsymbolic + $(call ld-option, --build-id) $(call ld-option, --eh-frame-hdr) \ + -Bsymbolic GCOV_PROFILE := n #
[PATCH v2] x86: vdso: Pass --eh-frame-hdr to ld
Commit 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") accidentally broke unwinding from userspace, because ld would strip the .eh_frame sections when linking. Originally, the compiler would implicitly add --eh-frame-hdr when invoking the linker, but when this Makefile was converted from invoking ld via the compiler, to invoking it directly (like vmlinux does), the flag was missed. (The EH_FRAME section is important for the VDSO shared libraries, but not for vmlinux.) Fix the problem by explicitly specifying --eh-frame-hdr, which restores parity with the old method. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201741 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1659295 Reported-by: Florian Weimer Reported-by: Carlos O'Donell Reported-by: "H. J. Lu" Tested-by: Laura Abbott Fixes: 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") Cc: sta...@vger.kernel.org Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: "H. Peter Anvin" Cc: X86 ML Cc: Joel Fernandes Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- v2: Updated commit message, no changes to the code arch/x86/entry/vdso/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 141d415a8c80..c3d7ccd25381 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -171,7 +171,8 @@ quiet_cmd_vdso = VDSO$@ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ - $(call ld-option, --build-id) -Bsymbolic + $(call ld-option, --build-id) $(call ld-option, --eh-frame-hdr) \ + -Bsymbolic GCOV_PROFILE := n # -- 2.20.0.405.gbc1bbc6f85-goog
[PATCH] x86: vdso: Pass --eh-frame-hdr to ld
Commit 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") accidentally broke unwinding from userspace, because ld would strip the .eh_frame sections when linking. Originally, the compiler would implicitly add --eh-frame-hdr when invoking the linker, but when this Makefile was converted from invoking ld via the compiler, to invoking it directly (like vmlinux does), the flag was missed. (The EH_FRAME section is important for the VDSO shared libraries, but not for vmlinux.) Fix the problem by explicitly specifying --eh-frame-hdr, which restores parity with the old method. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201741 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1659295 Reported-by: Laura Abbott Fixes: 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") Cc: sta...@vger.kernel.org Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: "H. Peter Anvin" Cc: X86 ML Cc: Florian Weimer , Cc: Carlos O'Donell , Cc: "H. J. Lu" Cc: Joel Fernandes Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- arch/x86/entry/vdso/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 141d415a8c80..c3d7ccd25381 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -171,7 +171,8 @@ quiet_cmd_vdso = VDSO$@ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ - $(call ld-option, --build-id) -Bsymbolic + $(call ld-option, --build-id) $(call ld-option, --eh-frame-hdr) \ + -Bsymbolic GCOV_PROFILE := n # -- 2.20.0.405.gbc1bbc6f85-goog
Re: Unwinding regression with 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link")
Hi Laura, On Fri, Dec 14, 2018 at 1:48 PM Laura Abbott wrote: > Hi, > > There are two reports of a regression with unwinding with > 379d98ddf413 ("x86: vdso: Use $LD instead of $CC to link") > > https://bugzilla.kernel.org/show_bug.cgi?id=201741 > https://bugzilla.redhat.com/show_bug.cgi?id=1659295 > > It looks like the unwinding information has been > stripped out. Any ideas? Sorry about this. I missed a critical flag (--eh-frame-hdr) that gcc specifies implicitly when converting the Makefile to invoke ld directly. This went unnoticed because the linking of vmlinux (which also uses ld directly) does not need this information, so it does not specify the flag. Sending a verified fix now. > Thanks, > Laura
"x86/mm: Introduce the 'no5lvl' kernel parameter" broke SETUP_DTB?
Hi Kirill, I noticed that booting 4.19 in qemu while injecting a FDT using the "-dtb /path/to/blob" feature might have been broken by your change 372fddf70904 ("x86/mm: Introduce the 'no5lvl' kernel parameter"). This manifests either as FDT corruption, which causes the setup code to fail to unpack it (i.e. corruption of the device-tree structure), or simply bad node data. If I make the below change, the problem goes away: diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index 8c5107545251..bfe5aca71254 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -40,7 +40,7 @@ struct paging_config paging_prepare(void *rmode) unsigned long bios_start, ebda_start; /* Initialize boot_params. Required for cmdline_find_option_bool(). */ - boot_params = rmode; + //boot_params = rmode; /* * Check if LA57 is desired and supported. This can be reproduced with the system-root.dtb file in https://android.googlesource.com/device/google/cuttlefish/+archive/master.tar.gz using "qemu -dtb system-root.dtb -kernel /path/to/bzImage -drive file=root.ext4" on x86_64_defconfig with CONFIG_OF_UNITTEST enabled. If the FDT is unpacked successfully, the /proc/device-tree/firmware/android/compatible file will exist, and contain the string "android,firmware" instead of junk. I'm still looking into the root cause for this, but I just wanted to let you know. Alistair.
"x86/mm: Introduce the 'no5lvl' kernel parameter" broke SETUP_DTB?
Hi Kirill, I noticed that booting 4.19 in qemu while injecting a FDT using the "-dtb /path/to/blob" feature might have been broken by your change 372fddf70904 ("x86/mm: Introduce the 'no5lvl' kernel parameter"). This manifests either as FDT corruption, which causes the setup code to fail to unpack it (i.e. corruption of the device-tree structure), or simply bad node data. If I make the below change, the problem goes away: diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c index 8c5107545251..bfe5aca71254 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -40,7 +40,7 @@ struct paging_config paging_prepare(void *rmode) unsigned long bios_start, ebda_start; /* Initialize boot_params. Required for cmdline_find_option_bool(). */ - boot_params = rmode; + //boot_params = rmode; /* * Check if LA57 is desired and supported. This can be reproduced with the system-root.dtb file in https://android.googlesource.com/device/google/cuttlefish/+archive/master.tar.gz using "qemu -dtb system-root.dtb -kernel /path/to/bzImage -drive file=root.ext4" on x86_64_defconfig with CONFIG_OF_UNITTEST enabled. If the FDT is unpacked successfully, the /proc/device-tree/firmware/android/compatible file will exist, and contain the string "android,firmware" instead of junk. I'm still looking into the root cause for this, but I just wanted to let you know. Alistair.
Re: [PATCH] Trivial numbering change in comments.
On Tue, Aug 28, 2018 at 1:08 PM Ray Clinton wrote: > I'm trying to get my feet wet in kernel dev and while working on a patch > I noticed in a lengthy comment block that the number "2" was skipped > in a description of the steps of a protocol. This patch is simply correcting > this. This is for 4.18.0. > > It is a very trivial patch, but this is my first actual try at one and > thought I > might start out small (and I don't think you can get smaller than a single > character change). Any and all advice (about this patch, email, or > anything else) is very welcome. > > Thanks so much! > Ray > > Signed-off-by: Ray Clinton Acked-by: Alistair Strachan > --- > drivers/staging/android/uapi/vsoc_shm.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/staging/android/uapi/vsoc_shm.h > b/drivers/staging/android/uapi/vsoc_shm.h > index 6291fb2..0301172 100644 > --- a/drivers/staging/android/uapi/vsoc_shm.h > +++ b/drivers/staging/android/uapi/vsoc_shm.h > @@ -92,7 +92,7 @@ struct fd_scoped_permission_arg { > *The transmitter can choose any appropriate hashing algorithm, including > *hash = futex_offset & ((1 << num_nodes_lg2) - 1) > * > - * 3. The transmitter should atomically compare and swap futex_offset with 0 > + * 2. The transmitter should atomically compare and swap futex_offset with 0 > *at hash. There are 3 possible outcomes > * a. The swap fails because the futex_offset is already in the table. > * The transmitter should stop. > -- > 2.7.4
Re: [PATCH] Trivial numbering change in comments.
On Tue, Aug 28, 2018 at 1:08 PM Ray Clinton wrote: > I'm trying to get my feet wet in kernel dev and while working on a patch > I noticed in a lengthy comment block that the number "2" was skipped > in a description of the steps of a protocol. This patch is simply correcting > this. This is for 4.18.0. > > It is a very trivial patch, but this is my first actual try at one and > thought I > might start out small (and I don't think you can get smaller than a single > character change). Any and all advice (about this patch, email, or > anything else) is very welcome. > > Thanks so much! > Ray > > Signed-off-by: Ray Clinton Acked-by: Alistair Strachan > --- > drivers/staging/android/uapi/vsoc_shm.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/staging/android/uapi/vsoc_shm.h > b/drivers/staging/android/uapi/vsoc_shm.h > index 6291fb2..0301172 100644 > --- a/drivers/staging/android/uapi/vsoc_shm.h > +++ b/drivers/staging/android/uapi/vsoc_shm.h > @@ -92,7 +92,7 @@ struct fd_scoped_permission_arg { > *The transmitter can choose any appropriate hashing algorithm, including > *hash = futex_offset & ((1 << num_nodes_lg2) - 1) > * > - * 3. The transmitter should atomically compare and swap futex_offset with 0 > + * 2. The transmitter should atomically compare and swap futex_offset with 0 > *at hash. There are 3 possible outcomes > * a. The swap fails because the futex_offset is already in the table. > * The transmitter should stop. > -- > 2.7.4
[tip:x86/vdso] x86: vdso: Use $LD instead of $CC to link
Commit-ID: 379d98ddf41344273d9718556f761420f4dc80b3 Gitweb: https://git.kernel.org/tip/379d98ddf41344273d9718556f761420f4dc80b3 Author: Alistair Strachan AuthorDate: Fri, 3 Aug 2018 10:39:31 -0700 Committer: Thomas Gleixner CommitDate: Sun, 5 Aug 2018 22:33:50 +0200 x86: vdso: Use $LD instead of $CC to link The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Clang is a retargetable compiler, and each invocation of it must provide --target= --gcc-toolchain= to allow it to find the correct binutils for cross compilation. These flags had been added to KBUILD_CFLAGS, but the vdso code uses CC and not KBUILD_CFLAGS (for various reasons) which breaks clang's ability to find the correct linker when cross compiling. Most of the time this goes unnoticed because the host linker is new enough to work anyway, or is incompatible and skipped, but this cannot be reliably assumed. This change alters the vdso makefile to just use LD directly, which bypasses clang and thus the searching problem. The makefile will just use ${CROSS_COMPILE}ld instead, which is always what we want. This matches the method used to link vmlinux. This drops references to DISABLE_LTO; this option doesn't seem to be set anywhere, and not knowing what its possible values are, it's not clear how to convert it from CC to LD flag. Signed-off-by: Alistair Strachan Signed-off-by: Thomas Gleixner Acked-by: Andy Lutomirski Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: kernel-t...@android.com Cc: j...@joelfernandes.org Cc: Andi Kleen Link: https://lkml.kernel.org/r/20180803173931.117515-1-astrac...@google.com --- arch/x86/entry/vdso/Makefile | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 261802b1cc50..42c6c1bea4f4 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -46,10 +46,8 @@ targets += $(vdso_img_sodbg) $(vdso_img-y:%=vdso%.so) CPPFLAGS_vdso.lds += -P -C -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ - -Wl,--no-undefined \ - -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \ - $(DISABLE_LTO) +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \ + -z max-page-size=4096 -z common-page-size=4096 $(obj)/vdso64.so.dbg: $(obj)/vdso.lds $(vobjs) FORCE $(call if_changed,vdso) @@ -95,10 +93,8 @@ CFLAGS_REMOVE_vvar.o = -pg # CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \ - -Wl,-soname=linux-vdso.so.1 \ - -Wl,-z,max-page-size=4096 \ - -Wl,-z,common-page-size=4096 +VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \ + -z max-page-size=4096 -z common-page-size=4096 # x32-rebranded versions vobjx32s-y := $(vobjs-y:.o=-x32.o) @@ -123,7 +119,7 @@ $(obj)/vdsox32.so.dbg: $(obj)/vdsox32.lds $(vobjx32s) FORCE $(call if_changed,vdso) CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1 +VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1 targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o @@ -157,13 +153,13 @@ $(obj)/vdso32.so.dbg: FORCE \ # The DSO images are built using a special linker script. # quiet_cmd_vdso = VDSO$@ - cmd_vdso = $(CC) -nostdlib -o $@ \ + cmd_vdso = $(LD) -nostdlib -o $@ \ $(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \ - -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \ + -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' -VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) +VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ + $(call ld-option, --build-id) -Bsymbolic GCOV_PROFILE := n #
[tip:x86/vdso] x86: vdso: Use $LD instead of $CC to link
Commit-ID: 379d98ddf41344273d9718556f761420f4dc80b3 Gitweb: https://git.kernel.org/tip/379d98ddf41344273d9718556f761420f4dc80b3 Author: Alistair Strachan AuthorDate: Fri, 3 Aug 2018 10:39:31 -0700 Committer: Thomas Gleixner CommitDate: Sun, 5 Aug 2018 22:33:50 +0200 x86: vdso: Use $LD instead of $CC to link The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Clang is a retargetable compiler, and each invocation of it must provide --target= --gcc-toolchain= to allow it to find the correct binutils for cross compilation. These flags had been added to KBUILD_CFLAGS, but the vdso code uses CC and not KBUILD_CFLAGS (for various reasons) which breaks clang's ability to find the correct linker when cross compiling. Most of the time this goes unnoticed because the host linker is new enough to work anyway, or is incompatible and skipped, but this cannot be reliably assumed. This change alters the vdso makefile to just use LD directly, which bypasses clang and thus the searching problem. The makefile will just use ${CROSS_COMPILE}ld instead, which is always what we want. This matches the method used to link vmlinux. This drops references to DISABLE_LTO; this option doesn't seem to be set anywhere, and not knowing what its possible values are, it's not clear how to convert it from CC to LD flag. Signed-off-by: Alistair Strachan Signed-off-by: Thomas Gleixner Acked-by: Andy Lutomirski Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: kernel-t...@android.com Cc: j...@joelfernandes.org Cc: Andi Kleen Link: https://lkml.kernel.org/r/20180803173931.117515-1-astrac...@google.com --- arch/x86/entry/vdso/Makefile | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 261802b1cc50..42c6c1bea4f4 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -46,10 +46,8 @@ targets += $(vdso_img_sodbg) $(vdso_img-y:%=vdso%.so) CPPFLAGS_vdso.lds += -P -C -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ - -Wl,--no-undefined \ - -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \ - $(DISABLE_LTO) +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \ + -z max-page-size=4096 -z common-page-size=4096 $(obj)/vdso64.so.dbg: $(obj)/vdso.lds $(vobjs) FORCE $(call if_changed,vdso) @@ -95,10 +93,8 @@ CFLAGS_REMOVE_vvar.o = -pg # CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \ - -Wl,-soname=linux-vdso.so.1 \ - -Wl,-z,max-page-size=4096 \ - -Wl,-z,common-page-size=4096 +VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \ + -z max-page-size=4096 -z common-page-size=4096 # x32-rebranded versions vobjx32s-y := $(vobjs-y:.o=-x32.o) @@ -123,7 +119,7 @@ $(obj)/vdsox32.so.dbg: $(obj)/vdsox32.lds $(vobjx32s) FORCE $(call if_changed,vdso) CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1 +VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1 targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o @@ -157,13 +153,13 @@ $(obj)/vdso32.so.dbg: FORCE \ # The DSO images are built using a special linker script. # quiet_cmd_vdso = VDSO$@ - cmd_vdso = $(CC) -nostdlib -o $@ \ + cmd_vdso = $(LD) -nostdlib -o $@ \ $(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \ - -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \ + -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' -VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) +VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ + $(call ld-option, --build-id) -Bsymbolic GCOV_PROFILE := n #
Re: native_save_fl() causes a warning
On Fri, Aug 3, 2018 at 9:38 AM Nick Desaulniers wrote: > > On Fri, Aug 3, 2018 at 6:10 AM Jean Delvare wrote: > > > > Hi Nick, > > > > It seems that this linux kernel commit of yours: > > > > commit d0a8d9378d16eb3c69bd8e6d23779fbdbee3a8c7 > > Author: Nick Desaulniers > > Date: Thu Jun 21 09:23:24 2018 -0700 > > > > x86/paravirt: Make native_save_fl() extern inline > > > > introduced a new warning (with W=1): > > > > ./arch/x86/include/asm/irqflags.h:16:29: warning: no previous prototype for > > ‘native_save_fl’ [-Wmissing-prototypes] > > extern inline unsigned long native_save_fl(void) > > ^ > > > > Please fix it. > > Hi Jean, thanks for the report. David Laight also reported this > warning; he tested a patch I sent him overnight. > > Let me guess, you're using a version of GCC < 4.9? It seems that GCC > < 4.9 will produce that warning for extern inline functions without > previous declarations. > > I'll add your Reported-By tag to the patch that I will send out in a > few minutes. > > > Secondly, I am quite curious why you changed only native_save_fl() from > > static inline to extern inline, when native_restore_fl(), > > native_irq_disable() and native_irq_enable() are equally referenced by > > address in arch/x86/kernel/paravirt.c and thus should suffer from the > > same problem. Can you explain? > > This is a good point. With native_save_fl, we were not able to boot > the kernel at all. Maybe this was called from the boot sequence > (maybe Juergen knows more)? It seems that the other functions aren't > preventing us from booting, but maybe exercising other paths in > paravirt would expose such an issue? Or maybe paravirt doesn't have > the same calling convention requirements for those functions? The core issue these patches worked around was the automatic/heuristic generation of stack guard code by clang, which ended up clobbering %ecx/%rcx in a way not expected by the contract of the paravirt code. The only function affected by this problem was native_save_fl(), because only it has a C stack (and thus, has a stack that requires guarding). The other functions could have been converted at the same time, and they will have to be converted if (down the line) somebody adds C stack variables to them. But, for now, the patch series seems to correctly work around this issue. > Is there a standard testing procedure for paravirt? I'd be happy to > try it to see if we can expose more things that should have the same > cleanup. This bug was so obvious you just enabled CONFIG_PARAVIRT, built with CC=clang, and booted the x86_64 bzImage on any qemu. The minute the paravirt alternatives were patched in, it exploded. > -- > Thanks, > ~Nick Desaulniers
Re: native_save_fl() causes a warning
On Fri, Aug 3, 2018 at 9:38 AM Nick Desaulniers wrote: > > On Fri, Aug 3, 2018 at 6:10 AM Jean Delvare wrote: > > > > Hi Nick, > > > > It seems that this linux kernel commit of yours: > > > > commit d0a8d9378d16eb3c69bd8e6d23779fbdbee3a8c7 > > Author: Nick Desaulniers > > Date: Thu Jun 21 09:23:24 2018 -0700 > > > > x86/paravirt: Make native_save_fl() extern inline > > > > introduced a new warning (with W=1): > > > > ./arch/x86/include/asm/irqflags.h:16:29: warning: no previous prototype for > > ‘native_save_fl’ [-Wmissing-prototypes] > > extern inline unsigned long native_save_fl(void) > > ^ > > > > Please fix it. > > Hi Jean, thanks for the report. David Laight also reported this > warning; he tested a patch I sent him overnight. > > Let me guess, you're using a version of GCC < 4.9? It seems that GCC > < 4.9 will produce that warning for extern inline functions without > previous declarations. > > I'll add your Reported-By tag to the patch that I will send out in a > few minutes. > > > Secondly, I am quite curious why you changed only native_save_fl() from > > static inline to extern inline, when native_restore_fl(), > > native_irq_disable() and native_irq_enable() are equally referenced by > > address in arch/x86/kernel/paravirt.c and thus should suffer from the > > same problem. Can you explain? > > This is a good point. With native_save_fl, we were not able to boot > the kernel at all. Maybe this was called from the boot sequence > (maybe Juergen knows more)? It seems that the other functions aren't > preventing us from booting, but maybe exercising other paths in > paravirt would expose such an issue? Or maybe paravirt doesn't have > the same calling convention requirements for those functions? The core issue these patches worked around was the automatic/heuristic generation of stack guard code by clang, which ended up clobbering %ecx/%rcx in a way not expected by the contract of the paravirt code. The only function affected by this problem was native_save_fl(), because only it has a C stack (and thus, has a stack that requires guarding). The other functions could have been converted at the same time, and they will have to be converted if (down the line) somebody adds C stack variables to them. But, for now, the patch series seems to correctly work around this issue. > Is there a standard testing procedure for paravirt? I'd be happy to > try it to see if we can expose more things that should have the same > cleanup. This bug was so obvious you just enabled CONFIG_PARAVIRT, built with CC=clang, and booted the x86_64 bzImage on any qemu. The minute the paravirt alternatives were patched in, it exploded. > -- > Thanks, > ~Nick Desaulniers
[PATCH v2] x86: vdso: Use $LD instead of $CC to link
The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Clang is a retargetable compiler, and each invocation of it must provide --target= --gcc-toolchain= to allow it to find the correct binutils for cross compilation. These flags had been added to KBUILD_CFLAGS, but the vdso code uses CC and not KBUILD_CFLAGS (for various reasons) which breaks clang's ability to find the correct linker when cross compiling. Most of the time this goes unnoticed because the host linker is new enough to work anyway, or is incompatible and skipped, but this cannot be reliably assumed. This change alters the vdso makefile to just use LD directly, which bypasses clang and thus the searching problem. The makefile will just use ${CROSS_COMPILE}ld instead, which is always what we want. This matches the method used to link vmlinux. This change drops references to DISABLE_LTO; this option doesn't seem to be set anywhere, and not knowing what its possible values are, it's not clear how to convert it from CC to LD flag. Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: x...@kernel.org Cc: kernel-t...@android.com Cc: j...@joelfernandes.org Signed-off-by: Alistair Strachan Acked-by: Andy Lutomirski --- v2: Updated changelog and rediffed Supersedes "x86: vdso: Fix leaky vdso link with CC=clang" arch/x86/entry/vdso/Makefile | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 261802b1cc50..42c6c1bea4f4 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -46,10 +46,8 @@ targets += $(vdso_img_sodbg) $(vdso_img-y:%=vdso%.so) CPPFLAGS_vdso.lds += -P -C -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ - -Wl,--no-undefined \ - -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \ - $(DISABLE_LTO) +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \ + -z max-page-size=4096 -z common-page-size=4096 $(obj)/vdso64.so.dbg: $(obj)/vdso.lds $(vobjs) FORCE $(call if_changed,vdso) @@ -95,10 +93,8 @@ CFLAGS_REMOVE_vvar.o = -pg # CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \ - -Wl,-soname=linux-vdso.so.1 \ - -Wl,-z,max-page-size=4096 \ - -Wl,-z,common-page-size=4096 +VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \ + -z max-page-size=4096 -z common-page-size=4096 # x32-rebranded versions vobjx32s-y := $(vobjs-y:.o=-x32.o) @@ -123,7 +119,7 @@ $(obj)/vdsox32.so.dbg: $(obj)/vdsox32.lds $(vobjx32s) FORCE $(call if_changed,vdso) CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1 +VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1 targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o @@ -157,13 +153,13 @@ $(obj)/vdso32.so.dbg: FORCE \ # The DSO images are built using a special linker script. # quiet_cmd_vdso = VDSO$@ - cmd_vdso = $(CC) -nostdlib -o $@ \ + cmd_vdso = $(LD) -nostdlib -o $@ \ $(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \ - -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \ + -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' -VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) +VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ + $(call ld-option, --build-id) -Bsymbolic GCOV_PROFILE := n #
[PATCH v2] x86: vdso: Use $LD instead of $CC to link
The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Clang is a retargetable compiler, and each invocation of it must provide --target= --gcc-toolchain= to allow it to find the correct binutils for cross compilation. These flags had been added to KBUILD_CFLAGS, but the vdso code uses CC and not KBUILD_CFLAGS (for various reasons) which breaks clang's ability to find the correct linker when cross compiling. Most of the time this goes unnoticed because the host linker is new enough to work anyway, or is incompatible and skipped, but this cannot be reliably assumed. This change alters the vdso makefile to just use LD directly, which bypasses clang and thus the searching problem. The makefile will just use ${CROSS_COMPILE}ld instead, which is always what we want. This matches the method used to link vmlinux. This change drops references to DISABLE_LTO; this option doesn't seem to be set anywhere, and not knowing what its possible values are, it's not clear how to convert it from CC to LD flag. Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: x...@kernel.org Cc: kernel-t...@android.com Cc: j...@joelfernandes.org Signed-off-by: Alistair Strachan Acked-by: Andy Lutomirski --- v2: Updated changelog and rediffed Supersedes "x86: vdso: Fix leaky vdso link with CC=clang" arch/x86/entry/vdso/Makefile | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 261802b1cc50..42c6c1bea4f4 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -46,10 +46,8 @@ targets += $(vdso_img_sodbg) $(vdso_img-y:%=vdso%.so) CPPFLAGS_vdso.lds += -P -C -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ - -Wl,--no-undefined \ - -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \ - $(DISABLE_LTO) +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \ + -z max-page-size=4096 -z common-page-size=4096 $(obj)/vdso64.so.dbg: $(obj)/vdso.lds $(vobjs) FORCE $(call if_changed,vdso) @@ -95,10 +93,8 @@ CFLAGS_REMOVE_vvar.o = -pg # CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \ - -Wl,-soname=linux-vdso.so.1 \ - -Wl,-z,max-page-size=4096 \ - -Wl,-z,common-page-size=4096 +VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \ + -z max-page-size=4096 -z common-page-size=4096 # x32-rebranded versions vobjx32s-y := $(vobjs-y:.o=-x32.o) @@ -123,7 +119,7 @@ $(obj)/vdsox32.so.dbg: $(obj)/vdsox32.lds $(vobjx32s) FORCE $(call if_changed,vdso) CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1 +VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1 targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o @@ -157,13 +153,13 @@ $(obj)/vdso32.so.dbg: FORCE \ # The DSO images are built using a special linker script. # quiet_cmd_vdso = VDSO$@ - cmd_vdso = $(CC) -nostdlib -o $@ \ + cmd_vdso = $(LD) -nostdlib -o $@ \ $(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \ - -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \ + -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' -VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) +VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ + $(call ld-option, --build-id) -Bsymbolic GCOV_PROFILE := n #
Re: [PATCH] x86: vdso: Use $LD instead of $CC to link
On Fri, Aug 3, 2018 at 3:26 AM Thomas Gleixner wrote: > > On Wed, 18 Jul 2018, Alistair Strachan wrote: > > export CPPFLAGS_vdso.lds += -P -C > > > > -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ > > - -Wl,--no-undefined \ > > - -Wl,-z,max-page-size=4096 > > -Wl,-z,common-page-size=4096 \ > > - $(DISABLE_LTO) > > +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 > > --no-undefined \ > > + -z max-page-size=4096 -z common-page-size=4096 > > Aside of the fact that it does not apply to upstream, It seems to apply fine to Linus's tree (0585df4), but I'll send a v2 which is rediffed. > why is this dropping > the $(DISABLE_LTO) part? > > The changelog is utterly silent about that. $ git grep DISABLE_LTO arch/sparc/vdso/Makefile:KBUILD_CFLAGS += $(DISABLE_LTO) arch/sparc/vdso/Makefile: $(DISABLE_LTO) arch/x86/entry/vdso/Makefile:KBUILD_CFLAGS += $(DISABLE_LTO) arch/x86/entry/vdso/Makefile: $(DISABLE_LTO) kernel/Makefile:CFLAGS_sys_ni.o = $(DISABLE_LTO) scripts/Makefile.build:cmd_cc_s_c = $(CC) $(c_flags) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< Looks like a dead option to me, but maybe somebody else knows better. v2 will explain this removal. > Thanks, > > tglx
Re: [PATCH] x86: vdso: Use $LD instead of $CC to link
On Fri, Aug 3, 2018 at 3:26 AM Thomas Gleixner wrote: > > On Wed, 18 Jul 2018, Alistair Strachan wrote: > > export CPPFLAGS_vdso.lds += -P -C > > > > -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ > > - -Wl,--no-undefined \ > > - -Wl,-z,max-page-size=4096 > > -Wl,-z,common-page-size=4096 \ > > - $(DISABLE_LTO) > > +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 > > --no-undefined \ > > + -z max-page-size=4096 -z common-page-size=4096 > > Aside of the fact that it does not apply to upstream, It seems to apply fine to Linus's tree (0585df4), but I'll send a v2 which is rediffed. > why is this dropping > the $(DISABLE_LTO) part? > > The changelog is utterly silent about that. $ git grep DISABLE_LTO arch/sparc/vdso/Makefile:KBUILD_CFLAGS += $(DISABLE_LTO) arch/sparc/vdso/Makefile: $(DISABLE_LTO) arch/x86/entry/vdso/Makefile:KBUILD_CFLAGS += $(DISABLE_LTO) arch/x86/entry/vdso/Makefile: $(DISABLE_LTO) kernel/Makefile:CFLAGS_sys_ni.o = $(DISABLE_LTO) scripts/Makefile.build:cmd_cc_s_c = $(CC) $(c_flags) $(DISABLE_LTO) -fverbose-asm -S -o $@ $< Looks like a dead option to me, but maybe somebody else knows better. v2 will explain this removal. > Thanks, > > tglx
[PATCH] x86: vdso: Use $LD instead of $CC to link
The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Clang is a retargetable compiler, and each invocation of it must provide --target= --gcc-toolchain= to allow it to find the correct binutils for cross compilation. These flags had been added to KBUILD_CFLAGS, but the vdso code uses CC and not KBUILD_CFLAGS (for various reasons) which breaks clang's ability to find the correct linker when cross compiling. Most of the time this goes unnoticed because the host linker is new enough to work anyway, or is incompatible and skipped, but this cannot be reliably assumed. This change alters the vdso makefile to just use LD directly, which bypasses clang and thus the searching problem. The makefile will just use ${CROSS_COMPILE}ld instead, which is always what we want. This matches the method used to link vmlinux. Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: x...@kernel.org Cc: kernel-t...@android.com Cc: j...@joelfernandes.com Signed-off-by: Alistair Strachan --- Supersedes "x86: vdso: Fix leaky vdso link with CC=clang" arch/x86/entry/vdso/Makefile | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 1943aebadede..896a55204386 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -48,10 +48,8 @@ targets += $(vdso_img_sodbg) export CPPFLAGS_vdso.lds += -P -C -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ - -Wl,--no-undefined \ - -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \ - $(DISABLE_LTO) +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \ + -z max-page-size=4096 -z common-page-size=4096 $(obj)/vdso64.so.dbg: $(src)/vdso.lds $(vobjs) FORCE $(call if_changed,vdso) @@ -97,10 +95,8 @@ CFLAGS_REMOVE_vvar.o = -pg # CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \ - -Wl,-soname=linux-vdso.so.1 \ - -Wl,-z,max-page-size=4096 \ - -Wl,-z,common-page-size=4096 +VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \ + -z max-page-size=4096 -z common-page-size=4096 # 64-bit objects to re-brand as x32 vobjs64-for-x32 := $(filter-out $(vobjs-nox32),$(vobjs-y)) @@ -128,7 +124,7 @@ $(obj)/vdsox32.so.dbg: $(src)/vdsox32.lds $(vobjx32s) FORCE $(call if_changed,vdso) CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1 +VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1 targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o @@ -162,13 +158,13 @@ $(obj)/vdso32.so.dbg: FORCE \ # The DSO images are built using a special linker script. # quiet_cmd_vdso = VDSO$@ - cmd_vdso = $(CC) -nostdlib -o $@ \ + cmd_vdso = $(LD) -nostdlib -o $@ \ $(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \ - -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \ + -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' -VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) +VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ + $(call ld-option, --build-id) -Bsymbolic GCOV_PROFILE := n #
[PATCH] x86: vdso: Use $LD instead of $CC to link
The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Clang is a retargetable compiler, and each invocation of it must provide --target= --gcc-toolchain= to allow it to find the correct binutils for cross compilation. These flags had been added to KBUILD_CFLAGS, but the vdso code uses CC and not KBUILD_CFLAGS (for various reasons) which breaks clang's ability to find the correct linker when cross compiling. Most of the time this goes unnoticed because the host linker is new enough to work anyway, or is incompatible and skipped, but this cannot be reliably assumed. This change alters the vdso makefile to just use LD directly, which bypasses clang and thus the searching problem. The makefile will just use ${CROSS_COMPILE}ld instead, which is always what we want. This matches the method used to link vmlinux. Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: x...@kernel.org Cc: kernel-t...@android.com Cc: j...@joelfernandes.com Signed-off-by: Alistair Strachan --- Supersedes "x86: vdso: Fix leaky vdso link with CC=clang" arch/x86/entry/vdso/Makefile | 22 +- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 1943aebadede..896a55204386 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -48,10 +48,8 @@ targets += $(vdso_img_sodbg) export CPPFLAGS_vdso.lds += -P -C -VDSO_LDFLAGS_vdso.lds = -m64 -Wl,-soname=linux-vdso.so.1 \ - -Wl,--no-undefined \ - -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096 \ - $(DISABLE_LTO) +VDSO_LDFLAGS_vdso.lds = -m elf_x86_64 -soname linux-vdso.so.1 --no-undefined \ + -z max-page-size=4096 -z common-page-size=4096 $(obj)/vdso64.so.dbg: $(src)/vdso.lds $(vobjs) FORCE $(call if_changed,vdso) @@ -97,10 +95,8 @@ CFLAGS_REMOVE_vvar.o = -pg # CPPFLAGS_vdsox32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdsox32.lds = -Wl,-m,elf32_x86_64 \ - -Wl,-soname=linux-vdso.so.1 \ - -Wl,-z,max-page-size=4096 \ - -Wl,-z,common-page-size=4096 +VDSO_LDFLAGS_vdsox32.lds = -m elf32_x86_64 -soname linux-vdso.so.1 \ + -z max-page-size=4096 -z common-page-size=4096 # 64-bit objects to re-brand as x32 vobjs64-for-x32 := $(filter-out $(vobjs-nox32),$(vobjs-y)) @@ -128,7 +124,7 @@ $(obj)/vdsox32.so.dbg: $(src)/vdsox32.lds $(vobjx32s) FORCE $(call if_changed,vdso) CPPFLAGS_vdso32.lds = $(CPPFLAGS_vdso.lds) -VDSO_LDFLAGS_vdso32.lds = -m32 -Wl,-m,elf_i386 -Wl,-soname=linux-gate.so.1 +VDSO_LDFLAGS_vdso32.lds = -m elf_i386 -soname linux-gate.so.1 targets += vdso32/vdso32.lds targets += vdso32/note.o vdso32/system_call.o vdso32/sigreturn.o @@ -162,13 +158,13 @@ $(obj)/vdso32.so.dbg: FORCE \ # The DSO images are built using a special linker script. # quiet_cmd_vdso = VDSO$@ - cmd_vdso = $(CC) -nostdlib -o $@ \ + cmd_vdso = $(LD) -nostdlib -o $@ \ $(VDSO_LDFLAGS) $(VDSO_LDFLAGS_$(filter %.lds,$(^F))) \ - -Wl,-T,$(filter %.lds,$^) $(filter %.o,$^) && \ + -T $(filter %.lds,$^) $(filter %.o,$^) && \ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' -VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) +VDSO_LDFLAGS = -shared $(call ld-option, --hash-style=both) \ + $(call ld-option, --build-id) -Bsymbolic GCOV_PROFILE := n #
Re: WARNING: lock held when returning to user space in fuse_lock_inode
On Tue, Jul 17, 2018 at 5:46 AM Miklos Szeredi wrote: > On Tue, Jul 17, 2018 at 1:36 PM, Dmitry Vyukov wrote: > > On Tue, Jul 17, 2018 at 1:14 PM, Miklos Szeredi wrote: > >> On Thu, Jul 12, 2018 at 5:49 PM, syzbot > >> wrote: > >>> Hello, > >>> > >>> syzbot found the following crash on: > >>> > >>> HEAD commit:c25c74b7476e Merge tag 'trace-v4.18-rc3-2' of > >>> git://git.ke.. > >>> git tree: upstream > >>> console output: https://syzkaller.appspot.com/x/log.txt?x=177bcec240 > >>> kernel config: https://syzkaller.appspot.com/x/.config?x=25856fac4e580aa7 > >>> dashboard link: > >>> https://syzkaller.appspot.com/bug?extid=3f7b29af1baa9d0a55be > >>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) > >>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=13aa767840 > >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1749267840 > >>> > >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: > >>> Reported-by: syzbot+3f7b29af1baa9d0a5...@syzkaller.appspotmail.com > >>> > >>> random: sshd: uninitialized urandom read (32 bytes read) > >>> random: sshd: uninitialized urandom read (32 bytes read) > >>> random: sshd: uninitialized urandom read (32 bytes read) > >>> > >>> > >>> WARNING: lock held when returning to user space! > >>> 4.18.0-rc4+ #143 Not tainted > >>> > >>> syz-executor012/4539 is leaving the kernel with locks still held! > >>> 1 lock held by syz-executor012/4539: > >>> #0: (ptrval) (>mutex){+.+.}, at: fuse_lock_inode+0xaf/0xe0 > >>> fs/fuse/inode.c:363 > >> > >> False positive. > >> > >> fi->mutex is definitely not held by the acquiring task when returning > >> to userspace. Maybe syzkaller is confused by the fact that there are > >> several interdependent tasks involved with fuse: the one calling into > >> fuse by doing something (looking up ./file0/file0) and the one that > >> reads the fuse device (returning with the LOOKUP request for "file0"). > >> The second one will return with that lock held, but it's not the one > >> that acquired it, so there's no bug at all here. > > > > Hi Miklos, > > > > syzkaller is unrelated here. That's what kernel self-detects and > > prints. So either way there is something to fix in kernel here: either > > fuse or lockdep. > > > > +Alistair did some analysis offline, hope you don't mind if I repost > > your description: > > === > > Just from reading the code, I think I can see how this happens. Fuse > > is wrapping its inode mutex with a check for "parallel_dirops", which > > is set up in process_init_reply(). The FUSE_PARALLEL_DIROPS appears to > > always be set, in fuse_send_init(), but its initial state is to be > > disabled. So if the mutex gets taken, and it'll never be unlocked if > > the initial command is flushed by fuse_readdir()'s use of > > fuse_lock_inode(). > > === > > Ah, indeed. Fix attached. Looks good to me. Tested-by: Alistair Strachan > Thanks, > Miklos
Re: WARNING: lock held when returning to user space in fuse_lock_inode
On Tue, Jul 17, 2018 at 5:46 AM Miklos Szeredi wrote: > On Tue, Jul 17, 2018 at 1:36 PM, Dmitry Vyukov wrote: > > On Tue, Jul 17, 2018 at 1:14 PM, Miklos Szeredi wrote: > >> On Thu, Jul 12, 2018 at 5:49 PM, syzbot > >> wrote: > >>> Hello, > >>> > >>> syzbot found the following crash on: > >>> > >>> HEAD commit:c25c74b7476e Merge tag 'trace-v4.18-rc3-2' of > >>> git://git.ke.. > >>> git tree: upstream > >>> console output: https://syzkaller.appspot.com/x/log.txt?x=177bcec240 > >>> kernel config: https://syzkaller.appspot.com/x/.config?x=25856fac4e580aa7 > >>> dashboard link: > >>> https://syzkaller.appspot.com/bug?extid=3f7b29af1baa9d0a55be > >>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) > >>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=13aa767840 > >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1749267840 > >>> > >>> IMPORTANT: if you fix the bug, please add the following tag to the commit: > >>> Reported-by: syzbot+3f7b29af1baa9d0a5...@syzkaller.appspotmail.com > >>> > >>> random: sshd: uninitialized urandom read (32 bytes read) > >>> random: sshd: uninitialized urandom read (32 bytes read) > >>> random: sshd: uninitialized urandom read (32 bytes read) > >>> > >>> > >>> WARNING: lock held when returning to user space! > >>> 4.18.0-rc4+ #143 Not tainted > >>> > >>> syz-executor012/4539 is leaving the kernel with locks still held! > >>> 1 lock held by syz-executor012/4539: > >>> #0: (ptrval) (>mutex){+.+.}, at: fuse_lock_inode+0xaf/0xe0 > >>> fs/fuse/inode.c:363 > >> > >> False positive. > >> > >> fi->mutex is definitely not held by the acquiring task when returning > >> to userspace. Maybe syzkaller is confused by the fact that there are > >> several interdependent tasks involved with fuse: the one calling into > >> fuse by doing something (looking up ./file0/file0) and the one that > >> reads the fuse device (returning with the LOOKUP request for "file0"). > >> The second one will return with that lock held, but it's not the one > >> that acquired it, so there's no bug at all here. > > > > Hi Miklos, > > > > syzkaller is unrelated here. That's what kernel self-detects and > > prints. So either way there is something to fix in kernel here: either > > fuse or lockdep. > > > > +Alistair did some analysis offline, hope you don't mind if I repost > > your description: > > === > > Just from reading the code, I think I can see how this happens. Fuse > > is wrapping its inode mutex with a check for "parallel_dirops", which > > is set up in process_init_reply(). The FUSE_PARALLEL_DIROPS appears to > > always be set, in fuse_send_init(), but its initial state is to be > > disabled. So if the mutex gets taken, and it'll never be unlocked if > > the initial command is flushed by fuse_readdir()'s use of > > fuse_lock_inode(). > > === > > Ah, indeed. Fix attached. Looks good to me. Tested-by: Alistair Strachan > Thanks, > Miklos
Re: [PATCH] x86: vdso: Fix leaky vdso link with CC=clang
On Thu, Jul 12, 2018 at 4:20 PM Andy Lutomirski wrote: > > > On Jul 12, 2018, at 3:06 PM, H. Peter Anvin wrote: > > > >> On 07/12/18 13:37, Alistair Strachan wrote: > >>> On Thu, Jul 12, 2018 at 1:25 PM H. Peter Anvin wrote: > >>>> On 07/12/18 13:10, Alistair Strachan wrote: > >>>> The vdso{32,64}.so can fail to link with CC=clang when clang tries to > >>>> find a suitable GCC toolchain to link these libraries with. > >>>> > >>>> /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: > >>>> access beyond end of merged section (782) > >>>> > >>>> This happens because the host environment leaked into the cross > >>>> compiler environment due to the way clang searches for suitable GCC > >>>> toolchains. > >>>> > >>> > >>> Is this another clang bug that you want a workaround for in the kernel? > >> > >> Clang is a retargetable compiler (specified with --target=) > >> and so it has a mechanism for searching for suitable binutils (from > >> another "GCC toolchain") to perform assembly and linkage. This > >> mechanism relies on both --target and --gcc-toolchain when > >> cross-compiling, otherwise it will fall back to searching /usr. > >> > >> The --target and --gcc-toolchain flags are already specified correctly > >> in the top level Makefile, but the vdso Makefile rolls its own linker > >> flags and doesn't use KBUILD_CFLAGS. Therefore, these flags get > >> incorrectly dropped from the vdso $CC link command line, and an > >> inconsistency is created between the "GCC toolchain" used to generate > >> the objects for the vdso, and the linker used to link them. > >> > > > > It sounds like there needs to be a more fundamental symbol than > > KBUILD_CFLAGS to contain these kinds of things. > > How about $(CC)? I'm guessing, but I think this wasn't done originally because CC is something the user might reasonably specify on the command line (the other bit comes from CROSS_COMPILE), so doing this via CC would require us to override the CC passed in on the command line. Not sure how to do that, since the vdso makefile is executed with a submake, so the usual "override CC := $(CC) something else" followed by "export CC" doesn't work.
Re: [PATCH] x86: vdso: Fix leaky vdso link with CC=clang
On Thu, Jul 12, 2018 at 4:20 PM Andy Lutomirski wrote: > > > On Jul 12, 2018, at 3:06 PM, H. Peter Anvin wrote: > > > >> On 07/12/18 13:37, Alistair Strachan wrote: > >>> On Thu, Jul 12, 2018 at 1:25 PM H. Peter Anvin wrote: > >>>> On 07/12/18 13:10, Alistair Strachan wrote: > >>>> The vdso{32,64}.so can fail to link with CC=clang when clang tries to > >>>> find a suitable GCC toolchain to link these libraries with. > >>>> > >>>> /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: > >>>> access beyond end of merged section (782) > >>>> > >>>> This happens because the host environment leaked into the cross > >>>> compiler environment due to the way clang searches for suitable GCC > >>>> toolchains. > >>>> > >>> > >>> Is this another clang bug that you want a workaround for in the kernel? > >> > >> Clang is a retargetable compiler (specified with --target=) > >> and so it has a mechanism for searching for suitable binutils (from > >> another "GCC toolchain") to perform assembly and linkage. This > >> mechanism relies on both --target and --gcc-toolchain when > >> cross-compiling, otherwise it will fall back to searching /usr. > >> > >> The --target and --gcc-toolchain flags are already specified correctly > >> in the top level Makefile, but the vdso Makefile rolls its own linker > >> flags and doesn't use KBUILD_CFLAGS. Therefore, these flags get > >> incorrectly dropped from the vdso $CC link command line, and an > >> inconsistency is created between the "GCC toolchain" used to generate > >> the objects for the vdso, and the linker used to link them. > >> > > > > It sounds like there needs to be a more fundamental symbol than > > KBUILD_CFLAGS to contain these kinds of things. > > How about $(CC)? I'm guessing, but I think this wasn't done originally because CC is something the user might reasonably specify on the command line (the other bit comes from CROSS_COMPILE), so doing this via CC would require us to override the CC passed in on the command line. Not sure how to do that, since the vdso makefile is executed with a submake, so the usual "override CC := $(CC) something else" followed by "export CC" doesn't work.
Re: [PATCH] x86: vdso: Fix leaky vdso link with CC=clang
On Thu, Jul 12, 2018 at 1:25 PM H. Peter Anvin wrote: > On 07/12/18 13:10, Alistair Strachan wrote: > > The vdso{32,64}.so can fail to link with CC=clang when clang tries to > > find a suitable GCC toolchain to link these libraries with. > > > > /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: > > access beyond end of merged section (782) > > > > This happens because the host environment leaked into the cross > > compiler environment due to the way clang searches for suitable GCC > > toolchains. > > > > Is this another clang bug that you want a workaround for in the kernel? Clang is a retargetable compiler (specified with --target=) and so it has a mechanism for searching for suitable binutils (from another "GCC toolchain") to perform assembly and linkage. This mechanism relies on both --target and --gcc-toolchain when cross-compiling, otherwise it will fall back to searching /usr. The --target and --gcc-toolchain flags are already specified correctly in the top level Makefile, but the vdso Makefile rolls its own linker flags and doesn't use KBUILD_CFLAGS. Therefore, these flags get incorrectly dropped from the vdso $CC link command line, and an inconsistency is created between the "GCC toolchain" used to generate the objects for the vdso, and the linker used to link them.
Re: [PATCH] x86: vdso: Fix leaky vdso link with CC=clang
On Thu, Jul 12, 2018 at 1:25 PM H. Peter Anvin wrote: > On 07/12/18 13:10, Alistair Strachan wrote: > > The vdso{32,64}.so can fail to link with CC=clang when clang tries to > > find a suitable GCC toolchain to link these libraries with. > > > > /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: > > access beyond end of merged section (782) > > > > This happens because the host environment leaked into the cross > > compiler environment due to the way clang searches for suitable GCC > > toolchains. > > > > Is this another clang bug that you want a workaround for in the kernel? Clang is a retargetable compiler (specified with --target=) and so it has a mechanism for searching for suitable binutils (from another "GCC toolchain") to perform assembly and linkage. This mechanism relies on both --target and --gcc-toolchain when cross-compiling, otherwise it will fall back to searching /usr. The --target and --gcc-toolchain flags are already specified correctly in the top level Makefile, but the vdso Makefile rolls its own linker flags and doesn't use KBUILD_CFLAGS. Therefore, these flags get incorrectly dropped from the vdso $CC link command line, and an inconsistency is created between the "GCC toolchain" used to generate the objects for the vdso, and the linker used to link them.
[PATCH] x86: vdso: Fix leaky vdso link with CC=clang
The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Most of the time this goes unnoticed because the host linker is new enough to work anyway, but this cannot be reliably assumed. Extract the needed --target and --gcc-toolchain flags added in the top level Makefile from KBUILD_CFLAGS. These flags direct clang to the appropriate toolchain to link the vdsos. Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: x...@kernel.org Cc: kernel-t...@android.com Cc: j...@joelfernandes.com Signed-off-by: Alistair Strachan --- arch/x86/entry/vdso/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 261802b1cc50..062ee94d2d26 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -163,7 +163,8 @@ quiet_cmd_vdso = VDSO$@ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) + $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) \ + $(filter --target=% --gcc-toolchain=%,$(KBUILD_CFLAGS)) GCOV_PROFILE := n #
[PATCH] x86: vdso: Fix leaky vdso link with CC=clang
The vdso{32,64}.so can fail to link with CC=clang when clang tries to find a suitable GCC toolchain to link these libraries with. /usr/bin/ld: arch/x86/entry/vdso/vclock_gettime.o: access beyond end of merged section (782) This happens because the host environment leaked into the cross compiler environment due to the way clang searches for suitable GCC toolchains. Most of the time this goes unnoticed because the host linker is new enough to work anyway, but this cannot be reliably assumed. Extract the needed --target and --gcc-toolchain flags added in the top level Makefile from KBUILD_CFLAGS. These flags direct clang to the appropriate toolchain to link the vdsos. Cc: Andy Lutomirski Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: x...@kernel.org Cc: kernel-t...@android.com Cc: j...@joelfernandes.com Signed-off-by: Alistair Strachan --- arch/x86/entry/vdso/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile index 261802b1cc50..062ee94d2d26 100644 --- a/arch/x86/entry/vdso/Makefile +++ b/arch/x86/entry/vdso/Makefile @@ -163,7 +163,8 @@ quiet_cmd_vdso = VDSO$@ sh $(srctree)/$(src)/checkundef.sh '$(NM)' '$@' VDSO_LDFLAGS = -fPIC -shared $(call cc-ldoption, -Wl$(comma)--hash-style=both) \ - $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) + $(call cc-ldoption, -Wl$(comma)--build-id) -Wl,-Bsymbolic $(LTO_CFLAGS) \ + $(filter --target=% --gcc-toolchain=%,$(KBUILD_CFLAGS)) GCOV_PROFILE := n #
Re: [PATCH] staging: android/vsoc: stop using 'timespec'
On Mon, Jun 18, 2018 at 8:09 AM Arnd Bergmann wrote: > > The timespec structure suffers from the y2038 overflow and should not > be used. This changes handle_vsoc_cond_wait() to use ktime_t directly. > > Signed-off-by: Arnd Bergmann Tested-by: Alistair Strachan > --- > drivers/staging/android/vsoc.c | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c > index 806beda1040b..22571abcaa4e 100644 > --- a/drivers/staging/android/vsoc.c > +++ b/drivers/staging/android/vsoc.c > @@ -405,7 +405,7 @@ static int handle_vsoc_cond_wait(struct file *filp, > struct vsoc_cond_wait *arg) > int ret = 0; > struct vsoc_device_region *region_p = vsoc_region_from_filep(filp); > atomic_t *address = NULL; > - struct timespec ts; > + ktime_t wake_time; > > /* Ensure that the offset is aligned */ > if (arg->offset & (sizeof(uint32_t) - 1)) > @@ -433,14 +433,13 @@ static int handle_vsoc_cond_wait(struct file *filp, > struct vsoc_cond_wait *arg) > * We do things this way to flatten differences between 32 bit > * and 64 bit timespecs. > */ > - ts.tv_sec = arg->wake_time_sec; > - ts.tv_nsec = arg->wake_time_nsec; > - > - if (!timespec_valid()) > + if (arg->wake_time_nsec >= NSEC_PER_SEC) > return -EINVAL; > + wake_time = ktime_set(arg->wake_time_sec, > arg->wake_time_nsec); > + > hrtimer_init_on_stack(>timer, CLOCK_MONOTONIC, > HRTIMER_MODE_ABS); > - hrtimer_set_expires_range_ns(>timer, > timespec_to_ktime(ts), > + hrtimer_set_expires_range_ns(>timer, wake_time, > current->timer_slack_ns); > > hrtimer_init_sleeper(to, current); > -- > 2.9.0 >
Re: [PATCH] staging: android/vsoc: stop using 'timespec'
On Mon, Jun 18, 2018 at 8:09 AM Arnd Bergmann wrote: > > The timespec structure suffers from the y2038 overflow and should not > be used. This changes handle_vsoc_cond_wait() to use ktime_t directly. > > Signed-off-by: Arnd Bergmann Tested-by: Alistair Strachan > --- > drivers/staging/android/vsoc.c | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c > index 806beda1040b..22571abcaa4e 100644 > --- a/drivers/staging/android/vsoc.c > +++ b/drivers/staging/android/vsoc.c > @@ -405,7 +405,7 @@ static int handle_vsoc_cond_wait(struct file *filp, > struct vsoc_cond_wait *arg) > int ret = 0; > struct vsoc_device_region *region_p = vsoc_region_from_filep(filp); > atomic_t *address = NULL; > - struct timespec ts; > + ktime_t wake_time; > > /* Ensure that the offset is aligned */ > if (arg->offset & (sizeof(uint32_t) - 1)) > @@ -433,14 +433,13 @@ static int handle_vsoc_cond_wait(struct file *filp, > struct vsoc_cond_wait *arg) > * We do things this way to flatten differences between 32 bit > * and 64 bit timespecs. > */ > - ts.tv_sec = arg->wake_time_sec; > - ts.tv_nsec = arg->wake_time_nsec; > - > - if (!timespec_valid()) > + if (arg->wake_time_nsec >= NSEC_PER_SEC) > return -EINVAL; > + wake_time = ktime_set(arg->wake_time_sec, > arg->wake_time_nsec); > + > hrtimer_init_on_stack(>timer, CLOCK_MONOTONIC, > HRTIMER_MODE_ABS); > - hrtimer_set_expires_range_ns(>timer, > timespec_to_ktime(ts), > + hrtimer_set_expires_range_ns(>timer, wake_time, > current->timer_slack_ns); > > hrtimer_init_sleeper(to, current); > -- > 2.9.0 >
[PATCH 2/2 v2] staging: android: ashmem: Fix mmap size validation
The ashmem driver did not check that the size/offset of the vma passed to its .mmap() function was not larger than the ashmem object being mapped. This could cause mmap() to succeed, even though accessing parts of the mapping would later fail with a segmentation fault. Ensure an error is returned by the ashmem_mmap() function if the vma size is larger than the ashmem object size. This enables safer handling of the problem in userspace. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: linux-kernel@vger.kernel.org Cc: kernel-t...@android.com Cc: Joel Fernandes Signed-off-by: Alistair Strachan --- v2: Removed unnecessary use of unlikely() macro drivers/staging/android/ashmem.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index c6386e4f5c9b..e392358ec244 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -366,6 +366,12 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) goto out; } + /* requested mapping size larger than object size */ + if (vma->vm_end - vma->vm_start > PAGE_ALIGN(asma->size)) { + ret = -EINVAL; + goto out; + } + /* requested protection bits must match our allowed protection mask */ if ((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & calc_vm_prot_bits(PROT_MASK, 0)) {
[PATCH 2/2 v2] staging: android: ashmem: Fix mmap size validation
The ashmem driver did not check that the size/offset of the vma passed to its .mmap() function was not larger than the ashmem object being mapped. This could cause mmap() to succeed, even though accessing parts of the mapping would later fail with a segmentation fault. Ensure an error is returned by the ashmem_mmap() function if the vma size is larger than the ashmem object size. This enables safer handling of the problem in userspace. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: linux-kernel@vger.kernel.org Cc: kernel-t...@android.com Cc: Joel Fernandes Signed-off-by: Alistair Strachan --- v2: Removed unnecessary use of unlikely() macro drivers/staging/android/ashmem.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index c6386e4f5c9b..e392358ec244 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -366,6 +366,12 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) goto out; } + /* requested mapping size larger than object size */ + if (vma->vm_end - vma->vm_start > PAGE_ALIGN(asma->size)) { + ret = -EINVAL; + goto out; + } + /* requested protection bits must match our allowed protection mask */ if ((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & calc_vm_prot_bits(PROT_MASK, 0)) {
[PATCH 1/2] staging: android: ashmem: Remove use of unlikely()
There is no speed difference, and it makes the code harder to read. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: linux-kernel@vger.kernel.org Cc: kernel-t...@android.com Cc: Joel Fernandes Suggested-by: Greg Kroah-Hartman Signed-off-by: Alistair Strachan --- drivers/staging/android/ashmem.c | 34 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index a1a0025b59e0..c6386e4f5c9b 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -178,7 +178,7 @@ static int range_alloc(struct ashmem_area *asma, struct ashmem_range *range; range = kmem_cache_zalloc(ashmem_range_cachep, GFP_KERNEL); - if (unlikely(!range)) + if (!range) return -ENOMEM; range->asma = asma; @@ -246,11 +246,11 @@ static int ashmem_open(struct inode *inode, struct file *file) int ret; ret = generic_file_open(inode, file); - if (unlikely(ret)) + if (ret) return ret; asma = kmem_cache_zalloc(ashmem_area_cachep, GFP_KERNEL); - if (unlikely(!asma)) + if (!asma) return -ENOMEM; INIT_LIST_HEAD(>unpinned_list); @@ -361,14 +361,14 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) mutex_lock(_mutex); /* user needs to SET_SIZE before mapping */ - if (unlikely(!asma->size)) { + if (!asma->size) { ret = -EINVAL; goto out; } /* requested protection bits must match our allowed protection mask */ - if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & -calc_vm_prot_bits(PROT_MASK, 0))) { + if ((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & + calc_vm_prot_bits(PROT_MASK, 0)) { ret = -EPERM; goto out; } @@ -486,7 +486,7 @@ static int set_prot_mask(struct ashmem_area *asma, unsigned long prot) mutex_lock(_mutex); /* the user can only remove, not add, protection bits */ - if (unlikely((asma->prot_mask & prot) != prot)) { + if ((asma->prot_mask & prot) != prot) { ret = -EINVAL; goto out; } @@ -524,7 +524,7 @@ static int set_name(struct ashmem_area *asma, void __user *name) local_name[ASHMEM_NAME_LEN - 1] = '\0'; mutex_lock(_mutex); /* cannot change an existing mapping's name */ - if (unlikely(asma->file)) + if (asma->file) ret = -EINVAL; else strcpy(asma->name + ASHMEM_NAME_PREFIX_LEN, local_name); @@ -563,7 +563,7 @@ static int get_name(struct ashmem_area *asma, void __user *name) * Now we are just copying from the stack variable to userland * No lock held */ - if (unlikely(copy_to_user(name, local_name, len))) + if (copy_to_user(name, local_name, len)) ret = -EFAULT; return ret; } @@ -701,25 +701,25 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, size_t pgstart, pgend; int ret = -EINVAL; - if (unlikely(copy_from_user(, p, sizeof(pin + if (copy_from_user(, p, sizeof(pin))) return -EFAULT; mutex_lock(_mutex); - if (unlikely(!asma->file)) + if (!asma->file) goto out_unlock; /* per custom, you can pass zero for len to mean "everything onward" */ if (!pin.len) pin.len = PAGE_ALIGN(asma->size) - pin.offset; - if (unlikely((pin.offset | pin.len) & ~PAGE_MASK)) + if ((pin.offset | pin.len) & ~PAGE_MASK) goto out_unlock; - if (unlikely(((__u32)-1) - pin.offset < pin.len)) + if (((__u32)-1) - pin.offset < pin.len) goto out_unlock; - if (unlikely(PAGE_ALIGN(asma->size) < pin.offset + pin.len)) + if (PAGE_ALIGN(asma->size) < pin.offset + pin.len) goto out_unlock; pgstart = pin.offset / PAGE_SIZE; @@ -856,7 +856,7 @@ static int __init ashmem_init(void) ashmem_area_cachep = kmem_cache_create("ashmem_area_cache", sizeof(struct ashmem_area), 0, 0, NULL); - if (unlikely(!ashmem_area_cachep)) { + if (!ashmem_area_cachep) { pr_err("failed to create slab cache\n"); goto out; } @@ -864,13 +864,13 @@ static int __init ashmem_init(void) ashmem_range_cachep = kmem_cache_create("ashmem_range_cache",
[PATCH 1/2] staging: android: ashmem: Remove use of unlikely()
There is no speed difference, and it makes the code harder to read. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: linux-kernel@vger.kernel.org Cc: kernel-t...@android.com Cc: Joel Fernandes Suggested-by: Greg Kroah-Hartman Signed-off-by: Alistair Strachan --- drivers/staging/android/ashmem.c | 34 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index a1a0025b59e0..c6386e4f5c9b 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -178,7 +178,7 @@ static int range_alloc(struct ashmem_area *asma, struct ashmem_range *range; range = kmem_cache_zalloc(ashmem_range_cachep, GFP_KERNEL); - if (unlikely(!range)) + if (!range) return -ENOMEM; range->asma = asma; @@ -246,11 +246,11 @@ static int ashmem_open(struct inode *inode, struct file *file) int ret; ret = generic_file_open(inode, file); - if (unlikely(ret)) + if (ret) return ret; asma = kmem_cache_zalloc(ashmem_area_cachep, GFP_KERNEL); - if (unlikely(!asma)) + if (!asma) return -ENOMEM; INIT_LIST_HEAD(>unpinned_list); @@ -361,14 +361,14 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) mutex_lock(_mutex); /* user needs to SET_SIZE before mapping */ - if (unlikely(!asma->size)) { + if (!asma->size) { ret = -EINVAL; goto out; } /* requested protection bits must match our allowed protection mask */ - if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & -calc_vm_prot_bits(PROT_MASK, 0))) { + if ((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & + calc_vm_prot_bits(PROT_MASK, 0)) { ret = -EPERM; goto out; } @@ -486,7 +486,7 @@ static int set_prot_mask(struct ashmem_area *asma, unsigned long prot) mutex_lock(_mutex); /* the user can only remove, not add, protection bits */ - if (unlikely((asma->prot_mask & prot) != prot)) { + if ((asma->prot_mask & prot) != prot) { ret = -EINVAL; goto out; } @@ -524,7 +524,7 @@ static int set_name(struct ashmem_area *asma, void __user *name) local_name[ASHMEM_NAME_LEN - 1] = '\0'; mutex_lock(_mutex); /* cannot change an existing mapping's name */ - if (unlikely(asma->file)) + if (asma->file) ret = -EINVAL; else strcpy(asma->name + ASHMEM_NAME_PREFIX_LEN, local_name); @@ -563,7 +563,7 @@ static int get_name(struct ashmem_area *asma, void __user *name) * Now we are just copying from the stack variable to userland * No lock held */ - if (unlikely(copy_to_user(name, local_name, len))) + if (copy_to_user(name, local_name, len)) ret = -EFAULT; return ret; } @@ -701,25 +701,25 @@ static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, size_t pgstart, pgend; int ret = -EINVAL; - if (unlikely(copy_from_user(, p, sizeof(pin + if (copy_from_user(, p, sizeof(pin))) return -EFAULT; mutex_lock(_mutex); - if (unlikely(!asma->file)) + if (!asma->file) goto out_unlock; /* per custom, you can pass zero for len to mean "everything onward" */ if (!pin.len) pin.len = PAGE_ALIGN(asma->size) - pin.offset; - if (unlikely((pin.offset | pin.len) & ~PAGE_MASK)) + if ((pin.offset | pin.len) & ~PAGE_MASK) goto out_unlock; - if (unlikely(((__u32)-1) - pin.offset < pin.len)) + if (((__u32)-1) - pin.offset < pin.len) goto out_unlock; - if (unlikely(PAGE_ALIGN(asma->size) < pin.offset + pin.len)) + if (PAGE_ALIGN(asma->size) < pin.offset + pin.len) goto out_unlock; pgstart = pin.offset / PAGE_SIZE; @@ -856,7 +856,7 @@ static int __init ashmem_init(void) ashmem_area_cachep = kmem_cache_create("ashmem_area_cache", sizeof(struct ashmem_area), 0, 0, NULL); - if (unlikely(!ashmem_area_cachep)) { + if (!ashmem_area_cachep) { pr_err("failed to create slab cache\n"); goto out; } @@ -864,13 +864,13 @@ static int __init ashmem_init(void) ashmem_range_cachep = kmem_cache_create("ashmem_range_cache",
Re: [PATCH] staging: android: ashmem: Fix mmap size validation
HI Greg, On Tue, Jun 19, 2018 at 4:01 PM Greg Kroah-Hartman wrote: > > On Tue, Jun 19, 2018 at 03:24:44PM -0700, Alistair Strachan wrote: > > The ashmem driver did not check that the size/offset of the vma passed > > to its .mmap() function was not larger than the ashmem object being > > mapped. This could cause mmap() to succeed, even though accessing parts > > of the mapping would later fail with a segmentation fault. > > > > Ensure an error is returned by the ashmem_mmap() function if the vma > > size is larger than the ashmem object size. This enables safer handling > > of the problem in userspace. > > Is this going to break current users of this api as their original call > was succeeding? If the user created an ashmem region, set its size with ASHMEM_SET_SIZE, then mmap()ed a larger size, but did not touch the extra memory it just tried to mmap(), it could pass with the existing driver. With this patch, it would fail at mmap() time. We're not aware of users that rely on such behavior. > > > > Cc: Greg Kroah-Hartman > > Cc: Arve Hjønnevåg > > Cc: Todd Kjos > > Cc: Martijn Coenen > > Cc: de...@driverdev.osuosl.org > > Cc: kernel-t...@android.com > > Signed-off-by: Alistair Strachan > > --- > > drivers/staging/android/ashmem.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/staging/android/ashmem.c > > b/drivers/staging/android/ashmem.c > > index a1a0025b59e0..1eeedb529a10 100644 > > --- a/drivers/staging/android/ashmem.c > > +++ b/drivers/staging/android/ashmem.c > > @@ -366,6 +366,12 @@ static int ashmem_mmap(struct file *file, struct > > vm_area_struct *vma) > > goto out; > > } > > > > + /* requested mapping size larger than object size */ > > + if (unlikely(vma->vm_end - vma->vm_start > PAGE_ALIGN(asma->size))) { > > + ret = -EINVAL; > > + goto out; > > + } > > Unless you can measure the speed difference, never use likely/unlikely. > The CPU and compiler almost always knows how to do this better than we > do. I know there are other checks like this in this function, those > should also be fixed as well. Sure. I'll split that out into another change and resend. > > thanks, > > greg k-h
Re: [PATCH] staging: android: ashmem: Fix mmap size validation
HI Greg, On Tue, Jun 19, 2018 at 4:01 PM Greg Kroah-Hartman wrote: > > On Tue, Jun 19, 2018 at 03:24:44PM -0700, Alistair Strachan wrote: > > The ashmem driver did not check that the size/offset of the vma passed > > to its .mmap() function was not larger than the ashmem object being > > mapped. This could cause mmap() to succeed, even though accessing parts > > of the mapping would later fail with a segmentation fault. > > > > Ensure an error is returned by the ashmem_mmap() function if the vma > > size is larger than the ashmem object size. This enables safer handling > > of the problem in userspace. > > Is this going to break current users of this api as their original call > was succeeding? If the user created an ashmem region, set its size with ASHMEM_SET_SIZE, then mmap()ed a larger size, but did not touch the extra memory it just tried to mmap(), it could pass with the existing driver. With this patch, it would fail at mmap() time. We're not aware of users that rely on such behavior. > > > > Cc: Greg Kroah-Hartman > > Cc: Arve Hjønnevåg > > Cc: Todd Kjos > > Cc: Martijn Coenen > > Cc: de...@driverdev.osuosl.org > > Cc: kernel-t...@android.com > > Signed-off-by: Alistair Strachan > > --- > > drivers/staging/android/ashmem.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/staging/android/ashmem.c > > b/drivers/staging/android/ashmem.c > > index a1a0025b59e0..1eeedb529a10 100644 > > --- a/drivers/staging/android/ashmem.c > > +++ b/drivers/staging/android/ashmem.c > > @@ -366,6 +366,12 @@ static int ashmem_mmap(struct file *file, struct > > vm_area_struct *vma) > > goto out; > > } > > > > + /* requested mapping size larger than object size */ > > + if (unlikely(vma->vm_end - vma->vm_start > PAGE_ALIGN(asma->size))) { > > + ret = -EINVAL; > > + goto out; > > + } > > Unless you can measure the speed difference, never use likely/unlikely. > The CPU and compiler almost always knows how to do this better than we > do. I know there are other checks like this in this function, those > should also be fixed as well. Sure. I'll split that out into another change and resend. > > thanks, > > greg k-h
[PATCH] staging: android: ashmem: Fix mmap size validation
The ashmem driver did not check that the size/offset of the vma passed to its .mmap() function was not larger than the ashmem object being mapped. This could cause mmap() to succeed, even though accessing parts of the mapping would later fail with a segmentation fault. Ensure an error is returned by the ashmem_mmap() function if the vma size is larger than the ashmem object size. This enables safer handling of the problem in userspace. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- drivers/staging/android/ashmem.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index a1a0025b59e0..1eeedb529a10 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -366,6 +366,12 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) goto out; } + /* requested mapping size larger than object size */ + if (unlikely(vma->vm_end - vma->vm_start > PAGE_ALIGN(asma->size))) { + ret = -EINVAL; + goto out; + } + /* requested protection bits must match our allowed protection mask */ if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & calc_vm_prot_bits(PROT_MASK, 0))) {
[PATCH] staging: android: ashmem: Fix mmap size validation
The ashmem driver did not check that the size/offset of the vma passed to its .mmap() function was not larger than the ashmem object being mapped. This could cause mmap() to succeed, even though accessing parts of the mapping would later fail with a segmentation fault. Ensure an error is returned by the ashmem_mmap() function if the vma size is larger than the ashmem object size. This enables safer handling of the problem in userspace. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- drivers/staging/android/ashmem.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c index a1a0025b59e0..1eeedb529a10 100644 --- a/drivers/staging/android/ashmem.c +++ b/drivers/staging/android/ashmem.c @@ -366,6 +366,12 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) goto out; } + /* requested mapping size larger than object size */ + if (unlikely(vma->vm_end - vma->vm_start > PAGE_ALIGN(asma->size))) { + ret = -EINVAL; + goto out; + } + /* requested protection bits must match our allowed protection mask */ if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) & calc_vm_prot_bits(PROT_MASK, 0))) {
Re: [PATCH v2] proc: Simplify and fix proc by removing the kernel mount
Hi Eric, Thanks a lot for looking into this problem. On Sat, Jun 16, 2018 at 7:55 PM Eric W. Biederman wrote: > > > Today there are three users of proc_mnt. > - The legacy sysctl system call implementation. > - The uml mconsole driver. > - The process cleanup function proc_flush_task. > > The first two are slow path and essentially unused. I expect soon we > will be able to remove the legacy sysctl system call entirely. To > keep them working for now a new wrapper file_open_proc_is added to > mount and unmount proc around file_open_root. Which nicely removes > the need for a always mounted proc instance for these cases. > > Handling proc_flush_task which is regularly used requires a little more > work. First I optimize proc_flush_task to do nothing where there is > evidence that there are no entries in proc, by looking at pid->count. > Then I carefully update proc_fill_super and proc_kill_sb to maintain a > ns->proc_super pointer to the super block for proc. This allows > proc_flush_task to find the appropriate instance of proc via rcu. > > Once the appropriate instance of proc is found in proc_flush_task > atomic_inc_not_zero is used to increase the s_active count ensuring > proc_kill_sb will not be called, until the superblock is deactivated. > This makes it safe to inspect the instance of proc and invalidate any > dentries that mention the exiting task. > > The two extra atomics operations in exit are not my favorite but given > that exit is already almost completely serialized with the task lock I > do not expect this change will be measurable. > > The benefit for all of this change is that one of the most error prone > and tricky parts of the pid namespace implementation, maintaining > kernel mounts of proc is removed. > > In addition removing the unnecessary complexity of the kernel mount > fixes a regression that caused the proc mount options to be ignored. > Now that the initial mount of proc comes from userspace, those mount > options are again honored. This fixes Android's usage of the proc > hidepid option. > > Reported-by: Alistair Strachan > Cc: sta...@vger.kernel.org > Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns.") > Signed-off-by: "Eric W. Biederman" > --- > > Alistair if you can test and confirm this fixes your issue I will add > your tested by and send the fix to Linus. I tested v2 with both UML and qemu-system-x86_64 / ARCH=x86_64 against 4.18-rc1, 4.14 and 4.9 and I couldn't break it. The hidepid problem is resolved, and the mount flags can now only be specified on the first userspace mount for that pid namespace. Tested-by: Alistair Strachan > Since my earlier posting I have spot tested this. Fixed a few bugs that > showed up and verified my changes work. So I think this is ready to go > unless someone looks at this and in testing or code review spots a bug. Agreed! > Eric > > arch/um/drivers/mconsole_kern.c | 4 ++-- > fs/proc/base.c | 36 ++--- > fs/proc/inode.c | 5 - > fs/proc/root.c | 28 ++--- > include/linux/pid_namespace.h | 3 +-- > include/linux/proc_ns.h | 7 ++- > kernel/pid.c| 8 > kernel/pid_namespace.c | 7 --- > kernel/sysctl_binary.c | 5 ++--- > 9 files changed, 51 insertions(+), 52 deletions(-) > > diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c > index d5f9a2d1da1b..36af0e02d56b 100644 > --- a/arch/um/drivers/mconsole_kern.c > +++ b/arch/um/drivers/mconsole_kern.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -124,7 +125,6 @@ void mconsole_log(struct mc_request *req) > > void mconsole_proc(struct mc_request *req) > { > - struct vfsmount *mnt = task_active_pid_ns(current)->proc_mnt; > char *buf; > int len; > struct file *file; > @@ -135,7 +135,7 @@ void mconsole_proc(struct mc_request *req) > ptr += strlen("proc"); > ptr = skip_spaces(ptr); > > - file = file_open_root(mnt->mnt_root, mnt, ptr, O_RDONLY, 0); > + file = file_open_proc(ptr, O_RDONLY, 0); > if (IS_ERR(file)) { > mconsole_reply(req, "Failed to open file", 1, 0); > printk(KERN_ERR "open /proc/%s: %ld\n", ptr, PTR_ERR(file)); > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 1b2ede6abcdf..cd7b68a64ed1 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -3052,7 +3052,7 @@ static const struct inode_operations > proc_tgid_base_inode_operations = { &g
Re: [PATCH v2] proc: Simplify and fix proc by removing the kernel mount
Hi Eric, Thanks a lot for looking into this problem. On Sat, Jun 16, 2018 at 7:55 PM Eric W. Biederman wrote: > > > Today there are three users of proc_mnt. > - The legacy sysctl system call implementation. > - The uml mconsole driver. > - The process cleanup function proc_flush_task. > > The first two are slow path and essentially unused. I expect soon we > will be able to remove the legacy sysctl system call entirely. To > keep them working for now a new wrapper file_open_proc_is added to > mount and unmount proc around file_open_root. Which nicely removes > the need for a always mounted proc instance for these cases. > > Handling proc_flush_task which is regularly used requires a little more > work. First I optimize proc_flush_task to do nothing where there is > evidence that there are no entries in proc, by looking at pid->count. > Then I carefully update proc_fill_super and proc_kill_sb to maintain a > ns->proc_super pointer to the super block for proc. This allows > proc_flush_task to find the appropriate instance of proc via rcu. > > Once the appropriate instance of proc is found in proc_flush_task > atomic_inc_not_zero is used to increase the s_active count ensuring > proc_kill_sb will not be called, until the superblock is deactivated. > This makes it safe to inspect the instance of proc and invalidate any > dentries that mention the exiting task. > > The two extra atomics operations in exit are not my favorite but given > that exit is already almost completely serialized with the task lock I > do not expect this change will be measurable. > > The benefit for all of this change is that one of the most error prone > and tricky parts of the pid namespace implementation, maintaining > kernel mounts of proc is removed. > > In addition removing the unnecessary complexity of the kernel mount > fixes a regression that caused the proc mount options to be ignored. > Now that the initial mount of proc comes from userspace, those mount > options are again honored. This fixes Android's usage of the proc > hidepid option. > > Reported-by: Alistair Strachan > Cc: sta...@vger.kernel.org > Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns.") > Signed-off-by: "Eric W. Biederman" > --- > > Alistair if you can test and confirm this fixes your issue I will add > your tested by and send the fix to Linus. I tested v2 with both UML and qemu-system-x86_64 / ARCH=x86_64 against 4.18-rc1, 4.14 and 4.9 and I couldn't break it. The hidepid problem is resolved, and the mount flags can now only be specified on the first userspace mount for that pid namespace. Tested-by: Alistair Strachan > Since my earlier posting I have spot tested this. Fixed a few bugs that > showed up and verified my changes work. So I think this is ready to go > unless someone looks at this and in testing or code review spots a bug. Agreed! > Eric > > arch/um/drivers/mconsole_kern.c | 4 ++-- > fs/proc/base.c | 36 ++--- > fs/proc/inode.c | 5 - > fs/proc/root.c | 28 ++--- > include/linux/pid_namespace.h | 3 +-- > include/linux/proc_ns.h | 7 ++- > kernel/pid.c| 8 > kernel/pid_namespace.c | 7 --- > kernel/sysctl_binary.c | 5 ++--- > 9 files changed, 51 insertions(+), 52 deletions(-) > > diff --git a/arch/um/drivers/mconsole_kern.c b/arch/um/drivers/mconsole_kern.c > index d5f9a2d1da1b..36af0e02d56b 100644 > --- a/arch/um/drivers/mconsole_kern.c > +++ b/arch/um/drivers/mconsole_kern.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > > #include > #include > @@ -124,7 +125,6 @@ void mconsole_log(struct mc_request *req) > > void mconsole_proc(struct mc_request *req) > { > - struct vfsmount *mnt = task_active_pid_ns(current)->proc_mnt; > char *buf; > int len; > struct file *file; > @@ -135,7 +135,7 @@ void mconsole_proc(struct mc_request *req) > ptr += strlen("proc"); > ptr = skip_spaces(ptr); > > - file = file_open_root(mnt->mnt_root, mnt, ptr, O_RDONLY, 0); > + file = file_open_proc(ptr, O_RDONLY, 0); > if (IS_ERR(file)) { > mconsole_reply(req, "Failed to open file", 1, 0); > printk(KERN_ERR "open /proc/%s: %ld\n", ptr, PTR_ERR(file)); > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 1b2ede6abcdf..cd7b68a64ed1 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -3052,7 +3052,7 @@ static const struct inode_operations > proc_tgid_base_inode_operations = { &g
Re: [PATCH] proc: Fix parsing of mount parameters.
On Mon, Jun 11, 2018 at 6:22 PM Eric W. Biederman wrote: > > Alistair Strachan writes: > > > In commit e94591d0d90c "proc: Convert proc_mount to use mount_ns" > > the parsing of mount parameters for the proc filesystem was broken. > > > > The SB_KERNMOUNT for procfs happens via: > > > > start_kernel() > > rest_init() > > kernel_thread() > > _do_fork() > >copy_process() > > alloc_pid() > >pid_ns_prepare_proc() > > kern_mount_data() > >proc_mount() > > mount_ns() > > > > In mount_ns(), the kernel calls proc_fill_super() only if the superblock > > has not previously been set up (i.e. the first mount reference), > > regardless of SB_KERNMOUNT. Because the call to proc_parse_options() had > > been moved inside here, and the SB_KERNMOUNT uses no mount options, the > > option parser became a no-op. > > > > When userspace later mounted procfs with e.g. hidepid=2, the options > > would be ignored. > > > > This change backs out a part of the original cleanup and parses the > > procfs mount options at every mount call. Because the options currently > > only update the pid_ns for the mount, they are applied for all mounts of > > proc by that pid or childen of that pid, instantaneously. This is the > > same behavior as the original code. > > Two years for a regression to be reported is a litte long. I think that > gets out of the kneejerk immediate fix or revert phase and into thinking > a little bout about what makes sense in this code. Android has been using hidepid=2 for a while, but most shipping products were on 3.18 or 4.4 kernels. To us, it's a new problem. > As we say with devpts there is a very real danger of someone mounting > a second instance of proc in a chroot and causing problems by either > strengthening or weakening the hid pid protections for the entire pid > namespace. If we go with your proposed change in behavior. I guess my change does change the behavior, but it's just back to the behavior which the kernel had for a good while (~v3.3 thru v4.7). > Ordinary block device filesystems (like ext4) avoid this problem by > allowing a second mount and by not parsing the mount options except > on remount. What proc currently does. IMHO, they're not really comparable. You'll only get kernmounts of an ext4 filesystem when finding rootfs, and in that case the user knows about the mount and can see it in /proc/mounts, so they know to use -o remount,. Since the first mount (where the options might have been respected) is *always* the kernmount done before init, with your change these mount options for procfs will never be respected. As userspace didn't yet mount /proc, it can't know /proc was already mounted, in order to know to use a remount to re-parse the options. The behavior was changed in a non-obvious way. > So I think it can be reasonably argued that the change in behavior is > was an unintentional fix. > > I can see an argument for failing the mount of proc if mount options > are specified or if those mount options differ from the existing mount > options. > > proc_remount's call of proc_parse_options is definitely buggy as it can > partially succeed and change the pid namespace and return an error code. > That is bad error handling. > > There may be an argument for making these options available in something > other than a mount of proc. As they are pid namespace wide. > > There may be an argument for multiple instances of proc so that it makes > sense to process these options during an ordinary mount. > > > Ultimately what I see is that this is a difficult area of semantics that > there is at least a little room for improvement on, but it is not > as simple as this proposed change. An alternative fix might be to ignore the super setup if done from a kernmount of procfs. IMO, this initial mount shouldn't be considered the first reference, because it will not pass the mount options and cannot be observed by userspace. Such a change looks complicated, though, and it would only be relevant to procfs. It might be better to roll back the cleanup and implement these semantics directly in the procfs code. > > Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns") > > Signed-off-by: Alistair Strachan > > Cc: Seth Forshee > > Cc: Djalal Harouni > > Cc: "Eric W. Biederman" > > Cc: kernel-t...@android.com > > Cc: linux-kernel@vger.kernel.org > > --- > > fs/proc/inode.c| 4 > > fs/proc/internal.h | 1 - > > fs/proc/root.c | 5 - > > 3 files changed, 4 insertions(+), 6
Re: [PATCH] proc: Fix parsing of mount parameters.
On Mon, Jun 11, 2018 at 6:22 PM Eric W. Biederman wrote: > > Alistair Strachan writes: > > > In commit e94591d0d90c "proc: Convert proc_mount to use mount_ns" > > the parsing of mount parameters for the proc filesystem was broken. > > > > The SB_KERNMOUNT for procfs happens via: > > > > start_kernel() > > rest_init() > > kernel_thread() > > _do_fork() > >copy_process() > > alloc_pid() > >pid_ns_prepare_proc() > > kern_mount_data() > >proc_mount() > > mount_ns() > > > > In mount_ns(), the kernel calls proc_fill_super() only if the superblock > > has not previously been set up (i.e. the first mount reference), > > regardless of SB_KERNMOUNT. Because the call to proc_parse_options() had > > been moved inside here, and the SB_KERNMOUNT uses no mount options, the > > option parser became a no-op. > > > > When userspace later mounted procfs with e.g. hidepid=2, the options > > would be ignored. > > > > This change backs out a part of the original cleanup and parses the > > procfs mount options at every mount call. Because the options currently > > only update the pid_ns for the mount, they are applied for all mounts of > > proc by that pid or childen of that pid, instantaneously. This is the > > same behavior as the original code. > > Two years for a regression to be reported is a litte long. I think that > gets out of the kneejerk immediate fix or revert phase and into thinking > a little bout about what makes sense in this code. Android has been using hidepid=2 for a while, but most shipping products were on 3.18 or 4.4 kernels. To us, it's a new problem. > As we say with devpts there is a very real danger of someone mounting > a second instance of proc in a chroot and causing problems by either > strengthening or weakening the hid pid protections for the entire pid > namespace. If we go with your proposed change in behavior. I guess my change does change the behavior, but it's just back to the behavior which the kernel had for a good while (~v3.3 thru v4.7). > Ordinary block device filesystems (like ext4) avoid this problem by > allowing a second mount and by not parsing the mount options except > on remount. What proc currently does. IMHO, they're not really comparable. You'll only get kernmounts of an ext4 filesystem when finding rootfs, and in that case the user knows about the mount and can see it in /proc/mounts, so they know to use -o remount,. Since the first mount (where the options might have been respected) is *always* the kernmount done before init, with your change these mount options for procfs will never be respected. As userspace didn't yet mount /proc, it can't know /proc was already mounted, in order to know to use a remount to re-parse the options. The behavior was changed in a non-obvious way. > So I think it can be reasonably argued that the change in behavior is > was an unintentional fix. > > I can see an argument for failing the mount of proc if mount options > are specified or if those mount options differ from the existing mount > options. > > proc_remount's call of proc_parse_options is definitely buggy as it can > partially succeed and change the pid namespace and return an error code. > That is bad error handling. > > There may be an argument for making these options available in something > other than a mount of proc. As they are pid namespace wide. > > There may be an argument for multiple instances of proc so that it makes > sense to process these options during an ordinary mount. > > > Ultimately what I see is that this is a difficult area of semantics that > there is at least a little room for improvement on, but it is not > as simple as this proposed change. An alternative fix might be to ignore the super setup if done from a kernmount of procfs. IMO, this initial mount shouldn't be considered the first reference, because it will not pass the mount options and cannot be observed by userspace. Such a change looks complicated, though, and it would only be relevant to procfs. It might be better to roll back the cleanup and implement these semantics directly in the procfs code. > > Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns") > > Signed-off-by: Alistair Strachan > > Cc: Seth Forshee > > Cc: Djalal Harouni > > Cc: "Eric W. Biederman" > > Cc: kernel-t...@android.com > > Cc: linux-kernel@vger.kernel.org > > --- > > fs/proc/inode.c| 4 > > fs/proc/internal.h | 1 - > > fs/proc/root.c | 5 - > > 3 files changed, 4 insertions(+), 6
[PATCH] proc: Fix parsing of mount parameters.
In commit e94591d0d90c "proc: Convert proc_mount to use mount_ns" the parsing of mount parameters for the proc filesystem was broken. The SB_KERNMOUNT for procfs happens via: start_kernel() rest_init() kernel_thread() _do_fork() copy_process() alloc_pid() pid_ns_prepare_proc() kern_mount_data() proc_mount() mount_ns() In mount_ns(), the kernel calls proc_fill_super() only if the superblock has not previously been set up (i.e. the first mount reference), regardless of SB_KERNMOUNT. Because the call to proc_parse_options() had been moved inside here, and the SB_KERNMOUNT uses no mount options, the option parser became a no-op. When userspace later mounted procfs with e.g. hidepid=2, the options would be ignored. This change backs out a part of the original cleanup and parses the procfs mount options at every mount call. Because the options currently only update the pid_ns for the mount, they are applied for all mounts of proc by that pid or childen of that pid, instantaneously. This is the same behavior as the original code. Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns") Signed-off-by: Alistair Strachan Cc: Seth Forshee Cc: Djalal Harouni Cc: "Eric W. Biederman" Cc: kernel-t...@android.com Cc: linux-kernel@vger.kernel.org --- fs/proc/inode.c| 4 fs/proc/internal.h | 1 - fs/proc/root.c | 5 - 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 2cf3b74391ca..f348be0a 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -492,13 +492,9 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de) int proc_fill_super(struct super_block *s, void *data, int silent) { - struct pid_namespace *ns = get_pid_ns(s->s_fs_info); struct inode *root_inode; int ret; - if (!proc_parse_options(data, ns)) - return -EINVAL; - /* User space would break if executables or devices appear on proc */ s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC; diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 50cb22a08c2f..89b7e845b000 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -264,7 +264,6 @@ static inline void proc_tty_init(void) {} * root.c */ extern struct proc_dir_entry proc_root; -extern int proc_parse_options(char *options, struct pid_namespace *pid); extern void proc_self_init(void); extern int proc_remount(struct super_block *, int *, char *); diff --git a/fs/proc/root.c b/fs/proc/root.c index 61b7340b357a..d40676a5dd6c 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -36,7 +36,7 @@ static const match_table_t tokens = { {Opt_err, NULL}, }; -int proc_parse_options(char *options, struct pid_namespace *pid) +static int proc_parse_options(char *options, struct pid_namespace *pid) { char *p; substring_t args[MAX_OPT_ARGS]; @@ -98,6 +98,9 @@ static struct dentry *proc_mount(struct file_system_type *fs_type, ns = task_active_pid_ns(current); } + if (!proc_parse_options(data, ns)) + return ERR_PTR(-EINVAL); + return mount_ns(fs_type, flags, data, ns, ns->user_ns, proc_fill_super); } -- 2.18.0.rc1.242.g61856ae69a-goog
[PATCH] proc: Fix parsing of mount parameters.
In commit e94591d0d90c "proc: Convert proc_mount to use mount_ns" the parsing of mount parameters for the proc filesystem was broken. The SB_KERNMOUNT for procfs happens via: start_kernel() rest_init() kernel_thread() _do_fork() copy_process() alloc_pid() pid_ns_prepare_proc() kern_mount_data() proc_mount() mount_ns() In mount_ns(), the kernel calls proc_fill_super() only if the superblock has not previously been set up (i.e. the first mount reference), regardless of SB_KERNMOUNT. Because the call to proc_parse_options() had been moved inside here, and the SB_KERNMOUNT uses no mount options, the option parser became a no-op. When userspace later mounted procfs with e.g. hidepid=2, the options would be ignored. This change backs out a part of the original cleanup and parses the procfs mount options at every mount call. Because the options currently only update the pid_ns for the mount, they are applied for all mounts of proc by that pid or childen of that pid, instantaneously. This is the same behavior as the original code. Fixes: e94591d0d90c ("proc: Convert proc_mount to use mount_ns") Signed-off-by: Alistair Strachan Cc: Seth Forshee Cc: Djalal Harouni Cc: "Eric W. Biederman" Cc: kernel-t...@android.com Cc: linux-kernel@vger.kernel.org --- fs/proc/inode.c| 4 fs/proc/internal.h | 1 - fs/proc/root.c | 5 - 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 2cf3b74391ca..f348be0a 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -492,13 +492,9 @@ struct inode *proc_get_inode(struct super_block *sb, struct proc_dir_entry *de) int proc_fill_super(struct super_block *s, void *data, int silent) { - struct pid_namespace *ns = get_pid_ns(s->s_fs_info); struct inode *root_inode; int ret; - if (!proc_parse_options(data, ns)) - return -EINVAL; - /* User space would break if executables or devices appear on proc */ s->s_iflags |= SB_I_USERNS_VISIBLE | SB_I_NOEXEC | SB_I_NODEV; s->s_flags |= SB_NODIRATIME | SB_NOSUID | SB_NOEXEC; diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 50cb22a08c2f..89b7e845b000 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -264,7 +264,6 @@ static inline void proc_tty_init(void) {} * root.c */ extern struct proc_dir_entry proc_root; -extern int proc_parse_options(char *options, struct pid_namespace *pid); extern void proc_self_init(void); extern int proc_remount(struct super_block *, int *, char *); diff --git a/fs/proc/root.c b/fs/proc/root.c index 61b7340b357a..d40676a5dd6c 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -36,7 +36,7 @@ static const match_table_t tokens = { {Opt_err, NULL}, }; -int proc_parse_options(char *options, struct pid_namespace *pid) +static int proc_parse_options(char *options, struct pid_namespace *pid) { char *p; substring_t args[MAX_OPT_ARGS]; @@ -98,6 +98,9 @@ static struct dentry *proc_mount(struct file_system_type *fs_type, ns = task_active_pid_ns(current); } + if (!proc_parse_options(data, ns)) + return ERR_PTR(-EINVAL); + return mount_ns(fs_type, flags, data, ns, ns->user_ns, proc_fill_super); } -- 2.18.0.rc1.242.g61856ae69a-goog
Re: [clang] stack protector and f1f029c7bf
(Resent plain text) On Thu, May 24, 2018 at 11:24 AM Nick Desaulnierswrote: > On Thu, May 24, 2018 at 11:20 AM wrote: > > A stack canary on an *inlined* function? That's bound to break things > elsewhere too sooner or later. > But it's *not* inlined by GCC or Clang. FWIW, GCC can also insert a stack guard in an out-of-lined inline function, it just doesn't for this one. The -fstack-protector and -fstack-protector-strong flags are heuristic and the heuristic does not match between gcc and clang for this function. It is working on GCC purely by chance. > While the function is marked `static inline`, it's not in > arch/x86/kernel/paravirt.o due to: > arch/x86/kernel/paravirt.c:326 > 325 __visible struct pv_irq_ops pv_irq_ops = { > 326 .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl), > see comparison of disassembly attached in: > https://bugs.llvm.org/attachment.cgi?id=20338 > -- > Thanks, > ~Nick Desaulniers
Re: [clang] stack protector and f1f029c7bf
(Resent plain text) On Thu, May 24, 2018 at 11:24 AM Nick Desaulniers wrote: > On Thu, May 24, 2018 at 11:20 AM wrote: > > A stack canary on an *inlined* function? That's bound to break things > elsewhere too sooner or later. > But it's *not* inlined by GCC or Clang. FWIW, GCC can also insert a stack guard in an out-of-lined inline function, it just doesn't for this one. The -fstack-protector and -fstack-protector-strong flags are heuristic and the heuristic does not match between gcc and clang for this function. It is working on GCC purely by chance. > While the function is marked `static inline`, it's not in > arch/x86/kernel/paravirt.o due to: > arch/x86/kernel/paravirt.c:326 > 325 __visible struct pv_irq_ops pv_irq_ops = { > 326 .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl), > see comparison of disassembly attached in: > https://bugs.llvm.org/attachment.cgi?id=20338 > -- > Thanks, > ~Nick Desaulniers
[PATCH 1/3] staging: Android: vsoc: Create wc kernel mapping for region shm.
Map the region shm as write-combining instead of uncachable. Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org> Cc: Arve Hjønnevåg <a...@android.com> Cc: Todd Kjos <tk...@android.com> Cc: Martijn Coenen <m...@android.com> Cc: Greg Hartman <ghart...@google.com> Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan <astrac...@google.com> --- drivers/staging/android/TODO | 1 - drivers/staging/android/vsoc.c | 4 +--- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO index 2ea6f97b8f0f..ebd6ba3ae02e 100644 --- a/drivers/staging/android/TODO +++ b/drivers/staging/android/TODO @@ -18,7 +18,6 @@ vsoc.c, uapi/vsoc_shm.h waiting threads. We should eventually use multiple queues and select the queue based on the region. - Add debugfs support for examining the permissions of regions. - - Use ioremap_wc instead of ioremap_nocache. - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been superseded by the futex and is there for legacy reasons. diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c index 587c66d709b9..794137b7751f 100644 --- a/drivers/staging/android/vsoc.c +++ b/drivers/staging/android/vsoc.c @@ -802,9 +802,7 @@ static int vsoc_probe_device(struct pci_dev *pdev, dev_info(>dev, "shared memory @ DMA %p size=0x%zx\n", (void *)vsoc_dev.shm_phys_start, vsoc_dev.shm_size); - /* TODO(ghartman): ioremap_wc should work here */ - vsoc_dev.kernel_mapped_shm = ioremap_nocache( - vsoc_dev.shm_phys_start, vsoc_dev.shm_size); + vsoc_dev.kernel_mapped_shm = pci_iomap_wc(pdev, SHARED_MEMORY_BAR, 0); if (!vsoc_dev.kernel_mapped_shm) { dev_err(_dev.dev->dev, "cannot iomap region\n"); vsoc_remove_device(pdev);
[PATCH 1/3] staging: Android: vsoc: Create wc kernel mapping for region shm.
Map the region shm as write-combining instead of uncachable. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: Greg Hartman Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- drivers/staging/android/TODO | 1 - drivers/staging/android/vsoc.c | 4 +--- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO index 2ea6f97b8f0f..ebd6ba3ae02e 100644 --- a/drivers/staging/android/TODO +++ b/drivers/staging/android/TODO @@ -18,7 +18,6 @@ vsoc.c, uapi/vsoc_shm.h waiting threads. We should eventually use multiple queues and select the queue based on the region. - Add debugfs support for examining the permissions of regions. - - Use ioremap_wc instead of ioremap_nocache. - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been superseded by the futex and is there for legacy reasons. diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c index 587c66d709b9..794137b7751f 100644 --- a/drivers/staging/android/vsoc.c +++ b/drivers/staging/android/vsoc.c @@ -802,9 +802,7 @@ static int vsoc_probe_device(struct pci_dev *pdev, dev_info(>dev, "shared memory @ DMA %p size=0x%zx\n", (void *)vsoc_dev.shm_phys_start, vsoc_dev.shm_size); - /* TODO(ghartman): ioremap_wc should work here */ - vsoc_dev.kernel_mapped_shm = ioremap_nocache( - vsoc_dev.shm_phys_start, vsoc_dev.shm_size); + vsoc_dev.kernel_mapped_shm = pci_iomap_wc(pdev, SHARED_MEMORY_BAR, 0); if (!vsoc_dev.kernel_mapped_shm) { dev_err(_dev.dev->dev, "cannot iomap region\n"); vsoc_remove_device(pdev);
[PATCH 2/3] staging: Android: vsoc: Fix a i386-randconfig warning.
Fix "warning: cast to pointer from integer of different size" when printing the region shm physical address. Use the %pa conversion specifier and pass the resource by reference. Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org> Cc: Arve Hjønnevåg <a...@android.com> Cc: Todd Kjos <tk...@android.com> Cc: Martijn Coenen <m...@android.com> Cc: Greg Hartman <ghart...@google.com> Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan <astrac...@google.com> --- drivers/staging/android/vsoc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c index 794137b7751f..3e6e4af7d6a1 100644 --- a/drivers/staging/android/vsoc.c +++ b/drivers/staging/android/vsoc.c @@ -800,8 +800,8 @@ static int vsoc_probe_device(struct pci_dev *pdev, vsoc_dev.shm_phys_start = pci_resource_start(pdev, SHARED_MEMORY_BAR); vsoc_dev.shm_size = pci_resource_len(pdev, SHARED_MEMORY_BAR); - dev_info(>dev, "shared memory @ DMA %p size=0x%zx\n", -(void *)vsoc_dev.shm_phys_start, vsoc_dev.shm_size); + dev_info(>dev, "shared memory @ DMA %pa size=0x%zx\n", +_dev.shm_phys_start, vsoc_dev.shm_size); vsoc_dev.kernel_mapped_shm = pci_iomap_wc(pdev, SHARED_MEMORY_BAR, 0); if (!vsoc_dev.kernel_mapped_shm) { dev_err(_dev.dev->dev, "cannot iomap region\n");
[PATCH 2/3] staging: Android: vsoc: Fix a i386-randconfig warning.
Fix "warning: cast to pointer from integer of different size" when printing the region shm physical address. Use the %pa conversion specifier and pass the resource by reference. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: Greg Hartman Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- drivers/staging/android/vsoc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c index 794137b7751f..3e6e4af7d6a1 100644 --- a/drivers/staging/android/vsoc.c +++ b/drivers/staging/android/vsoc.c @@ -800,8 +800,8 @@ static int vsoc_probe_device(struct pci_dev *pdev, vsoc_dev.shm_phys_start = pci_resource_start(pdev, SHARED_MEMORY_BAR); vsoc_dev.shm_size = pci_resource_len(pdev, SHARED_MEMORY_BAR); - dev_info(>dev, "shared memory @ DMA %p size=0x%zx\n", -(void *)vsoc_dev.shm_phys_start, vsoc_dev.shm_size); + dev_info(>dev, "shared memory @ DMA %pa size=0x%zx\n", +_dev.shm_phys_start, vsoc_dev.shm_size); vsoc_dev.kernel_mapped_shm = pci_iomap_wc(pdev, SHARED_MEMORY_BAR, 0); if (!vsoc_dev.kernel_mapped_shm) { dev_err(_dev.dev->dev, "cannot iomap region\n");
[PATCH 3/3] staging: Android: Fix sparse warnings in vsoc driver.
Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org> Cc: Arve Hjønnevåg <a...@android.com> Cc: Todd Kjos <tk...@android.com> Cc: Martijn Coenen <m...@android.com> Cc: Greg Hartman <ghart...@google.com> Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan <astrac...@google.com> --- drivers/staging/android/vsoc.c | 100 - 1 file changed, 49 insertions(+), 51 deletions(-) diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c index 3e6e4af7d6a1..954ed2c5d807 100644 --- a/drivers/staging/android/vsoc.c +++ b/drivers/staging/android/vsoc.c @@ -81,8 +81,8 @@ struct vsoc_region_data { atomic_t *incoming_signalled; /* Flag indicating the guest has signalled the host. */ atomic_t *outgoing_signalled; - int irq_requested; - int device_created; + bool irq_requested; + bool device_created; }; struct vsoc_device { @@ -91,7 +91,7 @@ struct vsoc_device { /* Physical address of SHARED_MEMORY_BAR. */ phys_addr_t shm_phys_start; /* Kernel virtual address of SHARED_MEMORY_BAR. */ - void *kernel_mapped_shm; + void __iomem *kernel_mapped_shm; /* Size of the entire shared memory window in bytes. */ size_t shm_size; /* @@ -116,22 +116,23 @@ struct vsoc_device { * vsoc_region_data because the kernel deals with them as an array. */ struct msix_entry *msix_entries; - /* -* Flags that indicate what we've initialzied. These are used to do an -* orderly cleanup of the device. -*/ - char enabled_device; - char requested_regions; - char cdev_added; - char class_added; - char msix_enabled; /* Mutex that protectes the permission list */ struct mutex mtx; /* Major number assigned by the kernel */ int major; - + /* Character device assigned by the kernel */ struct cdev cdev; + /* Device class assigned by the kernel */ struct class *class; + /* +* Flags that indicate what we've initialized. These are used to do an +* orderly cleanup of the device. +*/ + bool enabled_device; + bool requested_regions; + bool cdev_added; + bool class_added; + bool msix_enabled; }; static struct vsoc_device vsoc_dev; @@ -153,13 +154,13 @@ static long vsoc_ioctl(struct file *, unsigned int, unsigned long); static int vsoc_mmap(struct file *, struct vm_area_struct *); static int vsoc_open(struct inode *, struct file *); static int vsoc_release(struct inode *, struct file *); -static ssize_t vsoc_read(struct file *, char *, size_t, loff_t *); -static ssize_t vsoc_write(struct file *, const char *, size_t, loff_t *); +static ssize_t vsoc_read(struct file *, char __user *, size_t, loff_t *); +static ssize_t vsoc_write(struct file *, const char __user *, size_t, loff_t *); static loff_t vsoc_lseek(struct file *filp, loff_t offset, int origin); static int do_create_fd_scoped_permission( struct vsoc_device_region *region_p, struct fd_scoped_permission_node *np, - struct fd_scoped_permission_arg *__user arg); + struct fd_scoped_permission_arg __user *arg); static void do_destroy_fd_scoped_permission( struct vsoc_device_region *owner_region_p, struct fd_scoped_permission *perm); @@ -198,7 +199,7 @@ inline int vsoc_validate_filep(struct file *filp) /* Converts from shared memory offset to virtual address */ static inline void *shm_off_to_virtual_addr(__u32 offset) { - return vsoc_dev.kernel_mapped_shm + offset; + return (void __force *)vsoc_dev.kernel_mapped_shm + offset; } /* Converts from shared memory offset to physical address */ @@ -261,7 +262,7 @@ static struct pci_driver vsoc_pci_driver = { static int do_create_fd_scoped_permission( struct vsoc_device_region *region_p, struct fd_scoped_permission_node *np, - struct fd_scoped_permission_arg *__user arg) + struct fd_scoped_permission_arg __user *arg) { struct file *managed_filp; s32 managed_fd; @@ -632,11 +633,11 @@ static long vsoc_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) return 0; } -static ssize_t vsoc_read(struct file *filp, char *buffer, size_t len, +static ssize_t vsoc_read(struct file *filp, char __user *buffer, size_t len, loff_t *poffset) { __u32 area_off; - void *area_p; + const void *area_p; ssize_t area_len; int retval = vsoc_validate_filep(filp); @@ -706,7 +707,7 @@ static loff_t vsoc_lseek(struct file *filp, loff_t offset, int origin) return offset; } -static ssize_t vsoc_write(struct file *filp, const char *buffer, +static ssize_t vsoc_write(struct file *filp, const char __user *buffer, size_t len, loff_t
[PATCH 3/3] staging: Android: Fix sparse warnings in vsoc driver.
Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: Greg Hartman Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Alistair Strachan --- drivers/staging/android/vsoc.c | 100 - 1 file changed, 49 insertions(+), 51 deletions(-) diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c index 3e6e4af7d6a1..954ed2c5d807 100644 --- a/drivers/staging/android/vsoc.c +++ b/drivers/staging/android/vsoc.c @@ -81,8 +81,8 @@ struct vsoc_region_data { atomic_t *incoming_signalled; /* Flag indicating the guest has signalled the host. */ atomic_t *outgoing_signalled; - int irq_requested; - int device_created; + bool irq_requested; + bool device_created; }; struct vsoc_device { @@ -91,7 +91,7 @@ struct vsoc_device { /* Physical address of SHARED_MEMORY_BAR. */ phys_addr_t shm_phys_start; /* Kernel virtual address of SHARED_MEMORY_BAR. */ - void *kernel_mapped_shm; + void __iomem *kernel_mapped_shm; /* Size of the entire shared memory window in bytes. */ size_t shm_size; /* @@ -116,22 +116,23 @@ struct vsoc_device { * vsoc_region_data because the kernel deals with them as an array. */ struct msix_entry *msix_entries; - /* -* Flags that indicate what we've initialzied. These are used to do an -* orderly cleanup of the device. -*/ - char enabled_device; - char requested_regions; - char cdev_added; - char class_added; - char msix_enabled; /* Mutex that protectes the permission list */ struct mutex mtx; /* Major number assigned by the kernel */ int major; - + /* Character device assigned by the kernel */ struct cdev cdev; + /* Device class assigned by the kernel */ struct class *class; + /* +* Flags that indicate what we've initialized. These are used to do an +* orderly cleanup of the device. +*/ + bool enabled_device; + bool requested_regions; + bool cdev_added; + bool class_added; + bool msix_enabled; }; static struct vsoc_device vsoc_dev; @@ -153,13 +154,13 @@ static long vsoc_ioctl(struct file *, unsigned int, unsigned long); static int vsoc_mmap(struct file *, struct vm_area_struct *); static int vsoc_open(struct inode *, struct file *); static int vsoc_release(struct inode *, struct file *); -static ssize_t vsoc_read(struct file *, char *, size_t, loff_t *); -static ssize_t vsoc_write(struct file *, const char *, size_t, loff_t *); +static ssize_t vsoc_read(struct file *, char __user *, size_t, loff_t *); +static ssize_t vsoc_write(struct file *, const char __user *, size_t, loff_t *); static loff_t vsoc_lseek(struct file *filp, loff_t offset, int origin); static int do_create_fd_scoped_permission( struct vsoc_device_region *region_p, struct fd_scoped_permission_node *np, - struct fd_scoped_permission_arg *__user arg); + struct fd_scoped_permission_arg __user *arg); static void do_destroy_fd_scoped_permission( struct vsoc_device_region *owner_region_p, struct fd_scoped_permission *perm); @@ -198,7 +199,7 @@ inline int vsoc_validate_filep(struct file *filp) /* Converts from shared memory offset to virtual address */ static inline void *shm_off_to_virtual_addr(__u32 offset) { - return vsoc_dev.kernel_mapped_shm + offset; + return (void __force *)vsoc_dev.kernel_mapped_shm + offset; } /* Converts from shared memory offset to physical address */ @@ -261,7 +262,7 @@ static struct pci_driver vsoc_pci_driver = { static int do_create_fd_scoped_permission( struct vsoc_device_region *region_p, struct fd_scoped_permission_node *np, - struct fd_scoped_permission_arg *__user arg) + struct fd_scoped_permission_arg __user *arg) { struct file *managed_filp; s32 managed_fd; @@ -632,11 +633,11 @@ static long vsoc_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) return 0; } -static ssize_t vsoc_read(struct file *filp, char *buffer, size_t len, +static ssize_t vsoc_read(struct file *filp, char __user *buffer, size_t len, loff_t *poffset) { __u32 area_off; - void *area_p; + const void *area_p; ssize_t area_len; int retval = vsoc_validate_filep(filp); @@ -706,7 +707,7 @@ static loff_t vsoc_lseek(struct file *filp, loff_t offset, int origin) return offset; } -static ssize_t vsoc_write(struct file *filp, const char *buffer, +static ssize_t vsoc_write(struct file *filp, const char __user *buffer, size_t len, loff_t *poffset) { __u32 area_off; @@ -772,14 +773,14 @@ static int vsoc_probe_device(struct pci_dev *pdev, pci_name(pdev), result
[PATCH v2] staging: Android: Add 'vsoc' driver for cuttlefish.
From: Greg Hartman <ghart...@google.com> The cuttlefish system is a virtual SoC architecture based on QEMU. It uses the QEMU ivshmem feature to share memory regions between guest and host with a custom protocol. Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org> Cc: Arve Hjønnevåg <a...@android.com> Cc: Todd Kjos <tk...@android.com> Cc: Martijn Coenen <m...@android.com> Cc: Dan Carpenter <dan.carpen...@oracle.com> Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Greg Hartman <ghart...@google.com> [astrachan: rebased against 4.16, added TODO, fixed checkpatch issues] Signed-off-by: Alistair Strachan <astrac...@google.com> --- v2: addressed issues with class_create() failure handling and redundant null pointer checks noticed by Dan Carpenter drivers/staging/android/Kconfig |9 + drivers/staging/android/Makefile|1 + drivers/staging/android/TODO| 10 + drivers/staging/android/uapi/vsoc_shm.h | 303 ++ drivers/staging/android/vsoc.c | 1169 +++ 5 files changed, 1492 insertions(+) create mode 100644 drivers/staging/android/uapi/vsoc_shm.h create mode 100644 drivers/staging/android/vsoc.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 71a50b99caff..29d891355f7a 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -14,6 +14,15 @@ config ASHMEM It is, in theory, a good memory allocator for low-memory devices, because it can discard shared memory units when under memory pressure. +config ANDROID_VSOC + tristate "Android Virtual SoC support" + default n + depends on PCI_MSI + ---help--- + This option adds support for the Virtual SoC driver needed to boot + a 'cuttlefish' Android image inside QEmu. The driver interacts with + a QEmu ivshmem device. If built as a module, it will be called vsoc. + source "drivers/staging/android/ion/Kconfig" endif # if ANDROID diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 7cf1564a49a5..90e6154f11a4 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -3,3 +3,4 @@ ccflags-y += -I$(src) # needed for trace events obj-y += ion/ obj-$(CONFIG_ASHMEM) += ashmem.o +obj-$(CONFIG_ANDROID_VSOC) += vsoc.o diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO index 687e0eac85bf..6aab759efa48 100644 --- a/drivers/staging/android/TODO +++ b/drivers/staging/android/TODO @@ -11,5 +11,15 @@ ion/ - Split /dev/ion up into multiple nodes (e.g. /dev/ion/heap0) - Better test framework (integration with VGEM was suggested) +vsoc.c, uapi/vsoc_shm.h + - The current driver uses the same wait queue for all of the futexes in a + region. This will cause false wakeups in regions with a large number of + waiting threads. We should eventually use multiple queues and select the + queue based on the region. + - Add debugfs support for examining the permissions of regions. + - Use ioremap_wc instead of ioremap_nocache. + - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been + superseded by the futex and is there for legacy reasons. + Please send patches to Greg Kroah-Hartman <g...@kroah.com> and Cc: Arve Hjønnevåg <a...@android.com> and Riley Andrews <riandr...@android.com> diff --git a/drivers/staging/android/uapi/vsoc_shm.h b/drivers/staging/android/uapi/vsoc_shm.h new file mode 100644 index ..741b1387c25b --- /dev/null +++ b/drivers/staging/android/uapi/vsoc_shm.h @@ -0,0 +1,303 @@ +/* + * Copyright (C) 2017 Google, Inc. + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _UAPI_LINUX_VSOC_SHM_H +#define _UAPI_LINUX_VSOC_SHM_H + +#include + +/** + * A permission is a token that permits a receiver to read and/or write an area + * of memory within a Vsoc region. + * + * An fd_scoped permission grants both read and write access, and can be + * attached to a file description (see open(2)). + * Ownership of the area can then be shared by passing a file descriptor + * among processes. + * + * begin_offset and end_offset define the area of memory that is controlled by + * the permission. owner_offset points to a word, also in shared memory, that + * controls ownership of the area. + * + * ownership of the region expires when the assoc
[PATCH v2] staging: Android: Add 'vsoc' driver for cuttlefish.
From: Greg Hartman The cuttlefish system is a virtual SoC architecture based on QEMU. It uses the QEMU ivshmem feature to share memory regions between guest and host with a custom protocol. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: Dan Carpenter Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Greg Hartman [astrachan: rebased against 4.16, added TODO, fixed checkpatch issues] Signed-off-by: Alistair Strachan --- v2: addressed issues with class_create() failure handling and redundant null pointer checks noticed by Dan Carpenter drivers/staging/android/Kconfig |9 + drivers/staging/android/Makefile|1 + drivers/staging/android/TODO| 10 + drivers/staging/android/uapi/vsoc_shm.h | 303 ++ drivers/staging/android/vsoc.c | 1169 +++ 5 files changed, 1492 insertions(+) create mode 100644 drivers/staging/android/uapi/vsoc_shm.h create mode 100644 drivers/staging/android/vsoc.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 71a50b99caff..29d891355f7a 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -14,6 +14,15 @@ config ASHMEM It is, in theory, a good memory allocator for low-memory devices, because it can discard shared memory units when under memory pressure. +config ANDROID_VSOC + tristate "Android Virtual SoC support" + default n + depends on PCI_MSI + ---help--- + This option adds support for the Virtual SoC driver needed to boot + a 'cuttlefish' Android image inside QEmu. The driver interacts with + a QEmu ivshmem device. If built as a module, it will be called vsoc. + source "drivers/staging/android/ion/Kconfig" endif # if ANDROID diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 7cf1564a49a5..90e6154f11a4 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -3,3 +3,4 @@ ccflags-y += -I$(src) # needed for trace events obj-y += ion/ obj-$(CONFIG_ASHMEM) += ashmem.o +obj-$(CONFIG_ANDROID_VSOC) += vsoc.o diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO index 687e0eac85bf..6aab759efa48 100644 --- a/drivers/staging/android/TODO +++ b/drivers/staging/android/TODO @@ -11,5 +11,15 @@ ion/ - Split /dev/ion up into multiple nodes (e.g. /dev/ion/heap0) - Better test framework (integration with VGEM was suggested) +vsoc.c, uapi/vsoc_shm.h + - The current driver uses the same wait queue for all of the futexes in a + region. This will cause false wakeups in regions with a large number of + waiting threads. We should eventually use multiple queues and select the + queue based on the region. + - Add debugfs support for examining the permissions of regions. + - Use ioremap_wc instead of ioremap_nocache. + - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been + superseded by the futex and is there for legacy reasons. + Please send patches to Greg Kroah-Hartman and Cc: Arve Hjønnevåg and Riley Andrews diff --git a/drivers/staging/android/uapi/vsoc_shm.h b/drivers/staging/android/uapi/vsoc_shm.h new file mode 100644 index ..741b1387c25b --- /dev/null +++ b/drivers/staging/android/uapi/vsoc_shm.h @@ -0,0 +1,303 @@ +/* + * Copyright (C) 2017 Google, Inc. + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _UAPI_LINUX_VSOC_SHM_H +#define _UAPI_LINUX_VSOC_SHM_H + +#include + +/** + * A permission is a token that permits a receiver to read and/or write an area + * of memory within a Vsoc region. + * + * An fd_scoped permission grants both read and write access, and can be + * attached to a file description (see open(2)). + * Ownership of the area can then be shared by passing a file descriptor + * among processes. + * + * begin_offset and end_offset define the area of memory that is controlled by + * the permission. owner_offset points to a word, also in shared memory, that + * controls ownership of the area. + * + * ownership of the region expires when the associated file description is + * released. + * + * At most one permission can be attached to each file description. + * + * This is useful when implementing HALs like gralloc that scope and pass + * ownership of shared resources via file descriptors. + * + * The caller is responsibe f
[PATCH] staging: Android: Add 'vsoc' driver for cuttlefish.
From: Greg Hartman <ghart...@google.com> The cuttlefish system is a virtual SoC architecture based on QEMU. It uses the QEMU ivshmem feature to share memory regions between guest and host with a custom protocol. Cc: Greg Kroah-Hartman <gre...@linuxfoundation.org> Cc: Arve Hjønnevåg <a...@android.com> Cc: Todd Kjos <tk...@android.com> Cc: Martijn Coenen <m...@android.com> Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Greg Hartman <ghart...@google.com> [astrachan: rebased against 4.16, added TODO, fixed checkpatch issues] Signed-off-by: Alistair Strachan <astrac...@google.com> --- drivers/staging/android/Kconfig |9 + drivers/staging/android/Makefile|1 + drivers/staging/android/TODO| 10 + drivers/staging/android/uapi/vsoc_shm.h | 303 ++ drivers/staging/android/vsoc.c | 1167 +++ 5 files changed, 1490 insertions(+) create mode 100644 drivers/staging/android/uapi/vsoc_shm.h create mode 100644 drivers/staging/android/vsoc.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 71a50b99caff..29d891355f7a 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -14,6 +14,15 @@ config ASHMEM It is, in theory, a good memory allocator for low-memory devices, because it can discard shared memory units when under memory pressure. +config ANDROID_VSOC + tristate "Android Virtual SoC support" + default n + depends on PCI_MSI + ---help--- + This option adds support for the Virtual SoC driver needed to boot + a 'cuttlefish' Android image inside QEmu. The driver interacts with + a QEmu ivshmem device. If built as a module, it will be called vsoc. + source "drivers/staging/android/ion/Kconfig" endif # if ANDROID diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 7cf1564a49a5..90e6154f11a4 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -3,3 +3,4 @@ ccflags-y += -I$(src) # needed for trace events obj-y += ion/ obj-$(CONFIG_ASHMEM) += ashmem.o +obj-$(CONFIG_ANDROID_VSOC) += vsoc.o diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO index 687e0eac85bf..6aab759efa48 100644 --- a/drivers/staging/android/TODO +++ b/drivers/staging/android/TODO @@ -11,5 +11,15 @@ ion/ - Split /dev/ion up into multiple nodes (e.g. /dev/ion/heap0) - Better test framework (integration with VGEM was suggested) +vsoc.c, uapi/vsoc_shm.h + - The current driver uses the same wait queue for all of the futexes in a + region. This will cause false wakeups in regions with a large number of + waiting threads. We should eventually use multiple queues and select the + queue based on the region. + - Add debugfs support for examining the permissions of regions. + - Use ioremap_wc instead of ioremap_nocache. + - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been + superseded by the futex and is there for legacy reasons. + Please send patches to Greg Kroah-Hartman <g...@kroah.com> and Cc: Arve Hjønnevåg <a...@android.com> and Riley Andrews <riandr...@android.com> diff --git a/drivers/staging/android/uapi/vsoc_shm.h b/drivers/staging/android/uapi/vsoc_shm.h new file mode 100644 index ..741b1387c25b --- /dev/null +++ b/drivers/staging/android/uapi/vsoc_shm.h @@ -0,0 +1,303 @@ +/* + * Copyright (C) 2017 Google, Inc. + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _UAPI_LINUX_VSOC_SHM_H +#define _UAPI_LINUX_VSOC_SHM_H + +#include + +/** + * A permission is a token that permits a receiver to read and/or write an area + * of memory within a Vsoc region. + * + * An fd_scoped permission grants both read and write access, and can be + * attached to a file description (see open(2)). + * Ownership of the area can then be shared by passing a file descriptor + * among processes. + * + * begin_offset and end_offset define the area of memory that is controlled by + * the permission. owner_offset points to a word, also in shared memory, that + * controls ownership of the area. + * + * ownership of the region expires when the associated file description is + * released. + * + * At most one permission can be attached to each file description. + * + * This is useful when implementing HALs like gr
[PATCH] staging: Android: Add 'vsoc' driver for cuttlefish.
From: Greg Hartman The cuttlefish system is a virtual SoC architecture based on QEMU. It uses the QEMU ivshmem feature to share memory regions between guest and host with a custom protocol. Cc: Greg Kroah-Hartman Cc: Arve Hjønnevåg Cc: Todd Kjos Cc: Martijn Coenen Cc: de...@driverdev.osuosl.org Cc: kernel-t...@android.com Signed-off-by: Greg Hartman [astrachan: rebased against 4.16, added TODO, fixed checkpatch issues] Signed-off-by: Alistair Strachan --- drivers/staging/android/Kconfig |9 + drivers/staging/android/Makefile|1 + drivers/staging/android/TODO| 10 + drivers/staging/android/uapi/vsoc_shm.h | 303 ++ drivers/staging/android/vsoc.c | 1167 +++ 5 files changed, 1490 insertions(+) create mode 100644 drivers/staging/android/uapi/vsoc_shm.h create mode 100644 drivers/staging/android/vsoc.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 71a50b99caff..29d891355f7a 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -14,6 +14,15 @@ config ASHMEM It is, in theory, a good memory allocator for low-memory devices, because it can discard shared memory units when under memory pressure. +config ANDROID_VSOC + tristate "Android Virtual SoC support" + default n + depends on PCI_MSI + ---help--- + This option adds support for the Virtual SoC driver needed to boot + a 'cuttlefish' Android image inside QEmu. The driver interacts with + a QEmu ivshmem device. If built as a module, it will be called vsoc. + source "drivers/staging/android/ion/Kconfig" endif # if ANDROID diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 7cf1564a49a5..90e6154f11a4 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -3,3 +3,4 @@ ccflags-y += -I$(src) # needed for trace events obj-y += ion/ obj-$(CONFIG_ASHMEM) += ashmem.o +obj-$(CONFIG_ANDROID_VSOC) += vsoc.o diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO index 687e0eac85bf..6aab759efa48 100644 --- a/drivers/staging/android/TODO +++ b/drivers/staging/android/TODO @@ -11,5 +11,15 @@ ion/ - Split /dev/ion up into multiple nodes (e.g. /dev/ion/heap0) - Better test framework (integration with VGEM was suggested) +vsoc.c, uapi/vsoc_shm.h + - The current driver uses the same wait queue for all of the futexes in a + region. This will cause false wakeups in regions with a large number of + waiting threads. We should eventually use multiple queues and select the + queue based on the region. + - Add debugfs support for examining the permissions of regions. + - Use ioremap_wc instead of ioremap_nocache. + - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been + superseded by the futex and is there for legacy reasons. + Please send patches to Greg Kroah-Hartman and Cc: Arve Hjønnevåg and Riley Andrews diff --git a/drivers/staging/android/uapi/vsoc_shm.h b/drivers/staging/android/uapi/vsoc_shm.h new file mode 100644 index ..741b1387c25b --- /dev/null +++ b/drivers/staging/android/uapi/vsoc_shm.h @@ -0,0 +1,303 @@ +/* + * Copyright (C) 2017 Google, Inc. + * + * This software is licensed under the terms of the GNU General Public + * License version 2, as published by the Free Software Foundation, and + * may be copied, distributed, and modified under those terms. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _UAPI_LINUX_VSOC_SHM_H +#define _UAPI_LINUX_VSOC_SHM_H + +#include + +/** + * A permission is a token that permits a receiver to read and/or write an area + * of memory within a Vsoc region. + * + * An fd_scoped permission grants both read and write access, and can be + * attached to a file description (see open(2)). + * Ownership of the area can then be shared by passing a file descriptor + * among processes. + * + * begin_offset and end_offset define the area of memory that is controlled by + * the permission. owner_offset points to a word, also in shared memory, that + * controls ownership of the area. + * + * ownership of the region expires when the associated file description is + * released. + * + * At most one permission can be attached to each file description. + * + * This is useful when implementing HALs like gralloc that scope and pass + * ownership of shared resources via file descriptors. + * + * The caller is responsibe for doing any fencing. + * + * The calling process will normally identify a currently free area of + * memory. It will construct a proposed fd_scoped_perm
[PATCH] sync: Fix memory corruption in sync_timeline_signal().
The android_fence_release() function checks for active sync points by calling list_empty() on the list head embedded on the sync point. However, it is only valid to use list_empty() on nodes that have been initialized with INIT_LIST_HEAD() or list_del_init(). Because the list entry has likely been removed from the active list by sync_timeline_signal(), there is a good chance that this WARN_ON_ONCE() will be hit due to dangling pointers pointing at freed memory (even though the sync drivers did nothing wrong) and memory corruption will ensue as the list entry is removed for a second time, corrupting the active list. This problem can be reproduced quite easily with CONFIG_DEBUG_LIST=y and fences with more than one sync point. Signed-off-by: Alistair Strachan Cc: Maarten Lankhorst Cc: Greg Kroah-Hartman Cc: Colin Cross --- drivers/staging/android/sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 7bdb62b..f83e00c 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -114,7 +114,7 @@ void sync_timeline_signal(struct sync_timeline *obj) list_for_each_entry_safe(pt, next, >active_list_head, active_list) { if (fence_is_signaled_locked(>base)) - list_del(>active_list); + list_del_init(>active_list); } spin_unlock_irqrestore(>child_list_lock, flags); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sync: Fix memory corruption in sync_timeline_signal().
The android_fence_release() function checks for active sync points by calling list_empty() on the list head embedded on the sync point. However, it is only valid to use list_empty() on nodes that have been initialized with INIT_LIST_HEAD() or list_del_init(). Because the list entry has likely been removed from the active list by sync_timeline_signal(), there is a good chance that this WARN_ON_ONCE() will be hit due to dangling pointers pointing at freed memory (even though the sync drivers did nothing wrong) and memory corruption will ensue as the list entry is removed for a second time, corrupting the active list. This problem can be reproduced quite easily with CONFIG_DEBUG_LIST=y and fences with more than one sync point. Change-Id: Ie2a6bc1480bbcfdc14f9b385fca5a2b833effc05 Signed-off-by: Alistair Strachan Cc: Maarten Lankhorst Cc: Greg Kroah-Hartman Cc: Colin Cross --- drivers/staging/android/sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 7bdb62b..f83e00c 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -114,7 +114,7 @@ void sync_timeline_signal(struct sync_timeline *obj) list_for_each_entry_safe(pt, next, >active_list_head, active_list) { if (fence_is_signaled_locked(>base)) - list_del(>active_list); + list_del_init(>active_list); } spin_unlock_irqrestore(>child_list_lock, flags); -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sync: Fix memory corruption in sync_timeline_signal().
The android_fence_release() function checks for active sync points by calling list_empty() on the list head embedded on the sync point. However, it is only valid to use list_empty() on nodes that have been initialized with INIT_LIST_HEAD() or list_del_init(). Because the list entry has likely been removed from the active list by sync_timeline_signal(), there is a good chance that this WARN_ON_ONCE() will be hit due to dangling pointers pointing at freed memory (even though the sync drivers did nothing wrong) and memory corruption will ensue as the list entry is removed for a second time, corrupting the active list. This problem can be reproduced quite easily with CONFIG_DEBUG_LIST=y and fences with more than one sync point. Signed-off-by: Alistair Strachan alistair.strac...@imgtec.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Colin Cross ccr...@google.com --- drivers/staging/android/sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 7bdb62b..f83e00c 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -114,7 +114,7 @@ void sync_timeline_signal(struct sync_timeline *obj) list_for_each_entry_safe(pt, next, obj-active_list_head, active_list) { if (fence_is_signaled_locked(pt-base)) - list_del(pt-active_list); + list_del_init(pt-active_list); } spin_unlock_irqrestore(obj-child_list_lock, flags); -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sync: Fix memory corruption in sync_timeline_signal().
The android_fence_release() function checks for active sync points by calling list_empty() on the list head embedded on the sync point. However, it is only valid to use list_empty() on nodes that have been initialized with INIT_LIST_HEAD() or list_del_init(). Because the list entry has likely been removed from the active list by sync_timeline_signal(), there is a good chance that this WARN_ON_ONCE() will be hit due to dangling pointers pointing at freed memory (even though the sync drivers did nothing wrong) and memory corruption will ensue as the list entry is removed for a second time, corrupting the active list. This problem can be reproduced quite easily with CONFIG_DEBUG_LIST=y and fences with more than one sync point. Change-Id: Ie2a6bc1480bbcfdc14f9b385fca5a2b833effc05 Signed-off-by: Alistair Strachan alistair.strac...@imgtec.com Cc: Maarten Lankhorst maarten.lankho...@canonical.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: Colin Cross ccr...@google.com --- drivers/staging/android/sync.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 7bdb62b..f83e00c 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -114,7 +114,7 @@ void sync_timeline_signal(struct sync_timeline *obj) list_for_each_entry_safe(pt, next, obj-active_list_head, active_list) { if (fence_is_signaled_locked(pt-base)) - list_del(pt-active_list); + list_del_init(pt-active_list); } spin_unlock_irqrestore(obj-child_list_lock, flags); -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/