Re: [PATCH 2/9] dma: Convert from tasklet to BH workqueue

2024-03-29 Thread Vinod Koul
On 28-03-24, 12:39, Allen wrote:

> > I think that is very great idea. having this wrapped in dma_chan would
> > be very good way as well
> >
> > Am not sure if Allen is up for it :-)
> 
>  Thanks Arnd, I know we did speak about this at LPC. I did start
> working on using completion. I dropped it as I thought it would
> be easier to move to workqueues.
> 
> Vinod, I would like to give this a shot and put out a RFC, I would
> really appreciate review and feedback.

Sounds like a good plan to me

-- 
~Vinod


Re: [PATCH 2/9] dma: Convert from tasklet to BH workqueue

2024-03-29 Thread Vinod Koul
On 28-03-24, 13:01, Allen wrote:
> > >> > Since almost every driver associates the tasklet with the
> > >> > dma_chan, we could go one step further and add the
> > >> > work_queue structure directly into struct dma_chan,
> > >> > with the wrapper operating on the dma_chan rather than
> > >> > the work_queue.
> > >>
> > >> I think that is very great idea. having this wrapped in dma_chan would
> > >> be very good way as well
> > >>
> > >> Am not sure if Allen is up for it :-)
> > >
> > >  Thanks Arnd, I know we did speak about this at LPC. I did start
> > > working on using completion. I dropped it as I thought it would
> > > be easier to move to workqueues.
> >
> > It's definitely easier to do the workqueue conversion as a first
> > step, and I agree adding support for the completion right away is
> > probably too much. Moving the work_struct into the dma_chan
> > is probably not too hard though, if you leave your current
> > approach for the cases where the tasklet is part of the
> > dma_dev rather than the dma_chan.
> >
> 
>  Alright, I will work on moving work_struck into the dma_chan and
> leave the dma_dev as is (using bh workqueues) and post a RFC.
> Once reviewed, I could move to the next step.

That might be better from a performance pov but the current design is a
global tasklet and not a per chan one... We would need to carefully
review and test this for sure

-- 
~Vinod


[PATCH] serial: pmac_zilog: Drop usage of platform_driver_probe()

2024-03-29 Thread Uwe Kleine-König
There are considerations to drop platform_driver_probe() as a concept
that isn't relevant any more today. It comes with an added complexity
that makes many users hold it wrong. (E.g. this driver should have
marked the driver struct with __refdata to prevent the below mentioned
false positive section mismatch warning.)

This fixes a W=1 build warning:

WARNING: modpost: drivers/tty/serial/pmac_zilog: section mismatch in 
reference: pmz_driver+0x8 (section: .data) -> pmz_detach (section: .exit.text)

Signed-off-by: Uwe Kleine-König 
---
 drivers/tty/serial/pmac_zilog.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/pmac_zilog.c b/drivers/tty/serial/pmac_zilog.c
index 05d97e89511e..e44621218248 100644
--- a/drivers/tty/serial/pmac_zilog.c
+++ b/drivers/tty/serial/pmac_zilog.c
@@ -1695,7 +1695,7 @@ static void pmz_dispose_port(struct uart_pmac_port *uap)
memset(uap, 0, sizeof(struct uart_pmac_port));
 }
 
-static int __init pmz_attach(struct platform_device *pdev)
+static int pmz_attach(struct platform_device *pdev)
 {
struct uart_pmac_port *uap;
int i;
@@ -1714,7 +1714,7 @@ static int __init pmz_attach(struct platform_device *pdev)
return uart_add_one_port(_uart_reg, >port);
 }
 
-static void __exit pmz_detach(struct platform_device *pdev)
+static void pmz_detach(struct platform_device *pdev)
 {
struct uart_pmac_port *uap = platform_get_drvdata(pdev);
 
@@ -1789,7 +1789,8 @@ static struct macio_driver pmz_driver = {
 #else
 
 static struct platform_driver pmz_driver = {
-   .remove_new = __exit_p(pmz_detach),
+   .probe  = pmz_attach,
+   .remove_new = pmz_detach,
.driver = {
.name   = "scc",
},
@@ -1837,7 +1838,7 @@ static int __init init_pmz(void)
 #ifdef CONFIG_PPC_PMAC
return macio_register_driver(_driver);
 #else
-   return platform_driver_probe(_driver, pmz_attach);
+   return platform_driver_register(_driver);
 #endif
 }
 
base-commit: a6bd6c997f5a0e2667d4d82fef8c970108f2
-- 
2.43.0



[PATCH v3 1/3] arch: Select fbdev helpers with CONFIG_VIDEO

2024-03-29 Thread Thomas Zimmermann
Various Kconfig options selected the per-architecture helpers for
fbdev. But none of the contained code depends on fbdev. Standardize
on CONFIG_VIDEO, which will allow to add more general helpers for
video functionality.

CONFIG_VIDEO protects each architecture's video/ directory. This
allows for the use of more fine-grained control for each directory's
files, such as the use of CONFIG_STI_CORE on parisc.

v2:
- sparc: rebased onto Makefile changes

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sam Ravnborg 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: "David S. Miller" 
Cc: Andreas Larsson 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
---
 arch/parisc/Makefile  | 2 +-
 arch/sparc/Makefile   | 4 ++--
 arch/sparc/video/Makefile | 2 +-
 arch/x86/Makefile | 2 +-
 arch/x86/video/Makefile   | 3 ++-
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/parisc/Makefile b/arch/parisc/Makefile
index 316f84f1d15c8..21b8166a68839 100644
--- a/arch/parisc/Makefile
+++ b/arch/parisc/Makefile
@@ -119,7 +119,7 @@ export LIBGCC
 
 libs-y += arch/parisc/lib/ $(LIBGCC)
 
-drivers-y += arch/parisc/video/
+drivers-$(CONFIG_VIDEO) += arch/parisc/video/
 
 boot   := arch/parisc/boot
 
diff --git a/arch/sparc/Makefile b/arch/sparc/Makefile
index 2a03daa68f285..757451c3ea1df 100644
--- a/arch/sparc/Makefile
+++ b/arch/sparc/Makefile
@@ -59,8 +59,8 @@ endif
 libs-y += arch/sparc/prom/
 libs-y += arch/sparc/lib/
 
-drivers-$(CONFIG_PM) += arch/sparc/power/
-drivers-$(CONFIG_FB_CORE) += arch/sparc/video/
+drivers-$(CONFIG_PM)+= arch/sparc/power/
+drivers-$(CONFIG_VIDEO) += arch/sparc/video/
 
 boot := arch/sparc/boot
 
diff --git a/arch/sparc/video/Makefile b/arch/sparc/video/Makefile
index d4d83f1702c61..9dd82880a027a 100644
--- a/arch/sparc/video/Makefile
+++ b/arch/sparc/video/Makefile
@@ -1,3 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-obj-$(CONFIG_FB_CORE) += fbdev.o
+obj-y  += fbdev.o
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 662d9d4033e6b..b80d15c29ecc6 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -260,7 +260,7 @@ drivers-$(CONFIG_PCI)+= arch/x86/pci/
 # suspend and hibernation support
 drivers-$(CONFIG_PM) += arch/x86/power/
 
-drivers-$(CONFIG_FB_CORE) += arch/x86/video/
+drivers-$(CONFIG_VIDEO) += arch/x86/video/
 
 
 # boot loader support. Several targets are kept for legacy purposes
diff --git a/arch/x86/video/Makefile b/arch/x86/video/Makefile
index 5ebe48752ffc4..9dd82880a027a 100644
--- a/arch/x86/video/Makefile
+++ b/arch/x86/video/Makefile
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_FB_CORE)  += fbdev.o
+
+obj-y  += fbdev.o
-- 
2.44.0



[PATCH v3 3/3] arch: Rename fbdev header and source files

2024-03-29 Thread Thomas Zimmermann
The per-architecture fbdev code has no dependencies on fbdev and can
be used for any video-related subsystem. Rename the files to 'video'.
Use video-sti.c on parisc as the source file depends on CONFIG_STI_CORE.

On arc, arm, arm64, sh, and um the asm header file is an empty wrapper
around the file in asm-generic. Let Kbuild generate the file. The build
system does this automatically. Only um needs to generate video.h
explicitly, so that it overrides the host architecture's header. The
latter would otherwise interfere with the build.

Further update all includes statements, include guards, and Makefiles.
Also update a few strings and comments to refer to video instead of
fbdev.

v3:
- arc, arm, arm64, sh: generate asm header via build system (Sam,
Helge, Arnd)
- um: rename fb.h to video.h
- fix typo in commit message (Sam)

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sam Ravnborg 
Cc: Vineet Gupta 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Huacai Chen 
Cc: WANG Xuerui 
Cc: Geert Uytterhoeven 
Cc: Thomas Bogendoerfer 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Yoshinori Sato 
Cc: Rich Felker 
Cc: John Paul Adrian Glaubitz 
Cc: "David S. Miller" 
Cc: Andreas Larsson 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
---
 arch/arc/include/asm/fb.h|  8 
 arch/arm/include/asm/fb.h|  6 --
 arch/arm64/include/asm/fb.h  | 10 --
 arch/loongarch/include/asm/{fb.h => video.h} |  8 
 arch/m68k/include/asm/{fb.h => video.h}  |  8 
 arch/mips/include/asm/{fb.h => video.h}  | 12 ++--
 arch/parisc/include/asm/{fb.h => video.h}|  8 
 arch/parisc/video/Makefile   |  2 +-
 arch/parisc/video/{fbdev.c => video-sti.c}   |  2 +-
 arch/powerpc/include/asm/{fb.h => video.h}   |  8 
 arch/powerpc/kernel/pci-common.c |  2 +-
 arch/sh/include/asm/fb.h |  7 ---
 arch/sparc/include/asm/{fb.h => video.h} |  8 
 arch/sparc/video/Makefile|  2 +-
 arch/sparc/video/{fbdev.c => video.c}|  4 ++--
 arch/um/include/asm/Kbuild   |  2 +-
 arch/x86/include/asm/{fb.h => video.h}   |  8 
 arch/x86/video/Makefile  |  2 +-
 arch/x86/video/{fbdev.c => video.c}  |  3 ++-
 include/asm-generic/Kbuild   |  2 +-
 include/asm-generic/{fb.h => video.h}|  6 +++---
 include/linux/fb.h   |  2 +-
 22 files changed, 45 insertions(+), 75 deletions(-)
 delete mode 100644 arch/arc/include/asm/fb.h
 delete mode 100644 arch/arm/include/asm/fb.h
 delete mode 100644 arch/arm64/include/asm/fb.h
 rename arch/loongarch/include/asm/{fb.h => video.h} (86%)
 rename arch/m68k/include/asm/{fb.h => video.h} (86%)
 rename arch/mips/include/asm/{fb.h => video.h} (76%)
 rename arch/parisc/include/asm/{fb.h => video.h} (68%)
 rename arch/parisc/video/{fbdev.c => video-sti.c} (96%)
 rename arch/powerpc/include/asm/{fb.h => video.h} (76%)
 delete mode 100644 arch/sh/include/asm/fb.h
 rename arch/sparc/include/asm/{fb.h => video.h} (89%)
 rename arch/sparc/video/{fbdev.c => video.c} (86%)
 rename arch/x86/include/asm/{fb.h => video.h} (77%)
 rename arch/x86/video/{fbdev.c => video.c} (97%)
 rename include/asm-generic/{fb.h => video.h} (96%)

diff --git a/arch/arc/include/asm/fb.h b/arch/arc/include/asm/fb.h
deleted file mode 100644
index 9c2383d29cbb9..0
--- a/arch/arc/include/asm/fb.h
+++ /dev/null
@@ -1,8 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-#ifndef _ASM_FB_H_
-#define _ASM_FB_H_
-
-#include 
-
-#endif /* _ASM_FB_H_ */
diff --git a/arch/arm/include/asm/fb.h b/arch/arm/include/asm/fb.h
deleted file mode 100644
index ce20a43c30339..0
--- a/arch/arm/include/asm/fb.h
+++ /dev/null
@@ -1,6 +0,0 @@
-#ifndef _ASM_FB_H_
-#define _ASM_FB_H_
-
-#include 
-
-#endif /* _ASM_FB_H_ */
diff --git a/arch/arm64/include/asm/fb.h b/arch/arm64/include/asm/fb.h
deleted file mode 100644
index 1a495d8fb2ce0..0
--- a/arch/arm64/include/asm/fb.h
+++ /dev/null
@@ -1,10 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2012 ARM Ltd.
- */
-#ifndef __ASM_FB_H_
-#define __ASM_FB_H_
-
-#include 
-
-#endif /* __ASM_FB_H_ */
diff --git a/arch/loongarch/include/asm/fb.h 
b/arch/loongarch/include/asm/video.h
similarity index 86%
rename from arch/loongarch/include/asm/fb.h
rename to arch/loongarch/include/asm/video.h
index 0b218b10a9ec3..9f76845f2d4fd 100644
--- a/arch/loongarch/include/asm/fb.h
+++ b/arch/loongarch/include/asm/video.h
@@ -2,8 +2,8 @@
 /*
  * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
  */
-#ifndef _ASM_FB_H_
-#define _ASM_FB_H_
+#ifndef _ASM_VIDEO_H_
+#define _ASM_VIDEO_H_
 
 #include 
 #include 
@@ -26,6 +26,6 @@ static inline void fb_memset_io(volatile void __iomem 

[PATCH v3 2/3] arch: Remove struct fb_info from video helpers

2024-03-29 Thread Thomas Zimmermann
The per-architecture video helpers do not depend on struct fb_info
or anything else from fbdev. Remove it from the interface and replace
fb_is_primary_device() with video_is_primary_device(). The new helper
is similar in functionality, but can operate on non-fbdev devices.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sam Ravnborg 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: "David S. Miller" 
Cc: Andreas Larsson 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: x...@kernel.org
Cc: "H. Peter Anvin" 
---
 arch/parisc/include/asm/fb.h |  8 +---
 arch/parisc/video/fbdev.c|  9 +
 arch/sparc/include/asm/fb.h  |  7 ---
 arch/sparc/video/fbdev.c | 17 -
 arch/x86/include/asm/fb.h|  8 +---
 arch/x86/video/fbdev.c   | 18 +++---
 drivers/video/fbdev/core/fbcon.c |  2 +-
 include/asm-generic/fb.h | 11 ++-
 8 files changed, 41 insertions(+), 39 deletions(-)

diff --git a/arch/parisc/include/asm/fb.h b/arch/parisc/include/asm/fb.h
index 658a8a7dc5312..ed2a195a3e762 100644
--- a/arch/parisc/include/asm/fb.h
+++ b/arch/parisc/include/asm/fb.h
@@ -2,11 +2,13 @@
 #ifndef _ASM_FB_H_
 #define _ASM_FB_H_
 
-struct fb_info;
+#include 
+
+struct device;
 
 #if defined(CONFIG_STI_CORE)
-int fb_is_primary_device(struct fb_info *info);
-#define fb_is_primary_device fb_is_primary_device
+bool video_is_primary_device(struct device *dev);
+#define video_is_primary_device video_is_primary_device
 #endif
 
 #include 
diff --git a/arch/parisc/video/fbdev.c b/arch/parisc/video/fbdev.c
index e4f8ac99fc9e0..540fa0c919d59 100644
--- a/arch/parisc/video/fbdev.c
+++ b/arch/parisc/video/fbdev.c
@@ -5,12 +5,13 @@
  * Copyright (C) 2001-2002 Thomas Bogendoerfer 
  */
 
-#include 
 #include 
 
 #include 
 
-int fb_is_primary_device(struct fb_info *info)
+#include 
+
+bool video_is_primary_device(struct device *dev)
 {
struct sti_struct *sti;
 
@@ -21,6 +22,6 @@ int fb_is_primary_device(struct fb_info *info)
return true;
 
/* return true if it's the default built-in framebuffer driver */
-   return (sti->dev == info->device);
+   return (sti->dev == dev);
 }
-EXPORT_SYMBOL(fb_is_primary_device);
+EXPORT_SYMBOL(video_is_primary_device);
diff --git a/arch/sparc/include/asm/fb.h b/arch/sparc/include/asm/fb.h
index 24440c0fda490..07f0325d6921c 100644
--- a/arch/sparc/include/asm/fb.h
+++ b/arch/sparc/include/asm/fb.h
@@ -3,10 +3,11 @@
 #define _SPARC_FB_H_
 
 #include 
+#include 
 
 #include 
 
-struct fb_info;
+struct device;
 
 #ifdef CONFIG_SPARC32
 static inline pgprot_t pgprot_framebuffer(pgprot_t prot,
@@ -18,8 +19,8 @@ static inline pgprot_t pgprot_framebuffer(pgprot_t prot,
 #define pgprot_framebuffer pgprot_framebuffer
 #endif
 
-int fb_is_primary_device(struct fb_info *info);
-#define fb_is_primary_device fb_is_primary_device
+bool video_is_primary_device(struct device *dev);
+#define video_is_primary_device video_is_primary_device
 
 static inline void fb_memcpy_fromio(void *to, const volatile void __iomem 
*from, size_t n)
 {
diff --git a/arch/sparc/video/fbdev.c b/arch/sparc/video/fbdev.c
index bff66dd1909a4..e46f0499c2774 100644
--- a/arch/sparc/video/fbdev.c
+++ b/arch/sparc/video/fbdev.c
@@ -1,26 +1,25 @@
 // SPDX-License-Identifier: GPL-2.0
 
 #include 
-#include 
+#include 
 #include 
 
+#include 
 #include 
 
-int fb_is_primary_device(struct fb_info *info)
+bool video_is_primary_device(struct device *dev)
 {
-   struct device *dev = info->device;
-   struct device_node *node;
+   struct device_node *node = dev->of_node;
 
if (console_set_on_cmdline)
-   return 0;
+   return false;
 
-   node = dev->of_node;
if (node && node == of_console_device)
-   return 1;
+   return true;
 
-   return 0;
+   return false;
 }
-EXPORT_SYMBOL(fb_is_primary_device);
+EXPORT_SYMBOL(video_is_primary_device);
 
 MODULE_DESCRIPTION("Sparc fbdev helpers");
 MODULE_LICENSE("GPL");
diff --git a/arch/x86/include/asm/fb.h b/arch/x86/include/asm/fb.h
index c3b9582de7efd..999db33792869 100644
--- a/arch/x86/include/asm/fb.h
+++ b/arch/x86/include/asm/fb.h
@@ -2,17 +2,19 @@
 #ifndef _ASM_X86_FB_H
 #define _ASM_X86_FB_H
 
+#include 
+
 #include 
 
-struct fb_info;
+struct device;
 
 pgprot_t pgprot_framebuffer(pgprot_t prot,
unsigned long vm_start, unsigned long vm_end,
unsigned long offset);
 #define pgprot_framebuffer pgprot_framebuffer
 
-int fb_is_primary_device(struct fb_info *info);
-#define fb_is_primary_device fb_is_primary_device
+bool video_is_primary_device(struct device *dev);
+#define video_is_primary_device video_is_primary_device
 
 #include 
 
diff --git a/arch/x86/video/fbdev.c b/arch/x86/video/fbdev.c
index 1dd6528cc947c..4d87ce8e257fe 100644
--- a/arch/x86/video/fbdev.c
+++ b/arch/x86/video/fbdev.c
@@ -7,7 +7,6 @@
  *
  */

[PATCH v3 0/3] arch: Remove fbdev dependency from video helpers

2024-03-29 Thread Thomas Zimmermann
Make architecture helpers for display functionality depend on general
video functionality instead of fbdev. This avoids the dependency on
fbdev and makes the functionality available for non-fbdev code.

Patch 1 replaces the variety of Kconfig options that control the
Makefiles with CONFIG_VIDEO. More fine-grained control of the build
can then be done within each video/ directory; see parisc for an
example.

Patch 2 replaces fb_is_primary_device() with video_is_primary_device(),
which has no dependencies on fbdev. The implementation remains identical
on all affected platforms. There's one minor change in fbcon, which is
the only caller of fb_is_primary_device().

Patch 3 renames the source and header files from fbdev to video.

v3:
- arc, arm, arm64, sh, um: generate asm/video.h (Sam, Helge, Arnd)
- fix typos (Sam)
v2:
- improve cover letter
- rebase onto v6.9-rc1

Thomas Zimmermann (3):
  arch: Select fbdev helpers with CONFIG_VIDEO
  arch: Remove struct fb_info from video helpers
  arch: Rename fbdev header and source files

 arch/arc/include/asm/fb.h|  8 --
 arch/arm/include/asm/fb.h|  6 -
 arch/arm64/include/asm/fb.h  | 10 
 arch/loongarch/include/asm/{fb.h => video.h} |  8 +++---
 arch/m68k/include/asm/{fb.h => video.h}  |  8 +++---
 arch/mips/include/asm/{fb.h => video.h}  | 12 -
 arch/parisc/Makefile |  2 +-
 arch/parisc/include/asm/fb.h | 14 ---
 arch/parisc/include/asm/video.h  | 16 
 arch/parisc/video/Makefile   |  2 +-
 arch/parisc/video/{fbdev.c => video-sti.c}   |  9 ---
 arch/powerpc/include/asm/{fb.h => video.h}   |  8 +++---
 arch/powerpc/kernel/pci-common.c |  2 +-
 arch/sh/include/asm/fb.h |  7 --
 arch/sparc/Makefile  |  4 +--
 arch/sparc/include/asm/{fb.h => video.h} | 15 +--
 arch/sparc/video/Makefile|  2 +-
 arch/sparc/video/fbdev.c | 26 
 arch/sparc/video/video.c | 25 +++
 arch/um/include/asm/Kbuild   |  2 +-
 arch/x86/Makefile|  2 +-
 arch/x86/include/asm/fb.h| 19 --
 arch/x86/include/asm/video.h | 21 
 arch/x86/video/Makefile  |  3 ++-
 arch/x86/video/{fbdev.c => video.c}  | 21 +++-
 drivers/video/fbdev/core/fbcon.c |  2 +-
 include/asm-generic/Kbuild   |  2 +-
 include/asm-generic/{fb.h => video.h}| 17 +++--
 include/linux/fb.h   |  2 +-
 29 files changed, 124 insertions(+), 151 deletions(-)
 delete mode 100644 arch/arc/include/asm/fb.h
 delete mode 100644 arch/arm/include/asm/fb.h
 delete mode 100644 arch/arm64/include/asm/fb.h
 rename arch/loongarch/include/asm/{fb.h => video.h} (86%)
 rename arch/m68k/include/asm/{fb.h => video.h} (86%)
 rename arch/mips/include/asm/{fb.h => video.h} (76%)
 delete mode 100644 arch/parisc/include/asm/fb.h
 create mode 100644 arch/parisc/include/asm/video.h
 rename arch/parisc/video/{fbdev.c => video-sti.c} (78%)
 rename arch/powerpc/include/asm/{fb.h => video.h} (76%)
 delete mode 100644 arch/sh/include/asm/fb.h
 rename arch/sparc/include/asm/{fb.h => video.h} (75%)
 delete mode 100644 arch/sparc/video/fbdev.c
 create mode 100644 arch/sparc/video/video.c
 delete mode 100644 arch/x86/include/asm/fb.h
 create mode 100644 arch/x86/include/asm/video.h
 rename arch/x86/video/{fbdev.c => video.c} (66%)
 rename include/asm-generic/{fb.h => video.h} (89%)

-- 
2.44.0



Re: [PATCH 0/9] enabled -Wformat-truncation for clang

2024-03-29 Thread patchwork-bot+netdevbpf
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski :

On Tue, 26 Mar 2024 23:37:59 +0100 you wrote:
> From: Arnd Bergmann 
> 
> With randconfig build testing, I found only eight files that produce
> warnings with clang when -Wformat-truncation is enabled. This means
> we can just turn it on by default rather than only enabling it for
> "make W=1".
> 
> [...]

Here is the summary with links:
  - [2/9] enetc: avoid truncating error message
https://git.kernel.org/netdev/net-next/c/9046d581ed58
  - [3/9] qed: avoid truncating work queue length
https://git.kernel.org/netdev/net-next/c/954fd908f177
  - [4/9] mlx5: avoid truncating error message
https://git.kernel.org/netdev/net-next/c/b324a960354b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH v12 8/8] PCI: endpoint: Remove "core_init_notifier" flag

2024-03-29 Thread Frank Li
On Wed, Mar 27, 2024 at 02:43:37PM +0530, Manivannan Sadhasivam wrote:
> "core_init_notifier" flag is set by the glue drivers requiring refclk from
> the host to complete the DWC core initialization. Also, those drivers will
> send a notification to the EPF drivers once the initialization is fully
> completed using the pci_epc_init_notify() API. Only then, the EPF drivers
> will start functioning.
> 
> For the rest of the drivers generating refclk locally, EPF drivers will
> start functioning post binding with them. EPF drivers rely on the
> 'core_init_notifier' flag to differentiate between the drivers.
> Unfortunately, this creates two different flows for the EPF drivers.
> 
> So to avoid that, let's get rid of the "core_init_notifier" flag and follow
> a single initialization flow for the EPF drivers. This is done by calling
> the dw_pcie_ep_init_notify() from all glue drivers after the completion of
> dw_pcie_ep_init_registers() API. This will allow all the glue drivers to
> send the notification to the EPF drivers once the initialization is fully
> completed.
> 
> Only difference here is that, the drivers requiring refclk from host will
> send the notification once refclk is received, while others will send it
> during probe time itself.
> 
> But this also requires the EPC core driver to deliver the notification
> after EPF driver bind. Because, the glue driver can send the notification
> before the EPF drivers bind() and in those cases the EPF drivers will miss
> the event. To accommodate this, EPC core is now caching the state of the
> EPC initialization in 'init_complete' flag and pci-ep-cfs driver sends the
> notification to EPF drivers based on that after each EPF driver bind.
> 
> Tested-by: Niklas Cassel 

Reviewed-by: Frank Li 

> Signed-off-by: Manivannan Sadhasivam 
> ---
>  drivers/pci/controller/cadence/pcie-cadence-ep.c  |  2 ++
>  drivers/pci/controller/dwc/pci-dra7xx.c   |  2 ++
>  drivers/pci/controller/dwc/pci-imx6.c |  2 ++
>  drivers/pci/controller/dwc/pci-keystone.c |  2 ++
>  drivers/pci/controller/dwc/pci-layerscape-ep.c|  2 ++
>  drivers/pci/controller/dwc/pcie-artpec6.c |  2 ++
>  drivers/pci/controller/dwc/pcie-designware-ep.c   |  1 +
>  drivers/pci/controller/dwc/pcie-designware-plat.c |  2 ++
>  drivers/pci/controller/dwc/pcie-keembay.c |  2 ++
>  drivers/pci/controller/dwc/pcie-qcom-ep.c |  1 -
>  drivers/pci/controller/dwc/pcie-rcar-gen4.c   |  2 ++
>  drivers/pci/controller/dwc/pcie-tegra194.c|  1 -
>  drivers/pci/controller/dwc/pcie-uniphier-ep.c |  2 ++
>  drivers/pci/controller/pcie-rcar-ep.c |  2 ++
>  drivers/pci/controller/pcie-rockchip-ep.c |  2 ++
>  drivers/pci/endpoint/functions/pci-epf-test.c | 18 +-
>  drivers/pci/endpoint/pci-ep-cfs.c |  9 +
>  drivers/pci/endpoint/pci-epc-core.c   | 22 ++
>  include/linux/pci-epc.h   |  7 ---
>  19 files changed, 65 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/pci/controller/cadence/pcie-cadence-ep.c 
> b/drivers/pci/controller/cadence/pcie-cadence-ep.c
> index 81c50dc64da9..55c42ca2b777 100644
> --- a/drivers/pci/controller/cadence/pcie-cadence-ep.c
> +++ b/drivers/pci/controller/cadence/pcie-cadence-ep.c
> @@ -746,6 +746,8 @@ int cdns_pcie_ep_setup(struct cdns_pcie_ep *ep)
>  
>   spin_lock_init(>lock);
>  
> + pci_epc_init_notify(epc);
> +
>   return 0;
>  
>   free_epc_mem:
> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c 
> b/drivers/pci/controller/dwc/pci-dra7xx.c
> index 395042b29ffc..d2d17d37d3e0 100644
> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
> @@ -474,6 +474,8 @@ static int dra7xx_add_pcie_ep(struct dra7xx_pcie *dra7xx,
>   return ret;
>   }
>  
> + dw_pcie_ep_init_notify(ep);
> +
>   return 0;
>  }
>  
> diff --git a/drivers/pci/controller/dwc/pci-imx6.c 
> b/drivers/pci/controller/dwc/pci-imx6.c
> index 8d28ecc381bc..917c69edee1d 100644
> --- a/drivers/pci/controller/dwc/pci-imx6.c
> +++ b/drivers/pci/controller/dwc/pci-imx6.c
> @@ -1131,6 +1131,8 @@ static int imx6_add_pcie_ep(struct imx6_pcie *imx6_pcie,
>   return ret;
>   }
>  
> + dw_pcie_ep_init_notify(ep);
> +
>   /* Start LTSSM. */
>   imx6_pcie_ltssm_enable(dev);
>  
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c 
> b/drivers/pci/controller/dwc/pci-keystone.c
> index 81ebac520650..d3a7d14ee685 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -1293,6 +1293,8 @@ static int ks_pcie_probe(struct platform_device *pdev)
>   goto err_ep_init;
>   }
>  
> + dw_pcie_ep_init_notify(>ep);
> +
>   break;
>   default:
>   dev_err(dev, "INVALID device type %d\n", mode);
> diff --git 

Re: [PATCH v12 7/8] PCI: dwc: ep: Call dw_pcie_ep_init_registers() API directly from all glue drivers

2024-03-29 Thread Frank Li
On Wed, Mar 27, 2024 at 02:43:36PM +0530, Manivannan Sadhasivam wrote:
> Currently, dw_pcie_ep_init_registers() API is directly called by the glue
> drivers requiring active refclk from host. But for the other drivers, it is
> getting called implicitly by dw_pcie_ep_init(). This is due to the fact
> that this API initializes DWC EP specific registers and that requires an
> active refclk (either from host or generated locally by endpoint itsef).
> 
> But, this causes a discrepancy among the glue drivers. So to avoid this
> confusion, let's call this API directly from all glue drivers irrespective
> of refclk dependency. Only difference here is that the drivers requiring
> refclk from host will call this API only after the refclk is received and
> other drivers without refclk dependency will call this API right after
> dw_pcie_ep_init().
> 
> With this change, the check for 'core_init_notifier' flag can now be
> dropped from dw_pcie_ep_init() API. This will also allow us to remove the
> 'core_init_notifier' flag completely in the later commits.
> 
> Reviewed-by: Yoshihiro Shimoda 
> Reviewed-by: Niklas Cassel 

Reviewed-by: Frank Li 

> Signed-off-by: Manivannan Sadhasivam 
> ---
>  drivers/pci/controller/dwc/pci-dra7xx.c   |  7 +++
>  drivers/pci/controller/dwc/pci-imx6.c |  8 
>  drivers/pci/controller/dwc/pci-keystone.c |  9 +
>  drivers/pci/controller/dwc/pci-layerscape-ep.c|  7 +++
>  drivers/pci/controller/dwc/pcie-artpec6.c | 13 -
>  drivers/pci/controller/dwc/pcie-designware-ep.c   | 22 --
>  drivers/pci/controller/dwc/pcie-designware-plat.c |  9 +
>  drivers/pci/controller/dwc/pcie-keembay.c | 16 +++-
>  drivers/pci/controller/dwc/pcie-rcar-gen4.c   | 12 +++-
>  drivers/pci/controller/dwc/pcie-uniphier-ep.c | 13 -
>  10 files changed, 90 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-dra7xx.c 
> b/drivers/pci/controller/dwc/pci-dra7xx.c
> index 0e406677060d..395042b29ffc 100644
> --- a/drivers/pci/controller/dwc/pci-dra7xx.c
> +++ b/drivers/pci/controller/dwc/pci-dra7xx.c
> @@ -467,6 +467,13 @@ static int dra7xx_add_pcie_ep(struct dra7xx_pcie *dra7xx,
>   return ret;
>   }
>  
> + ret = dw_pcie_ep_init_registers(ep);
> + if (ret) {
> + dev_err(dev, "Failed to initialize DWC endpoint registers\n");
> + dw_pcie_ep_deinit(ep);
> + return ret;
> + }
> +
>   return 0;
>  }
>  
> diff --git a/drivers/pci/controller/dwc/pci-imx6.c 
> b/drivers/pci/controller/dwc/pci-imx6.c
> index 99a60270b26c..8d28ecc381bc 100644
> --- a/drivers/pci/controller/dwc/pci-imx6.c
> +++ b/drivers/pci/controller/dwc/pci-imx6.c
> @@ -1123,6 +1123,14 @@ static int imx6_add_pcie_ep(struct imx6_pcie 
> *imx6_pcie,
>   dev_err(dev, "failed to initialize endpoint\n");
>   return ret;
>   }
> +
> + ret = dw_pcie_ep_init_registers(ep);
> + if (ret) {
> + dev_err(dev, "Failed to initialize DWC endpoint registers\n");
> + dw_pcie_ep_deinit(ep);
> + return ret;
> + }
> +
>   /* Start LTSSM. */
>   imx6_pcie_ltssm_enable(dev);
>  
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c 
> b/drivers/pci/controller/dwc/pci-keystone.c
> index 844de4418724..81ebac520650 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -1286,6 +1286,13 @@ static int ks_pcie_probe(struct platform_device *pdev)
>   ret = dw_pcie_ep_init(>ep);
>   if (ret < 0)
>   goto err_get_sync;
> +
> + ret = dw_pcie_ep_init_registers(>ep);
> + if (ret) {
> + dev_err(dev, "Failed to initialize DWC endpoint 
> registers\n");
> + goto err_ep_init;
> + }
> +
>   break;
>   default:
>   dev_err(dev, "INVALID device type %d\n", mode);
> @@ -1295,6 +1302,8 @@ static int ks_pcie_probe(struct platform_device *pdev)
>  
>   return 0;
>  
> +err_ep_init:
> + dw_pcie_ep_deinit(>ep);
>  err_get_sync:
>   pm_runtime_put(dev);
>   pm_runtime_disable(dev);
> diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c 
> b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> index 1f6ee1460ec2..9eb2233e3d7f 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> @@ -279,6 +279,13 @@ static int __init ls_pcie_ep_probe(struct 
> platform_device *pdev)
>   if (ret)
>   return ret;
>  
> + ret = dw_pcie_ep_init_registers(>ep);
> + if (ret) {
> + dev_err(dev, "Failed to initialize DWC endpoint registers\n");
> + dw_pcie_ep_deinit(>ep);
> + return ret;
> + }
> +
>   return ls_pcie_ep_interrupt_init(pcie, pdev);
>  }
>  

Re: [PATCH v4 10/15] x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
On 2024-03-29 12:28 PM, Dave Hansen wrote:
> On 3/29/24 00:18, Samuel Holland wrote:
>> +#
>> +# CFLAGS for compiling floating point code inside the kernel.
>> +#
>> +CC_FLAGS_FPU := -msse -msse2
>> +ifdef CONFIG_CC_IS_GCC
>> +# Stack alignment mismatch, proceed with caution.
>> +# GCC < 7.1 cannot compile code using `double` and 
>> -mpreferred-stack-boundary=3
>> +# (8B stack alignment).
>> +# See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
>> +#
>> +# The "-msse" in the first argument is there so that the
>> +# -mpreferred-stack-boundary=3 build error:
>> +#
>> +#  -mpreferred-stack-boundary=3 is not between 4 and 12
>> +#
>> +# can be triggered. Otherwise gcc doesn't complain.
>> +CC_FLAGS_FPU += -mhard-float
>> +CC_FLAGS_FPU += $(call cc-option,-msse 
>> -mpreferred-stack-boundary=3,-mpreferred-stack-boundary=4)
>> +endif
> 
> I was expecting to see this (now duplicate) hunk come _out_ of
> lib/Makefile somewhere in the series.
> 
> Did I miss that, or is there something keeping the duplicate there?

This hunk is removed in patch 15/15, after the conversion of lib/test_fpu.c:

https://lore.kernel.org/linux-kernel/20240329072441.591471-16-samuel.holl...@sifive.com/

Regards,
Samuel



Re: [PATCH v4 09/15] x86/fpu: Fix asm/fpu/types.h include guard

2024-03-29 Thread Dave Hansen
On 3/29/24 00:18, Samuel Holland wrote:
> The include guard should match the filename, or it will conflict with
> the newly-added asm/fpu.h.

Acked-by: Dave Hansen 


Re: [PATCH v4 10/15] x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Dave Hansen
On 3/29/24 00:18, Samuel Holland wrote:
> +#
> +# CFLAGS for compiling floating point code inside the kernel.
> +#
> +CC_FLAGS_FPU := -msse -msse2
> +ifdef CONFIG_CC_IS_GCC
> +# Stack alignment mismatch, proceed with caution.
> +# GCC < 7.1 cannot compile code using `double` and 
> -mpreferred-stack-boundary=3
> +# (8B stack alignment).
> +# See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
> +#
> +# The "-msse" in the first argument is there so that the
> +# -mpreferred-stack-boundary=3 build error:
> +#
> +#  -mpreferred-stack-boundary=3 is not between 4 and 12
> +#
> +# can be triggered. Otherwise gcc doesn't complain.
> +CC_FLAGS_FPU += -mhard-float
> +CC_FLAGS_FPU += $(call cc-option,-msse 
> -mpreferred-stack-boundary=3,-mpreferred-stack-boundary=4)
> +endif

I was expecting to see this (now duplicate) hunk come _out_ of
lib/Makefile somewhere in the series.

Did I miss that, or is there something keeping the duplicate there?


Re: [PATCH v2 12/14] sh: Add support for suppressing warning backtraces

2024-03-29 Thread Guenter Roeck
On Wed, Mar 27, 2024 at 07:39:20PM +, Simon Horman wrote:
[ ... ]
> > > 
> > > Hi Guenter,
> > > 
> > > a minor nit from my side: this change results in a Kernel doc warning.
> > > 
> > >   .../bug.h:29: warning: expecting prototype for _EMIT_BUG_ENTRY(). 
> > > Prototype was for HAVE_BUG_FUNCTION() instead
> > > 
> > > Perhaps either the new code should be placed above the Kernel doc,
> > > or scripts/kernel-doc should be enhanced?
> > > 
> > 
> > Thanks a lot for the feedback.
> > 
> > The definition block needs to be inside CONFIG_DEBUG_BUGVERBOSE,
> > so it would be a bit odd to move it above the documentation
> > just to make kerneldoc happy. I am not really sure that to do
> > about it.
> 
> FWIIW, I agree that would be odd.
> But perhaps the #ifdef could also move above the Kernel doc?
> Maybe not a great idea, but the best one I've had so far.
> 

I did that for the next version of the patch series. It is a bit more
clumsy, so I left it as separate patch on top of this patch. I'd
still like to get input from others before making the change final.

Thanks,
Guenter


Re: [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)

2024-03-29 Thread Sachin Sant



> Can you check the debug patch below and provide output?
> When I'm right the warning should be gone and you should just get the
> "Modification triggered" instead. When I'm wrong we should at least see,
> how many references d_ref has left.
> 

With the debug patch applied, code says d_ref value is 2

# ./ioctl_sg01 
tst_test.c:1741: TINFO: LTP version: 20210524-2511-g00b497c47
tst_test.c:1625: TINFO: Timeout per run is 1h 00m 30s
ioctl_sg01.c:83: TINFO: Found SCSI device /dev/sg0
[   36.016630] [ cut here ]
[   36.016674] WARNING: CPU: 19 PID: 460 at drivers/scsi/sg.c:2238 
sg_remove_sfp_usercontext+0x270/0x298 [sg]
[   36.016707] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet 
nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 
nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc 
pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi 
crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi 
scsi_transport_srp ibmveth fuse
[   36.016834] CPU: 19 PID: 460 Comm: kworker/19:1 Kdump: loaded Not tainted 
6.9.0-rc1-next-20240328-dirty #3
[   36.016849] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
[   36.016868] Workqueue: events sg_remove_sfp_usercontext [sg]
[   36.016889] NIP:  c00815cf4110 LR: c00815cf4000 CTR: c05393b0
[   36.016903] REGS: c0009414fae0 TRAP: 0700   Not tainted  
(6.9.0-rc1-next-20240328-dirty)
[   36.016921] MSR:  8282b033   CR: 
44000448  XER: 
[   36.016962] CFAR: c00815cf400c IRQMASK: 0  [   36.016962] GPR00: 
c00815cf4000 c0009414fd80 c00815d18900   [   
36.016962] GPR04: c000 0023 c8dee000 
0022  [   36.016962] GPR08: 00038a6d 0002 
 c00815cf8c10  [   36.016962] GPR12: c05393b0 
c0038ffe8b00 c01a2bac c8e4e980  [   36.016962] GPR16: 
     [   
36.016962] GPR20: c0038c993b00 c8dea030 c8dea000 
c000a2712000  [   36.016962] GPR24:  cc3bd380 
c45ab205 c8deb330  [   36.016962] GPR28: c0038c993b00 
c8dea080 c8deb328 cc3bd418  [   36.017107] NIP 
[c00815cf4110] sg_remove_sfp_usercontext+0x270/0x298 [sg]
[   36.017129] LR [c00815cf4000] sg_remove_sfp_usercontext+0x160/0x298 [sg]
[   36.017144] Call Trace:
[   36.017148] [c0009414fd80] [c00815cf4000] 
sg_remove_sfp_usercontext+0x160/0x298 [sg] (unreliable)
[   36.017169] [c0009414fe40] [c019337c] 
process_one_work+0x20c/0x4f4
[   36.017189] [c0009414fef0] [c01942fc] worker_thread+0x378/0x544
[   36.017208] [c0009414ff90] [c01a2cdc] kthread+0x138/0x140
[   36.017225] [c0009414ffe0] [c000df98] 
start_kernel_thread+0x14/0x18
[   36.017241] Code: 3bf90098 e8c98310 3d22 e8698010 48004509 e8410018 
7ec3b378 48004b15 e8410018 81390098 2c090001 4182ff04 <0fe0> 80990098 
3d22 78840020  [   36.017289] ---[ end trace  ]---
[   36.017302] d_ref=2
ioctl_sg01.c:124: TPASS: Output buffer is empty, no data leaked
[   44.707319] d_ref=2

Summary:
passed   1
failed   0
broken   0
skipped  0
warnings 0


— Sachin

Re: FAILED: Patch "powerpc: xor_vmx: Add '-mhard-float' to CFLAGS" failed to apply to 5.10-stable tree

2024-03-29 Thread Greg KH
On Wed, Mar 27, 2024 at 08:16:13AM -0700, Nathan Chancellor wrote:
> On Wed, Mar 27, 2024 at 08:20:07AM -0400, Sasha Levin wrote:
> > The patch below does not apply to the 5.10-stable tree.
> > If someone wants it applied there, or to any other stable or longterm
> > tree, then please email the backport, including the original git commit
> > id to .
> ...
> > -- original commit in Linus's tree --
> > 
> > From 35f20786c481d5ced9283ff42de5c69b65e5ed13 Mon Sep 17 00:00:00 2001
> > From: Nathan Chancellor 
> > Date: Sat, 27 Jan 2024 11:07:43 -0700
> > Subject: [PATCH] powerpc: xor_vmx: Add '-mhard-float' to CFLAGS
> 
> I have attached a backport that will work for 5.15 and earlier. I think
> you worked around this conflict in 5.15 by taking 04e85bbf71c9 but I am
> not sure that is a smart idea. I think it might just be better to drop
> that dependency and apply this version in 5.15.

I'll go drop it and take this version, thanks!

greg k-h


Re: [powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)

2024-03-29 Thread Alexander Wetzel
> Following WARN_ON_ONCE is triggered while running LTP tests
> (specifically ioctl_sg01) on IBM Power booted with 6.9.0-rc1-next-20240328
>
> [   64.230233] [ cut here ]
> [   64.230269] WARNING: CPU: 10 PID: 452 at drivers/scsi/sg.c:2236 
> sg_remove_sfp_usercontext+0x270/0x280 [sg]
> [   64.230302] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet 
> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 
> nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack 
> nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink 
> sunrpc binfmt_misc pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi 
> crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth 
> scsi_transport_srp fuse
> [   64.230420] CPU: 10 PID: 452 Comm: kworker/10:1 Kdump: loaded Not tainted 
> 6.9.0-rc1-next-20240328 #2
> [   64.230438] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
> [   64.230449] Workqueue: events sg_remove_sfp_usercontext [sg]
> [   64.230468] NIP:  c00815c34110 LR: c00815c33ffc CTR: 
> c05393b0
> [   64.230485] REGS: cc1efae0 TRAP: 0700   Not tainted  
> (6.9.0-rc1-next-20240328)
> [   64.230498] MSR:  8282b033   CR: 
> 44000408  XER: 
> [   64.230535] CFAR: c00815c3400c IRQMASK: 0
> [   64.230535] GPR00: c00815c33ffc cc1efd80 c00815c58900 
> cca8ae98
> [   64.230535] GPR04: c000 0023 c7c2e000 
> 0022
> [   64.230535] GPR08: 00038a13 0002  
> c00815c38bc0
> [   64.230535] GPR12: c05393b0 c0038fff3f00 c01a2bac 
> c7c7a9c0
> [   64.230535] GPR16:    
> 
> [   64.230535] GPR20: c0038c3f3b00 c7c10030 c7c1 
> c000901c
> [   64.230535] GPR24:  cca8ae00 c45a5805 
> c7c11330
> [   64.230535] GPR28: c0038c3f3b00 c7c10080 c7c11328 
> c2fdee54
> [   64.230671] NIP [c00815c34110] sg_remove_sfp_usercontext+0x270/0x280 
> [sg]
> [   64.230690] LR [c00815c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 
> [sg]
> [   64.230709] Call Trace:
> [   64.230716] [cc1efd80] [c00815c33ffc] 
> sg_remove_sfp_usercontext+0x15c/0x280 [sg] (unreliable)
> [   64.230740] [cc1efe40] [c019337c] 
> process_one_work+0x20c/0x4f4
> [   64.230767] [cc1efef0] [c01942fc] worker_thread+0x378/0x544
> [   64.230787] [cc1eff90] [c01a2cdc] kthread+0x138/0x140
> [   64.230801] [cc1effe0] [c000df98] 
> start_kernel_thread+0x14/0x18
> [   64.230819] Code: e8c98310 3d22 e8698010 480044bd e8410018 7ec3b378 
> 48004ac9 e8410018 38790098 81390098 2c090001 4182ff04 <0fe0> 4bfffefc 
> 000247e0 
> [   64.230857] ---[ end trace  ]—
>
> This WARN_ON was introduced with
> commit 27f58c04a8f438078583041468ec60597841284d
> scsi: sg: Avoid sg device teardown race
>
> Reverting the patch avoids the warning. The test case passes irrespective of 
> the
> patch is present of not.
>

The new WARN_ON_ONCE is only an additional logic check. When it
triggers it also should trigger when you undo the rest of the change.

But when it triggers something with the driver logic must be off.
(Or my understanding of the intent of the code is worse than assumed:-)

Looking into the d_ref logic I see two additional problems not addressed
by the original patch when sg_add_sfp() fails:
 1) sg_open() is then also calling first scsi_device_put() and then
sg_device_destroy() via kref_put(). That's the wrong order.

 2) When sg_add_sfp() fails we never call kref_get(>d_ref).
Thus we shoud not call kref_get() here at all.

Thus your warning above could be triggered by an error within
sg_add_sfp(): In that case d_ref would already be zero when the code
gets to the warning.

Can you check the debug patch below and provide output?
When I'm right the warning should be gone and you should just get the
"Modification triggered" instead. When I'm wrong we should at least see,
how many references d_ref has left.

Alexander
---
 drivers/scsi/sg.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index ff6894ce5404..1c27d5f8f384 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -373,7 +373,8 @@ sg_open(struct inode *inode, struct file *filp)
scsi_autopm_put_device(sdp->device);
 sdp_put:
scsi_device_put(sdp->device);
-   goto sg_put;
+   pr_warn("%s: Modification triggered\n", __func__);
+   return retval;
 }
 
 /* Release resources associated with a successful sg_open()
@@ -2233,7 +2234,8 @@ sg_remove_sfp_usercontext(struct work_struct *work)
"sg_remove_sfp: sfp=0x%p\n", sfp));

[powerpc] WARN at drivers/scsi/sg.c:2236 (sg_remove_sfp_usercontext)

2024-03-29 Thread Sachin Sant
Following WARN_ON_ONCE is triggered while running LTP tests
(specifically ioctl_sg01) on IBM Power booted with 6.9.0-rc1-next-20240328

[   64.230233] [ cut here ]
[   64.230269] WARNING: CPU: 10 PID: 452 at drivers/scsi/sg.c:2236 
sg_remove_sfp_usercontext+0x270/0x280 [sg]
[   64.230302] Modules linked in: rpadlpar_io rpaphp xsk_diag nft_fib_inet 
nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 
nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc binfmt_misc 
pseries_rng vmx_crypto xfs libcrc32c sd_mod sr_mod t10_pi 
crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi ibmveth 
scsi_transport_srp fuse
[   64.230420] CPU: 10 PID: 452 Comm: kworker/10:1 Kdump: loaded Not tainted 
6.9.0-rc1-next-20240328 #2
[   64.230438] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
of:IBM,FW1060.00 (NH1060_018) hv:phyp pSeries
[   64.230449] Workqueue: events sg_remove_sfp_usercontext [sg]
[   64.230468] NIP:  c00815c34110 LR: c00815c33ffc CTR: c05393b0
[   64.230485] REGS: cc1efae0 TRAP: 0700   Not tainted  
(6.9.0-rc1-next-20240328)
[   64.230498] MSR:  8282b033   CR: 
44000408  XER: 
[   64.230535] CFAR: c00815c3400c IRQMASK: 0
[   64.230535] GPR00: c00815c33ffc cc1efd80 c00815c58900 
cca8ae98
[   64.230535] GPR04: c000 0023 c7c2e000 
0022
[   64.230535] GPR08: 00038a13 0002  
c00815c38bc0
[   64.230535] GPR12: c05393b0 c0038fff3f00 c01a2bac 
c7c7a9c0
[   64.230535] GPR16:    

[   64.230535] GPR20: c0038c3f3b00 c7c10030 c7c1 
c000901c
[   64.230535] GPR24:  cca8ae00 c45a5805 
c7c11330
[   64.230535] GPR28: c0038c3f3b00 c7c10080 c7c11328 
c2fdee54
[   64.230671] NIP [c00815c34110] sg_remove_sfp_usercontext+0x270/0x280 [sg]
[   64.230690] LR [c00815c33ffc] sg_remove_sfp_usercontext+0x15c/0x280 [sg]
[   64.230709] Call Trace:
[   64.230716] [cc1efd80] [c00815c33ffc] 
sg_remove_sfp_usercontext+0x15c/0x280 [sg] (unreliable)
[   64.230740] [cc1efe40] [c019337c] 
process_one_work+0x20c/0x4f4
[   64.230767] [cc1efef0] [c01942fc] worker_thread+0x378/0x544
[   64.230787] [cc1eff90] [c01a2cdc] kthread+0x138/0x140
[   64.230801] [cc1effe0] [c000df98] 
start_kernel_thread+0x14/0x18
[   64.230819] Code: e8c98310 3d22 e8698010 480044bd e8410018 7ec3b378 
48004ac9 e8410018 38790098 81390098 2c090001 4182ff04 <0fe0> 4bfffefc 
000247e0 
[   64.230857] ---[ end trace  ]—

This WARN_ON was introduced with
commit 27f58c04a8f438078583041468ec60597841284d
scsi: sg: Avoid sg device teardown race

Reverting the patch avoids the warning. The test case passes irrespective of the
patch is present of not. 

-- Sachin

Re: [PATCH v11 00/11] Support page table check PowerPC

2024-03-29 Thread Christophe Leroy


Le 28/03/2024 à 08:57, Christophe Leroy a écrit :
> 
> 
> Le 28/03/2024 à 07:52, Christophe Leroy a écrit :
>>
>>
>> Le 28/03/2024 à 05:55, Rohan McLure a écrit :
>>> Support page table check on all PowerPC platforms. This works by
>>> serialising assignments, reassignments and clears of page table
>>> entries at each level in order to ensure that anonymous mappings
>>> have at most one writable consumer, and likewise that file-backed
>>> mappings are not simultaneously also anonymous mappings.
>>>
>>> In order to support this infrastructure, a number of stubs must be
>>> defined for all powerpc platforms. Additionally, seperate set_pte_at()
>>> and set_pte_at_unchecked(), to allow for internal, uninstrumented 
>>> mappings.
>>
>> I gave it a try on QEMU e500 (64 bits), and get the following Oops. 
>> What do I have to look for ?
>>
>> Freeing unused kernel image (initmem) memory: 2588K
>> This architecture does not have kernel memory protection.
>> Run /init as init process
>> [ cut here ]
>> kernel BUG at mm/page_table_check.c:119!
>> Oops: Exception in kernel mode, sig: 5 [#1]
>> BE PAGE_SIZE=4K SMP NR_CPUS=32 QEMU e500
> 
> Same problem on my 8xx board:
> 
> [    7.358146] Freeing unused kernel image (initmem) memory: 448K
> [    7.363957] Run /init as init process
> [    7.370955] [ cut here ]
> [    7.375411] kernel BUG at mm/page_table_check.c:119!
> [    7.380393] Oops: Exception in kernel mode, sig: 5 [#1]
> [    7.385621] BE PAGE_SIZE=16K PREEMPT CMPC885

Both problems are fixed by following change:

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 413d01a51e6f..5b932632a5d7 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -29,6 +29,8 @@ static inline pte_basic_t pte_update(struct mm_struct 
*mm, unsigned long addr, p

  #ifndef __ASSEMBLY__

+#include 
+
  extern int icache_44x_need_flush;

  /*
@@ -92,7 +94,11 @@ static inline void ptep_set_wrprotect(struct 
mm_struct *mm, unsigned long addr,
  static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned 
long addr,
   pte_t *ptep)
  {
-   return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
+   pte_t old_pte = __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
+
+   page_table_check_pte_clear(mm, addr, old_pte);
+
+   return old_pte;
  }
  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR




[PATCH v4 15/15] selftests/fpu: Allow building on other architectures

2024-03-29 Thread Samuel Holland
Now that ARCH_HAS_KERNEL_FPU_SUPPORT provides a common way to compile
and run floating-point code, this test is no longer x86-specific.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v1)

 lib/Kconfig.debug   |  2 +-
 lib/Makefile| 25 ++---
 lib/test_fpu_glue.c |  5 -
 3 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index c63a5fbf1f1c..f93e778e0405 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2890,7 +2890,7 @@ config TEST_FREE_PAGES
 
 config TEST_FPU
tristate "Test floating point operations in kernel space"
-   depends on X86 && !KCOV_INSTRUMENT_ALL
+   depends on ARCH_HAS_KERNEL_FPU_SUPPORT && !KCOV_INSTRUMENT_ALL
help
  Enable this option to add /sys/kernel/debug/selftest_helpers/test_fpu
  which will trigger a sequence of floating point operations. This is 
used
diff --git a/lib/Makefile b/lib/Makefile
index fcb35bf50979..e44ad11f77b5 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -110,31 +110,10 @@ CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
 obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
 obj-$(CONFIG_TEST_OBJPOOL) += test_objpool.o
 
-#
-# CFLAGS for compiling floating point code inside the kernel. x86/Makefile 
turns
-# off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
-# get appended last to CFLAGS and thus override those previous compiler 
options.
-#
-FPU_CFLAGS := -msse -msse2
-ifdef CONFIG_CC_IS_GCC
-# Stack alignment mismatch, proceed with caution.
-# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
-# (8B stack alignment).
-# See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
-#
-# The "-msse" in the first argument is there so that the
-# -mpreferred-stack-boundary=3 build error:
-#
-#  -mpreferred-stack-boundary=3 is not between 4 and 12
-#
-# can be triggered. Otherwise gcc doesn't complain.
-FPU_CFLAGS += -mhard-float
-FPU_CFLAGS += $(call cc-option,-msse 
-mpreferred-stack-boundary=3,-mpreferred-stack-boundary=4)
-endif
-
 obj-$(CONFIG_TEST_FPU) += test_fpu.o
 test_fpu-y := test_fpu_glue.o test_fpu_impl.o
-CFLAGS_test_fpu_impl.o += $(FPU_CFLAGS)
+CFLAGS_test_fpu_impl.o += $(CC_FLAGS_FPU)
+CFLAGS_REMOVE_test_fpu_impl.o += $(CC_FLAGS_NO_FPU)
 
 # Some KUnit files (hooks.o) need to be built-in even when KUnit is a module,
 # so we can't just use obj-$(CONFIG_KUNIT).
diff --git a/lib/test_fpu_glue.c b/lib/test_fpu_glue.c
index 85963d7be826..eef282a2715f 100644
--- a/lib/test_fpu_glue.c
+++ b/lib/test_fpu_glue.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "test_fpu.h"
 
@@ -38,6 +38,9 @@ static struct dentry *selftest_dir;
 
 static int __init test_fpu_init(void)
 {
+   if (!kernel_fpu_available())
+   return -EINVAL;
+
selftest_dir = debugfs_create_dir("selftest_helpers", NULL);
if (!selftest_dir)
return -ENOMEM;
-- 
2.44.0



[PATCH v4 14/15] selftests/fpu: Move FP code to a separate translation unit

2024-03-29 Thread Samuel Holland
This ensures no compiler-generated floating-point code can appear
outside kernel_fpu_{begin,end}() sections, and some architectures
enforce this separation.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - Declare test_fpu() in a header

 lib/Makefile|  3 ++-
 lib/test_fpu.h  |  8 +++
 lib/{test_fpu.c => test_fpu_glue.c} | 32 +
 lib/test_fpu_impl.c | 37 +
 4 files changed, 48 insertions(+), 32 deletions(-)
 create mode 100644 lib/test_fpu.h
 rename lib/{test_fpu.c => test_fpu_glue.c} (71%)
 create mode 100644 lib/test_fpu_impl.c

diff --git a/lib/Makefile b/lib/Makefile
index ffc6b2341b45..fcb35bf50979 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -133,7 +133,8 @@ FPU_CFLAGS += $(call cc-option,-msse 
-mpreferred-stack-boundary=3,-mpreferred-st
 endif
 
 obj-$(CONFIG_TEST_FPU) += test_fpu.o
-CFLAGS_test_fpu.o += $(FPU_CFLAGS)
+test_fpu-y := test_fpu_glue.o test_fpu_impl.o
+CFLAGS_test_fpu_impl.o += $(FPU_CFLAGS)
 
 # Some KUnit files (hooks.o) need to be built-in even when KUnit is a module,
 # so we can't just use obj-$(CONFIG_KUNIT).
diff --git a/lib/test_fpu.h b/lib/test_fpu.h
new file mode 100644
index ..4459807084bc
--- /dev/null
+++ b/lib/test_fpu.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#ifndef _LIB_TEST_FPU_H
+#define _LIB_TEST_FPU_H
+
+int test_fpu(void);
+
+#endif
diff --git a/lib/test_fpu.c b/lib/test_fpu_glue.c
similarity index 71%
rename from lib/test_fpu.c
rename to lib/test_fpu_glue.c
index e82db19fed84..85963d7be826 100644
--- a/lib/test_fpu.c
+++ b/lib/test_fpu_glue.c
@@ -19,37 +19,7 @@
 #include 
 #include 
 
-static int test_fpu(void)
-{
-   /*
-* This sequence of operations tests that rounding mode is
-* to nearest and that denormal numbers are supported.
-* Volatile variables are used to avoid compiler optimizing
-* the calculations away.
-*/
-   volatile double a, b, c, d, e, f, g;
-
-   a = 4.0;
-   b = 1e-15;
-   c = 1e-310;
-
-   /* Sets precision flag */
-   d = a + b;
-
-   /* Result depends on rounding mode */
-   e = a + b / 2;
-
-   /* Denormal and very large values */
-   f = b / c;
-
-   /* Depends on denormal support */
-   g = a + c * f;
-
-   if (d > a && e > a && g > a)
-   return 0;
-   else
-   return -EINVAL;
-}
+#include "test_fpu.h"
 
 static int test_fpu_get(void *data, u64 *val)
 {
diff --git a/lib/test_fpu_impl.c b/lib/test_fpu_impl.c
new file mode 100644
index ..777894dbbe86
--- /dev/null
+++ b/lib/test_fpu_impl.c
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+#include 
+
+#include "test_fpu.h"
+
+int test_fpu(void)
+{
+   /*
+* This sequence of operations tests that rounding mode is
+* to nearest and that denormal numbers are supported.
+* Volatile variables are used to avoid compiler optimizing
+* the calculations away.
+*/
+   volatile double a, b, c, d, e, f, g;
+
+   a = 4.0;
+   b = 1e-15;
+   c = 1e-310;
+
+   /* Sets precision flag */
+   d = a + b;
+
+   /* Result depends on rounding mode */
+   e = a + b / 2;
+
+   /* Denormal and very large values */
+   f = b / c;
+
+   /* Depends on denormal support */
+   g = a + c * f;
+
+   if (d > a && e > a && g > a)
+   return 0;
+   else
+   return -EINVAL;
+}
-- 
2.44.0



[PATCH v4 13/15] drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
Now that all previously-supported architectures select
ARCH_HAS_KERNEL_FPU_SUPPORT, this code can depend on that symbol instead
of the existing list of architectures. It can also take advantage of the
common kernel-mode FPU API and method of adjusting CFLAGS.

Acked-by: Alex Deucher 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - Split altivec removal to a separate patch
 - Use linux/fpu.h instead of asm/fpu.h in consumers

 drivers/gpu/drm/amd/display/Kconfig   |  2 +-
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 27 ++
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 36 ++-
 drivers/gpu/drm/amd/display/dc/dml2/Makefile  | 36 ++-
 4 files changed, 7 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 901d1961b739..5fcd4f778dc3 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || ARM64 || RISCV || SPARC64 || X86_64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
-   select DRM_AMD_DC_FP if (X86 || LOONGARCH || (PPC64 && ALTIVEC) || 
(ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
+   select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && (!ARM64 || 
!CC_IS_CLANG)
help
  Choose this option if you want to use the new display engine
  support for AMDGPU. This adds required support for Vega and
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index 0de16796466b..e46f8ce41d87 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -26,16 +26,7 @@
 
 #include "dc_trace.h"
 
-#if defined(CONFIG_X86)
-#include 
-#elif defined(CONFIG_PPC64)
-#include 
-#include 
-#elif defined(CONFIG_ARM64)
-#include 
-#elif defined(CONFIG_LOONGARCH)
-#include 
-#endif
+#include 
 
 /**
  * DOC: DC FPU manipulation overview
@@ -87,16 +78,9 @@ void dc_fpu_begin(const char *function_name, const int line)
WARN_ON_ONCE(!in_task());
preempt_disable();
depth = __this_cpu_inc_return(fpu_recursion_depth);
-
if (depth == 1) {
-#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
+   BUG_ON(!kernel_fpu_available());
kernel_fpu_begin();
-#elif defined(CONFIG_PPC64)
-   if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
-   enable_kernel_fp();
-#elif defined(CONFIG_ARM64)
-   kernel_neon_begin();
-#endif
}
 
TRACE_DCN_FPU(true, function_name, line, depth);
@@ -118,14 +102,7 @@ void dc_fpu_end(const char *function_name, const int line)
 
depth = __this_cpu_dec_return(fpu_recursion_depth);
if (depth == 0) {
-#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
kernel_fpu_end();
-#elif defined(CONFIG_PPC64)
-   if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
-   disable_kernel_fp();
-#elif defined(CONFIG_ARM64)
-   kernel_neon_end();
-#endif
} else {
WARN_ON_ONCE(depth < 0);
}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index 59d3972341d2..a94b6d546cd1 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -25,40 +25,8 @@
 # It provides the general basic services required by other DAL
 # subcomponents.
 
-ifdef CONFIG_X86
-dml_ccflags-$(CONFIG_CC_IS_GCC) := -mhard-float
-dml_ccflags := $(dml_ccflags-y) -msse
-endif
-
-ifdef CONFIG_PPC64
-dml_ccflags := -mhard-float
-endif
-
-ifdef CONFIG_ARM64
-dml_rcflags := -mgeneral-regs-only
-endif
-
-ifdef CONFIG_LOONGARCH
-dml_ccflags := -mfpu=64
-dml_rcflags := -msoft-float
-endif
-
-ifdef CONFIG_CC_IS_GCC
-ifneq ($(call gcc-min-version, 70100),y)
-IS_OLD_GCC = 1
-endif
-endif
-
-ifdef CONFIG_X86
-ifdef IS_OLD_GCC
-# Stack alignment mismatch, proceed with caution.
-# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
-# (8B stack alignment).
-dml_ccflags += -mpreferred-stack-boundary=4
-else
-dml_ccflags += -msse2
-endif
-endif
+dml_ccflags := $(CC_FLAGS_FPU)
+dml_rcflags := $(CC_FLAGS_NO_FPU)
 
 ifneq ($(CONFIG_FRAME_WARN),0)
 ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index 7b51364084b5..4f6c804a26ad 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -24,40 +24,8 @@
 #
 # Makefile for dml2.
 
-ifdef CONFIG_X86
-dml2_ccflags-$(CONFIG_CC_IS_GCC) := -mhard-float
-dml2_ccflags := $(dml2_ccflags-y) -msse
-endif
-
-ifdef CONFIG_PPC64
-dml2_ccflags := 

[PATCH v4 12/15] drm/amd/display: Only use hard-float, not altivec on powerpc

2024-03-29 Thread Samuel Holland
From: Michael Ellerman 

The compiler flags enable altivec, but that is not required; hard-float
is sufficient for the code to build and function.

Drop altivec from the compiler flags and adjust the enable/disable code
to only enable FPU use.

Signed-off-by: Michael Ellerman 
Acked-by: Alex Deucher 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - New patch for v2

 drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c | 12 ++--
 drivers/gpu/drm/amd/display/dc/dml/Makefile|  2 +-
 drivers/gpu/drm/amd/display/dc/dml2/Makefile   |  2 +-
 3 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
index 4ae4720535a5..0de16796466b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/dc_fpu.c
@@ -92,11 +92,7 @@ void dc_fpu_begin(const char *function_name, const int line)
 #if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
kernel_fpu_begin();
 #elif defined(CONFIG_PPC64)
-   if (cpu_has_feature(CPU_FTR_VSX_COMP))
-   enable_kernel_vsx();
-   else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP))
-   enable_kernel_altivec();
-   else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
+   if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
enable_kernel_fp();
 #elif defined(CONFIG_ARM64)
kernel_neon_begin();
@@ -125,11 +121,7 @@ void dc_fpu_end(const char *function_name, const int line)
 #if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
kernel_fpu_end();
 #elif defined(CONFIG_PPC64)
-   if (cpu_has_feature(CPU_FTR_VSX_COMP))
-   disable_kernel_vsx();
-   else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP))
-   disable_kernel_altivec();
-   else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
+   if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
disable_kernel_fp();
 #elif defined(CONFIG_ARM64)
kernel_neon_end();
diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml/Makefile
index c4a5efd2dda5..59d3972341d2 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile
@@ -31,7 +31,7 @@ dml_ccflags := $(dml_ccflags-y) -msse
 endif
 
 ifdef CONFIG_PPC64
-dml_ccflags := -mhard-float -maltivec
+dml_ccflags := -mhard-float
 endif
 
 ifdef CONFIG_ARM64
diff --git a/drivers/gpu/drm/amd/display/dc/dml2/Makefile 
b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
index acff3449b8d7..7b51364084b5 100644
--- a/drivers/gpu/drm/amd/display/dc/dml2/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dml2/Makefile
@@ -30,7 +30,7 @@ dml2_ccflags := $(dml2_ccflags-y) -msse
 endif
 
 ifdef CONFIG_PPC64
-dml2_ccflags := -mhard-float -maltivec
+dml2_ccflags := -mhard-float
 endif
 
 ifdef CONFIG_ARM64
-- 
2.44.0



[PATCH v4 11/15] riscv: Add support for kernel-mode FPU

2024-03-29 Thread Samuel Holland
This is motivated by the amdgpu DRM driver, which needs floating-point
code to support recent hardware. That code is not performance-critical,
so only provide a minimal non-preemptible implementation for now.

Support is limited to riscv64 because riscv32 requires runtime (libgcc)
assistance to convert between doubles and 64-bit integers.

Acked-by: Palmer Dabbelt 
Reviewed-by: Palmer Dabbelt 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v3)

Changes in v3:
 - Limit riscv ARCH_HAS_KERNEL_FPU_SUPPORT to 64BIT

Changes in v2:
 - Remove RISC-V architecture-specific preprocessor check

 arch/riscv/Kconfig  |  1 +
 arch/riscv/Makefile |  3 +++
 arch/riscv/include/asm/fpu.h| 16 
 arch/riscv/kernel/Makefile  |  1 +
 arch/riscv/kernel/kernel_mode_fpu.c | 28 
 5 files changed, 49 insertions(+)
 create mode 100644 arch/riscv/include/asm/fpu.h
 create mode 100644 arch/riscv/kernel/kernel_mode_fpu.c

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index be09c8836d56..3bcd0d250810 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -27,6 +27,7 @@ config RISCV
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
+   select ARCH_HAS_KERNEL_FPU_SUPPORT if 64BIT && FPU
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MMIOWB
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 252d63942f34..76ff4033c854 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -84,6 +84,9 @@ KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed 
-E 's/(rv32ima|rv64i
 
 KBUILD_AFLAGS += -march=$(riscv-march-y)
 
+# For C code built with floating-point support, exclude V but keep F and D.
+CC_FLAGS_FPU  := -march=$(shell echo $(riscv-march-y) | sed -E 
's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
+
 KBUILD_CFLAGS += -mno-save-restore
 KBUILD_CFLAGS += -DCONFIG_PAGE_OFFSET=$(CONFIG_PAGE_OFFSET)
 
diff --git a/arch/riscv/include/asm/fpu.h b/arch/riscv/include/asm/fpu.h
new file mode 100644
index ..91c04c244e12
--- /dev/null
+++ b/arch/riscv/include/asm/fpu.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 SiFive
+ */
+
+#ifndef _ASM_RISCV_FPU_H
+#define _ASM_RISCV_FPU_H
+
+#include 
+
+#define kernel_fpu_available() has_fpu()
+
+void kernel_fpu_begin(void);
+void kernel_fpu_end(void);
+
+#endif /* ! _ASM_RISCV_FPU_H */
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index 81d94a8ee10f..5b243d46f4b1 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -67,6 +67,7 @@ obj-$(CONFIG_RISCV_MISALIGNED)+= 
unaligned_access_speed.o
 obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
 
 obj-$(CONFIG_FPU)  += fpu.o
+obj-$(CONFIG_FPU)  += kernel_mode_fpu.o
 obj-$(CONFIG_RISCV_ISA_V)  += vector.o
 obj-$(CONFIG_RISCV_ISA_V)  += kernel_mode_vector.o
 obj-$(CONFIG_SMP)  += smpboot.o
diff --git a/arch/riscv/kernel/kernel_mode_fpu.c 
b/arch/riscv/kernel/kernel_mode_fpu.c
new file mode 100644
index ..0ac8348876c4
--- /dev/null
+++ b/arch/riscv/kernel/kernel_mode_fpu.c
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 SiFive
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+void kernel_fpu_begin(void)
+{
+   preempt_disable();
+   fstate_save(current, task_pt_regs(current));
+   csr_set(CSR_SSTATUS, SR_FS);
+}
+EXPORT_SYMBOL_GPL(kernel_fpu_begin);
+
+void kernel_fpu_end(void)
+{
+   csr_clear(CSR_SSTATUS, SR_FS);
+   fstate_restore(current, task_pt_regs(current));
+   preempt_enable();
+}
+EXPORT_SYMBOL_GPL(kernel_fpu_end);
-- 
2.44.0



[PATCH v4 10/15] x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
x86 already provides kernel_fpu_begin() and kernel_fpu_end(), but in a
different header. Add a wrapper header, and export the CFLAGS
adjustments as found in lib/Makefile.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v1)

 arch/x86/Kconfig   |  1 +
 arch/x86/Makefile  | 20 
 arch/x86/include/asm/fpu.h | 13 +
 3 files changed, 34 insertions(+)
 create mode 100644 arch/x86/include/asm/fpu.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 39886bab943a..7c9d032ee675 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -83,6 +83,7 @@ config X86
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOVif X86_64
+   select ARCH_HAS_KERNEL_FPU_SUPPORT
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 662d9d4033e6..5a5f5999c505 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -74,6 +74,26 @@ KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow 
-mno-avx
 KBUILD_RUSTFLAGS += --target=$(objtree)/scripts/target.json
 KBUILD_RUSTFLAGS += 
-Ctarget-feature=-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2
 
+#
+# CFLAGS for compiling floating point code inside the kernel.
+#
+CC_FLAGS_FPU := -msse -msse2
+ifdef CONFIG_CC_IS_GCC
+# Stack alignment mismatch, proceed with caution.
+# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
+# (8B stack alignment).
+# See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
+#
+# The "-msse" in the first argument is there so that the
+# -mpreferred-stack-boundary=3 build error:
+#
+#  -mpreferred-stack-boundary=3 is not between 4 and 12
+#
+# can be triggered. Otherwise gcc doesn't complain.
+CC_FLAGS_FPU += -mhard-float
+CC_FLAGS_FPU += $(call cc-option,-msse 
-mpreferred-stack-boundary=3,-mpreferred-stack-boundary=4)
+endif
+
 ifeq ($(CONFIG_X86_KERNEL_IBT),y)
 #
 # Kernel IBT has S_CET.NOTRACK_EN=0, as such the compilers must not generate
diff --git a/arch/x86/include/asm/fpu.h b/arch/x86/include/asm/fpu.h
new file mode 100644
index ..b2743fe19339
--- /dev/null
+++ b/arch/x86/include/asm/fpu.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 SiFive
+ */
+
+#ifndef _ASM_X86_FPU_H
+#define _ASM_X86_FPU_H
+
+#include 
+
+#define kernel_fpu_available() true
+
+#endif /* ! _ASM_X86_FPU_H */
-- 
2.44.0



[PATCH v4 09/15] x86/fpu: Fix asm/fpu/types.h include guard

2024-03-29 Thread Samuel Holland
The include guard should match the filename, or it will conflict with
the newly-added asm/fpu.h.

Signed-off-by: Samuel Holland 
---

Changes in v4:
 - New patch for v4

 arch/x86/include/asm/fpu/types.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index ace9aa3b78a3..eb17f31b06d2 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -2,8 +2,8 @@
 /*
  * FPU data structures:
  */
-#ifndef _ASM_X86_FPU_H
-#define _ASM_X86_FPU_H
+#ifndef _ASM_X86_FPU_TYPES_H
+#define _ASM_X86_FPU_TYPES_H
 
 #include 
 
@@ -596,4 +596,4 @@ struct fpu_state_config {
 /* FPU state configuration information */
 extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg;
 
-#endif /* _ASM_X86_FPU_H */
+#endif /* _ASM_X86_FPU_TYPES_H */
-- 
2.44.0



[PATCH v4 08/15] powerpc: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
PowerPC provides an equivalent to the common kernel-mode FPU API, but in
a different header and using different function names. The PowerPC API
also requires a non-preemptible context. Add a wrapper header, and
export the CFLAGS adjustments.

Acked-by: Michael Ellerman  (powerpc)
Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v1)

 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/Makefile  |  5 -
 arch/powerpc/include/asm/fpu.h | 28 
 3 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/include/asm/fpu.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1c4be3373686..c42a57b6839d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -137,6 +137,7 @@ config PPC
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_HUGEPD  if HUGETLB_PAGE
select ARCH_HAS_KCOV
+   select ARCH_HAS_KERNEL_FPU_SUPPORT  if PPC_FPU
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEMREMAP_COMPAT_ALIGN   if PPC_64S_HASH_MMU
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 65261cbe5bfd..93d89f055b70 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -153,6 +153,9 @@ CFLAGS-$(CONFIG_PPC32)  += $(call cc-option, 
$(MULTIPLEWORD))
 
 CFLAGS-$(CONFIG_PPC32) += $(call cc-option,-mno-readonly-in-sdata)
 
+CC_FLAGS_FPU   := $(call cc-option,-mhard-float)
+CC_FLAGS_NO_FPU:= $(call cc-option,-msoft-float)
+
 ifdef CONFIG_FUNCTION_TRACER
 ifdef CONFIG_ARCH_USING_PATCHABLE_FUNCTION_ENTRY
 KBUILD_CPPFLAGS+= -DCC_USING_PATCHABLE_FUNCTION_ENTRY
@@ -174,7 +177,7 @@ asinstr := $(call as-instr,lis 
9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)
 
 KBUILD_CPPFLAGS+= -I $(srctree)/arch/powerpc $(asinstr)
 KBUILD_AFLAGS  += $(AFLAGS-y)
-KBUILD_CFLAGS  += $(call cc-option,-msoft-float)
+KBUILD_CFLAGS  += $(CC_FLAGS_NO_FPU)
 KBUILD_CFLAGS  += $(CFLAGS-y)
 CPP= $(CC) -E $(KBUILD_CFLAGS)
 
diff --git a/arch/powerpc/include/asm/fpu.h b/arch/powerpc/include/asm/fpu.h
new file mode 100644
index ..ca584e4bc40f
--- /dev/null
+++ b/arch/powerpc/include/asm/fpu.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 SiFive
+ */
+
+#ifndef _ASM_POWERPC_FPU_H
+#define _ASM_POWERPC_FPU_H
+
+#include 
+
+#include 
+#include 
+
+#define kernel_fpu_available() (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
+
+static inline void kernel_fpu_begin(void)
+{
+   preempt_disable();
+   enable_kernel_fp();
+}
+
+static inline void kernel_fpu_end(void)
+{
+   disable_kernel_fp();
+   preempt_enable();
+}
+
+#endif /* ! _ASM_POWERPC_FPU_H */
-- 
2.44.0



[PATCH v4 07/15] LoongArch: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
LoongArch already provides kernel_fpu_begin() and kernel_fpu_end() in
asm/fpu.h, so it only needs to add kernel_fpu_available() and export
the CFLAGS adjustments.

Acked-by: WANG Xuerui 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v3)

Changes in v3:
 - Rebase on v6.9-rc1

 arch/loongarch/Kconfig   | 1 +
 arch/loongarch/Makefile  | 5 -
 arch/loongarch/include/asm/fpu.h | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index a5f300ec6f28..2266c6c41c38 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -18,6 +18,7 @@ config LOONGARCH
select ARCH_HAS_CURRENT_STACK_POINTER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_KCOV
+   select ARCH_HAS_KERNEL_FPU_SUPPORT if CPU_HAS_FPU
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index df6caf79537a..efb5440a43ec 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -26,6 +26,9 @@ endif
 32bit-emul = elf32loongarch
 64bit-emul = elf64loongarch
 
+CC_FLAGS_FPU   := -mfpu=64
+CC_FLAGS_NO_FPU:= -msoft-float
+
 ifdef CONFIG_UNWINDER_ORC
 orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
 orc_hash_sh := $(srctree)/scripts/orc_hash.sh
@@ -59,7 +62,7 @@ ld-emul   = $(64bit-emul)
 cflags-y   += -mabi=lp64s
 endif
 
-cflags-y   += -pipe -msoft-float
+cflags-y   += -pipe $(CC_FLAGS_NO_FPU)
 LDFLAGS_vmlinux+= -static -n -nostdlib
 
 # When the assembler supports explicit relocation hint, we must use it.
diff --git a/arch/loongarch/include/asm/fpu.h b/arch/loongarch/include/asm/fpu.h
index c2d8962fda00..3177674228f8 100644
--- a/arch/loongarch/include/asm/fpu.h
+++ b/arch/loongarch/include/asm/fpu.h
@@ -21,6 +21,7 @@
 
 struct sigcontext;
 
+#define kernel_fpu_available() cpu_has_fpu
 extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 
-- 
2.44.0



[PATCH v4 06/15] lib/raid6: Use CC_FLAGS_FPU for NEON CFLAGS

2024-03-29 Thread Samuel Holland
Now that CC_FLAGS_FPU is exported and can be used anywhere in the source
tree, use it instead of duplicating the flags here.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

Changes in v4:
 - Add missed CFLAGS changes for recov_neon_inner.c
   (fixes arm build failures)

 lib/raid6/Makefile | 33 ++---
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
index 385a94aa0b99..0e88bfe6445b 100644
--- a/lib/raid6/Makefile
+++ b/lib/raid6/Makefile
@@ -33,25 +33,6 @@ CFLAGS_REMOVE_vpermxor8.o += -msoft-float
 endif
 endif
 
-# The GCC option -ffreestanding is required in order to compile code containing
-# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
-ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
-NEON_FLAGS := -ffreestanding
-# Enable 
-NEON_FLAGS += -isystem $(shell $(CC) -print-file-name=include)
-ifeq ($(ARCH),arm)
-NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-endif
-CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
-ifeq ($(ARCH),arm64)
-CFLAGS_REMOVE_recov_neon_inner.o += -mgeneral-regs-only
-CFLAGS_REMOVE_neon1.o += -mgeneral-regs-only
-CFLAGS_REMOVE_neon2.o += -mgeneral-regs-only
-CFLAGS_REMOVE_neon4.o += -mgeneral-regs-only
-CFLAGS_REMOVE_neon8.o += -mgeneral-regs-only
-endif
-endif
-
 quiet_cmd_unroll = UNROLL  $@
   cmd_unroll = $(AWK) -v N=$* -f $(srctree)/$(src)/unroll.awk < $< > $@
 
@@ -75,10 +56,16 @@ targets += vpermxor1.c vpermxor2.c vpermxor4.c vpermxor8.c
 $(obj)/vpermxor%.c: $(src)/vpermxor.uc $(src)/unroll.awk FORCE
$(call if_changed,unroll)
 
-CFLAGS_neon1.o += $(NEON_FLAGS)
-CFLAGS_neon2.o += $(NEON_FLAGS)
-CFLAGS_neon4.o += $(NEON_FLAGS)
-CFLAGS_neon8.o += $(NEON_FLAGS)
+CFLAGS_neon1.o += $(CC_FLAGS_FPU)
+CFLAGS_neon2.o += $(CC_FLAGS_FPU)
+CFLAGS_neon4.o += $(CC_FLAGS_FPU)
+CFLAGS_neon8.o += $(CC_FLAGS_FPU)
+CFLAGS_recov_neon_inner.o += $(CC_FLAGS_FPU)
+CFLAGS_REMOVE_neon1.o += $(CC_FLAGS_NO_FPU)
+CFLAGS_REMOVE_neon2.o += $(CC_FLAGS_NO_FPU)
+CFLAGS_REMOVE_neon4.o += $(CC_FLAGS_NO_FPU)
+CFLAGS_REMOVE_neon8.o += $(CC_FLAGS_NO_FPU)
+CFLAGS_REMOVE_recov_neon_inner.o += $(CC_FLAGS_NO_FPU)
 targets += neon1.c neon2.c neon4.c neon8.c
 $(obj)/neon%.c: $(src)/neon.uc $(src)/unroll.awk FORCE
$(call if_changed,unroll)
-- 
2.44.0



[PATCH v4 05/15] arm64: crypto: Use CC_FLAGS_FPU for NEON CFLAGS

2024-03-29 Thread Samuel Holland
Now that CC_FLAGS_FPU is exported and can be used anywhere in the source
tree, use it instead of duplicating the flags here.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - New patch for v2

 arch/arm64/lib/Makefile | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 29490be2546b..13e6a2829116 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -7,10 +7,8 @@ lib-y  := clear_user.o delay.o copy_from_user.o
\
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
 obj-$(CONFIG_XOR_BLOCKS)   += xor-neon.o
-CFLAGS_REMOVE_xor-neon.o   += -mgeneral-regs-only
-CFLAGS_xor-neon.o  += -ffreestanding
-# Enable 
-CFLAGS_xor-neon.o  += -isystem $(shell $(CC) 
-print-file-name=include)
+CFLAGS_xor-neon.o  += $(CC_FLAGS_FPU)
+CFLAGS_REMOVE_xor-neon.o   += $(CC_FLAGS_NO_FPU)
 endif
 
 lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
-- 
2.44.0



[PATCH v4 04/15] arm64: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
arm64 provides an equivalent to the common kernel-mode FPU API, but in a
different header and using different function names. Add a wrapper
header, and export CFLAGS adjustments as found in lib/raid6/Makefile.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - Remove file name from header comment

 arch/arm64/Kconfig   |  1 +
 arch/arm64/Makefile  |  9 -
 arch/arm64/include/asm/fpu.h | 15 +++
 3 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/fpu.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b11c98b3e84..67f0d3b5b7df 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -30,6 +30,7 @@ config ARM64
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
+   select ARCH_HAS_KERNEL_FPU_SUPPORT if KERNEL_MODE_NEON
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 0e075d3c546b..3e863e5b0169 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -36,7 +36,14 @@ ifeq ($(CONFIG_BROKEN_GAS_INST),y)
 $(warning Detected assembler with broken .inst; disassembly will be unreliable)
 endif
 
-KBUILD_CFLAGS  += -mgeneral-regs-only  \
+# The GCC option -ffreestanding is required in order to compile code containing
+# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
+CC_FLAGS_FPU   := -ffreestanding
+# Enable 
+CC_FLAGS_FPU   += -isystem $(shell $(CC) -print-file-name=include)
+CC_FLAGS_NO_FPU:= -mgeneral-regs-only
+
+KBUILD_CFLAGS  += $(CC_FLAGS_NO_FPU) \
   $(compat_vdso) $(cc_has_k_constraint)
 KBUILD_CFLAGS  += $(call cc-disable-warning, psabi)
 KBUILD_AFLAGS  += $(compat_vdso)
diff --git a/arch/arm64/include/asm/fpu.h b/arch/arm64/include/asm/fpu.h
new file mode 100644
index ..2ae50bdce59b
--- /dev/null
+++ b/arch/arm64/include/asm/fpu.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 SiFive
+ */
+
+#ifndef __ASM_FPU_H
+#define __ASM_FPU_H
+
+#include 
+
+#define kernel_fpu_available() cpu_has_neon()
+#define kernel_fpu_begin() kernel_neon_begin()
+#define kernel_fpu_end()   kernel_neon_end()
+
+#endif /* ! __ASM_FPU_H */
-- 
2.44.0



[PATCH v4 03/15] ARM: crypto: Use CC_FLAGS_FPU for NEON CFLAGS

2024-03-29 Thread Samuel Holland
Now that CC_FLAGS_FPU is exported and can be used anywhere in the source
tree, use it instead of duplicating the flags here.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v1)

 arch/arm/lib/Makefile | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 650404be6768..0ca5aae1bcc3 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -40,8 +40,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
 $(obj)/csumpartialcopyuser.o:  $(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
-  NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-  CFLAGS_xor-neon.o+= $(NEON_FLAGS)
+  CFLAGS_xor-neon.o+= $(CC_FLAGS_FPU)
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
 
-- 
2.44.0



[PATCH v4 00/15] Unified cross-architecture kernel-mode FPU API

2024-03-29 Thread Samuel Holland
This series unifies the kernel-mode FPU API across several architectures
by wrapping the existing functions (where needed) in consistently-named
functions placed in a consistent header location, with mostly the same
semantics: they can be called from preemptible or non-preemptible task
context, and are not assumed to be reentrant. Architectures are also
expected to provide CFLAGS adjustments for compiling FPU-dependent code.
For the moment, SIMD/vector units are out of scope for this common API.

This allows us to remove the ifdeffery and duplicated Makefile logic at
each FPU user. It then implements the common API on RISC-V, and converts
a couple of users to the new API: the AMDGPU DRM driver, and the FPU
self test.

The underlying goal of this series is to allow using newer AMD GPUs
(e.g. Navi) on RISC-V boards such as SiFive's HiFive Unmatched. Those
GPUs need CONFIG_DRM_AMD_DC_FP to initialize, which requires kernel-mode
FPU support.

Previous versions:
v3: 
https://lore.kernel.org/linux-kernel/20240327200157.1097089-1-samuel.holl...@sifive.com/
v2: 
https://lore.kernel.org/linux-kernel/20231228014220.3562640-1-samuel.holl...@sifive.com/
v1: 
https://lore.kernel.org/linux-kernel/20231208055501.2916202-1-samuel.holl...@sifive.com/
v0: 
https://lore.kernel.org/linux-kernel/20231122030621.3759313-1-samuel.holl...@sifive.com/

Changes in v4:
 - Add missed CFLAGS changes for recov_neon_inner.c
   (fixes arm build failures)
 - Fix x86 include guard issue (fixes x86 build failures)

Changes in v3:
 - Rebase on v6.9-rc1
 - Limit riscv ARCH_HAS_KERNEL_FPU_SUPPORT to 64BIT

Changes in v2:
 - Add documentation explaining the built-time and runtime APIs
 - Add a linux/fpu.h header for generic isolation enforcement
 - Remove file name from header comment
 - Clean up arch/arm64/lib/Makefile, like for arch/arm
 - Remove RISC-V architecture-specific preprocessor check
 - Split altivec removal to a separate patch
 - Use linux/fpu.h instead of asm/fpu.h in consumers
 - Declare test_fpu() in a header

Michael Ellerman (1):
  drm/amd/display: Only use hard-float, not altivec on powerpc

Samuel Holland (14):
  arch: Add ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: Use CC_FLAGS_FPU for NEON CFLAGS
  arm64: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arm64: crypto: Use CC_FLAGS_FPU for NEON CFLAGS
  lib/raid6: Use CC_FLAGS_FPU for NEON CFLAGS
  LoongArch: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: Fix asm/fpu/types.h include guard
  x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
  riscv: Add support for kernel-mode FPU
  drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT
  selftests/fpu: Move FP code to a separate translation unit
  selftests/fpu: Allow building on other architectures

 Documentation/core-api/floating-point.rst | 78 +++
 Documentation/core-api/index.rst  |  1 +
 Makefile  |  5 ++
 arch/Kconfig  |  6 ++
 arch/arm/Kconfig  |  1 +
 arch/arm/Makefile |  7 ++
 arch/arm/include/asm/fpu.h| 15 
 arch/arm/lib/Makefile |  3 +-
 arch/arm64/Kconfig|  1 +
 arch/arm64/Makefile   |  9 ++-
 arch/arm64/include/asm/fpu.h  | 15 
 arch/arm64/lib/Makefile   |  6 +-
 arch/loongarch/Kconfig|  1 +
 arch/loongarch/Makefile   |  5 +-
 arch/loongarch/include/asm/fpu.h  |  1 +
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/Makefile |  5 +-
 arch/powerpc/include/asm/fpu.h| 28 +++
 arch/riscv/Kconfig|  1 +
 arch/riscv/Makefile   |  3 +
 arch/riscv/include/asm/fpu.h  | 16 
 arch/riscv/kernel/Makefile|  1 +
 arch/riscv/kernel/kernel_mode_fpu.c   | 28 +++
 arch/x86/Kconfig  |  1 +
 arch/x86/Makefile | 20 +
 arch/x86/include/asm/fpu.h| 13 
 arch/x86/include/asm/fpu/types.h  |  6 +-
 drivers/gpu/drm/amd/display/Kconfig   |  2 +-
 .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 35 +
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 36 +
 drivers/gpu/drm/amd/display/dc/dml2/Makefile  | 36 +
 include/linux/fpu.h   | 12 +++
 lib/Kconfig.debug |  2 +-
 lib/Makefile  | 26 +--
 lib/raid6/Makefile| 33 +++-
 lib/test_fpu.h|  8 ++
 lib/{test_fpu.c => test_fpu_glue.c}   | 37 ++---
 lib/test_fpu_impl.c   | 37 +
 38 files changed, 348 

[PATCH v4 02/15] ARM: Implement ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
ARM provides an equivalent to the common kernel-mode FPU API, but in a
different header and using different function names. Add a wrapper
header, and export CFLAGS adjustments as found in lib/raid6/Makefile.

Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - Remove file name from header comment

 arch/arm/Kconfig   |  1 +
 arch/arm/Makefile  |  7 +++
 arch/arm/include/asm/fpu.h | 15 +++
 3 files changed, 23 insertions(+)
 create mode 100644 arch/arm/include/asm/fpu.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b14aed3a17ab..b1751c2cab87 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -15,6 +15,7 @@ config ARM
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_KCOV
+   select ARCH_HAS_KERNEL_FPU_SUPPORT if KERNEL_MODE_NEON
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL if ARM_LPAE
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index d82908b1b1bb..71afdd98ddf2 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -130,6 +130,13 @@ endif
 # Accept old syntax despite ".syntax unified"
 AFLAGS_NOWARN  :=$(call 
as-option,-Wa$(comma)-mno-warn-deprecated,-Wa$(comma)-W)
 
+# The GCC option -ffreestanding is required in order to compile code containing
+# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
+CC_FLAGS_FPU   := -ffreestanding
+# Enable 
+CC_FLAGS_FPU   += -isystem $(shell $(CC) -print-file-name=include)
+CC_FLAGS_FPU   += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
+
 ifeq ($(CONFIG_THUMB2_KERNEL),y)
 CFLAGS_ISA :=-Wa,-mimplicit-it=always $(AFLAGS_NOWARN)
 AFLAGS_ISA :=$(CFLAGS_ISA) -Wa$(comma)-mthumb
diff --git a/arch/arm/include/asm/fpu.h b/arch/arm/include/asm/fpu.h
new file mode 100644
index ..2ae50bdce59b
--- /dev/null
+++ b/arch/arm/include/asm/fpu.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 SiFive
+ */
+
+#ifndef __ASM_FPU_H
+#define __ASM_FPU_H
+
+#include 
+
+#define kernel_fpu_available() cpu_has_neon()
+#define kernel_fpu_begin() kernel_neon_begin()
+#define kernel_fpu_end()   kernel_neon_end()
+
+#endif /* ! __ASM_FPU_H */
-- 
2.44.0



[PATCH v4 01/15] arch: Add ARCH_HAS_KERNEL_FPU_SUPPORT

2024-03-29 Thread Samuel Holland
Several architectures provide an API to enable the FPU and run
floating-point SIMD code in kernel space. However, the function names,
header locations, and semantics are inconsistent across architectures,
and FPU support may be gated behind other Kconfig options.

Provide a standard way for architectures to declare that kernel space
FPU support is available. Architectures selecting this option must
implement what is currently the most common API (kernel_fpu_begin() and
kernel_fpu_end(), plus a new function kernel_fpu_available()) and
provide the appropriate CFLAGS for compiling floating-point C code.

Suggested-by: Christoph Hellwig 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Samuel Holland 
---

(no changes since v2)

Changes in v2:
 - Add documentation explaining the built-time and runtime APIs
 - Add a linux/fpu.h header for generic isolation enforcement

 Documentation/core-api/floating-point.rst | 78 +++
 Documentation/core-api/index.rst  |  1 +
 Makefile  |  5 ++
 arch/Kconfig  |  6 ++
 include/linux/fpu.h   | 12 
 5 files changed, 102 insertions(+)
 create mode 100644 Documentation/core-api/floating-point.rst
 create mode 100644 include/linux/fpu.h

diff --git a/Documentation/core-api/floating-point.rst 
b/Documentation/core-api/floating-point.rst
new file mode 100644
index ..a8d0d4b05052
--- /dev/null
+++ b/Documentation/core-api/floating-point.rst
@@ -0,0 +1,78 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+Floating-point API
+==
+
+Kernel code is normally prohibited from using floating-point (FP) registers or
+instructions, including the C float and double data types. This rule reduces
+system call overhead, because the kernel does not need to save and restore the
+userspace floating-point register state.
+
+However, occasionally drivers or library functions may need to include FP code.
+This is supported by isolating the functions containing FP code to a separate
+translation unit (a separate source file), and saving/restoring the FP register
+state around calls to those functions. This creates "critical sections" of
+floating-point usage.
+
+The reason for this isolation is to prevent the compiler from generating code
+touching the FP registers outside these critical sections. Compilers sometimes
+use FP registers to optimize inlined ``memcpy`` or variable assignment, as
+floating-point registers may be wider than general-purpose registers.
+
+Usability of floating-point code within the kernel is architecture-specific.
+Additionally, because a single kernel may be configured to support platforms
+both with and without a floating-point unit, FPU availability must be checked
+both at build time and at run time.
+
+Several architectures implement the generic kernel floating-point API from
+``linux/fpu.h``, as described below. Some other architectures implement their
+own unique APIs, which are documented separately.
+
+Build-time API
+--
+
+Floating-point code may be built if the option ``ARCH_HAS_KERNEL_FPU_SUPPORT``
+is enabled. For C code, such code must be placed in a separate file, and that
+file must have its compilation flags adjusted using the following pattern::
+
+CFLAGS_foo.o += $(CC_FLAGS_FPU)
+CFLAGS_REMOVE_foo.o += $(CC_FLAGS_NO_FPU)
+
+Architectures are expected to define one or both of these variables in their
+top-level Makefile as needed. For example::
+
+CC_FLAGS_FPU := -mhard-float
+
+or::
+
+CC_FLAGS_NO_FPU := -msoft-float
+
+Normal kernel code is assumed to use the equivalent of ``CC_FLAGS_NO_FPU``.
+
+Runtime API
+---
+
+The runtime API is provided in ``linux/fpu.h``. This header cannot be included
+from files implementing FP code (those with their compilation flags adjusted as
+above). Instead, it must be included when defining the FP critical sections.
+
+.. c:function:: bool kernel_fpu_available( void )
+
+This function reports if floating-point code can be used on this CPU or
+platform. The value returned by this function is not expected to change
+at runtime, so it only needs to be called once, not before every
+critical section.
+
+.. c:function:: void kernel_fpu_begin( void )
+void kernel_fpu_end( void )
+
+These functions create a floating-point critical section. It is only
+valid to call ``kernel_fpu_begin()`` after a previous call to
+``kernel_fpu_available()`` returned ``true``. These functions are only
+guaranteed to be callable from (preemptible or non-preemptible) process
+context.
+
+Preemption may be disabled inside critical sections, so their size
+should be minimized. They are *not* required to be reentrant. If the
+caller expects to nest critical sections, it must implement its own
+reference counting.
diff --git a/Documentation/core-api/index.rst