Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

2021-04-07 Thread Bill Wendling
On Wed, Apr 7, 2021 at 2:47 PM Nathan Chancellor  wrote:
>
> Hi Bill,
>
> On Wed, Apr 07, 2021 at 02:17:04PM -0700, Bill Wendling wrote:
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > Tested-by: Nick Desaulniers 
> > Reviewed-by: Nick Desaulniers 
> > Reviewed-by: Fangrui Song 
>
> Few small nits below, not sure they warrant a v10 versus just some
> follow up patches, up to you. Regardless:
>
> Reviewed-by: Nathan Chancellor 
>
> > ---
> > v9: - [maskray] Remove explicit 'ALIGN' and 'KEEP' from PGO variables in
> >   vmlinux.lds.h.
> > v8: - Rebased on top-of-tree.
> > v7: - [sedat.dilek] Fix minor build failure.
> > v6: - Add better documentation about the locking scheme and other things.
> > - Rename macros to better match the same macros in LLVM's source code.
> > v5: - [natechancellor] Correct padding calculation.
> > v4: - [ndesaulniers] Remove non-x86 Makfile changes and se "hweight64" 
> > instead
> >   of using our own popcount implementation.
> > v3: - [sedat.dilek] Added change log section.
> > v2: - [natechancellor] Added "__llvm_profile_instrument_memop".
> > - [maskray] Corrected documentation, re PGO flags when using LTO.
> > ---
> >  Documentation/dev-tools/index.rst |   1 +
> >  Documentation/dev-tools/pgo.rst   | 127 +
> >  MAINTAINERS   |   9 +
> >  Makefile  |   3 +
> >  arch/Kconfig  |   1 +
> >  arch/x86/Kconfig  |   1 +
> >  arch/x86/boot/Makefile|   1 +
> >  arch/x86/boot/compressed/Makefile |   1 +
> >  arch/x86/crypto/Makefile  |   4 +
> >  arch/x86/entry/vdso/Makefile  |   1 +
> >  arch/x86/kernel/vmlinux.lds.S |   2 +
> >  arch/x86/platform/efi/Makefile|   1 +
> >  arch/x86/purgatory/Makefile   |   1 +
> >  arch/x86/realmode/rm/Makefile |   1 +
> >  arch/x86/um/vdso/Makefile |   1 +
> >  drivers/firmware/efi/libstub/Makefile |   1 +
> >  include/asm-generic/vmlinux.lds.h |  34 +++
> >  kernel/Makefile   |   1 +
> >  kernel/pgo/Kconfig|  35 +++
> >  kernel/pgo/Makefile   |   5 +
> >  kernel/pgo/fs.c   | 389 ++
> >  kernel/pgo/instrument.c   | 189 +
> >  kernel/pgo/pgo.h  | 203 ++
> >  scripts/Makefile.lib  |  10 +
> >  24 files changed, 1022 insertions(+)
> >  create mode 100644 Documentation/dev-tools/pgo.rst
> >  create mode 100644 kernel/pgo/Kconfig
> >  create mode 100644 kernel/pgo/Makefile
> >  create mode 100644 kernel/pgo/fs.c
> >  create mode 100644 kernel/pgo/instrument.c
> >  create mode 100644 kernel/pgo/pgo.h
> >
> > diff --git a/Documentation/dev-tools/index.rst 
> > b/Documentation/dev-tools/index.rst
> > index 1b1cf4f5c9d9..6a30cd98e6f9 100644
> > --- a/Documentation/dev-tools/index.rst
> > +++ b/Documentation/dev-tools/index.rst
> > @@ -27,6 +27,7 @@ whole; patches welcome!
> > kgdb
> > kselftest
> > kunit/index
> > +   pgo
> >
> >
> >  .. only::  subproject and html

[PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

2021-04-07 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
Tested-by: Nick Desaulniers 
Reviewed-by: Nick Desaulniers 
Reviewed-by: Fangrui Song 
---
v9: - [maskray] Remove explicit 'ALIGN' and 'KEEP' from PGO variables in
  vmlinux.lds.h.
v8: - Rebased on top-of-tree.
v7: - [sedat.dilek] Fix minor build failure.
v6: - Add better documentation about the locking scheme and other things.
- Rename macros to better match the same macros in LLVM's source code.
v5: - [natechancellor] Correct padding calculation.
v4: - [ndesaulniers] Remove non-x86 Makfile changes and se "hweight64" instead
  of using our own popcount implementation.
v3: - [sedat.dilek] Added change log section.
v2: - [natechancellor] Added "__llvm_profile_instrument_memop".
- [maskray] Corrected documentation, re PGO flags when using LTO.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/crypto/Makefile  |   4 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h |  34 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  35 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 389 ++
 kernel/pgo/instrument.c   | 189 +
 kernel/pgo/pgo.h  | 203 ++
 scripts/Makefile.lib  |  10 +
 24 files changed, 1022 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index 1b1cf4f5c9d9..6a30cd98e6f9 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -27,6 +27,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index ..b7f11d8405b7
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become accessible once debugfs has been mounted:
+
+.. code-block:: sh
+
+   mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+=
+
+You can enable or disable profiling for individual file and directories by
+adding a line similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)
+
+  .. code-block:: make
+
+ PGO_PROFIL

Re: [PATCH v8] pgo: add clang's Profile Guided Optimization infrastructure

2021-02-26 Thread Bill Wendling
On Fri, Feb 26, 2021 at 2:20 PM Bill Wendling  wrote:
>
> From: Sami Tolvanen 
>
> Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> profile, the kernel is instrumented with PGO counters, a representative
> workload is run, and the raw profile data is collected from
> /sys/kernel/debug/pgo/profraw.
>
> The raw profile data must be processed by clang's "llvm-profdata" tool
> before it can be used during recompilation:
>
>   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
>   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
>
> Multiple raw profiles may be merged during this step.
>
> The data can now be used by the compiler:
>
>   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
>
> This initial submission is restricted to x86, as that's the platform we
> know works. This restriction can be lifted once other platforms have
> been verified to work with PGO.
>
> Note that this method of profiling the kernel is clang-native, unlike
> the clang support in kernel/gcov.
>
> [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
>
> Signed-off-by: Sami Tolvanen 
> Co-developed-by: Bill Wendling 
> Signed-off-by: Bill Wendling 

I forgot to add these tags:

Tested-by: Nick Desaulniers 
Reviewed-by: Nick Desaulniers 

> ---
> v8: - Rebased on top-of-tree.
> v7: - Fix minor build failure reported by Sedat.
> v6: - Add better documentation about the locking scheme and other things.
> - Rename macros to better match the same macros in LLVM's source code.
> v5: - Correct padding calculation, discovered by Nathan Chancellor.
> v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
>   own popcount implementation, based on Nick Desaulniers's comment.
> v3: - Added change log section based on Sedat Dilek's comments.
> v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
>   testing.
> - Corrected documentation, re PGO flags when using LTO, based on Fangrui
>   Song's comments.
> ---
>  Documentation/dev-tools/index.rst |   1 +
>  Documentation/dev-tools/pgo.rst   | 127 +
>  MAINTAINERS   |   9 +
>  Makefile  |   3 +
>  arch/Kconfig  |   1 +
>  arch/x86/Kconfig  |   1 +
>  arch/x86/boot/Makefile|   1 +
>  arch/x86/boot/compressed/Makefile |   1 +
>  arch/x86/crypto/Makefile  |   4 +
>  arch/x86/entry/vdso/Makefile  |   1 +
>  arch/x86/kernel/vmlinux.lds.S |   2 +
>  arch/x86/platform/efi/Makefile|   1 +
>  arch/x86/purgatory/Makefile   |   1 +
>  arch/x86/realmode/rm/Makefile |   1 +
>  arch/x86/um/vdso/Makefile |   1 +
>  drivers/firmware/efi/libstub/Makefile |   1 +
>  include/asm-generic/vmlinux.lds.h |  44 +++
>  kernel/Makefile   |   1 +
>  kernel/pgo/Kconfig|  35 +++
>  kernel/pgo/Makefile   |   5 +
>  kernel/pgo/fs.c   | 389 ++
>  kernel/pgo/instrument.c   | 189 +
>  kernel/pgo/pgo.h  | 203 ++
>  scripts/Makefile.lib  |  10 +
>  24 files changed, 1032 insertions(+)
>  create mode 100644 Documentation/dev-tools/pgo.rst
>  create mode 100644 kernel/pgo/Kconfig
>  create mode 100644 kernel/pgo/Makefile
>  create mode 100644 kernel/pgo/fs.c
>  create mode 100644 kernel/pgo/instrument.c
>  create mode 100644 kernel/pgo/pgo.h
>
> diff --git a/Documentation/dev-tools/index.rst 
> b/Documentation/dev-tools/index.rst
> index f7809c7b1ba9..8d6418e85806 100644
> --- a/Documentation/dev-tools/index.rst
> +++ b/Documentation/dev-tools/index.rst
> @@ -26,6 +26,7 @@ whole; patches welcome!
> kgdb
> kselftest
> kunit/index
> +   pgo
>
>
>  .. only::  subproject and html
> diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
> new file mode 100644
> index ..b7f11d8405b7
> --- /dev/null
> +++ b/Documentation/dev-tools/pgo.rst
> @@ -0,0 +1,127 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===
> +Using PGO with the Linux kernel
> +===
> +
> +Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
> +when building with Clang. The profiling data is exported via the ``pgo``
> +debugfs directory.
> +
> +.. _PGO: 
> https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> +
>

[PATCH v8] pgo: add clang's Profile Guided Optimization infrastructure

2021-02-26 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
---
v8: - Rebased on top-of-tree.
v7: - Fix minor build failure reported by Sedat.
v6: - Add better documentation about the locking scheme and other things.
- Rename macros to better match the same macros in LLVM's source code.
v5: - Correct padding calculation, discovered by Nathan Chancellor.
v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
  own popcount implementation, based on Nick Desaulniers's comment.
v3: - Added change log section based on Sedat Dilek's comments.
v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
  testing.
- Corrected documentation, re PGO flags when using LTO, based on Fangrui
  Song's comments.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/crypto/Makefile  |   4 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  35 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 389 ++
 kernel/pgo/instrument.c   | 189 +
 kernel/pgo/pgo.h  | 203 ++
 scripts/Makefile.lib  |  10 +
 24 files changed, 1032 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9..8d6418e85806 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index ..b7f11d8405b7
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become accessible once debugfs has been mounted:
+
+.. code-block:: sh
+
+   mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+=
+
+You can enable or disable profiling for individual file and directories by
+adding a line similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := y
+
+- For all files in one directory
+
+  .. code-bl

Re: [PATCH v7] pgo: add clang's Profile Guided Optimization infrastructure

2021-02-22 Thread Bill Wendling
Another bump for review. :-)


On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling  wrote:
>
> Bumping for review from Masahiro Yamada and Andrew Morton.
>
> -bw
>
> On Fri, Jan 22, 2021 at 2:12 AM Bill Wendling  wrote:
> >
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > Tested-by: Nick Desaulniers 
> > ---
> > v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
> >   testing.
> > - Corrected documentation, re PGO flags when using LTO, based on Fangrui
> >   Song's comments.
> > v3: - Added change log section based on Sedat Dilek's comments.
> > v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
> >   own popcount implementation, based on Nick Desaulniers's comment.
> > v5: - Correct padding calculation, discovered by Nathan Chancellor.
> > v6: - Add better documentation about the locking scheme and other things.
> > - Rename macros to better match the same macros in LLVM's source code.
> > v7: - Fix minor build failure reported by Sedat.
> > ---
> >  Documentation/dev-tools/index.rst |   1 +
> >  Documentation/dev-tools/pgo.rst   | 127 +
> >  MAINTAINERS   |   9 +
> >  Makefile  |   3 +
> >  arch/Kconfig  |   1 +
> >  arch/x86/Kconfig  |   1 +
> >  arch/x86/boot/Makefile|   1 +
> >  arch/x86/boot/compressed/Makefile |   1 +
> >  arch/x86/crypto/Makefile  |   4 +
> >  arch/x86/entry/vdso/Makefile  |   1 +
> >  arch/x86/kernel/vmlinux.lds.S |   2 +
> >  arch/x86/platform/efi/Makefile|   1 +
> >  arch/x86/purgatory/Makefile   |   1 +
> >  arch/x86/realmode/rm/Makefile |   1 +
> >  arch/x86/um/vdso/Makefile |   1 +
> >  drivers/firmware/efi/libstub/Makefile |   1 +
> >  include/asm-generic/vmlinux.lds.h |  44 +++
> >  kernel/Makefile   |   1 +
> >  kernel/pgo/Kconfig|  35 +++
> >  kernel/pgo/Makefile   |   5 +
> >  kernel/pgo/fs.c   | 389 ++
> >  kernel/pgo/instrument.c   | 189 +
> >  kernel/pgo/pgo.h  | 203 ++
> >  scripts/Makefile.lib  |  10 +
> >  24 files changed, 1032 insertions(+)
> >  create mode 100644 Documentation/dev-tools/pgo.rst
> >  create mode 100644 kernel/pgo/Kconfig
> >  create mode 100644 kernel/pgo/Makefile
> >  create mode 100644 kernel/pgo/fs.c
> >  create mode 100644 kernel/pgo/instrument.c
> >  create mode 100644 kernel/pgo/pgo.h
> >
> > diff --git a/Documentation/dev-tools/index.rst 
> > b/Documentation/dev-tools/index.rst
> > index f7809c7b1ba9..8d6418e85806 100644
> > --- a/Documentation/dev-tools/index.rst
> > +++ b/Documentation/dev-tools/index.rst
> > @@ -26,6 +26,7 @@ whole; patches welcome!
> > kgdb
> > kselftest
> > kunit/index
> > +   pgo
> >
> >
> >  .. only::  subproject and html
> > diff --git a/Documentation/dev-tools/pgo.rst 
> > b/Documentation/dev-tools/pgo.rst
> > new file mode 100644
> > index ..b7f1

Re: [PATCH v7] pgo: add clang's Profile Guided Optimization infrastructure

2021-02-10 Thread Bill Wendling
Bumping for review from Masahiro Yamada and Andrew Morton.

-bw

On Fri, Jan 22, 2021 at 2:12 AM Bill Wendling  wrote:
>
> From: Sami Tolvanen 
>
> Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> profile, the kernel is instrumented with PGO counters, a representative
> workload is run, and the raw profile data is collected from
> /sys/kernel/debug/pgo/profraw.
>
> The raw profile data must be processed by clang's "llvm-profdata" tool
> before it can be used during recompilation:
>
>   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
>   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
>
> Multiple raw profiles may be merged during this step.
>
> The data can now be used by the compiler:
>
>   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
>
> This initial submission is restricted to x86, as that's the platform we
> know works. This restriction can be lifted once other platforms have
> been verified to work with PGO.
>
> Note that this method of profiling the kernel is clang-native, unlike
> the clang support in kernel/gcov.
>
> [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
>
> Signed-off-by: Sami Tolvanen 
> Co-developed-by: Bill Wendling 
> Signed-off-by: Bill Wendling 
> Tested-by: Nick Desaulniers 
> ---
> v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
>   testing.
> - Corrected documentation, re PGO flags when using LTO, based on Fangrui
>   Song's comments.
> v3: - Added change log section based on Sedat Dilek's comments.
> v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
>   own popcount implementation, based on Nick Desaulniers's comment.
> v5: - Correct padding calculation, discovered by Nathan Chancellor.
> v6: - Add better documentation about the locking scheme and other things.
> - Rename macros to better match the same macros in LLVM's source code.
> v7: - Fix minor build failure reported by Sedat.
> ---
>  Documentation/dev-tools/index.rst |   1 +
>  Documentation/dev-tools/pgo.rst   | 127 +
>  MAINTAINERS   |   9 +
>  Makefile  |   3 +
>  arch/Kconfig  |   1 +
>  arch/x86/Kconfig  |   1 +
>  arch/x86/boot/Makefile|   1 +
>  arch/x86/boot/compressed/Makefile |   1 +
>  arch/x86/crypto/Makefile  |   4 +
>  arch/x86/entry/vdso/Makefile  |   1 +
>  arch/x86/kernel/vmlinux.lds.S |   2 +
>  arch/x86/platform/efi/Makefile|   1 +
>  arch/x86/purgatory/Makefile   |   1 +
>  arch/x86/realmode/rm/Makefile |   1 +
>  arch/x86/um/vdso/Makefile |   1 +
>  drivers/firmware/efi/libstub/Makefile |   1 +
>  include/asm-generic/vmlinux.lds.h |  44 +++
>  kernel/Makefile   |   1 +
>  kernel/pgo/Kconfig|  35 +++
>  kernel/pgo/Makefile   |   5 +
>  kernel/pgo/fs.c   | 389 ++
>  kernel/pgo/instrument.c   | 189 +
>  kernel/pgo/pgo.h  | 203 ++
>  scripts/Makefile.lib  |  10 +
>  24 files changed, 1032 insertions(+)
>  create mode 100644 Documentation/dev-tools/pgo.rst
>  create mode 100644 kernel/pgo/Kconfig
>  create mode 100644 kernel/pgo/Makefile
>  create mode 100644 kernel/pgo/fs.c
>  create mode 100644 kernel/pgo/instrument.c
>  create mode 100644 kernel/pgo/pgo.h
>
> diff --git a/Documentation/dev-tools/index.rst 
> b/Documentation/dev-tools/index.rst
> index f7809c7b1ba9..8d6418e85806 100644
> --- a/Documentation/dev-tools/index.rst
> +++ b/Documentation/dev-tools/index.rst
> @@ -26,6 +26,7 @@ whole; patches welcome!
> kgdb
> kselftest
> kunit/index
> +   pgo
>
>
>  .. only::  subproject and html
> diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
> new file mode 100644
> index ..b7f11d8405b7
> --- /dev/null
> +++ b/Documentation/dev-tools/pgo.rst
> @@ -0,0 +1,127 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===
> +Using PGO with the Linux kernel
> +===
> +
> +Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
> +when building with Clang. The profiling data is exported via the ``pgo``
> +debugfs directory.
> +
> +.. _PGO: 
> https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> +
> +
> +Preparation
>

[PATCH v7] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-22 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
Tested-by: Nick Desaulniers 
---
v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
  testing.
- Corrected documentation, re PGO flags when using LTO, based on Fangrui
  Song's comments.
v3: - Added change log section based on Sedat Dilek's comments.
v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
  own popcount implementation, based on Nick Desaulniers's comment.
v5: - Correct padding calculation, discovered by Nathan Chancellor.
v6: - Add better documentation about the locking scheme and other things.
- Rename macros to better match the same macros in LLVM's source code.
v7: - Fix minor build failure reported by Sedat.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/crypto/Makefile  |   4 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  35 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 389 ++
 kernel/pgo/instrument.c   | 189 +
 kernel/pgo/pgo.h  | 203 ++
 scripts/Makefile.lib  |  10 +
 24 files changed, 1032 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9..8d6418e85806 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index ..b7f11d8405b7
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become accessible once debugfs has been mounted:
+
+.. code-block:: sh
+
+   mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+=
+
+You can enable or disable profiling for individual file and directories by
+adding a line similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := y
+
+- For all files in one directory
+
+  .. code-bl

Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-21 Thread Bill Wendling
On Wed, Jan 20, 2021 at 4:51 PM Nick Desaulniers
 wrote:
>
> Thanks Bill, mostly questions below.  Patch looks good to me modulo
> disabling profiling for one crypto TU, mixing style of pre/post
> increment, and some comments around locking.  With those addressed,
> I'm hoping akpm@ would consider picking this up.
>
> On Sat, Jan 16, 2021 at 1:44 AM Bill Wendling  wrote:
> >
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > ---
> > v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
> >   testing.
> > - Corrected documentation, re PGO flags when using LTO, based on Fangrui
> >   Song's comments.
> > v3: - Added change log section based on Sedat Dilek's comments.
> > v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
> >   own popcount implementation, based on Nick Desaulniers's comment.
> > v5: - Correct padding calculation, discovered by Nathan Chancellor.
> > ---
> >  Documentation/dev-tools/index.rst |   1 +
> >  Documentation/dev-tools/pgo.rst   | 127 +
> >  MAINTAINERS   |   9 +
> >  Makefile  |   3 +
> >  arch/Kconfig  |   1 +
> >  arch/x86/Kconfig  |   1 +
> >  arch/x86/boot/Makefile|   1 +
> >  arch/x86/boot/compressed/Makefile |   1 +
> >  arch/x86/crypto/Makefile  |   2 +
> >  arch/x86/entry/vdso/Makefile  |   1 +
> >  arch/x86/kernel/vmlinux.lds.S |   2 +
> >  arch/x86/platform/efi/Makefile|   1 +
> >  arch/x86/purgatory/Makefile   |   1 +
> >  arch/x86/realmode/rm/Makefile |   1 +
> >  arch/x86/um/vdso/Makefile |   1 +
> >  drivers/firmware/efi/libstub/Makefile |   1 +
> >  include/asm-generic/vmlinux.lds.h |  44 +++
> >  kernel/Makefile   |   1 +
> >  kernel/pgo/Kconfig|  35 +++
> >  kernel/pgo/Makefile   |   5 +
> >  kernel/pgo/fs.c   | 382 ++
> >  kernel/pgo/instrument.c   | 185 +
> >  kernel/pgo/pgo.h  | 206 ++
> >  scripts/Makefile.lib  |  10 +
> >  24 files changed, 1022 insertions(+)
> >  create mode 100644 Documentation/dev-tools/pgo.rst
> >  create mode 100644 kernel/pgo/Kconfig
> >  create mode 100644 kernel/pgo/Makefile
> >  create mode 100644 kernel/pgo/fs.c
> >  create mode 100644 kernel/pgo/instrument.c
> >  create mode 100644 kernel/pgo/pgo.h
> >
> > diff --git a/Documentation/dev-tools/index.rst 
> > b/Documentation/dev-tools/index.rst
> > index f7809c7b1ba9e..8d6418e858062 100644
> > --- a/Documentation/dev-tools/index.rst
> > +++ b/Documentation/dev-tools/index.rst
> > @@ -26,6 +26,7 @@ whole; patches welcome!
> > kgdb
> > kselftest
> > kunit/index
> > +   pgo
> >
> >
> >  .. only::  subproject and html
> > diff --git a/Documentation/dev-tools/pgo.rst 
> > b/Documentation/dev-tools/pgo.rst
> > new file mode 100644
> > index 0..b7f11d8405b73
> > --- /dev/null
> > +++ b/Documentation/dev-tools/pgo.rst
> > @@ -0,0 +1,127 @@
> > +.. 

[PATCH v6] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-21 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
Tested-by: Nick Desaulniers 
---
v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
  testing.
- Corrected documentation, re PGO flags when using LTO, based on Fangrui
  Song's comments.
v3: - Added change log section based on Sedat Dilek's comments.
v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
  own popcount implementation, based on Nick Desaulniers's comment.
v5: - Correct padding calculation, discovered by Nathan Chancellor.
v6: - Add better documentation about the locking scheme and other things.
- Rename macros to better match the same macros in LLVM's source code.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/crypto/Makefile  |   4 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  35 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 389 ++
 kernel/pgo/instrument.c   | 189 +
 kernel/pgo/pgo.h  | 203 ++
 scripts/Makefile.lib  |  10 +
 24 files changed, 1032 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9..8d6418e85806 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index ..b7f11d8405b7
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become accessible once debugfs has been mounted:
+
+.. code-block:: sh
+
+   mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+=
+
+You can enable or disable profiling for individual file and directories by
+adding a line similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := y
+
+- For all files in one directory
+
+  .. code-block:: make
+
+ PGO_PROFILE

Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-18 Thread Bill Wendling
On Mon, Jan 18, 2021 at 9:26 AM Sedat Dilek  wrote:
>
> On Mon, Jan 18, 2021 at 1:39 PM Sedat Dilek  wrote:
> >
> > On Mon, Jan 18, 2021 at 3:32 AM Bill Wendling  wrote:
> > >
> > > On Sun, Jan 17, 2021 at 4:27 PM Sedat Dilek  wrote:
> > > >
> > > > [ big snip ]
> > >
> > > [More snippage.]
> > >
> > > > [ CC Fangrui ]
> > > >
> > > > With the attached...
> > > >
> > > >[PATCH v3] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for
> > > > undefined symbols
> > > >
> > > > ...I was finally able to boot into a rebuild PGO-optimized Linux-kernel.
> > > > For details see ClangBuiltLinux issue #1250 "Unknown symbol
> > > > _GLOBAL_OFFSET_TABLE_ loading kernel modules".
> > > >
> > > Thanks for confirming that this works with the above patch.
> > >
> > > > @ Bill Nick Sami Nathan
> > > >
> > > > 1, Can you say something of the impact passing "LLVM_IAS=1" to make?
> > >
> > > The integrated assembler and this option are more-or-less orthogonal
> > > to each other. One can still use the GNU assembler with PGO. If you're
> > > having an issue, it may be related to ClangBuiltLinux issue #1250.
> > >
> > > > 2. Can you please try Nick's DWARF v5 support patchset v5 and
> > > > CONFIG_DEBUG_INFO_DWARF5=y (see attachments)?
> > > >
> > > I know Nick did several tests with PGO. He may have looked into it
> > > already, but we can check.
> > >
> >
> > Reproducible.
> >
> > LLVM_IAS=1 + DWARF5 = Not bootable
> >
> > I will try:
> >
> > LLVM_IAS=1 + DWARF4
> >
>
> I was not able to boot into such a built Linux-kernel.
>
PGO will have no effect on debugging data. If this is an issue with
DWARF, then it's likely orthogonal to the PGO patch.

> For me worked: DWARF2 and LLVM_IAS=1 *not* set.
>
> Of course, this could be an issue with my system's LLVM/Clang.
>
> Debian clang version
> 12.0.0-++2021011513+45ef053bd709-1~exp1~20210115101809.3724
>
Please use the official clang 11.0.1 release
(https://releases.llvm.org/download.html), modifying the
kernel/pgo/Kconfig as I suggested above. The reason we specify clang
12 for the minimal version is because of an issue that was recently
fixed.

> Can you give me a LLVM commit-id where you had success with LLVM_IAS=1
> and especially CONFIG_DEBUG_INFO_DWARF5=y?
> Success means I was able to boot in QEMU and/or bare metal.
>
The DWARF5 patch isn't in yet, so I don't want to rely upon it too much.

-bw


Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-17 Thread Bill Wendling
On Sun, Jan 17, 2021 at 4:27 PM Sedat Dilek  wrote:
>
> [ big snip ]

[More snippage.]

> [ CC Fangrui ]
>
> With the attached...
>
>[PATCH v3] module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for
> undefined symbols
>
> ...I was finally able to boot into a rebuild PGO-optimized Linux-kernel.
> For details see ClangBuiltLinux issue #1250 "Unknown symbol
> _GLOBAL_OFFSET_TABLE_ loading kernel modules".
>
Thanks for confirming that this works with the above patch.

> @ Bill Nick Sami Nathan
>
> 1, Can you say something of the impact passing "LLVM_IAS=1" to make?

The integrated assembler and this option are more-or-less orthogonal
to each other. One can still use the GNU assembler with PGO. If you're
having an issue, it may be related to ClangBuiltLinux issue #1250.

> 2. Can you please try Nick's DWARF v5 support patchset v5 and
> CONFIG_DEBUG_INFO_DWARF5=y (see attachments)?
>
I know Nick did several tests with PGO. He may have looked into it
already, but we can check.

> I would like to know what the impact of the Clang's Integrated
> Assembler and DWARF v5 are.
>
> I dropped both means...
>
> 1. Do not pass "LLVM_IAS=1" to make.
> 2. Use default DWARF v2 (with Nick's patchset: CONFIG_DEBUG_INFO_DWARF2=y).
>
> ...for a successfull build and boot on bare metal.
>

[Next message]

> On each rebuild I need to pass to make ...?
>
>   LLVM=1 -fprofile-use=vmlinux.profdata
>
Yes.

> Did you try together with passing LLVM_IAS=1 to make?

One of my tests was with the integrated assembler enabled. Are you
finding issues with it?

The problem with using top-of-tree clang is that it's not necessarily
stable. You could try using the clang 11.x release (changing the
"CLANG_VERSION >= 12" in kernel/pgo/Kconfig/ to "CLANG_VERSION >=
11").

-bw


Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-17 Thread Bill Wendling
On Sun, Jan 17, 2021 at 9:42 AM Sedat Dilek  wrote:
>
> On Sun, Jan 17, 2021 at 1:05 PM Sedat Dilek  wrote:
> >
> > On Sun, Jan 17, 2021 at 12:58 PM Sedat Dilek  wrote:
> > >
> > > On Sun, Jan 17, 2021 at 12:42 PM Sedat Dilek  
> > > wrote:
> > > >
> > > > On Sun, Jan 17, 2021 at 12:23 PM Sedat Dilek  
> > > > wrote:
> > > > >
> > > > > On Sun, Jan 17, 2021 at 11:53 AM Sedat Dilek  
> > > > > wrote:
> > > > > >
> > > > > > On Sun, Jan 17, 2021 at 11:44 AM Sedat Dilek 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Sat, Jan 16, 2021 at 9:23 PM Bill Wendling  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Sat, Jan 16, 2021 at 9:39 AM Sedat Dilek 
> > > > > > > >  wrote:
> > > > > > > > > On Sat, Jan 16, 2021 at 10:44 AM 'Bill Wendling' via Clang 
> > > > > > > > > Built Linux
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > From: Sami Tolvanen 
> > > > > > > > > >
> > > > > > > > > > Enable the use of clang's Profile-Guided Optimization[1]. 
> > > > > > > > > > To generate a
> > > > > > > > > > profile, the kernel is instrumented with PGO counters, a 
> > > > > > > > > > representative
> > > > > > > > > > workload is run, and the raw profile data is collected from
> > > > > > > > > > /sys/kernel/debug/pgo/profraw.
> > > > > > > > > >
> > > > > > > > > > The raw profile data must be processed by clang's 
> > > > > > > > > > "llvm-profdata" tool
> > > > > > > > > > before it can be used during recompilation:
> > > > > > > > > >
> > > > > > > > > >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > > > > > > > > >   $ llvm-profdata merge --output=vmlinux.profdata 
> > > > > > > > > > vmlinux.profraw
> > > > > > > > > >
> > > > > > > > > > Multiple raw profiles may be merged during this step.
> > > > > > > > > >
> > > > > > > > > > The data can now be used by the compiler:
> > > > > > > > > >
> > > > > > > > > >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> > > > > > > > > >
> > > > > > > > > > This initial submission is restricted to x86, as that's the 
> > > > > > > > > > platform we
> > > > > > > > > > know works. This restriction can be lifted once other 
> > > > > > > > > > platforms have
> > > > > > > > > > been verified to work with PGO.
> > > > > > > > > >
> > > > > > > > > > Note that this method of profiling the kernel is 
> > > > > > > > > > clang-native, unlike
> > > > > > > > > > the clang support in kernel/gcov.
> > > > > > > > > >
> > > > > > > > > > [1] 
> > > > > > > > > > https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Sami Tolvanen 
> > > > > > > > > > Co-developed-by: Bill Wendling 
> > > > > > > > > > Signed-off-by: Bill Wendling 
> > > > > > > > > > ---
> > > > > > > > > > v2: - Added "__llvm_profile_instrument_memop" based on 
> > > > > > > > > > Nathan Chancellor's
> > > > > > > > > >   testing.
> > > > > > > > > > - Corrected documentation, re PGO flags when using LTO, 
> > > > > > > > > > based on Fangrui
> > > > > > > > > >   Song's comments.
> > > > >

Re: [PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-16 Thread Bill Wendling
On Sat, Jan 16, 2021 at 9:39 AM Sedat Dilek  wrote:
> On Sat, Jan 16, 2021 at 10:44 AM 'Bill Wendling' via Clang Built Linux
>  wrote:
> >
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > ---
> > v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
> >   testing.
> > - Corrected documentation, re PGO flags when using LTO, based on Fangrui
> >   Song's comments.
> > v3: - Added change log section based on Sedat Dilek's comments.
> > v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
> >   own popcount implementation, based on Nick Desaulniers's comment.
> > v5: - Correct padding calculation, discovered by Nathan Chancellor.
> > ---
> >  Documentation/dev-tools/index.rst |   1 +
> >  Documentation/dev-tools/pgo.rst   | 127 +
> >  MAINTAINERS   |   9 +
> >  Makefile  |   3 +
> >  arch/Kconfig  |   1 +
> >  arch/x86/Kconfig  |   1 +
> >  arch/x86/boot/Makefile|   1 +
> >  arch/x86/boot/compressed/Makefile |   1 +
> >  arch/x86/crypto/Makefile  |   2 +
> >  arch/x86/entry/vdso/Makefile  |   1 +
> >  arch/x86/kernel/vmlinux.lds.S |   2 +
> >  arch/x86/platform/efi/Makefile|   1 +
> >  arch/x86/purgatory/Makefile   |   1 +
> >  arch/x86/realmode/rm/Makefile |   1 +
> >  arch/x86/um/vdso/Makefile |   1 +
> >  drivers/firmware/efi/libstub/Makefile |   1 +
> >  include/asm-generic/vmlinux.lds.h |  44 +++
> >  kernel/Makefile   |   1 +
> >  kernel/pgo/Kconfig|  35 +++
> >  kernel/pgo/Makefile   |   5 +
> >  kernel/pgo/fs.c   | 382 ++
> >  kernel/pgo/instrument.c   | 185 +
> >  kernel/pgo/pgo.h  | 206 ++
> >  scripts/Makefile.lib  |  10 +
> >  24 files changed, 1022 insertions(+)
> >  create mode 100644 Documentation/dev-tools/pgo.rst
> >  create mode 100644 kernel/pgo/Kconfig
> >  create mode 100644 kernel/pgo/Makefile
> >  create mode 100644 kernel/pgo/fs.c
> >  create mode 100644 kernel/pgo/instrument.c
> >  create mode 100644 kernel/pgo/pgo.h
> >
> > diff --git a/Documentation/dev-tools/index.rst 
> > b/Documentation/dev-tools/index.rst
> > index f7809c7b1ba9e..8d6418e858062 100644
> > --- a/Documentation/dev-tools/index.rst
> > +++ b/Documentation/dev-tools/index.rst
> > @@ -26,6 +26,7 @@ whole; patches welcome!
> > kgdb
> > kselftest
> > kunit/index
> > +   pgo
> >
> >
> >  .. only::  subproject and html
> > diff --git a/Documentation/dev-tools/pgo.rst 
> > b/Documentation/dev-tools/pgo.rst
> > new file mode 100644
> > index 0..b7f11d8405b73
> > --- /dev/null
> > +++ b/Documentation/dev-tools/pgo.rst
> > @@ -0,0 +1,127 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +===
> > +Using PGO with the Linux kernel
> > +===
> > +
> > +Clang's profiling kernel support (PGO_) en

[PATCH v5] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-16 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
---
v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
  testing.
- Corrected documentation, re PGO flags when using LTO, based on Fangrui
  Song's comments.
v3: - Added change log section based on Sedat Dilek's comments.
v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
  own popcount implementation, based on Nick Desaulniers's comment.
v5: - Correct padding calculation, discovered by Nathan Chancellor.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/crypto/Makefile  |   2 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  35 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 382 ++
 kernel/pgo/instrument.c   | 185 +
 kernel/pgo/pgo.h  | 206 ++
 scripts/Makefile.lib  |  10 +
 24 files changed, 1022 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9e..8d6418e858062 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index 0..b7f11d8405b73
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become accessible once debugfs has been mounted:
+
+.. code-block:: sh
+
+   mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+=
+
+You can enable or disable profiling for individual file and directories by
+adding a line similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := y
+
+- For all files in one directory
+
+  .. code-block:: make
+
+ PGO_PROFILE := y
+
+To exclude files from being profiled use
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := n
+
+and
+
+  .. code-block:: make
+
+ PGO_PROFILE := n
+
+Only files which are linked to 

Re: [PATCH v4] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-13 Thread Bill Wendling
On Wed, Jan 13, 2021 at 12:55 PM Nathan Chancellor
 wrote:
>
> Hi Bill,
>
> On Tue, Jan 12, 2021 at 10:19:58PM -0800, Bill Wendling wrote:
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> >   $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
> > Signed-off-by: Sami Tolvanen 
> > Co-developed-by: Bill Wendling 
> > Signed-off-by: Bill Wendling 
> > Change-Id: Ic78e69c682286d3a44c4549a0138578c98138b77
>
> Small nit: This should be removed.
>
GrrrThe git hook keeps adding it in there. :-(

> I applied this patch on top of v5.11-rc3, built it with LLVM 12
> (f1d5cbbdee5526bc86eac0a5652b115d9bc158e5 + D94470) with Microsoft's
> WSL 5.4 config [1] + CONFIG_PGO_CLANG=y, and ran it on WSL2.
>
> $ zgrep PGO /proc/config.gz
> # Profile Guided Optimization (PGO) (EXPERIMENTAL)
> CONFIG_ARCH_SUPPORTS_PGO_CLANG=y
> CONFIG_PGO_CLANG=y
> # end of Profile Guided Optimization (PGO) (EXPERIMENTAL)
>
> However, I see an issue with actually using the data:
>
> $ sudo -s
> # mount -t debugfs none /sys/kernel/debug
> # cp -a /sys/kernel/debug/pgo/profraw vmlinux.profraw
> # chown nathan:nathan vmlinux.profraw
> # exit
> $ tc-build/build/llvm/stage1/bin/llvm-profdata merge 
> --output=vmlinux.profdata vmlinux.profraw
> warning: vmlinux.profraw: Invalid instrumentation profile data (bad magic)
> error: No profiles could be merged.
>
> Am I holding it wrong? :) Note, this is virtualized, I do not have any
> "real" x86 hardware that I can afford to test on right now.
>
> [1]: 
> https://github.com/microsoft/WSL2-Linux-Kernel/raw/linux-msft-wsl-5.4.y/Microsoft/config-wsl
>
Could you send me the vmlinux.profraw file? (Don't CC this list.)

-bw


[PATCH v4] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-12 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native, unlike
the clang support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
Change-Id: Ic78e69c682286d3a44c4549a0138578c98138b77
---
v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
  testing.
- Corrected documentation, re PGO flags when using LTO, based on Fangrui
  Song's comments.
v3: - Added change log section based on Sedat Dilek's comments.
v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
  own popcount implementation, based on Nick Desaulniers's comment.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  34 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 382 ++
 kernel/pgo/instrument.c   | 185 +
 kernel/pgo/pgo.h  | 206 ++
 scripts/Makefile.lib  |  10 +
 23 files changed, 1019 insertions(+)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9e..8d6418e858062 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index 0..b7f11d8405b73
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become accessible once debugfs has been mounted:
+
+.. code-block:: sh
+
+   mount -t debugfs none /sys/kernel/debug
+
+
+Customization
+=
+
+You can enable or disable profiling for individual file and directories by
+adding a line similar to the following to the respective kernel Makefile:
+
+- For a single file (e.g. main.o)
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := y
+
+- For all files in one directory
+
+  .. code-block:: make
+
+ PGO_PROFILE := y
+
+To exclude files from being profiled use
+
+  .. code-block:: make
+
+ PGO_PROFILE_main.o := n
+
+and
+
+  .. code-block:: make
+
+ PGO_PROFILE := n
+
+Only files which are linked to the main kernel image or are compiled as kernel
+module

[PATCH v3] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native and isn't
compatible with clang's gcov support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
---
v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
  testing.
- Corrected documentation, re PGO flags when using LTO, based on Fāng-ruì
  Sòng's comments.
v3: - Added change log section based on Sedat Dilek's comments.
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/arm/boot/bootp/Makefile  |   1 +
 arch/arm/boot/compressed/Makefile |   1 +
 arch/arm/vdso/Makefile|   3 +-
 arch/arm64/kernel/vdso/Makefile   |   3 +-
 arch/arm64/kvm/hyp/nvhe/Makefile  |   1 +
 arch/mips/boot/compressed/Makefile|   1 +
 arch/mips/vdso/Makefile   |   1 +
 arch/nds32/kernel/vdso/Makefile   |   4 +-
 arch/parisc/boot/compressed/Makefile  |   1 +
 arch/powerpc/kernel/Makefile  |   6 +-
 arch/powerpc/kernel/trace/Makefile|   3 +-
 arch/powerpc/kernel/vdso32/Makefile   |   1 +
 arch/powerpc/kernel/vdso64/Makefile   |   1 +
 arch/powerpc/kexec/Makefile   |   3 +-
 arch/powerpc/xmon/Makefile|   1 +
 arch/riscv/kernel/vdso/Makefile   |   3 +-
 arch/s390/boot/Makefile   |   1 +
 arch/s390/boot/compressed/Makefile|   1 +
 arch/s390/kernel/Makefile |   1 +
 arch/s390/kernel/vdso64/Makefile  |   3 +-
 arch/s390/purgatory/Makefile  |   1 +
 arch/sh/boot/compressed/Makefile  |   1 +
 arch/sh/mm/Makefile   |   1 +
 arch/sparc/vdso/Makefile  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 drivers/s390/char/Makefile|   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  34 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 382 ++
 kernel/pgo/instrument.c   | 188 +
 kernel/pgo/pgo.h  | 206 ++
 scripts/Makefile.lib  |  10 +
 48 files changed, 1058 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9e..8d6418e858062 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index 0..da0e654ae7078
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-g

[PATCH v2] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool
before it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can now be used by the compiler:

  $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we
know works. This restriction can be lifted once other platforms have
been verified to work with PGO.

Note that this method of profiling the kernel is clang-native and isn't
compatible with clang's gcov support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/arm/boot/bootp/Makefile  |   1 +
 arch/arm/boot/compressed/Makefile |   1 +
 arch/arm/vdso/Makefile|   3 +-
 arch/arm64/kernel/vdso/Makefile   |   3 +-
 arch/arm64/kvm/hyp/nvhe/Makefile  |   1 +
 arch/mips/boot/compressed/Makefile|   1 +
 arch/mips/vdso/Makefile   |   1 +
 arch/nds32/kernel/vdso/Makefile   |   4 +-
 arch/parisc/boot/compressed/Makefile  |   1 +
 arch/powerpc/kernel/Makefile  |   6 +-
 arch/powerpc/kernel/trace/Makefile|   3 +-
 arch/powerpc/kernel/vdso32/Makefile   |   1 +
 arch/powerpc/kernel/vdso64/Makefile   |   1 +
 arch/powerpc/kexec/Makefile   |   3 +-
 arch/powerpc/xmon/Makefile|   1 +
 arch/riscv/kernel/vdso/Makefile   |   3 +-
 arch/s390/boot/Makefile   |   1 +
 arch/s390/boot/compressed/Makefile|   1 +
 arch/s390/kernel/Makefile |   1 +
 arch/s390/kernel/vdso64/Makefile  |   3 +-
 arch/s390/purgatory/Makefile  |   1 +
 arch/sh/boot/compressed/Makefile  |   1 +
 arch/sh/mm/Makefile   |   1 +
 arch/sparc/vdso/Makefile  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 drivers/s390/char/Makefile|   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  34 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 382 ++
 kernel/pgo/instrument.c   | 188 +
 kernel/pgo/pgo.h  | 206 ++
 scripts/Makefile.lib  |  10 +
 48 files changed, 1058 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9e..8d6418e858062 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index 0..da0e654ae7078
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profiling flags will be significantly larger
+and run slower.
+
+Profiling data will only become 

Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
On Mon, Jan 11, 2021 at 12:31 PM Fangrui Song  wrote:
> On 2021-01-11, Bill Wendling wrote:
> >On Mon, Jan 11, 2021 at 12:12 PM Fangrui Song  wrote:
> >>
> >> On 2021-01-11, 'Bill Wendling' via Clang Built Linux wrote:
> >> >From: Sami Tolvanen 
> >> >
> >> >Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> >> >profile, the kernel is instrumented with PGO counters, a representative
> >> >workload is run, and the raw profile data is collected from
> >> >/sys/kernel/debug/pgo/profraw.
> >> >
> >> >The raw profile data must be processed by clang's "llvm-profdata" tool 
> >> >before
> >> >it can be used during recompilation:
> >> >
> >> >  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >> >  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >> >
> >> >Multiple raw profiles may be merged during this step.
> >> >
> >> >The data can be used either by the compiler if LTO isn't enabled:
> >> >
> >> >... -fprofile-use=vmlinux.profdata ...
> >> >
> >> >or by LLD if LTO is enabled:
> >> >
> >> >... -lto-cs-profile-file=vmlinux.profdata ...
> >>
> >> This LLD option does not exist.
> >> LLD does have some `--lto-*` options but the `-lto-*` form is not supported
> >> (it clashes with -l) https://reviews.llvm.org/D79371
> >>
> >That's strange. I've been using that option for years now. :-) Is this
> >a recent change?
>
> The more frequently used options (specifyed by the clang driver) are
> -plugin-opt=... (options implemented by LLVMgold.so).
> `-lto-*` is rare.
>
> >> (There is an earlier -fprofile-instr-generate which does
> >> instrumentation in Clang, but the option does not have broad usage.
> >> It is used more for code coverage, not for optimization.
> >> Noticeably, it does not even implement the Kirchhoff's current law
> >> optimization)
> >>
> >Right. I've been told outside of this email that -fprofile-generate is
> >the prefered flag to use.
> >
> >> -fprofile-use= is used by both regular PGO and context-sensitive PGO 
> >> (CSPGO).
> >>
> >> clang -flto=thin -fprofile-use= passes -plugin-opt=cs-profile-path= to the 
> >> linker.
> >> For regular PGO, this option is effectively a no-op (confirmed with CSPGO 
> >> main developer).
> >>
> >> So I think the "or by LLD if LTO is enabled:" part should be removed.
> >
> >But what if you specify the linking step explicitly? Linux doesn't
> >call "clang" when linking, but "ld.lld".
>
> Regular PGO+LTO does not need -plugin-opt=cs-profile-path=
> CSPGO+LTO needs it.
> Because -fprofile-use= may be used by both, Clang driver adds it.
> CSPGO is relevant in this this patch, so the linker option does not need to 
> be mentioned.

I'm still a bit confused. Are you saying that when clang uses
`-flto=thin -fprofile-use=foo` that the profile file "foo" is embedded
into the bitcode file so that when the linker's run it'll be used?

This is the workflow:

clang ... -fprofile-use=vmlinux.profdata ... -c -o foo.o foo.c
clang ... -fprofile-use=vmlinux.profdata ... -c -o bar.o bar.c
ld.lld ...  foo.o bar.o

Are you saying that we don't need to have
"-plugin-opt=cs-profile-path=vmlinux.profdata" on the "ld.lld ..."
line?

-bw


Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
On Mon, Jan 11, 2021 at 1:18 PM Nick Desaulniers
 wrote:
>
> On Mon, Jan 11, 2021 at 1:04 PM Nathan Chancellor
>  wrote:
> >
> > On Mon, Jan 11, 2021 at 12:18:21AM -0800, Bill Wendling wrote:
> > > From: Sami Tolvanen 
> > >
> > > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > > profile, the kernel is instrumented with PGO counters, a representative
> > > workload is run, and the raw profile data is collected from
> > > /sys/kernel/debug/pgo/profraw.
> > >
> > > The raw profile data must be processed by clang's "llvm-profdata" tool 
> > > before
> > > it can be used during recompilation:
> > >
> > >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> > >
> > > Multiple raw profiles may be merged during this step.
> > >
> > > The data can be used either by the compiler if LTO isn't enabled:
> > >
> > > ... -fprofile-use=vmlinux.profdata ...
> > >
> > > or by LLD if LTO is enabled:
> > >
> > > ... -lto-cs-profile-file=vmlinux.profdata ...
> > >
> > > This initial submission is restricted to x86, as that's the platform we 
> > > know
> > > works. This restriction can be lifted once other platforms have been 
> > > verified
> > > to work with PGO.
> > >
> > > Note that this method of profiling the kernel is clang-native and isn't
> > > compatible with clang's gcov support in kernel/gcov.
> > >
> > > [1] 
> > > https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> > >
> > > Signed-off-by: Sami Tolvanen 
> > > Co-developed-by: Bill Wendling 
> > > Signed-off-by: Bill Wendling 
> >
> > I took this for a spin against x86_64_defconfig and ran into two issues:
> >
> > 1. https://github.com/ClangBuiltLinux/linux/issues/1252
>
> "Cannot split an edge from a CallBrInst"
> Looks like that should be fixed first, then we should gate this
> feature on clang-12.
>
Weird. I'll investigate.

> >
> >There is also one in drivers/gpu/drm/i915/i915_query.c. For the time
> >being, I added PGO_PROFILE_... := n for those two files.
> >
> > 2. After doing that, I run into an undefined function error with ld.lld.
> >
> > How I tested:
> >
> > $ make -skj"$(nproc)" LLVM=1 defconfig
> >
> > $ scripts/config -e PGO_CLANG
> >
> > $ make -skj"$(nproc)" LLVM=1 olddefconfig vmlinux all
> > ld.lld: error: undefined symbol: __llvm_profile_instrument_memop
>
> Err...that seems like it should be implemented in
> kernel/pgo/instrument.c in this patch in a v2?
>
Yes. I'll submit a new V2 with this and other feedback integrated.

> > >>> referenced by head64.c
> > >>>   arch/x86/kernel/head64.o:(__early_make_pgtable)
> > >>> referenced by head64.c
> > >>>   arch/x86/kernel/head64.o:(x86_64_start_kernel)
> > >>> referenced by head64.c
> > >>>   arch/x86/kernel/head64.o:(copy_bootdata)
> > >>> referenced 2259 more times
> >
> > Local diff:
> >
> > diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> > index ffce287ef415..4b2f238770b5 100644
> > --- a/drivers/char/Makefile
> > +++ b/drivers/char/Makefile
> > @@ -4,6 +4,7 @@
> >  #
> >
> >  obj-y  += mem.o random.o
> > +PGO_PROFILE_random.o   := n
> >  obj-$(CONFIG_TTY_PRINTK)   += ttyprintk.o
> >  obj-y  += misc.o
> >  obj-$(CONFIG_ATARI_DSP56K) += dsp56k.o
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index e5574e506a5c..d83cacc79b1a 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -168,6 +168,7 @@ i915-y += \
> >   i915_vma.o \
> >   intel_region_lmem.o \
> >   intel_wopcm.o
> > +PGO_PROFILE_i915_query.o := n
> >
> >  # general-purpose microcontroller (GuC) support
> >  i915-y += gt/uc/intel_uc.o \
>
> I'd rather have these both sorted out before landing with PGO disabled
> on these files.
>
Agreed.

-bw


Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
On Mon, Jan 11, 2021 at 12:12 PM Fangrui Song  wrote:
>
> On 2021-01-11, 'Bill Wendling' via Clang Built Linux wrote:
> >From: Sami Tolvanen 
> >
> >Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> >profile, the kernel is instrumented with PGO counters, a representative
> >workload is run, and the raw profile data is collected from
> >/sys/kernel/debug/pgo/profraw.
> >
> >The raw profile data must be processed by clang's "llvm-profdata" tool before
> >it can be used during recompilation:
> >
> >  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> >Multiple raw profiles may be merged during this step.
> >
> >The data can be used either by the compiler if LTO isn't enabled:
> >
> >... -fprofile-use=vmlinux.profdata ...
> >
> >or by LLD if LTO is enabled:
> >
> >... -lto-cs-profile-file=vmlinux.profdata ...
>
> This LLD option does not exist.
> LLD does have some `--lto-*` options but the `-lto-*` form is not supported
> (it clashes with -l) https://reviews.llvm.org/D79371
>
That's strange. I've been using that option for years now. :-) Is this
a recent change?

> (There is an earlier -fprofile-instr-generate which does
> instrumentation in Clang, but the option does not have broad usage.
> It is used more for code coverage, not for optimization.
> Noticeably, it does not even implement the Kirchhoff's current law
> optimization)
>
Right. I've been told outside of this email that -fprofile-generate is
the prefered flag to use.

> -fprofile-use= is used by both regular PGO and context-sensitive PGO (CSPGO).
>
> clang -flto=thin -fprofile-use= passes -plugin-opt=cs-profile-path= to the 
> linker.
> For regular PGO, this option is effectively a no-op (confirmed with CSPGO 
> main developer).
>
> So I think the "or by LLD if LTO is enabled:" part should be removed.

But what if you specify the linking step explicitly? Linux doesn't
call "clang" when linking, but "ld.lld".

-bw


Re: [PATCH] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
On Mon, Jan 11, 2021 at 12:39 AM Sedat Dilek  wrote:
>
> On Mon, Jan 11, 2021 at 9:18 AM 'Bill Wendling' via Clang Built Linux
>  wrote:
> >
> > From: Sami Tolvanen 
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool 
> > before
> > it can be used during recompilation:
> >
> >   $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> >   $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can be used either by the compiler if LTO isn't enabled:
> >
> > ... -fprofile-use=vmlinux.profdata ...
> >
> > or by LLD if LTO is enabled:
> >
> > ... -lto-cs-profile-file=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we know
> > works. This restriction can be lifted once other platforms have been 
> > verified
> > to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native and isn't
> > compatible with clang's gcov support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
> >
>
> Hi Bill and Sami,
>
> I have seen the pull-request in the CBL issue tracker and had some
> questions in mind.
>
> Good you send this.
>
> First of all, I like to fetch any development stuff easily from a Git
> repository.

The version in the pull-request in the CBL issue tracker is roughly
the same as this patch. (There are some changes, but they aren't
functionality changes.)

> Can you offer this, please?
> What is the base for your work?
> I hope this is (fresh released) Linux v5.11-rc3.
>
This patch (and the PR on the CBL issue tracker) are from top-of-tree Linux.

> I myself had some experiences with a PGO + ThinLTO optimized LLVM
> toolchain built with the help of tc-build.
> Here it takes very long to build it.
>
> This means I have some profile-data archived.
> Can I use it?
>
LLVM is more tolerant of "stale" profile data than gcov, so it's
possible that your archived profile data would still work, but I can't
guarantee that it will be better than using new profile data.

> Is an own PGO + ThinLTO optimized LLVM toolchain pre-requirement for
> this or not?
> That is one of my important questions.
>
Do you mean that the LLVM tools (clang, llc, etc.) are compiled with
PGO + ThinLTO?

-bw


[PATCH] pgo: add clang's Profile Guided Optimization infrastructure

2021-01-11 Thread Bill Wendling
From: Sami Tolvanen 

Enable the use of clang's Profile-Guided Optimization[1]. To generate a
profile, the kernel is instrumented with PGO counters, a representative
workload is run, and the raw profile data is collected from
/sys/kernel/debug/pgo/profraw.

The raw profile data must be processed by clang's "llvm-profdata" tool before
it can be used during recompilation:

  $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
  $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw

Multiple raw profiles may be merged during this step.

The data can be used either by the compiler if LTO isn't enabled:

... -fprofile-use=vmlinux.profdata ...

or by LLD if LTO is enabled:

... -lto-cs-profile-file=vmlinux.profdata ...

This initial submission is restricted to x86, as that's the platform we know
works. This restriction can be lifted once other platforms have been verified
to work with PGO.

Note that this method of profiling the kernel is clang-native and isn't
compatible with clang's gcov support in kernel/gcov.

[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

Signed-off-by: Sami Tolvanen 
Co-developed-by: Bill Wendling 
Signed-off-by: Bill Wendling 
---
 Documentation/dev-tools/index.rst |   1 +
 Documentation/dev-tools/pgo.rst   | 127 +
 MAINTAINERS   |   9 +
 Makefile  |   3 +
 arch/Kconfig  |   1 +
 arch/arm/boot/bootp/Makefile  |   1 +
 arch/arm/boot/compressed/Makefile |   1 +
 arch/arm/vdso/Makefile|   3 +-
 arch/arm64/kernel/vdso/Makefile   |   3 +-
 arch/arm64/kvm/hyp/nvhe/Makefile  |   1 +
 arch/mips/boot/compressed/Makefile|   1 +
 arch/mips/vdso/Makefile   |   1 +
 arch/nds32/kernel/vdso/Makefile   |   4 +-
 arch/parisc/boot/compressed/Makefile  |   1 +
 arch/powerpc/kernel/Makefile  |   6 +-
 arch/powerpc/kernel/trace/Makefile|   3 +-
 arch/powerpc/kernel/vdso32/Makefile   |   1 +
 arch/powerpc/kernel/vdso64/Makefile   |   1 +
 arch/powerpc/kexec/Makefile   |   3 +-
 arch/powerpc/xmon/Makefile|   1 +
 arch/riscv/kernel/vdso/Makefile   |   3 +-
 arch/s390/boot/Makefile   |   1 +
 arch/s390/boot/compressed/Makefile|   1 +
 arch/s390/kernel/Makefile |   1 +
 arch/s390/kernel/vdso64/Makefile  |   3 +-
 arch/s390/purgatory/Makefile  |   1 +
 arch/sh/boot/compressed/Makefile  |   1 +
 arch/sh/mm/Makefile   |   1 +
 arch/sparc/vdso/Makefile  |   1 +
 arch/x86/Kconfig  |   1 +
 arch/x86/boot/Makefile|   1 +
 arch/x86/boot/compressed/Makefile |   1 +
 arch/x86/entry/vdso/Makefile  |   1 +
 arch/x86/kernel/vmlinux.lds.S |   2 +
 arch/x86/platform/efi/Makefile|   1 +
 arch/x86/purgatory/Makefile   |   1 +
 arch/x86/realmode/rm/Makefile |   1 +
 arch/x86/um/vdso/Makefile |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 drivers/s390/char/Makefile|   1 +
 include/asm-generic/vmlinux.lds.h |  44 +++
 kernel/Makefile   |   1 +
 kernel/pgo/Kconfig|  34 +++
 kernel/pgo/Makefile   |   5 +
 kernel/pgo/fs.c   | 382 ++
 kernel/pgo/instrument.c   | 147 ++
 kernel/pgo/pgo.h  | 206 ++
 scripts/Makefile.lib  |  10 +
 48 files changed, 1017 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/dev-tools/pgo.rst
 create mode 100644 kernel/pgo/Kconfig
 create mode 100644 kernel/pgo/Makefile
 create mode 100644 kernel/pgo/fs.c
 create mode 100644 kernel/pgo/instrument.c
 create mode 100644 kernel/pgo/pgo.h

diff --git a/Documentation/dev-tools/index.rst 
b/Documentation/dev-tools/index.rst
index f7809c7b1ba9e..8d6418e858062 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -26,6 +26,7 @@ whole; patches welcome!
kgdb
kselftest
kunit/index
+   pgo
 
 
 .. only::  subproject and html
diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst
new file mode 100644
index 0..2ed7f549b20ef
--- /dev/null
+++ b/Documentation/dev-tools/pgo.rst
@@ -0,0 +1,127 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+Using PGO with the Linux kernel
+===
+
+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel
+when building with Clang. The profiling data is exported via the ``pgo``
+debugfs directory.
+
+.. _PGO: 
https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
+
+
+Preparation
+===
+
+Configure the kernel with:
+
+.. code-block:: make
+
+   CONFIG_DEBUG_FS=y
+   CONFIG_PGO_CLANG=y
+
+Note that kernels compiled with profilin

Re: [PATCH] riscv: Explicitly specify the build id style in vDSO Makefile again

2020-11-12 Thread Bill Wendling
On Thu, Nov 12, 2020 at 4:53 PM Nick Desaulniers
 wrote:
>
> On Sun, Nov 8, 2020 at 12:37 PM Nathan Chancellor
>  wrote:
> >
> > Commit a96843372331 ("kbuild: explicitly specify the build id style")
> > explicitly set the build ID style to SHA1. Commit c2c81bb2f691 ("RISC-V:
> > Fix the VDSO symbol generaton for binutils-2.35+") undid this change,
> > likely unintentionally.
> >
> > Restore it so that the build ID style stays consistent across the tree
> > regardless of linker.
> >
> > Fixes: c2c81bb2f691 ("RISC-V: Fix the VDSO symbol generaton for 
> > binutils-2.35+")
> > Signed-off-by: Nathan Chancellor 
>
> Thanks for the fixup!
>
> Reviewed-by: Nick Desaulniers 
>
> (I'm curious what --build-id linker flag does, and what kind of spooky
> bugs that led to a96843372331?)
>
--build-id generates a unique "build id" for the build. It can use
several different algorithms to do this. The BFD linker uses sha1 by
default while LLD uses a "fast" algorithm. The difference is that the
fast algorithm generates a shorter build id. This shouldn't matter in
general, but there are some tools out there that expect the build id
to be of a certain length, i.e. the BFD style's length, because BFD is
more prevalent. The obvious response "well, why don't they just change
the expected length?" is difficult in all situations. (Once an
assumption is made, it's hard to backtrack.)

You can add this if you like:

Reviewed-by: Bill Wendling 

> > ---
> >  arch/riscv/kernel/vdso/Makefile | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/riscv/kernel/vdso/Makefile 
> > b/arch/riscv/kernel/vdso/Makefile
> > index cb8f9e4cfcbf..0cfd6da784f8 100644
> > --- a/arch/riscv/kernel/vdso/Makefile
> > +++ b/arch/riscv/kernel/vdso/Makefile
> > @@ -44,7 +44,7 @@ SYSCFLAGS_vdso.so.dbg = $(c_flags)
> >  $(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE
> > $(call if_changed,vdsold)
> >  SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \
> > -   -Wl,--build-id -Wl,--hash-style=both
> > +   -Wl,--build-id=sha1 -Wl,--hash-style=both
> >
> >  # We also create a special relocatable object that should mirror the symbol
> >  # table and layout of the linked DSO. With ld --just-symbols we can then
> >
> > base-commit: c2c81bb2f69138f902e1a58d3bef6ad97fb8a92c
> > --
> > 2.29.2
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Clang Built Linux" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to clang-built-linux+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/clang-built-linux/20201108203737.94270-1-natechancellor%40gmail.com.
>
>
>
> --
> Thanks,
> ~Nick Desaulniers


[PATCH] kbuild: explicitly specify the build id style

2020-09-22 Thread Bill Wendling
ld's --build-id defaults to "sha1" style, while lld defaults to "fast".
The build IDs are very different between the two, which may confuse
programs that reference them.

Signed-off-by: Bill Wendling 
---
 Makefile | 4 ++--
 arch/arm/vdso/Makefile   | 2 +-
 arch/arm64/kernel/vdso/Makefile  | 2 +-
 arch/arm64/kernel/vdso32/Makefile| 2 +-
 arch/mips/vdso/Makefile  | 2 +-
 arch/riscv/kernel/vdso/Makefile  | 2 +-
 arch/s390/kernel/vdso64/Makefile | 2 +-
 arch/sparc/vdso/Makefile | 2 +-
 arch/x86/entry/vdso/Makefile | 2 +-
 tools/testing/selftests/bpf/Makefile | 2 +-
 10 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/Makefile b/Makefile
index 2b66d3398878..7e6f41c9803a 100644
--- a/Makefile
+++ b/Makefile
@@ -973,8 +973,8 @@ KBUILD_CPPFLAGS += $(KCPPFLAGS)
 KBUILD_AFLAGS   += $(KAFLAGS)
 KBUILD_CFLAGS   += $(KCFLAGS)
 
-KBUILD_LDFLAGS_MODULE += --build-id
-LDFLAGS_vmlinux += --build-id
+KBUILD_LDFLAGS_MODULE += --build-id=sha1
+LDFLAGS_vmlinux += --build-id=sha1
 
 ifeq ($(CONFIG_STRIP_ASM_SYMS),y)
 LDFLAGS_vmlinux+= $(call ld-option, -X,)
diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
index a54f70731d9f..150ce6e6a5d3 100644
--- a/arch/arm/vdso/Makefile
+++ b/arch/arm/vdso/Makefile
@@ -19,7 +19,7 @@ ccflags-y += -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO32
 ldflags-$(CONFIG_CPU_ENDIAN_BE8) := --be8
 ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
-z max-page-size=4096 -nostdlib -shared $(ldflags-y) \
-   --hash-style=sysv --build-id \
+   --hash-style=sysv --build-id=sha1 \
-T
 
 obj-$(CONFIG_VDSO) += vdso.o
diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index 45d5cfe46429..871915097f9d 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -24,7 +24,7 @@ btildflags-$(CONFIG_ARM64_BTI_KERNEL) += -z force-bti
 # routines, as x86 does (see 6f121e548f83 ("x86, vdso: Reimplement vdso.so
 # preparation in build-time C")).
 ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv   
\
--Bsymbolic $(call ld-option, --no-eh-frame-hdr) --build-id -n  
\
+-Bsymbolic $(call ld-option, --no-eh-frame-hdr) --build-id=sha1 -n 
\
 $(btildflags-y) -T
 
 ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18
diff --git a/arch/arm64/kernel/vdso32/Makefile 
b/arch/arm64/kernel/vdso32/Makefile
index d6adb4677c25..4fa4b3fe8efb 100644
--- a/arch/arm64/kernel/vdso32/Makefile
+++ b/arch/arm64/kernel/vdso32/Makefile
@@ -128,7 +128,7 @@ VDSO_LDFLAGS += -Wl,-Bsymbolic -Wl,--no-undefined 
-Wl,-soname=linux-vdso.so.1
 VDSO_LDFLAGS += -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096
 VDSO_LDFLAGS += -nostdlib -shared -mfloat-abi=soft
 VDSO_LDFLAGS += -Wl,--hash-style=sysv
-VDSO_LDFLAGS += -Wl,--build-id
+VDSO_LDFLAGS += -Wl,--build-id=sha1
 VDSO_LDFLAGS += $(call cc32-ldoption,-fuse-ld=bfd)
 
 
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 57fe83235281..5810cc12bc1d 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -61,7 +61,7 @@ endif
 # VDSO linker flags.
 ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \
$(filter -E%,$(KBUILD_CFLAGS)) -nostdlib -shared \
-   -G 0 --eh-frame-hdr --hash-style=sysv --build-id -T
+   -G 0 --eh-frame-hdr --hash-style=sysv --build-id=sha1 -T
 
 CFLAGS_REMOVE_vdso.o = -pg
 
diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile
index 478e7338ddc1..7d6a94d45ec9 100644
--- a/arch/riscv/kernel/vdso/Makefile
+++ b/arch/riscv/kernel/vdso/Makefile
@@ -49,7 +49,7 @@ $(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE
 # refer to these symbols in the kernel code rather than hand-coded addresses.
 
 SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \
-   -Wl,--build-id -Wl,--hash-style=both
+   -Wl,--build-id=sha1 -Wl,--hash-style=both
 $(obj)/vdso-dummy.o: $(src)/vdso.lds $(obj)/rt_sigreturn.o FORCE
$(call if_changed,vdsold)
 
diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile
index 4a66a1cb919b..edc473b32e42 100644
--- a/arch/s390/kernel/vdso64/Makefile
+++ b/arch/s390/kernel/vdso64/Makefile
@@ -19,7 +19,7 @@ KBUILD_AFLAGS_64 += -m64 -s
 KBUILD_CFLAGS_64 := $(filter-out -m64,$(KBUILD_CFLAGS))
 KBUILD_CFLAGS_64 += -m64 -fPIC -shared -fno-common -fno-builtin
 ldflags-y := -fPIC -shared -nostdlib -soname=linux-vdso64.so.1 \
---hash-style=both --build-id -T
+--hash-style=both --build-id=sha1 -T
 
 $(targets:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_64)
 $(targets:%=$(obj)/%.dbg): KBUILD_AFLAGS = $(KBUILD_AFLAGS_64)
diff --git a/arch/sparc/vdso/Makefile b/arch/sparc/vdso/Makefile
index f44355e46f31..469dd23887ab 100644
--- a/arch/sparc/vdso/Makefile
+++ b/arch/sparc/vdso/Makefile

Re: [PATCH] x86/smap: Fix the smap_save() asm

2020-09-15 Thread Bill Wendling
On Tue, Sep 15, 2020 at 4:40 PM Andrew Cooper  wrote:
>
> On 16/09/2020 00:11, Andy Lutomirski wrote:
> >> On Sep 15, 2020, at 2:24 PM, Nick Desaulniers  
> >> wrote:
> >>
> >> On Tue, Sep 15, 2020 at 1:56 PM Andy Lutomirski  wrote:
> >>> The old smap_save() code was:
> >>>
> >>>  pushf
> >>>  pop %0
> >>>
> >>> with %0 defined by an "=rm" constraint.  This is fine if the
> >>> compiler picked the register option, but it was incorrect with an
> >>> %rsp-relative memory operand.
> >> It is incorrect because ... (I think mentioning the point about the
> >> red zone would be good, unless there were additional concerns?)
> > This isn’t a red zone issue — it’s a just-plain-wrong issue.  The popf is 
> > storing the result in the wrong place in memory — it’s RSP-relative, but 
> > RSP is whatever the compiler thinks it should be minus 8, because the 
> > compiler doesn’t know that pushfq changed RSP.
>
> It's worse than that.  Even when stating that %rsp is modified in the
> asm, the generated code sequence is still buggy, for recent Clang and GCC.
>
> https://godbolt.org/z/ccz9v7
>
> It's clearly not safe to ever use memory operands with pushf/popf asm
> fragments.
>
Would this apply to native_save_fl() and native_restore_fl in
arch/x86/include/asm/irqflags.h? It was like that two revisions ago,
but it was changed (back) to "=rm" with a comment about it being safe.

> >> This is something we should fix.  Bill, James, and I are discussing
> >> this internally.  Thank you for filing a bug; I owe you a beer just
> >> for that.
> > I’m looking forward to the day that beers can be exchanged in person again 
> > :)
>
> +1 to that.
>
+100

-bw


Re: [PATCH] x86/smap: Fix the smap_save() asm

2020-09-15 Thread Bill Wendling
On Tue, Sep 15, 2020 at 2:24 PM Nick Desaulniers
 wrote:
> On Tue, Sep 15, 2020 at 1:56 PM Andy Lutomirski  wrote:
> > Fixes: e74deb11931f ("x86/uaccess: Introduce user_access_{save,restore}()")
> > Cc: sta...@vger.kernel.org
> > Reported-by: Bill Wendling  # I think
>
> LOL, yes, the comment can be dropped...though I guess someone else may
> have reported the problem to Bill?
>
I found this instance, but not the general issue. :-)

-bw


Re: [PATCH bpf-next] selftests/bpf: Switch test_vmlinux to use hrtimer_range_start_ns.

2020-06-30 Thread Bill Wendling
On Tue, Jun 30, 2020 at 3:48 PM Hao Luo  wrote:
>
> On Tue, Jun 30, 2020 at 1:37 PM Yonghong Song  wrote:
> >
> > On 6/30/20 11:49 AM, Hao Luo wrote:
> > > The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
> > > programs. But it seems Clang may have done an aggressive optimization,
> > > causing fentry and kprobe to not hook on this function properly on a
> > > Clang build kernel.
> >
> > Could you explain why it does not on clang built kernel? How did you
> > build the kernel? Did you use [thin]lto?
> >
> > hrtimer_nanosleep is a global function who is called in several
> > different files. I am curious how clang optimization can make
> > function disappear, or make its function signature change, or
> > rename the function?
> >
>
> Yonghong,
>
> We didn't enable LTO. It also puzzled me. But I can confirm those
> fentry/kprobe test failures via many different experiments I've done.
> After talking to my colleague on kernel compiling tools (Bill, cc'ed),
> we suspected this could be because of clang's aggressive inlining. We
> also noticed that all the callsites of hrtimer_nanosleep() are tail
> calls.
>
> For a better explanation, I can reach out to the people who are more
> familiar to clang in the compiler team to see if they have any
> insights. This may not be of high priority for them though.
>
Hi Yonghong,

Clang is generally more aggressive at inlining than gcc. So even
though hrtimer_nanosleep is a global function, clang goes ahead and
inlines it into the "nanosleep" syscall, which is in the same file.
(We're not currently using {Thin}LTO, so this won't happen in
functions outside of kernel/time/hrtimer.c.) Note that if gcc were to
change it's inlining heuristics so that it inlined more aggressively,
you would be faced with a similar issue.

If you would like to test that it calls hrtimer_nanosleep() and not
another function, it might be best to call a syscall not defined in
hrtimer.c, e.g. clock_nanosleep().

-bw


Re: [PATCH] be2net: fix adapter->big_page_size miscaculation

2019-07-18 Thread Bill Wendling
Possibly. I'd need to ask him. :-)

On Thu, Jul 18, 2019 at 2:22 PM Nick Desaulniers
 wrote:
>
> On Thu, Jul 18, 2019 at 2:18 PM Bill Wendling  wrote:
> >
> > Top-of-tree clang says that it's const:
> >
> > $ gcc a.c -O2 && ./a.out
> > a is a const.
> >
> > $ clang a.c -O2 && ./a.out
> > a is a const.
>
> Right, so I know you (Bill) did a lot of work to refactor
> __builtin_constant_p handling in Clang and LLVM in the
> pre-llvm-9-release timeframe.  I suspect Qian might not be using
> clang-9 built from source (as clang-8 is the current release) and thus
> observing differences.
>
> >
> > On Thu, Jul 18, 2019 at 2:10 PM Nick Desaulniers  
> > wrote:
> >>
> >> On Thu, Jul 18, 2019 at 2:01 PM Qian Cai  wrote:
> >> >
> >> >
> >> >
> >> > > On Jul 12, 2019, at 8:50 PM, David Miller  wrote:
> >> > >
> >> > > From: Qian Cai 
> >> > > Date: Fri, 12 Jul 2019 20:27:09 -0400
> >> > >
> >> > >> Actually, GCC would consider it a const with -O2 optimized level 
> >> > >> because it found that it was never modified and it does not 
> >> > >> understand it is a module parameter. Considering the following code.
> >> > >>
> >> > >> # cat const.c
> >> > >> #include 
> >> > >>
> >> > >> static int a = 1;
> >> > >>
> >> > >> int main(void)
> >> > >> {
> >> > >>  if (__builtin_constant_p(a))
> >> > >>  printf("a is a const.\n");
> >> > >>
> >> > >>  return 0;
> >> > >> }
> >> > >>
> >> > >> # gcc -O2 const.c -o const
> >> > >
> >> > > That's not a complete test case, and with a proper test case that
> >> > > shows the externalization of the address of &a done by the module
> >> > > parameter macros, gcc should not make this optimization or we should
> >> > > define the module parameter macros in a way that makes this properly
> >> > > clear to the compiler.
> >> > >
> >> > > It makes no sense to hack around this locally in drivers and other
> >> > > modules.
> >> >
> >> > If you see the warning in the original patch,
> >> >
> >> > https://lore.kernel.org/netdev/1562959401-19815-1-git-send-email-...@lca.pw/
> >> >
> >> > GCC definitely optimize rx_frag_size  to be a constant while I just 
> >> > confirmed clang
> >> > -O2 does not. The problem is that I have no clue about how to let GCC 
> >> > not to
> >> > optimize a module parameter.
> >> >
> >> > Though, I have added a few people who might know more of compilers than 
> >> > myself.
> >>
> >> + Bill and James, who probably knows more than they'd like to about
> >> __builtin_constant_p and more than other LLVM folks at this point.
> >>
> >> --
> >> Thanks,
> >> ~Nick Desaulniers
>
>
>
> --
> Thanks,
> ~Nick Desaulniers


Re: [PATCH] be2net: fix adapter->big_page_size miscaculation

2019-07-18 Thread Bill Wendling
[My previous response was marked as spam...]

Top-of-tree clang says that it's const:

$ gcc a.c -O2 && ./a.out
a is a const.

$ clang a.c -O2 && ./a.out
a is a const.


On Thu, Jul 18, 2019 at 2:10 PM Nick Desaulniers
 wrote:
>
> On Thu, Jul 18, 2019 at 2:01 PM Qian Cai  wrote:
> >
> >
> >
> > > On Jul 12, 2019, at 8:50 PM, David Miller  wrote:
> > >
> > > From: Qian Cai 
> > > Date: Fri, 12 Jul 2019 20:27:09 -0400
> > >
> > >> Actually, GCC would consider it a const with -O2 optimized level because 
> > >> it found that it was never modified and it does not understand it is a 
> > >> module parameter. Considering the following code.
> > >>
> > >> # cat const.c
> > >> #include 
> > >>
> > >> static int a = 1;
> > >>
> > >> int main(void)
> > >> {
> > >>  if (__builtin_constant_p(a))
> > >>  printf("a is a const.\n");
> > >>
> > >>  return 0;
> > >> }
> > >>
> > >> # gcc -O2 const.c -o const
> > >
> > > That's not a complete test case, and with a proper test case that
> > > shows the externalization of the address of &a done by the module
> > > parameter macros, gcc should not make this optimization or we should
> > > define the module parameter macros in a way that makes this properly
> > > clear to the compiler.
> > >
> > > It makes no sense to hack around this locally in drivers and other
> > > modules.
> >
> > If you see the warning in the original patch,
> >
> > https://lore.kernel.org/netdev/1562959401-19815-1-git-send-email-...@lca.pw/
> >
> > GCC definitely optimize rx_frag_size  to be a constant while I just 
> > confirmed clang
> > -O2 does not. The problem is that I have no clue about how to let GCC not to
> > optimize a module parameter.
> >
> > Though, I have added a few people who might know more of compilers than 
> > myself.
>
> + Bill and James, who probably knows more than they'd like to about
> __builtin_constant_p and more than other LLVM folks at this point.
>
> --
> Thanks,
> ~Nick Desaulniers


>2G Files

2001-05-02 Thread Bill Wendling

Hi all,

Question: Does Linux support >2G files and, if so, how do I implement
this?

Thanks.

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] gcc-3.0 warnings

2001-03-23 Thread Bill Wendling

Also sprach Alan Cox:

} > -   default:
} > +   default:;
} 
} Agree - done
} 
This kind of coding makes me want to cry. What's so wrong with:

default:
break;

instead? The ';' is hard to notice and, if people don't leave the
"default:" at the end, then bad things could happen...

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [rfc] Near-constant time directory index for Ext2

2001-02-21 Thread Bill Wendling

Also sprach H. Peter Anvin:
} Martin Mares wrote:
} > 
} > Hello!
} > 
} > > True.  Note too, though, that on a filesystem (which we are, after all,
} > > talking about), if you assume a large linear space you have to create a
} > > file, which means you need to multiply the cost of all random-access
} > > operations with O(log n).
} > 
} > One could avoid this, but it would mean designing the whole filesystem in a
} > completely different way -- merge all directories to a single gigantic
} > hash table and use (directory ID,file name) as a key, but we were originally
} > talking about extending ext2, so such massive changes are out of question
} > and your log n access argument is right.
} > 
} 
} It would still be tricky since you have to have actual files in the
} filesystem as well.
} 
But that's just a user space issue, isn't it.

(Just kidding :-)

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [IDE] meaningless #ifndef?

2001-02-19 Thread Bill Wendling

Also sprach Andre Hedrick:
} On Mon, 19 Feb 2001, Pozsar Balazs wrote:
} 
} > from drivers/ide/ide-features.c:
} > 
} > /*
} >  *  All hosts that use the 80c ribbon mus use!
} >  */
} > byte eighty_ninty_three (ide_drive_t *drive)
} > {
} > return ((byte) ((HWIF(drive)->udma_four) &&
} > #ifndef CONFIG_IDEDMA_IVB
} > (drive->id->hw_config & 0x4000) &&
} > #endif /* CONFIG_IDEDMA_IVB */
} > (drive->id->hw_config & 0x6000)) ? 1 : 0);
} > }
} > 
} > If i see well, then this is always same whether CONFIG_IDEDMA_IVB is
} > defined or not.
} > What's the clue?
} 
[snip...]

The use of the ternary operator is superfluous, though...and makes the
code look ugly IMNSHO :).

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux stifles innovation...

2001-02-15 Thread Bill Wendling

Also sprach Alan Olsen:
} I expect the next thing that will happen is that they will get patents on
} key portions of their protocols and then start enforcing them.
} 
Which protocols would that be? TCP/IP wasn't invented by them.

} I wonder what kind of law they will try to push to outlaw Open Source?
} 
With the horrid (pro-Microsoft) Aschroft in office, who knows what MS can
get away with. Not to mention all of the pro-business, anti-human cronies
in Washington running the Presidency (cause \/\/ just can't do it).

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



OOPs in 2.4.1 SMP

2001-02-12 Thread Bill Wendling

I noticed this oops on my HP Kayak XU dual Pentium II 300MHz. I added the
stuff right before it, though it may have nothing to do with the actual
Oops. My box is stull running, BTW :).

Feb 12 16:00:00 dangermouse CROND[11154]: (root) CMD (   /sbin/rmmod -as) 
Feb 12 16:00:00 dangermouse CROND[11155]: (wendling) CMD (cd
/home/wendling/setiathome ; if [ -z `/sbin/pidof setiathome` ]; then
/home/wendling/setiathome/setiathome -nice 19 -graphics ; fi) 
Feb 12 16:01:00 dangermouse CROND[11159]: (root) CMD (run-parts
/etc/cron.hourly) 
Feb 12 16:03:16 dangermouse kernel: Unable to handle kernel NULL pointer
dereference at virtual address 0008
Feb 12 16:03:16 dangermouse kernel:  printing eip:
Feb 12 16:03:16 dangermouse kernel: c885eac3
Feb 12 16:03:16 dangermouse kernel: *pde = 
Feb 12 16:03:16 dangermouse kernel: Oops: 
Feb 12 16:03:16 dangermouse kernel: CPU:1
Feb 12 16:03:17 dangermouse kernel: EIP:0010:[]
Feb 12 16:03:17 dangermouse kernel: EFLAGS: 00210292
Feb 12 16:03:17 dangermouse kernel: eax:    ebx: c05baa60   ecx:
c2b01060   edx: 
Feb 12 16:03:17 dangermouse kernel: esi: c2b01060   edi: c5816c60   ebp:
004d   esp: c2e1ff44
Feb 12 16:03:17 dangermouse kernel: ds: 0018   es: 0018   ss: 0018
Feb 12 16:03:17 dangermouse kernel: Process cvs (pid: 11161,
stackpage=c2e1f000)
Feb 12 16:03:17 dangermouse kernel: Stack: 004d c2b01060 c885ecfd
c2b01060 08105875  c5816c80 01b6 
Feb 12 16:03:17 dangermouse kernel:081057c0 c724b000 c5816c60
ffea  004d c0134d06 c5816c60 
Feb 12 16:03:17 dangermouse kernel:08105828 004d c5816c80
c0134930 0002 c2e1e000 0027cd82  
Feb 12 16:03:17 dangermouse kernel: Call Trace: []
[sys_write+150/208] [default_llseek+0/128] [sys_lseek+196/208]
[system_call+51/56] 
Feb 12 16:03:17 dangermouse kernel: 
Feb 12 16:03:17 dangermouse kernel: Code: 8b 70 08 8b 46 3c 8b 56 40 89
41 3c 0f ac d0 09 89 41 54 8b 


-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/



WaveLAN Gold Driver...Anyone?

2001-02-01 Thread Bill Wendling

Hi all,

Is anyone working on porting the driver for Lucent Technologies's WaveLAN
Gold wireless ethernet card to the 2.4.x kernel?

Just didn't want to duplicate effort.

Thanks!

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Poll and Select not scaling

2001-01-10 Thread Bill Wendling

Also sprach Micah Gorrell:
} If I am asking this in the wrong place I apoligize in advace.  Please let me
} know where the correct forum would be.
} 
} 
} I have been trying to increase the scalabilty of an email server that has
} been ported to Linux.  It was originally written for Netware, and there we
} are able to provide over 30,000 connections at any given time.  On Linux
} however select stops working after the first 1024 connections.  I have
} changed include/linux/fs.h and updated NR_FILE to be 81920.  In test
} applications I have been able to create well over 30,000 connections but I
} am unable to do either a select or a poll on them.  Does any one know what I
} can do to fix this?
} 
Hi Micah,

Which kernel version are you using?

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.4.0ac1

2001-01-05 Thread Bill Wendling

Also sprach Miles Lane:
} Bill Wendling wrote:
} 
} > Also sprach Keith Owens:
} > } On Thu, 04 Jan 2001 21:54:29 -0800, 
} > } Miles Lane <[EMAIL PROTECTED]> wrote:
} > } >make[4]: Entering directory `/usr/src/linux/drivers/acpi'
} > } >/usr/src/linux/Rules.make:224: *** Recursive variable `CFLAGS' references 
itself (eventually).  Stop.
} > } 
} 
} > Changing that line to:
} > 
} > $(MODINCL)/%.ver: CFLAGS := -I./include $(CFLAGS)
} > 
} > might work as well...
} 
} Seems to work here.  Thanks, Bill.
} 
Great! I just wish I knew where this line was being generated so that I
could send a patch in :).

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.4.0ac1

2001-01-04 Thread Bill Wendling

Also sprach Keith Owens:
} On Thu, 04 Jan 2001 21:54:29 -0800, 
} Miles Lane <[EMAIL PROTECTED]> wrote:
} >make[4]: Entering directory `/usr/src/linux/drivers/acpi'
} >/usr/src/linux/Rules.make:224: *** Recursive variable `CFLAGS' references itself 
(eventually).  Stop.
} 
} In drivers/acpi/Makefile, delete the line
} 
} $(MODINCL)/%.ver: CFLAGS = -I./include $(CFLAGS)
} 
} You will be able to compile but acpi may not work with module symbol
} versions, so do not select module symbol versions.
} 
Changing that line to:

$(MODINCL)/%.ver: CFLAGS := -I./include $(CFLAGS)

might work as well...

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Laggin Mouse on IBM Thinkpad 240

2000-11-14 Thread Bill Wendling

Hi all,

Whenever my IBM Thinkpad 240 (running 2.2.17 with apmd) goes into
``sleep'' mode and then wakes up, the mouse and typing tend to lag quite
a bit. So much so that it becomes very hard to get anything done on the
machine. This happens mostly when it's in battery mode (not plugged in)
but occaisionally occurs when it's plugged in.

The .config file options I have turned on for power management are:

CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
CONFIG_APM_CPU_IDLE=y
CONFIG_APM_DISPLAY_BLANK=y
CONFIG_APM_IGNORE_MULTIPLE_SUSPEND=y
CONFIG_APM_IGNORE_SUSPEND_BOUNCE=y
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOWS_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set

I've tried the 2.4.0test* kernels with similar results...

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.4 Status / TODO page (Updated as of 2.4.0-test10)

2000-11-03 Thread Bill Wendling

Also sprach [EMAIL PROTECTED]:
} > de4x5 is probably also buggy in regard to this.
} 
} de4x5 is hopeless. I added nice comment in softnet to it.
} Unfortunately it was lost. 8)
} 
} Andi, neither you nor me nor Alan nor anyone are able to audit
} all this unnevessarily overcomplicated code. It was buggy, is buggy
} and will be buggy. It is inavoidable, as soon as you have hundreds
} of drivers.
} 
If they are buggy and unsupported, why aren't they being expunged from
the main source tree and placed into a ``contrib'' directory or something
for people who may want those drivers?

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[patch(?)] question wrt context switching during disk i/o

2000-10-21 Thread Bill Wendling

Also sprach Mike Galbraith:
} On Fri, 20 Oct 2000, Mark Hahn wrote:
} 
} > > This is something that has been bugging me for a while.  I notice
} > > on my system that during disk write we do much context switching,
} > > but not during disk read.  Why is that?
} > 
} > bdflush is broken in current kernels.  I posted to linux-mm about this,
} > but Rik et al haven't shown any interest.  I normally see bursts of 
} > up to around 40K cs/second when doing writes; I hacked a little 
} > premption counter into the kernel and verified that they're practially
} > all bdflush...
} 
There's some strangness in bdflush(). The comment says:

/*
 * If there are still a lot of dirty buffers around,
 * skip the sleep and flush some more. Otherwise, we
 * go to sleep waiting a wakeup.
 */
if (!flushed || balance_dirty_state(NODEV) < 0) {
run_task_queue(&tq_disk);
schedule();
}

but the comment for balance_dirty_state() says:

/* -1 -> no need to flush
0 -> async flush
1 -> sync flush (wait for I/O completation) */
int balance_dirty_state(kdev_t dev)
{

Which leads me to believe that the `<' should be either `==' or `<='. I
tried it with the `<=' and it doesn't seem to be so bad...Here's a patch
to see if it helps you?

-- 
|| Bill Wendling[EMAIL PROTECTED]


--- linux/fs/buffer.c   Sat Oct 21 02:55:41 2000
+++ linux-2.4.0-test10pre4/fs/buffer.c  Sat Oct 21 12:27:10 2000
@@ -2683,7 +2683,7 @@
 * skip the sleep and flush some more. Otherwise, we
 * go to sleep waiting a wakeup.
 */
-   if (!flushed || balance_dirty_state(NODEV) < 0) {
+   if (!flushed || balance_dirty_state(NODEV) <= 0) {
run_task_queue(&tq_disk);
schedule();
}



Re: Patch to remove undefined C code

2000-10-17 Thread Bill Wendling

Also sprach Tom Leete:
} 
} You are correct that in C the rightmost argument is always
} at the open end of the stack, and that varargs require that.
} The opposite is called the Pascal convention.
} 
Where in the standard does it say this? It's probably done most of the
time in this fashion for convenience, but I don't believe it's in the
standard.

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Patch to remove undefined C code

2000-10-17 Thread Bill Wendling

Also sprach Bernd Schmidt:
} > Looking at the above code, I noticed that there are a lot of ++ 
} > operations.  I rewrote the code as:
} > 
} >  setup_from[0] = setup_from[1] = eaddrs[0];
} >  setup_from[2] = setup_from[3] = eaddrs[1];
} >  setup_from[4] = setup_from[5] = eaddrs[2];
} >  setup_from += 6;
} > 
} > I compiled using "gcc -S -Wall -O2 -fomit-frame-pointer -m486" to generate 
} > the assembler code.  The old code is 17 instructions long and the new code 
} > is 11 instructions.  As well as being shorter, simple timing test indicate 
} > that the new code is significantly quicker.
} 
} This is something the compiler ought to know about.
} 
A compiler can only guess so much. Don't look to a compiler to fix poorly
written code...

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Patch to remove undefined C code

2000-10-17 Thread Bill Wendling

Also sprach Mark Montague:
} 
} dn[i], i++ evaluates to i, but str[i] = dn[i], i++ sets str[i] to
} dn[i] first, then increments i returning its previous value, which is
} discarded. Which was probably specified this way so
} 
} for(i=1,j=2; something; something else)
} 
} works as expected, rather than setting i and j to 2, and discarding
} the 1.
} 
Um...it sets i to 1 and j to 2.

} Just to encourage empiricism, I usually check stuff like that with
} 
} % cat > foo.c
} main(){
} int i;
} i = 1 , 2;
} printf("%d\n",i);
} }
} % gcc foo.c
} % ./a.out
} 1
} 
} or similar if I'm confused.
}
This isn't a good test of what's proper according to the standard. If you
are testing some undefined part of the standard, a compile may get it
"right" in one version but is perfectly within its rights to do something
else in the next version. The only way to solve these things is to look
at what the standard says and follow it. All else is hopes and wishes.
The above code is specified as working in the standard (evaluating the
assign and then evaluating the 2).

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Is this a valid construct?

2000-10-16 Thread Bill Wendling

Also sprach Matthew Dharm:
} Does the following pseudocode do what I think it does?
} 
} Assume the semaphore is properly initialized to locked.
} 
} int flagvar = 0;
} struct semaphore blocking_sem;
} 
} void function_called_from_kernel_thread(void)
} {
}   chew_on_hardware();
}   flagvar = 1;
}   down(blocking_sem);
} 
}   if (flagvar)
} printk("something went wrong")
}   else
} printk("everything okay")
} }
} 
} void function_called_from_interrupt_context()
} {
}   flagvar = 0;
}   up(blocking_sem);
} }
} 
} void function_to_call_from_timeout()
} {
}   up(blocking_sem);
} }
} 
} The idea is this -- I chew on the hardware, then sleep on the semaphore.  I
} then either get woken up by an IRQ (which may never come), or the timeout.
} I then try to use the flagvar to determine which of the two happened.
} 
} This _looks_ valid to me... but I'm seeing occurances where I get the IRQ
} (yes, I'm sure of it) but flagvar == 1, which confuses me.
} 
Are you sure there isn't a race on the flagvar variable? Like, the
interrupt happens, it gets set to 0, then before we can ``up'' the
semaphore, it's set to 1 in the function_called_from_kernel_thread()?

-- 
|| Bill Wendling[EMAIL PROTECTED]

 PGP signature


Re: Patch to remove undefined C code

2000-10-16 Thread Bill Wendling

Also sprach Abramo Bagnara:
} 
} Isn't this more efficient?
}   n = (x>>32) | (x<<32); 
}   n = ((n & 0xLL)<<16) | (n & 0xLL)>>16; 
}   n = ((n & 0x00ff00ff00ff00ffLL)<<8) | (n & 0xff00ff00ff00ff00LL)>>8; 
} 
} 6 shift
} 4 and
} 3 or
} 
Plus 3 assigns...but they may get optimized out. :)

} instead of
} 
} 8 shift
} 8 and
} 7 or
} 

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[patch-2.4.0-test9-pre1] mm/filemap.c

2000-09-16 Thread Bill Wendling

Hi Linus,
 
Here's a resubmitted small optimization for the mm/filemap.c file.
 
- The `curr = curr->next;' statement doesn't need to be executed
  if the repeat is taken. I used the list_for_each() macro to
  accomodate this better.
 
Share and enjoy!

-- 
|| Bill Wendling[EMAIL PROTECTED]


--- linux-2.4.0-test9-pre1/mm/filemap.c Sat Sep 16 02:21:03 2000
+++ linux-2.4.0-test9-pre1-new/mm/filemap.c Sat Sep 16 02:31:54 2000
@@ -193,12 +193,10 @@
 repeat:
head = &mapping->pages;
spin_lock(&pagecache_lock);
-   curr = head->next;
-   while (curr != head) {
+   list_for_each(curr, head) {
unsigned long offset;
 
page = list_entry(curr, struct page, list);
-   curr = curr->next;
offset = page->index;
 
/* Is one of the pages to truncate? */



Re: [PATCH 2.4.0-test8] mm/filemap.c

2000-09-14 Thread Bill Wendling

Also sprach Juan J. Quintela:
} >>>>> "bill" == Bill Wendling <[EMAIL PROTECTED]> writes:
} 
} Hi
} 
} Linus, please don't apply.
} 
} bill> - The `head = &mapping->pages;' statement is useless inside the
} bill>   repeat, since head isn't modified inside the loop.
} 
} No, but we sleep inside the loop, and while we sleep, we don't have
} locked the page cache :
} 
As David pointed out, it's only necessary if mapping->pages changes. But,
then, shouldn't the head = &mapping->pages statement be within the
spinlock?

} If you think that the for is nicer (I think that the while is easier
} to read, but that is question of taste).
} 
It's not really a question of taste. The for loop does the increment
after the block of code. Doing it beforehand is a waste if the repeat is
taken.

Attached is a new patch which just addresses this issue and leaves the
head = &mapping->pages thingy alone...

-- 
|| Bill Wendling[EMAIL PROTECTED]


--- linux-2.4.0-test8/mm/filemap.c  Sat Sep  9 02:35:09 2000
+++ linux-2.4.0-test8-new/mm/filemap.c  Thu Sep 14 12:09:21 2000
@@ -193,12 +193,10 @@
 repeat:
head = &mapping->pages;
spin_lock(&pagecache_lock);
-   curr = head->next;
-   while (curr != head) {
+   list_for_each(curr, head) {
unsigned long offset;
 
page = list_entry(curr, struct page, list);
-   curr = curr->next;
offset = page->index;
 
/* Is one of the pages to truncate? */



Re: [PATCH 2.4.0-test8] mm/filemap.c

2000-09-14 Thread Bill Wendling

Also sprach David Mansfield:
} Bill Wendling wrote:
} > 
} > Hi Linus,
} > 
} > Here's a small optimization for the mm/filemap.c file.
} > 
} > - The `head = &mapping->pages;' statement is useless inside the
} >   repeat, since head isn't modified inside the loop.
} > - The `curr = curr->next;' statement doesn't need to be executed
} >   if the repeat is taken. I changed the while() into a for() loop
} >   to accomodate this better.
} > 
} 
} I spotted the curr = curr->next thing yesterday, too!  I think you're
} right on that one.  But I'm not sure about the head = &mapping thing. 
} The reason we jump back here is that we've been outside the spinlock'ed
} critical section.  Is it possible for the &mapping->pages to change
} during this period of time (when spinlock isn't held?), if not, your
} patch is ok.  If it could change, we need to re-initialize head because
} it could have changed while we didn't have the lock locked.
} 
Doh...Yeah. But...Shouldn't the `head = &mapping->pages;' thing be inside
of a spinlock if &mapping->pages could change? Say it changed between the
assignment and the function grabbing the lock?

Does anyone else know better on this?

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH 2.4.0-test8] mm/filemap.c

2000-09-14 Thread Bill Wendling

Hi Linus,

Here's a small optimization for the mm/filemap.c file.

- The `head = &mapping->pages;' statement is useless inside the
  repeat, since head isn't modified inside the loop.
- The `curr = curr->next;' statement doesn't need to be executed
  if the repeat is taken. I changed the while() into a for() loop
  to accomodate this better.

Share and enjoy!

-- 
|| Bill Wendling[EMAIL PROTECTED]


--- linux-2.4.0-test8/mm/filemap.c  Sat Sep  9 02:35:09 2000
+++ linux-2.4.0-test8-new/mm/filemap.c  Thu Sep 14 03:14:06 2000
@@ -189,16 +189,14 @@
unsigned long start;
 
start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+   head = &mapping->pages;
 
 repeat:
-   head = &mapping->pages;
spin_lock(&pagecache_lock);
-   curr = head->next;
-   while (curr != head) {
+   for (curr = head->next; curr != head; curr = curr->next) {
unsigned long offset;
 
page = list_entry(curr, struct page, list);
-   curr = curr->next;
offset = page->index;
 
/* Is one of the pages to truncate? */



Re: Linux-2.4.0-test8-pre6

2000-09-07 Thread Bill Wendling

Also sprach dean gaudet:
} On Thu, 7 Sep 2000, Bill Wendling wrote:
} > Don't be stupid.
} 
} dude, i gave at least three hints that i was joking up there.  stupid
} would be if i claimed that it was obvious that a debugger would have
} helped this situation.  instead all i'm claiming is that it's orthogonal.
} 
} > Because some bugs are very hard to fix no matter what tool/brain you
} > throw at it...
} 
} so you got my point after all.
} 
Sorry...overreacted...Won't do it again (well, will try not to)...

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux-2.4.0-test8-pre6

2000-09-07 Thread Bill Wendling

Also sprach dean gaudet:
} On Wed, 6 Sep 2000, Linus Torvalds wrote:
} 
} > Yeah. Maybe we fixed truncate, and maybe we didn't. I've thought that we
} > fixed it now several times, and I was always wrong.
} 
} obpainintheass:  haven't you anti-debugger-religion folks been claiming
} that if you don't have a debugger you're forced to "think about the code
} to find the correct fix"?  so, like, why are you guessing right now?  :)
} 
Don't be stupid.

Because some bugs are very hard to fix no matter what tool/brain you
throw at it...

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Availability of kdb

2000-09-06 Thread Bill Wendling

Also sprach Linus Torvalds:
 
[snipo]
} I do realize that others disagree. And I'm not your Mom. You can use a
} kernel debugger if you want to, and I won't give you the cold shoulder
} because you have "sullied" yourself. But I'm not going to help you use
} one, and I wuld frankly prefer people not to use kernel debuggers that
} much. So I don't make it part of the standard distribution, and if the
} existing debuggers aren't very well known I won't shed a tear over it.
} 
Or, to misquote Feynman (another cantankorous bastard, but proud of it):

"Look at the problem. Think really hard. And write the correct code."

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: linux-2.4.0-test8-pre5

2000-09-06 Thread Bill Wendling

Also sprach Dan Aloni:
} On Wed, 6 Sep 2000, Peter Samuelson wrote:
} 
} > Can someone explain this line from the VIA update?
} >   #define FIT(v,min,max) (((v)>(max)?(max):(v))<(min)?(min):(v))
} > Barring side effects on the variables, it is equivalent to
} >   #define FIT(v,min,max) ((v)<(min)?(min):(v))
} > 
} > Why do I get the feeling that this was *not* the intent?
} 
} Correct. The last v should be replaced with whatever that we got from
} (v)>(max)?(max):(v), like:
} 
} #define FIT(v,min,max) (((v)>(max)?(max):(v))<(min)?(min):((v)>(max)?(max):(v)))
} 
} Or perhaps this is a lot better:
} 
} #define FIT(v,min,max) ((v)>(max)?(max):((v)<(min)?(min):(v)))
} 
*pukes*

Wouldn't an inline'd function be much much more readable/maintainable??

-- 
|| Bill Wendling[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/