Re: [PATCH v4 00/15] mm: jit/text allocator

2024-04-11 Thread Kent Overstreet
On Thu, Apr 11, 2024 at 07:00:36PM +0300, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" 
> 
> Hi,
> 
> Since v3 I looked into making execmem more of an utility toolbox, as we
> discussed at LPC with Mark Rutland, but it was getting more hairier than
> having a struct describing architecture constraints and a type identifying
> the consumer of execmem.
> 
> And I do think that having the description of architecture constraints for
> allocations of executable memory in a single place is better that having it
> spread all over the place.
> 
> The patches available via git:
> https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=execmem/v4
> 
> v4 changes:
> * rebase on v6.9-rc2
> * rename execmem_params to execmem_info and execmem_arch_params() to
>   execmem_arch_setup()
> * use single execmem_alloc() API instead of execmem_{text,data}_alloc() (Song)
> * avoid extra copy of execmem parameters (Rick)
> * run execmem_init() as core_initcall() except for the architectures that
>   may allocated text really early (currently only x86) (Will)
> * add acks for some of arm64 and riscv changes, thanks Will and Alexandre
> * new commits:
>   - drop call to kasan_alloc_module_shadow() on arm64 because it's not
> needed anymore
>   - rename MODULE_START to MODULES_VADDR on MIPS
>   - use CONFIG_EXECMEM instead of CONFIG_MODULES on powerpc as per Christophe:
> 
> https://lore.kernel.org/all/79062fa3-3402-47b3-8920-9231ad05e...@csgroup.eu/
> 
> v3: https://lore.kernel.org/all/20230918072955.2507221-1-r...@kernel.org
> * add type parameter to execmem allocation APIs
> * remove BPF dependency on modules
> 
> v2: https://lore.kernel.org/all/20230616085038.4121892-1-r...@kernel.org
> * Separate "module" and "others" allocations with execmem_text_alloc()
> and jit_text_alloc()
> * Drop ROX entailment on x86
> * Add ack for nios2 changes, thanks Dinh Nguyen
> 
> v1: https://lore.kernel.org/all/20230601101257.530867-1-r...@kernel.org
> 
> = Cover letter from v1 (sligtly updated) =
> 
> module_alloc() is used everywhere as a mean to allocate memory for code.
> 
> Beside being semantically wrong, this unnecessarily ties all subsystmes
> that need to allocate code, such as ftrace, kprobes and BPF to modules and
> puts the burden of code allocation to the modules code.
> 
> Several architectures override module_alloc() because of various
> constraints where the executable memory can be located and this causes
> additional obstacles for improvements of code allocation.
> 
> A centralized infrastructure for code allocation allows allocations of
> executable memory as ROX, and future optimizations such as caching large
> pages for better iTLB performance and providing sub-page allocations for
> users that only need small jit code snippets.
> 
> Rick Edgecombe proposed perm_alloc extension to vmalloc [1] and Song Liu
> proposed execmem_alloc [2], but both these approaches were targeting BPF
> allocations and lacked the ground work to abstract executable allocations
> and split them from the modules core.
> 
> Thomas Gleixner suggested to express module allocation restrictions and
> requirements as struct mod_alloc_type_params [3] that would define ranges,
> protections and other parameters for different types of allocations used by
> modules and following that suggestion Song separated allocations of
> different types in modules (commit ac3b43283923 ("module: replace
> module_layout with module_memory")) and posted "Type aware module
> allocator" set [4].
> 
> I liked the idea of parametrising code allocation requirements as a
> structure, but I believe the original proposal and Song's module allocator
> was too module centric, so I came up with these patches.
> 
> This set splits code allocation from modules by introducing execmem_alloc()
> and and execmem_free(), APIs, replaces call sites of module_alloc() and
> module_memfree() with the new APIs and implements core text and related
> allocations in a central place.
> 
> Instead of architecture specific overrides for module_alloc(), the
> architectures that require non-default behaviour for text allocation must
> fill execmem_info structure and implement execmem_arch_setup() that returns
> a pointer to that structure. If an architecture does not implement
> execmem_arch_setup(), the defaults compatible with the current
> modules::module_alloc() are used.
> 
> Since architectures define different restrictions on placement,
> permissions, alignment and other parameters for memory that can be used by
> different subsystems that allocate executable memory, execmem APIs
> take a type argument, that will be used to identify the calling subsystem
> and to allow architectures to define parameters for ranges suitable for that
> subsystem.
> 
> The new infrastructure allows decoupling of BPF, kprobes and ftrace from
> modules, and most importantly it paves the way for ROX allocations for
> executable memory.

It looks like you're just doing API cleanup first, then 

Re: [FYI][PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-02-23 Thread Kent Overstreet
On Fri, Feb 23, 2024 at 01:46:53PM -0500, Steven Rostedt wrote:
> On Fri, 23 Feb 2024 10:30:45 -0800
> Jeff Johnson  wrote:
> 
> > On 2/23/2024 9:56 AM, Steven Rostedt wrote:
> > > From: "Steven Rostedt (Google)" 
> > > 
> > > [
> > >This is a treewide change. I will likely re-create this patch again in
> > >the second week of the merge window of v6.9 and submit it then. Hoping
> > >to keep the conflicts that it will cause to a minimum.
> > > ]
> > > 
> > > With the rework of how the __string() handles dynamic strings where it
> > > saves off the source string in field in the helper structure[1], the
> > > assignment of that value to the trace event field is stored in the helper
> > > value and does not need to be passed in again.  
> > 
> > Just curious if this could be done piecemeal by first changing the
> > macros to be variadic macros which allows you to ignore the extra
> > argument. The callers could then be modified in their separate trees.
> > And then once all the callers have be merged, the macros could be
> > changed to no longer be variadic.
> 
> I weighed doing that, but I think ripping off the band-aid is a better
> approach. One thing I found is that leaving unused parameters in the macros
> can cause bugs itself. I found one case doing my clean up, where an unused
> parameter in one of the macros was bogus, and when I made it a used
> parameter, it broke the build.
> 
> I think for tree-wide changes, the preferred approach is to do one big
> patch at once. And since this only affects TRACE_EVENT() macros, it
> hopefully would not be too much of a burden (although out of tree users may
> suffer from this, but do we care?)

Agreed on doing it all at once, it'll be way less spam for people to
deal with.

Tangentially related though, what would make me really happy is if we
could create the string with in the TP__fast_assign() section. I have to
have a bunch of annoying wrappers right now because the string length
has to be known when we invoke the tracepoint.


Re: [PATCH 16/22] bcachefs: mark bch2_target_to_text_sb() static

2023-11-08 Thread Kent Overstreet
On Wed, Nov 08, 2023 at 01:58:37PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> bch2_target_to_text_sb() is only called in the file it is defined in,
> and it has no extern prototype:
> 
> fs/bcachefs/disk_groups.c:583:6: error: no previous prototype for 
> 'bch2_target_to_text_sb' [-Werror=missing-prototypes]
> 
> Mark it static to avoid the warning and have the code better optimized.
> 
> Fixes: bf0d9e89de2e ("bcachefs: Split apart bch2_target_to_text(), 
> bch2_target_to_text_sb()")
> Signed-off-by: Arnd Bergmann 

This is already fixed in my tree.


Re: [PATCH] powerpc: Export kvm_guest static key, for bcachefs six locks

2023-09-14 Thread Kent Overstreet
On Thu, Sep 14, 2023 at 12:26:53PM +1000, Michael Ellerman wrote:
> Kent Overstreet  writes:
> > bcachefs's six locks need kvm_guest, via
> >  ower_on_cpu() ->  vcpu_is_preempted() -> is_kvm_guest()
> >
> > Signed-off-by: Kent Overstreet 
> > Cc: linuxppc-dev@lists.ozlabs.org
> > ---
> >  arch/powerpc/kernel/firmware.c | 2 ++
> >  1 file changed, 2 insertions(+)
> 
> Acked-by: Michael Ellerman  (powerpc)
> 
> I'm happy for you to take this via your tree.

Thanks!


[PATCH] powerpc: Export kvm_guest static key, for bcachefs six locks

2023-09-13 Thread Kent Overstreet
bcachefs's six locks need kvm_guest, via
 ower_on_cpu() ->  vcpu_is_preempted() -> is_kvm_guest()

Signed-off-by: Kent Overstreet 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/firmware.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/firmware.c b/arch/powerpc/kernel/firmware.c
index 20328f72f9f2..8987eee33dc8 100644
--- a/arch/powerpc/kernel/firmware.c
+++ b/arch/powerpc/kernel/firmware.c
@@ -23,6 +23,8 @@ EXPORT_SYMBOL_GPL(powerpc_firmware_features);
 
 #if defined(CONFIG_PPC_PSERIES) || defined(CONFIG_KVM_GUEST)
 DEFINE_STATIC_KEY_FALSE(kvm_guest);
+EXPORT_SYMBOL_GPL(kvm_guest);
+
 int __init check_kvm_guest(void)
 {
struct device_node *hyper_node;
-- 
2.40.1



Re: [PATCH v2 02/12] mm: introduce execmem_text_alloc() and jit_text_alloc()

2023-06-25 Thread Kent Overstreet
On Sun, Jun 25, 2023 at 08:42:57PM +0300, Mike Rapoport wrote:
> On Sun, Jun 25, 2023 at 09:59:34AM -0700, Andy Lutomirski wrote:
> > 
> > 
> > On Sun, Jun 25, 2023, at 9:14 AM, Mike Rapoport wrote:
> > > On Mon, Jun 19, 2023 at 10:09:02AM -0700, Andy Lutomirski wrote:
> > >> 
> > >> On Sun, Jun 18, 2023, at 1:00 AM, Mike Rapoport wrote:
> > >> > On Sat, Jun 17, 2023 at 01:38:29PM -0700, Andy Lutomirski wrote:
> > >> >> On Fri, Jun 16, 2023, at 1:50 AM, Mike Rapoport wrote:
> > >> >> > From: "Mike Rapoport (IBM)" 
> > >> >> >
> > >> >> > module_alloc() is used everywhere as a mean to allocate memory for 
> > >> >> > code.
> > >> >> >
> > >> >> > Beside being semantically wrong, this unnecessarily ties all 
> > >> >> > subsystems
> > >> >> > that need to allocate code, such as ftrace, kprobes and BPF to 
> > >> >> > modules
> > >> >> > and puts the burden of code allocation to the modules code.
> > >> >> >
> > >> >> > Several architectures override module_alloc() because of various
> > >> >> > constraints where the executable memory can be located and this 
> > >> >> > causes
> > >> >> > additional obstacles for improvements of code allocation.
> > >> >> >
> > >> >> > Start splitting code allocation from modules by introducing
> > >> >> > execmem_text_alloc(), execmem_free(), jit_text_alloc(), jit_free() 
> > >> >> > APIs.
> > >> >> >
> > >> >> > Initially, execmem_text_alloc() and jit_text_alloc() are wrappers 
> > >> >> > for
> > >> >> > module_alloc() and execmem_free() and jit_free() are replacements of
> > >> >> > module_memfree() to allow updating all call sites to use the new 
> > >> >> > APIs.
> > >> >> >
> > >> >> > The intention semantics for new allocation APIs:
> > >> >> >
> > >> >> > * execmem_text_alloc() should be used to allocate memory that must 
> > >> >> > reside
> > >> >> >   close to the kernel image, like loadable kernel modules and 
> > >> >> > generated
> > >> >> >   code that is restricted by relative addressing.
> > >> >> >
> > >> >> > * jit_text_alloc() should be used to allocate memory for generated 
> > >> >> > code
> > >> >> >   when there are no restrictions for the code placement. For
> > >> >> >   architectures that require that any code is within certain 
> > >> >> > distance
> > >> >> >   from the kernel image, jit_text_alloc() will be essentially 
> > >> >> > aliased to
> > >> >> >   execmem_text_alloc().
> > >> >> >
> > >> >> 
> > >> >> Is there anything in this series to help users do the appropriate
> > >> >> synchronization when the actually populate the allocated memory with
> > >> >> code?  See here, for example:
> > >> >
> > >> > This series only factors out the executable allocations from modules 
> > >> > and
> > >> > puts them in a central place.
> > >> > Anything else would go on top after this lands.
> > >> 
> > >> Hmm.
> > >> 
> > >> On the one hand, there's nothing wrong with factoring out common code. On
> > >> the other hand, this is probably the right time to at least start
> > >> thinking about synchronization, at least to the extent that it might make
> > >> us want to change this API.  (I'm not at all saying that this series
> > >> should require changes -- I'm just saying that this is a good time to
> > >> think about how this should work.)
> > >> 
> > >> The current APIs, *and* the proposed jit_text_alloc() API, don't actually
> > >> look like the one think in the Linux ecosystem that actually
> > >> intelligently and efficiently maps new text into an address space:
> > >> mmap().
> > >> 
> > >> On x86, you can mmap() an existing file full of executable code PROT_EXEC
> > >> and jump to it with minimal synchronization (just the standard implicit
> > >> ordering in the kernel that populates the pages before setting up the
> > >> PTEs and whatever user synchronization is needed to avoid jumping into
> > >> the mapping before mmap() finishes).  It works across CPUs, and the only
> > >> possible way userspace can screw it up (for a read-only mapping of
> > >> read-only text, anyway) is to jump to the mapping too early, in which
> > >> case userspace gets a page fault.  Incoherence is impossible, and no one
> > >> needs to "serialize" (in the SDM sense).
> > >> 
> > >> I think the same sequence (from userspace's perspective) works on other
> > >> architectures, too, although I think more cache management is needed on
> > >> the kernel's end.  As far as I know, no Linux SMP architecture needs an
> > >> IPI to map executable text into usermode, but I could easily be wrong.
> > >> (IIRC RISC-V has very developer-unfriendly icache management, but I don't
> > >> remember the details.)
> > >> 
> > >> Of course, using ptrace or any other FOLL_FORCE to modify text on x86 is
> > >> rather fraught, and I bet many things do it wrong when userspace is
> > >> multithreaded.  But not in production because it's mostly not used in
> > >> production.)
> > >> 
> > >> But jit_text_alloc() can't do this, because the order of operations
> > >> doesn't match.  With 

Re: [PATCH v2 02/12] mm: introduce execmem_text_alloc() and jit_text_alloc()

2023-06-19 Thread Kent Overstreet
On Sat, Jun 17, 2023 at 01:38:29PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 16, 2023, at 1:50 AM, Mike Rapoport wrote:
> > From: "Mike Rapoport (IBM)" 
> >
> > module_alloc() is used everywhere as a mean to allocate memory for code.
> >
> > Beside being semantically wrong, this unnecessarily ties all subsystems
> > that need to allocate code, such as ftrace, kprobes and BPF to modules
> > and puts the burden of code allocation to the modules code.
> >
> > Several architectures override module_alloc() because of various
> > constraints where the executable memory can be located and this causes
> > additional obstacles for improvements of code allocation.
> >
> > Start splitting code allocation from modules by introducing
> > execmem_text_alloc(), execmem_free(), jit_text_alloc(), jit_free() APIs.
> >
> > Initially, execmem_text_alloc() and jit_text_alloc() are wrappers for
> > module_alloc() and execmem_free() and jit_free() are replacements of
> > module_memfree() to allow updating all call sites to use the new APIs.
> >
> > The intention semantics for new allocation APIs:
> >
> > * execmem_text_alloc() should be used to allocate memory that must reside
> >   close to the kernel image, like loadable kernel modules and generated
> >   code that is restricted by relative addressing.
> >
> > * jit_text_alloc() should be used to allocate memory for generated code
> >   when there are no restrictions for the code placement. For
> >   architectures that require that any code is within certain distance
> >   from the kernel image, jit_text_alloc() will be essentially aliased to
> >   execmem_text_alloc().
> >
> 
> Is there anything in this series to help users do the appropriate 
> synchronization when the actually populate the allocated memory with code?  
> See here, for example:
> 
> https://lore.kernel.org/linux-fsdevel/cb6533c6-cea0-4f04-95cf-b8240c6ab...@app.fastmail.com/T/#u

We're still in need of an arch independent text_poke() api.


Re: [PATCH v2 06/12] mm/execmem: introduce execmem_data_alloc()

2023-06-18 Thread Kent Overstreet
On Mon, Jun 19, 2023 at 02:43:58AM +0200, Thomas Gleixner wrote:
> Kent!

Hi Thomas :)

> No. I am not.

Ok.

> Whether that's an internal function or not does not make any difference
> at all.

Well, at the risk of this discussion going completely off the rails, I
have to disagree with you there. External interfaces and high level
semantics are more important to get right from the outset, internal
implementation details can be cleaned up later, within reason.

And the discussion on this patchset has been more focused on those
external interfaces, which seems like the right approach to me.

> > ... I made the same mistake reviewing Song's patchset...
> 
> Songs series had rough edges, but was way more data structure driven
> and palatable than this hackery.

I liked that aspect of Song's patchset too, and I'm actually inclined to
agree with you that this patchset might get a bit cleaner with more of
that, but really, this semes like just quibbling over calling convention
for an internal helper function.


Re: [PATCH v2 06/12] mm/execmem: introduce execmem_data_alloc()

2023-06-18 Thread Kent Overstreet
On Mon, Jun 19, 2023 at 12:32:55AM +0200, Thomas Gleixner wrote:
> Mike!
> 
> Sorry for being late on this ...
> 
> On Fri, Jun 16 2023 at 11:50, Mike Rapoport wrote:
> >  
> > +void *execmem_data_alloc(size_t size)
> > +{
> > +   unsigned long start = execmem_params.modules.data.start;
> > +   unsigned long end = execmem_params.modules.data.end;
> > +   pgprot_t pgprot = execmem_params.modules.data.pgprot;
> > +   unsigned int align = execmem_params.modules.data.alignment;
> > +   unsigned long fallback_start = 
> > execmem_params.modules.data.fallback_start;
> > +   unsigned long fallback_end = execmem_params.modules.data.fallback_end;
> > +   bool kasan = execmem_params.modules.flags & EXECMEM_KASAN_SHADOW;
> 
> While I know for sure that you read up on the discussion I had with Song
> about data structures, it seems you completely failed to understand it.
> 
> > +   return execmem_alloc(size, start, end, align, pgprot,
> > +fallback_start, fallback_end, kasan);
> 
> Having _seven_ intermediate variables to fill _eight_ arguments of a
> function instead of handing in @size and a proper struct pointer is
> tasteless and disgusting at best.
> 
> Six out of those seven parameters are from:
> 
> execmem_params.module.data
> 
> while the KASAN shadow part is retrieved from
> 
> execmem_params.module.flags
> 
> So what prevents you from having a uniform data structure, which is
> extensible and decribes _all_ types of allocations?
> 
> Absolutely nothing. The flags part can either be in the type dependend
> part or you make the type configs an array as I had suggested originally
> and then execmem_alloc() becomes:
> 
> void *execmem_alloc(type, size)
> 
> and
> 
> static inline void *execmem_data_alloc(size_t size)
> {
> return execmem_alloc(EXECMEM_TYPE_DATA, size);
> }
> 
> which gets the type independent parts from @execmem_param.
> 
> Just read through your own series and watch the evolution of
> execmem_alloc():
> 
>   static void *execmem_alloc(size_t size)
> 
>   static void *execmem_alloc(size_t size, unsigned long start,
>  unsigned long end, unsigned int align,
>  pgprot_t pgprot)
> 
>   static void *execmem_alloc(size_t len, unsigned long start,
>  unsigned long end, unsigned int align,
>  pgprot_t pgprot,
>  unsigned long fallback_start,
>  unsigned long fallback_end,
>  bool kasan)
> 
> In a month from now this function will have _ten_ parameters and tons of
> horrible wrappers which convert an already existing data structure into
> individual function arguments.
> 
> Seriously?
> 
> If you want this function to be [ab]used outside of the exec_param
> configuration space for whatever non-sensical reasons then this still
> can be either:
> 
> void *execmem_alloc(params, type, size)
> 
> static inline void *execmem_data_alloc(size_t size)
> {
> return execmem_alloc(_param, EXECMEM_TYPE_DATA, size);
> }
> 
> or
> 
> void *execmem_alloc(type_params, size);
> 
> static inline void *execmem_data_alloc(size_t size)
> {
> return execmem_alloc(_param.data, size);
> }
> 
> which both allows you to provide alternative params, right?
> 
> Coming back to my conversation with Song:
> 
>"Bad programmers worry about the code. Good programmers worry about
> data structures and their relationships."

Thomas, you're confusing an internal interface with external, I made the
same mistake reviewing Song's patchset...


Re: [PATCH v2 07/12] arm64, execmem: extend execmem_params for generated code definitions

2023-06-17 Thread Kent Overstreet
On Sat, Jun 17, 2023 at 09:38:17AM -0700, Song Liu wrote:
> On Sat, Jun 17, 2023 at 8:37 AM Kent Overstreet
>  wrote:
> >
> > On Sat, Jun 17, 2023 at 09:57:59AM +0300, Mike Rapoport wrote:
> > > > This is growing fast. :) We have 3 now: text, data, jit. And it will be
> > > > 5 when we split data into rw data, ro data, ro after init data. I wonder
> > > > whether we should still do some type enum here. But we can revisit
> > > > this topic later.
> > >
> > > I don't think we'd need 5. Four at most :)
> > >
> > > I don't know yet what would be the best way to differentiate RW and RO
> > > data, but ro_after_init surely won't need a new type. It either will be
> > > allocated as RW and then the caller will have to set it RO after
> > > initialization is done, or it will be allocated as RO and the caller will
> > > have to do something like text_poke to update it.
> >
> > Perhaps ro_after_init could use the same allocation interface and share
> > pages with ro pages - if we just added a refcount for "this page
> > currently needs to be rw, module is still loading?"
> 
> If we don't relax rules with read only, we will have to separate rw, ro,
> and ro_after_init. But we can still have page sharing:
> 
> Two modules can put rw data on the same page.
> With text poke (ro data poke to be accurate), two modules can put
> ro data on the same page.
> 
> > text_poke() approach wouldn't be workable, you'd have to audit and fix
> > all module init code in the entire kernel.
> 
> Agreed. For this reason, each module has to have its own page(s) for
> ro_after_init data.

Relaxing page permissions to allow for page sharing could also be a
config option. For archs with 64k pages it seems worthwhile.


Re: [PATCH v2 07/12] arm64, execmem: extend execmem_params for generated code definitions

2023-06-17 Thread Kent Overstreet
On Sat, Jun 17, 2023 at 09:57:59AM +0300, Mike Rapoport wrote:
> > This is growing fast. :) We have 3 now: text, data, jit. And it will be
> > 5 when we split data into rw data, ro data, ro after init data. I wonder
> > whether we should still do some type enum here. But we can revisit
> > this topic later.
> 
> I don't think we'd need 5. Four at most :)
> 
> I don't know yet what would be the best way to differentiate RW and RO
> data, but ro_after_init surely won't need a new type. It either will be
> allocated as RW and then the caller will have to set it RO after
> initialization is done, or it will be allocated as RO and the caller will
> have to do something like text_poke to update it.

Perhaps ro_after_init could use the same allocation interface and share
pages with ro pages - if we just added a refcount for "this page
currently needs to be rw, module is still loading?"

text_poke() approach wouldn't be workable, you'd have to audit and fix
all module init code in the entire kernel.


Re: [PATCH v2 02/12] mm: introduce execmem_text_alloc() and jit_text_alloc()

2023-06-16 Thread Kent Overstreet
On Fri, Jun 16, 2023 at 11:50:28AM +0300, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" 
> 
> module_alloc() is used everywhere as a mean to allocate memory for code.
> 
> Beside being semantically wrong, this unnecessarily ties all subsystems
> that need to allocate code, such as ftrace, kprobes and BPF to modules
> and puts the burden of code allocation to the modules code.
> 
> Several architectures override module_alloc() because of various
> constraints where the executable memory can be located and this causes
> additional obstacles for improvements of code allocation.
> 
> Start splitting code allocation from modules by introducing
> execmem_text_alloc(), execmem_free(), jit_text_alloc(), jit_free() APIs.
> 
> Initially, execmem_text_alloc() and jit_text_alloc() are wrappers for
> module_alloc() and execmem_free() and jit_free() are replacements of
> module_memfree() to allow updating all call sites to use the new APIs.
> 
> The intention semantics for new allocation APIs:
> 
> * execmem_text_alloc() should be used to allocate memory that must reside
>   close to the kernel image, like loadable kernel modules and generated
>   code that is restricted by relative addressing.
> 
> * jit_text_alloc() should be used to allocate memory for generated code
>   when there are no restrictions for the code placement. For
>   architectures that require that any code is within certain distance
>   from the kernel image, jit_text_alloc() will be essentially aliased to
>   execmem_text_alloc().
> 
> The names execmem_text_alloc() and jit_text_alloc() emphasize that the
> allocated memory is for executable code, the allocations of the
> associated data, like data sections of a module will use
> execmem_data_alloc() interface that will be added later.

I like the API split - at the risk of further bikeshedding, perhaps
near_text_alloc() and far_text_alloc()? Would be more explicit.

Reviewed-by: Kent Overstreet 


Re: [PATCH 00/13] mm: jit/text allocator

2023-06-13 Thread Kent Overstreet
On Thu, Jun 08, 2023 at 09:41:16PM +0300, Mike Rapoport wrote:
> On Tue, Jun 06, 2023 at 11:21:59AM -0700, Song Liu wrote:
> > On Mon, Jun 5, 2023 at 3:09 AM Mark Rutland  wrote:
> > 
> > [...]
> > 
> > > > > > Can you give more detail on what parameters you need? If the only 
> > > > > > extra
> > > > > > parameter is just "does this allocation need to live close to kernel
> > > > > > text", that's not that big of a deal.
> > > > >
> > > > > My thinking was that we at least need the start + end for each 
> > > > > caller. That
> > > > > might be it, tbh.
> > > >
> > > > Do you mean that modules will have something like
> > > >
> > > >   jit_text_alloc(size, MODULES_START, MODULES_END);
> > > >
> > > > and kprobes will have
> > > >
> > > >   jit_text_alloc(size, KPROBES_START, KPROBES_END);
> > > > ?
> > >
> > > Yes.
> > 
> > How about we start with two APIs:
> >  jit_text_alloc(size);
> >  jit_text_alloc_range(size, start, end);
> > 
> > AFAICT, arm64 is the only arch that requires the latter API. And TBH, I am
> > not quite convinced it is needed.
>  
> Right now arm64 and riscv override bpf and kprobes allocations to use the
> entire vmalloc address space, but having the ability to allocate generated
> code outside of modules area may be useful for other architectures.
> 
> Still the start + end for the callers feels backwards to me because the
> callers do not define the ranges, but rather the architectures, so we still
> need a way for architectures to define how they want allocate memory for
> the generated code.

So, the start + end just comes from the need to keep relative pointers
under a certain size. I think this could be just a flag, I see no reason
to expose actual addresses here.


Re: [PATCH 00/13] mm: jit/text allocator

2023-06-05 Thread Kent Overstreet
On Mon, Jun 05, 2023 at 12:20:40PM +0300, Mike Rapoport wrote:
> On Fri, Jun 02, 2023 at 10:35:09AM +0100, Mark Rutland wrote:
> > On Thu, Jun 01, 2023 at 02:14:56PM -0400, Kent Overstreet wrote:
> > > On Thu, Jun 01, 2023 at 05:12:03PM +0100, Mark Rutland wrote:
> > > > For a while I have wanted to give kprobes its own allocator so that it 
> > > > can work
> > > > even with CONFIG_MODULES=n, and so that it doesn't have to waste VA 
> > > > space in
> > > > the modules area.
> > > > 
> > > > Given that, I think these should have their own allocator functions 
> > > > that can be
> > > > provided independently, even if those happen to use common 
> > > > infrastructure.
> > > 
> > > How much memory can kprobes conceivably use? I think we also want to try
> > > to push back on combinatorial new allocators, if we can.
> > 
> > That depends on who's using it, and how (e.g. via BPF).
> > 
> > To be clear, I'm not necessarily asking for entirely different allocators, 
> > but
> > I do thinkg that we want wrappers that can at least pass distinct start+end
> > parameters to a common allocator, and for arm64's modules code I'd expect 
> > that
> > we'd keep the range falblack logic out of the common allcoator, and just 
> > call
> > it twice.
> > 
> > > > > Several architectures override module_alloc() because of various
> > > > > constraints where the executable memory can be located and this causes
> > > > > additional obstacles for improvements of code allocation.
> > > > > 
> > > > > This set splits code allocation from modules by introducing
> > > > > jit_text_alloc(), jit_data_alloc() and jit_free() APIs, replaces call
> > > > > sites of module_alloc() and module_memfree() with the new APIs and
> > > > > implements core text and related allocation in a central place.
> > > > > 
> > > > > Instead of architecture specific overrides for module_alloc(), the
> > > > > architectures that require non-default behaviour for text allocation 
> > > > > must
> > > > > fill jit_alloc_params structure and implement jit_alloc_arch_params() 
> > > > > that
> > > > > returns a pointer to that structure. If an architecture does not 
> > > > > implement
> > > > > jit_alloc_arch_params(), the defaults compatible with the current
> > > > > modules::module_alloc() are used.
> > > > 
> > > > As above, I suspect that each of the callsites should probably be using 
> > > > common
> > > > infrastructure, but I don't think that a single jit_alloc_arch_params() 
> > > > makes
> > > > sense, since the parameters for each case may need to be distinct.
> > > 
> > > I don't see how that follows. The whole point of function parameters is
> > > that they may be different :)
> > 
> > What I mean is that jit_alloc_arch_params() tries to aggregate common
> > parameters, but they aren't actually common (e.g. the actual start+end range
> > for allocation).
> 
> jit_alloc_arch_params() tries to aggregate architecture constraints and
> requirements for allocations of executable memory and this exactly what
> the first 6 patches of this set do.
> 
> A while ago Thomas suggested to use a structure that parametrizes
> architecture constraints by the memory type used in modules [1] and Song
> implemented the infrastructure for it and x86 part [2].
> 
> I liked the idea of defining parameters in a single structure, but I
> thought that approaching the problem from the arch side rather than from
> modules perspective will be better starting point, hence these patches.
> 
> I don't see a fundamental reason why a single structure cannot describe
> what is needed for different code allocation cases, be it modules, kprobes
> or bpf. There is of course an assumption that the core allocations will be
> the same for all the users, and it seems to me that something like 
> 
> * allocate physical memory if allocator caches are empty
> * map it in vmalloc or modules address space
> * return memory from the allocator cache to the caller
> 
> will work for all usecases.
> 
> We might need separate caches for different cases on different
> architectures, and a way to specify what cache should be used in the
> allocator API, but that does not contradict a single structure for arch
> specific parameters, but only makes it more elaborate, e.g. something like
> 

Re: [PATCH 00/13] mm: jit/text allocator

2023-06-04 Thread Kent Overstreet
On Sun, Jun 04, 2023 at 02:22:30PM -0700, Song Liu wrote:
> On Sun, Jun 4, 2023 at 11:02 AM Kent Overstreet
>  wrote:
> >
> > On Fri, Jun 02, 2023 at 11:20:58AM -0700, Song Liu wrote:
> > > IIUC, arm64 uses VMALLOC address space for BPF programs. The reason
> > > is each BPF program uses at least 64kB (one page) out of the 128MB
> > > address space. Puranjay Mohan (CC'ed) is working on enabling
> > > bpf_prog_pack for arm64. Once this work is done, multiple BPF programs
> > > will be able to share a page. Will this improvement remove the need to
> > > specify a different address range for BPF programs?
> >
> > Can we please stop working on BPF specific sub page allocation and focus
> > on doing this in mm/? This never should have been in BPF in the first
> > place.
> 
> That work is mostly independent of the allocator work we are discussing here.
> The goal Puranjay's work is to enable the arm64 BPF JIT engine to use a
> ROX allocator. The allocator could be the bpf_prog_pack allocator, or 
> jitalloc,
> or module_alloc_type. Puranjay is using bpf_prog_alloc for now. But once
> jitalloc or module_alloc_type (either one) is merged, we will migrate BPF
> JIT engines (x86_64 and arm64) to the new allocator and then tear down
> bpf_prog_pack.
> 
> Does this make sense?

Yeah, as long as that's the plan. Maybe one of you could tell us what
issues were preventing prog_pack from being used in the first place, it
might be relevant - this is the time to get the new allocator API right.


Re: [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX

2023-06-04 Thread Kent Overstreet
On Thu, Jun 01, 2023 at 08:50:39PM +, Edgecombe, Rick P wrote:
> > Ahh! Thanks for that; perhaps the comment in text_poke() about IPIs
> > could be a bit clearer.
> > 
> > What is it (if anything) you don't like about text_poke() then? It
> > looks
> > like it's doing broadly similar things to kmap_local(), so should be
> > in the same ballpark from a performance POV?
> 
> The way text_poke() is used here, it is creating a new writable alias
> and flushing it for *each* write to the module (like for each write of
> an individual relocation, etc). I was just thinking it might warrant
> some batching or something.

Ah, I see. A kmap_local type interface might get us that kind of
batching, if it supported mapping compound pages - currently kmap_local
still only maps single pages, but with folios getting plumbed around I
assume someone will make it handle compound pages eventually.


Re: [PATCH 00/13] mm: jit/text allocator

2023-06-04 Thread Kent Overstreet
On Fri, Jun 02, 2023 at 11:20:58AM -0700, Song Liu wrote:
> IIUC, arm64 uses VMALLOC address space for BPF programs. The reason
> is each BPF program uses at least 64kB (one page) out of the 128MB
> address space. Puranjay Mohan (CC'ed) is working on enabling
> bpf_prog_pack for arm64. Once this work is done, multiple BPF programs
> will be able to share a page. Will this improvement remove the need to
> specify a different address range for BPF programs?

Can we please stop working on BPF specific sub page allocation and focus
on doing this in mm/? This never should have been in BPF in the first
place.


Re: [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX

2023-06-01 Thread Kent Overstreet
On Thu, Jun 01, 2023 at 06:13:44PM +, Edgecombe, Rick P wrote:
> > text_poke() _does_ create a separate RW mapping.
> 
> Sorry, I meant a separate RW allocation.

Ah yes, that makes sense


> 
> > 
> > The thing that sucks about text_poke() is that it always does a full
> > TLB
> > flush, and AFAICT that's not remotely needed. What it really wants to
> > be
> > doing is conceptually just
> > 
> > kmap_local()
> > mempcy()
> > kunmap_loca()
> > flush_icache();
> > 
> > ...except that kmap_local() won't actually create a new mapping on
> > non-highmem architectures, so text_poke() open codes it.
> 
> Text poke creates only a local CPU RW mapping. It's more secure because
> other threads can't write to it.

*nod*, same as kmap_local

> It also only needs to flush the local core when it's done since it's
> not using a shared MM.
 
Ahh! Thanks for that; perhaps the comment in text_poke() about IPIs
could be a bit clearer.

What is it (if anything) you don't like about text_poke() then? It looks
like it's doing broadly similar things to kmap_local(), so should be
in the same ballpark from a performance POV?


Re: [PATCH 00/13] mm: jit/text allocator

2023-06-01 Thread Kent Overstreet
On Thu, Jun 01, 2023 at 05:12:03PM +0100, Mark Rutland wrote:
> For a while I have wanted to give kprobes its own allocator so that it can 
> work
> even with CONFIG_MODULES=n, and so that it doesn't have to waste VA space in
> the modules area.
> 
> Given that, I think these should have their own allocator functions that can 
> be
> provided independently, even if those happen to use common infrastructure.

How much memory can kprobes conceivably use? I think we also want to try
to push back on combinatorial new allocators, if we can.

> > Several architectures override module_alloc() because of various
> > constraints where the executable memory can be located and this causes
> > additional obstacles for improvements of code allocation.
> > 
> > This set splits code allocation from modules by introducing
> > jit_text_alloc(), jit_data_alloc() and jit_free() APIs, replaces call
> > sites of module_alloc() and module_memfree() with the new APIs and
> > implements core text and related allocation in a central place.
> > 
> > Instead of architecture specific overrides for module_alloc(), the
> > architectures that require non-default behaviour for text allocation must
> > fill jit_alloc_params structure and implement jit_alloc_arch_params() that
> > returns a pointer to that structure. If an architecture does not implement
> > jit_alloc_arch_params(), the defaults compatible with the current
> > modules::module_alloc() are used.
> 
> As above, I suspect that each of the callsites should probably be using common
> infrastructure, but I don't think that a single jit_alloc_arch_params() makes
> sense, since the parameters for each case may need to be distinct.

I don't see how that follows. The whole point of function parameters is
that they may be different :)

Can you give more detail on what parameters you need? If the only extra
parameter is just "does this allocation need to live close to kernel
text", that's not that big of a deal.


Re: [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX

2023-06-01 Thread Kent Overstreet
On Thu, Jun 01, 2023 at 04:54:27PM +, Edgecombe, Rick P wrote:
> It is just a local flush, but I wonder how much text_poke()ing is too
> much. A lot of the are even inside loops. Can't it do the batch version
> at least?
> 
> The other thing, and maybe this is in paranoia category, but it's
> probably at least worth noting. Before the modules were not made
> executable until all of the code was finalized. Now they are made
> executable in an intermediate state and then patched later. It might
> weaken the CFI stuff, but also it just kind of seems a bit unbounded
> for dealing with executable code.

I believe bpf starts out by initializing new executable memory with
illegal opcodes, maybe we should steal that and make it standard.

> Preparing the modules in a separate RW mapping, and then text_poke()ing
> the whole thing in when you are done would resolve both of these.

text_poke() _does_ create a separate RW mapping.

The thing that sucks about text_poke() is that it always does a full TLB
flush, and AFAICT that's not remotely needed. What it really wants to be
doing is conceptually just

kmap_local()
mempcy()
kunmap_loca()
flush_icache();

...except that kmap_local() won't actually create a new mapping on
non-highmem architectures, so text_poke() open codes it.


Re: [PATCH 12/13] x86/jitalloc: prepare to allocate exectuatble memory as ROX

2023-06-01 Thread Kent Overstreet
On Thu, Jun 01, 2023 at 12:30:50PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 01, 2023 at 01:12:56PM +0300, Mike Rapoport wrote:
> 
> > +static void __init_or_module do_text_poke(void *addr, const void *opcode, 
> > size_t len)
> > +{
> > +   if (system_state < SYSTEM_RUNNING) {
> > +   text_poke_early(addr, opcode, len);
> > +   } else {
> > +   mutex_lock(_mutex);
> > +   text_poke(addr, opcode, len);
> > +   mutex_unlock(_mutex);
> > +   }
> > +}
> 
> So I don't much like do_text_poke(); why?

Could you share why?

I think the impementation sucks but conceptually it's the right idea -
create a new temporary mapping to avoid the need for RWX mappings.


Re: [RFC PATCH RESEND 00/28] per-VMA locks proposal

2022-09-05 Thread Kent Overstreet
On Mon, Sep 05, 2022 at 11:32:48AM -0700, Suren Baghdasaryan wrote:
> On Mon, Sep 5, 2022 at 5:32 AM 'Michal Hocko' via kernel-team
>  wrote:
> >
> > Unless I am missing something, this is not based on the Maple tree
> > rewrite, right? Does the change in the data structure makes any
> > difference to the approach? I remember discussions at LSFMM where it has
> > been pointed out that some issues with the vma tree are considerably
> > simpler to handle with the maple tree.
> 
> Correct, this does not use the Maple tree yet but once Maple tree
> transition happens and it supports RCU-safe lookups, my code in
> find_vma_under_rcu() becomes really simple.
> 
> >
> > On Thu 01-09-22 10:34:48, Suren Baghdasaryan wrote:
> > [...]
> > > One notable way the implementation deviates from the proposal is the way
> > > VMAs are marked as locked. Because during some of mm updates multiple
> > > VMAs need to be locked until the end of the update (e.g. vma_merge,
> > > split_vma, etc).
> >
> > I think it would be really helpful to spell out those issues in a greater
> > detail. Not everybody is aware of those vma related subtleties.
> 
> Ack. I'll expand the description of the cases when multiple VMAs need
> to be locked in the same update. The main difficulties are:
> 1. Multiple VMAs might need to be locked within one
> mmap_write_lock/mmap_write_unlock session (will call it an update
> transaction).
> 2. Figuring out when it's safe to unlock a previously locked VMA is
> tricky because that might be happening in different functions and at
> different call levels.
> 
> So, instead of the usual lock/unlock pattern, the proposed solution
> marks a VMA as locked and provides an efficient way to:
> 1. Identify locked VMAs.
> 2. Unlock all locked VMAs in bulk.
> 
> We also postpone unlocking the locked VMAs until the end of the update
> transaction, when we do mmap_write_unlock. Potentially this keeps a
> VMA locked for longer than is absolutely necessary but it results in a
> big reduction of code complexity.

Correct me if I'm wrong, but it looks like any time multiple VMAs need to be
locked we need mmap_lock anyways, which is what makes your approach so sweet.

If however we ever want to lock multiple VMAs without taking mmap_lock, then
deadlock avoidance algorithms aren't that bad - there's the ww_mutex approach,
which is simple and works well when there isn't much expected contention (the
advantage of the ww_mutex approach is that it doesn't have to track all held
locks). I've also written full cycle detection; that approcah gets you fewer
restarts, at the cost of needing a list of all currently held locks.


Re: [RFC PATCH RESEND 00/28] per-VMA locks proposal

2022-09-01 Thread Kent Overstreet
On Thu, Sep 01, 2022 at 10:34:48AM -0700, Suren Baghdasaryan wrote:
> Resending to fix the issue with the In-Reply-To tag in the original
> submission at [4].
> 
> This is a proof of concept for per-vma locks idea that was discussed
> during SPF [1] discussion at LSF/MM this year [2], which concluded with
> suggestion that “a reader/writer semaphore could be put into the VMA
> itself; that would have the effect of using the VMA as a sort of range
> lock. There would still be contention at the VMA level, but it would be an
> improvement.” This patchset implements this suggested approach.
> 
> When handling page faults we lookup the VMA that contains the faulting
> page under RCU protection and try to acquire its lock. If that fails we
> fall back to using mmap_lock, similar to how SPF handled this situation.
> 
> One notable way the implementation deviates from the proposal is the way
> VMAs are marked as locked. Because during some of mm updates multiple
> VMAs need to be locked until the end of the update (e.g. vma_merge,
> split_vma, etc). Tracking all the locked VMAs, avoiding recursive locks
> and other complications would make the code more complex. Therefore we
> provide a way to "mark" VMAs as locked and then unmark all locked VMAs
> all at once. This is done using two sequence numbers - one in the
> vm_area_struct and one in the mm_struct. VMA is considered locked when
> these sequence numbers are equal. To mark a VMA as locked we set the
> sequence number in vm_area_struct to be equal to the sequence number
> in mm_struct. To unlock all VMAs we increment mm_struct's seq number.
> This allows for an efficient way to track locked VMAs and to drop the
> locks on all VMAs at the end of the update.

I like it - the sequence numbers are a stroke of genuius. For what it's doing
the patchset seems almost small.

Two complaints so far:
 - I don't like the vma_mark_locked() name. To me it says that the caller
   already took or is taking the lock and this function is just marking that
   we're holding the lock, but it's really taking a different type of lock. But
   this function can block, it really is taking a lock, so it should say that.
   
   This is AFAIK a new concept, not sure I'm going to have anything good either,
   but perhaps vma_lock_multiple()?

 - I don't like the #ifdef and the separate fallback path in the fault handlers.

   Can we make find_and_lock_anon_vma() do the right thing, and not fail unless
   e.g. there isn't a vma at that address? Just have it wait for vm_lock_seq to
   change and then retry if needed.


Re: [RFC PATCH RESEND 03/28] mm: introduce __find_vma to be used without mmap_lock protection

2022-09-01 Thread Kent Overstreet
On Thu, Sep 01, 2022 at 10:34:51AM -0700, Suren Baghdasaryan wrote:
> Add __find_vma function to be used for VMA lookup under rcu protection.

So it was news to me that the rb tree code can be used for lockless lookups -
not having looked at lib/rbtree.c in over 10 years :) - I still think it should
be mentioned in the commit message that that's what you're doing and why it's
safe, because it's not exactly common knowledge and lockless stuff deserves
extra scrutiny.

Probably worth a comment, too.

Reviewed-by: Kent Overstreet 


Re: [RFC PATCH RESEND 04/28] mm: move mmap_lock assert function definitions

2022-09-01 Thread Kent Overstreet
On Thu, Sep 01, 2022 at 10:34:52AM -0700, Suren Baghdasaryan wrote:
> Move mmap_lock assert function definitions up so that they can be used
> by other mmap_lock routines.
> 
> Signed-off-by: Suren Baghdasaryan 
> ---
>  include/linux/mmap_lock.h | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
> index 96e113e23d04..e49ba91bb1f0 100644
> --- a/include/linux/mmap_lock.h
> +++ b/include/linux/mmap_lock.h
> @@ -60,6 +60,18 @@ static inline void __mmap_lock_trace_released(struct 
> mm_struct *mm, bool write)
>  
>  #endif /* CONFIG_TRACING */
>  
> +static inline void mmap_assert_locked(struct mm_struct *mm)
> +{
> + lockdep_assert_held(>mmap_lock);
> + VM_BUG_ON_MM(!rwsem_is_locked(>mmap_lock), mm);

These look redundant to me - maybe there's a reason the VM developers want both,
but I would drop the VM_BUG_ON() and just keep the lockdep_assert_held(), since
that's the standard way to write that assertion.


Re: [RFC PATCH RESEND 23/28] x86/mm: define ARCH_SUPPORTS_PER_VMA_LOCK

2022-09-01 Thread Kent Overstreet
On Thu, Sep 01, 2022 at 10:35:11AM -0700, Suren Baghdasaryan wrote:
> Set ARCH_SUPPORTS_PER_VMA_LOCK so that the per-VMA lock support can be
> compiled on this architecture.
> 
> Signed-off-by: Suren Baghdasaryan 
> ---
>  arch/x86/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index f9920f1341c8..ee19de020b27 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -27,6 +27,7 @@ config X86_64
>   # Options that are inherently 64-bit kernel only:
>   select ARCH_HAS_GIGANTIC_PAGE
>   select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
> + select ARCH_SUPPORTS_PER_VMA_LOCK
>   select ARCH_USE_CMPXCHG_LOCKREF
>   select HAVE_ARCH_SOFT_DIRTY
>   select MODULES_USE_ELF_RELA

I think you could combine this with the previous path (and similarly on other
architectures) - they logically go together.


[PATCH 04/11] powerpc: Convert to printbuf

2022-08-15 Thread Kent Overstreet
From: Kent Overstreet 

This converts from seq_buf to printbuf. We're using printbuf in external
buffer mode, so it's a direct conversion, aside from some trivial
refactoring in cpu_show_meltdown() to make the code more consistent.

Signed-off-by: Kent Overstreet 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/process.c | 16 +++--
 arch/powerpc/kernel/security.c| 75 ++-
 arch/powerpc/platforms/pseries/papr_scm.c | 34 +-
 3 files changed, 57 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 0fbda89cd1..05654dbeb2 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -37,7 +37,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -1396,32 +1396,30 @@ void show_user_instructions(struct pt_regs *regs)
 {
unsigned long pc;
int n = NR_INSN_TO_PRINT;
-   struct seq_buf s;
char buf[96]; /* enough for 8 times 9 + 2 chars */
+   struct printbuf s = PRINTBUF_EXTERN(buf, sizeof(buf));
 
pc = regs->nip - (NR_INSN_TO_PRINT * 3 / 4 * sizeof(int));
 
-   seq_buf_init(, buf, sizeof(buf));
-
while (n) {
int i;
 
-   seq_buf_clear();
+   printbuf_reset();
 
for (i = 0; i < 8 && n; i++, n--, pc += sizeof(int)) {
int instr;
 
if (copy_from_user_nofault(, (void __user *)pc,
sizeof(instr))) {
-   seq_buf_printf(, " ");
+   prt_printf(, " ");
continue;
}
-   seq_buf_printf(, regs->nip == pc ? "<%08x> " : "%08x 
", instr);
+   prt_printf(, regs->nip == pc ? "<%08x> " : "%08x ", 
instr);
}
 
-   if (!seq_buf_has_overflowed())
+   if (printbuf_remaining())
pr_info("%s[%d]: code: %s\n", current->comm,
-   current->pid, s.buffer);
+   current->pid, s.buf);
}
 }
 
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index d96fd14bd7..b34de62e65 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -144,31 +144,28 @@ void __init setup_spectre_v2(void)
 #ifdef CONFIG_PPC_BOOK3S_64
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
bool thread_priv;
 
thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV);
 
if (rfi_flush) {
-   struct seq_buf s;
-   seq_buf_init(, buf, PAGE_SIZE - 1);
 
-   seq_buf_printf(, "Mitigation: RFI Flush");
+   prt_printf(, "Mitigation: RFI Flush");
if (thread_priv)
-   seq_buf_printf(, ", L1D private per thread");
-
-   seq_buf_printf(, "\n");
-
-   return s.len;
+   prt_printf(, ", L1D private per thread");
+
+   prt_printf(, "\n");
+   } else if (thread_priv) {
+   prt_printf(, "Vulnerable: L1D private per thread\n");
+   } else if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
+  !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) {
+   prt_printf(, "Not affected\n");
+   } else {
+   prt_printf(, "Vulnerable\n");
}
 
-   if (thread_priv)
-   return sprintf(buf, "Vulnerable: L1D private per thread\n");
-
-   if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
-   !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
-   return sprintf(buf, "Not affected\n");
-
-   return sprintf(buf, "Vulnerable\n");
+   return printbuf_written();
 }
 
 ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char 
*buf)
@@ -179,70 +176,66 @@ ssize_t cpu_show_l1tf(struct device *dev, struct 
device_attribute *attr, char *b
 
 ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
-   struct seq_buf s;
-
-   seq_buf_init(, buf, PAGE_SIZE - 1);
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
 
if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
if (barrier_nospec_enabled)
-   seq_buf_printf(, "Mitigation: __user pointer 
sanitization");
+  

[PATCH v4 27/34] powerpc: Convert to printbuf

2022-06-19 Thread Kent Overstreet
This converts from seq_buf to printbuf. We're using printbuf in external
buffer mode, so it's a direct conversion, aside from some trivial
refactoring in cpu_show_meltdown() to make the code more consistent.

Signed-off-by: Kent Overstreet 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/process.c | 16 +++--
 arch/powerpc/kernel/security.c| 75 ++-
 arch/powerpc/platforms/pseries/papr_scm.c | 34 +-
 3 files changed, 57 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 984813a4d5..fb8ba50223 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -39,7 +39,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -1399,32 +1399,30 @@ void show_user_instructions(struct pt_regs *regs)
 {
unsigned long pc;
int n = NR_INSN_TO_PRINT;
-   struct seq_buf s;
char buf[96]; /* enough for 8 times 9 + 2 chars */
+   struct printbuf s = PRINTBUF_EXTERN(buf, sizeof(buf));
 
pc = regs->nip - (NR_INSN_TO_PRINT * 3 / 4 * sizeof(int));
 
-   seq_buf_init(, buf, sizeof(buf));
-
while (n) {
int i;
 
-   seq_buf_clear();
+   printbuf_reset();
 
for (i = 0; i < 8 && n; i++, n--, pc += sizeof(int)) {
int instr;
 
if (copy_from_user_nofault(, (void __user *)pc,
sizeof(instr))) {
-   seq_buf_printf(, " ");
+   prt_printf(, " ");
continue;
}
-   seq_buf_printf(, regs->nip == pc ? "<%08x> " : "%08x 
", instr);
+   prt_printf(, regs->nip == pc ? "<%08x> " : "%08x ", 
instr);
}
 
-   if (!seq_buf_has_overflowed())
+   if (printbuf_remaining())
pr_info("%s[%d]: code: %s\n", current->comm,
-   current->pid, s.buffer);
+   current->pid, s.buf);
}
 }
 
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index d96fd14bd7..b34de62e65 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -144,31 +144,28 @@ void __init setup_spectre_v2(void)
 #ifdef CONFIG_PPC_BOOK3S_64
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
bool thread_priv;
 
thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV);
 
if (rfi_flush) {
-   struct seq_buf s;
-   seq_buf_init(, buf, PAGE_SIZE - 1);
 
-   seq_buf_printf(, "Mitigation: RFI Flush");
+   prt_printf(, "Mitigation: RFI Flush");
if (thread_priv)
-   seq_buf_printf(, ", L1D private per thread");
-
-   seq_buf_printf(, "\n");
-
-   return s.len;
+   prt_printf(, ", L1D private per thread");
+
+   prt_printf(, "\n");
+   } else if (thread_priv) {
+   prt_printf(, "Vulnerable: L1D private per thread\n");
+   } else if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
+  !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) {
+   prt_printf(, "Not affected\n");
+   } else {
+   prt_printf(, "Vulnerable\n");
}
 
-   if (thread_priv)
-   return sprintf(buf, "Vulnerable: L1D private per thread\n");
-
-   if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
-   !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
-   return sprintf(buf, "Not affected\n");
-
-   return sprintf(buf, "Vulnerable\n");
+   return printbuf_written();
 }
 
 ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char 
*buf)
@@ -179,70 +176,66 @@ ssize_t cpu_show_l1tf(struct device *dev, struct 
device_attribute *attr, char *b
 
 ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
-   struct seq_buf s;
-
-   seq_buf_init(, buf, PAGE_SIZE - 1);
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
 
if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
if (barrier_nospec_enabled)
-   seq_buf_printf(, "Mitigation: __user pointer 
sanitization");
+  

[PATCH v3 27/33] powerpc: Convert to printbuf

2022-06-05 Thread Kent Overstreet
This converts from seq_buf to printbuf. We're using printbuf in external
buffer mode, so it's a direct conversion, aside from some trivial
refactoring in cpu_show_meltdown() to make the code more consistent.

Signed-off-by: Kent Overstreet 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/process.c | 16 +++--
 arch/powerpc/kernel/security.c| 75 ++-
 arch/powerpc/platforms/pseries/papr_scm.c | 34 +-
 3 files changed, 57 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 984813a4d5..fb8ba50223 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -39,7 +39,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -1399,32 +1399,30 @@ void show_user_instructions(struct pt_regs *regs)
 {
unsigned long pc;
int n = NR_INSN_TO_PRINT;
-   struct seq_buf s;
char buf[96]; /* enough for 8 times 9 + 2 chars */
+   struct printbuf s = PRINTBUF_EXTERN(buf, sizeof(buf));
 
pc = regs->nip - (NR_INSN_TO_PRINT * 3 / 4 * sizeof(int));
 
-   seq_buf_init(, buf, sizeof(buf));
-
while (n) {
int i;
 
-   seq_buf_clear();
+   printbuf_reset();
 
for (i = 0; i < 8 && n; i++, n--, pc += sizeof(int)) {
int instr;
 
if (copy_from_user_nofault(, (void __user *)pc,
sizeof(instr))) {
-   seq_buf_printf(, " ");
+   prt_printf(, " ");
continue;
}
-   seq_buf_printf(, regs->nip == pc ? "<%08x> " : "%08x 
", instr);
+   prt_printf(, regs->nip == pc ? "<%08x> " : "%08x ", 
instr);
}
 
-   if (!seq_buf_has_overflowed())
+   if (printbuf_remaining())
pr_info("%s[%d]: code: %s\n", current->comm,
-   current->pid, s.buffer);
+   current->pid, s.buf);
}
 }
 
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index d96fd14bd7..b34de62e65 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -144,31 +144,28 @@ void __init setup_spectre_v2(void)
 #ifdef CONFIG_PPC_BOOK3S_64
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
bool thread_priv;
 
thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV);
 
if (rfi_flush) {
-   struct seq_buf s;
-   seq_buf_init(, buf, PAGE_SIZE - 1);
 
-   seq_buf_printf(, "Mitigation: RFI Flush");
+   prt_printf(, "Mitigation: RFI Flush");
if (thread_priv)
-   seq_buf_printf(, ", L1D private per thread");
-
-   seq_buf_printf(, "\n");
-
-   return s.len;
+   prt_printf(, ", L1D private per thread");
+
+   prt_printf(, "\n");
+   } else if (thread_priv) {
+   prt_printf(, "Vulnerable: L1D private per thread\n");
+   } else if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
+  !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) {
+   prt_printf(, "Not affected\n");
+   } else {
+   prt_printf(, "Vulnerable\n");
}
 
-   if (thread_priv)
-   return sprintf(buf, "Vulnerable: L1D private per thread\n");
-
-   if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
-   !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
-   return sprintf(buf, "Not affected\n");
-
-   return sprintf(buf, "Vulnerable\n");
+   return printbuf_written();
 }
 
 ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char 
*buf)
@@ -179,70 +176,66 @@ ssize_t cpu_show_l1tf(struct device *dev, struct 
device_attribute *attr, char *b
 
 ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
-   struct seq_buf s;
-
-   seq_buf_init(, buf, PAGE_SIZE - 1);
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
 
if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
if (barrier_nospec_enabled)
-   seq_buf_printf(, "Mitigation: __user pointer 
sanitization");
+  

[PATCH v2 26/28] powerpc: Convert to printbuf

2022-05-19 Thread Kent Overstreet
This converts from seq_buf to printbuf. We're using printbuf in external
buffer mode, so it's a direct conversion, aside from some trivial
refactoring in cpu_show_meltdown() to make the code more consistent.

Signed-off-by: Kent Overstreet 
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/process.c | 16 +++--
 arch/powerpc/kernel/security.c| 75 ++-
 arch/powerpc/platforms/pseries/papr_scm.c | 34 +-
 3 files changed, 57 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 984813a4d5..f6f7804516 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -39,7 +39,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -1399,32 +1399,30 @@ void show_user_instructions(struct pt_regs *regs)
 {
unsigned long pc;
int n = NR_INSN_TO_PRINT;
-   struct seq_buf s;
char buf[96]; /* enough for 8 times 9 + 2 chars */
+   struct printbuf s = PRINTBUF_EXTERN(buf, sizeof(buf));
 
pc = regs->nip - (NR_INSN_TO_PRINT * 3 / 4 * sizeof(int));
 
-   seq_buf_init(, buf, sizeof(buf));
-
while (n) {
int i;
 
-   seq_buf_clear();
+   printbuf_reset();
 
for (i = 0; i < 8 && n; i++, n--, pc += sizeof(int)) {
int instr;
 
if (copy_from_user_nofault(, (void __user *)pc,
sizeof(instr))) {
-   seq_buf_printf(, " ");
+   pr_buf(, " ");
continue;
}
-   seq_buf_printf(, regs->nip == pc ? "<%08x> " : "%08x 
", instr);
+   pr_buf(, regs->nip == pc ? "<%08x> " : "%08x ", 
instr);
}
 
-   if (!seq_buf_has_overflowed())
+   if (printbuf_remaining())
pr_info("%s[%d]: code: %s\n", current->comm,
-   current->pid, s.buffer);
+   current->pid, s.buf);
}
 }
 
diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index e159d4093d..5c9bad138c 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -144,31 +144,28 @@ void __init setup_spectre_v2(void)
 #ifdef CONFIG_PPC_BOOK3S_64
 ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
bool thread_priv;
 
thread_priv = security_ftr_enabled(SEC_FTR_L1D_THREAD_PRIV);
 
if (rfi_flush) {
-   struct seq_buf s;
-   seq_buf_init(, buf, PAGE_SIZE - 1);
 
-   seq_buf_printf(, "Mitigation: RFI Flush");
+   pr_buf(, "Mitigation: RFI Flush");
if (thread_priv)
-   seq_buf_printf(, ", L1D private per thread");
-
-   seq_buf_printf(, "\n");
-
-   return s.len;
+   pr_buf(, ", L1D private per thread");
+
+   pr_buf(, "\n");
+   } else if (thread_priv) {
+   pr_buf(, "Vulnerable: L1D private per thread\n");
+   } else if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
+  !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR)) {
+   pr_buf(, "Not affected\n");
+   } else {
+   pr_buf(, "Vulnerable\n");
}
 
-   if (thread_priv)
-   return sprintf(buf, "Vulnerable: L1D private per thread\n");
-
-   if (!security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV) &&
-   !security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR))
-   return sprintf(buf, "Not affected\n");
-
-   return sprintf(buf, "Vulnerable\n");
+   return printbuf_written();
 }
 
 ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char 
*buf)
@@ -179,70 +176,66 @@ ssize_t cpu_show_l1tf(struct device *dev, struct 
device_attribute *attr, char *b
 
 ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
-   struct seq_buf s;
-
-   seq_buf_init(, buf, PAGE_SIZE - 1);
+   struct printbuf s = PRINTBUF_EXTERN(buf, PAGE_SIZE);
 
if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
if (barrier_nospec_enabled)
-   seq_buf_printf(, "Mitigation: __user pointer 
sanitization");
+   pr_buf(, "Mitigation: __user pointer sanitization&quo

[PATCH 04/22] block: Convert bio_for_each_segment() to bvec_iter

2013-03-27 Thread Kent Overstreet
More prep work for immutable biovecs - with immutable bvecs drivers
won't be able to use the biovec directly, they'll need to use helpers
that take into account bio-bi_iter.bi_bvec_done.

This updates callers for the new usage without changing the
implementation yet.

Signed-off-by: Kent Overstreet koverstr...@google.com
Cc: Jens Axboe ax...@kernel.dk
Cc: Geert Uytterhoeven ge...@linux-m68k.org
Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
Cc: Paul Mackerras pau...@samba.org
Cc: Ed L. Cashin ecas...@coraid.com
Cc: Nick Piggin npig...@kernel.dk
Cc: Lars Ellenberg drbd-...@lists.linbit.com
Cc: Jiri Kosina jkos...@suse.cz
Cc: Paul Clements paul.cleme...@steeleye.com
Cc: Jim Paris j...@jtan.com
Cc: Geoff Levand ge...@infradead.org
Cc: Yehuda Sadeh yeh...@inktank.com
Cc: Sage Weil s...@inktank.com
Cc: Alex Elder el...@inktank.com
Cc: ceph-de...@vger.kernel.org
Cc: Joshua Morris josh.h.mor...@us.ibm.com
Cc: Philip Kelleher pjk1...@linux.vnet.ibm.com
Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Jeremy Fitzhardinge jer...@goop.org
Cc: Neil Brown ne...@suse.de
Cc: Martin Schwidefsky schwidef...@de.ibm.com
Cc: Heiko Carstens heiko.carst...@de.ibm.com
Cc: linux...@de.ibm.com
Cc: Nagalakshmi Nandigama nagalakshmi.nandig...@lsi.com
Cc: Sreekanth Reddy sreekanth.re...@lsi.com
Cc: supp...@lsi.com
Cc: James E.J. Bottomley jbottom...@parallels.com
Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
Cc: Alexander Viro v...@zeniv.linux.org.uk
Cc: Steven Whitehouse swhit...@redhat.com
Cc: Kent Overstreet koverstr...@google.com
Cc: Herton Ronaldo Krzesinski herton.krzesin...@canonical.com
Cc: Tejun Heo t...@kernel.org
Cc: Andrew Morton a...@linux-foundation.org
Cc: Guo Chao y...@linux.vnet.ibm.com
Cc: Asai Thambi S P asamymuth...@micron.com
Cc: Selvan Mani sm...@micron.com
Cc: Sam Bradshaw sbrads...@micron.com
Cc: Matthew Wilcox matthew.r.wil...@intel.com
Cc: Keith Busch keith.bu...@intel.com
Cc: Stephen Hemminger shemmin...@vyatta.com
Cc: Quoc-Son Anh quoc-sonx@intel.com
Cc: Sebastian Ott seb...@linux.vnet.ibm.com
Cc: Nitin Gupta ngu...@vflare.org
Cc: Minchan Kim minc...@kernel.org
Cc: Jerome Marchand jmarc...@redhat.com
Cc: Seth Jennings sjenn...@linux.vnet.ibm.com
Cc: Martin K. Petersen martin.peter...@oracle.com
Cc: Mike Snitzer snit...@redhat.com
Cc: Vivek Goyal vgo...@redhat.com
Cc: Darrick J. Wong darrick.w...@oracle.com
Cc: Chris Metcalf cmetc...@tilera.com
Cc: Jan Kara j...@suse.cz
Cc: linux-m...@lists.linux-m68k.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: drbd-u...@lists.linbit.com
Cc: nbd-gene...@lists.sourceforge.net
Cc: cbe-oss-...@lists.ozlabs.org
Cc: xen-de...@lists.xensource.com
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-r...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: dl-mptfusionli...@lsi.com
Cc: linux-s...@vger.kernel.org
Cc: de...@driverdev.osuosl.org
Cc: linux-fsde...@vger.kernel.org
Cc: cluster-de...@redhat.com
Cc: linux...@kvack.org
---
 arch/m68k/emu/nfblock.c  | 11 ---
 arch/powerpc/sysdev/axonram.c| 18 +--
 block/blk-merge.c| 45 ++-
 drivers/block/aoe/aoecmd.c   | 16 +-
 drivers/block/brd.c  | 12 
 drivers/block/drbd/drbd_main.c   | 27 +
 drivers/block/drbd/drbd_receiver.c   | 13 
 drivers/block/drbd/drbd_worker.c |  8 ++---
 drivers/block/floppy.c   | 12 
 drivers/block/loop.c | 23 +++---
 drivers/block/mtip32xx/mtip32xx.c| 13 
 drivers/block/nbd.c  | 12 
 drivers/block/nvme.c | 27 +
 drivers/block/ps3vram.c  | 10 +++---
 drivers/block/rbd.c  | 36 +++---
 drivers/block/rsxx/dma.c | 11 ---
 drivers/block/xen-blkfront.c | 14 -
 drivers/md/md.c  | 16 +-
 drivers/md/raid5.c   | 12 
 drivers/s390/block/dcssblk.c | 14 -
 drivers/s390/block/xpram.c   | 10 +++---
 drivers/scsi/mpt2sas/mpt2sas_transport.c | 31 ++-
 drivers/scsi/mpt3sas/mpt3sas_transport.c | 31 ++-
 drivers/staging/zram/zram_drv.c  | 19 ++--
 fs/bio-integrity.c   | 30 +-
 fs/bio.c | 16 +-
 fs/gfs2/lops.c   | 10 +++---
 include/linux/bio.h  | 25 ---
 include/linux/blkdev.h   |  7 +++--
 mm/bounce.c  | 52 +++-
 30 files changed, 296 insertions(+), 285 deletions(-)

diff --git a/arch/m68k/emu/nfblock.c b/arch/m68k/emu/nfblock.c
index 9070d6c..d2b260c 100644
--- a/arch/m68k/emu/nfblock.c
+++ b/arch/m68k/emu/nfblock.c
@@ -62,17 +62,18 @@ struct nfhd_device {
 static void