from:"Russell King"

Re: [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options

2024-07-29 Thread Russell King (Oracle)

On Fri, Jul 26, 2024 at 05:07:26PM +0200, David Hildenbrand wrote:
> Let's clean that up a bit and prepare for depending on
> CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options.
> 
> More cleanups would be reasonable (like the arch-specific "depends on"
> for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Russell King (Oracle) 

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 00/14] Introducing TIF_NOTIFY_IPI flag

2024-06-15 Thread Russell King (Oracle)

On Thu, Jun 13, 2024 at 06:15:59PM +, K Prateek Nayak wrote:
> o Dropping the ARM results since I never got my hands on the ARM64
>   system I used in my last testing. If I do manage to get my hands on it
>   again, I'll rerun the experiments and share the results on the thread.
>   To test the case where TIF_NOTIFY_IPI is not enabled for a particular
>   architecture, I applied the series only until Patch 3 and tested the
>   same on my x86 machine with a WARN_ON_ONCE() in do_idle() to check if
>   tif_notify_ipi() ever return true and then repeated the same with
>   Patch 4 applied.

Confused. ARM (32-bit) or ARM64? You patch 32-bit ARM, but you don't
touch 64-bit Arm. "ARM" on its own in the context above to me suggests
32-bit, since you refer to ARM64 later.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/1] [RFC] ethernet: Convert from tasklet to BH workqueue

2024-05-07 Thread Russell King (Oracle)

On Tue, May 07, 2024 at 07:01:11PM +, Allen Pais wrote:
> The only generic interface to execute asynchronously in the BH context is
> tasklet; however, it's marked deprecated and has some design flaws. To
> replace tasklets, BH workqueue support was recently added. A BH workqueue
> behaves similarly to regular workqueues except that the queued work items
> are executed in the BH context.
> 
> This patch converts drivers/ethernet/* from tasklet to BH workqueue.

I doubt you're going to get many comments on this patch, being so large
and spread across all drivers. I'm not going to bother trying to edit
this down to something more sensible, I'll just plonk my comment here.

For the mvpp2 driver, you're only updating a comment - and looking at
it, the comment no longer reflects the code. It doesn't make use of
tasklets at all. That makes the comment wrong whether or not it's
updated. So I suggest rather than doing a search and replace for
"tasklet" to "BH blahblah" (sorry, I don't remember what you replaced
it with) just get rid of that bit of the comment.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/1] arch/mm/fault: fix major fault accounting when retrying under per-VMA lock

2024-01-20 Thread Russell King (Oracle)

On Sat, Jan 20, 2024 at 09:09:47PM +, patchwork-bot+linux-ri...@kernel.org 
wrote:
> Hello:
> 
> This patch was applied to riscv/linux.git (fixes)
> by Andrew Morton :
> 
> On Tue, 26 Dec 2023 13:46:10 -0800 you wrote:
> > A test [1] in Android test suite started failing after [2] was merged.
> > It turns out that after handling a major fault under per-VMA lock, the
> > process major fault counter does not register that fault as major.
> > Before [2] read faults would be done under mmap_lock, in which case
> > FAULT_FLAG_TRIED flag is set before retrying. That in turn causes
> > mm_account_fault() to account the fault as major once retry completes.
> > With per-VMA locks we often retry because a fault can't be handled
> > without locking the whole mm using mmap_lock. Therefore such retries
> > do not set FAULT_FLAG_TRIED flag. This logic does not work after [2]
> > because we can now handle read major faults under per-VMA lock and
> > upon retry the fact there was a major fault gets lost. Fix this by
> > setting FAULT_FLAG_TRIED after retrying under per-VMA lock if
> > VM_FAULT_MAJOR was returned. Ideally we would use an additional
> > VM_FAULT bit to indicate the reason for the retry (could not handle
> > under per-VMA lock vs other reason) but this simpler solution seems
> > to work, so keeping it simple.
> > 
> > [...]
> 
> Here is the summary with links:
>   - [1/1] arch/mm/fault: fix major fault accounting when retrying under 
> per-VMA lock
> https://git.kernel.org/riscv/c/46e714c729c8
> 
> You are awesome, thank you!

Now that 32-bit ARM has support for the per-VMA lock, does that also
need to be patched?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH bpf-next 5/6] bpf, arm32: Always zero extend for LDX with B/H/W

2023-09-12 Thread Russell King (Oracle)

On Tue, Sep 12, 2023 at 10:46:53PM +, Puranjay Mohan wrote:
> The JITs should not depend on the verifier for zero extending the upper
> 32 bits of the destination register when loading a byte, half-word, or
> word.
> 
> A following patch will make the verifier stop patching zext instructions
> after LDX.

This was introduced by:

163541e6ba34 ("arm: bpf: eliminate zero extension code-gen")

along with an additional function. So three points:

1) the commit should probably explain why it has now become undesirable
to access this verifier state, whereas it appears it was explicitly
added to permit this optimisation.
2) you state that jits should not depend on this state, but the above
commit adds more references than you're removing, so aren't there still
references to the verifier remaining after this patch? I count a total
of 10, and the patch below removes three.
3) what about the bpf_jit_needs_zext() function that was added to
support the export of this zext state?

Essentially, the logic stated in the commit message doesn't seem to be
reflected by the proposed code change.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2] syscalls: Cleanup references to sys_lookup_dcookie()

2023-07-25 Thread Russell King (Oracle)

On Mon, Jul 10, 2023 at 06:51:24PM +, Sohil Mehta wrote:
> diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> index 8ebed8a13874..cb7ea3bf18cf 100644
> --- a/arch/arm/tools/syscall.tbl
> +++ b/arch/arm/tools/syscall.tbl
> @@ -263,7 +263,7 @@
>  246  common  io_submit   sys_io_submit
>  247  common  io_cancel   sys_io_cancel
>  248  common  exit_group  sys_exit_group
> -249  common  lookup_dcookie  sys_lookup_dcookie
> +249  common  lookup_dcookie  sys_ni_syscall

Acked-by: Russell King (Oracle) 

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v3 01/13] kexec: consolidate kexec and crash options into kernel/Kconfig.kexec

2023-06-26 Thread Russell King (Oracle)

On Mon, Jun 26, 2023 at 12:13:20PM -0400, Eric DeVolder wrote:
> +config KEXEC
> + bool "Enable kexec system call"
> + default ARCH_DEFAULT_KEXEC
> + depends on ARCH_SUPPORTS_KEXEC
> + select KEXEC_CORE
> + help
> +   kexec is a system call that implements the ability to shutdown your
> +   current kernel, and to start another kernel. It is like a reboot
> +   but it is independent of the system firmware. And like a reboot
> +   you can start any kernel with it, not just Linux.
> +
> +   The name comes from the similarity to the exec system call.
> +
> +   It is an ongoing process to be certain the hardware in a machine
> +   is properly shutdown, so do not be surprised if this code does not
> +   initially work for you. As of this writing the exact hardware
> +   interface is strongly in flux, so no good recommendation can be
> +   made.

Is this last paragraph still true? Is the hardware interface still
"strongly in flux" ?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 20/21] ARM: dma-mapping: split out arch_dma_mark_clean() helper

2023-03-31 Thread Russell King (Oracle)

On Fri, Mar 31, 2023 at 04:06:37PM +0200, Arnd Bergmann wrote:
> On Mon, Mar 27, 2023, at 17:01, Russell King (Oracle) wrote:
> > On Mon, Mar 27, 2023 at 02:13:16PM +0200, Arnd Bergmann wrote:
> >> From: Arnd Bergmann 
> >> 
> >> The arm version of the arch_sync_dma_for_cpu() function annotates pages as
> >> PG_dcache_clean after a DMA, but no other architecture does this here.
> >
> > ... because this is an arm32 specific feature. Generically, it's
> > PG_arch_1, which is a page flag free for architecture use. On arm32
> > we decided to use this to mark whether we can skip dcache writebacks
> > when establishing a PTE - and thus it was decided to call it
> > PG_dcache_clean to reflect how arm32 decided to use that bit.
> >
> > This isn't just a DMA thing, there are other places that we update
> > the bit, such as flush_dcache_page() and copy_user_highpage().
> >
> > So thinking that the arm32 PG_dcache_clean is something for DMA is
> > actually wrong.
> >
> > Other architectures are free to do their own other optimisations
> > using that bit, and their implementations may be DMA-centric.
> 
> The flag is used the same way on most architectures, though some
> use the opposite polarity and call it PG_dcache_dirty. The only
> other architecture that uses it for DMA is ia64, with the difference
> being that this also marks the page as clean even for coherent
> DMA, not just when doing a flush as part of noncoherent DMA.
> 
> Based on Robin's reply it sounds that this is not a valid assumption
> on Arm, if a coherent DMA can target a dirty dcache line without
> cleaning it.

The other thing to note here is that PG_dcache_clean doesn't have
much meaning on modern CPUs with PIPT caches. For these,
cache_is_vipt_nonaliasing() will be true, and
cache_ops_need_broadcast() will be false.

Firstly, if we're using coherent DMA, then PG_dcache_clean is
intentionally not touched, because the data cache isn't cleaned
in any way by DMA operations.

flush_dcache_page() turns into a no-op apart from clearing
PG_dcache_clean if it was set.

__sync_icache_dcache() will do nothing for non-executable pages,
but will write-back a page that isn't marked PG_dcache_clean to
ensure that it is visible to the instruction stream. This is only
used to ensure that a the instructions are visible to a newly
established executable mapping when e.g. the page has been DMA'd
in. The default state of PG_dcache_clean is zero on any new
allocation, so this has the effect of causing any executable page
to be flushed such that the instruction stream can see the
instructions, but only for the first establishment of the mapping.
That means that e.g. libc text pages don't keep getting flushed on
the start of every program.

update_mmu_cache() isn't compiled, so it's use of PG_dcache_clean
is irrelevant.

v6_copy_user_highpage_aliasing() won't be called because we're not
using an aliasing cache.

So, for modern ARM systems with DMA-coherent PG_dcache_clean only
serves for the __sync_icache_dcache() optimisation.

ARMs use of this remains valid in this circumstance.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 15/21] ARM: dma-mapping: always invalidate WT caches before DMA

2023-03-31 Thread Russell King (Oracle)

On Fri, Mar 31, 2023 at 12:38:45PM +0200, Arnd Bergmann wrote:
> On Fri, Mar 31, 2023, at 11:35, Russell King (Oracle) wrote:
> > On Fri, Mar 31, 2023 at 10:07:28AM +0100, Russell King (Oracle) wrote:
> >> On Mon, Mar 27, 2023 at 02:13:11PM +0200, Arnd Bergmann wrote:
> >> > From: Arnd Bergmann 
> >> > 
> >> > Most ARM CPUs can have write-back caches and that require
> >> > cache management to be done in the dma_sync_*_for_device()
> >> > operation. This is typically done in both writeback and
> >> > writethrough mode.
> >> > 
> >> > The cache-v4.S (arm720/740/7tdmi/9tdmi) and cache-v4wt.S
> >> > (arm920t, arm940t) implementations are the exception here,
> >> > and only do the cache management after the DMA is complete,
> >> > in the dma_sync_*_for_cpu() operation.
> >> > 
> >> > Change this for consistency with the other platforms. This
> >> > should have no user visible effect.
> >> 
> >> NAK...
> >> 
> >> The reason we do cache management _after_ is to ensure that there
> >> is no stale data. The kernel _has_ (at the very least in the past)
> >> performed DMA to data structures that are embedded within other
> >> data structures, resulting in cache lines being shared. If one of
> >> those cache lines is touched while DMA is progressing, then we
> >> must to cache management _after_ the DMA operation has completed.
> >> Doing it before is no good.
> 
> What I'm trying to address here is the inconsistency between
> implementations. If we decide that we always want to invalidate
> after FROM_DEVICE, I can do that as part of the series, but then
> I have to change most of the other arm implementations.

Why?

First thing to say is that DMA to buffers where the cache lines are
shared with data the CPU may be accessing need to be outlawed - they
are a recipe for data corruption - always have been. Sadly, some folk
don't see it that way because of a passed "x86 just works and we demand
that all architectures behave like x86!" attitude. The SCSI sense
buffer has historically been a big culpret for that.

For WT, FROM_DEVICE, invalidating after DMA is the right thing to do,
because we want to ensure that the DMA'd data is properly readable upon
completion of the DMA. If overlapping cache lines have been touched
while DMA is progressing, and we invalidate before DMA, then the cache
will contain stale data that will remain in the cache after DMA has
completed. Invalidating a WT cache does not destroy any data, so is
safe to do. So the safest approach is to invalidate after DMA has
completed in this instance.

For WB, FROM_DEVICE, we have the problem of dirty cache lines which
we have to get rid of. For the overlapping cache lines, we have to
clean those before DMA begins to ensure that data written to the
non-DMA-buffer part is preserved. All other cache lines need to be
invalidated before DMA begins to ensure that writebacks do not
corrupt data from the device. Hence why it's different.

And hence why the ARM implementation is based around buffer ownership.
And hence why they're called dma_map_area()/dma_unmap_area() rather
than the cache operations themselves. This is an intentional change,
one that was done when ARMv6 came along.

> OTOH, most machines that are actually in use today (armv6+,
> powerpc, later mips, microblaze, riscv, nios2) also have to
> deal with speculative accesses, so they end up having to
> invalidate or flush both before and after a DMA_FROM_DEVICE
> and DMA_BIDIRECTIONAL.

Again, these are implementation details of the cache, and this is
precisely why having the map/unmap interface is so much better than
having generic code explicitly call "clean" and "invalidate"
interfaces into arch code.

If we treat everything as a speculative cache, then we're doing
needless extra work for those caches that aren't speculative. So,
ARM would have to step through every cache line for every DMA
buffer at 32-byte intervals performing cache maintenance whether
the cache is speculative or not. That is expensive, and hurts
performance.

I put a lot of thought into this when I updated the ARM DMA
implementation when we started seeing these different cache types
particularly when ARMv6 came along. I really don't want that work
wrecked.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 15/21] ARM: dma-mapping: always invalidate WT caches before DMA

2023-03-31 Thread Russell King (Oracle)

On Fri, Mar 31, 2023 at 10:07:28AM +0100, Russell King (Oracle) wrote:
> On Mon, Mar 27, 2023 at 02:13:11PM +0200, Arnd Bergmann wrote:
> > From: Arnd Bergmann 
> > 
> > Most ARM CPUs can have write-back caches and that require
> > cache management to be done in the dma_sync_*_for_device()
> > operation. This is typically done in both writeback and
> > writethrough mode.
> > 
> > The cache-v4.S (arm720/740/7tdmi/9tdmi) and cache-v4wt.S
> > (arm920t, arm940t) implementations are the exception here,
> > and only do the cache management after the DMA is complete,
> > in the dma_sync_*_for_cpu() operation.
> > 
> > Change this for consistency with the other platforms. This
> > should have no user visible effect.
> 
> NAK...
> 
> The reason we do cache management _after_ is to ensure that there
> is no stale data. The kernel _has_ (at the very least in the past)
> performed DMA to data structures that are embedded within other
> data structures, resulting in cache lines being shared. If one of
> those cache lines is touched while DMA is progressing, then we
> must to cache management _after_ the DMA operation has completed.
> Doing it before is no good.

It looks like the main offender of "touching cache lines shared
with DMA" has now been resolved - that was the SCSI sense buffer,
and was fixed some time ago:

commit de25deb18016f66dcdede165d07654559bb332bc
Author: FUJITA Tomonori 
Date:   Wed Jan 16 13:32:17 2008 +0900

/if/ that is the one and only case, then we're probably fine, but
having been through an era where this kind of thing was the norm
and requests to fix it did not get great responses from subsystem
maintainers, I just don't trust the kernel not to want to DMA to
overlapping cache lines.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 15/21] ARM: dma-mapping: always invalidate WT caches before DMA

2023-03-31 Thread Russell King (Oracle)

On Mon, Mar 27, 2023 at 02:13:11PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> Most ARM CPUs can have write-back caches and that require
> cache management to be done in the dma_sync_*_for_device()
> operation. This is typically done in both writeback and
> writethrough mode.
> 
> The cache-v4.S (arm720/740/7tdmi/9tdmi) and cache-v4wt.S
> (arm920t, arm940t) implementations are the exception here,
> and only do the cache management after the DMA is complete,
> in the dma_sync_*_for_cpu() operation.
> 
> Change this for consistency with the other platforms. This
> should have no user visible effect.

NAK...

The reason we do cache management _after_ is to ensure that there
is no stale data. The kernel _has_ (at the very least in the past)
performed DMA to data structures that are embedded within other
data structures, resulting in cache lines being shared. If one of
those cache lines is touched while DMA is progressing, then we
must to cache management _after_ the DMA operation has completed.
Doing it before is no good.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 20/21] ARM: dma-mapping: split out arch_dma_mark_clean() helper

2023-03-27 Thread Russell King (Oracle)

On Mon, Mar 27, 2023 at 02:13:16PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The arm version of the arch_sync_dma_for_cpu() function annotates pages as
> PG_dcache_clean after a DMA, but no other architecture does this here.

... because this is an arm32 specific feature. Generically, it's
PG_arch_1, which is a page flag free for architecture use. On arm32
we decided to use this to mark whether we can skip dcache writebacks
when establishing a PTE - and thus it was decided to call it
PG_dcache_clean to reflect how arm32 decided to use that bit.

This isn't just a DMA thing, there are other places that we update
the bit, such as flush_dcache_page() and copy_user_highpage().

So thinking that the arm32 PG_dcache_clean is something for DMA is
actually wrong.

Other architectures are free to do their own other optimisations
using that bit, and their implementations may be DMA-centric.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 16/21] ARM: dma-mapping: bring back dmac_{clean,inv}_range

2023-03-27 Thread Russell King (Oracle)

On Mon, Mar 27, 2023 at 02:13:12PM +0200, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> These were remove ages ago in commit 702b94bff3c5 ("ARM: dma-mapping:
> remove dmac_clean_range and dmac_inv_range") in an effort to sanitize
> the dma-mapping API.

Really no, please no. Let's not go back to this, let's keep the
buffer ownership model that came at around that time.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v3 03/24] arm: Remove COMMAND_LINE_SIZE from uapi

2023-02-15 Thread Russell King (Oracle)

On Tue, Feb 14, 2023 at 08:49:04AM +0100, Alexandre Ghiti wrote:
> From: Palmer Dabbelt 
> 
> As far as I can tell this is not used by userspace and thus should not
> be part of the user-visible API.
> 
> Signed-off-by: Palmer Dabbelt 

Looks good to me. What's the merge plan for this?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 03/24] arm/cpu: Make sure arch_cpu_idle_dead() doesn't return

2023-02-14 Thread Russell King (Oracle)

On Tue, Feb 14, 2023 at 10:39:26AM -0800, Josh Poimboeuf wrote:
> On Tue, Feb 14, 2023 at 11:15:23AM +0000, Russell King (Oracle) wrote:
> > On Mon, Feb 13, 2023 at 11:05:37PM -0800, Josh Poimboeuf wrote:
> > > arch_cpu_idle_dead() doesn't return.  Make that more explicit with a
> > > BUG().
> > > 
> > > BUG() is preferable to unreachable() because BUG() is a more explicit
> > > failure mode and avoids undefined behavior like falling off the edge of
> > > the function into whatever code happens to be next.
> > 
> > This is silly. Just mark the function __noreturn and be done with it.
> > If the CPU ever executes code past the "b" instruction, it's already
> > really broken that the extra instructions that BUG() gives will be
> > meaningless.
> > 
> > This patch does nothing except add yet more bloat the kernel.
> > 
> > Sorry, but NAK.
> 
> Problem is, the compiler can't read inline asm.  So you'd get a 
> "'noreturn' function does return" warning.
> 
> We can do an unreachable() instead of a BUG() here if you prefer
> undefined behavior.

That's fine.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 03/24] arm/cpu: Make sure arch_cpu_idle_dead() doesn't return

2023-02-14 Thread Russell King (Oracle)

On Mon, Feb 13, 2023 at 11:05:37PM -0800, Josh Poimboeuf wrote:
> arch_cpu_idle_dead() doesn't return.  Make that more explicit with a
> BUG().
> 
> BUG() is preferable to unreachable() because BUG() is a more explicit
> failure mode and avoids undefined behavior like falling off the edge of
> the function into whatever code happens to be next.

This is silly. Just mark the function __noreturn and be done with it.
If the CPU ever executes code past the "b" instruction, it's already
really broken that the extra instructions that BUG() gives will be
meaningless.

This patch does nothing except add yet more bloat the kernel.

Sorry, but NAK.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH mm-unstable v1 04/26] arm/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2023-01-13 Thread Russell King (Oracle)

On Fri, Jan 13, 2023 at 06:10:04PM +0100, David Hildenbrand wrote:
> Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE by stealing one bit from the
> offset. This reduces the maximum swap space per file to 64 GiB (was 128
> GiB).
> 
> While at it drop the PTE_TYPE_FAULT from __swp_entry_to_pte() which is
> defined to be 0 and is rather confusing because we should be dealing
> with "Linux PTEs" not "hardware PTEs". Also, properly mask the type in
> __swp_entry().
> 
> Cc: Russell King 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Russell King (Oracle) 

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 3/3] treewide: use get_random_u32_between() when possible

2022-11-17 Thread Russell King (Oracle)

On Thu, Nov 17, 2022 at 03:05:14AM +0100, Jason A. Donenfeld wrote:
> On Thu, Nov 17, 2022 at 12:55:47AM +0100, Jason A. Donenfeld wrote:
> > 1) How/whether to make f(0, UR2_MAX) safe,
> >- without additional 64-bit arithmetic,
> >- minimizing the number of branches.
> >I have a few ideas I'll code golf for a bit.
> > I think I can make progress with (1) alone by fiddling around with
> > godbolt enough, like usual.
> 
> The code gen is definitely worse.
> 
> Original half-open interval:
> 
> return floor + get_random_u32_below(ceil - floor);
> 
> Suggested fully closed interval:
>   
> ceil = ceil - floor + 1;
> return likely(ceil) ? floor + get_random_u32_below(ceil) : 
> get_random_u32();

How many of these uses are going to have ceil and floor as a variable?
If they're constants (e.g. due to being in an inline function with
constant arguments) then the compiler will optimise all of the above
and the assembly code will just be either:

1. a call to get_random_u32()
2. a call to get_random_u32_below() and an addition.

If one passes ceil or floor as a variable, then yes, the code gen is
going to be as complicated as you suggest above.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 1/3] treewide: use get_random_u32_below() instead of deprecated function

2022-11-14 Thread Russell King (Oracle)

On Mon, Nov 14, 2022 at 05:45:56PM +0100, Jason A. Donenfeld wrote:
> This is a simple mechanical transformation done by:
> 
> @@
> expression E;
> @@
> - prandom_u32_max
> + get_random_u32_below
>   (E)
> 
> Reviewed-by: Greg Kroah-Hartman 
> Acked-by: Darrick J. Wong  # for xfs
> Reviewed-by: SeongJae Park  # for damon
> Reviewed-by: Jason Gunthorpe  # for infiniband
> Signed-off-by: Jason A. Donenfeld 
> ---
>  arch/arm/kernel/process.c     |  2 +-

Reviewed-by: Russell King (Oracle)  # for arm

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH net-next v6 6/9] net: dpaa: Convert to phylink

2022-10-04 Thread Russell King (Oracle)

On Fri, Sep 30, 2022 at 04:09:30PM -0400, Sean Anderson wrote:
> +static void memac_validate(struct phylink_config *config,
> +unsigned long *supported,
> +struct phylink_link_state *state)
> +{
> + __ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
> + struct fman_mac *memac = fman_config_to_mac(config)->fman_mac;
> +
> + phylink_generic_validate(config, supported, state);
> +
> + if (phy_interface_mode_is_rgmii(state->interface) &&
> + memac->rgmii_no_half_duplex) {
> + phylink_caps_to_linkmodes(mask, MAC_10HD | MAC_100HD);
> + linkmode_andnot(supported, supported, mask);
> + linkmode_andnot(state->advertising, state->advertising, mask);
> + }
> +}

Having been through the rest of this with a fine tooth comb, nothing
else stands out with the exception of the above, which I think could
be done better with this patch:

http://git.armlinux.org.uk/cgit/linux-arm.git/commit/?h=net-queue=e65a47c4053255bd51715d5550e21c869971258c

Since the above would become:

static void memac_validate(struct phylink_config *config,
   unsigned long *supported,
   struct phylink_link_state *state)
{
struct mac_device *mac_dev = fman_config_to_mac(config);
struct fman_mac *memac = mac_dev->fman_mac;
unsigned long caps;

caps = mac_dev->phylink_config.capabilities;

if (phy_interface_mode_is_rgmii(state->interface) &&
memac->rgmii_no_half_duplex)
caps &= ~(MAC_10HD | MAC_100HD);

phylink_validate_mask_caps(supported, state, caps);
}

If you want to pick up my patch that adds this helper into your series,
please do.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH net-next v6 6/9] net: dpaa: Convert to phylink

2022-10-04 Thread Russell King (Oracle)

On Fri, Sep 30, 2022 at 04:09:30PM -0400, Sean Anderson wrote:
> @@ -1064,43 +1061,50 @@ static struct phylink_pcs *memac_pcs_create(struct 
> device_node *mac_node,
>   return pcs;
>  }
>  
> +static bool memac_supports(struct mac_device *mac_dev, phy_interface_t iface)
> +{
> + /* If there's no serdes device, assume that it's been configured for
> +  * whatever the default interface mode is.
> +  */
> + if (!mac_dev->fman_mac->serdes)
> + return mac_dev->phy_if == iface;
> + /* Otherwise, ask the serdes */
> + return !phy_validate(mac_dev->fman_mac->serdes, PHY_MODE_ETHERNET,
> +  iface, NULL);
> +}
> +
>  int memac_initialization(struct mac_device *mac_dev,
>struct device_node *mac_node,
>struct fman_mac_params *params)
>  {
>   int  err;
> + struct device_node  *fixed;
>   struct phylink_pcs  *pcs;
> - struct fixed_phy_status *fixed_link;
>   struct fman_mac *memac;
> + unsigned longcapabilities;
> + unsigned long   *supported;
>  
> + mac_dev->phylink_ops= _mac_ops;
>   mac_dev->set_promisc= memac_set_promiscuous;
>   mac_dev->change_addr= memac_modify_mac_address;
>   mac_dev->add_hash_mac_addr  = memac_add_hash_mac_address;
>   mac_dev->remove_hash_mac_addr   = memac_del_hash_mac_address;
> - mac_dev->set_tx_pause   = memac_set_tx_pause_frames;
> - mac_dev->set_rx_pause   = memac_accept_rx_pause_frames;
>   mac_dev->set_exception  = memac_set_exception;
>   mac_dev->set_allmulti   = memac_set_allmulti;
>   mac_dev->set_tstamp = memac_set_tstamp;
>   mac_dev->set_multi  = fman_set_multi;
> - mac_dev->adjust_link= adjust_link_memac;
>   mac_dev->enable = memac_enable;
>   mac_dev->disable= memac_disable;
>  
> - if (params->max_speed == SPEED_1)
> - mac_dev->phy_if = PHY_INTERFACE_MODE_XGMII;
> -
>   mac_dev->fman_mac = memac_config(mac_dev, params);
> - if (!mac_dev->fman_mac) {
> - err = -EINVAL;
> - goto _return;
> - }
> + if (!mac_dev->fman_mac)
> + return -EINVAL;
>  
>   memac = mac_dev->fman_mac;
>   memac->memac_drv_param->max_frame_length = fman_get_max_frm();
>   memac->memac_drv_param->reset_on_init = true;
>  
> - err = of_property_match_string(mac_node, "pcs-names", "xfi");
> + err = of_property_match_string(mac_node, "pcs-handle-names", "xfi");

While reading through the patch, I stumbled upon this - in the previous
patch, you introduce this code with "pcs-names" and then in this patch
you change the name of the property. I don't think this was mentioned in
the commit message (searching it for "pcs" didn't reveal anything) so
I'm wondering whether this name update should've been merged into the
previous patch instead of this one?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH] kernel: exit: cleanup release_thread()

2022-08-19 Thread Russell King (Oracle)

On Fri, Aug 19, 2022 at 09:44:06AM +0800, Kefeng Wang wrote:
> Only x86 has own release_thread(), introduce a new weak
> release_thread() function to clean empty definitions in
> other ARCHs.
> 
> Signed-off-by: Kefeng Wang 
...
>  arch/arm/include/asm/processor.h| 3 ---
>  arch/arm/kernel/process.c   | 4 ----

Acked-by: Russell King (Oracle) 

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH] kprobes: Enable tracing for mololithic kernel images

2022-06-10 Thread Russell King (Oracle)

Masahiro Yamada , Sami Tolvanen 
, "Naveen N. Rao" , Marco 
Elver , Kees Cook , Steven Rostedt 
, Nathan Chancellor , Mark Brown 
, Borislav Petkov , Alexander Egorenkov 
, Thomas Bogendoerfer , 
linux-par...@vger.kernel.org, Nathaniel McCallum , 
Dmitry Torokhov , "David S. Miller" 
, "Kirill A. Shutemov" , 
Tobias Huschle , "Peter Zijlstra \(Intel\)" 
, "H. Peter Anvin" , 
sparcli...@vger.kernel.org, Tiezhu Yang , Miroslav 
Benes , Chen Zhongjin , Ard Biesheuvel 
, x...@kernel.org, linux-ri...@lists.infradead.org, Ing
 o Molnar , Aaron Tomlin , Albert Ou 
, Heiko Carstens , Liao Chang 
, Paul Walmsley , Josh 
Poimboeuf , Thomas Richter , 
linux-m...@vger.kernel.org, Changbin Du , Palmer Dabbelt 
, linuxppc-dev@lists.ozlabs.org, 
linux-modu...@vger.kernel.org
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Wed, Jun 08, 2022 at 02:59:27AM +0300, Jarkko Sakkinen wrote:
> diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
> index 553866751e1a..d2bb954cd54f 100644
> --- a/arch/arm/kernel/Makefile
> +++ b/arch/arm/kernel/Makefile
> @@ -44,6 +44,11 @@ obj-$(CONFIG_CPU_IDLE) += cpuidle.o
>  obj-$(CONFIG_ISA_DMA_API)+= dma.o
>  obj-$(CONFIG_FIQ)+= fiq.o fiqasm.o
>  obj-$(CONFIG_MODULES)+= armksyms.o module.o
> +ifeq ($(CONFIG_MODULES),y)
> +obj-y+= module_alloc.o
> +else
> +obj-$(CONFIG_KPROBES)+= module_alloc.o
> +endif

Doesn't:

obj-$(CONFIG_MODULES)   += module_alloc.o
obj-$(CONFIG_KPROBES)   += module_alloc.o

work just as well? The kbuild modules.rst documentation says:

The order of files in $(obj-y) is significant.  Duplicates in
the lists are allowed: the first instance will be linked into
built-in.a and succeeding instances will be ignored.

so you should be fine... or the documentation is wrong!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v5] mm: Avoid unnecessary page fault retires on shared memory types

2022-05-31 Thread Russell King (Oracle)

l Simek , Thomas Bogendoerfer , 
linux-par...@vger.kernel.org, Max Filippov , 
linux-ker...@vger.kernel.org, Dinh Nguyen , 
linux-ri...@lists.infradead.org, Palmer Dabbelt , Sven 
Schnelle , Guo Ren , 
linux-hexa...@vger.kernel.org, Ivan Kokshaysky , 
Johannes Berg , linuxppc-dev@lists.ozlabs.org, 
"David S . Miller" 
Errors-To: linuxppc-dev-bounces+archive=mail-archive@lists.ozlabs.org
Sender: "Linuxppc-dev" 


On Mon, May 30, 2022 at 02:34:50PM -0400, Peter Xu wrote:
> I observed that for each of the shared file-backed page faults, we're very
> likely to retry one more time for the 1st write fault upon no page.  It's
> because we'll need to release the mmap lock for dirty rate limit purpose
> with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).
> 
> Then after that throttling we return VM_FAULT_RETRY.
> 
> We did that probably because VM_FAULT_RETRY is the only way we can return
> to the fault handler at that time telling it we've released the mmap lock.
> 
> However that's not ideal because it's very likely the fault does not need
> to be retried at all since the pgtable was well installed before the
> throttling, so the next continuous fault (including taking mmap read lock,
> walk the pgtable, etc.) could be in most cases unnecessary.
> 
> It's not only slowing down page faults for shared file-backed, but also add
> more mmap lock contention which is in most cases not needed at all.
> 
> To observe this, one could try to write to some shmem page and look at
> "pgfault" value in /proc/vmstat, then we should expect 2 counts for each
> shmem write simply because we retried, and vm event "pgfault" will capture
> that.
> 
> To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
> show that we've completed the whole fault and released the lock.  It's also
> a hint that we should very possibly not need another fault immediately on
> this page because we've just completed it.
> 
> This patch provides a ~12% perf boost on my aarch64 test VM with a simple
> program sequentially dirtying 400MB shmem file being mmap()ed and these are
> the time it needs:
> 
>   Before: 650.980 ms (+-1.94%)
>   After:  569.396 ms (+-1.38%)
> 
> I believe it could help more than that.
> 
> We need some special care on GUP and the s390 pgfault handler (for gmap
> code before returning from pgfault), the rest changes in the page fault
> handlers should be relatively straightforward.
> 
> Another thing to mention is that mm_account_fault() does take this new
> fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.
> 
> I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
> not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
> them as-is.
> 
> Acked-by: Geert Uytterhoeven 
> Acked-by: Peter Zijlstra (Intel) 
> Acked-by: Johannes Weiner 
> Acked-by: Vineet Gupta 
> Acked-by: Guo Ren 
> Acked-by: Max Filippov 
> Acked-by: Christian Borntraeger 
> Acked-by: Michael Ellerman  (powerpc)
> Acked-by: Catalin Marinas 
> Reviewed-by: Alistair Popple 
> Reviewed-by: Ingo Molnar 
> Signed-off-by: Peter Xu 

For:

> diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
> index a062e07516dd..46cccd6bf705 100644
> --- a/arch/arm/mm/fault.c
> +++ b/arch/arm/mm/fault.c
> @@ -322,6 +322,10 @@ do_page_fault(unsigned long addr, unsigned int fsr, 
> struct pt_regs *regs)
>   return 0;
>   }
>  
> + /* The fault is fully completed (including releasing mmap lock) */
> + if (fault & VM_FAULT_COMPLETED)
> + return 0;
> +
>   if (!(fault & VM_FAULT_ERROR)) {
>   if (fault & VM_FAULT_RETRY) {
>   flags |= FAULT_FLAG_TRIED;

Acked-by: Russell King (Oracle) 

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/30] ARM: kexec: Disable IRQs/FIQs also on crash CPUs shutdown path

2022-04-29 Thread Russell King (Oracle)

On Fri, Apr 29, 2022 at 06:38:19PM -0300, Guilherme G. Piccoli wrote:
> Thanks Marc and Michael for the review/discussion.
> 
> On 29/04/2022 15:20, Marc Zyngier wrote:
> > [...]
> 
> > My expectations would be that, since we're getting here using an IPI,
> > interrupts are already masked. So what reenabled them the first place?
> > 
> > Thanks,
> > 
> > M.
> > 
> 
> Marc, I did some investigation in the code (and tried/failed in the ARM
> documentation as well heh), but this is still not 100% clear for me.
> 
> You're saying IPI calls disable IRQs/FIQs by default in the the target
> CPUs? Where does it happen? I'm a bit confused if this a processor
> mechanism, or it's in code.

When we taken an IRQ, IRQs will be masked, FIQs will not. IPIs are
themselves interrupts, so IRQs will be masked while the IPI is being
processed. Therefore, there should be no need to re-disable the
already disabled interrupts.

> But crash_smp_send_stop() is different, it seems to IPI the other CPUs
> with the flag IPI_CALL_FUNC, which leads to calling
> generic_smp_call_function_interrupt() - does it disable interrupts/FIQs
> as well? I couldn't find it.

It's buried in the architecture behaviour. When the CPU takes an
interrupt and jumps to the interrupt vector in the vectors page, it is
architecturally defined that interrupts will be disabled. If they
weren't architecturally disabled at this point, then as soon as the
first instruction is processed (at the interrupt vector, likely a
branch) the CPU would immediately take another jump to the interrupt
vector, and this process would continue indefinitely, making interrupt
handling utterly useless.

So, you won't find an explicit instruction in the code path from the
vectors to the IPI handler that disables interrupts - because it's
written into the architecture that this is what must happen.

IRQs are a lower priority than FIQs, so FIQs remain unmasked.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH net-next v2 05/18] net: dsa: mv88e6xxx: remove redundant check in mv88e6xxx_port_vlan()

2022-04-12 Thread Russell King (Oracle)

On Tue, Apr 12, 2022 at 12:58:17PM +0200, Jakob Koschel wrote:
> We know that "dev > dst->last_switch" in the "else" block.
> In other words, that "dev - dst->last_switch" is > 0.
> 
> dsa_port_bridge_num_get(dp) can be 0, but the check
> "if (bridge_num + dst->last_switch != dev) continue", rewritten as
> "if (bridge_num != dev - dst->last_switch) continue", aka
> "if (bridge_num != something which cannot be 0) continue",
> makes it redundant to have the extra "if (!bridge_num) continue" logic,
> since a bridge_num of zero would have been skipped anyway.
> 
> Signed-off-by: Jakob Koschel 
> Signed-off-by: Vladimir Oltean 

Isn't this Vladimir's patch?

If so, it should be commited as Vladimir as the author, and Vladimir's
sign-off should appear before yours. When sending out such patches,
there should be a From: line for Vladimir as the first line in the body
of the patch email.

The same goes for the next mv88e6xxx patch in this series - I think
both of these are the patches Vladimir sent in his email:

https://lore.kernel.org/r/20220408123101.p33jpynhqo67hebe@skbuf

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-03-02 Thread Russell King (Oracle)

On Wed, Mar 02, 2022 at 04:36:52PM +0530, Anshuman Khandual wrote:
> On 3/2/22 3:35 PM, Geert Uytterhoeven wrote:
> > I doubt the switch() variant would give better code on any platform.
> > 
> > What about using tables everywhere, using designated initializers
> > to improve readability?
> 
> Designated initializers ? Could you please be more specific. A table look
> up on arm platform would be something like this and arm_protection_map[]
> needs to be updated with user_pgprot like before.

There is *absolutely* nothing wrong with that. Updating it once during
boot is way more efficient than having to compute the value each time
vm_get_page_prot() gets called.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH] net: Remove branch in csum_shift()

2022-03-01 Thread Russell King (Oracle)

On Tue, Mar 01, 2022 at 11:41:06AM +, David Laight wrote:
> From: Christophe Leroy
> > Sent: 01 March 2022 11:15
> ...
> > Looks like ARM also does better code with the generic implementation as
> > it seems to have some looking like conditional instructions 'rorne' and
> > 'strne'.
> 
> In arm32 (and I think arm64) every instruction is conditional.

Almost every instruction in arm32. There are a number of unconditional
instructions that were introduced.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-02-28 Thread Russell King (Oracle)

On Tue, Mar 01, 2022 at 05:30:41AM +0530, Anshuman Khandual wrote:
> On 2/28/22 4:27 PM, Russell King (Oracle) wrote:
> > On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
> >> This defines and exports a platform specific custom vm_get_page_prot() via
> >> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
> >> macros can be dropped which are no longer needed.
> > 
> > What I would really like to know is why having to run _code_ to work out
> > what the page protections need to be is better than looking it up in a
> > table.
> > 
> > Not only is this more expensive in terms of CPU cycles, it also brings
> > additional code size with it.
> > 
> > I'm struggling to see what the benefit is.
> 
> Currently vm_get_page_prot() is also being _run_ to fetch required page
> protection values. Although that is being run in the core MM and from a
> platform perspective __SXXX, __PXXX are just being exported for a table.
> Looking it up in a table (and applying more constructs there after) is
> not much different than a clean switch case statement in terms of CPU
> usage. So this is not more expensive in terms of CPU cycles.

I disagree.

However, let's base this disagreement on some evidence. Here is the
present 32-bit ARM implementation:

0048 :
  48:   e20fand r0, r0, #15
  4c:   e3003000movwr3, #0
4c: R_ARM_MOVW_ABS_NC   .LANCHOR1
  50:   e3403000movtr3, #0
50: R_ARM_MOVT_ABS  .LANCHOR1
  54:   e7930100ldr r0, [r3, r0, lsl #2]
  58:   e12fff1ebx  lr

That is five instructions long.

Please show that your new implementation is not more expensive on
32-bit ARM. Please do so by building a 32-bit kernel, and providing
the disassembly.

I think you will find way more than five instructions in your version -
the compiler will have to issue code to decode the protection bits,
probably using a table of branches or absolute PC values, or possibly
the worst case of using multiple comparisons and branches. It then has
to load constants that may be moved using movw on ARMv7, but on
older architectures would have to be created from multiple instructions
or loaded from the literal pool. Then there'll be instructions to load
the address of "user_pgprot", retrieve its value, and bitwise or that.

Therefore, I fail to see how your approach of getting rid of the table
is somehow "better" than what we currently have in terms of the effect
on the resulting code.

If you don't like the __P and __S stuff and two arch_* hooks, you could
move the table into arch code along with vm_get_page_prot() without the
additional unnecessary hooks, while keeping all the benefits of the
table lookup.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH V3 09/30] arm/mm: Enable ARCH_HAS_VM_GET_PAGE_PROT

2022-02-28 Thread Russell King (Oracle)

On Mon, Feb 28, 2022 at 04:17:32PM +0530, Anshuman Khandual wrote:
> This defines and exports a platform specific custom vm_get_page_prot() via
> subscribing ARCH_HAS_VM_GET_PAGE_PROT. Subsequently all __SXXX and __PXXX
> macros can be dropped which are no longer needed.

What I would really like to know is why having to run _code_ to work out
what the page protections need to be is better than looking it up in a
table.

Not only is this more expensive in terms of CPU cycles, it also brings
additional code size with it.

I'm struggling to see what the benefit is.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v4 16/20] Kbuild: add Rust support

2022-02-12 Thread Russell King (Oracle)

On Sat, Feb 12, 2022 at 04:47:33PM +0100, Miguel Ojeda wrote:
> Hi Russell,
> 
> On Sat, Feb 12, 2022 at 3:17 PM Russell King (Oracle)
>  wrote:
> >
> > Please don't use CPU_32v6* here.
> >
> > It probably makes more sense to add a symbol "HAVE_RUST" and have the
> > appropriate architecture Kconfig files select HAVE_RUST.
> 
> We can do it whatever way arch maintainers prefer, of course. Why
> would you prefer one over the other?

It would be cleaner, rather than the "depends" line getting longer
and longer over time - and if different architecture maintainers
change it, it will lead to conflicts.

> > Does Rust support Thumb on ARMv6 and ARMv7 architectures?
> 
> Yes, the main backend is LLVM. Some built-in targets and their support
> level are listed here, if you want to take a look:

Right, so why made it dependent on CPU_32v6 || CPU_32v6K if ARMv7 is
supported? What about CPU_32v7? What about CPU_32v7M?

I think it would be saner to use the CPU_V6, CPU_V6K, CPU_V7 and maybe
CPU_V7M here - even bettern to select "HAVE_RUST" from these symbols,
since I'm sure you'd start to see the issue behind my "HAVE_RUST"
suggestion as it means having four symbols just for 32-bit ARM on your
dependency line.

> https://doc.rust-lang.org/nightly/rustc/platform-support.html

Interestingly, it does not list arm-unknown-linux-gnueabihf, which
is the "tuple" commonly used to build 32-bit ARM kernels. 

> > Please remove every utterance of "default n" from your patch; n is the
> > default default which default defaults to, so you don't need to specify
> > default n to make the option default to n. It will default to n purely
> > because n is the default when no default is specified.
> 
> Certainly. I am curious, though: is there a reason for most of the
> other 500+ instances in the kernel tree?

Probably because people incorrectly think it's required or some other
minor reason. As I say:

config FOO
bool/tristate ...

always defaults to 'n' without needing "default n" to be specified.

Let's do some proper research on this. There are 19148 "config"
statements in the kernel tree, 521 "default n" and 4818 that
specify any kind of "default". That means there are about 14330
config statements that do not specify any kind of default.

So, there are about 27 times more config statements that specify no
default than those that specify "default n", so using the argument
that there are "500+ instances" and therefore should be seen as
correct is completely misguided.

> > As Rust doesn't support all the architectures that the kernel supports,
> > Rust must not be used for core infrastructure.
> 
> Yeah, although I am not sure I understand what you are getting at here.

I mean, if we end up with, e.g. a filesystem coded in Rust, that
filesystem will not be available on architectures that the kernel
supports until either (a) Rust gains support for that architecture
or (b) someone re-codes the filesystem in C - at which point, what
is the point of having Rust in the kernel?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v4 16/20] Kbuild: add Rust support

2022-02-12 Thread Russell King (Oracle)

On Sat, Feb 12, 2022 at 02:03:42PM +0100, Miguel Ojeda wrote:
> +config RUST
> + bool "Rust support"
> + depends on RUST_IS_AVAILABLE
> + depends on ARM64 || CPU_32v6 || CPU_32v6K || (PPC64 && 
> CPU_LITTLE_ENDIAN) || X86_64 || RISCV

Please don't use CPU_32v6* here.

It probably makes more sense to add a symbol "HAVE_RUST" and have the
appropriate architecture Kconfig files select HAVE_RUST.

Does Rust support Thumb on ARMv6 and ARMv7 architectures?

> + depends on !MODVERSIONS
> + depends on !GCC_PLUGIN_RANDSTRUCT
> + select CONSTRUCTORS
> + default n

Please remove every utterance of "default n" from your patch; n is the
default default which default defaults to, so you don't need to specify
default n to make the option default to n. It will default to n purely
because n is the default when no default is specified.

> + help
> +   Enables Rust support in the kernel.
> +
> +   This allows other Rust-related options, like drivers written in Rust,
> +   to be selected.

As Rust doesn't support all the architectures that the kernel supports,
Rust must not be used for core infrastructure.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 10/45] ARM: Use do_kernel_power_off()

2021-10-28 Thread Russell King (Oracle)

On Thu, Oct 28, 2021 at 12:16:40AM +0300, Dmitry Osipenko wrote:
> Kernel now supports chained power-off handlers. Use do_kernel_power_off()
> that invokes chained power-off handlers. It also invokes legacy
> pm_power_off() for now, which will be removed once all drivers will
> be converted to the new power-off API.
> 
> Signed-off-by: Dmitry Osipenko 

Reviewed-by: Russell King (Oracle) 

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH -next] trap: Cleanup trap_init()

2021-08-12 Thread Russell King (Oracle)

On Thu, Aug 12, 2021 at 08:36:02PM +0800, Kefeng Wang wrote:
> There are some empty trap_init() in different ARCHs, introduce
> a new weak trap_init() function to cleanup them.
> 
> Cc: Vineet Gupta 
> Cc: Russell King 
> Cc: Yoshinori Sato 
> Cc: Ley Foon Tan 
> Cc: Jonas Bonn 
> Cc: Stefan Kristiansson 
> Cc: Stafford Horne 
> Cc: James E.J. Bottomley 
> Cc: Helge Deller 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Paul Walmsley 
> Cc: Jeff Dike 
> Cc: Richard Weinberger 
> Cc: Anton Ivanov 
> Cc: Andrew Morton 
> Signed-off-by: Kefeng Wang 

For 32-bit arm:

Acked-by: Russell King (Oracle) 

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v1 00/16] .map_sg() error cleanup

2021-07-15 Thread Russell King (Oracle)

On Thu, Jul 15, 2021 at 10:45:28AM -0600, Logan Gunthorpe wrote:
> Hi,
> 
> This series is spun out and expanded from my work to add P2PDMA support
> to DMA map operations[1].
> 
> The P2PDMA work requires distinguishing different error conditions in
> a map_sg operation. dma_map_sgtable() already allows for returning an
> error code (where as dma_map_sg() is only allowed to return zero)
> however, it currently only returns -EINVAL when a .map_sg() call returns
> zero.
> 
> This series cleans up all .map_sg() implementations to return appropriate
> error codes. After the cleanup, dma_map_sg() will still return zero,
> however dma_map_sgtable() will pass the error code from the .map_sg()
> call. Thanks go to Martn Oliveira for doing a lot of the cleanup of the
> obscure implementations.
> 
> The patch set is based off of v5.14-rc1 and a git repo can be found
> here:

Have all the callers for dma_map_sg() been updated to check for error
codes? If not, isn't that a pre-requisit to this patch set?

>From what I see in Linus' current tree, we still have cases today
where the return value of dma_map_sg() is compared with zero to
detect failure, so I think that needs fixing before we start changing
the dma_map_sg() implementation to return negative numbers.

I also notice that there are various places that don't check the
return value - and returning a negative number instead of zero may
well cause random other bits to be set in fields.

So, I think there's a fair amount of work to do in all the drivers
before this change can be considered.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH] bus: Make remove callback return void

2021-07-06 Thread Russell King (Oracle)

On Tue, Jul 06, 2021 at 11:50:37AM +0200, Uwe Kleine-König wrote:
> The driver core ignores the return value of this callback because there
> is only little it can do when a device disappears.
> 
> This is the final bit of a long lasting cleanup quest where several
> buses were converted to also return void from their remove callback.
> Additionally some resource leaks were fixed that were caused by drivers
> returning an error code in the expectation that the driver won't go
> away.
> 
> With struct bus_type::remove returning void it's prevented that newly
> implemented buses return an ignored error code and so don't anticipate
> wrong expectations for driver authors.
> 
> Signed-off-by: Uwe Kleine-König 

Yay! For ARM, Amba and related parts:

Acked-by: Russell King (Oracle) 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH printk v3 4/6] printk: remove NMI tracking

2021-06-25 Thread Russell King (Oracle)

On Fri, Jun 25, 2021 at 02:36:23PM +0200, Petr Mladek wrote:
> On Thu 2021-06-24 13:17:46, John Ogness wrote:
> > All NMI contexts are handled the same as the safe context: store the
> > message and defer printing. There is no need to have special NMI
> > context tracking for this. Using in_nmi() is enough.
> > 
> > Signed-off-by: John Ogness 
> > Reviewed-by: Petr Mladek 
> > ---
> >  arch/arm/kernel/smp.c   |  2 --
> >  arch/powerpc/kexec/crash.c  |  3 ---
> >  include/linux/hardirq.h |  2 --
> >  include/linux/printk.h  | 12 
> >  init/Kconfig|  5 -
> >  kernel/printk/internal.h|  6 --
> >  kernel/printk/printk_safe.c | 37 +
> >  kernel/trace/trace.c|  2 --
> >  8 files changed, 1 insertion(+), 68 deletions(-)
> > 
> > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > index 74679240a9d8..0dd2d733ad62 100644
> > --- a/arch/arm/kernel/smp.c
> > +++ b/arch/arm/kernel/smp.c
> > @@ -668,9 +668,7 @@ static void do_handle_IPI(int ipinr)
> > break;
> >  
> > case IPI_CPU_BACKTRACE:
> > -   printk_nmi_enter();
> > nmi_cpu_backtrace(get_irq_regs());
> > -   printk_nmi_exit();
> 
> It looks to me that in_nmi() returns false here. As a result,
> nmi_cpu_backtrace() might newly call consoles immediately.
> 
> If I recall correctly, arm does not have a proper NMI.
> And this is just some special case of a "normal" IRQ.
> 
> And indeed, nmi_enter() is called only from handle_fiq_as_nmi()
> and it is just a boiler plate.
> 
> If I am right, we should replace printk_nmi_enter() with
> printk_safe_enter_irqsave(flags) or so.
> 
> Even better solution might be to call this within
> nmi_enter()/nmi_exit(). But I am not sure if this is what
> the arm people want.

As I seem to recall, the guy in ARM Ltd who was working on this seemed
to drift away and it never got finished - however, I've always carried
platform specific hacks in my tree to make this work from FIQ on the
platforms I cared about:

http://git.armlinux.org.uk/cgit/linux-arm.git/commit/?h=fiq

Not suitable for mainline like that. I'm not aware of anyone working on
it now.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 00/15] init_mm: cleanup ARCH's text/data/brk setup code

2021-06-07 Thread Russell King (Oracle)

On Mon, Jun 07, 2021 at 07:48:54AM +0200, Christophe Leroy wrote:
> Hi Kefeng,
> 
> What you could do is to define a __weak function that architectures can
> override and call that function from mm_init() as suggested by Mike,

The problem with weak functions is that they bloat the kernel. Each
time a weak function is overriden, it becomes dead unreachable code
within the kernel image.

At some point we're probabily going to have to enable -ffunction-sections
to (hopefully) allow the dead code to be discarded.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: Bogus struct page layout on 32-bit

2021-04-10 Thread Russell King - ARM Linux admin

On Sat, Apr 10, 2021 at 03:06:52PM +0100, Matthew Wilcox wrote:
> How about moving the flags into the union?  A bit messy, but we don't
> have to play games with __packed__.

Yes, that is probably the better solution, avoiding the games to try
and get the union appropriately placed on 32-bit systems.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2] mm: Move mem_init_print_info() into mm_init()

2021-03-31 Thread Russell King - ARM Linux admin

On Wed, Mar 17, 2021 at 09:52:10AM +0800, Kefeng Wang wrote:
> mem_init_print_info() is called in mem_init() on each architecture,
> and pass NULL argument, so using void argument and move it into mm_init().
> 
> Acked-by: Dave Hansen 
> Signed-off-by: Kefeng Wang 

Acked-by: Russell King  # for arm bits
-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-23 Thread Russell King - ARM Linux admin

On Mon, Mar 22, 2021 at 06:10:01PM +0100, Cye Borg wrote:
> PWS 500au:
> 
> snow / # lspci -vvx -s 7.1
> 00:07.1 IDE interface: Contaq Microsystems 82c693 (prog-if 80 [ISA
> Compatibility mode-only controller, supports bus mastering])
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR- FastB2B- DisINTx-
> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- SERR-  Latency: 0
> Interrupt: pin A routed to IRQ 0
> Region 0: I/O ports at 01f0 [size=8]
> Region 1: I/O ports at 03f4
> Region 4: I/O ports at 9080 [size=16]
> Kernel driver in use: pata_cypress
> Kernel modules: pata_cypress
> 00: 80 10 93 c6 45 00 80 02 00 80 01 01 00 00 80 00
> 10: f1 01 00 00 f5 03 00 00 00 00 00 00 00 00 00 00
> 20: 81 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00
> 
> snow / # lspci -vvx -s 7.2
> 00:07.2 IDE interface: Contaq Microsystems 82c693 (prog-if 00 [ISA
> Compatibility mode-only controller])
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR- FastB2B- DisINTx-
> Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
> >TAbort- SERR-  Latency: 0
> Interrupt: pin B routed to IRQ 0
> Region 0: I/O ports at 0170 [size=8]
> Region 1: I/O ports at 0374
> Region 4: Memory at 0c24 (32-bit, non-prefetchable)
> [disabled] [size=64K]
> Kernel modules: pata_cypress
> 00: 80 10 93 c6 45 00 80 02 00 00 01 01 00 00 80 00
> 10: 71 01 00 00 75 03 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 24 0c 00 00 00 00 00 00 00 00 00 00 00 00
> 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00

Thanks very much.

Could I also ask for the output of:

# lspci -vxxx -s 7.0

as well please - this will dump all 256 bytes for the ISA bridge, which
contains a bunch of configuration registers. Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-23 Thread Russell King - ARM Linux admin

On Mon, Mar 22, 2021 at 04:33:14PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 22, 2021 at 04:18:23PM +0100, Christoph Hellwig wrote:
> > On Mon, Mar 22, 2021 at 03:15:03PM +, Russell King - ARM Linux admin 
> > wrote:
> > > It gets worse than that though - due to a change to remove
> > > pcibios_min_io from the generic code, moving it into the ARM
> > > architecture code, this has caused a regression that prevents the
> > > legacy resources being registered against the bus resource. So even
> > > if they are there, they cause probe failures. I haven't found a
> > > reasonable way to solve this yet, but until there is, there is no
> > > way that the PATA driver can be used as the "legacy mode" support
> > > is effectively done via the PCI code assigning virtual IO port
> > > resources.
> > > 
> > > I'm quite surprised that the CY82C693 even works on Alpha - I've
> > > asked for a lspci for that last week but nothing has yet been
> > > forthcoming from whoever responded to your patch for Alpha - so I
> > > can't compare what I'm seeing with what's happening with Alpha.
> > 
> > That sounds like something we could fix with a quirk for function 2
> > in the PCI resource assignment code.  Can you show what vendor and
> > device ID function 2 has so that I could try to come up with one?
> 
> Something like this:

That solves the problem for the IDE driver, which knows how to deal
with legacy mode, but not the PATA driver, which doesn't. The PATA
driver needs these resources.

As I say, having these resources presents a problem on ARM. A previous
commit (3c5d1699887b) changed the way the bus resources are setup which
results in /proc/ioports containing:

-000f : dma1
0020-003f : pic1
0060-006f : i8042
0070-0073 : rtc_cmos
  0070-0073 : rtc0
0080-008f : dma low page
00a0-00bf : pic2
00c0-00df : dma2
0213-0213 : ISAPnP
02f8-02ff : serial8250.0
  02f8-02ff : serial
03c0-03df : vga+
03f8-03ff : serial8250.0
  03f8-03ff : serial
0480-048f : dma high page
0a79-0a79 : isapnp write
1000- : PCI0 I/O
  1000-107f : :00:08.0
1000-107f : 3c59x
  1080-108f : :00:06.1
  1090-109f : :00:07.0
1090-109f : pata_it821x
  10a0-10a7 : :00:07.0
10a0-10a7 : pata_it821x
  10a8-10af : :00:07.0
10a8-10af : pata_it821x
  10b0-10b3 : :00:07.0
10b0-10b3 : pata_it821x
  10b4-10b7 : :00:07.0
10b4-10b7 : pata_it821x

The "PCI0 I/O" resource is the bus level resource, and the legacy
resources can not be claimed against that.

Without these resources, the PATA cypress driver doesn't work.

As I said previously, the reason this regression was not picked up
earlier is because I don't upgrade the kernel on this machine very
often; the machine has had uptimes into thousands of days.

I need to try reverting Rob's commit to find out if anything breaks
on this platform - it's completely wrong from a technical point of
view for any case where we have a PCI southbridge, since the
southbridge provides ISA based resources. I'm not entirely sure
what the point of it was, since we still have the PCIBIOS_MIN_IO
macro which still uses pcibios_min_io.

I'm looking at some of the other changes Rob made back at that time
which also look wrong, such as 8ef6e6201b26 which has the effect of
locating the 21285 IO resources to PCI address 0, over the top of
the ISA southbridge resources. I've no idea what Rob was thinking
when he removed the csrio allocation code in that commit, but
looking at it to day, it's soo obviously wrong even to a casual
glance.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-22 Thread Russell King - ARM Linux admin

On Mon, Mar 22, 2021 at 05:09:13PM +0100, John Paul Adrian Glaubitz wrote:
> On 3/22/21 4:15 PM, Russell King - ARM Linux admin wrote:
> > I'm quite surprised that the CY82C693 even works on Alpha - I've
> > asked for a lspci for that last week but nothing has yet been
> > forthcoming from whoever responded to your patch for Alpha - so I
> > can't compare what I'm seeing with what's happening with Alpha.
> 
> Here is lspci on my DEC Alpha XP-1000:
> 
> root@tsunami:~> lspci
> :00:07.0 ISA bridge: Contaq Microsystems 82c693
> :00:07.1 IDE interface: Contaq Microsystems 82c693
> :00:07.2 IDE interface: Contaq Microsystems 82c693
> :00:07.3 USB controller: Contaq Microsystems 82c693
> :00:0d.0 VGA compatible controller: Texas Instruments TVP4020 [Permedia 
> 2] (rev 01)
> 0001:01:03.0 Ethernet controller: Digital Equipment Corporation DECchip 
> 21142/43 (rev 41)
> 0001:01:06.0 SCSI storage controller: QLogic Corp. ISP1020 Fast-wide SCSI 
> (rev 06)
> 0001:01:08.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
> 0001:02:09.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet 
> Controller (rev 05)
> root@tsunami:~>

This is no good. What I asked last Thursday was:

"Could you send me the output of lspci -vvx -s 7.1 and lspci -vvx -s 7.2
please?"

so I can see the resources the kernel is using and a dump of the PCI
config space to see what the hardware is using.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-22 Thread Russell King - ARM Linux admin

On Mon, Mar 22, 2021 at 04:18:23PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 22, 2021 at 03:15:03PM +0000, Russell King - ARM Linux admin 
> wrote:
> > It gets worse than that though - due to a change to remove
> > pcibios_min_io from the generic code, moving it into the ARM
> > architecture code, this has caused a regression that prevents the
> > legacy resources being registered against the bus resource. So even
> > if they are there, they cause probe failures. I haven't found a
> > reasonable way to solve this yet, but until there is, there is no
> > way that the PATA driver can be used as the "legacy mode" support
> > is effectively done via the PCI code assigning virtual IO port
> > resources.
> > 
> > I'm quite surprised that the CY82C693 even works on Alpha - I've
> > asked for a lspci for that last week but nothing has yet been
> > forthcoming from whoever responded to your patch for Alpha - so I
> > can't compare what I'm seeing with what's happening with Alpha.
> 
> That sounds like something we could fix with a quirk for function 2
> in the PCI resource assignment code.  Can you show what vendor and
> device ID function 2 has so that I could try to come up with one?

I already have a quirk in arch/arm/kernel/bios32.c for this - but it
is no longer sufficient due to changes in the PCI layer, where much
of this is documented.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-22 Thread Russell King - ARM Linux admin

On Mon, Mar 22, 2021 at 03:54:03PM +0100, Christoph Hellwig wrote:
> On Fri, Mar 19, 2021 at 05:53:12PM +0000, Russell King - ARM Linux admin 
> wrote:
> > If I extend the arch/arm/kernel/bios32.c code to kill BARs 2/3 (which
> > actually are not present on the CY82C693) then the IDE driver works
> > for me, but the PATA driver does not:
> > 
> > cy82c693 :00:06.1: IDE controller (0x1080:0xc693 rev 0x00)
> > cy82c693 :00:06.1: not 100% native mode: will probe irqs later
> > legacy IDE will be removed in 2021, please switch to libata
> > Report any missing HW support to linux-...@vger.kernel.org
> > ide0: BM-DMA at 0x1080-0x1087
> > ide1: BM-DMA at 0x1088-0x108f
> > Probing IDE interface ide0...
> > hda: PIONEER DVD-RW DVR-105, ATAPI CD/DVD-ROM drive
> > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
> > ...
> > 
> > (unbind Cypress_IDE and try binding pata_cypress)
> > 
> > pata_cypress :00:06.1: no available native port
> 
> This comes from ata_pci_sff_init_host when it tails to initialize
> a port.  There are three cases why it can't initialize the port:
> 
>  1) because it is marked as dummy, which is the case for the second
> port of the cypress controller, but you're not using that even
> with the old ide driver, and we'd still not get that message just
> because of that second port.
>  2) when ata_resources_present returns false because the BAR has
> a zero start or length
>  3) because pcim_iomap_regions() fails.  This prints a warning to the
> log ("failed to request/iomap BARs for port %d (errno=%d)") that you
> should have seen
> 
> So the problem here has to be number two.  The legacy ide driver OTOH
> seems to lack a lot of these checks, although I'm not sure how it
> manages to actually work without those.
> 
> Can you show how the BAR assignment for the device looks using lscpi
> or a tool of your choice?

There's a big problem here. I have to explicitly zero the resources
(getting rid of the legacy ones assigned by the PCI probe code)
because they are in fact _wrong_ for the CY82C693. The PCI code
assumes that PCI function 1 (primary port) and PCI function 2
(secondary port) are two independent dual-channel IDE ports, and
as the PROG-IF of the class code indicates that all ports are in
legacy mode, the PCI code assigns the legacy ioport resources to
_both_ PCI functions. Essentially, the CY82C693 is a bit of an odd-ball
because it splits the two IDE ports across two functions rather than a
single function.

It gets worse than that though - due to a change to remove
pcibios_min_io from the generic code, moving it into the ARM
architecture code, this has caused a regression that prevents the
legacy resources being registered against the bus resource. So even
if they are there, they cause probe failures. I haven't found a
reasonable way to solve this yet, but until there is, there is no
way that the PATA driver can be used as the "legacy mode" support
is effectively done via the PCI code assigning virtual IO port
resources.

I'm quite surprised that the CY82C693 even works on Alpha - I've
asked for a lspci for that last week but nothing has yet been
forthcoming from whoever responded to your patch for Alpha - so I
can't compare what I'm seeing with what's happening with Alpha.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-19 Thread Russell King - ARM Linux admin

On Fri, Mar 19, 2021 at 05:07:53PM +, Russell King - ARM Linux admin wrote:
> On Thu, Mar 18, 2021 at 05:56:58AM +0100, Christoph Hellwig wrote:
> > footbridge_defconfig enables CONFIG_IDE but no actual host controller
> > driver, so just drop it.
> 
> I have been using the Cypress 82C693 IDE driver on Footbridge for a
> CD ROM drive, and I know it doesn't work with the PATA driver - as
> I need to disable BM DMA, otherwise the 82C693/DC21285 combination
> deadlocks the PCI bus. The PATA driver doesn't support disabling
> BM DMA without disabling it for all PATA ports, which is really
> annoying for my IT821x card in the same machine.
> 
> So, I'm rather stuck using the PATA driver for the HDDs and the
> IDE driver for the CD ROM.
> 
> That said, a commit a while back "cleaning up" the PCI layer appears
> to have totally shafted the 82C693, as the kernel tries to request
> IO resources at the legacy IDE addresses against the PCI bus resource
> which only covers 0x1000-0x. Hence, the 82C693 IDE ports are non-
> functional at the moment.
> 
> I'm debating about trying to find a fix to the PCI breakage that was
> introduced by "ARM: move PCI i/o resource setup into common code".
> 
> I hadn't noticed it because I don't use the CD ROM drive very often,
> and I don't upgrade the kernel that often either on the machine -
> but it has been running 24x7 for almost two decades.

Okay, a bit more on this...

If I extend the arch/arm/kernel/bios32.c code to kill BARs 2/3 (which
actually are not present on the CY82C693) then the IDE driver works
for me, but the PATA driver does not:

cy82c693 :00:06.1: IDE controller (0x1080:0xc693 rev 0x00)
cy82c693 :00:06.1: not 100% native mode: will probe irqs later
legacy IDE will be removed in 2021, please switch to libata
Report any missing HW support to linux-...@vger.kernel.org
ide0: BM-DMA at 0x1080-0x1087
ide1: BM-DMA at 0x1088-0x108f
Probing IDE interface ide0...
hda: PIONEER DVD-RW DVR-105, ATAPI CD/DVD-ROM drive
hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
...

(unbind Cypress_IDE and try binding pata_cypress)

pata_cypress :00:06.1: no available native port


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 02/10] ARM: disable CONFIG_IDE in footbridge_defconfig

2021-03-19 Thread Russell King - ARM Linux admin

On Thu, Mar 18, 2021 at 05:56:58AM +0100, Christoph Hellwig wrote:
> footbridge_defconfig enables CONFIG_IDE but no actual host controller
> driver, so just drop it.

I have been using the Cypress 82C693 IDE driver on Footbridge for a
CD ROM drive, and I know it doesn't work with the PATA driver - as
I need to disable BM DMA, otherwise the 82C693/DC21285 combination
deadlocks the PCI bus. The PATA driver doesn't support disabling
BM DMA without disabling it for all PATA ports, which is really
annoying for my IT821x card in the same machine.

So, I'm rather stuck using the PATA driver for the HDDs and the
IDE driver for the CD ROM.

That said, a commit a while back "cleaning up" the PCI layer appears
to have totally shafted the 82C693, as the kernel tries to request
IO resources at the legacy IDE addresses against the PCI bus resource
which only covers 0x1000-0x. Hence, the 82C693 IDE ports are non-
functional at the moment.

I'm debating about trying to find a fix to the PCI breakage that was
introduced by "ARM: move PCI i/o resource setup into common code".

I hadn't noticed it because I don't use the CD ROM drive very often,
and I don't upgrade the kernel that often either on the machine -
but it has been running 24x7 for almost two decades.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v12 01/14] ARM: mm: add missing pud_page define to 2-level page tables

2021-02-02 Thread Russell King - ARM Linux admin

On Tue, Feb 02, 2021 at 07:47:04PM +0800, Ding Tianhong wrote:
> On 2021/2/2 19:13, Russell King - ARM Linux admin wrote:
> > On Tue, Feb 02, 2021 at 09:05:02PM +1000, Nicholas Piggin wrote:
> >> diff --git a/arch/arm/include/asm/pgtable.h 
> >> b/arch/arm/include/asm/pgtable.h
> >> index c02f24400369..d63a5bb6bd0c 100644
> >> --- a/arch/arm/include/asm/pgtable.h
> >> +++ b/arch/arm/include/asm/pgtable.h
> >> @@ -166,6 +166,9 @@ extern struct page *empty_zero_page;
> >>  
> >>  extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
> >>  
> >> +#define pud_page(pud) pmd_page(__pmd(pud_val(pud)))
> >> +#define pud_write(pud)pmd_write(__pmd(pud_val(pud)))
> > 
> > As there is no PUD, does it really make sense to return a valid
> > struct page (which will be the PTE page) for pud_page(), which is
> > several tables above?
> > 
> --- a/arch/arm/include/asm/pgtable-2level.h
> +++ b/arch/arm/include/asm/pgtable-2level.h
> 
> +static inline int pud_none(pud_t pud)
> +{
> +  return 0;
> +}
> 
> I think it could be fix like this.

We already have that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v12 01/14] ARM: mm: add missing pud_page define to 2-level page tables

2021-02-02 Thread Russell King - ARM Linux admin

On Tue, Feb 02, 2021 at 09:05:02PM +1000, Nicholas Piggin wrote:
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index c02f24400369..d63a5bb6bd0c 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -166,6 +166,9 @@ extern struct page *empty_zero_page;
>  
>  extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
>  
> +#define pud_page(pud)pmd_page(__pmd(pud_val(pud)))
> +#define pud_write(pud)   pmd_write(__pmd(pud_val(pud)))

As there is no PUD, does it really make sense to return a valid
struct page (which will be the PTE page) for pud_page(), which is
several tables above?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 10:00:28AM +, Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 12:33:02PM +1000, Nicholas Piggin wrote:
> > Excerpts from Russell King - ARM Linux admin's message of December 29, 2020 
> > 8:44 pm:
> > > On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
> > >> I think it should certainly be documented in terms of what guarantees
> > >> it provides to application, _not_ the kinds of instructions it may or
> > >> may not induce the core to execute. And if existing API can't be
> > >> re-documented sanely, then deprecatd and new ones added that DTRT.
> > >> Possibly under a new system call, if arch's like ARM want a range
> > >> flush and we don't want to expand the multiplexing behaviour of
> > >> membarrier even more (sigh).
> > > 
> > > The 32-bit ARM sys_cacheflush() is there only to support self-modifying
> > > code, and takes whatever actions are necessary to support that.
> > > Exactly what actions it takes are cache implementation specific, and
> > > should be of no concern to the caller, but the underlying thing is...
> > > it's to support self-modifying code.
> > 
> >Caveat
> >cacheflush()  should  not  be used in programs intended to be 
> > portable.
> >On Linux, this call first appeared on the MIPS architecture, but  
> > nowa‐
> >days, Linux provides a cacheflush() system call on some other 
> > architec‐
> >tures, but with different arguments.
> > 
> > What a disaster. Another badly designed interface, although it didn't 
> > originate in Linux it sounds like we weren't to be outdone so
> > we messed it up even worse.
> > 
> > flushing caches is neither necessary nor sufficient for code modification
> > on many processors. Maybe some old MIPS specific private thing was fine,
> > but certainly before it grew to other architectures, somebody should 
> > have thought for more than 2 minutes about it. Sigh.
> 
> WARNING: You are bordering on being objectionable and offensive with
> that comment.
> 
> The ARM interface was designed by me back in the very early days of
> Linux, probably while you were still in dypers, based on what was
> known at the time.  Back in the early 2000s, ideas such as relaxed
> memory ordering were not known.  All there was was one level of
> harvard cache.

Sorry, I got that slightly wrong. Its origins on ARM were from 12 Dec
1998:

http://www.armlinux.org.uk/developer/patches/viewpatch.php?id=88/1

by Philip Blundell, and first appeared in the ARM
pre-patch-2.1.131-19981214-1.gz. It was subsequently documented in the
kernel sources by me in July 2001 in ARM patch-2.4.6-rmk2.gz. It has
a slightly different signature: the third argument on ARM is a flags
argument, whereas the MIPS code, it is some undocumented "cache"
parameter.

Whether the ARM version pre or post dates the MIPS code, I couldn't say.
Whether it was ultimately taken from the MIPS implementation, again, I
couldn't say.

However, please stop insulting your fellow developers ability to think.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 12:33:02PM +1000, Nicholas Piggin wrote:
> Excerpts from Russell King - ARM Linux admin's message of December 29, 2020 
> 8:44 pm:
> > On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
> >> I think it should certainly be documented in terms of what guarantees
> >> it provides to application, _not_ the kinds of instructions it may or
> >> may not induce the core to execute. And if existing API can't be
> >> re-documented sanely, then deprecatd and new ones added that DTRT.
> >> Possibly under a new system call, if arch's like ARM want a range
> >> flush and we don't want to expand the multiplexing behaviour of
> >> membarrier even more (sigh).
> > 
> > The 32-bit ARM sys_cacheflush() is there only to support self-modifying
> > code, and takes whatever actions are necessary to support that.
> > Exactly what actions it takes are cache implementation specific, and
> > should be of no concern to the caller, but the underlying thing is...
> > it's to support self-modifying code.
> 
>Caveat
>cacheflush()  should  not  be used in programs intended to be portable.
>On Linux, this call first appeared on the MIPS architecture, but  nowa‐
>days, Linux provides a cacheflush() system call on some other architec‐
>tures, but with different arguments.
> 
> What a disaster. Another badly designed interface, although it didn't 
> originate in Linux it sounds like we weren't to be outdone so
> we messed it up even worse.
> 
> flushing caches is neither necessary nor sufficient for code modification
> on many processors. Maybe some old MIPS specific private thing was fine,
> but certainly before it grew to other architectures, somebody should 
> have thought for more than 2 minutes about it. Sigh.

WARNING: You are bordering on being objectionable and offensive with
that comment.

The ARM interface was designed by me back in the very early days of
Linux, probably while you were still in dypers, based on what was
known at the time.  Back in the early 2000s, ideas such as relaxed
memory ordering were not known.  All there was was one level of
harvard cache.

So, juts shut up with your insults.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-29 Thread Russell King - ARM Linux admin

On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
> I think it should certainly be documented in terms of what guarantees
> it provides to application, _not_ the kinds of instructions it may or
> may not induce the core to execute. And if existing API can't be
> re-documented sanely, then deprecatd and new ones added that DTRT.
> Possibly under a new system call, if arch's like ARM want a range
> flush and we don't want to expand the multiplexing behaviour of
> membarrier even more (sigh).

The 32-bit ARM sys_cacheflush() is there only to support self-modifying
code, and takes whatever actions are necessary to support that.
Exactly what actions it takes are cache implementation specific, and
should be of no concern to the caller, but the underlying thing is...
it's to support self-modifying code.

Sadly, because it's existed for 20+ years, and it has historically been
sufficient for other purposes too, it has seen quite a bit of abuse
despite its design purpose not changing - it's been used by graphics
drivers for example. They quickly learnt the error of their ways with
ARMv6+, since it does not do sufficient for their purposes given the
cache architectures found there.

Let's not go around redesigning this after twenty odd years, requiring
a hell of a lot of pain to users. This interface is called by code
generated by GCC, so to change it you're looking at patching GCC as
well as the kernel, and you basically will make new programs
incompatible with older kernels - very bad news for users.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin

On Mon, Dec 28, 2020 at 11:44:33AM -0800, Andy Lutomirski wrote:
> On Mon, Dec 28, 2020 at 11:09 AM Russell King - ARM Linux admin
>  wrote:
> >
> > On Mon, Dec 28, 2020 at 07:29:34PM +0100, Jann Horn wrote:
> > > After chatting with rmk about this (but without claiming that any of
> > > this is his opinion), based on the manpage, I think membarrier()
> > > currently doesn't really claim to be synchronizing caches? It just
> > > serializes cores. So arguably if userspace wants to use membarrier()
> > > to synchronize code changes, userspace should first do the code
> > > change, then flush icache as appropriate for the architecture, and
> > > then do the membarrier() to ensure that the old code is unused?
> > >
> > > For 32-bit arm, rmk pointed out that that would be the cacheflush()
> > > syscall. That might cause you to end up with two IPIs instead of one
> > > in total, but we probably don't care _that_ much about extra IPIs on
> > > 32-bit arm?
> > >
> > > For arm64, I believe userspace can flush icache across the entire
> > > system with some instructions from userspace - "DC CVAU" followed by
> > > "DSB ISH", or something like that, I think? (See e.g.
> > > compat_arm_syscall(), the arm64 compat code that implements the 32-bit
> > > arm cacheflush() syscall.)
> >
> > Note that the ARM cacheflush syscall calls flush_icache_user_range()
> > over the range of addresses that userspace has passed - it's intention
> > since day one is to support cases where userspace wants to change
> > executable code.
> >
> > It will issue the appropriate write-backs to the data cache (DCCMVAU),
> > the invalidates to the instruction cache (ICIMVAU), invalidate the
> > branch target buffer (BPIALLIS or BPIALL as appropriate), and issue
> > the appropriate barriers (DSB ISHST, ISB).
> >
> > Note that neither flush_icache_user_range() nor flush_icache_range()
> > result in IPIs; cache operations are broadcast across all CPUs (which
> > is one of the minimums we require for SMP systems.)
> >
> > Now, that all said, I think the question that has to be asked is...
> >
> > What is the basic purpose of membarrier?
> >
> > Is the purpose of it to provide memory barriers, or is it to provide
> > memory coherence?
> >
> > If it's the former and not the latter, then cache flushes are out of
> > scope, and expecting memory written to be visible to the instruction
> > stream is totally out of scope of the membarrier interface, whether
> > or not the writes happen on the same or a different CPU to the one
> > executing the rewritten code.
> >
> > The documentation in the kernel does not seem to describe what it's
> > supposed to be doing - the only thing I could find is this:
> > Documentation/features/sched/membarrier-sync-core/arch-support.txt
> > which describes it as "arch supports core serializing membarrier"
> > whatever that means.
> >
> > Seems to be the standard and usual case of utterly poor to non-existent
> > documentation within the kernel tree, or even a pointer to where any
> > useful documentation can be found.
> >
> > Reading the membarrier(2) man page, I find nothing in there that talks
> > about any kind of cache coherency for self-modifying code - it only
> > seems to be about _barriers_ and nothing more, and barriers alone do
> > precisely nothing to save you from non-coherent Harvard caches.
> >
> > So, either Andy has a misunderstanding, or the man page is wrong, or
> > my rudimentary understanding of what membarrier is supposed to be
> > doing is wrong...
> 
> Look at the latest man page:
> 
> https://man7.org/linux/man-pages/man2/membarrier.2.html
> 
> for MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE.  The result may not be
> all that enlightening.

   MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE (since Linux 4.16)
  In  addition  to  providing  the  memory ordering guarantees de■
  scribed in MEMBARRIER_CMD_PRIVATE_EXPEDITED,  upon  return  from
  system call the calling thread has a guarantee that all its run■
  ning thread siblings have executed a core  serializing  instruc■
  tion.   This  guarantee is provided only for threads in the same
  process as the calling thread.

  The "expedited" commands complete faster than the  non-expedited
  ones,  they  never block, but have the downside of causing extra
  overhead.

  A process must register its intent to use the priv

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin

On Mon, Dec 28, 2020 at 07:29:34PM +0100, Jann Horn wrote:
> After chatting with rmk about this (but without claiming that any of
> this is his opinion), based on the manpage, I think membarrier()
> currently doesn't really claim to be synchronizing caches? It just
> serializes cores. So arguably if userspace wants to use membarrier()
> to synchronize code changes, userspace should first do the code
> change, then flush icache as appropriate for the architecture, and
> then do the membarrier() to ensure that the old code is unused?
> 
> For 32-bit arm, rmk pointed out that that would be the cacheflush()
> syscall. That might cause you to end up with two IPIs instead of one
> in total, but we probably don't care _that_ much about extra IPIs on
> 32-bit arm?
> 
> For arm64, I believe userspace can flush icache across the entire
> system with some instructions from userspace - "DC CVAU" followed by
> "DSB ISH", or something like that, I think? (See e.g.
> compat_arm_syscall(), the arm64 compat code that implements the 32-bit
> arm cacheflush() syscall.)

Note that the ARM cacheflush syscall calls flush_icache_user_range()
over the range of addresses that userspace has passed - it's intention
since day one is to support cases where userspace wants to change
executable code.

It will issue the appropriate write-backs to the data cache (DCCMVAU),
the invalidates to the instruction cache (ICIMVAU), invalidate the
branch target buffer (BPIALLIS or BPIALL as appropriate), and issue
the appropriate barriers (DSB ISHST, ISB).

Note that neither flush_icache_user_range() nor flush_icache_range()
result in IPIs; cache operations are broadcast across all CPUs (which
is one of the minimums we require for SMP systems.)

Now, that all said, I think the question that has to be asked is...

What is the basic purpose of membarrier?

Is the purpose of it to provide memory barriers, or is it to provide
memory coherence?

If it's the former and not the latter, then cache flushes are out of
scope, and expecting memory written to be visible to the instruction
stream is totally out of scope of the membarrier interface, whether
or not the writes happen on the same or a different CPU to the one
executing the rewritten code.

The documentation in the kernel does not seem to describe what it's
supposed to be doing - the only thing I could find is this:
Documentation/features/sched/membarrier-sync-core/arch-support.txt
which describes it as "arch supports core serializing membarrier"
whatever that means.

Seems to be the standard and usual case of utterly poor to non-existent
documentation within the kernel tree, or even a pointer to where any
useful documentation can be found.

Reading the membarrier(2) man page, I find nothing in there that talks
about any kind of cache coherency for self-modifying code - it only
seems to be about _barriers_ and nothing more, and barriers alone do
precisely nothing to save you from non-coherent Harvard caches.

So, either Andy has a misunderstanding, or the man page is wrong, or
my rudimentary understanding of what membarrier is supposed to be
doing is wrong...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin

On Mon, Dec 28, 2020 at 09:14:23AM -0800, Andy Lutomirski wrote:
> On Mon, Dec 28, 2020 at 2:25 AM Russell King - ARM Linux admin
>  wrote:
> >
> > On Sun, Dec 27, 2020 at 01:36:13PM -0800, Andy Lutomirski wrote:
> > > On Sun, Dec 27, 2020 at 12:18 PM Mathieu Desnoyers
> > >  wrote:
> > > >
> > > > - On Dec 27, 2020, at 1:28 PM, Andy Lutomirski l...@kernel.org 
> > > > wrote:
> > > >
> > >
> > > > >
> > > > > I admit that I'm rather surprised that the code worked at all on 
> > > > > arm64,
> > > > > and I'm suspicious that it has never been very well tested.  My 
> > > > > apologies
> > > > > for not reviewing this more carefully in the first place.
> > > >
> > > > Please refer to 
> > > > Documentation/features/sched/membarrier-sync-core/arch-support.txt
> > > >
> > > > It clearly states that only arm, arm64, powerpc and x86 support the 
> > > > membarrier
> > > > sync core feature as of now:
> > >
> > > Sigh, I missed arm (32).  Russell or ARM folks, what's the right
> > > incantation to make the CPU notice instruction changes initiated by
> > > other cores on 32-bit ARM?
> >
> > You need to call flush_icache_range(), since the changes need to be
> > flushed from the data cache to the point of unification (of the Harvard
> > I and D), and the instruction cache needs to be invalidated so it can
> > then see those updated instructions. This will also take care of the
> > necessary barriers that the CPU requires for you.
> 
> With what parameters?   From looking at the header, this is for the
> case in which the kernel writes some memory and then intends to
> execute it.  That's not what membarrier() does at all.  membarrier()
> works like this:

You didn't specify that you weren't looking at kernel memory.

If you're talking about userspace, then the interface you require
is flush_icache_user_range(), which does the same as
flush_icache_range() but takes userspace addresses. Note that this
requires that the memory is currently mapped at those userspace
addresses.

If that doesn't fit your needs, there isn't an interface to do what
you require, and it basically means creating something brand new
on every architecture.

What you are asking for is not "just a matter of a few instructions".
I have stated the required steps to achieve what you require above;
that is the minimum when you have non-snooping harvard caches, which
the majority of 32-bit ARMs have.

> User thread 1:
> 
> write to RWX memory *or* write to an RW alias of an X region.
> membarrier(...);
> somehow tell thread 2 that we're ready (with a store release, perhaps,
> or even just a relaxed store.)
> 
> User thread 2:
> 
> wait for the indication from thread 1.
> barrier();
> jump to the code.
> 
> membarrier() is, for better or for worse, not given a range of addresses.

Then, I'm sorry, it can't work on 32-bit ARM.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

2020-12-28 Thread Russell King - ARM Linux admin

On Sun, Dec 27, 2020 at 01:36:13PM -0800, Andy Lutomirski wrote:
> On Sun, Dec 27, 2020 at 12:18 PM Mathieu Desnoyers
>  wrote:
> >
> > - On Dec 27, 2020, at 1:28 PM, Andy Lutomirski l...@kernel.org wrote:
> >
> 
> > >
> > > I admit that I'm rather surprised that the code worked at all on arm64,
> > > and I'm suspicious that it has never been very well tested.  My apologies
> > > for not reviewing this more carefully in the first place.
> >
> > Please refer to 
> > Documentation/features/sched/membarrier-sync-core/arch-support.txt
> >
> > It clearly states that only arm, arm64, powerpc and x86 support the 
> > membarrier
> > sync core feature as of now:
> 
> Sigh, I missed arm (32).  Russell or ARM folks, what's the right
> incantation to make the CPU notice instruction changes initiated by
> other cores on 32-bit ARM?

You need to call flush_icache_range(), since the changes need to be
flushed from the data cache to the point of unification (of the Harvard
I and D), and the instruction cache needs to be invalidated so it can
then see those updated instructions. This will also take care of the
necessary barriers that the CPU requires for you.

... as documented in Documentation/core-api/cachetlb.rst and so
should be available on every kernel supported CPU.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin

On Tue, Jul 14, 2020 at 03:01:09PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 14, 2020 at 03:19:24PM +0300, Ard Biesheuvel wrote:
> > So perhaps the answer is to have text_alloc() not with a 'where'
> > argument but with a 'why' argument. Or more simply, just have separate
> > alloc/free APIs for each case, with generic versions that can be
> > overridden by the architecture.
> 
> Well, there only seem to be 2 cases here, either the pointer needs to
> fit in some immediate displacement, or not.
> 
> On x86 we seem have the advantage of a fairly large immediate
> displacement as compared to many other architectures (due to our
> variable sized instructions). And thus have been fairly liberal with our
> usage of it (also our indirect jmps/calls suck, double so with
> RETCH-POLINE).
> 
> Still, the indirect jump, as mentioned by Russel should work for
> arbitrarily placed code for us too.
> 
> 
> So I'm thinking that something like:
> 
> enum ptr_type {
>   immediate_displacement,
>   absolute,
> };
> 
> void *text_alloc(unsigned long size, enum ptr_type type)
> {
>   unsigned long vstart = VMALLOC_START;
>   unsigned long vend   = VMALLOC_END;
> 
>   if (type == immediate_displacement) {
>   vstart = MODULES_VADDR;
>   vend   = MODULES_END;
>   }
> 
>   return __vmalloc_node_range(size, TEXT_ALIGN, vstart, vend,
>   GFP_KERNEL, PAGE_KERNEL_EXEC, 0,
>   NUMA_NO_NODE, _RET_IP_);
> }
> 
> void text_free(void *ptr)
> {
>   vfree(ptr);
> }

Beware, however, that on 32-bit ARM, if module PLTs are enabled,
we will try to place the module in the module region (which gives
best performance) but if that allocation fails, we will fall back
to placing it in the vmalloc region and using PLTs.

So, for a module allocation, we would need to make up to two calls
to text_alloc(), once with "immediate_displacement", and if that
fails and we have PLT support enabled, again with "absolute".

Hence, as other people have said, why module_alloc() would need to
stay separate.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin

On Tue, Jul 14, 2020 at 01:17:22PM +0300, Ard Biesheuvel wrote:
> On Tue, 14 Jul 2020 at 12:53, Jarkko Sakkinen
>  wrote:
> >
> > On Mon, Jul 13, 2020 at 10:49:48PM +0300, Ard Biesheuvel wrote:
> > > This patch suggests that there are other reasons why conflating
> > > allocation of module space and allocating  text pages for other uses
> > > is a bad idea, but switching all users to text_alloc() is a step in
> > > the wrong direction. It would be better to stop using module_alloc()
> > > in core code except in the module loader, and have a generic
> > > text_alloc() that can be overridden by the arch if necessary. Note
> > > that x86  and s390 are the only architectures that use module_alloc()
> > > in ftrace code.
> >
> > This series essentially does this: introduces text_alloc() and
> > text_memfree(), which have generic implementations in kernel/text.c.
> > Those can be overriddent by arch specific implementations.
> >
> > What you think should be done differently than in my patch set?
> >
> 
> On arm64, module_alloc is only used by the module loader, and so
> pulling it out and renaming it will cause unused code to be
> incorporated into the kernel when building without module support,
> which is the use case you claim to be addressing.
> 
> Module_alloc has semantics that are intimately tied to the module
> loader, but over the years, it ended up being (ab)used by other
> subsystems, which don't require those semantics but just need n pages
> of vmalloc space with executable permissions.
> 
> So the correct approach is to make text_alloc() implement just that,
> generically, and switch bpf etc to use it. Then, only on architectures
> that need it, override it with an implementation that has the required
> additional semantics.
> 
> Refactoring 10+ architectures like this without any regard for how
> text_alloc() deviates from module_alloc() just creates a lot of churn
> that others will have to clean up after you.

For 32-bit ARM, our bpf code uses "blx/bx" (or equivalent code
sequences) rather than encoding a "bl" or "b", so BPF there doesn't
care where the executable memory is mapped, and doesn't need any
PLTs.  Given that, should bpf always allocate from the vmalloc()
region to preserve the module space for modules?

I'm more concerned about ftrace though, but only because I don't
have the understanding of that subsystem to really say whether there
are any side effects from having the allocations potentially be out
of range of a "bl" or "b" instruction.

If ftrace jumps both to and from the allocated page using a "load
address to register, branch to register" approach like BPF does, then
ftrace should be safe - and again, raises the issue that maybe it
should always come from vmalloc() space.

So, I think we need to keep module_alloc() for allocating memory for
modules, and provide a new text_alloc() for these other cases.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin

On Tue, Jul 14, 2020 at 12:49:28PM +0300, Jarkko Sakkinen wrote:
> On Mon, Jul 13, 2020 at 07:34:10PM +0100, Russell King - ARM Linux admin 
> wrote:
> > On Mon, Jul 13, 2020 at 09:19:37PM +0300, Jarkko Sakkinen wrote:
> > > Rename module_alloc() to text_alloc() and module_memfree() to
> > > text_memfree(), and move them to kernel/text.c, which is unconditionally
> > > compiled to the kernel proper. This allows kprobes, ftrace and bpf to
> > > allocate space for executable code without requiring to compile the 
> > > modules
> > > support (CONFIG_MODULES=y) in.
> > 
> > I'm not sure this is a good idea for 32-bit ARM.  The code you are
> > moving for 32-bit ARM is quite specific to module use in that it also
> > supports allocating from the vmalloc area, where the module code
> > knows to add PLT entries.
> > 
> > If the other proposed users of this text_alloc() do not have the logic
> > to add PLT entries when branches between kernel code and this
> > allocation are not reachable by, e.g. a 26-bit signed offset for 32-bit
> > ARM code, then this code is not suitable for that use.
> 
> My intention is to use this in kprobes code in the place of
> module_alloc().  I'm not sure why moving this code out of the module
> subsystem could possibly break anything.  Unfortunately I forgot to add
> covere letter to my series. Sending v2 with that to explain my use case
> for this.

Ah, so you're merely renaming module_alloc() to text_alloc() everywhere?
It sounded from the initial patch like you were also converting other
users to use module_alloc().

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-14 Thread Russell King - ARM Linux admin

On Tue, Jul 14, 2020 at 12:45:36PM +0300, Jarkko Sakkinen wrote:
> Rename module_alloc() to text_alloc() and module_memfree() to
> text_memfree(), and move them to kernel/text.c, which is unconditionally
> compiled to the kernel proper. This allows kprobes, ftrace and bpf to
> allocate space for executable code without requiring to compile the modules
> support (CONFIG_MODULES=y) in.

As I said in response to your first post of this, which seems to have
gone unacknowledged, I don't think this is a good idea.

So, NAK on the whole concept this without some discussion on the 32-bit
ARM issues.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/3] module: Rename module_alloc() to text_alloc() and move to kernel proper

2020-07-13 Thread Russell King - ARM Linux admin

On Mon, Jul 13, 2020 at 09:19:37PM +0300, Jarkko Sakkinen wrote:
> Rename module_alloc() to text_alloc() and module_memfree() to
> text_memfree(), and move them to kernel/text.c, which is unconditionally
> compiled to the kernel proper. This allows kprobes, ftrace and bpf to
> allocate space for executable code without requiring to compile the modules
> support (CONFIG_MODULES=y) in.

I'm not sure this is a good idea for 32-bit ARM.  The code you are
moving for 32-bit ARM is quite specific to module use in that it also
supports allocating from the vmalloc area, where the module code
knows to add PLT entries.

If the other proposed users of this text_alloc() do not have the logic
to add PLT entries when branches between kernel code and this
allocation are not reachable by, e.g. a 26-bit signed offset for 32-bit
ARM code, then this code is not suitable for that use.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()

2020-04-21 Thread Russell King - ARM Linux admin

On Tue, Apr 21, 2020 at 03:49:19AM +0100, Al Viro wrote:
>   The only source I'd been able to find speeks of >= 60 cycles
> (and possibly much more) for non-pipelined coprocessor instructions;
> the list of such does contain loads and stores to a bunch of registers.
> However, the register in question (p15/c3) has only store mentioned there,
> so loads might be cheap; no obvious reasons for those to be slow.
> That's a question to arm folks, I'm afraid...  rmk?

I have no information on that; instruction timings are not defined
at architecture level (architecture reference manual), nor do I find
information in the CPU technical reference manual (which would be
specific to the CPU). Instruction timings tend to be implementation
dependent.

I've always consulted Will Deacon when I've needed to know whether
an instruction is expensive or not.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

Re: [PATCH] soc: fsl: dpio: avoid stack usage warning

2020-04-08 Thread Russell King - ARM Linux admin

On Wed, Apr 08, 2020 at 08:58:16PM +0200, Arnd Bergmann wrote:
> A 1024 byte variable on the stack will warn on any 32-bit architecture
> during compile-testing, and is generally a bad idea anyway:
> 
> fsl/dpio/dpio-service.c: In function 
> 'dpaa2_io_service_enqueue_multiple_desc_fq':
> fsl/dpio/dpio-service.c:495:1: error: the frame size of 1032 bytes is larger 
> than 1024 bytes [-Werror=frame-larger-than=]
> 
> There are currently no callers of this function, so I cannot tell whether
> dynamic memory allocation is allowed once callers are added. Change
> it to kcalloc for now, if anyone gets a warning about calling this in
> atomic context after they start using it, they can fix it later.
> 
> Fixes: 9d98809711ae ("soc: fsl: dpio: Adding QMAN multiple enqueue interface")
> Signed-off-by: Arnd Bergmann 
> ---
>  drivers/soc/fsl/dpio/dpio-service.c | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/soc/fsl/dpio/dpio-service.c 
> b/drivers/soc/fsl/dpio/dpio-service.c
> index cd4f6410e8c2..ff0ef8cbdbff 100644
> --- a/drivers/soc/fsl/dpio/dpio-service.c
> +++ b/drivers/soc/fsl/dpio/dpio-service.c
> @@ -478,12 +478,17 @@ int dpaa2_io_service_enqueue_multiple_desc_fq(struct 
> dpaa2_io *d,
>   const struct dpaa2_fd *fd,
>   int nb)
>  {
> - int i;
> - struct qbman_eq_desc ed[32];
> + struct qbman_eq_desc *ed = kcalloc(sizeof(struct qbman_eq_desc), 32, 
> GFP_KERNEL);

I think you need to rearrange this to be more compliant with the coding
style.

> + int i, ret;
> +
> + if (!ed)
> + return -ENOMEM;
>  
>   d = service_select(d);
> - if (!d)
> - return -ENODEV;
> + if (!d) {
> + ret = -ENODEV;
> + goto out;
> + }
>  
>   for (i = 0; i < nb; i++) {
>   qbman_eq_desc_clear([i]);
> @@ -491,7 +496,10 @@ int dpaa2_io_service_enqueue_multiple_desc_fq(struct 
> dpaa2_io *d,
>   qbman_eq_desc_set_fq([i], fqid[i]);
>   }
>  
> - return qbman_swp_enqueue_multiple_desc(d->swp, [0], fd, nb);
> + ret = qbman_swp_enqueue_multiple_desc(d->swp, [0], fd, nb);
> +out:
> + kfree(ed);
> + return ret;
>  }
>  EXPORT_SYMBOL(dpaa2_io_service_enqueue_multiple_desc_fq);
>  
> -- 
> 2.26.0
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

Re: decruft the vmalloc API

2020-04-08 Thread Russell King - ARM Linux admin

On Wed, Apr 08, 2020 at 01:58:58PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> Peter noticed that with some dumb luck you can toast the kernel address
> space with exported vmalloc symbols.
> 
> I used this as an opportunity to decruft the vmalloc.c API and make it
> much more systematic.  This also removes any chance to create vmalloc
> mappings outside the designated areas or using executable permissions
> from modules.  Besides that it removes more than 300 lines of code.

I haven't read all your patches yet.

Have you tested it on 32-bit ARM, where the module area is located
_below_ PAGE_OFFSET and outside of the vmalloc area?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

Re: [PATCH RESEND 1/4] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

2020-04-03 Thread Russell King - ARM Linux admin

On Fri, Apr 03, 2020 at 12:26:10PM +0100, Catalin Marinas wrote:
> On Fri, Apr 03, 2020 at 01:58:31AM +0100, Al Viro wrote:
> > On Thu, Apr 02, 2020 at 11:35:57AM -0700, Kees Cook wrote:
> > > Yup, I think it's a weakness of the ARM implementation and I'd like to
> > > not extend it further. AFAIK we should never nest, but I would not be
> > > surprised at all if we did.
> > > 
> > > If we were looking at a design goal for all architectures, I'd like
> > > to be doing what the public PaX patchset did for their memory access
> > > switching, which is to alarm if calling into "enable" found the access
> > > already enabled, etc. Such a condition would show an unexpected nesting
> > > (like we've seen with similar constructs with set_fs() not getting reset
> > > during an exception handler, etc etc).
> > 
> > FWIW, maybe I'm misreading the ARM uaccess logics, but... it smells like
> > KERNEL_DS is somewhat more dangerous there than on e.g. x86.
> > 
> > Look: with CONFIG_CPU_DOMAINS, set_fs(KERNEL_DS) tells MMU to ignore
> > per-page permission bits in DOMAIN_KERNEL (i.e. for kernel address
> > ranges), allowing them even if they would normally be denied.  We need
> > that for actual uaccess loads/stores, since those use insns that pretend
> > to be done in user mode and we want them to access the kernel pages.
> > But that affects the normal loads/stores as well; unless I'm misreading
> > that code, it will ignore (supervisor) r/o on a page.  And that's not
> > just for the code inside the uaccess blocks; *everything* done under
> > KERNEL_DS is subject to that.
> 
> That's correct. Luckily this only affects ARMv5 and earlier. From ARMv6
> onwards, CONFIG_CPU_USE_DOMAINS is no longer selected and the uaccess
> instructions are just plain ldr/str.
> 
> Russell should know the details on whether there was much choice. Since
> the kernel was living in the linear map with full rwx permissions, the
> KERNEL_DS overriding was probably not a concern and the ldrt/strt for
> uaccess deemed more secure. We also have weird permission setting
> pre-ARMv6 (or rather v6k) where RO user pages are writable from the
> kernel with standard str instructions (breaking CoW). I don't recall
> whether it was a choice made by the kernel or something the architecture
> enforced. The vectors page has to be kernel writable (and user RO) to
> store the TLS value in the absence of a TLS register but maybe we could
> do this via the linear alias together with the appropriate cache
> maintenance.
> 
> From ARMv6, the domain overriding had the side-effect of ignoring the XN
> bit and causing random instruction fetches from ioremap() areas. So we
> had to remove the domain switching. We also gained a dedicated TLS
> register.

Indeed.  On pre-ARMv6, we have the following choices for protection
attributes:

Page tables Control Reg Privileged  User
AP  S,R permission  permission
00  0,0 No access   No access
00  1,0 Read-only   No access
00  0,1 Read-only   Read-only
00  1,1 Unpredictable   Unpredictable
01  X,X Read/Write  No access
10  X,X Read/Write  Read-only
11  X,X Read/Write  Read/Write

We use S,R=1,0 under Linux because this allows us to read-protect
kernel pages without making them visible to userspace.  If we
changed to S,R=0,1, then we could have our read-only permissions
for both kernel and userspace, drop domain switching, and use the
plain LDR/STR instructions, but we then lose the ability to
write-protect module executable code and other parts of kernel
space without making them visible to userspace.

So, it essentially boils down to making a choice - which set of
security features we think are the most important.

> I think uaccess_enable() could indeed switch the kernel domain if
> KERNEL_DS is set and move this out of set_fs(). It would reduce the
> window the kernel domain permissions are overridden. Anyway,
> uaccess_enable() appeared much later on arm when Russell introduced PAN
> (SMAP) like support by switching the user domain.

Yes, that would be a possibility.  Another possibility would be to
eliminate as much usage of KERNEL_DS as possible - I've just found
one instance in sys_oabi-compat.c that can be eliminated (epoll_ctl)
but there's several there that can't with the current code structure,
and re-coding the contents of some fs/* functions to work around that
is a very bad idea.  If there's some scope for rejigging some of the
fs/* code, it may be possible to elimate some other cases in there.

I notice that the fs/* code seems like some of the last remaining
users of KERNEL_DS, although I suspect that some aren't possible to
eliminate. :(

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps

Re: [PATCH RESEND 1/4] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

2020-04-03 Thread Russell King - ARM Linux admin

On Thu, Apr 02, 2020 at 11:35:57AM -0700, Kees Cook wrote:
> On Thu, Apr 02, 2020 at 06:50:32PM +0100, Al Viro wrote:
> > On Thu, Apr 02, 2020 at 07:03:28PM +0200, Christophe Leroy wrote:
> > 
> > > user_access_begin() grants both read and write.
> > > 
> > > This patch adds user_read_access_begin() and user_write_access_begin() but
> > > it doesn't remove user_access_begin()
> > 
> > Ouch...  So the most generic name is for the rarest case?
> >  
> > > > What should we do about that?  Do we prohibit such blocks outside
> > > > of arch?
> > > > 
> > > > What should we do about arm and s390?  There we want a cookie passed
> > > > from beginning of block to its end; should that be a return value?
> > > 
> > > That was the way I implemented it in January, see
> > > https://patchwork.ozlabs.org/patch/1227926/
> > > 
> > > There was some discussion around that and most noticeable was:
> > > 
> > > H. Peter (hpa) said about it: "I have *deep* concern with carrying state 
> > > in
> > > a "key" variable: it's a direct attack vector for a crowbar attack,
> > > especially since it is by definition live inside a user access region."
> > 
> > > This patch minimises the change by just adding user_read_access_begin() 
> > > and
> > > user_write_access_begin() keeping the same parameters as the existing
> > > user_access_begin().
> > 
> > Umm...  What about the arm situation?  The same concerns would apply there,
> > wouldn't they?  Currently we have
> > static __always_inline unsigned int uaccess_save_and_enable(void)
> > {
> > #ifdef CONFIG_CPU_SW_DOMAIN_PAN
> > unsigned int old_domain = get_domain();
> > 
> > /* Set the current domain access to permit user accesses */
> > set_domain((old_domain & ~domain_mask(DOMAIN_USER)) |
> >domain_val(DOMAIN_USER, DOMAIN_CLIENT));
> > 
> > return old_domain;
> > #else
> > return 0;
> > #endif
> > }
> > and
> > static __always_inline void uaccess_restore(unsigned int flags)
> > {
> > #ifdef CONFIG_CPU_SW_DOMAIN_PAN
> > /* Restore the user access mask */
> > set_domain(flags);
> > #endif
> > }
> > 
> > How much do we need nesting on those, anyway?  rmk?

It's that way because it's easy, logical, and actually *more* efficient
to do it that way, rather than read-modify-write the domain register
each time we want to change it.

> Yup, I think it's a weakness of the ARM implementation and I'd like to
> not extend it further. AFAIK we should never nest, but I would not be
> surprised at all if we did.

There is one known nesting, which is __clear_user() when used with
the (IMHO horrid and I don't care about) UACCESS_WITH_MEMCPY feature.
That's not intentional however.

When I introduced this on ARM, the placement I adopted was to locate
it _as close as sanely possible_ to the userspace access so we
minimised the kernel accesses, so we minimise the number of accesses
that could go stray because of the domain issue - we ideally only
want the access done by the accessor itself to be affected, which
we achieve for most accesses.

Thinking laterally, maybe we should get rid of the whole KERNEL_DS
stuff entirely, and come up with an alternative way of handling the
kernel-wants-to-access-kernelspace-via-user-accessors problem.
Such as, copying some data back to userspace memory?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

Re: [PATCH RESEND 1/4] uaccess: Add user_read_access_begin/end and user_write_access_begin/end

2020-04-03 Thread Russell King - ARM Linux admin

On Fri, Apr 03, 2020 at 01:58:31AM +0100, Al Viro wrote:
> On Thu, Apr 02, 2020 at 11:35:57AM -0700, Kees Cook wrote:
> 
> > Yup, I think it's a weakness of the ARM implementation and I'd like to
> > not extend it further. AFAIK we should never nest, but I would not be
> > surprised at all if we did.
> > 
> > If we were looking at a design goal for all architectures, I'd like
> > to be doing what the public PaX patchset did for their memory access
> > switching, which is to alarm if calling into "enable" found the access
> > already enabled, etc. Such a condition would show an unexpected nesting
> > (like we've seen with similar constructs with set_fs() not getting reset
> > during an exception handler, etc etc).
> 
> FWIW, maybe I'm misreading the ARM uaccess logics, but... it smells like
> KERNEL_DS is somewhat more dangerous there than on e.g. x86.
> 
> Look: with CONFIG_CPU_DOMAINS, set_fs(KERNEL_DS) tells MMU to ignore
> per-page permission bits in DOMAIN_KERNEL (i.e. for kernel address
> ranges), allowing them even if they would normally be denied.  We need
> that for actual uaccess loads/stores, since those use insns that pretend
> to be done in user mode and we want them to access the kernel pages.
> But that affects the normal loads/stores as well; unless I'm misreading
> that code, it will ignore (supervisor) r/o on a page.  And that's not
> just for the code inside the uaccess blocks; *everything* done under
> KERNEL_DS is subject to that.
> 
> Why do we do that (modify_domain(), that is) inside set_fs() and not
> in uaccess_enable() et.al.?

First, CONFIG_CPU_DOMAINS is used on older ARMs, not ARMv7. Second,
the kernel image itself is not RO-protected on any ARM32 platform.

If we get rid of CONFIG_CPU_DOMAINS, we will use the ARMv7 method of
user access, which is to use normal load/stores for the user accessors
and every access must check against the address limit, even the
__-accessors.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up

Re: [PATCH v2 00/13] mm: remove __ARCH_HAS_5LEVEL_HACK

2020-02-16 Thread Russell King - ARM Linux admin

On Sun, Feb 16, 2020 at 10:18:30AM +0200, Mike Rapoport wrote:
> From: Mike Rapoport 
> 
> Hi,
> 
> These patches convert several architectures to use page table folding and
> remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h.
> 
> The changes are mostly about mechanical replacement of pgd accessors with p4d
> ones and the addition of higher levels to page table traversals.
> 
> All the patches were sent separately to the respective arch lists and
> maintainers hence the "v2" prefix.

You fail to explain why this change which adds 488 additional lines of
code is desirable.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-11 Thread Russell King - ARM Linux admin

On Tue, Feb 11, 2020 at 06:33:47AM +0100, Christophe Leroy wrote:
> 
> 
> Le 11/02/2020 à 03:25, Anshuman Khandual a écrit :
> > 
> > 
> > On 02/10/2020 04:36 PM, Russell King - ARM Linux admin wrote:
> > > There are good reasons for the way ARM does stuff.  The generic crap was
> > > written without regard for the circumstances that ARM has, and thus is
> > > entirely unsuitable for 32-bit ARM.
> > 
> > Since we dont have an agreement here, lets just settle with disabling the
> > test for now on platforms where the build fails. CONFIG_EXPERT is enabling
> > this test for better adaptability and coverage, hence how about re framing
> > the config like this ? This at the least conveys the fact that EXPERT only
> > works when platform is neither IA64 or ARM.
> 
> Agreed
> 
> > 
> > config DEBUG_VM_PGTABLE
> > bool "Debug arch page table for semantics compliance"
> > depends on MMU
> > depends on ARCH_HAS_DEBUG_VM_PGTABLE || (EXPERT &&  !(IA64 || ARM))
> 
> I think it's maybe better to have a dedicated depends line:
> 
> depends on !IA64 && !ARM
> depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT
> 
> The day arm and/or ia64 is ready for building the test, we can remove that
> depends.

Never going to happen as its technically infeasible, sorry.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-10 Thread Russell King - ARM Linux admin

On Mon, Feb 10, 2020 at 11:46:23AM +0100, Christophe Leroy wrote:
> 
> 
> Le 10/02/2020 à 11:02, Russell King - ARM Linux admin a écrit :
> > On Mon, Feb 10, 2020 at 07:38:38AM +0100, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 10/02/2020 à 06:35, Anshuman Khandual a écrit :
> > > > 
> > > > 
> > > > On 02/10/2020 10:22 AM, Andrew Morton wrote:
> > > > > On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual 
> > > > >  wrote:
> > > > > 
> > > > > > 
> > > > > > On 02/06/2020 04:40 AM, kbuild test robot wrote:
> > > > > > > Hi Anshuman,
> > > > > > > 
> > > > > > > Thank you for the patch! Yet something to improve:
> > > > > > > 
> > > > > > > [auto build test ERROR on powerpc/next]
> > > > > > > [also build test ERROR on s390/features linus/master arc/for-next 
> > > > > > > v5.5]
> > > > > > > [cannot apply to mmotm/master tip/x86/core arm64/for-next/core 
> > > > > > > next-20200205]
> > > > > > > [if your patch is applied to the wrong git tree, please drop us a 
> > > > > > > note to help
> > > > > > > improve the system. BTW, we also suggest to use '--base' option 
> > > > > > > to specify the
> > > > > > > base tree in git format-patch, please see 
> > > > > > > https://stackoverflow.com/a/37406982]
> > > > > > > 
> > > > > > > url:
> > > > > > > https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
> > > > > > > base:   
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> > > > > > > next
> > > > > > > config: ia64-allmodconfig (attached as .config)
> > > > > > > compiler: ia64-linux-gcc (GCC) 7.5.0
> > > > > > > reproduce:
> > > > > > >   wget 
> > > > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > > > > > >  -O ~/bin/make.cross
> > > > > > >   chmod +x ~/bin/make.cross
> > > > > > >   # save the attached .config to linux build tree
> > > > > > >   GCC_VERSION=7.5.0 make.cross ARCH=ia64
> > > > > > > 
> > > > > > > If you fix the issue, kindly add following tag
> > > > > > > Reported-by: kbuild test robot 
> > > > > > > 
> > > > > > > All error/warnings (new ones prefixed by >>):
> > > > > > > 
> > > > > > >  In file included from 
> > > > > > > include/asm-generic/pgtable-nopud.h:8:0,
> > > > > > >   from arch/ia64/include/asm/pgtable.h:586,
> > > > > > >   from include/linux/mm.h:99,
> > > > > > >   from include/linux/highmem.h:8,
> > > > > > >   from mm/debug_vm_pgtable.c:14:
> > > > > > >  mm/debug_vm_pgtable.c: In function 'pud_clear_tests':
> > > > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:32: error: 
> > > > > > > > > implicit declaration of function '__pgd'; did you mean 
> > > > > > > > > '__p4d'? [-Werror=implicit-function-declaration]
> > > > > > >   #define __pud(x)((pud_t) { __pgd(x) })
> > > > > > >  ^
> > > > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro 
> > > > > > > > > '__pud'
> > > > > > >pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> > > > > > >  ^
> > > > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: 
> > > > > > > > > missing braces around initializer [-Wmissing-braces]
> > > > > > >   #define __pud(x)((pud_t) { __pgd(x) })
> > > > > > >^
> > > > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro 
> > > > > > > > > '__pud'
>

Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-10 Thread Russell King - ARM Linux admin

On Mon, Feb 10, 2020 at 07:38:38AM +0100, Christophe Leroy wrote:
> 
> 
> Le 10/02/2020 à 06:35, Anshuman Khandual a écrit :
> > 
> > 
> > On 02/10/2020 10:22 AM, Andrew Morton wrote:
> > > On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual 
> > >  wrote:
> > > 
> > > > 
> > > > On 02/06/2020 04:40 AM, kbuild test robot wrote:
> > > > > Hi Anshuman,
> > > > > 
> > > > > Thank you for the patch! Yet something to improve:
> > > > > 
> > > > > [auto build test ERROR on powerpc/next]
> > > > > [also build test ERROR on s390/features linus/master arc/for-next 
> > > > > v5.5]
> > > > > [cannot apply to mmotm/master tip/x86/core arm64/for-next/core 
> > > > > next-20200205]
> > > > > [if your patch is applied to the wrong git tree, please drop us a 
> > > > > note to help
> > > > > improve the system. BTW, we also suggest to use '--base' option to 
> > > > > specify the
> > > > > base tree in git format-patch, please see 
> > > > > https://stackoverflow.com/a/37406982]
> > > > > 
> > > > > url:
> > > > > https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
> > > > > base:   
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> > > > > config: ia64-allmodconfig (attached as .config)
> > > > > compiler: ia64-linux-gcc (GCC) 7.5.0
> > > > > reproduce:
> > > > >  wget 
> > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > > > >  -O ~/bin/make.cross
> > > > >  chmod +x ~/bin/make.cross
> > > > >  # save the attached .config to linux build tree
> > > > >  GCC_VERSION=7.5.0 make.cross ARCH=ia64
> > > > > 
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kbuild test robot 
> > > > > 
> > > > > All error/warnings (new ones prefixed by >>):
> > > > > 
> > > > > In file included from include/asm-generic/pgtable-nopud.h:8:0,
> > > > >  from arch/ia64/include/asm/pgtable.h:586,
> > > > >  from include/linux/mm.h:99,
> > > > >  from include/linux/highmem.h:8,
> > > > >  from mm/debug_vm_pgtable.c:14:
> > > > > mm/debug_vm_pgtable.c: In function 'pud_clear_tests':
> > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:32: error: implicit 
> > > > > > > declaration of function '__pgd'; did you mean '__p4d'? 
> > > > > > > [-Werror=implicit-function-declaration]
> > > > >  #define __pud(x)((pud_t) { __pgd(x) })
> > > > > ^
> > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
> > > > >   pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> > > > > ^
> > > > > > > include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: missing 
> > > > > > > braces around initializer [-Wmissing-braces]
> > > > >  #define __pud(x)((pud_t) { __pgd(x) })
> > > > >   ^
> > > > > > > mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
> > > > >   pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> > > > > ^
> > > > > cc1: some warnings being treated as errors
> > > > 
> > > > This build failure is expected now given that we have allowed 
> > > > DEBUG_VM_PGTABLE
> > > > with EXPERT without platform requiring ARCH_HAS_DEBUG_VM_PGTABLE. This 
> > > > problem
> > > > i.e build failure caused without a platform __pgd(), is known to exist 
> > > > both on
> > > > ia64 and arm (32bit) platforms. Please refer 
> > > > https://lkml.org/lkml/2019/9/24/314
> > > > for details where this was discussed earlier.
> > > > 
> > > 
> > > I'd prefer not to merge a patch which is known to cause build
> > > regressions.  Is there some temporary thing we can do to prevent these
> > > errors until arch maintainers(?) get around to implementing the
> > > long-term fixes?
> > 
> > We could explicitly disable CONFIG_DEBUG_VM_PGTABLE on ia64 and arm 
> > platforms
> > which will ensure that others can still use the EXPERT path.
> > 
> > config DEBUG_VM_PGTABLE
> > bool "Debug arch page table for semantics compliance"
> > depends on MMU
> > depends on !(IA64 || ARM)
> > depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT
> > default n if !ARCH_HAS_DEBUG_VM_PGTABLE
> > default y if DEBUG_VM
> > 
> 
> On both ia32 and arm, the fix is trivial.
> 
> Can we include the fix within this patch, just the same way as the
> mm_p4d_folded() fix for x86 ?

Why should arm include a macro for something that nothing (apart from
this checker) requires?  If the checker requires it but the rest of
the kernel does not, it suggests that the checker isn't actually
correct, and the results can't be relied upon.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 00/50] Add log level to show_stack()

2019-11-08 Thread Russell King - ARM Linux admin

On Fri, Nov 08, 2019 at 04:28:30PM +, Dmitry Safonov wrote:
> On 11/6/19 8:34 PM, Peter Zijlstra wrote:
> > On Wed, Nov 06, 2019 at 04:27:33PM +, Dmitry Safonov wrote:
> [..]
> >> Sorry, I should have tried to describe better.
> >>
> >> I'm trying to remove external users of console_loglevel by following
> >> reasons:
> >
> > I suppose since all my machines have 'debug ignore_loglevel
> > earlyprintk=serial,ttyS0,115200 console=ttyS0,115200' I don't have this
> > experience.
> 
> Yeah, I remember you avoid all those functionalities of printk(), fair
> enough. On the other side, regular users and I'm betting most of
> the non-tuned distributions use /proc/sys/kernel/printk by default.
> (Checking on my Arch & Fedora - loglevel 4 from the box)
> 
> >> - changing console_loglevel on SMP means that unwanted messages from
> >> other CPUs will appear (that have lower log level)
> >> - on UMP unwanted messages may appear if the code is preempted while it
> >> hasn't set the console_loglevel back to old
> >> - rising console_loglevel to print wanted message(s) may not work at all
> >> if printk() has being delayed and the console_loglevel is already set
> >> back to old value
> >
> > Sure, frobbing the global console_loglevel is bad.
> >
> >> I also have patches in wip those needs to print backtrace with specific
> >> loglevel (higher when it's critical, lower when it's notice and
> >> shouldn't go to serial console).
> >
> > (everything always should go to serial, serial is awesome :-)
> 
> Personally I agree. Unfortunately, here @Arista there are switches (I'm
> speaking about the order of thousands at least) those have baud-rate 9600.
> It's a bit expensive being elaborate with such setup.
> 
> >> Besides on local tests I see hits those have headers (messages like
> >> "Backtrace: ") without an actual backtrace and the reverse - a backtrace
> >> without a reason for it. It's quite annoying and worth addressing by
> >> syncing headers log levels to backtraces.
> >
> > I suppose I'm surprised there are backtraces that are not important.
> > Either badness happened and it needs printing, or the user asked for it
> > and it needs printing.
> >
> > Perhaps we should be removing backtraces if they're not important
> > instead of allowing to print them as lower loglevels?
> 
> Well, the use-case for lower log-level is that everything goes into logs
> (/var/log/dmesg or /var/log/messages whatever rsyslog has settting).
> 
> That has it's value:
> - after a failure (i.e. panic) messages, those were only signs that
> something goes wrong can be seen in logs which can give ideas what has
> happened.

No they don't.  When the kernel panics, userspace generally stops
running, so rsyslog won't be able to write them to /var/log/messages.

How, by "kernel panics" I mean a real kernel panic, which probably
isn't what you're talking about there.  You are probably talking
about the whole shebang of non-fatal kernel oops, kernel warnings
and the like.  If so, I'd ask you to stop confuzzilating terminology.

If you really want to capture such events, then you need to have the
kernel write the panic to (e.g.) flash - see the mtdoops driver.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 00/50] Add log level to show_stack()

2019-11-06 Thread Russell King - ARM Linux admin

On Wed, Nov 06, 2019 at 09:34:40PM +0100, Peter Zijlstra wrote:
> I suppose I'm surprised there are backtraces that are not important.
> Either badness happened and it needs printing, or the user asked for it
> and it needs printing.

Or utterly meaningless.

> Perhaps we should be removing backtraces if they're not important
> instead of allowing to print them as lower loglevels?

Definitely!  WARN_ON() is well overused - and as is typical, used
without much thought.  Bound to happen after Linus got shirty about
BUG_ON() being over used.  Everyone just grabbed the next nearest thing
to assert().

As a kind of example, I've recently come across one WARN_ON() in a
driver subsystem (that shall remain nameless at the moment) which very
likely has multiple different devices on a platform.  The WARN_ON()
triggers as a result of a problem with the hardware, but because it's a
WARN_ON(), you've no idea which device has a problem.  The backtrace is
mostly meaningless.  So you know that a problem has occurred, but the
kernel prints *useless* backtrace to let you know, and totally omits
the *useful* information.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

2019-10-23 Thread Russell King - ARM Linux admin

On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> > I think this should have been done the other way around and default to
> > coherent since most traditional OF platforms are coherent, and you
> > can't just require those DTs to change.
> 
> You can blame me. This was really only intended for cases where
> coherency is configurable on a per peripheral basis and can't be
> determined in other ways.
> 
> The simple solution here is simply to use the compatible string for
> the device to determine coherent or not.

It really isn't that simple.

There are two aspects to coherency, both of which must match:

1) The configuration of the device
2) The configuration of the kernel's DMA API

(1) is controlled by the driver, which can make the decision any way
it pleases.

(2) on ARM64 is controlled depending on whether or not "dma-coherent"
is specified in the device tree, since ARM64 can have a mixture of
DMA coherent and non-coherent devices.

A mismatch between (1) and (2) results in data corruption, potentially
eating your filesystem.  So, it's very important that the two match.

These didn't match for the LX2160A, but, due to the way CMA was working,
we sort of got away with it, but it was very dangerous as far as data
safety went.

Then, a change to CMA happened which moved where it was located, which
caused a regression.  Reverting the CMA changes didn't seem to be an
option, so another solution had to be found.

I started a discussion on how best to solve this:

https://archive.armlinux.org.uk/lurker/thread/20190919.041320.1e53541f.en.html

and the solution that the discussion came out with was the one that has
been merged - which we now know caused a regression on PPC.

Using compatible strings doesn't solve the issue: there is no way to
tell the DMA API from the driver that the device is coherent.  The
only way to do that is via the "dma-coherent" property in DT on ARM64.

To say that this is a mess is an under-statement, but we seem to have
ended up here because of a series of piece-meal changes that don't seem
to have been thought through enough.

So, what's the right way to solve this, and ensure that the DMA API and
device match as far as their coherency expectations go?  Revert all the
changes for sdhci-of-esdhc and CMA back to 5.0 or 5.1 state?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: Onboard SD card doesn't work anymore after the 'mmc-v5.4-2' updates

2019-10-23 Thread Russell King - ARM Linux admin

On Wed, Oct 23, 2019 at 08:52:33AM -0500, Rob Herring wrote:
> On Wed, Oct 23, 2019 at 1:41 AM Benjamin Herrenschmidt
>  wrote:
> >
> > On Wed, 2019-10-23 at 16:42 +1100, Michael Ellerman wrote:
> > >
> > > Right, it seems of_dma_is_coherent() has baked in the assumption that
> > > devices are non-coherent unless explicitly marked as coherent.
> > >
> > > Which is wrong on all or at least most existing powerpc systems
> > > according to Ben.
> >
> > This is probably broken on sparc(64) as well and whatever else uses
> > DT and is an intrinsicly coherent architecture (did we ever have
> > DT enabled x86s ? Wasn't OLPC such a beast ?).
> 
> Only if those platforms use one of the 5 drivers that call this function:
> 
> drivers/ata/ahci_qoriq.c:   qoriq_priv->is_dmacoherent =
> of_dma_is_coherent(np);
> drivers/crypto/ccree/cc_driver.c:   new_drvdata->coherent =
> of_dma_is_coherent(np);
> drivers/iommu/arm-smmu-v3.c:if (of_dma_is_coherent(dev->of_node))
> drivers/iommu/arm-smmu.c:   if (of_dma_is_coherent(dev->of_node))
> drivers/mmc/host/sdhci-of-esdhc.c:  if (of_dma_is_coherent(dev->of_node))
> 
> Curious that a PPC specific driver (ahci_qoriq) calls it...

Maybe because it is not PPC specific:

arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi:
compatible = "fsl,lx2160a-ahci";

drivers/ata/ahci_qoriq.c:   { .compatible = "fsl,lx2160a-ahci", .data = 
(void *)AHCI_LX2160A},

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH v2 3/3] arm: Add support for function error injection

2019-08-29 Thread Russell King - ARM Linux admin

I'm sorry, I can't apply this, it produces loads of:

include/linux/error-injection.h:7:10: fatal error: asm/error-injection.h: No 
such file or directory

Since your patch 1 has been merged by the ARM64 people, I can't take
it until next cycle.

On Mon, Aug 19, 2019 at 05:18:08PM +0800, Leo Yan wrote:
> Hi Russell,
> 
> On Tue, Aug 06, 2019 at 06:00:15PM +0800, Leo Yan wrote:
> > This patch implements arm specific functions regs_set_return_value() and
> > override_function_with_return() to support function error injection.
> > 
> > In the exception flow, it updates pt_regs::ARM_pc with pt_regs::ARM_lr
> > so can override the probed function return.
> 
> Gentle ping ...  Could you review this patch?
> 
> Thanks,
> Leo.
> 
> > Signed-off-by: Leo Yan 
> > ---
> >  arch/arm/Kconfig  |  1 +
> >  arch/arm/include/asm/ptrace.h |  5 +
> >  arch/arm/lib/Makefile |  2 ++
> >  arch/arm/lib/error-inject.c   | 19 +++
> >  4 files changed, 27 insertions(+)
> >  create mode 100644 arch/arm/lib/error-inject.c
> > 
> > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> > index 33b00579beff..2d3d44a037f6 100644
> > --- a/arch/arm/Kconfig
> > +++ b/arch/arm/Kconfig
> > @@ -77,6 +77,7 @@ config ARM
> > select HAVE_EXIT_THREAD
> > select HAVE_FAST_GUP if ARM_LPAE
> > select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
> > +   select HAVE_FUNCTION_ERROR_INJECTION if !THUMB2_KERNEL
> > select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG
> > select HAVE_FUNCTION_TRACER if !XIP_KERNEL
> > select HAVE_GCC_PLUGINS
> > diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
> > index 91d6b7856be4..3b41f37b361a 100644
> > --- a/arch/arm/include/asm/ptrace.h
> > +++ b/arch/arm/include/asm/ptrace.h
> > @@ -89,6 +89,11 @@ static inline long regs_return_value(struct pt_regs 
> > *regs)
> > return regs->ARM_r0;
> >  }
> >  
> > +static inline void regs_set_return_value(struct pt_regs *regs, unsigned 
> > long rc)
> > +{
> > +   regs->ARM_r0 = rc;
> > +}
> > +
> >  #define instruction_pointer(regs)  (regs)->ARM_pc
> >  
> >  #ifdef CONFIG_THUMB2_KERNEL
> > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
> > index b25c54585048..8f56484a7156 100644
> > --- a/arch/arm/lib/Makefile
> > +++ b/arch/arm/lib/Makefile
> > @@ -42,3 +42,5 @@ ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
> >CFLAGS_xor-neon.o+= $(NEON_FLAGS)
> >obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
> >  endif
> > +
> > +obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
> > diff --git a/arch/arm/lib/error-inject.c b/arch/arm/lib/error-inject.c
> > new file mode 100644
> > index ..2d696dc94893
> > --- /dev/null
> > +++ b/arch/arm/lib/error-inject.c
> > @@ -0,0 +1,19 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include 
> > +#include 
> > +
> > +void override_function_with_return(struct pt_regs *regs)
> > +{
> > +   /*
> > +* 'regs' represents the state on entry of a predefined function in
> > +* the kernel/module and which is captured on a kprobe.
> > +*
> > +* 'regs->ARM_lr' contains the the link register for the probed
> > +* function, when kprobe returns back from exception it will override
> > +* the end of probed function and directly return to the predefined
> > +* function's caller.
> > +*/
> > +   instruction_pointer_set(regs, regs->ARM_lr);
> > +}
> > +NOKPROBE_SYMBOL(override_function_with_return);
> > -- 
> > 2.17.1
> > 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*

2019-08-06 Thread Russell King - ARM Linux admin

On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote:
> On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote:
> > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote:
> > > 
> > > So this boils down to a terminology mismatch. The Arm architecture 
> > > doesn't have
> > > anything called "write combine", so in Linux we instead provide what the 
> > > Arm
> > > architecture calls "Normal non-cacheable" memory for 
> > > pgprot_writecombine().
> > > Amongst other things, this memory type permits speculation, unaligned 
> > > accesses
> > > and merging of writes. I found something in the architecture spec about
> > > non-cachable memory, but it's written in Armglish[1].
> > > 
> > > pgprot_noncached(), on the other hand, provides what the architecture 
> > > calls
> > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping 
> > > MMIO
> > > (i.e. PCI config space) and therefore forbids speculation, preserves 
> > > access
> > > size, requires strict alignment and also forces write responses to come 
> > > from
> > > the endpoint.
> > > 
> > > I think the naming mismatch is historical, but on arm64 we wanted to use 
> > > the
> > > same names as arm32 so that any drivers using these things directly would 
> > > get
> > > the same behaviour.
> > 
> > That all makes sense, but it totally needs a comment.  I'll try to draft
> > one based on this.  I've also looked at the arm32 code a bit more, and
> > it seems arm always (?) supported Normal non-cacheable attribute, but
> > Linux only optionally uses it for arm v6+ because of fears of drivers
> > missing barriers.
> 
> I think it was also to do with aliasing, but I don't recall all of the
> details.

ARMv6+ is where the architecture significantly changed to introduce
the idea of [Normal, Device, Strongly Ordered] where Normal has the
cache attributes.

Before that, we had just "uncached/unbuffered, uncached/buffered,
cached/unbuffered, cached/buffered" modes.

The write buffer (enabled by buffered modes) has no architected
guarantees about how long writes will sit in it, and there is only
the "drain write buffer" instruction to push writes out.

Up to and including ARMv5, we took the easy approach of just using
the "uncached/unbuffered" mode since that is (a) the safest, and (b)
avoids write buffers that alias when there are multiple different
mappings.

We could have used a different approach, making all IO writes contain
a "drain write buffer" instruction, and map DMA memory as "buffered",
but as there were no Linux barriers defined to order memory accesses
to DMA memory (so, for example, ring buffers can be updated in the
correct order) back in those days, using the uncached/unbuffered mode
was the sanest and most reliable solution.

> 
> > The other really weird things is that in arm32
> > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding
> > is the no-execture bit, but pgprot_writecombine does not.  This seems to
> > not very unintentional.  So minus that the whole DMA_ATTR_WRITE_COMBІNE
> > seems to be about flagging old arm specific drivers as having the proper
> > barriers in places and otherwise is a no-op.
> 
> I think it only matters for Armv7 CPUs, but yes, we should probably be
> setting L_PTE_XN for both of these memory types.

Conventionally, pgprot_writecombine() has only been used to change
the memory type and not the permissions.  Since writecombine memory
is still capable of being executed, I don't see any reason to set XN
for it.

If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there
really a reason for writecombine to set XN overriding the user?

That said, pgprot_writecombine() is mostly used for framebuffers, which
arguably shouldn't be executable anyway - but who'd want to mmap() the
framebuffer with PROT_EXEC?

> 
> > Here is my tentative plan:
> > 
> >  - respin this patch with a small fix to handle the
> >DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported),
> >but keep the name as-is to avoid churn.  This should allow 5.3
> >inclusion and backports
> >  - remove DMA_ATTR_WRITE_COMBINE support from mips, probably also 5.3
> >material.
> >  - move all architectures but arm over to just define
> >pgprot_dmacoherent, including a comment with the above explanation
> >for arm64.
> 
> That would be great, thanks.
> 
> >  - make DMA_ATTR_WRITE_COMBINE a no-op and schedule it for removal,
> >thus removing the last instances of arch_dma_mmap_pgprot
> 
> All sounds good to me, although I suppose 32-bit Arm platforms without
> CONFIG_ARM_DMA_MEM_BUFFERABLE may run into issues if DMA_ATTR_WRITE_COMBINE
> disappears. Only one way to find out...

Looking at the results of grep, I think only OMAP2+ and Exynos may be
affected.

However, removing writecombine support from the DMA API is going to
have a huge impact for framebuffers on earlier ARMs - that's where we
do expect framebuffers to be mapped "uncached/buffered" for

Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_*

2019-08-06 Thread Russell King - ARM Linux admin

On Tue, Aug 06, 2019 at 05:45:03PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote:
> > On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote:
> > > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote:
> > > > 
> > > > So this boils down to a terminology mismatch. The Arm architecture 
> > > > doesn't have
> > > > anything called "write combine", so in Linux we instead provide what 
> > > > the Arm
> > > > architecture calls "Normal non-cacheable" memory for 
> > > > pgprot_writecombine().
> > > > Amongst other things, this memory type permits speculation, unaligned 
> > > > accesses
> > > > and merging of writes. I found something in the architecture spec about
> > > > non-cachable memory, but it's written in Armglish[1].
> > > > 
> > > > pgprot_noncached(), on the other hand, provides what the architecture 
> > > > calls
> > > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping 
> > > > MMIO
> > > > (i.e. PCI config space) and therefore forbids speculation, preserves 
> > > > access
> > > > size, requires strict alignment and also forces write responses to come 
> > > > from
> > > > the endpoint.
> > > > 
> > > > I think the naming mismatch is historical, but on arm64 we wanted to 
> > > > use the
> > > > same names as arm32 so that any drivers using these things directly 
> > > > would get
> > > > the same behaviour.
> > > 
> > > That all makes sense, but it totally needs a comment.  I'll try to draft
> > > one based on this.  I've also looked at the arm32 code a bit more, and
> > > it seems arm always (?) supported Normal non-cacheable attribute, but
> > > Linux only optionally uses it for arm v6+ because of fears of drivers
> > > missing barriers.
> > 
> > I think it was also to do with aliasing, but I don't recall all of the
> > details.
> 
> ARMv6+ is where the architecture significantly changed to introduce
> the idea of [Normal, Device, Strongly Ordered] where Normal has the
> cache attributes.
> 
> Before that, we had just "uncached/unbuffered, uncached/buffered,
> cached/unbuffered, cached/buffered" modes.
> 
> The write buffer (enabled by buffered modes) has no architected
> guarantees about how long writes will sit in it, and there is only
> the "drain write buffer" instruction to push writes out.
> 
> Up to and including ARMv5, we took the easy approach of just using
> the "uncached/unbuffered" mode since that is (a) the safest, and (b)
> avoids write buffers that alias when there are multiple different
> mappings.
> 
> We could have used a different approach, making all IO writes contain
> a "drain write buffer" instruction, and map DMA memory as "buffered",
> but as there were no Linux barriers defined to order memory accesses
> to DMA memory (so, for example, ring buffers can be updated in the
> correct order) back in those days, using the uncached/unbuffered mode
> was the sanest and most reliable solution.
> 
> > 
> > > The other really weird things is that in arm32
> > > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding
> > > is the no-execture bit, but pgprot_writecombine does not.  This seems to
> > > not very unintentional.  So minus that the whole DMA_ATTR_WRITE_COMBІNE
> > > seems to be about flagging old arm specific drivers as having the proper
> > > barriers in places and otherwise is a no-op.
> > 
> > I think it only matters for Armv7 CPUs, but yes, we should probably be
> > setting L_PTE_XN for both of these memory types.
> 
> Conventionally, pgprot_writecombine() has only been used to change
> the memory type and not the permissions.  Since writecombine memory
> is still capable of being executed, I don't see any reason to set XN
> for it.
> 
> If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there
> really a reason for writecombine to set XN overriding the user?
> 
> That said, pgprot_writecombine() is mostly used for framebuffers, which
> arguably shouldn't be executable anyway - but who'd want to mmap() the
> framebuffer with PROT_EXEC?
> 
> > 
> > > Here is my tentative plan:
> > > 
> > >  - respin this patch with a small fix to handle the
> > >DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported),
> > &g

Re: [PATCH v5 3/3] locking/rwsem: Optimize down_read_trylock()

2019-03-22 Thread Russell King - ARM Linux admin

On Fri, Mar 22, 2019 at 10:30:08AM -0400, Waiman Long wrote:
> Modify __down_read_trylock() to optimize for an unlocked rwsem and make
> it generate slightly better code.
> 
> Before this patch, down_read_trylock:
> 
>0x <+0>: callq  0x5 
>0x0005 <+5>: jmp0x18 
>0x0007 <+7>: lea0x1(%rdx),%rcx
>0x000b <+11>:mov%rdx,%rax
>0x000e <+14>:lock cmpxchg %rcx,(%rdi)
>0x0013 <+19>:cmp%rax,%rdx
>0x0016 <+22>:je 0x23 
>0x0018 <+24>:mov(%rdi),%rdx
>0x001b <+27>:test   %rdx,%rdx
>0x001e <+30>:jns0x7 
>0x0020 <+32>:xor%eax,%eax
>0x0022 <+34>:retq
>0x0023 <+35>:mov%gs:0x0,%rax
>0x002c <+44>:or $0x3,%rax
>0x0030 <+48>:mov%rax,0x20(%rdi)
>0x0034 <+52>:mov$0x1,%eax
>0x0039 <+57>:retq
> 
> After patch, down_read_trylock:
> 
>0x <+0>:   callq  0x5 
>0x0005 <+5>:   xor%eax,%eax
>0x0007 <+7>:   lea0x1(%rax),%rdx
>0x000b <+11>:  lock cmpxchg %rdx,(%rdi)
>0x0010 <+16>:  jne0x29 
>0x0012 <+18>:  mov%gs:0x0,%rax
>0x001b <+27>:  or $0x3,%rax
>0x001f <+31>:  mov%rax,0x20(%rdi)
>0x0023 <+35>:  mov$0x1,%eax
>0x0028 <+40>:  retq
>0x0029 <+41>:  test   %rax,%rax
>0x002c <+44>:  jns0x7 
>0x002e <+46>:  xor%eax,%eax
>0x0030 <+48>:  retq
> 
> By using a rwsem microbenchmark, the down_read_trylock() rate (with a
> load of 10 to lengthen the lock critical section) on a x86-64 system
> before and after the patch were:
> 
>  Before PatchAfter Patch
># of Threads rlock   rlock
> -   -
> 1   14,496  14,716
> 28,644   8,453
>   46,799   6,983
>   85,664   7,190
> 
> On a ARM64 system, the performance results were:
> 
>  Before PatchAfter Patch
># of Threads rlock   rlock
> -   -
> 1   23,676  24,488
> 27,697   9,502
> 44,945   3,440
> 82,641   1,603
> 
> For the uncontended case (1 thread), the new down_read_trylock() is a
> little bit faster. For the contended cases, the new down_read_trylock()
> perform pretty well in x86-64, but performance degrades at high
> contention level on ARM64.

So, 70% for 4 threads, 61% for 4 threads - does this trend
continue tailing off as the number of threads (and cores)
increase?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH v2 07/29] ARM: add kexec_file_load system call number

2019-01-25 Thread Russell King - ARM Linux admin

On Fri, Jan 25, 2019 at 03:43:59PM +, Catalin Marinas wrote:
> On Fri, Jan 18, 2019 at 05:18:13PM +0100, Arnd Bergmann wrote:
> > diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> > index 86de9eb34296..20ed7e026723 100644
> > --- a/arch/arm/tools/syscall.tbl
> > +++ b/arch/arm/tools/syscall.tbl
> > @@ -415,3 +415,4 @@
> >  398common  rseqsys_rseq
> >  399common  io_pgetevents   sys_io_pgetevents
> >  400common  migrate_pages   sys_migrate_pages
> > +401common  kexec_file_load sys_kexec_file_load
> 
> I presume on arm32 this would still return -ENOSYS.

Yes, I already checked.  If CONFIG_KEXEC_FILE is not set, then
kernel/kexec_file.c is not built, and we fall back to the stub
in kernel/sys_ni.c.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH v2 29/29] y2038: add 64-bit time_t syscalls to all 32-bit architectures

2019-01-19 Thread Russell King - ARM Linux admin

On Fri, Jan 18, 2019 at 11:53:25AM -0800, Andy Lutomirski wrote:
> On Fri, Jan 18, 2019 at 11:33 AM Arnd Bergmann  wrote:
> >
> > On Fri, Jan 18, 2019 at 7:50 PM Andy Lutomirski  wrote:
> > > On Fri, Jan 18, 2019 at 8:25 AM Arnd Bergmann  wrote:
> > > > - Once we get to 512, we clash with the x32 numbers (unless
> > > >   we remove x32 support first), and probably have to skip
> > > >   a few more. I also considered using the 512..547 space
> > > >   for 32-bit-only calls (which never clash with x32), but
> > > >   that also seems to add a bit of complexity.
> > >
> > > I have a patch that I'll send soon to make x32 use its own table.  As
> > > far as I'm concerned, 547 is *it*.  548 is just a normal number and is
> > > not special.  But let's please not reuse 512..547 for other purposes
> > > on x86 variants -- that way lies even more confusion, IMO.
> >
> > Fair enough, the space for those numbers is cheap enough here.
> > I take it you mean we also should not reuse that number space if
> > we were to decide to remove x32 soon, but you are not worried
> > about clashing with arch/alpha when everything else uses consistent
> > numbers?
> >
> 
> I think we have two issues if we reuse those numbers for new syscalls.
> First, I'd really like to see new syscalls be numbered consistently
> everywhere, or at least on all x86 variants, and we can't on x32
> because they mean something else.  Perhaps more importantly, due to
> what is arguably a rather severe bug, issuing a native x86_64 syscall
> (x32 bit clear) with nr in the range 512..547 does *not* return
> -ENOSYS on a kernel with x32 enabled.  Instead it does something that
> is somewhat arbitrary.  With my patch applied, it will return -ENOSYS,
> but old kernels will still exist, and this will break syscall probing.
> 
> Can we perhaps just start the consistent numbers above 547 or maybe
> block out 512..547 in the new regime?

I don't think you gain much with that kind of scheme - it won't take
very long before an architecture misses having a syscall added, and
then someone else adds their own.  Been there with ARM - I was keeping
the syscall table in the same order as x86 for new syscalls, but now
that others have been adding syscalls to the table since I converted
ARM to the tabular form, that's now gone out the window.

So, I think it's completely pointless to do what you're suggesting.
We'll just end up with a big hole in the middle of the syscall table
and then revert back to random numbering of syscalls thereafter again.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 15/15] arch: add pkey and rseq syscall numbers everywhere

2019-01-15 Thread Russell King - ARM Linux admin

On Thu, Jan 10, 2019 at 05:24:35PM +0100, Arnd Bergmann wrote:
> Most architectures define system call numbers for the rseq and pkey system
> calls, even when they don't support the features, and perhaps never will.
> 
> Only a few architectures are missing these, so just define them anyway
> for consistency. If we decide to add them later to one of these, the
> system call numbers won't get out of sync then.

I was lambasted for adding the pkey syscalls for 32-bit ARM in 2016,
which will probably never support it.  Why has the attitude towards
this kind of thing now apparently become acceptable?

> Signed-off-by: Arnd Bergmann 
> ---
>  arch/alpha/include/asm/unistd.h | 4 
>  arch/alpha/kernel/syscalls/syscall.tbl  | 4 
>  arch/ia64/kernel/syscalls/syscall.tbl   | 4 
>  arch/m68k/kernel/syscalls/syscall.tbl   | 4 
>  arch/parisc/include/asm/unistd.h| 3 ---
>  arch/parisc/kernel/syscalls/syscall.tbl | 4 
>  arch/s390/include/asm/unistd.h  | 3 ---
>  arch/s390/kernel/syscalls/syscall.tbl   | 3 +++
>  arch/sh/kernel/syscalls/syscall.tbl | 4 
>  arch/sparc/include/asm/unistd.h | 5 -
>  arch/sparc/kernel/syscalls/syscall.tbl  | 4 
>  arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
>  12 files changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h
> index 564ba87bdc38..31ad350b58a0 100644
> --- a/arch/alpha/include/asm/unistd.h
> +++ b/arch/alpha/include/asm/unistd.h
> @@ -29,9 +29,5 @@
>  #define __IGNORE_getppid
>  #define __IGNORE_getuid
>  
> -/* Alpha doesn't have protection keys. */
> -#define __IGNORE_pkey_mprotect
> -#define __IGNORE_pkey_alloc
> -#define __IGNORE_pkey_free
>  
>  #endif /* _ALPHA_UNISTD_H */
> diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
> b/arch/alpha/kernel/syscalls/syscall.tbl
> index b0e247287908..25b4a7e76943 100644
> --- a/arch/alpha/kernel/syscalls/syscall.tbl
> +++ b/arch/alpha/kernel/syscalls/syscall.tbl
> @@ -452,3 +452,7 @@
>  521  common  pwritev2sys_pwritev2
>  522  common  statx   sys_statx
>  523  common  io_pgetevents   sys_io_pgetevents
> +524  common  pkey_alloc  sys_pkey_alloc
> +525  common  pkey_free   sys_pkey_free
> +526  common  pkey_mprotect   sys_pkey_mprotect
> +527  common  rseqsys_rseq
> diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
> b/arch/ia64/kernel/syscalls/syscall.tbl
> index 2e93dbdcdb80..84e03de00177 100644
> --- a/arch/ia64/kernel/syscalls/syscall.tbl
> +++ b/arch/ia64/kernel/syscalls/syscall.tbl
> @@ -339,3 +339,7 @@
>  327  common  io_pgetevents   sys_io_pgetevents
>  328  common  perf_event_open sys_perf_event_open
>  329  common  seccomp sys_seccomp
> +330  common  pkey_alloc  sys_pkey_alloc
> +331  common  pkey_free   sys_pkey_free
> +332  common  pkey_mprotect   sys_pkey_mprotect
> +333  common  rseqsys_rseq
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
> b/arch/m68k/kernel/syscalls/syscall.tbl
> index 5354ba02eed2..ae88b85d068e 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -388,6 +388,10 @@
>  378  common  pwritev2sys_pwritev2
>  379  common  statx   sys_statx
>  380  common  seccomp sys_seccomp
> +381  common  pkey_alloc  sys_pkey_alloc
> +382  common  pkey_free   sys_pkey_free
> +383  common  pkey_mprotect   sys_pkey_mprotect
> +384  common  rseqsys_rseq
>  # room for arch specific calls
>  393  common  semget  sys_semget
>  394  common  semctl  sys_semctl
> diff --git a/arch/parisc/include/asm/unistd.h 
> b/arch/parisc/include/asm/unistd.h
> index c2c2afb28941..9ec1026af877 100644
> --- a/arch/parisc/include/asm/unistd.h
> +++ b/arch/parisc/include/asm/unistd.h
> @@ -12,9 +12,6 @@
>  
>  #define __IGNORE_select  /* newselect */
>  #define __IGNORE_fadvise64   /* fadvise64_64 */
> -#define __IGNORE_pkey_mprotect
> -#define __IGNORE_pkey_alloc
> -#define __IGNORE_pkey_free
>  
>  #ifndef ASM_LINE_SEP
>  # define ASM_LINE_SEP ;
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
> b/arch/parisc/kernel/syscalls/syscall.tbl
> index 9bbd2f9f56c8..e07231de3597 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -367,3 +367,7 @@
>  348  common  pwritev2sys_pwritev2
> compat_sys_pwritev2
>  349  common  statx   sys_statx
>  350  common  io_pgetevents   sys_io_pgetevents   
>

Re: [PATCH] mm/zsmalloc.c: Fix zsmalloc 32-bit PAE support

2018-12-10 Thread Russell King - ARM Linux

On Mon, Dec 10, 2018 at 02:35:55PM +, Robin Murphy wrote:
> On 10/12/2018 14:21, Rafael David Tinoco wrote:
> >On 32-bit systems, zsmalloc uses HIGHMEM and, when PAE is enabled, the
> >physical frame number might be so big that zsmalloc obj encoding (to
> >location) will break, causing:
> >
> >BUG: KASAN: null-ptr-deref in zs_map_object+0xa4/0x2bc
> >Read of size 4 at addr  by task mkfs.ext4/623
> >CPU: 2 PID: 623 Comm: mkfs.ext4 Not tainted 
> >4.19.0-rc8-00017-g8239bc6d3307-dirty #15
> >Hardware name: Generic DT based system
> >[] (unwind_backtrace) from [] (show_stack+0x20/0x24)
> >[] (show_stack) from [] (dump_stack+0xbc/0xe8)
> >[] (dump_stack) from [] (kasan_report+0x248/0x390)
> >[] (kasan_report) from [] (__asan_load4+0x78/0xb4)
> >[] (__asan_load4) from [] (zs_map_object+0xa4/0x2bc)
> >[] (zs_map_object) from [] 
> >(zram_bvec_rw.constprop.2+0x324/0x8e4 [zram])
> >[] (zram_bvec_rw.constprop.2 [zram]) from [] 
> >(zram_make_request+0x234/0x46c [zram])
> >[] (zram_make_request [zram]) from [] 
> >(generic_make_request+0x304/0x63c)
> >[] (generic_make_request) from [] (submit_bio+0x4c/0x1c8)
> >[] (submit_bio) from [] 
> >(submit_bh_wbc.constprop.15+0x238/0x26c)
> >[] (submit_bh_wbc.constprop.15) from [] 
> >(__block_write_full_page+0x524/0x76c)
> >[] (__block_write_full_page) from [] 
> >(block_write_full_page+0x1bc/0x1d4)
> >[] (block_write_full_page) from [] 
> >(blkdev_writepage+0x24/0x28)
> >[] (blkdev_writepage) from [] (__writepage+0x44/0x78)
> >[] (__writepage) from [] (write_cache_pages+0x3b8/0x800)
> >[] (write_cache_pages) from [] 
> >(generic_writepages+0x74/0xa0)
> >[] (generic_writepages) from [] 
> >(blkdev_writepages+0x18/0x1c)
> >[] (blkdev_writepages) from [] (do_writepages+0x68/0x134)
> >[] (do_writepages) from [] 
> >(__filemap_fdatawrite_range+0xb0/0xf4)
> >[] (__filemap_fdatawrite_range) from [] 
> >(file_write_and_wait_range+0x64/0xd0)
> >[] (file_write_and_wait_range) from [] 
> >(blkdev_fsync+0x54/0x84)
> >[] (blkdev_fsync) from [] (vfs_fsync_range+0x70/0xd4)
> >[] (vfs_fsync_range) from [] (do_fsync+0x4c/0x80)
> >[] (do_fsync) from [] (sys_fsync+0x1c/0x20)
> >[] (sys_fsync) from [] (ret_fast_syscall+0x0/0x2c)
> >
> >when trying to decode (the pfn) and map the object.
> >
> >That happens because one architecture might not re-define
> >MAX_PHYSMEM_BITS, like in this ARM 32-bit w/ LPAE enabled example. For
> >32-bit systems, if not re-defined, MAX_POSSIBLE_PHYSMEM_BITS will
> >default to BITS_PER_LONG (32) in most cases, and, with PAE enabled,
> >_PFN_BITS might be wrong: which may cause obj variable to overflow if
> >frame number is in HIGHMEM and referencing a page above the 4GB
> >watermark.
> >
> >commit 6e00ec00b1a7 ("staging: zsmalloc: calculate MAX_PHYSMEM_BITS if
> >not defined") realized MAX_PHYSMEM_BITS depended on SPARSEMEM headers
> >and "fixed" it by calculating it using BITS_PER_LONG if SPARSEMEM wasn't
> >used, like in the example given above.
> >
> >Systems with potential for PAE exist for a long time and assuming
> >BITS_PER_LONG seems inadequate. Defining MAX_PHYSMEM_BITS looks better,
> >however it is NOT a constant anymore for x86.
> >
> >SO, instead, MAX_POSSIBLE_PHYSMEM_BITS should be defined by every
> >architecture using zsmalloc, together with a sanity check for
> >MAX_POSSIBLE_PHYSMEM_BITS being too big on 32-bit systems.
> >
> >Link: https://bugs.linaro.org/show_bug.cgi?id=3765#c17
> >Signed-off-by: Rafael David Tinoco 
> >---
> >  arch/arm/include/asm/pgtable-2level-types.h |  2 ++
> >  arch/arm/include/asm/pgtable-3level-types.h |  2 ++
> >  arch/arm64/include/asm/pgtable-types.h  |  2 ++
> >  arch/ia64/include/asm/page.h|  2 ++
> >  arch/mips/include/asm/page.h|  2 ++
> >  arch/powerpc/include/asm/mmu.h  |  2 ++
> >  arch/s390/include/asm/page.h|  2 ++
> >  arch/sh/include/asm/page.h  |  2 ++
> >  arch/sparc/include/asm/page_32.h|  2 ++
> >  arch/sparc/include/asm/page_64.h|  2 ++
> >  arch/x86/include/asm/pgtable-2level_types.h |  2 ++
> >  arch/x86/include/asm/pgtable-3level_types.h |  3 +-
> >  arch/x86/include/asm/pgtable_64_types.h |  4 +--
> >  mm/zsmalloc.c   | 35 +++--
> >  14 files changed, 45 insertions(+), 19 deletions(-)
> >
> >diff --git a/arch/arm/include/asm/pgtable-2level-types.h 
> >b/arch/arm/include/asm/pgtable-2level-types.h
> >index 66cb5b0e89c5..552dba411324 100644
> >--- a/arch/arm/include/asm/pgtable-2level-types.h
> >+++ b/arch/arm/include/asm/pgtable-2level-types.h
> >@@ -64,4 +64,6 @@ typedef pteval_t pgprot_t;
> >  #endif /* STRICT_MM_TYPECHECKS */
> >+#define MAX_POSSIBLE_PHYSMEM_BITS 32
> >+
> >  #endif /* _ASM_PGTABLE_2LEVEL_TYPES_H */
> >diff --git a/arch/arm/include/asm/pgtable-3level-types.h 
> >b/arch/arm/include/asm/pgtable-3level-types.h
> >index 921aa30259c4..664c39e6517c 100644
> >--- a/arch/arm/include/asm/pgtable-3level-types.h
> >+++

Re: [PATCH v3 0/7] include: add setbits32/clrbits32/clrsetbits32/setbits64/clrbits64/clrsetbits64

2018-11-15 Thread Russell King - ARM Linux

On Thu, Nov 15, 2018 at 10:30:34AM +0100, LABBE Corentin wrote:
> On Wed, Oct 24, 2018 at 09:57:00AM +0100, Russell King - ARM Linux wrote:
> > On Wed, Oct 24, 2018 at 07:35:46AM +, Corentin Labbe wrote:
> > > This patchset adds a new set of functions which are open-coded in lot of
> > > place.
> > > Basicly the pattern is always the same, "read, modify a bit, write"
> > > some driver and the powerpc arch already have thoses pattern them as 
> > > functions. (like ahci_sunxi.c or dwmac-meson8b)
> > 
> > The advantage of them being open-coded is that it's _obvious_ to the
> > reviewer that there is a read-modify-write going on which, in a multi-
> > threaded environment, may need some locking (so it should trigger a
> > review of the locking around that code.)
> > 
> > With it hidden inside a helper which has no locking itself, it becomes
> > much easier to pass over in review, which means that races are much
> > more likely to go unspotted - and that is bad news.
> > 
> 
> Hello
> 
> I understand your fear, but I think the benefit overhaul thoses.
> Furthermore, drivers which I have converted does not need such locking.
> 
> If you want I can rename the header to linux/setbits-non-atomic.h for making 
> obvious the lack of locking.

It'd probably be better in the function name - it then doesn't get
"lost" that it's non-atomic when it's included via other headers.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 6/9] PCI: consolidate PCI config entry in drivers/pci

2018-11-01 Thread Russell King - ARM Linux

On Fri, Oct 19, 2018 at 09:58:46PM +0900, Masahiro Yamada wrote:
> On Fri, Oct 19, 2018 at 9:23 PM Russell King - ARM Linux
>  wrote:
> 
> > > index a68b34183107..b185794549be 100644
> > > --- a/arch/arm/mach-pxa/Kconfig
> > > +++ b/arch/arm/mach-pxa/Kconfig
> > > @@ -125,7 +125,7 @@ config MACH_ARMCORE
> > >   bool "CompuLab CM-X255/CM-X270 modules"
> > >   select ARCH_HAS_DMA_SET_COHERENT_MASK if PCI
> > >   select IWMMXT
> > > - select MIGHT_HAVE_PCI
> > > + select HAVE_PCI
> > >   select NEED_MACH_IO_H if PCI
> > >   select PXA25x
> > >   select PXA27x
> >
> > This is wrong.  "MIGHT_HAVE_PCI" is _not_ the same as "HAVE_PCI" - we
> > have a bunch of platforms that mandatorily have PCI and these select
> > PCI directly.  "MIGHT_HAVE_PCI" controls the _visibility_ of the PCI
> > menu option, but does not prevent it being selected.  Your patch will
> > cause Kconfig to complain for those which mandatorily have PCI but
> > do not set HAVE_PCI.
> 
> 
> Good catch!
> But, adding a bunch of 'select HAVE_PCI' along with 'select PCI' is ugly.
> 
> Do you have any suggestion?
> 
> How about letting CONFIG_ARM to select HAVE_PCI ?

Well, the situation we have on ARM is rather optimal - when the
rest of the configuration supports PCI, PCI is made visible.  When
there's no hardware support for PCI, PCI is hidden.  When PCI is
mandatory, PCI is selected and mostly always hidden.

It seems that with this consolidation, we lose that - we end up
with PCI being visible for every ARM config, not only those where
PCI is "impossible" but also for those where PCI is forcefully
selected.

That said, offering PCI for platforms that do not have any possibility
of PCI hardware shouldn't cause any compile issues, and would increase
build coverage - but I'd say it wouldn't _usefully_ increase build
coverage.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 0/7] serial: Finish kgdb on qcom_geni; fix many lockdep splats w/ kgdb

2018-10-30 Thread Russell King - ARM Linux

On Tue, Oct 30, 2018 at 11:56:28AM +, Daniel Thompson wrote:
> On Mon, Oct 29, 2018 at 11:07:00AM -0700, Douglas Anderson wrote:
> > Looking back, this is pretty much two series squashed that could be
> > treated indepdently.  The first is a serial series and the second is a
> > kgdb series.
> 
> Indeed.
> 
> I couldn't work out the link between the first 5 patches and the last 2
> until I read this...
> 
> Is anything in the 01->05 patch set even related to kgdb? From the stack
> traces it looks to me like the lock dep warning would trigger for any
> sysrq. I think separating into two threads for v2 would be sensible.

I'm concerned about calling smp_call_function() from IRQ context with
IRQs disabled - that will block the ability of the _calling_ CPU to
process IPIs from other CPUs in the system.  If we have other CPUs
waiting on their IPIs to complete on _this_ CPU, we could end up
deadlocking while trying to grab the CSD lock.

This is the intention of warnings in smp_call_function*() - to catch
cases where deadlocks _can_ occur, but do not reliably show up.

The exceptions to the warning (disregarding oops_in_progress) are
chosen to allow IRQs-disabled calls when we're sure that the rest of
the system isn't going to be sending the calling CPU an IPI (eg,
because the CPU isn't marked online, and we only send IPIs to online
CPUs, or if we're still early in the kernel boot and hence have no
other CPUs running.)  The exception is oops_in_progress, which can
occur at any time - even with the current CPU already holding some
CSD locks (eg, oops while trying to send an IPI.)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 0/8] add generic builtin command line

2018-10-29 Thread Russell King - ARM Linux

On Mon, Oct 29, 2018 at 10:29:15AM +, Will Deacon wrote:
> On Wed, Oct 24, 2018 at 10:07:32AM +0100, Russell King - ARM Linux wrote:
> > On Wed, Oct 24, 2018 at 11:57:44AM +0300, Maksym Kokhan wrote:
> > > Do you mean, that you haven't seen patch for ARM, which I sent on
> > > September 27 along with cover and patch 1? It is strange, because
> > > you was the one from recipients. If so, you can see this patch here:
> > > https://lore.kernel.org/patchwork/patch/992779/
> > 
> > It seems that I have received patch 5, _but_ it's not threaded with
> > the cover message and patch 1.  With 50k messages in my inbox, and 3k
> > messages since you sent the series, it's virtually impossible to find
> > it (I only found it by looking at my mail server logs from September
> > to find the subject, and then searching my mailbox for that subject.)
> > 
> > This is unnecessarily difficult.
> 
> This comes up surprisingly often, and I think part of the issue is that
> different maintainers have different preferences. I also prefer to receive
> the entire series and cover-letter, but I've seen people object to being
> CC'd on the whole series as well (how they manage to review things in
> isolation is another question...!)

This series has the odd situation where patch 1 is threaded to the
cover letter, but nothing else is - that makes it inconsistent.

Where I've seen people disagree with threading is when sending
follow-up series - whether that should be threaded to the previous
series or not - some people want it others hate it.

However, I haven't seen any disagreement is about having the patches
threaded to the cover.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH v2 0/2] arm64: Cut rebuild time when changing CONFIG_BLK_DEV_INITRD

2018-10-25 Thread Russell King - ARM Linux

On Thu, Oct 25, 2018 at 10:38:34AM +0100, Mike Rapoport wrote:
> On Wed, Oct 24, 2018 at 02:55:17PM -0500, Rob Herring wrote:
> > On Wed, Oct 24, 2018 at 2:33 PM Florian Fainelli  
> > wrote:
> > >
> > > Hi all,
> > >
> > > While investigating why ARM64 required a ton of objects to be rebuilt
> > > when toggling CONFIG_DEV_BLK_INITRD, it became clear that this was
> > > because we define __early_init_dt_declare_initrd() differently and we do
> > > that in arch/arm64/include/asm/memory.h which gets included by a fair
> > > amount of other header files, and translation units as well.
> > 
> > I scratch my head sometimes as to why some config options rebuild so
> > much stuff. One down, ? to go. :)
> > 
> > > Changing the value of CONFIG_DEV_BLK_INITRD is a common thing with build
> > > systems that generate two kernels: one with the initramfs and one
> > > without. buildroot is one of these build systems, OpenWrt is also
> > > another one that does this.
> > >
> > > This patch series proposes adding an empty initrd.h to satisfy the need
> > > for drivers/of/fdt.c to unconditionally include that file, and moves the
> > > custom __early_init_dt_declare_initrd() definition away from
> > > asm/memory.h
> > >
> > > This cuts the number of objects rebuilds from 1920 down to 26, so a
> > > factor 73 approximately.
> > >
> > > Apologies for the long CC list, please let me know how you would go
> > > about merging that and if another approach would be preferable, e.g:
> > > introducing a CONFIG_ARCH_INITRD_BELOW_START_OK Kconfig option or
> > > something like that.
> > 
> > There may be a better way as of 4.20 because bootmem is now gone and
> > only memblock is used. This should unify what each arch needs to do
> > with initrd early. We need the physical address early for memblock
> > reserving. Then later on we need the virtual address to access the
> > initrd. Perhaps we should just change initrd_start and initrd_end to
> > physical addresses (or add 2 new variables would be less invasive and
> > allow for different translation than __va()). The sanity checks and
> > memblock reserve could also perhaps be moved to a common location.
> >
> > Alternatively, given arm64 is the only oddball, I'd be fine with an
> > "if (IS_ENABLED(CONFIG_ARM64))" condition in the default
> > __early_init_dt_declare_initrd as long as we have a path to removing
> > it like the above option.
> 
> I think arm64 does not have to redefine __early_init_dt_declare_initrd().
> Something like this might be just all we need (completely untested,
> probably it won't even compile):

The alternative solution would be to replace initrd_start/initrd_end
with physical address versions of these everywhere - that's what
we're passed from DT, it's what 32-bit ARM would prefer, and seemingly
what 64-bit ARM would also like as well.

Grepping for initrd_start in arch/*/mm shows that there's lots of
architectures that have virtual/physical conversions on these, and
a number that have obviously been derived from 32-bit ARM's approach
(with maintaining a phys_initrd_start variable to simplify things).

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 0/8] add generic builtin command line

2018-10-24 Thread Russell King - ARM Linux

On Wed, Oct 24, 2018 at 11:57:44AM +0300, Maksym Kokhan wrote:
> Do you mean, that you haven't seen patch for ARM, which I sent on
> September 27 along with cover and patch 1? It is strange, because
> you was the one from recipients. If so, you can see this patch here:
> https://lore.kernel.org/patchwork/patch/992779/

It seems that I have received patch 5, _but_ it's not threaded with
the cover message and patch 1.  With 50k messages in my inbox, and 3k
messages since you sent the series, it's virtually impossible to find
it (I only found it by looking at my mail server logs from September
to find the subject, and then searching my mailbox for that subject.)

This is unnecessarily difficult.

> On Tue, Oct 23, 2018 at 5:48 PM Russell King - ARM Linux
>  wrote:
> >
> > On Tue, Oct 23, 2018 at 05:43:18PM +0300, Maksym Kokhan wrote:
> > > We still have no response to patches for x86, arm, arm64 and powerpc.
> > > Is current generic command line implementation appropriate for these
> > > architectures?
> > > Is it possible to merge these patches in the current form (for x86,
> > > arm, arm64 and powerpc)?
> >
> > You may wish to consider your recipients - I seem to only have received
> > the cover and patch 1 (which doesn't include any ARM specific bits).
> > It may be that you're not getting responses because people haven't seen
> > your patches.
> >
> > Thanks.
> >
> > --
> > RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> > FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps 
> > up
> > According to speedtest.net: 11.9Mbps down 500kbps up

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH v3 0/7] include: add setbits32/clrbits32/clrsetbits32/setbits64/clrbits64/clrsetbits64

2018-10-24 Thread Russell King - ARM Linux

On Wed, Oct 24, 2018 at 07:35:46AM +, Corentin Labbe wrote:
> This patchset adds a new set of functions which are open-coded in lot of
> place.
> Basicly the pattern is always the same, "read, modify a bit, write"
> some driver and the powerpc arch already have thoses pattern them as 
> functions. (like ahci_sunxi.c or dwmac-meson8b)

The advantage of them being open-coded is that it's _obvious_ to the
reviewer that there is a read-modify-write going on which, in a multi-
threaded environment, may need some locking (so it should trigger a
review of the locking around that code.)

With it hidden inside a helper which has no locking itself, it becomes
much easier to pass over in review, which means that races are much
more likely to go unspotted - and that is bad news.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 0/8] add generic builtin command line

2018-10-23 Thread Russell King - ARM Linux

On Tue, Oct 23, 2018 at 05:43:18PM +0300, Maksym Kokhan wrote:
> We still have no response to patches for x86, arm, arm64 and powerpc.
> Is current generic command line implementation appropriate for these
> architectures?
> Is it possible to merge these patches in the current form (for x86,
> arm, arm64 and powerpc)?

You may wish to consider your recipients - I seem to only have received
the cover and patch 1 (which doesn't include any ARM specific bits).
It may be that you're not getting responses because people haven't seen
your patches.

Thanks.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH 6/9] PCI: consolidate PCI config entry in drivers/pci

2018-10-19 Thread Russell King - ARM Linux

On Fri, Oct 19, 2018 at 02:09:49PM +0200, Christoph Hellwig wrote:
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index e33735ce1c14..7495d0a0aa31 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -149,9 +149,6 @@ config ARM_DMA_IOMMU_ALIGNMENT
>  
>  endif
>  
> -config MIGHT_HAVE_PCI
> - bool
> -
>  config SYS_SUPPORTS_APM_EMULATION
>   bool
>  
> @@ -320,7 +317,7 @@ config ARCH_MULTIPLATFORM
>   select COMMON_CLK
>   select GENERIC_CLOCKEVENTS
>   select GENERIC_IRQ_MULTI_HANDLER
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select PCI_DOMAINS if PCI
>   select SPARSE_IRQ
>   select USE_OF
> @@ -436,7 +433,7 @@ config ARCH_IXP4XX
>   select DMABOUNCE if PCI
>   select GENERIC_CLOCKEVENTS
>   select GPIOLIB
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select NEED_MACH_IO_H
>   select USB_EHCI_BIG_ENDIAN_DESC
>   select USB_EHCI_BIG_ENDIAN_MMIO
> @@ -449,7 +446,7 @@ config ARCH_DOVE
>   select GENERIC_CLOCKEVENTS
>   select GENERIC_IRQ_MULTI_HANDLER
>   select GPIOLIB
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select MVEBU_MBUS
>   select PINCTRL
>   select PINCTRL_DOVE
> @@ -1216,14 +1213,6 @@ config ISA_DMA
>  config ISA_DMA_API
>   bool
>  
> -config PCI
> - bool "PCI support" if MIGHT_HAVE_PCI
> - help
> -   Find out whether you have a PCI motherboard. PCI is the name of a
> -   bus system, i.e. the way the CPU talks to the other stuff inside
> -   your box. Other bus systems are ISA, EISA, MicroChannel (MCA) or
> -   VESA. If you have PCI, say Y, otherwise N.
> -
>  config PCI_DOMAINS
>   bool "Support for multiple PCI domains"
>   depends on PCI
> @@ -1252,8 +1241,6 @@ config PCI_HOST_ITE8152
>   default y
>   select DMABOUNCE
>  
> -source "drivers/pci/Kconfig"
> -
>  source "drivers/pcmcia/Kconfig"
>  
>  endmenu
> diff --git a/arch/arm/mach-ks8695/Kconfig b/arch/arm/mach-ks8695/Kconfig
> index a545976bdbd6..b3185c05fffa 100644
> --- a/arch/arm/mach-ks8695/Kconfig
> +++ b/arch/arm/mach-ks8695/Kconfig
> @@ -4,7 +4,7 @@ menu "Kendin/Micrel KS8695 Implementations"
>  
>  config MACH_KS8695
>   bool "KS8695 development board"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to run on the original
> Kendin-Micrel KS8695 development board.
> @@ -52,7 +52,7 @@ config MACH_CM4002
>  
>  config MACH_CM4008
>   bool "OpenGear CM4008"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> CM4008 Console Server. See http://www.opengear.com for more
> @@ -60,7 +60,7 @@ config MACH_CM4008
>  
>  config MACH_CM41xx
>   bool "OpenGear CM41xx"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> CM4016 or CM4048 Console Servers. See http://www.opengear.com for
> @@ -68,7 +68,7 @@ config MACH_CM41xx
>  
>  config MACH_IM4004
>   bool "OpenGear IM4004"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> IM4004 Secure Access Server. See http://www.opengear.com for
> @@ -76,7 +76,7 @@ config MACH_IM4004
>  
>  config MACH_IM42xx
>   bool "OpenGear IM42xx"
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   help
> Say 'Y' here if you want your kernel to support the OpenGear
> IM4216 or IM4248 Console Servers. See http://www.opengear.com for
> diff --git a/arch/arm/mach-pxa/Kconfig b/arch/arm/mach-pxa/Kconfig
> index a68b34183107..b185794549be 100644
> --- a/arch/arm/mach-pxa/Kconfig
> +++ b/arch/arm/mach-pxa/Kconfig
> @@ -125,7 +125,7 @@ config MACH_ARMCORE
>   bool "CompuLab CM-X255/CM-X270 modules"
>   select ARCH_HAS_DMA_SET_COHERENT_MASK if PCI
>   select IWMMXT
> - select MIGHT_HAVE_PCI
> + select HAVE_PCI
>   select NEED_MACH_IO_H if PCI
>   select PXA25x
>   select PXA27x

This is wrong.  "MIGHT_HAVE_PCI" is _not_ the same as "HAVE_PCI" - we
have a bunch of platforms that mandatorily have PCI and these select
PCI directly.  "MIGHT_HAVE_PCI" controls the _visibility_ of the PCI
menu option, but does not prevent it being selected.  Your patch will
cause Kconfig to complain for those which mandatorily have PCI but
do not set HAVE_PCI.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH V4 2/15] KVM/MMU: Add tlb flush with range helper function

2018-10-14 Thread Russell King - ARM Linux

On Sun, Oct 14, 2018 at 09:21:23PM +0800, Tianyu Lan wrote:
> Sorry to confuse your. I get from CCers from get_maintainer.pl script.

Unfortunately you seem to have made a mistake.  My email address is
'li...@armlinux.org.uk' not 'li...@armlinux.org'.  There is no
'li...@armlinux.org' in MAINTAINERS, yet your emails appear to be
copied to this address.

Consequently, if it was your intention to copy me with the entire
series, that didn't happen.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH V4 2/15] KVM/MMU: Add tlb flush with range helper function

2018-10-14 Thread Russell King - ARM Linux

On Sun, Oct 14, 2018 at 10:16:56AM +0200, Thomas Gleixner wrote:
> On Sun, 14 Oct 2018, Liran Alon wrote:
> > > On 13 Oct 2018, at 17:53, lantianyu1...@gmail.com wrote:
> > > 
> > > From: Lan Tianyu 
> > > 
> > > This patch is to add wrapper functions for tlb_remote_flush_with_range
> > > callback.
> > > 
> > > Signed-off-by: Lan Tianyu 
> > > ---
> > > Change sicne V3:
> > >   Remove code of updating "tlbs_dirty"
> > > Change since V2:
> > >   Fix comment in the kvm_flush_remote_tlbs_with_range()
> > > ---
> > > arch/x86/kvm/mmu.c | 40 
> > > 1 file changed, 40 insertions(+)
> > > 
> > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > index c73d9f650de7..ff656d85903a 100644
> > > --- a/arch/x86/kvm/mmu.c
> > > +++ b/arch/x86/kvm/mmu.c
> > > @@ -264,6 +264,46 @@ static void mmu_spte_set(u64 *sptep, u64 spte);
> > > static union kvm_mmu_page_role
> > > kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
> > > 
> > > +
> > > +static inline bool kvm_available_flush_tlb_with_range(void)
> > > +{
> > > + return kvm_x86_ops->tlb_remote_flush_with_range;
> > > +}
> > 
> > Seems that kvm_available_flush_tlb_with_range() is not used in this patch…
> 
> What's wrong with that? 
> 
> It provides the implementation and later patches make use of it. It's a
> sensible way to split patches into small, self contained entities.

>From what I can see of the patches that follow _this_ patch in the
series, this function remains unused.  So, not only is it not used
in this patch, it's not used in this series.

I think the real question that needs asking is - what is the plan
for this function, and when will it be used?

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [PATCH V4 2/15] KVM/MMU: Add tlb flush with range helper function

2018-10-14 Thread Russell King - ARM Linux

On Sun, Oct 14, 2018 at 10:27:34AM +0100, Russell King - ARM Linux wrote:
> On Sun, Oct 14, 2018 at 10:16:56AM +0200, Thomas Gleixner wrote:
> > On Sun, 14 Oct 2018, Liran Alon wrote:
> > > > On 13 Oct 2018, at 17:53, lantianyu1...@gmail.com wrote:
> > > > 
> > > > From: Lan Tianyu 
> > > > 
> > > > This patch is to add wrapper functions for tlb_remote_flush_with_range
> > > > callback.
> > > > 
> > > > Signed-off-by: Lan Tianyu 
> > > > ---
> > > > Change sicne V3:
> > > >   Remove code of updating "tlbs_dirty"
> > > > Change since V2:
> > > >   Fix comment in the kvm_flush_remote_tlbs_with_range()
> > > > ---
> > > > arch/x86/kvm/mmu.c | 40 
> > > > 1 file changed, 40 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > > index c73d9f650de7..ff656d85903a 100644
> > > > --- a/arch/x86/kvm/mmu.c
> > > > +++ b/arch/x86/kvm/mmu.c
> > > > @@ -264,6 +264,46 @@ static void mmu_spte_set(u64 *sptep, u64 spte);
> > > > static union kvm_mmu_page_role
> > > > kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu);
> > > > 
> > > > +
> > > > +static inline bool kvm_available_flush_tlb_with_range(void)
> > > > +{
> > > > +   return kvm_x86_ops->tlb_remote_flush_with_range;
> > > > +}
> > > 
> > > Seems that kvm_available_flush_tlb_with_range() is not used in this patch…
> > 
> > What's wrong with that? 
> > 
> > It provides the implementation and later patches make use of it. It's a
> > sensible way to split patches into small, self contained entities.
> 
> From what I can see of the patches that follow _this_ patch in the
> series, this function remains unused.  So, not only is it not used
> in this patch, it's not used in this series.

Note - I seem to have only received patches 1 through 4, so this is
based on the patches I've received.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

Re: [RESEND PATCH 0/6] rapidio: move Kconfig menu definition to subsystem

2018-07-31 Thread Russell King - ARM Linux

For the thread associated with this patch set, a review of a previous
patch for ARM posted last Tuesday on this subject asked a series of
questions about the PCI-nature of this.  The review has not been
responded to.

If it is inappropriate to offer RapidIO for any architecture that
happens to has PCI, then it is inappropriate to offer it for any
ARM machine that happens to have PCI.

In light of the lack of explanation on this point so far, I'm naking
the ARM part of this series for now.

I also think that the HAS_RAPIDIO thing is misleading and needs
sorting out (as I've mentioned in other emails, including the one
I refer to above) before rapidio becomes available more widely.

On Tue, Jul 31, 2018 at 10:29:48AM -0400, Alexei Colin wrote:
> Resending the patchset from prior submission:
> https://lkml.org/lkml/2018/7/30/911
> 
> The only change are the Cc tags in all patches now include the mailing
> lists for all affected architectures, and patch 1/6 (which adds the menu
> item to RapdidIO subsystem Kconfig) is CCed to all maintainers who are
> getting this cover letter. The cover letter has been updated with
> explanations to points raised in the feedback.
> 
> 
> 
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfig files. This set of
> patches does two things:
> 
> 1. Move the Kconfig menu definition into the RapidIO subsystem and
> remove the duplicate definitions from arch Kconfig files.
> 
> 2. Enable RapidIO Kconfig menu entry for arm and arm64 architectures,
> where it was not enabled before. I tested that subsystem and drivers
> build successfully for both architectures, and tested that the modules
> load on a custom arm64 Qemu model.
> 
> For all architectures, RapidIO menu should be offered when either:
> (1) The platform has a PCI bus (which host a RapidIO module on the bus).
> (2) The platform has a RapidIO IP block (connected to a system bus, e.g.
> AXI on ARM). In this case, 'select HAS_RAPIDIO' should be added to the
> 'config ARCH_*' menu entry for the SoCs that offer the IP block.
> 
> Prior to this patchset, different architectures used different criteria:
> * powerpc: (1) and (2)
> * mips: (1) and (2) after recent commit into next that added (2):
>   https://www.linux-mips.org/archives/linux-mips/2018-07/msg00596.html
>   fc5d988878942e9b42a4de5204bdd452f3f1ce47
>   491ec1553e0075f345fbe476a93775eabcbc40b6
> * x86: (1)
> * arm,arm64: none (RapidIO menus never offered)
> 
> This set of architectures are the ones that implement support for
> RapidIO as system bus. On some platforms RapidIO can be the only system
> bus available replacing PCI/PCIe.  As it is done now, RapidIO is
> configured in "Bus Options" (x86/PPC) or "Bus Support" (ARMs) sub-menu
> and from system configuration option it should be kept this way.
> Current location of RAPIDIO configuration option is familiar to users of
> PowerPC and x86 platforms, and is similarly available in some ARM
> manufacturers kernel code trees. (Alex Bounine)
> 
> HAS_RAPIDIO is not enabled unconditionally, because HAS_RAPIDIO option
> is intended for SOCs that have built in SRIO controllers, like TI
> KeyStoneII or FPGAs. Because RapidIO subsystem core is required during
> RapidIO port driver initialization, having separate option allows us to
> control available build options for RapidIO core and port driver (bool
> vs.  tristate) and disable module option if port driver is configured as
> built-in. (Alex Bounine)
> 
> Responses to feedback from prior submission (thanks for the reviews!):
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593347.html
> http://lists.infradead.org/pipermail/linux-arm-kernel/2018-July/593349.html
> 
> Changelog:
>   * Moved Kconfig entry into RapidIO subsystem instead of duplicating
> 
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.
> 
> We appreciate the suggestion for also selecting the RapdiIO subsystem for
> compilation with COMPILE_TEST, but hope to address it in a separate
> patchset, localized to the subsystem, since it will need to change
> depends on all drivers, not just on the top level, and since this
> patch now spans multiple architectures.
> 
> Alexei Colin (6):
>   rapidio: define top Kconfig menu in driver subtree
>   x86: factor out RapidIO Kconfig menu
>   powerpc: factor out RapidIO Kconfig menu entry
>   mips: factor out RapidIO Kconfig entry
>   arm:

Re: [RESEND PATCH 5/6] arm: enable RapidIO menu in Kconfig

2018-07-31 Thread Russell King - ARM Linux

On Tue, Jul 31, 2018 at 10:29:53AM -0400, Alexei Colin wrote:
> Platforms with a PCI bus will be offered the RapidIO menu since they may
> be want support for a RapidIO PCI device. Platforms without a PCI bus
> that might include a RapidIO IP block will need to "select HAS_RAPIDIO"
> in the platform-/machine-specific "config ARCH_*" Kconfig entry.
> 
> Tested that kernel builds for ARM with the RapidIO subsystem and switch
> drivers enabled.
> 
> Cc: Andrew Morton 
> Cc: Russell King 
> Cc: John Paul Walters 
> Cc: linux-ker...@vger.kernel.org
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-m...@linux-mips.org
> Cc: linux-arm-ker...@lists.infradead.org
> Signed-off-by: Alexei Colin 
> ---
>  arch/arm/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index afe350e5e3d9..602a61324890 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1278,6 +1278,8 @@ config PCI_HOST_ITE8152
>  
>  source "drivers/pci/Kconfig"
>  
> +source "drivers/rapidio/Kconfig"
> +
>  source "drivers/pcmcia/Kconfig"

Why not place the new Kconfig file after pcmcia?  That way, it is in
a consistent position wrt architectures such as powerpc, and it is
also in alphabetical order.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

Re: [RESEND PATCH 1/6] rapidio: define top Kconfig menu in driver subtree

2018-07-31 Thread Russell King - ARM Linux

On Tue, Jul 31, 2018 at 10:29:49AM -0400, Alexei Colin wrote:
> The top-level Kconfig entry for RapidIO subsystem is currently
> duplicated in several architecture-specific Kconfigs. This
> commit re-defines it in the driver subtree, and subsequent
> commits, one per architecture, will remove the duplicated
> definitions from respective per-architecture Kconfigs.
> 
> Cc: Andrew Morton 
> Cc: John Paul Walters 
> Cc: Catalin Marinas 
> Cc: Russell King 
> Cc: Arnd Bergmann 
> Cc: Will Deacon 
> Cc: Ralf Baechle 
> Cc: Paul Burton 
> Cc: Alexander Sverdlin 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Thomas Gleixner 
> Cc: Peter Anvin 
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-m...@linux-mips.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Alexei Colin 
> ---
>  drivers/rapidio/Kconfig | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/rapidio/Kconfig b/drivers/rapidio/Kconfig
> index d6d2f20c4597..98e301847584 100644
> --- a/drivers/rapidio/Kconfig
> +++ b/drivers/rapidio/Kconfig
> @@ -1,6 +1,21 @@
>  #
>  # RapidIO configuration
>  #
> +
> +config HAS_RAPIDIO
> + bool
> + default n

There's no need to specify this default - the default default defaults to
'n' anyway, so "default n" just respecifies what's already the default.
(next time I'll try to add more "default"s into that! ;) )

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

Re: [PATCH 0/6] rapidio: move Kconfig menu definition to subsystem

2018-07-30 Thread Russell King - ARM Linux

I only have this message, and patches 5 and 6.  This is meaningless
for me to review - I can't tell what you've done as far as my comments
on your previous iteration.

Please arrange to _at least_ copy all patches the appropriate mailing
lists for the set with your complete patch set if you aren't going to
copy everyone on all patches in the set.

On Mon, Jul 30, 2018 at 06:50:28PM -0400, Alexei Colin wrote:
> In the current patchset, I took the approach of adding '|| PCI' to the
> depends in the subsystem. I did try the alterantive approach mentioned
> in the reviews for v1 of this patch, where the subsystem Kconfig does
> not add a '|| PCI' and each per-architecture Kconfig has to add a
> 'select HAS_RAPIDIO if PCI' and SoCs with IP blocks have to also add
> 'select HAS_RAPIDIO'. This works too but requires each architecture's
> Kconfig to add the line for RapidIO (whereas current approach does not
> require that involvement) and also may create a false impression that
> the dependency on PCI is strict.

... which means, as it stands, I've no idea what you actually came up
with for this.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up

Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

2018-04-15 Thread Russell King - ARM Linux

On Fri, Apr 13, 2018 at 12:53:49PM -0700, Linus Torvalds wrote:
> On Fri, Apr 13, 2018 at 11:45 AM, Dave Martin  wrote:
> >
> > Most uses I've seen do nothing more than use the FPE_xyz value to
> > format diagnostic messages while dying.  I struggled to find code that
> > made a meaningful functional decision based on the value, though that's
> > not proof...
> 
> Yeah. I've seen code that cares about SIGFPE deeply, but it's almost
> invariably about some emulated environment (eg Java VM, or CPU
> emulation).
> 
> And the siginfo data is basically never good enough for those
> environments anyway on its own, so they will go and look at the actual
> instruction that caused the fault and the register state instead,
> because they need *all* the information.
> 
> The cases that use si_code are the ones that just trapped signals in
> order to give a more helpful abort message.
> 
> So I could certainly imagine that si_code is actually used by somebody
> who then decides to actuall act differently on it, but aside from
> perhaps printing out a different message, it sounds far-fetched.

Okay, in that case let's just use FPE_FLTINV.  That makes the patch
easily back-portable for stable kernels.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

1 2 3 4 >

1 - 100 of 397 matches

Mail list logo