date:20231124

Re: [PATCH net-next 01/38] selftests/net: add lib.sh

2023-11-24 Thread Hangbin Liu

On Fri, Nov 24, 2023 at 03:35:51PM +0100, Petr Machata wrote:
> 
> Hangbin Liu  writes:
> 
> > +cleanup_ns()
> > +{
> > +   local ns=""
> > +   local errexit=0
> > +
> > +   # disable errexit temporary
> > +   if [[ $- =~ "e" ]]; then
> > +   errexit=1
> > +   set +e
> > +   fi
> > +
> > +   for ns in "$@"; do
> > +   ip netns delete "${ns}" &> /dev/null
> > +   busywait 2 "ip netns list | grep -vq $1" &> /dev/null
> 
> The grep would get confused by substrings of other names.
> This should be grep -vq "^$ns$".

Thanks. I just thought the ns name would like foo-xxx, but I forgot this
is a common function, which maybe called with normal ns name.

> 
> > +   if ip netns list | grep -q $1; then
> 
> Busywait returns != 0 when the wait condition is not reached within a
> given time. So it should be possible to roll the duplicated if-grep into
> the busywait line like so:
> 
>   if ! busywait 2 "ip netns etc."; then

You are right.
> 
> > +   echo "Failed to remove namespace $1"
> > +   return $ksft_skip
> 
> This does not restore the errexit.
> 
> I think it might be clearest to have this function as a helper, say
> __cleanup_ns, and then have a wrapper that does the errexit management:
> 
> cleanup_ns()
> {
>   local errexit
>   local rc
> 
>   # disable errexit temporarily
>   if [[ $- =~ "e" ]]; then
>   errexit=1
>   set +e
>   fi
> 
>   __cleanup_ns "$@"
>   rc=$?
> 
>   [ $errexit -eq 1 ] && set -e
>   return $rc
> }
> 
> If this comes up more often, we can have a helper like
> with_disabled_errexit or whatever, that does this management and
> dispatches to "$@", so cleanup_ns() would become:
> 
> cleanup_ns()
> {
>   with_disabled_errexit __cleanup_ns "$@"
> }

Thanks for your suggestion.

> 
> > +   fi
> > +   done
> > +
> > +   [ $errexit -eq 1 ] && set -e
> > +   return 0
> > +}
> > +
> > +# By default, remove all netns before EXIT.
> > +cleanup_all_ns()
> > +{
> > +   cleanup_ns $NS_LIST
> > +}
> > +trap cleanup_all_ns EXIT
> 
> Hmm, OK, this is a showstopper for inclusion from forwarding/lib.sh,
> because basically all users of forwarding/lib.sh use the EXIT trap.
> 
> I wonder if we need something like these push_cleanup / on_exit helpers:
> 
>   https://github.com/pmachata/stuff/blob/master/ptp-test/lib.sh#L15

When I added this, I just want to make sure the netns are cleaned up if the
client script forgot. I think the client script trap function should
cover this one, no?

> 
> But I don't want to force this on your already large patchset :)

Yes, Paolo also told me that this is too large. I will break it to
2 path set or merge some small patches together for next version.

> So just ignore the bit about including from forwarding/lib.sh.

> Actually I take this back. The cleanup should be invoked from where the
> init was called. I don't think the library should be auto-invoking it,
> the client scripts should. Whether through a trap or otherwise.

OK, also makes sense. I will remove this trap.

Thanks for all your comments.
Hangbin

Re: [PATCH net-next 01/38] selftests/net: add lib.sh

2023-11-24 Thread Hangbin Liu

On Fri, Nov 24, 2023 at 03:05:18PM +0100, Petr Machata wrote:
> 
> Hangbin Liu  writes:
> 
> > Add a lib.sh for net selftests. This file can be used to define commonly
> > used variables and functions.
> >
> > Add function setup_ns() for user to create unique namespaces with given
> > prefix name.
> >
> > Signed-off-by: Hangbin Liu 
> > ---
> > diff --git a/tools/testing/selftests/net/lib.sh 
> > b/tools/testing/selftests/net/lib.sh
> > new file mode 100644
> > index ..239ab2beb438
> > --- /dev/null
> > +++ b/tools/testing/selftests/net/lib.sh
> > @@ -0,0 +1,98 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +##
> > +# Defines
> > +
> > +# Kselftest framework requirement - SKIP code is 4.
> > +ksft_skip=4
> > +# namespace list created by setup_ns
> > +NS_LIST=""
> > +
> > +##
> > +# Helpers
> > +busywait()
> > +{
> > +   local timeout=$1; shift
> > +
> > +   local start_time="$(date -u +%s%3N)"
> > +   while true
> > +   do
> > +   local out
> > +   out=$($@)
> > +   local ret=$?
> > +   if ((!ret)); then
> > +   echo -n "$out"
> > +   return 0
> > +   fi
> > +
> > +   local current_time="$(date -u +%s%3N)"
> > +   if ((current_time - start_time > timeout)); then
> > +   echo -n "$out"
> > +   return 1
> > +   fi
> > +   done
> > +}
> 
> This is lifted from forwarding/lib.sh, right? Would it make sense to

Yes.

> just source this new file from forwarding/lib.sh instead of copying

Do you mean let net/forwarding/lib.sh source net.lib, and let other net
tests source the net/forwarding/lib.sh?

Or move the busywait() function from net/forwarding/lib.sh to net.lib.
Then let net/forwarding/lib.sh source net.lib?

> stuff around? I imagine there will eventually be more commonality, and
> when that pops up, we can just shuffle the forwarding code to
> net/lib.sh.

Yes, make sense.

Thanks
Hangbin

Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

2023-11-24 Thread Guo Ren

On Fri, Nov 24, 2023 at 11:15:19AM +0100, Peter Zijlstra wrote:
> On Fri, Nov 24, 2023 at 08:21:37AM +0100, Christoph Muellner wrote:
> > From: Christoph Müllner 
> > 
> > The upcoming RISC-V Ssdtso specification introduces a bit in the senvcfg
> > CSR to switch the memory consistency model at run-time from RVWMO to TSO
> > (and back). The active consistency model can therefore be switched on a
> > per-hart base and managed by the kernel on a per-process/thread base.
> 
> You guys, computers are hartless, nobody told ya?
> 
> > This patch implements basic Ssdtso support and adds a prctl API on top
> > so that user-space processes can switch to a stronger memory consistency
> > model (than the kernel was written for) at run-time.
> > 
> > I am not sure if other architectures support switching the memory
> > consistency model at run-time, but designing the prctl API in an
> > arch-independent way allows reusing it in the future.
> 
> IIRC some Sparc chips could do this, but I don't think anybody ever
> exposed this to userspace (or used it much).
> 
> IA64 had planned to do this, except they messed it up and did it the
> wrong way around (strong first and then relax it later), which lead to
> the discovery that all existing software broke (d'uh).
> 
> I think ARM64 approached this problem by adding the
> load-acquire/store-release instructions and for TSO based code,
> translate into those (eg. x86 -> arm64 transpilers).
Keeping global TSO order is easier and faster than mixing
acquire/release and regular load/store. That means when ssdtso is
enabled, the transpiler's load-acquire/store-release becomes regular
load/store. Some micro-arch hardwares could speed up the performance.

Of course, you may say powerful machines could smooth out the difference
between ssdtso & load-acquire/store-release, but that's not real life.
Adding ssdtso is a flexible way to gain more choices on the cost of chip
design.

> 
> IIRC Risc-V actually has such instructions as well, so *why* are you
> doing this?!?!
>

Re: [PATCH 0/5] pstore: add tty frontend and multi-backend

2023-11-24 Thread Guilherme G. Piccoli

Hi Yuanhe / Kees.

My apologies (and embarrassment) for responding almost 2mo later...

On 29/09/2023 00:49, Kees Cook wrote:
> [...]
>> Another problem is that currently pstore only supports a single backend.
>> For debugging kdump problems, we hope to save the console logs and tty
>> logs to the ramoops backend of pstore, as it will not be lost after
>> rebooting. If the user has enabled another backend, the ramoops backend
>> will not be registered. To this end, we add the multi-backend function
>> to support simultaneous registration of multiple backends.
> 
> Ah very cool; I really like this idea. I'd wanted to do it for a while
> just to make testing easier, but I hadn't had time to attempt it.

I found the idea of multi-backend quite interesting, thanks for that!!!
And to add on what's Kees mentioned, not sure others' opinions but seems
to me this is a bit more straightforward / path-of-less-resistance than
the the tty frontend, so I'd suggest split the series and focus first on
this and once accepted, hook the tty thingy.

Not that the series can't be sent altogether, reviews could work in
parallel...I just see them as a bit tangential one to the other, personally.

> [...]
> - The multi-backend will enable _all possible_ backends, and that's a
>   big change that will do weird things for some pstore users. I would
>   prefer a pstore option to opt-in to enabling all backends. Perhaps
>   have "pstore.backend=" be parsed with commas, so a list of backends
>   can be provided, or "all" for the "all backends" behavior.
> 
> - Moving the pstorefs files into a subdirectory will break userspace
>   immediately (e.g. systemd-pstore expects very specifically named
>   files). Using subdirectories seems like a good idea, but perhaps
>   we need hardlinks into the root pstorefs for the "first" backend,
>   or some other creative solution here.
>

Big +1 in these two, commas are a very nice idea and changing the sysfs
current way of exposing pstore logs would break at least kdumpst (the
Steam Deck/Arch pstore / kdump tool), besides systemd-pstore that was
already mentioned (and who knows what more tools / scripts out in the
field).

Overall, thanks a bunch for this work Yuanhe!
Cheers,

Guilherme

Re: [PATCH ipsec-next v1 7/7] bpf: xfrm: Add selftest for bpf_xdp_get_xfrm_state()

2023-11-24 Thread Daniel Xu

Hi Alexei,

On Wed, Nov 22, 2023 at 03:28:16PM -0800, Alexei Starovoitov wrote:
> On Wed, Nov 22, 2023 at 10:21 AM Daniel Xu  wrote:
> >
> > +
> > +   bpf_printk("replay-window %d\n", x->replay_esn->replay_window);
> 
> Pls no printk in tests. Find a different way to validate.

Ack. I'll migrate the ipsec tunnel tests to test_progs next rev so it
can use mmaped globals.

Thanks,
Daniel

Re: [PATCH v1 1/3] KVM: selftests: aarch64: Make the [create|destroy]_vpmu_vm() can be reused

2023-11-24 Thread Eric Auger

Hi Shaoqin,

On 11/23/23 07:37, Shaoqin Huang wrote:
> Move the [create|destroy]_vpmu_vm() into the lib/, which makes those
some wording suggestions below:

Move the implementation of .. into lib/aarch64/pmu.c and export their
declaration in a header so that they can be reused by other tests. Also
the title may be renamed: Make [create|destroy]_vpmu_vm() public
> function can be used by other tests. Install the handler is specific to
the sync exception handler install is test specific so we move it out of
the helper function.
> the vpmu_counter_access test, so create a wrapper function for it, and
> only move the common part.
> 
> No functional change.
intended ;-)
> 
> Signed-off-by: Shaoqin Huang 
> ---
>  tools/testing/selftests/kvm/Makefile  |   1 +
>  .../kvm/aarch64/vpmu_counter_access.c | 100 +-
>  .../selftests/kvm/include/aarch64/vpmu.h  |  16 +++
>  .../testing/selftests/kvm/lib/aarch64/vpmu.c  |  64 +++
>  4 files changed, 105 insertions(+), 76 deletions(-)
>  create mode 100644 tools/testing/selftests/kvm/include/aarch64/vpmu.h
>  create mode 100644 tools/testing/selftests/kvm/lib/aarch64/vpmu.c
> 
> diff --git a/tools/testing/selftests/kvm/Makefile 
> b/tools/testing/selftests/kvm/Makefile
> index a5963ab9215b..b60852c222ac 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -57,6 +57,7 @@ LIBKVM_aarch64 += lib/aarch64/processor.c
>  LIBKVM_aarch64 += lib/aarch64/spinlock.c
>  LIBKVM_aarch64 += lib/aarch64/ucall.c
>  LIBKVM_aarch64 += lib/aarch64/vgic.c
> +LIBKVM_aarch64 += lib/aarch64/vpmu.c
>  
>  LIBKVM_s390x += lib/s390x/diag318_test_handler.c
>  LIBKVM_s390x += lib/s390x/processor.c
> diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c 
> b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
> index 5ea78986e665..17305408a334 100644
> --- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
> +++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -25,13 +26,7 @@
>  /* The cycle counter bit position that's common among the PMU registers */
>  #define ARMV8_PMU_CYCLE_IDX  31
>  
> -struct vpmu_vm {
> - struct kvm_vm *vm;
> - struct kvm_vcpu *vcpu;
> - int gic_fd;
> -};
> -
> -static struct vpmu_vm vpmu_vm;
> +static struct vpmu_vm *vpmu_vm;
>  
>  struct pmreg_sets {
>   uint64_t set_reg_id;
> @@ -421,64 +416,6 @@ static void guest_code(uint64_t expected_pmcr_n)
>   GUEST_DONE();
>  }
>  
> -#define GICD_BASE_GPA0x800ULL
> -#define GICR_BASE_GPA0x80AULL
> -
> -/* Create a VM that has one vCPU with PMUv3 configured. */
> -static void create_vpmu_vm(void *guest_code)
> -{
> - struct kvm_vcpu_init init;
> - uint8_t pmuver, ec;
> - uint64_t dfr0, irq = 23;
> - struct kvm_device_attr irq_attr = {
> - .group = KVM_ARM_VCPU_PMU_V3_CTRL,
> - .attr = KVM_ARM_VCPU_PMU_V3_IRQ,
> - .addr = (uint64_t),
> - };
> - struct kvm_device_attr init_attr = {
> - .group = KVM_ARM_VCPU_PMU_V3_CTRL,
> - .attr = KVM_ARM_VCPU_PMU_V3_INIT,
> - };
> -
> - /* The test creates the vpmu_vm multiple times. Ensure a clean state */
> - memset(_vm, 0, sizeof(vpmu_vm));
> -
> - vpmu_vm.vm = vm_create(1);
> - vm_init_descriptor_tables(vpmu_vm.vm);
> - for (ec = 0; ec < ESR_EC_NUM; ec++) {
> - vm_install_sync_handler(vpmu_vm.vm, VECTOR_SYNC_CURRENT, ec,
> - guest_sync_handler);
> - }
> -
> - /* Create vCPU with PMUv3 */
> - vm_ioctl(vpmu_vm.vm, KVM_ARM_PREFERRED_TARGET, );
> - init.features[0] |= (1 << KVM_ARM_VCPU_PMU_V3);
> - vpmu_vm.vcpu = aarch64_vcpu_add(vpmu_vm.vm, 0, , guest_code);
> - vcpu_init_descriptor_tables(vpmu_vm.vcpu);
> - vpmu_vm.gic_fd = vgic_v3_setup(vpmu_vm.vm, 1, 64,
> - GICD_BASE_GPA, GICR_BASE_GPA);
> - __TEST_REQUIRE(vpmu_vm.gic_fd >= 0,
> -"Failed to create vgic-v3, skipping");
> -
> - /* Make sure that PMUv3 support is indicated in the ID register */
> - vcpu_get_reg(vpmu_vm.vcpu,
> -  KVM_ARM64_SYS_REG(SYS_ID_AA64DFR0_EL1), );
> - pmuver = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), dfr0);
> - TEST_ASSERT(pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF &&
> - pmuver >= ID_AA64DFR0_EL1_PMUVer_IMP,
> - "Unexpected PMUVER (0x%x) on the vCPU with PMUv3", pmuver);
> -
> - /* Initialize vPMU */
> - vcpu_ioctl(vpmu_vm.vcpu, KVM_SET_DEVICE_ATTR, _attr);
> - vcpu_ioctl(vpmu_vm.vcpu, KVM_SET_DEVICE_ATTR, _attr);
> -}
> -
> -static void destroy_vpmu_vm(void)
> -{
> - close(vpmu_vm.gic_fd);
> - kvm_vm_free(vpmu_vm.vm);
> -}
> -
>  static void run_vcpu(struct kvm_vcpu *vcpu, uint64_t pmcr_n)
>  {
>

Re: [PATCH v1 2/3] KVM: selftests: aarch64: Move the pmu helper function into lib/

2023-11-24 Thread Eric Auger

Hi Shaoqin,

On 11/23/23 07:37, Shaoqin Huang wrote:
> Move those pmu helper function into lib/, thus it can be used by other
functions

Not really into lib but rather in vpmu.h header.
> pmu test.

no functional change intended
> 
> Signed-off-by: Shaoqin Huang 
> ---
>  .../kvm/aarch64/vpmu_counter_access.c | 118 -
>  .../selftests/kvm/include/aarch64/vpmu.h  | 119 ++
>  2 files changed, 119 insertions(+), 118 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c 
> b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
> index 17305408a334..62d6315790ab 100644
> --- a/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
> +++ b/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.c
> @@ -20,12 +20,6 @@
>  #include 
>  #include 
>  
> -/* The max number of the PMU event counters (excluding the cycle counter) */
> -#define ARMV8_PMU_MAX_GENERAL_COUNTERS   (ARMV8_PMU_MAX_COUNTERS - 1)
> -
> -/* The cycle counter bit position that's common among the PMU registers */
> -#define ARMV8_PMU_CYCLE_IDX  31
> -
>  static struct vpmu_vm *vpmu_vm;
>  
>  struct pmreg_sets {
> @@ -35,118 +29,6 @@ struct pmreg_sets {
>  
>  #define PMREG_SET(set, clr) {.set_reg_id = set, .clr_reg_id = clr}
>  
> -static uint64_t get_pmcr_n(uint64_t pmcr)
> -{
> - return (pmcr >> ARMV8_PMU_PMCR_N_SHIFT) & ARMV8_PMU_PMCR_N_MASK;
> -}
> -
> -static void set_pmcr_n(uint64_t *pmcr, uint64_t pmcr_n)
> -{
> - *pmcr = *pmcr & ~(ARMV8_PMU_PMCR_N_MASK << ARMV8_PMU_PMCR_N_SHIFT);
> - *pmcr |= (pmcr_n << ARMV8_PMU_PMCR_N_SHIFT);
> -}
> -
> -static uint64_t get_counters_mask(uint64_t n)
> -{
> - uint64_t mask = BIT(ARMV8_PMU_CYCLE_IDX);
> -
> - if (n)
> - mask |= GENMASK(n - 1, 0);
> - return mask;
> -}
> -
> -/* Read PMEVTCNTR_EL0 through PMXEVCNTR_EL0 */
> -static inline unsigned long read_sel_evcntr(int sel)
> -{
> - write_sysreg(sel, pmselr_el0);
> - isb();
> - return read_sysreg(pmxevcntr_el0);
> -}
> -
> -/* Write PMEVTCNTR_EL0 through PMXEVCNTR_EL0 */
> -static inline void write_sel_evcntr(int sel, unsigned long val)
> -{
> - write_sysreg(sel, pmselr_el0);
> - isb();
> - write_sysreg(val, pmxevcntr_el0);
> - isb();
> -}
> -
> -/* Read PMEVTYPER_EL0 through PMXEVTYPER_EL0 */
> -static inline unsigned long read_sel_evtyper(int sel)
> -{
> - write_sysreg(sel, pmselr_el0);
> - isb();
> - return read_sysreg(pmxevtyper_el0);
> -}
> -
> -/* Write PMEVTYPER_EL0 through PMXEVTYPER_EL0 */
> -static inline void write_sel_evtyper(int sel, unsigned long val)
> -{
> - write_sysreg(sel, pmselr_el0);
> - isb();
> - write_sysreg(val, pmxevtyper_el0);
> - isb();
> -}
> -
> -static inline void enable_counter(int idx)
> -{
> - uint64_t v = read_sysreg(pmcntenset_el0);
> -
> - write_sysreg(BIT(idx) | v, pmcntenset_el0);
> - isb();
> -}
> -
> -static inline void disable_counter(int idx)
> -{
> - uint64_t v = read_sysreg(pmcntenset_el0);
> -
> - write_sysreg(BIT(idx) | v, pmcntenclr_el0);
> - isb();
> -}
> -
> -static void pmu_disable_reset(void)
> -{
> - uint64_t pmcr = read_sysreg(pmcr_el0);
> -
> - /* Reset all counters, disabling them */
> - pmcr &= ~ARMV8_PMU_PMCR_E;
> - write_sysreg(pmcr | ARMV8_PMU_PMCR_P, pmcr_el0);
> - isb();
> -}
> -
> -#define RETURN_READ_PMEVCNTRN(n) \
> - return read_sysreg(pmevcntr##n##_el0)
> -static unsigned long read_pmevcntrn(int n)
> -{
> - PMEVN_SWITCH(n, RETURN_READ_PMEVCNTRN);
> - return 0;
> -}
> -
> -#define WRITE_PMEVCNTRN(n) \
> - write_sysreg(val, pmevcntr##n##_el0)
> -static void write_pmevcntrn(int n, unsigned long val)
> -{
> - PMEVN_SWITCH(n, WRITE_PMEVCNTRN);
> - isb();
> -}
> -
> -#define READ_PMEVTYPERN(n) \
> - return read_sysreg(pmevtyper##n##_el0)
> -static unsigned long read_pmevtypern(int n)
> -{
> - PMEVN_SWITCH(n, READ_PMEVTYPERN);
> - return 0;
> -}
> -
> -#define WRITE_PMEVTYPERN(n) \
> - write_sysreg(val, pmevtyper##n##_el0)
> -static void write_pmevtypern(int n, unsigned long val)
> -{
> - PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
> - isb();
> -}
> -
>  /*
>   * The pmc_accessor structure has pointers to PMEV{CNTR,TYPER}_EL0
>   * accessors that test cases will use. Each of the accessors will
> diff --git a/tools/testing/selftests/kvm/include/aarch64/vpmu.h 
> b/tools/testing/selftests/kvm/include/aarch64/vpmu.h
> index 0a56183644ee..e0cc1ca1c4b7 100644
> --- a/tools/testing/selftests/kvm/include/aarch64/vpmu.h
> +++ b/tools/testing/selftests/kvm/include/aarch64/vpmu.h
> @@ -1,10 +1,17 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  
>  #include 
> +#include 
>  
>  #define GICD_BASE_GPA0x800ULL
>  #define GICR_BASE_GPA0x80AULL
>  
> +/* The max number of the PMU event counters (excluding the cycle counter) */
> +#define ARMV8_PMU_MAX_GENERAL_COUNTERS   (ARMV8_PMU_MAX_COUNTERS - 1)
> +
> +/* The

Re: [PATCH v3 18/25] arm64/ptrace: add support for FEAT_POE

2023-11-24 Thread Mark Brown

On Fri, Nov 24, 2023 at 04:35:03PM +, Joey Gouly wrote:

> Add a regset for POE containing POR_EL0.

> +++ b/include/uapi/linux/elf.h
> @@ -440,6 +440,7 @@ typedef struct elf64_shdr {
>  #define NT_ARM_SSVE  0x40b   /* ARM Streaming SVE registers */
>  #define NT_ARM_ZA0x40c   /* ARM SME ZA registers */
>  #define NT_ARM_ZT0x40d   /* ARM SME ZT registers */
> +#define NT_ARM_POE   0x40f   /* ARM POE registers */

Not 0x40e?

Otherwise

Reviewed-by: Mark Brown 


signature.asc
Description: PGP signature

[PATCH net 2/4] selftests/net: fix a char signedness issue

2023-11-24 Thread Willem de Bruijn

From: Willem de Bruijn 

Signedness of char is signed on x86_64, but unsigned on arm64.

Fix the warning building cmsg_sender.c on signed platforms or
forced with -fsigned-char:

msg_sender.c:455:12:
error: implicit conversion from 'int' to 'char'
changes value from 128 to -128
[-Werror,-Wconstant-conversion]
buf[0] = ICMPV6_ECHO_REQUEST;

constant ICMPV6_ECHO_REQUEST is 128.

Link: https://lwn.net/Articles/911914
Fixes: de17e305a810 ("selftests: net: cmsg_sender: support icmp and raw 
sockets")
Signed-off-by: Willem de Bruijn 
---
 tools/testing/selftests/net/cmsg_sender.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/cmsg_sender.c 
b/tools/testing/selftests/net/cmsg_sender.c
index 24b21b15ed3fb..6ff3e732f449f 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -416,9 +416,9 @@ int main(int argc, char *argv[])
 {
struct addrinfo hints, *ai;
struct iovec iov[1];
+   unsigned char *buf;
struct msghdr msg;
char cbuf[1024];
-   char *buf;
int err;
int fd;
 
-- 
2.43.0.rc1.413.gea7ed67945-goog

[PATCH net 4/4] selftests/net: mptcp: fix uninitialized variable warnings

2023-11-24 Thread Willem de Bruijn

From: Willem de Bruijn 

Same init_rng() in both tests. The function reads /dev/urandom to
initialize srand(). In case of failure, it falls back onto the
entropy in the uninitialized variable. Not sure if this is on purpose.
But failure reading urandom should be rare, so just fail hard. While
at it, convert to getrandom(). Which man 4 random suggests is simpler
and more robust.

mptcp_inq.c:525:6:
mptcp_connect.c:1131:6:

error: variable 'foo' is used uninitialized
whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]

Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Fixes: b51880568f20 ("selftests: mptcp: add inq test case")
Cc: Florian Westphal 
Signed-off-by: Willem de Bruijn 



When input is randomized because this is expected to meaningfully
explore edge cases, should we also add
1. logging the random seed to stdout and
2. adding a command line argument to replay from a specific seed
I can do this in net-next, if authors find it useful in this case.
---
 tools/testing/selftests/net/mptcp/mptcp_connect.c | 11 ---
 tools/testing/selftests/net/mptcp/mptcp_inq.c | 11 ---
 2 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c 
b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index c7f9ebeebc2c5..d2043ec3bf6d6 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -18,6 +18,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1125,15 +1126,11 @@ int main_loop_s(int listensock)
 
 static void init_rng(void)
 {
-   int fd = open("/dev/urandom", O_RDONLY);
unsigned int foo;
 
-   if (fd > 0) {
-   int ret = read(fd, , sizeof(foo));
-
-   if (ret < 0)
-   srand(fd + foo);
-   close(fd);
+   if (getrandom(, sizeof(foo), 0) == -1) {
+   perror("getrandom");
+   exit(1);
}
 
srand(foo);
diff --git a/tools/testing/selftests/net/mptcp/mptcp_inq.c 
b/tools/testing/selftests/net/mptcp/mptcp_inq.c
index 8672d898f8cda..218aac4673212 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_inq.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_inq.c
@@ -18,6 +18,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -519,15 +520,11 @@ static int client(int unixfd)
 
 static void init_rng(void)
 {
-   int fd = open("/dev/urandom", O_RDONLY);
unsigned int foo;
 
-   if (fd > 0) {
-   int ret = read(fd, , sizeof(foo));
-
-   if (ret < 0)
-   srand(fd + foo);
-   close(fd);
+   if (getrandom(, sizeof(foo), 0) == -1) {
+   perror("getrandom");
+   exit(1);
}
 
srand(foo);
-- 
2.43.0.rc1.413.gea7ed67945-goog

[PATCH net 3/4] selftests/net: unix: fix unused variable compiler warning

2023-11-24 Thread Willem de Bruijn

From: Willem de Bruijn 

Remove an unused variable.

diag_uid.c:151:24:
error: unused variable 'udr'
[-Werror,-Wunused-variable]

Fixes: ac011361bd4f ("af_unix: Add test for sock_diag and UDIAG_SHOW_UID.")
Signed-off-by: Willem de Bruijn 
---
 tools/testing/selftests/net/af_unix/diag_uid.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/net/af_unix/diag_uid.c 
b/tools/testing/selftests/net/af_unix/diag_uid.c
index 5b88f7129fea4..79a3dd75590e8 100644
--- a/tools/testing/selftests/net/af_unix/diag_uid.c
+++ b/tools/testing/selftests/net/af_unix/diag_uid.c
@@ -148,7 +148,6 @@ void receive_response(struct __test_metadata *_metadata,
.msg_iov = ,
.msg_iovlen = 1
};
-   struct unix_diag_req *udr;
struct nlmsghdr *nlh;
int ret;
 
-- 
2.43.0.rc1.413.gea7ed67945-goog

[PATCH net 1/4] selftests/net: ipsec: fix constant out of range

2023-11-24 Thread Willem de Bruijn

From: Willem de Bruijn 

Fix a small compiler warning.

nr_process must be a signed long: it is assigned a signed long by
strtol() and is compared against LONG_MIN and LONG_MAX.

ipsec.c:2280:65:
error: result of comparison of constant -9223372036854775808
with expression of type 'unsigned int' is always false
[-Werror,-Wtautological-constant-out-of-range-compare]

  if ((errno == ERANGE && (nr_process == LONG_MAX || nr_process == LONG_MIN))

Fixes: bc2652b7ae1e ("selftest/net/xfrm: Add test for ipsec tunnel")
Cc: Dmitry Safonov <0x7f454...@gmail.com>
Signed-off-by: Willem de Bruijn 
---
 tools/testing/selftests/net/ipsec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/ipsec.c 
b/tools/testing/selftests/net/ipsec.c
index 9a8229abfa026..be4a30a0d02ae 100644
--- a/tools/testing/selftests/net/ipsec.c
+++ b/tools/testing/selftests/net/ipsec.c
@@ -2263,7 +2263,7 @@ static int check_results(void)
 
 int main(int argc, char **argv)
 {
-   unsigned int nr_process = 1;
+   long nr_process = 1;
int route_sock = -1, ret = KSFT_SKIP;
int test_desc_fd[2];
uint32_t route_seq;
@@ -2284,7 +2284,7 @@ int main(int argc, char **argv)
exit_usage(argv);
}
 
-   if (nr_process > MAX_PROCESSES || !nr_process) {
+   if (nr_process > MAX_PROCESSES || nr_process < 1) {
printk("nr_process should be between [1; %u]",
MAX_PROCESSES);
exit_usage(argv);
-- 
2.43.0.rc1.413.gea7ed67945-goog

[PATCH net 0/4] selftests/net: fix a few small compiler warnings

2023-11-24 Thread Willem de Bruijn

From: Willem de Bruijn 

Observed a clang warning when backporting cmsg_sender.
Ran the same build against all the .c files under selftests/net.

This is clang-14 with -Wall
Which is what tools/testing/selftests/net/Makefile also enables.

Willem de Bruijn (4):
  selftests/net: ipsec: fix constant out of range
  selftests/net: fix a char signedness issue
  selftests/net: unix: fix unused variable compiler warning
  selftests/net: mptcp: fix uninitialized variable warnings

 tools/testing/selftests/net/af_unix/diag_uid.c|  1 -
 tools/testing/selftests/net/cmsg_sender.c |  2 +-
 tools/testing/selftests/net/ipsec.c   |  4 ++--
 tools/testing/selftests/net/mptcp/mptcp_connect.c | 11 ---
 tools/testing/selftests/net/mptcp/mptcp_inq.c | 11 ---
 5 files changed, 11 insertions(+), 18 deletions(-)

-- 
2.43.0.rc1.413.gea7ed67945-goog

Re: [PATCH v3 19/25] kselftest/arm64: move get_header()

2023-11-24 Thread Mark Brown

On Fri, Nov 24, 2023 at 04:35:04PM +, Joey Gouly wrote:
> Put this function in the header so that it can be used by other tests, without
> needing to link to testcases.c.

It would've been good to explain a bit more of the context here but

Reviewed-by: Mark Brown 


signature.asc
Description: PGP signature

Re: [PATCH v3 25/25] KVM: selftests: get-reg-list: add Permission Overlay registers

2023-11-24 Thread Mark Brown

On Fri, Nov 24, 2023 at 04:35:10PM +, Joey Gouly wrote:
> Add new system registers:
>   - POR_EL1
>   - POR_EL0

Reviewed-by: Mark Brown 


signature.asc
Description: PGP signature

Re: [PATCH v3 24/25] kselftest/arm64: Add test case for POR_EL0 signal frame records

2023-11-24 Thread Mark Brown

On Fri, Nov 24, 2023 at 04:35:09PM +, Joey Gouly wrote:

> +++ b/tools/testing/selftests/arm64/signal/.gitignore
> @@ -5,6 +5,7 @@ sme_*
>  ssve_*
>  sve_*
>  tpidr2_*
> +poe_siginfo
>  za_*
>  zt_*
>  !*.[ch]

Please keep this sorted, otherwise

Reviewed-by: Mark Brown 


signature.asc
Description: PGP signature

Re: [PATCH v3 22/25] kselftest/arm64: add HWCAP test for FEAT_S1POE

2023-11-24 Thread Mark Brown

On Fri, Nov 24, 2023 at 04:35:07PM +, Joey Gouly wrote:
> Check that when POE is enabled, the POR_EL0 register is accessible.

Reviewed-by: Mark Brown 


signature.asc
Description: PGP signature

[PATCH v3 25/25] KVM: selftests: get-reg-list: add Permission Overlay registers

2023-11-24 Thread Joey Gouly

Add new system registers:
  - POR_EL1
  - POR_EL0

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Marc Zyngier 
Cc: Oliver Upton 
Cc: Shuah Khan 
---
 tools/testing/selftests/kvm/aarch64/get-reg-list.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/get-reg-list.c 
b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
index 709d7d721760..ac661ebf6859 100644
--- a/tools/testing/selftests/kvm/aarch64/get-reg-list.c
+++ b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
@@ -40,6 +40,18 @@ static struct feature_id_reg feat_id_regs[] = {
ARM64_SYS_REG(3, 0, 0, 7, 3),   /* ID_AA64MMFR3_EL1 */
4,
1
+   },
+   {
+   ARM64_SYS_REG(3, 0, 10, 2, 4),  /* POR_EL1 */
+   ARM64_SYS_REG(3, 0, 0, 7, 3),   /* ID_AA64MMFR3_EL1 */
+   16,
+   1
+   },
+   {
+   ARM64_SYS_REG(3, 3, 10, 2, 4),  /* POR_EL0 */
+   ARM64_SYS_REG(3, 0, 0, 7, 3),   /* ID_AA64MMFR3_EL1 */
+   16,
+   1
}
 };
 
@@ -468,6 +480,7 @@ static __u64 base_regs[] = {
ARM64_SYS_REG(3, 0, 10, 2, 0),  /* MAIR_EL1 */
ARM64_SYS_REG(3, 0, 10, 2, 2),  /* PIRE0_EL1 */
ARM64_SYS_REG(3, 0, 10, 2, 3),  /* PIR_EL1 */
+   ARM64_SYS_REG(3, 0, 10, 2, 4),  /* POR_EL1 */
ARM64_SYS_REG(3, 0, 10, 3, 0),  /* AMAIR_EL1 */
ARM64_SYS_REG(3, 0, 12, 0, 0),  /* VBAR_EL1 */
ARM64_SYS_REG(3, 0, 12, 1, 1),  /* DISR_EL1 */
@@ -475,6 +488,7 @@ static __u64 base_regs[] = {
ARM64_SYS_REG(3, 0, 13, 0, 4),  /* TPIDR_EL1 */
ARM64_SYS_REG(3, 0, 14, 1, 0),  /* CNTKCTL_EL1 */
ARM64_SYS_REG(3, 2, 0, 0, 0),   /* CSSELR_EL1 */
+   ARM64_SYS_REG(3, 3, 10, 2, 4),  /* POR_EL0 */
ARM64_SYS_REG(3, 3, 13, 0, 2),  /* TPIDR_EL0 */
ARM64_SYS_REG(3, 3, 13, 0, 3),  /* TPIDRRO_EL0 */
ARM64_SYS_REG(3, 3, 14, 0, 1),  /* CNTPCT_EL0 */
-- 
2.25.1

[PATCH v3 23/25] kselftest/arm64: parse POE_MAGIC in a signal frame

2023-11-24 Thread Joey Gouly

Teach the signal frame parsing about the new POE frame, avoids warning when it
is generated.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mark Brown 
Cc: Shuah Khan 
Reviewed-by: Mark Brown 
---
 tools/testing/selftests/arm64/signal/testcases/testcases.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c 
b/tools/testing/selftests/arm64/signal/testcases/testcases.c
index fe950b6bca6b..5dda753870aa 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.c
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c
@@ -161,6 +161,10 @@ bool validate_reserved(ucontext_t *uc, size_t resv_sz, 
char **err)
if (head->size != sizeof(struct esr_context))
*err = "Bad size for esr_context";
break;
+   case POE_MAGIC:
+   if (head->size != sizeof(struct poe_context))
+   *err = "Bad size for poe_context";
+   break;
case TPIDR2_MAGIC:
if (head->size != sizeof(struct tpidr2_context))
*err = "Bad size for tpidr2_context";
-- 
2.25.1

[PATCH v3 21/25] selftests: mm: make protection_keys test work on arm64

2023-11-24 Thread Joey Gouly

The encoding of the pkey register differs on arm64, than on x86/ppc. On those
platforms, a bit in the register is used to disable permissions, for arm64, a
bit enabled in the register indicates that the permission is allowed.

This drops two asserts of the form:
 assert(read_pkey_reg() <= orig_pkey_reg);
Because on arm64 this doesn't hold, due to the encoding.

The pkey must be reset to both access allow and write allow in the signal
handler. pkey_access_allow() works currently for PowerPC as the
PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE have overlapping bits set.

Access to the uc_mcontext is abstracted, as arm64 has a different structure.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Andrew Morton 
Cc: Shuah Khan 
Cc: Dave Hansen 
Cc: Aneesh Kumar K.V 
Acked-by: Dave Hansen 
---
 .../arm64/signal/testcases/testcases.h|   3 +
 tools/testing/selftests/mm/Makefile   |   2 +-
 tools/testing/selftests/mm/pkey-arm64.h   | 139 ++
 tools/testing/selftests/mm/pkey-helpers.h |   8 +
 tools/testing/selftests/mm/pkey-powerpc.h |   2 +
 tools/testing/selftests/mm/pkey-x86.h |   2 +
 tools/testing/selftests/mm/protection_keys.c  | 103 +++--
 7 files changed, 247 insertions(+), 12 deletions(-)
 create mode 100644 tools/testing/selftests/mm/pkey-arm64.h

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h 
b/tools/testing/selftests/arm64/signal/testcases/testcases.h
index d33154c9a4bd..e445027d5ec2 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.h
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h
@@ -25,6 +25,9 @@
 #define HDR_SZ \
sizeof(struct _aarch64_ctx)
 
+#define GET_UC_RESV_HEAD(uc) \
+   (struct _aarch64_ctx *)(&(uc->uc_mcontext.__reserved))
+
 #define GET_SF_RESV_HEAD(sf) \
(struct _aarch64_ctx *)(&(sf).uc.uc_mcontext.__reserved)
 
diff --git a/tools/testing/selftests/mm/Makefile 
b/tools/testing/selftests/mm/Makefile
index 78dfec8bc676..33922ae4bb6e 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -97,7 +97,7 @@ TEST_GEN_FILES += $(BINARIES_64)
 endif
 else
 
-ifneq (,$(findstring $(ARCH),ppc64))
+ifneq (,$(filter $(ARCH),arm64 ppc64))
 TEST_GEN_FILES += protection_keys
 endif
 
diff --git a/tools/testing/selftests/mm/pkey-arm64.h 
b/tools/testing/selftests/mm/pkey-arm64.h
new file mode 100644
index ..2861564f6415
--- /dev/null
+++ b/tools/testing/selftests/mm/pkey-arm64.h
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Arm Ltd.
+*/
+
+#ifndef _PKEYS_ARM64_H
+#define _PKEYS_ARM64_H
+
+#include "vm_util.h"
+/* for signal frame parsing */
+#include "../arm64/signal/testcases/testcases.h"
+
+#ifndef SYS_mprotect_key
+# define SYS_mprotect_key  288
+#endif
+#ifndef SYS_pkey_alloc
+# define SYS_pkey_alloc289
+# define SYS_pkey_free 290
+#endif
+#define MCONTEXT_IP(mc)mc.pc
+#define MCONTEXT_TRAPNO(mc)-1
+
+#define PKEY_MASK  0xf
+
+#define POE_NONE   0x0
+#define POE_X  0x2
+#define POE_RX 0x3
+#define POE_RWX0x7
+
+#define NR_PKEYS   7
+#define NR_RESERVED_PKEYS  1 /* pkey-0 */
+
+#define PKEY_ALLOW_ALL 0x
+
+#define PKEY_BITS_PER_PKEY 4
+#define PAGE_SIZE  sysconf(_SC_PAGESIZE)
+#undef HPAGE_SIZE
+#define HPAGE_SIZE default_huge_page_size()
+
+/* 4-byte instructions * 16384 = 64K page */
+#define __page_o_noops() asm(".rept 16384 ; nop; .endr")
+
+static inline u64 __read_pkey_reg(void)
+{
+   u64 pkey_reg = 0;
+
+   // POR_EL0
+   asm volatile("mrs %0, S3_3_c10_c2_4" : "=r" (pkey_reg));
+
+   return pkey_reg;
+}
+
+static inline void __write_pkey_reg(u64 pkey_reg)
+{
+   u64 por = pkey_reg;
+
+   dprintf4("%s() changing %016llx to %016llx\n",
+__func__, __read_pkey_reg(), pkey_reg);
+
+   // POR_EL0
+   asm volatile("msr S3_3_c10_c2_4, %0\nisb" :: "r" (por) :);
+
+   dprintf4("%s() pkey register after changing %016llx to %016llx\n",
+   __func__, __read_pkey_reg(), pkey_reg);
+}
+
+static inline int cpu_has_pkeys(void)
+{
+   /* No simple way to determine this */
+   return 1;
+}
+
+static inline u32 pkey_bit_position(int pkey)
+{
+   return pkey * PKEY_BITS_PER_PKEY;
+}
+
+static inline int get_arch_reserved_keys(void)
+{
+   return NR_RESERVED_PKEYS;
+}
+
+void expect_fault_on_read_execonly_key(void *p1, int pkey)
+{
+}
+
+void *malloc_pkey_with_mprotect_subpage(long size, int prot, u16 pkey)
+{
+   return PTR_ERR_ENOTSUP;
+}
+
+#define set_pkey_bits  set_pkey_bits
+static inline u64 set_pkey_bits(u64 reg, int pkey, u64 flags)
+{
+   u32 shift = pkey_bit_position(pkey);
+   u64 new_val = POE_RWX;
+
+   /* mask out bits from pkey in old value */
+

[PATCH v3 24/25] kselftest/arm64: Add test case for POR_EL0 signal frame records

2023-11-24 Thread Joey Gouly

Ensure that we get signal context for POR_EL0 if and only if POE is present
on the system.

Copied from the TPIDR2 test.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mark Brown 
Cc: Shuah Khan 
---
 .../testing/selftests/arm64/signal/.gitignore |  1 +
 .../arm64/signal/testcases/poe_siginfo.c  | 86 +++
 2 files changed, 87 insertions(+)
 create mode 100644 tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c

diff --git a/tools/testing/selftests/arm64/signal/.gitignore 
b/tools/testing/selftests/arm64/signal/.gitignore
index 839e3a252629..6bcb27bd506b 100644
--- a/tools/testing/selftests/arm64/signal/.gitignore
+++ b/tools/testing/selftests/arm64/signal/.gitignore
@@ -5,6 +5,7 @@ sme_*
 ssve_*
 sve_*
 tpidr2_*
+poe_siginfo
 za_*
 zt_*
 !*.[ch]
diff --git a/tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c 
b/tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c
new file mode 100644
index ..d890029304c4
--- /dev/null
+++ b/tools/testing/selftests/arm64/signal/testcases/poe_siginfo.c
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023 Arm Limited
+ *
+ * Verify that the POR_EL0 register context in signal frames is set up as
+ * expected.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test_signals_utils.h"
+#include "testcases.h"
+
+static union {
+   ucontext_t uc;
+   char buf[1024 * 128];
+} context;
+
+#define SYS_POR_EL0 "S3_3_C10_C2_4"
+
+static uint64_t get_por_el0(void)
+{
+   uint64_t val;
+
+   asm volatile (
+   "mrs%0, " SYS_POR_EL0 "\n"
+   : "=r"(val)
+   :
+   : "cc");
+
+   return val;
+}
+
+int poe_present(struct tdescr *td, siginfo_t *si, ucontext_t *uc)
+{
+   struct _aarch64_ctx *head = GET_BUF_RESV_HEAD(context);
+   struct poe_context *poe_ctx;
+   size_t offset;
+   bool in_sigframe;
+   bool have_poe;
+   __u64 orig_poe;
+
+   have_poe = getauxval(AT_HWCAP2) & HWCAP2_POE;
+   if (have_poe)
+   orig_poe = get_por_el0();
+
+   if (!get_current_context(td, , sizeof(context)))
+   return 1;
+
+   poe_ctx = (struct poe_context *)
+   get_header(head, POE_MAGIC, td->live_sz, );
+
+   in_sigframe = poe_ctx != NULL;
+
+   fprintf(stderr, "POR_EL0 sigframe %s on system %s POE\n",
+   in_sigframe ? "present" : "absent",
+   have_poe ? "with" : "without");
+
+   td->pass = (in_sigframe == have_poe);
+
+   /*
+* Check that the value we read back was the one present at
+* the time that the signal was triggered.
+*/
+   if (have_poe && poe_ctx) {
+   if (poe_ctx->por_el0 != orig_poe) {
+   fprintf(stderr, "POR_EL0 in frame is %llx, was %llx\n",
+   poe_ctx->por_el0, orig_poe);
+   td->pass = false;
+   }
+   }
+
+   return 0;
+}
+
+struct tdescr tde = {
+   .name = "POR_EL0",
+   .descr = "Validate that POR_EL0 is present as expected",
+   .timeout = 3,
+   .run = poe_present,
+};
-- 
2.25.1

[PATCH v3 22/25] kselftest/arm64: add HWCAP test for FEAT_S1POE

2023-11-24 Thread Joey Gouly

Check that when POE is enabled, the POR_EL0 register is accessible.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mark Brown 
Cc: Shuah Khan 
---
 tools/testing/selftests/arm64/abi/hwcap.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/tools/testing/selftests/arm64/abi/hwcap.c 
b/tools/testing/selftests/arm64/abi/hwcap.c
index 1189e77c8152..9ee7b04f3fbb 100644
--- a/tools/testing/selftests/arm64/abi/hwcap.c
+++ b/tools/testing/selftests/arm64/abi/hwcap.c
@@ -115,6 +115,12 @@ static void pmull_sigill(void)
asm volatile(".inst 0x0ee0e000" : : : );
 }
 
+static void poe_sigill(void)
+{
+   /* mrs x0, POR_EL0 */
+   asm volatile("mrs x0, S3_3_C10_C2_4" : : : "x0");
+}
+
 static void rng_sigill(void)
 {
asm volatile("mrs x0, S3_3_C2_C4_0" : : : "x0");
@@ -426,6 +432,14 @@ static const struct hwcap_data {
.cpuinfo = "pmull",
.sigill_fn = pmull_sigill,
},
+   {
+   .name = "POE",
+   .at_hwcap = AT_HWCAP2,
+   .hwcap_bit = HWCAP2_POE,
+   .cpuinfo = "poe",
+   .sigill_fn = poe_sigill,
+   .sigill_reliable = true,
+   },
{
.name = "RNG",
.at_hwcap = AT_HWCAP2,
-- 
2.25.1

[PATCH v3 20/25] selftests: mm: move fpregs printing

2023-11-24 Thread Joey Gouly

arm64's fpregs are not at a constant offset from sigcontext. Since this is
not an important part of the test, don't print the fpregs pointer on arm64.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Andrew Morton 
Cc: Shuah Khan 
Cc: Dave Hansen 
Cc: Aneesh Kumar K.V 
Acked-by: Dave Hansen 
---
 tools/testing/selftests/mm/pkey-powerpc.h| 1 +
 tools/testing/selftests/mm/pkey-x86.h| 2 ++
 tools/testing/selftests/mm/protection_keys.c | 6 ++
 3 files changed, 9 insertions(+)

diff --git a/tools/testing/selftests/mm/pkey-powerpc.h 
b/tools/testing/selftests/mm/pkey-powerpc.h
index ae5df26104e5..6275d0f474b3 100644
--- a/tools/testing/selftests/mm/pkey-powerpc.h
+++ b/tools/testing/selftests/mm/pkey-powerpc.h
@@ -9,6 +9,7 @@
 #endif
 #define REG_IP_IDX PT_NIP
 #define REG_TRAPNO PT_TRAP
+#define MCONTEXT_FPREGS
 #define gregs  gp_regs
 #define fpregs fp_regs
 #define si_pkey_offset 0x20
diff --git a/tools/testing/selftests/mm/pkey-x86.h 
b/tools/testing/selftests/mm/pkey-x86.h
index 814758e109c0..b9170a26bfcb 100644
--- a/tools/testing/selftests/mm/pkey-x86.h
+++ b/tools/testing/selftests/mm/pkey-x86.h
@@ -15,6 +15,8 @@
 
 #endif
 
+#define MCONTEXT_FPREGS
+
 #ifndef PKEY_DISABLE_ACCESS
 # define PKEY_DISABLE_ACCESS   0x1
 #endif
diff --git a/tools/testing/selftests/mm/protection_keys.c 
b/tools/testing/selftests/mm/protection_keys.c
index 48dc151f8fca..b3dbd76ea27c 100644
--- a/tools/testing/selftests/mm/protection_keys.c
+++ b/tools/testing/selftests/mm/protection_keys.c
@@ -314,7 +314,9 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
ucontext_t *uctxt = vucontext;
int trapno;
unsigned long ip;
+#ifdef MCONTEXT_FPREGS
char *fpregs;
+#endif
 #if defined(__i386__) || defined(__x86_64__) /* arch */
u32 *pkey_reg_ptr;
int pkey_reg_offset;
@@ -330,7 +332,9 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
 
trapno = uctxt->uc_mcontext.gregs[REG_TRAPNO];
ip = uctxt->uc_mcontext.gregs[REG_IP_IDX];
+#ifdef MCONTEXT_FPREGS
fpregs = (char *) uctxt->uc_mcontext.fpregs;
+#endif
 
dprintf2("%s() trapno: %d ip: 0x%016lx info->si_code: %s/%d\n",
__func__, trapno, ip, si_code_str(si->si_code),
@@ -359,7 +363,9 @@ void signal_handler(int signum, siginfo_t *si, void 
*vucontext)
 #endif /* arch */
 
dprintf1("siginfo: %p\n", si);
+#ifdef MCONTEXT_FPREGS
dprintf1(" fpregs: %p\n", fpregs);
+#endif
 
if ((si->si_code == SEGV_MAPERR) ||
(si->si_code == SEGV_ACCERR) ||
-- 
2.25.1

[PATCH v3 19/25] kselftest/arm64: move get_header()

2023-11-24 Thread Joey Gouly

Put this function in the header so that it can be used by other tests, without
needing to link to testcases.c.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Andrew Morton 
Cc: Shuah Khan 
Cc: Dave Hansen 
Cc: Aneesh Kumar K.V 
---
 .../arm64/signal/testcases/testcases.c| 23 -
 .../arm64/signal/testcases/testcases.h| 25 +--
 2 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c 
b/tools/testing/selftests/arm64/signal/testcases/testcases.c
index 9f580b55b388..fe950b6bca6b 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.c
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c
@@ -6,29 +6,6 @@
 
 #include "testcases.h"
 
-struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic,
-   size_t resv_sz, size_t *offset)
-{
-   size_t offs = 0;
-   struct _aarch64_ctx *found = NULL;
-
-   if (!head || resv_sz < HDR_SZ)
-   return found;
-
-   while (offs <= resv_sz - HDR_SZ &&
-  head->magic != magic && head->magic) {
-   offs += head->size;
-   head = GET_RESV_NEXT_HEAD(head);
-   }
-   if (head->magic == magic) {
-   found = head;
-   if (offset)
-   *offset = offs;
-   }
-
-   return found;
-}
-
 bool validate_extra_context(struct extra_context *extra, char **err,
void **extra_data, size_t *extra_size)
 {
diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h 
b/tools/testing/selftests/arm64/signal/testcases/testcases.h
index a08ab0d6207a..d33154c9a4bd 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.h
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h
@@ -87,8 +87,29 @@ struct fake_sigframe {
 
 bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err);
 
-struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, uint32_t magic,
-   size_t resv_sz, size_t *offset);
+static inline struct _aarch64_ctx *get_header(struct _aarch64_ctx *head, 
uint32_t magic,
+   size_t resv_sz, size_t *offset)
+{
+   size_t offs = 0;
+   struct _aarch64_ctx *found = NULL;
+
+   if (!head || resv_sz < HDR_SZ)
+   return found;
+
+   while (offs <= resv_sz - HDR_SZ &&
+  head->magic != magic && head->magic) {
+   offs += head->size;
+   head = GET_RESV_NEXT_HEAD(head);
+   }
+   if (head->magic == magic) {
+   found = head;
+   if (offset)
+   *offset = offs;
+   }
+
+   return found;
+}
+
 
 static inline struct _aarch64_ctx *get_terminator(struct _aarch64_ctx *head,
  size_t resv_sz,
-- 
2.25.1

[PATCH v3 16/25] arm64: enable PKEY support for CPUs with S1POE

2023-11-24 Thread Joey Gouly

Now that PKEYs support has been implemented, enable it for CPUs that
support S1POE.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/pkeys.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
index a80c654da93d..23c473300058 100644
--- a/arch/arm64/include/asm/pkeys.h
+++ b/arch/arm64/include/asm/pkeys.h
@@ -17,7 +17,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int 
pkey,
 
 static inline bool arch_pkeys_enabled(void)
 {
-   return false;
+   return system_supports_poe();
 }
 
 static inline int vma_pkey(struct vm_area_struct *vma)
-- 
2.25.1

[PATCH v3 18/25] arm64/ptrace: add support for FEAT_POE

2023-11-24 Thread Joey Gouly

Add a regset for POE containing POR_EL0.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/kernel/ptrace.c | 46 ++
 include/uapi/linux/elf.h   |  1 +
 2 files changed, 47 insertions(+)

diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 20d7ef82de90..c3257a5c97f1 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1409,6 +1409,39 @@ static int tagged_addr_ctrl_set(struct task_struct 
*target, const struct
 }
 #endif
 
+#ifdef CONFIG_ARM64_POE
+static int poe_get(struct task_struct *target,
+  const struct user_regset *regset,
+  struct membuf to)
+{
+   if (!system_supports_poe())
+   return -EINVAL;
+
+   return membuf_write(, >thread.por_el0,
+   sizeof(target->thread.por_el0));
+}
+
+static int poe_set(struct task_struct *target, const struct
+  user_regset *regset, unsigned int pos,
+  unsigned int count, const void *kbuf, const
+  void __user *ubuf)
+{
+   int ret;
+   long ctrl;
+
+   if (!system_supports_poe())
+   return -EINVAL;
+
+   ret = user_regset_copyin(, , , , , 0, -1);
+   if (ret)
+   return ret;
+
+   target->thread.por_el0 = ctrl;
+
+   return 0;
+}
+#endif
+
 enum aarch64_regset {
REGSET_GPR,
REGSET_FPR,
@@ -1437,6 +1470,9 @@ enum aarch64_regset {
 #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
REGSET_TAGGED_ADDR_CTRL,
 #endif
+#ifdef CONFIG_ARM64_POE
+   REGSET_POE
+#endif
 };
 
 static const struct user_regset aarch64_regsets[] = {
@@ -1587,6 +1623,16 @@ static const struct user_regset aarch64_regsets[] = {
.set = tagged_addr_ctrl_set,
},
 #endif
+#ifdef CONFIG_ARM64_POE
+   [REGSET_POE] = {
+   .core_note_type = NT_ARM_POE,
+   .n = 1,
+   .size = sizeof(long),
+   .align = sizeof(long),
+   .regset_get = poe_get,
+   .set = poe_set,
+   },
+#endif
 };
 
 static const struct user_regset_view user_aarch64_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index 9417309b7230..f2713efcd81b 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -440,6 +440,7 @@ typedef struct elf64_shdr {
 #define NT_ARM_SSVE0x40b   /* ARM Streaming SVE registers */
 #define NT_ARM_ZA  0x40c   /* ARM SME ZA registers */
 #define NT_ARM_ZT  0x40d   /* ARM SME ZT registers */
+#define NT_ARM_POE 0x40f   /* ARM POE registers */
 #define NT_ARC_V2  0x600   /* ARCv2 accumulator/extra registers */
 #define NT_VMCOREDD0x700   /* Vmcore Device Dump Note */
 #define NT_MIPS_DSP0x800   /* MIPS DSP ASE registers */
-- 
2.25.1

[PATCH v3 15/25] arm64: add POE signal support

2023-11-24 Thread Joey Gouly

Add PKEY support to signals, by saving and restoring POR_EL0 from the 
stackframe.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Reviewed-by: Mark Brown 
---
 arch/arm64/include/uapi/asm/sigcontext.h |  7 
 arch/arm64/kernel/signal.c   | 51 
 2 files changed, 58 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/sigcontext.h 
b/arch/arm64/include/uapi/asm/sigcontext.h
index f23c1dc3f002..cef85eeaf541 100644
--- a/arch/arm64/include/uapi/asm/sigcontext.h
+++ b/arch/arm64/include/uapi/asm/sigcontext.h
@@ -98,6 +98,13 @@ struct esr_context {
__u64 esr;
 };
 
+#define POE_MAGIC  0x504f4530
+
+struct poe_context {
+   struct _aarch64_ctx head;
+   __u64 por_el0;
+};
+
 /*
  * extra_context: describes extra space in the signal frame for
  * additional structures that don't fit in sigcontext.__reserved[].
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 0e8beb3349ea..379f364005bf 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -62,6 +62,7 @@ struct rt_sigframe_user_layout {
unsigned long zt_offset;
unsigned long extra_offset;
unsigned long end_offset;
+   unsigned long poe_offset;
 };
 
 #define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
@@ -182,6 +183,8 @@ struct user_ctxs {
u32 za_size;
struct zt_context __user *zt;
u32 zt_size;
+   struct poe_context __user *poe;
+   u32 poe_size;
 };
 
 static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
@@ -227,6 +230,20 @@ static int restore_fpsimd_context(struct user_ctxs *user)
return err ? -EFAULT : 0;
 }
 
+static int restore_poe_context(struct user_ctxs *user)
+{
+   u64 por_el0;
+   int err = 0;
+
+   if (user->poe_size != sizeof(*user->poe))
+   return -EINVAL;
+
+   __get_user_error(por_el0, &(user->poe->por_el0), err);
+   if (!err)
+   write_sysreg_s(por_el0, SYS_POR_EL0);
+
+   return err;
+}
 
 #ifdef CONFIG_ARM64_SVE
 
@@ -590,6 +607,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->tpidr2 = NULL;
user->za = NULL;
user->zt = NULL;
+   user->poe = NULL;
 
if (!IS_ALIGNED((unsigned long)base, 16))
goto invalid;
@@ -640,6 +658,17 @@ static int parse_user_sigframe(struct user_ctxs *user,
/* ignore */
break;
 
+   case POE_MAGIC:
+   if (!system_supports_poe())
+   goto invalid;
+
+   if (user->poe)
+   goto invalid;
+
+   user->poe = (struct poe_context __user *)head;
+   user->poe_size = size;
+   break;
+
case SVE_MAGIC:
if (!system_supports_sve() && !system_supports_sme())
goto invalid;
@@ -812,6 +841,9 @@ static int restore_sigframe(struct pt_regs *regs,
if (err == 0 && system_supports_sme2() && user.zt)
err = restore_zt_context();
 
+   if (err == 0 && system_supports_poe() && user.poe)
+   err = restore_poe_context();
+
return err;
 }
 
@@ -928,6 +960,13 @@ static int setup_sigframe_layout(struct 
rt_sigframe_user_layout *user,
}
}
 
+   if (system_supports_poe()) {
+   err = sigframe_alloc(user, >poe_offset,
+sizeof(struct poe_context));
+   if (err)
+   return err;
+   }
+
return sigframe_alloc_end(user);
 }
 
@@ -968,6 +1007,15 @@ static int setup_sigframe(struct rt_sigframe_user_layout 
*user,
__put_user_error(current->thread.fault_code, _ctx->esr, 
err);
}
 
+   if (system_supports_poe() && err == 0 && user->poe_offset) {
+   struct poe_context __user *poe_ctx =
+   apply_user_offset(user, user->poe_offset);
+
+   __put_user_error(POE_MAGIC, _ctx->head.magic, err);
+   __put_user_error(sizeof(*poe_ctx), _ctx->head.size, err);
+   __put_user_error(read_sysreg_s(SYS_POR_EL0), _ctx->por_el0, 
err);
+   }
+
/* Scalable Vector Extension state (including streaming), if present */
if ((system_supports_sve() || system_supports_sme()) &&
err == 0 && user->sve_offset) {
@@ -1119,6 +1167,9 @@ static void setup_return(struct pt_regs *regs, struct 
k_sigaction *ka,
sme_smstop();
}
 
+   if (system_supports_poe())
+   write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
+
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
else
-- 
2.25.1

[PATCH v3 17/25] arm64: enable POE and PIE to coexist

2023-11-24 Thread Joey Gouly

Set the EL0/userspace indirection encodings to be the overlay enabled
variants of the permissions.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/pgtable-prot.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h 
b/arch/arm64/include/asm/pgtable-prot.h
index e9624f6326dd..3007208e04aa 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -137,10 +137,10 @@ extern bool arm64_use_ng_mappings;
 
 #define PIE_E0 ( \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),  PIE_X_O) | \
-   PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX)  | \
-   PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX) | \
-   PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),  PIE_R)   | \
-   PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),PIE_RW))
+   PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_RX_O)  | \
+   PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC),   PIE_RWX_O) | \
+   PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY),  PIE_R_O)   | \
+   PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED),PIE_RW_O))
 
 #define PIE_E1 ( \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY),  PIE_NONE_O) | \
-- 
2.25.1

[PATCH v3 14/25] arm64: implement PKEYS support

2023-11-24 Thread Joey Gouly

Implement the PKEYS interface, using the Permission Overlay Extension.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/mmu.h |  2 +
 arch/arm64/include/asm/mmu_context.h | 32 -
 arch/arm64/include/asm/pgtable.h | 23 +-
 arch/arm64/include/asm/pkeys.h   | 68 +---
 arch/arm64/include/asm/por.h | 33 ++
 arch/arm64/mm/mmu.c  | 35 +-
 6 files changed, 184 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/include/asm/por.h

diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 2fcf51231d6e..55338b14b453 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -25,6 +25,8 @@ typedef struct {
refcount_t  pinned;
void*vdso;
unsigned long   flags;
+
+   u8  pkey_allocation_map;
 } mm_context_t;
 
 /*
diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 16b48fe9353f..3fc739fb831c 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -211,11 +212,24 @@ init_new_context(struct task_struct *tsk, struct 
mm_struct *mm)
 {
atomic64_set(>context.id, 0);
refcount_set(>context.pinned, 0);
+
+   // pkey 0 is the default, so always reserve it.
+   mm->context.pkey_allocation_map = 0x1;
+
return 0;
 }
 
+static inline void arch_dup_pkeys(struct mm_struct *oldmm,
+ struct mm_struct *mm)
+{
+   /* Duplicate the oldmm pkey state in mm: */
+   mm->context.pkey_allocation_map = oldmm->context.pkey_allocation_map;
+}
+
 static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
 {
+   arch_dup_pkeys(oldmm, mm);
+
return 0;
 }
 
@@ -317,10 +331,26 @@ static inline unsigned long mm_untag_mask(struct 
mm_struct *mm)
return -1UL >> 8;
 }
 
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to POR_EL0 for other
+ * processes or any way to tell *which * POR_EL0 in a threaded
+ * process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current
+ * mm, or if we are in a kernel thread.
+ */
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
bool write, bool execute, bool foreign)
 {
-   return true;
+   if (!arch_pkeys_enabled())
+   return true;
+
+   /* allow access if the VMA is not one from this process */
+   if (foreign || vma_is_foreign(vma))
+   return true;
+
+   return por_el0_allows_pkey(vma_pkey(vma), write, execute);
 }
 
 #include 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index e45105336ca0..789c88b138f5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -30,6 +30,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -143,6 +144,24 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
 #define pte_accessible(mm, pte)\
(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
 
+static inline bool por_el0_allows_pkey(u8 pkey, bool write, bool execute)
+{
+   u64 por;
+
+   if (!system_supports_poe())
+   return true;
+
+   por = read_sysreg_s(SYS_POR_EL0);
+
+   if (write)
+   return por_elx_allows_write(por, pkey);
+
+   if (execute)
+   return por_elx_allows_exec(por, pkey);
+
+   return por_elx_allows_read(por, pkey);
+}
+
 /*
  * p??_access_permitted() is true for valid user mappings (PTE_USER
  * bit set, subject to the write permission check). For execute-only
@@ -151,7 +170,9 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
  * PTE_VALID bit set.
  */
 #define pte_access_permitted(pte, write) \
-   (((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && 
(!(write) || pte_write(pte)))
+   (((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER)) && 
\
+(!(write) || pte_write(pte)) && \
+por_el0_allows_pkey(FIELD_GET(PTE_PO_IDX_MASK, pte_val(pte)), write, 
false))
 #define pmd_access_permitted(pmd, write) \
(pte_access_permitted(pmd_pte(pmd), (write)))
 #define pud_access_permitted(pud, write) \
diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
index 5761fb48fd53..a80c654da93d 100644
--- a/arch/arm64/include/asm/pkeys.h
+++ b/arch/arm64/include/asm/pkeys.h
@@ -10,7 +10,7 @@
 
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
 
-#define arch_max_pkey() 0
+#define arch_max_pkey() 7
 
 int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val);
@@ -22,33 +22,89 @@

[PATCH v3 13/25] arm64: stop using generic mm_hooks.h

2023-11-24 Thread Joey Gouly

These functions will be extended to support pkeys.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/mmu_context.h | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 9ce4200508b1..16b48fe9353f 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -20,7 +20,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -215,6 +214,20 @@ init_new_context(struct task_struct *tsk, struct mm_struct 
*mm)
return 0;
 }
 
+static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm)
+{
+   return 0;
+}
+
+static inline void arch_exit_mmap(struct mm_struct *mm)
+{
+}
+
+static inline void arch_unmap(struct mm_struct *mm,
+   unsigned long start, unsigned long end)
+{
+}
+
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 static inline void update_saved_ttbr0(struct task_struct *tsk,
  struct mm_struct *mm)
@@ -304,6 +317,12 @@ static inline unsigned long mm_untag_mask(struct mm_struct 
*mm)
return -1UL >> 8;
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+   bool write, bool execute, bool foreign)
+{
+   return true;
+}
+
 #include 
 
 #endif /* !__ASSEMBLY__ */
-- 
2.25.1

[PATCH v3 12/25] arm64: handle PKEY/POE faults

2023-11-24 Thread Joey Gouly

If a memory fault occurs that is due to an overlay/pkey fault, report that to
userspace with a SEGV_PKUERR.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/traps.h |  1 +
 arch/arm64/kernel/traps.c  | 12 --
 arch/arm64/mm/fault.c  | 44 +++---
 3 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index eefe766d6161..f6f6f2cb7f10 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -25,6 +25,7 @@ try_emulate_armv8_deprecated(struct pt_regs *regs, u32 insn)
 void force_signal_inject(int signal, int code, unsigned long address, unsigned 
long err);
 void arm64_notify_segfault(unsigned long addr);
 void arm64_force_sig_fault(int signo, int code, unsigned long far, const char 
*str);
+void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far, const 
char *str, int pkey);
 void arm64_force_sig_mceerr(int code, unsigned long far, short lsb, const char 
*str);
 void arm64_force_sig_ptrace_errno_trap(int errno, unsigned long far, const 
char *str);
 
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 215e6d7f2df8..1bac6c84d3f5 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -263,16 +263,24 @@ static void arm64_show_signal(int signo, const char *str)
__show_regs(regs);
 }
 
-void arm64_force_sig_fault(int signo, int code, unsigned long far,
-  const char *str)
+void arm64_force_sig_fault_pkey(int signo, int code, unsigned long far,
+  const char *str, int pkey)
 {
arm64_show_signal(signo, str);
if (signo == SIGKILL)
force_sig(SIGKILL);
+   else if (code == SEGV_PKUERR)
+   force_sig_pkuerr((void __user *)far, pkey);
else
force_sig_fault(signo, code, (void __user *)far);
 }
 
+void arm64_force_sig_fault(int signo, int code, unsigned long far,
+  const char *str)
+{
+   arm64_force_sig_fault_pkey(signo, code, far, str, 0);
+}
+
 void arm64_force_sig_mceerr(int code, unsigned long far, short lsb,
const char *str)
 {
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 460d799e1296..efd263f56da7 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -497,6 +498,23 @@ static void do_bad_area(unsigned long far, unsigned long 
esr,
 #define VM_FAULT_BADMAP((__force vm_fault_t)0x01)
 #define VM_FAULT_BADACCESS ((__force vm_fault_t)0x02)
 
+static bool fault_from_pkey(unsigned long esr, struct vm_area_struct *vma,
+   unsigned int mm_flags)
+{
+   unsigned long iss2 = ESR_ELx_ISS2(esr);
+
+   if (!arch_pkeys_enabled())
+   return false;
+
+   if (iss2 & ESR_ELx_Overlay)
+   return true;
+
+   return !arch_vma_access_permitted(vma,
+   mm_flags & FAULT_FLAG_WRITE,
+   mm_flags & FAULT_FLAG_INSTRUCTION,
+   mm_flags & FAULT_FLAG_REMOTE);
+}
+
 static vm_fault_t __do_page_fault(struct mm_struct *mm,
  struct vm_area_struct *vma, unsigned long 
addr,
  unsigned int mm_flags, unsigned long vm_flags,
@@ -688,9 +706,29 @@ static int __kprobes do_page_fault(unsigned long far, 
unsigned long esr,
 * Something tried to access memory that isn't in our memory
 * map.
 */
-   arm64_force_sig_fault(SIGSEGV,
- fault == VM_FAULT_BADACCESS ? SEGV_ACCERR 
: SEGV_MAPERR,
- far, inf->name);
+   int fault_kind;
+   /*
+* The pkey value that we return to userspace can be different
+* from the pkey that caused the fault.
+*
+* 1. T1   : mprotect_key(foo, PAGE_SIZE, pkey=4);
+* 2. T1   : set AMR to deny access to pkey=4, touches, page
+* 3. T1   : faults...
+* 4.T2: mprotect_key(foo, PAGE_SIZE, pkey=5);
+* 5. T1   : enters fault handler, takes mmap_lock, etc...
+* 6. T1   : reaches here, sees vma_pkey(vma)=5, when we really
+*   faulted on a pte with its pkey=4.
+*/
+   int pkey = vma_pkey(vma);
+
+   if (fault_from_pkey(esr, vma, mm_flags))
+   fault_kind = SEGV_PKUERR;
+   else
+   fault_kind = fault == VM_FAULT_BADACCESS ? SEGV_ACCERR 
: SEGV_MAPERR;
+
+   arm64_force_sig_fault_pkey(SIGSEGV,
+ fault_kind,
+

[PATCH v3 11/25] arm64: enable ARCH_HAS_PKEYS on arm64

2023-11-24 Thread Joey Gouly

Enable the ARCH_HAS_PKEYS config, but provide dummy
functions for the entire interface.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/Kconfig |  2 ++
 arch/arm64/include/asm/pkeys.h | 54 ++
 arch/arm64/mm/mmu.c|  7 +
 3 files changed, 63 insertions(+)
 create mode 100644 arch/arm64/include/asm/pkeys.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d7df6c603190..72a71a9834dd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2082,6 +2082,8 @@ menu "ARMv8.9 architectural features"
 config ARM64_POE
prompt "Permission Overlay Extension"
def_bool y
+   select ARCH_USES_HIGH_VMA_FLAGS
+   select ARCH_HAS_PKEYS
help
  The Permission Overlay Extension is used to implement Memory
  Protection Keys. Memory Protection Keys provides a mechanism for
diff --git a/arch/arm64/include/asm/pkeys.h b/arch/arm64/include/asm/pkeys.h
new file mode 100644
index ..5761fb48fd53
--- /dev/null
+++ b/arch/arm64/include/asm/pkeys.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 Arm Ltd.
+ *
+ * Based on arch/x86/include/asm/pkeys.h
+*/
+
+#ifndef _ASM_ARM64_PKEYS_H
+#define _ASM_ARM64_PKEYS_H
+
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2)
+
+#define arch_max_pkey() 0
+
+int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val);
+
+static inline bool arch_pkeys_enabled(void)
+{
+   return false;
+}
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+   return -1;
+}
+
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+   int prot, int pkey)
+{
+   return -1;
+}
+
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+   return -1;
+}
+
+static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
+{
+   return false;
+}
+
+static inline int mm_pkey_alloc(struct mm_struct *mm)
+{
+   return -1;
+}
+
+static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
+{
+   return -EINVAL;
+}
+
+#endif /* _ASM_ARM64_PKEYS_H */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 15f6347d23b6..f7bf41eae904 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1486,3 +1486,10 @@ void ptep_modify_prot_commit(struct vm_area_struct *vma, 
unsigned long addr, pte
 {
set_pte_at(vma->vm_mm, addr, ptep, pte);
 }
+
+#ifdef CONFIG_ARCH_HAS_PKEYS
+int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long 
init_val)
+{
+   return -ENOSPC;
+}
+#endif
-- 
2.25.1

[PATCH v3 09/25] arm64: define VM_PKEY_BIT* for arm64

2023-11-24 Thread Joey Gouly

Define the VM_PKEY_BIT* values for arm64, and convert them into the arm64
specific pgprot values.

Move the current values for x86 and PPC into arch/*.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/mman.h   |  8 +++-
 arch/arm64/include/asm/page.h   | 10 ++
 arch/arm64/mm/mmap.c|  9 +
 arch/powerpc/include/asm/page.h | 11 +++
 arch/x86/include/asm/page.h | 10 ++
 fs/proc/task_mmu.c  |  2 ++
 include/linux/mm.h  | 13 -
 7 files changed, 49 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index 5966ee4a6154..ecb2d18dc4d7 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -7,7 +7,7 @@
 #include 
 
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
-   unsigned long pkey __always_unused)
+   unsigned long pkey)
 {
unsigned long ret = 0;
 
@@ -17,6 +17,12 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned 
long prot,
if (system_supports_mte() && (prot & PROT_MTE))
ret |= VM_MTE;
 
+#if defined(CONFIG_ARCH_HAS_PKEYS)
+   ret |= pkey & 0x1 ? VM_PKEY_BIT0 : 0;
+   ret |= pkey & 0x2 ? VM_PKEY_BIT1 : 0;
+   ret |= pkey & 0x4 ? VM_PKEY_BIT2 : 0;
+#endif
+
return ret;
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 2312e6ee595f..aabfda2516d2 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -49,6 +49,16 @@ int pfn_is_map_memory(unsigned long pfn);
 
 #define VM_DATA_DEFAULT_FLAGS  (VM_DATA_FLAGS_TSK_EXEC | VM_MTE_ALLOWED)
 
+#if defined(CONFIG_ARCH_HAS_PKEYS)
+/* A protection key is a 3-bit value */
+# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_2
+# define VM_PKEY_BIT0  VM_HIGH_ARCH_2
+# define VM_PKEY_BIT1  VM_HIGH_ARCH_3
+# define VM_PKEY_BIT2  VM_HIGH_ARCH_4
+# define VM_PKEY_BIT3  0
+# define VM_PKEY_BIT4  0
+#endif
+
 #include 
 
 #endif
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index 645fe60d000f..2e2a5a9bcfa1 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -98,6 +98,15 @@ pgprot_t vm_get_page_prot(unsigned long vm_flags)
if (vm_flags & VM_MTE)
prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
 
+#ifdef CONFIG_ARCH_HAS_PKEYS
+   if (vm_flags & VM_PKEY_BIT0)
+   prot |= PTE_PO_IDX_0;
+   if (vm_flags & VM_PKEY_BIT1)
+   prot |= PTE_PO_IDX_1;
+   if (vm_flags & VM_PKEY_BIT2)
+   prot |= PTE_PO_IDX_2;
+#endif
+
return __pgprot(prot);
 }
 EXPORT_SYMBOL(vm_get_page_prot);
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index e5fcc79b5bfb..a5e75ec333ad 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -330,6 +330,17 @@ static inline unsigned long kaslr_offset(void)
 }
 
 #include 
+
+#if defined(CONFIG_ARCH_HAS_PKEYS)
+# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
+/* A protection key is a 5-bit value */
+# define VM_PKEY_BIT0  VM_HIGH_ARCH_0
+# define VM_PKEY_BIT1  VM_HIGH_ARCH_1
+# define VM_PKEY_BIT2  VM_HIGH_ARCH_2
+# define VM_PKEY_BIT3  VM_HIGH_ARCH_3
+# define VM_PKEY_BIT4  VM_HIGH_ARCH_4
+#endif /* CONFIG_ARCH_HAS_PKEYS */
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PAGE_H */
diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index d18e5c332cb9..b770db1a21e7 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -87,5 +87,15 @@ static __always_inline u64 __is_canonical_address(u64 vaddr, 
u8 vaddr_bits)
 
 #define HAVE_ARCH_HUGETLB_UNMAPPED_AREA
 
+#if defined(CONFIG_ARCH_HAS_PKEYS)
+# define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
+/* A protection key is a 4-bit value */
+# define VM_PKEY_BIT0  VM_HIGH_ARCH_0
+# define VM_PKEY_BIT1  VM_HIGH_ARCH_1
+# define VM_PKEY_BIT2  VM_HIGH_ARCH_2
+# define VM_PKEY_BIT3  VM_HIGH_ARCH_3
+# define VM_PKEY_BIT4  0
+#endif /* CONFIG_ARCH_HAS_PKEYS */
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_X86_PAGE_H */
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ef2eb12906da..8c2790abeffb 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -691,7 +691,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_PKEY_BIT0)]   = "",
[ilog2(VM_PKEY_BIT1)]   = "",
[ilog2(VM_PKEY_BIT2)]   = "",
+#if VM_PKEY_BIT3
[ilog2(VM_PKEY_BIT3)]   = "",
+#endif
 #if VM_PKEY_BIT4
[ilog2(VM_PKEY_BIT4)]   = "",
 #endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 418d26608ece..47f42d9893fe 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -328,19 +328,6 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5)
 #endif /*

[PATCH v3 10/25] arm64: mask out POIndex when modifying a PTE

2023-11-24 Thread Joey Gouly

When a PTE is modified, the POIndex must be masked off so that it can be 
modified.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/pgtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b19a8aee684c..e45105336ca0 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -828,7 +828,8 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 */
const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
  PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP |
- PTE_ATTRINDX_MASK;
+ PTE_ATTRINDX_MASK | PTE_PO_IDX_MASK;
+
/* preserve the hardware dirty information */
if (pte_hw_dirty(pte))
pte = set_pte_bit(pte, __pgprot(PTE_DIRTY));
-- 
2.25.1

[PATCH v3 08/25] arm64: add POIndex defines

2023-11-24 Thread Joey Gouly

The 3-bit POIndex is stored in the PTE at bits 60..62.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/pgtable-hwdef.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index e4944d517c99..fe270fa39110 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -178,6 +178,16 @@
 #define PTE_PI_IDX_2   53  /* PXN */
 #define PTE_PI_IDX_3   54  /* UXN */
 
+/*
+ * POIndex[2:0] encoding (Permission Overlay Extension)
+ */
+#define PTE_PO_IDX_0   (_AT(pteval_t, 1) << 60)
+#define PTE_PO_IDX_1   (_AT(pteval_t, 1) << 61)
+#define PTE_PO_IDX_2   (_AT(pteval_t, 1) << 62)
+
+#define PTE_PO_IDX_MASKGENMASK_ULL(62, 60)
+
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
-- 
2.25.1

[PATCH v3 06/25] KVM: arm64: Save/restore POE registers

2023-11-24 Thread Joey Gouly

Define the new system registers that POE introduces and context switch them.

Signed-off-by: Joey Gouly 
Cc: Marc Zyngier 
Cc: Oliver Upton 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/kvm_arm.h   |  4 ++--
 arch/arm64/include/asm/kvm_host.h  |  4 
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 10 ++
 arch/arm64/kvm/sys_regs.c  |  2 ++
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index b85f46a73e21..597470e0b87b 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -346,14 +346,14 @@
  */
 #define __HFGRTR_EL2_RES0  (GENMASK(63, 56) | GENMASK(53, 51))
 #define __HFGRTR_EL2_MASK  GENMASK(49, 0)
-#define __HFGRTR_EL2_nMASK (GENMASK(58, 57) | GENMASK(55, 54) | BIT(50))
+#define __HFGRTR_EL2_nMASK (GENMASK(60, 57) | GENMASK(55, 54) | BIT(50))
 
 #define __HFGWTR_EL2_RES0  (GENMASK(63, 56) | GENMASK(53, 51) |\
 BIT(46) | BIT(42) | BIT(40) | BIT(28) | \
 GENMASK(26, 25) | BIT(21) | BIT(18) |  \
 GENMASK(15, 14) | GENMASK(10, 9) | BIT(2))
 #define __HFGWTR_EL2_MASK  GENMASK(49, 0)
-#define __HFGWTR_EL2_nMASK (GENMASK(58, 57) | GENMASK(55, 54) | BIT(50))
+#define __HFGWTR_EL2_nMASK (GENMASK(60, 57) | GENMASK(55, 54) | BIT(50))
 
 #define __HFGITR_EL2_RES0  GENMASK(63, 57)
 #define __HFGITR_EL2_MASK  GENMASK(54, 0)
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 824f29f04916..fa9ebd8fce40 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -401,6 +401,10 @@ enum vcpu_sysreg {
PIR_EL1,   /* Permission Indirection Register 1 (EL1) */
PIRE0_EL1, /*  Permission Indirection Register 0 (EL1) */
 
+   /* Permission Overlay Extension registers */
+   POR_EL1,/* Permission Overlay Register 1 (EL1) */
+   POR_EL0,/* Permission Overlay Register 0 (EL0) */
+
/* 32bit specific registers. */
DACR32_EL2, /* Domain Access Control Register */
IFSR32_EL2, /* Instruction Fault Status Register */
diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h 
b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
index bb6b571ec627..22f07ee43e7e 100644
--- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
+++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h
@@ -19,6 +19,9 @@
 static inline void __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
 {
ctxt_sys_reg(ctxt, MDSCR_EL1)   = read_sysreg(mdscr_el1);
+
+   if (system_supports_poe())
+   ctxt_sys_reg(ctxt, POR_EL0) = read_sysreg_s(SYS_POR_EL0);
 }
 
 static inline void __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
@@ -59,6 +62,8 @@ static inline void __sysreg_save_el1_state(struct 
kvm_cpu_context *ctxt)
ctxt_sys_reg(ctxt, PIR_EL1) = read_sysreg_el1(SYS_PIR);
ctxt_sys_reg(ctxt, PIRE0_EL1)   = read_sysreg_el1(SYS_PIRE0);
}
+   if (system_supports_poe())
+   ctxt_sys_reg(ctxt, POR_EL1) = read_sysreg_el1(SYS_POR);
ctxt_sys_reg(ctxt, PAR_EL1) = read_sysreg_par();
ctxt_sys_reg(ctxt, TPIDR_EL1)   = read_sysreg(tpidr_el1);
 
@@ -89,6 +94,9 @@ static inline void __sysreg_save_el2_return_state(struct 
kvm_cpu_context *ctxt)
 static inline void __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
 {
write_sysreg(ctxt_sys_reg(ctxt, MDSCR_EL1),  mdscr_el1);
+
+   if (system_supports_poe())
+   write_sysreg_s(ctxt_sys_reg(ctxt, POR_EL0), SYS_POR_EL0);
 }
 
 static inline void __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
@@ -135,6 +143,8 @@ static inline void __sysreg_restore_el1_state(struct 
kvm_cpu_context *ctxt)
write_sysreg_el1(ctxt_sys_reg(ctxt, PIR_EL1),   SYS_PIR);
write_sysreg_el1(ctxt_sys_reg(ctxt, PIRE0_EL1), SYS_PIRE0);
}
+   if (system_supports_poe())
+   write_sysreg_el1(ctxt_sys_reg(ctxt, POR_EL1),   SYS_POR);
write_sysreg(ctxt_sys_reg(ctxt, PAR_EL1),   par_el1);
write_sysreg(ctxt_sys_reg(ctxt, TPIDR_EL1), tpidr_el1);
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 4735e1b37fb3..a54e5eadbf29 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2269,6 +2269,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
{ SYS_DESC(SYS_PIRE0_EL1), NULL, reset_unknown, PIRE0_EL1 },
{ SYS_DESC(SYS_PIR_EL1), NULL, reset_unknown, PIR_EL1 },
+   { SYS_DESC(SYS_POR_EL1), NULL, reset_unknown, POR_EL1 },
{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
{ SYS_DESC(SYS_LORSA_EL1), trap_loregion },
@@ -2352,6

[PATCH v3 05/25] arm64: context switch POR_EL0 register

2023-11-24 Thread Joey Gouly

POR_EL0 is a register that can be modified by userspace directly,
so it must be context switched.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/cpufeature.h |  6 ++
 arch/arm64/include/asm/processor.h  |  1 +
 arch/arm64/include/asm/sysreg.h |  3 +++
 arch/arm64/kernel/process.c | 22 ++
 4 files changed, 32 insertions(+)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index f6d416fe49b0..6870e4d46334 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -819,6 +819,12 @@ static inline bool system_supports_tlb_range(void)
return alternative_has_cap_unlikely(ARM64_HAS_TLB_RANGE);
 }
 
+static inline bool system_supports_poe(void)
+{
+   return IS_ENABLED(CONFIG_ARM64_POE) &&
+   alternative_has_cap_unlikely(ARM64_HAS_S1POE);
+}
+
 int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
 bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
 
diff --git a/arch/arm64/include/asm/processor.h 
b/arch/arm64/include/asm/processor.h
index e5bc54522e71..b3ad719c2d0c 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -179,6 +179,7 @@ struct thread_struct {
u64 sctlr_user;
u64 svcr;
u64 tpidr2_el0;
+   u64 por_el0;
 };
 
 static inline unsigned int thread_get_vl(struct thread_struct *thread,
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 9c2caf0efdc7..77a4797d0d54 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1052,6 +1052,9 @@
 #define POE_RXWUL(0x7)
 #define POE_MASK   UL(0xf)
 
+/* Initial value for Permission Overlay Extension for EL0 */
+#define POR_EL0_INIT   POE_RXW
+
 #define ARM64_FEATURE_FIELD_BITS   4
 
 /* Defined for compatibility only, do not add new users. */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 7387b68c745b..fc899c12d759 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -271,12 +271,19 @@ static void flush_tagged_addr_state(void)
clear_thread_flag(TIF_TAGGED_ADDR);
 }
 
+static void flush_poe(void)
+{
+   if (system_supports_poe())
+   write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
+}
+
 void flush_thread(void)
 {
fpsimd_flush_thread();
tls_thread_flush();
flush_ptrace_hw_breakpoint(current);
flush_tagged_addr_state();
+   flush_poe();
 }
 
 void arch_release_task_struct(struct task_struct *tsk)
@@ -374,6 +381,9 @@ int copy_thread(struct task_struct *p, const struct 
kernel_clone_args *args)
if (system_supports_tpidr2())
p->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
 
+   if (system_supports_poe())
+   p->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
+
if (stack_start) {
if (is_compat_thread(task_thread_info(p)))
childregs->compat_sp = stack_start;
@@ -498,6 +508,17 @@ static void erratum_1418040_new_exec(void)
preempt_enable();
 }
 
+static void permission_overlay_switch(struct task_struct *next)
+{
+   if (system_supports_poe()) {
+   current->thread.por_el0 = read_sysreg_s(SYS_POR_EL0);
+   if (current->thread.por_el0 != next->thread.por_el0) {
+   write_sysreg_s(next->thread.por_el0, SYS_POR_EL0);
+   isb();
+   }
+   }
+}
+
 /*
  * __switch_to() checks current->thread.sctlr_user as an optimisation. 
Therefore
  * this function must be called with preemption disabled and the update to
@@ -533,6 +554,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
ssbs_thread_switch(next);
erratum_1418040_thread_switch(next);
ptrauth_thread_switch_user(next);
+   permission_overlay_switch(next);
 
/*
 * Complete any pending TLB or cache maintenance on this CPU in case
-- 
2.25.1

[PATCH v3 07/25] arm64: enable the Permission Overlay Extension for EL0

2023-11-24 Thread Joey Gouly

Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to
check if the CPU supports the feature.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 Documentation/arch/arm64/elf_hwcaps.rst |  3 +++
 arch/arm64/include/asm/hwcap.h  |  1 +
 arch/arm64/include/uapi/asm/hwcap.h |  1 +
 arch/arm64/kernel/cpufeature.c  | 14 ++
 arch/arm64/kernel/cpuinfo.c |  1 +
 5 files changed, 20 insertions(+)

diff --git a/Documentation/arch/arm64/elf_hwcaps.rst 
b/Documentation/arch/arm64/elf_hwcaps.rst
index ced7b335e2e0..fe7350a66cea 100644
--- a/Documentation/arch/arm64/elf_hwcaps.rst
+++ b/Documentation/arch/arm64/elf_hwcaps.rst
@@ -317,6 +317,9 @@ HWCAP2_LRCPC3
 HWCAP2_LSE128
 Functionality implied by ID_AA64ISAR0_EL1.Atomic == 0b0011.
 
+HWCAP2_POE
+Functionality implied by ID_AA64MMFR3_EL1.S1POE == 0b0001.
+
 4. Unused AT_HWCAP bits
 ---
 
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index cd71e09ea14d..9a1aa1e5e25c 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -142,6 +142,7 @@
 #define KERNEL_HWCAP_SVE_B16B16__khwcap2_feature(SVE_B16B16)
 #define KERNEL_HWCAP_LRCPC3__khwcap2_feature(LRCPC3)
 #define KERNEL_HWCAP_LSE128__khwcap2_feature(LSE128)
+#define KERNEL_HWCAP_POE   __khwcap2_feature(POE)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/uapi/asm/hwcap.h 
b/arch/arm64/include/uapi/asm/hwcap.h
index 5023599fa278..69f09521b553 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -107,5 +107,6 @@
 #define HWCAP2_SVE_B16B16  (1UL << 45)
 #define HWCAP2_LRCPC3  (1UL << 46)
 #define HWCAP2_LSE128  (1UL << 47)
+#define HWCAP2_POE (1UL << 48)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 00b6d516ed3f..02169cb3b84b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -402,6 +402,8 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64mmfr3[] = {
+   ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_POE),
+  FTR_NONSTRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR3_EL1_S1POE_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR3_EL1_S1PIE_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, 
ID_AA64MMFR3_EL1_TCRX_SHIFT, 4, 0),
ARM64_FTR_END,
@@ -2242,6 +2244,14 @@ static void cpu_enable_mops(const struct 
arm64_cpu_capabilities *__unused)
sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn);
 }
 
+#ifdef CONFIG_ARM64_POE
+static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused)
+{
+   sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE);
+   sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE);
+}
+#endif
+
 /* Internal helper functions to match cpu capability type */
 static bool
 cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@@ -2737,6 +2747,7 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.capability = ARM64_HAS_S1POE,
.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
.matches = has_cpuid_feature,
+   .cpu_enable = cpu_enable_poe,
ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
},
 #endif
@@ -2889,6 +2900,9 @@ static const struct arm64_cpu_capabilities 
arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64SMFR0_EL1, BI32I32, IMP, CAP_HWCAP, 
KERNEL_HWCAP_SME_BI32I32),
HWCAP_CAP(ID_AA64SMFR0_EL1, F32F32, IMP, CAP_HWCAP, 
KERNEL_HWCAP_SME_F32F32),
 #endif /* CONFIG_ARM64_SME */
+#ifdef CONFIG_ARM64_POE
+   HWCAP_CAP(ID_AA64MMFR3_EL1, S1POE, IMP, CAP_HWCAP, KERNEL_HWCAP_POE),
+#endif
{},
 };
 
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index a257da7b56fe..5515c50f5219 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -130,6 +130,7 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_SVE_B16B16]   = "sveb16b16",
[KERNEL_HWCAP_LRCPC3]   = "lrcpc3",
[KERNEL_HWCAP_LSE128]   = "lse128",
+   [KERNEL_HWCAP_POE]  = "poe",
 };
 
 #ifdef CONFIG_COMPAT
-- 
2.25.1

[PATCH v3 04/25] arm64: disable trapping of POR_EL0 to EL2

2023-11-24 Thread Joey Gouly

Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/include/asm/el2_setup.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/el2_setup.h 
b/arch/arm64/include/asm/el2_setup.h
index b7afaa026842..df5614be4b70 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -184,12 +184,20 @@
 .Lset_pie_fgt_\@:
mrs_s   x1, SYS_ID_AA64MMFR3_EL1
ubfxx1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4
-   cbz x1, .Lset_fgt_\@
+   cbz x1, .Lset_poe_fgt_\@
 
/* Disable trapping of PIR_EL1 / PIRE0_EL1 */
orr x0, x0, #HFGxTR_EL2_nPIR_EL1
orr x0, x0, #HFGxTR_EL2_nPIRE0_EL1
 
+.Lset_poe_fgt_\@:
+   mrs_s   x1, SYS_ID_AA64MMFR3_EL1
+   ubfxx1, x1, #ID_AA64MMFR3_EL1_S1POE_SHIFT, #4
+   cbz x1, .Lset_fgt_\@
+
+   /* Disable trapping of POR_EL0 */
+   orr x0, x0, #HFGxTR_EL2_nPOR_EL0
+
 .Lset_fgt_\@:
msr_s   SYS_HFGRTR_EL2, x0
msr_s   SYS_HFGWTR_EL2, x0
-- 
2.25.1

[PATCH v3 02/25] arm64/sysreg: update CPACR_EL1 register

2023-11-24 Thread Joey Gouly

Add E0POE bit that traps accesses to POR_EL0 from EL0.
Updated according to DDI0601 2023-03.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Reviewed-by: Mark Brown 
---
 arch/arm64/tools/sysreg | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 940040e82399..5209693239e9 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -1747,7 +1747,8 @@ Field 0   M
 EndSysreg
 
 SysregFields   CPACR_ELx
-Res0   63:29
+Res0   63:30
+Field  29  E0POE
 Field  28  TTA
 Res0   27:26
 Field  25:24   SMEN
-- 
2.25.1

[PATCH v3 03/25] arm64: cpufeature: add Permission Overlay Extension cpucap

2023-11-24 Thread Joey Gouly

This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE
as the boot CPU will enable POE if it has it, so secondary CPUs must also
have this feature.

Add a new config option: ARM64_POE

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm64/Kconfig | 16 
 arch/arm64/kernel/cpufeature.c |  9 +
 arch/arm64/tools/cpucaps   |  1 +
 3 files changed, 26 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b071a00425d..d7df6c603190 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2078,6 +2078,22 @@ config ARM64_EPAN
  if the cpu does not implement the feature.
 endmenu # "ARMv8.7 architectural features"
 
+menu "ARMv8.9 architectural features"
+config ARM64_POE
+   prompt "Permission Overlay Extension"
+   def_bool y
+   help
+ The Permission Overlay Extension is used to implement Memory
+ Protection Keys. Memory Protection Keys provides a mechanism for
+ enforcing page-based protections, but without requiring modification
+ of the page tables when an application changes protection domains.
+
+ For details, see Documentation/core-api/protection-keys.rst
+
+ If unsure, say y.
+
+endmenu # "ARMv8.9 architectural features"
+
 config ARM64_SVE
bool "ARM Scalable Vector Extension support"
default y
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 646591c67e7a..00b6d516ed3f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2731,6 +2731,15 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.matches = has_cpuid_feature,
ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, EVT, IMP)
},
+#ifdef CONFIG_ARM64_POE
+   {
+   .desc = "Stage-1 Permission Overlay Extension (S1POE)",
+   .capability = ARM64_HAS_S1POE,
+   .type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
+   .matches = has_cpuid_feature,
+   ARM64_CPUID_FIELDS(ID_AA64MMFR3_EL1, S1POE, IMP)
+   },
+#endif
{},
 };
 
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index b98c38288a9d..bbd2fac9345a 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -43,6 +43,7 @@ HAS_NESTED_VIRT
 HAS_NO_HW_PREFETCH
 HAS_PAN
 HAS_S1PIE
+HAS_S1POE
 HAS_RAS_EXTN
 HAS_RNG
 HAS_SB
-- 
2.25.1

[PATCH v3 00/25] Permission Overlay Extension

2023-11-24 Thread Joey Gouly

Hello everyone,

This series implements the Permission Overlay Extension introduced in 2022
VMSA enhancements [1]. It is based on v6.7-rc2.

Changes since v2[2]:
# Added ptrace support and selftest
# Add missing POR_EL0 initialisation in fork/clone
# Rebase onto v6.7-rc2
# Add r-bs

The Permission Overlay Extension allows to constrain permissions on memory
regions. This can be used from userspace (EL0) without a system call or TLB
invalidation.

POE is used to implement the Memory Protection Keys [3] Linux syscall.

The first few patches add the basic framework, then the PKEYS interface is
implemented, and then the selftests are made to work on arm64.

There was discussion about what the 'default' protection key value should be,
I used disallow-all (apart from pkey 0), which matches what x86 does.

I have tested the modified protection_keys test on x86_64, but not PPC.
I haven't build tested the x86/ppc arch changes.

Thanks,
Joey

[1] 
https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022
[2] 
https://lore.kernel.org/linux-arm-kernel/20231027180850.1068089-1-joey.go...@arm.com/
[3] Documentation/core-api/protection-keys.rst

Joey Gouly (25):
  arm64/sysreg: add system register POR_EL{0,1}
  arm64/sysreg: update CPACR_EL1 register
  arm64: cpufeature: add Permission Overlay Extension cpucap
  arm64: disable trapping of POR_EL0 to EL2
  arm64: context switch POR_EL0 register
  KVM: arm64: Save/restore POE registers
  arm64: enable the Permission Overlay Extension for EL0
  arm64: add POIndex defines
  arm64: define VM_PKEY_BIT* for arm64
  arm64: mask out POIndex when modifying a PTE
  arm64: enable ARCH_HAS_PKEYS on arm64
  arm64: handle PKEY/POE faults
  arm64: stop using generic mm_hooks.h
  arm64: implement PKEYS support
  arm64: add POE signal support
  arm64: enable PKEY support for CPUs with S1POE
  arm64: enable POE and PIE to coexist
  arm64/ptrace: add support for FEAT_POE
  kselftest/arm64: move get_header()
  selftests: mm: move fpregs printing
  selftests: mm: make protection_keys test work on arm64
  kselftest/arm64: add HWCAP test for FEAT_S1POE
  kselftest/arm64: parse POE_MAGIC in a signal frame
  kselftest/arm64: Add test case for POR_EL0 signal frame records
  KVM: selftests: get-reg-list: add Permission Overlay registers

 Documentation/arch/arm64/elf_hwcaps.rst   |   3 +
 arch/arm64/Kconfig|  18 +++
 arch/arm64/include/asm/cpufeature.h   |   6 +
 arch/arm64/include/asm/el2_setup.h|  10 +-
 arch/arm64/include/asm/hwcap.h|   1 +
 arch/arm64/include/asm/kvm_arm.h  |   4 +-
 arch/arm64/include/asm/kvm_host.h |   4 +
 arch/arm64/include/asm/mman.h |   8 +-
 arch/arm64/include/asm/mmu.h  |   2 +
 arch/arm64/include/asm/mmu_context.h  |  51 ++-
 arch/arm64/include/asm/page.h |  10 ++
 arch/arm64/include/asm/pgtable-hwdef.h|  10 ++
 arch/arm64/include/asm/pgtable-prot.h |   8 +-
 arch/arm64/include/asm/pgtable.h  |  26 +++-
 arch/arm64/include/asm/pkeys.h| 110 ++
 arch/arm64/include/asm/por.h  |  33 +
 arch/arm64/include/asm/processor.h|   1 +
 arch/arm64/include/asm/sysreg.h   |  16 ++
 arch/arm64/include/asm/traps.h|   1 +
 arch/arm64/include/uapi/asm/hwcap.h   |   1 +
 arch/arm64/include/uapi/asm/sigcontext.h  |   7 +
 arch/arm64/kernel/cpufeature.c|  23 +++
 arch/arm64/kernel/cpuinfo.c   |   1 +
 arch/arm64/kernel/process.c   |  22 +++
 arch/arm64/kernel/ptrace.c|  46 ++
 arch/arm64/kernel/signal.c|  51 +++
 arch/arm64/kernel/traps.c |  12 +-
 arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h|  10 ++
 arch/arm64/kvm/sys_regs.c |   2 +
 arch/arm64/mm/fault.c |  44 +-
 arch/arm64/mm/mmap.c  |   9 ++
 arch/arm64/mm/mmu.c   |  40 +
 arch/arm64/tools/cpucaps  |   1 +
 arch/arm64/tools/sysreg   |  15 +-
 arch/powerpc/include/asm/page.h   |  11 ++
 arch/x86/include/asm/page.h   |  10 ++
 fs/proc/task_mmu.c|   2 +
 include/linux/mm.h|  13 --
 include/uapi/linux/elf.h  |   1 +
 tools/testing/selftests/arm64/abi/hwcap.c |  14 ++
 .../testing/selftests/arm64/signal/.gitignore |   1 +
 .../arm64/signal/testcases/poe_siginfo.c  |  86 +++
 .../arm64/signal/testcases/testcases.c|  27 +---
 .../arm64/signal/testcases/testcases.h|  28 +++-
 .../selftests/kvm/aarch64/get-reg-list.c  |  14 ++
 tools/testing/selftests/mm/Makefile   |   2 +-

[PATCH v3 01/25] arm64/sysreg: add system register POR_EL{0,1}

2023-11-24 Thread Joey Gouly

Add POR_EL{0,1} according to DDI0601 2023-03.

Signed-off-by: Joey Gouly 
Cc: Catalin Marinas 
Cc: Will Deacon 
Reviewed-by: Mark Brown 
---
 arch/arm64/include/asm/sysreg.h | 13 +
 arch/arm64/tools/sysreg | 12 
 2 files changed, 25 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 5e65f51c10d2..9c2caf0efdc7 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1039,6 +1039,19 @@
 
 #define PIRx_ELx_PERM(idx, perm)   ((perm) << ((idx) * 4))
 
+/*
+ * Permission Overlay Extension (POE) permission encodings.
+ */
+#define POE_NONE   UL(0x0)
+#define POE_R  UL(0x1)
+#define POE_X  UL(0x2)
+#define POE_RX UL(0x3)
+#define POE_W  UL(0x4)
+#define POE_RW UL(0x5)
+#define POE_XW UL(0x6)
+#define POE_RXWUL(0x7)
+#define POE_MASK   UL(0xf)
+
 #define ARM64_FEATURE_FIELD_BITS   4
 
 /* Defined for compatibility only, do not add new users. */
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 96cbeeab4eec..940040e82399 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -2510,6 +2510,18 @@ Sysreg   PIR_EL2 3   4   10  2   
3
 Fields PIRx_ELx
 EndSysreg
 
+Sysreg POR_EL0 3   3   10  2   4
+Fields PIRx_ELx
+EndSysreg
+
+Sysreg POR_EL1 3   0   10  2   4
+Fields PIRx_ELx
+EndSysreg
+
+Sysreg POR_EL123   5   10  2   4
+Fields PIRx_ELx
+EndSysreg
+
 Sysreg LORSA_EL1   3   0   10  4   0
 Res0   63:52
 Field  51:16   SA
-- 
2.25.1

Re: [RFC PATCH 0/2] Add a test to verify device probing on ACPI platforms

2023-11-24 Thread Laura Nao

On 11/23/23 16:14, Dan Carpenter wrote:
> On Thu, Nov 23, 2023 at 01:09:42PM +0100, Laura Nao wrote:
>>> Your talk was interesting at Linux Plumbers.
>>>
>>> https://www.youtube.com/watch?v=oE73eVSyFXQ [time +2:35]
>>>
>>> This is probably a stupid question, but why not just add something to
>>> call_driver_probe() which creates a sysfs directory tree with all the
>>> driver information?
>>>
>>
>> Thanks for the feedback!
>>
>> Improving the device driver model to publish driver and devices info
>> was indeed another option we considered. We could have a debugfs entry
>> storing this kind of information, similar to what devices_deferred
>> does and in a standardized format. This would provide an interface
>> that is easier to query at runtime for getting a list of devices that
>> were probed correctly.
>> This would cover devices with a driver that's built into the kernel or
>> as a module; in view of catching also those cases where a device is
>> not probed because the relevant config is not enabled, I think we'd
>> still need another way of building a list of devices present on the
>> platform to be used as reference.
> 
> Yeah.  So we'd still need patch #1 as-is and but patch #2 would probably
> be simpler if we had this information in sysfs.  Or a different solution
> would be to do what someone said in the LPC talk and just save the
> output of the previous boot and complain if there was a regression where
> something didn't probe.
> 

Right. The main drawback of using the status of a known good boot as
reference is to keep it up to date over time. If support for a
peripheral gets added at a later stage, the reference needs to be updated
as well.

>>
>> The solution proposed in this RFC follows the same approach used for
>> dt based platforms for simplicity. But if adding a new sysfs entry
>> storing devices and driver info proves to be a viable option for
>> upstream, we can surely explore it and improve the probe test to
>> leverage that.
> 
> You're saying "simplicity" but I think you mean easiest from a political
> point of view.  It's not the most simple format at all.  It's like
> massive detective work to find the information and then you'll have to
> redo it for DT and for USB.  Are there other kinds of devices which can
> be probed?
> 

Yeah, that's what I meant. The ACPI use case is in a way simpler to
handle than the dt one, as we can get information on non removable
devices on enumerable buses such as PCI from the ACPI
tables (leveraging the _ADR objects). But it still requires quite a lot
digging in sysfs to get info on what was actually probed.
So having a list of probed devices would help both use cases.

> I feel like you're not valuing your stuff at the right level.  This
> shouldn't be in debugfs.  It should be a first class citizen in sysfs.
> 
> The exact format for this information is slightly tricky and people will
> probably debate that.  But I think most people will agree that it's
> super useful.
>

Right, agreeing on a format will be tricky. Judging by the response here
and in LPC it's still worth a shot though. I'll put some thought into
this and experiment a bit to come up with a proposal to submit in
another RFC.

Again, thanks for the helpful feedback!

Best,
Laura

[PATCH net-next 5/5] selftests: tc-testing: remove unused import

2023-11-24 Thread Pedro Tammela

Remove this leftover from the times we pre-allocated everything

Signed-off-by: Pedro Tammela 
---
 tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py 
b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index 77b1106b8388..bb19b8b76d3b 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -23,8 +23,6 @@ class SubPlugin(TdcPlugin):
 super().__init__()
 
 def pre_suite(self, testcount, testlist):
-from itertools import cycle
-
 super().pre_suite(testcount, testlist)
 
 def prepare_test(self, test):
-- 
2.40.1

[PATCH net-next 4/5] selftests: tc-testing: cleanup on Ctrl-C

2023-11-24 Thread Pedro Tammela

Cleanup net namespaces and other resources if we get a SIGINT (Ctrl-C).
As user visible resources are allocated on a per test basis, it's only
required to catch this condition when (possibly) running tests.

So far calling post_suite is enough to free up anything that might
linger.

A missing keyword replacement for nsPlugin is also included.

Signed-off-by: Pedro Tammela 
---
 tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py | 2 +-
 tools/testing/selftests/tc-testing/tdc.py | 6 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py 
b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index dc7a0597cf44..77b1106b8388 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -78,7 +78,7 @@ class SubPlugin(TdcPlugin):
 print('{}.post_suite'.format(self.sub_class))
 
 # Make sure we don't leak resources
-cmd = "$IP -a netns del"
+cmd = self._replace_keywords("$IP -a netns del")
 
 if self.args.verbose > 3:
 print('_exec_cmd:  command "{}"'.format(cmd))
diff --git a/tools/testing/selftests/tc-testing/tdc.py 
b/tools/testing/selftests/tc-testing/tdc.py
index c5ec861687b6..caeacc691587 100755
--- a/tools/testing/selftests/tc-testing/tdc.py
+++ b/tools/testing/selftests/tc-testing/tdc.py
@@ -1018,7 +1018,11 @@ def main():
 if args.verbose > 2:
 print('args is {}'.format(args))
 
-set_operation_mode(pm, parser, args, remaining)
+try:
+set_operation_mode(pm, parser, args, remaining)
+except KeyboardInterrupt:
+# Cleanup on Ctrl-C
+pm.call_post_suite(None)
 
 if __name__ == "__main__":
 main()
-- 
2.40.1

[PATCH net-next 3/5] selftests: tc-testing: prefix iproute2 functions with "ipr2"

2023-11-24 Thread Pedro Tammela

As suggested by Simon, prefix the functions that operate on iproute2
commands in contrast with the "nl" netlink prefix.

Cc: Simon Horman 
Signed-off-by: Pedro Tammela 
---
 .../selftests/tc-testing/plugin-lib/nsPlugin.py  | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py 
b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
index 65c8f3f983b9..dc7a0597cf44 100644
--- a/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
+++ b/tools/testing/selftests/tc-testing/plugin-lib/nsPlugin.py
@@ -37,7 +37,7 @@ class SubPlugin(TdcPlugin):
 if netlink == True:
 self._nl_ns_create()
 else:
-self._ns_create()
+self._ipr2_ns_create()
 
 # Make sure the netns is visible in the fs
 ticks = 20
@@ -71,7 +71,7 @@ class SubPlugin(TdcPlugin):
 if netlink == True:
 self._nl_ns_destroy()
 else:
-self._ns_destroy()
+self._ipr2_ns_destroy()
 
 def post_suite(self, index):
 if self.args.verbose:
@@ -161,7 +161,7 @@ class SubPlugin(TdcPlugin):
 ticks -= 1
 continue
 
-def _ns_create_cmds(self):
+def _ipr2_ns_create_cmds(self):
 cmds = []
 
 ns = self.args.NAMES['NS']
@@ -181,26 +181,26 @@ class SubPlugin(TdcPlugin):
 
 return cmds
 
-def _ns_create(self):
+def _ipr2_ns_create(self):
 '''
 Create the network namespace in which the tests will be run and set up
 the required network devices for it.
 '''
-self._exec_cmd_batched('pre', self._ns_create_cmds())
+self._exec_cmd_batched('pre', self._ipr2_ns_create_cmds())
 
 def _nl_ns_destroy(self):
 ns = self.args.NAMES['NS']
 netns.remove(ns)
 
-def _ns_destroy_cmd(self):
+def _ipr2_ns_destroy_cmd(self):
 return self._replace_keywords('netns delete 
{}'.format(self.args.NAMES['NS']))
 
-def _ns_destroy(self):
+def _ipr2_ns_destroy(self):
 '''
 Destroy the network namespace for testing (and any associated network
 devices as well)
 '''
-self._exec_cmd('post', self._ns_destroy_cmd())
+self._exec_cmd('post', self._ipr2_ns_destroy_cmd())
 
 @cached_property
 def _proc(self):
-- 
2.40.1

[PATCH net-next 2/5] selftests: tc-testing: remove unnecessary time.sleep

2023-11-24 Thread Pedro Tammela

This operation is redundant and it's not stabilizing nor waiting
for anything.

Signed-off-by: Pedro Tammela 
---
 tools/testing/selftests/tc-testing/tdc.py | 5 -
 1 file changed, 5 deletions(-)

diff --git a/tools/testing/selftests/tc-testing/tdc.py 
b/tools/testing/selftests/tc-testing/tdc.py
index 669ec89ebfe1..c5ec861687b6 100755
--- a/tools/testing/selftests/tc-testing/tdc.py
+++ b/tools/testing/selftests/tc-testing/tdc.py
@@ -497,11 +497,6 @@ def prepare_run(pm, args, testlist):
 pm.call_post_suite(1)
 return emergency_exit_message
 
-if args.verbose:
-print('give test rig 2 seconds to stabilize')
-
-time.sleep(2)
-
 def purge_run(pm, index):
 pm.call_post_suite(index)
 
-- 
2.40.1

[PATCH net-next 1/5] selftests: tc-testing: remove buildebpf plugin

2023-11-24 Thread Pedro Tammela

As tdc only tests loading/deleting and anything more complicated is
better left to the ebpf test suite, provide a pre-compiled version of
'action.c' and don't bother compiling it in kselftests or on the fly
at all.

Cc: Davide Caratti 
Signed-off-by: Pedro Tammela 
---
 tools/testing/selftests/tc-testing/Makefile   |  29 +---
 tools/testing/selftests/tc-testing/README |   2 -
 .../testing/selftests/tc-testing/action-ebpf  | Bin 0 -> 856 bytes
 .../tc-testing/plugin-lib/buildebpfPlugin.py  |  67 --
 .../tc-testing/tc-tests/actions/bpf.json  |  14 ++--
 .../tc-testing/tc-tests/filters/bpf.json  |  10 ++-
 tools/testing/selftests/tc-testing/tdc.sh |   2 +-
 7 files changed, 11 insertions(+), 113 deletions(-)
 create mode 100644 tools/testing/selftests/tc-testing/action-ebpf
 delete mode 100644 
tools/testing/selftests/tc-testing/plugin-lib/buildebpfPlugin.py

diff --git a/tools/testing/selftests/tc-testing/Makefile 
b/tools/testing/selftests/tc-testing/Makefile
index b1fa2e177e2f..e8b3dde4fa16 100644
--- a/tools/testing/selftests/tc-testing/Makefile
+++ b/tools/testing/selftests/tc-testing/Makefile
@@ -1,31 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
-include ../../../scripts/Makefile.include
 
-top_srcdir = $(abspath ../../../..)
-APIDIR := $(top_scrdir)/include/uapi
-TEST_GEN_FILES = action.o
+TEST_PROGS += ./tdc.sh
+TEST_FILES := action-ebpf tdc*.py Tdc*.py plugins plugin-lib tc-tests scripts
 
 include ../lib.mk
-
-PROBE := $(shell $(LLC) -march=bpf -mcpu=probe -filetype=null /dev/null 2>&1)
-
-ifeq ($(PROBE),)
-  CPU ?= probe
-else
-  CPU ?= generic
-endif
-
-CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - &1 \
-   | sed -n '/<...> search starts here:/,/End of search list./{ s| 
\(/.*\)|-idirafter \1|p }')
-
-CLANG_FLAGS = -I. -I$(APIDIR) \
- $(CLANG_SYS_INCLUDES) \
- -Wno-compare-distinct-pointer-types
-
-$(OUTPUT)/%.o: %.c
-   $(CLANG) $(CLANG_FLAGS) \
--O2 --target=bpf -emit-llvm -c $< -o - |  \
-   $(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
-
-TEST_PROGS += ./tdc.sh
-TEST_FILES := tdc*.py Tdc*.py plugins plugin-lib tc-tests scripts
diff --git a/tools/testing/selftests/tc-testing/README 
b/tools/testing/selftests/tc-testing/README
index be7b00799b3e..fc8e858ff119 100644
--- a/tools/testing/selftests/tc-testing/README
+++ b/tools/testing/selftests/tc-testing/README
@@ -195,8 +195,6 @@ directory:
   and the other is a test whether the command leaked memory or not.
   (This one is a preliminary version, it may not work quite right yet,
   but the overall template is there and it should only need tweaks.)
-  - buildebpfPlugin.py:
-  builds all programs in $EBPFDIR.
 
 
 ACKNOWLEDGEMENTS
diff --git a/tools/testing/selftests/tc-testing/action-ebpf 
b/tools/testing/selftests/tc-testing/action-ebpf
new file mode 100644
index 
..4879479b2ee5c046279be0fe8f9ca313dfb7e618
GIT binary patch
literal 856
zcmb_ayKcfj5L_FFP=-`UX`o1n`2r$0A*IpgTFnLKX%`_!N;U`3b&--wH~R684VW
zGukMra)oDhcIIBb_s4kTdmixc;2Y|SRe;%r7+E=j7CQH2*%9vjGf8`~C9?lCIqPKq
z0V7lbI2>i;4uxB2IQfRywbcWscZm%V+i>M{cKD3|LY-|jB?M`i`k`$r`e-
zC|*}8na?*>z5rF^X|}F1GK49FmEP#&8S!mp@PEb_r>Rd{&-q1E)skfwzsJ=^YYJZ^
zYA*SHxV}g7SDx>m{VgVh?O*Z}>URklWc~pgW_@`FFBFjbmFFLrY 0:
-foutput = serr.decode("utf-8")
-else:
-foutput = rawout.decode("utf-8")
-
-proc.stdout.close()
-proc.stderr.close()
-return proc, foutput
diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/bpf.json 
b/tools/testing/selftests/tc-testing/tc-tests/actions/bpf.json
index 91832400ddbd..6e00bf32ef9a 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/bpf.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/bpf.json
@@ -54,9 +54,6 @@
 "actions",
 "bpf"
 ],
-"plugins": {
-"requires": "buildebpfPlugin"
-},
 "setup": [
 [
 "$TC action flush action bpf",
@@ -65,10 +62,10 @@
 255
 ]
 ],
-"cmdUnderTest": "$TC action add action bpf object-file 
$EBPFDIR/action.o section action-ok index 667",
+"cmdUnderTest": "$TC action add action bpf object-file 
$EBPFDIR/action-ebpf section action-ok index 667",
 "expExitCode": "0",
 "verifyCmd": "$TC action get action bpf index 667",
-"matchPattern": "action order [0-9]*: bpf action.o:\\[action-ok\\] id 
[0-9].* tag [0-9a-f]{16}( jited)? default-action pipe.*index 667 ref",
+"matchPattern": "action order [0-9]*: bpf action-ebpf:\\[action-ok\\] 
id [0-9].* tag [0-9a-f]{16}( jited)? default-action pipe.*index 667 ref",
 "matchCount": "1",
 "teardown": [
 "$TC action flush action bpf"
@@ -81,9 +78,6 @@
 "actions",
 "bpf"
 ],
-"plugins": {
-"requires":

[PATCH net-next 0/5] selftests: tc-testing: updates and cleanups for tdc

2023-11-24 Thread Pedro Tammela

Address the recommendations from the previous series and cleanup some
leftovers.

Pedro Tammela (5):
  selftests: tc-testing: remove buildebpf plugin
  selftests: tc-testing: remove unnecessary time.sleep
  selftests: tc-testing: prefix iproute2 functions with "ipr2"
  selftests: tc-testing: cleanup on Ctrl-C
  selftests: tc-testing: remove unused import

 tools/testing/selftests/tc-testing/Makefile   |  29 +---
 tools/testing/selftests/tc-testing/README |   2 -
 .../testing/selftests/tc-testing/action-ebpf  | Bin 0 -> 856 bytes
 .../tc-testing/plugin-lib/buildebpfPlugin.py  |  67 --
 .../tc-testing/plugin-lib/nsPlugin.py |  20 +++---
 .../tc-testing/tc-tests/actions/bpf.json  |  14 ++--
 .../tc-testing/tc-tests/filters/bpf.json  |  10 ++-
 tools/testing/selftests/tc-testing/tdc.py |  11 ++-
 tools/testing/selftests/tc-testing/tdc.sh |   2 +-
 9 files changed, 25 insertions(+), 130 deletions(-)
 create mode 100644 tools/testing/selftests/tc-testing/action-ebpf
 delete mode 100644 
tools/testing/selftests/tc-testing/plugin-lib/buildebpfPlugin.py

-- 
2.40.1

Re: [PATCH net-next 01/38] selftests/net: add lib.sh

2023-11-24 Thread Petr Machata

Petr Machata  writes:

> Hangbin Liu  writes:
>
>> +# By default, remove all netns before EXIT.
>> +cleanup_all_ns()
>> +{
>> +cleanup_ns $NS_LIST
>> +}
>> +trap cleanup_all_ns EXIT
>
> Hmm, OK, this is a showstopper for inclusion from forwarding/lib.sh,
> because basically all users of forwarding/lib.sh use the EXIT trap.
[...]
> So just ignore the bit about including from forwarding/lib.sh.

Actually I take this back. The cleanup should be invoked from where the
init was called. I don't think the library should be auto-invoking it,
the client scripts should. Whether through a trap or otherwise.

[PATCH] kselftest/arm64: Improve output for skipped TPIDR2 ABI test

2023-11-24 Thread Mark Brown

When TPIDR2 is not supported the tpidr2 ABI test prints the same message
for each skipped test:

  ok 1 skipped, TPIDR2 not supported

which isn't ideal for test automation software since it tracks kselftest
results based on the string used to describe the test. This is also not
standard KTAP output, the expected format is:

  ok 1 # SKIP default_value

Updated the program to generate this, using the same set of test names that
we would run if the test actually executed.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/abi/tpidr2.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/arm64/abi/tpidr2.c 
b/tools/testing/selftests/arm64/abi/tpidr2.c
index 351a098b503a..02ee3a91b780 100644
--- a/tools/testing/selftests/arm64/abi/tpidr2.c
+++ b/tools/testing/selftests/arm64/abi/tpidr2.c
@@ -254,6 +254,12 @@ static int write_clone_read(void)
putnum(++tests_run); \
putstr(" " #name "\n");
 
+#define skip_test(name) \
+   tests_skipped++; \
+   putstr("ok ");   \
+   putnum(++tests_run); \
+   putstr(" # SKIP " #name "\n");
+
 int main(int argc, char **argv)
 {
int ret, i;
@@ -283,13 +289,11 @@ int main(int argc, char **argv)
} else {
putstr("# SME support not present\n");
 
-   for (i = 0; i < EXPECTED_TESTS; i++) {
-   putstr("ok ");
-   putnum(i);
-   putstr(" skipped, TPIDR2 not supported\n");
-   }
-
-   tests_skipped += EXPECTED_TESTS;
+   skip_test(default_value);
+   skip_test(write_read);
+   skip_test(write_sleep_read);
+   skip_test(write_fork_read);
+   skip_test(write_clone_read);
}
 
print_summary();

---
base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263
change-id: 20231124-kselftest-arm64-tpidr2-skip-43764f4ff4f4

Best regards,
-- 
Mark Brown

Re: [PATCH] kselftest/clone3: Make test names for set_tid test stable

2023-11-24 Thread Christian Brauner

On Wed, Nov 15, 2023 at 02:43:02PM +, Mark Brown wrote:
> The test results reported for the clone3_set_tid tests interact poorly with
> automation for running kselftest since the reported test names include TIDs
> dynamically allocated at runtime. A lot of automation for running kselftest
> will compare runs by looking at the test name to identify if the same test
> is being run so changing names make it look like the testsuite has been
> updated to include new tests. This makes the results display less clearly
> and breaks cases like bisection.
> 
> Address this by providing a brief description of the tests and logging that
> along with the stable parameters for the test currently logged. The TIDs
> are already logged separately in existing logging except for the final test
> which has a new log message added. We also tweak the formatting of the
> logging of expected/actual values for clarity.
> 
> There are still issues with the logging of skipped tests (many are simply
> not logged at all when skipped and all are logged with different names) but
> these are less disruptive since the skips are all based on not being run as
> root, a condition likely to be stable for a given test system.
> 
> Signed-off-by: Mark Brown 
> ---

May I already acked this. Not sure,
Acked-by: Christian Brauner

Re: [PATCH net-next 01/38] selftests/net: add lib.sh

2023-11-24 Thread Petr Machata

Hangbin Liu  writes:

> +cleanup_ns()
> +{
> + local ns=""
> + local errexit=0
> +
> + # disable errexit temporary
> + if [[ $- =~ "e" ]]; then
> + errexit=1
> + set +e
> + fi
> +
> + for ns in "$@"; do
> + ip netns delete "${ns}" &> /dev/null
> + busywait 2 "ip netns list | grep -vq $1" &> /dev/null

The grep would get confused by substrings of other names.
This should be grep -vq "^$ns$".

> + if ip netns list | grep -q $1; then

Busywait returns != 0 when the wait condition is not reached within a
given time. So it should be possible to roll the duplicated if-grep into
the busywait line like so:

if ! busywait 2 "ip netns etc."; then

> + echo "Failed to remove namespace $1"
> + return $ksft_skip

This does not restore the errexit.

I think it might be clearest to have this function as a helper, say
__cleanup_ns, and then have a wrapper that does the errexit management:

cleanup_ns()
{
local errexit
local rc

# disable errexit temporarily
if [[ $- =~ "e" ]]; then
errexit=1
set +e
fi

__cleanup_ns "$@"
rc=$?

[ $errexit -eq 1 ] && set -e
return $rc
}

If this comes up more often, we can have a helper like
with_disabled_errexit or whatever, that does this management and
dispatches to "$@", so cleanup_ns() would become:

cleanup_ns()
{
with_disabled_errexit __cleanup_ns "$@"
}

> + fi
> + done
> +
> + [ $errexit -eq 1 ] && set -e
> + return 0
> +}
> +
> +# By default, remove all netns before EXIT.
> +cleanup_all_ns()
> +{
> + cleanup_ns $NS_LIST
> +}
> +trap cleanup_all_ns EXIT

Hmm, OK, this is a showstopper for inclusion from forwarding/lib.sh,
because basically all users of forwarding/lib.sh use the EXIT trap.

I wonder if we need something like these push_cleanup / on_exit helpers:

https://github.com/pmachata/stuff/blob/master/ptp-test/lib.sh#L15

But I don't want to force this on your already large patchset :)
So just ignore the bit about including from forwarding/lib.sh.

> +# setup netns with given names as prefix. e.g
> +# setup_ns local remote
> +setup_ns()
> +{
> + local ns=""
> + # the ns list we created in this call
> + local ns_list=""
> + while [ -n "$1" ]; do

I would find it more readable if this used the same iteration approach
as the 'for ns in "$@"' above. The $1/shift approach used here is
somewhat confusing.

> + # Some test may setup/remove same netns multi times
> + if unset $1 2> /dev/null; then
> + ns="${1,,}-$(mktemp -u XX)"
> + eval readonly $1=$ns
> + else
> + eval ns='$'$1
> + cleanup_ns $ns
> +
> + fi
> +
> + ip netns add $ns
> + if ! ip netns list | grep -q $ns; then

As above, the grep could get confused. But in fact wouldn't just
checking the exit code of ip netns add be enough?

> + echo "Failed to create namespace $1"
> + cleanup_ns $ns_list
> + return $ksft_skip
> + fi
> + ip -n $ns link set lo up
> + ns_list="$ns_list $ns"
> +
> + shift
> + done
> + NS_LIST="$NS_LIST $ns_list"
> +}

Re: [PATCH net-next 01/38] selftests/net: add lib.sh

2023-11-24 Thread Petr Machata



Hangbin Liu  writes:

> Add a lib.sh for net selftests. This file can be used to define commonly
> used variables and functions.
>
> Add function setup_ns() for user to create unique namespaces with given
> prefix name.
>
> Signed-off-by: Hangbin Liu 
> ---
>  tools/testing/selftests/net/Makefile |  2 +-
>  tools/testing/selftests/net/lib.sh   | 98 
>  2 files changed, 99 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/net/lib.sh
>
> diff --git a/tools/testing/selftests/net/Makefile 
> b/tools/testing/selftests/net/Makefile
> index 9274edfb76ff..14bd68da7466 100644
> --- a/tools/testing/selftests/net/Makefile
> +++ b/tools/testing/selftests/net/Makefile
> @@ -54,7 +54,7 @@ TEST_PROGS += ip_local_port_range.sh
>  TEST_PROGS += rps_default_mask.sh
>  TEST_PROGS += big_tcp.sh
>  TEST_PROGS_EXTENDED := in_netns.sh setup_loopback.sh setup_veth.sh
> -TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh
> +TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh lib.sh
>  TEST_GEN_FILES =  socket nettest
>  TEST_GEN_FILES += psock_fanout psock_tpacket msg_zerocopy reuseport_addr_any
>  TEST_GEN_FILES += tcp_mmap tcp_inq psock_snd txring_overwrite
> diff --git a/tools/testing/selftests/net/lib.sh 
> b/tools/testing/selftests/net/lib.sh
> new file mode 100644
> index ..239ab2beb438
> --- /dev/null
> +++ b/tools/testing/selftests/net/lib.sh
> @@ -0,0 +1,98 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +##
> +# Defines
> +
> +# Kselftest framework requirement - SKIP code is 4.
> +ksft_skip=4
> +# namespace list created by setup_ns
> +NS_LIST=""
> +
> +##
> +# Helpers
> +busywait()
> +{
> + local timeout=$1; shift
> +
> + local start_time="$(date -u +%s%3N)"
> + while true
> + do
> + local out
> + out=$($@)
> + local ret=$?
> + if ((!ret)); then
> + echo -n "$out"
> + return 0
> + fi
> +
> + local current_time="$(date -u +%s%3N)"
> + if ((current_time - start_time > timeout)); then
> + echo -n "$out"
> + return 1
> + fi
> + done
> +}

This is lifted from forwarding/lib.sh, right? Would it make sense to
just source this new file from forwarding/lib.sh instead of copying
stuff around? I imagine there will eventually be more commonality, and
when that pops up, we can just shuffle the forwarding code to
net/lib.sh.

Re: [PATCH v7 1/3] iommufd: Add data structure for Intel VT-d stage-1 cache invalidation

2023-11-24 Thread Jason Gunthorpe

On Fri, Nov 24, 2023 at 03:00:45AM +, Tian, Kevin wrote:

> > I'm fully expecting that Intel will adopt an direct-DMA flush queue
> > like SMMU and AMD have already done as a performance optimization. In
> > this world it makes no sense that the behavior of the direct DMA queue
> > and driver mediated queue would be different.
> > 
> 
> that's a orthogonal topic. I don't think the value of direct-DMA flush
> queue should prevent possible optimization in the mediation path
> (as long as guest-expected deterministic behavior is sustained).

Okay, well as long as the guest is generating the ATC invalidations we
can always make an iommufd API flag to include or exclude the ATC
invalidation when doing the ASID invalidation. So we aren't trapped

Jason

Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

2023-11-24 Thread Michael Ellerman

Peter Zijlstra  writes:
> On Fri, Nov 24, 2023 at 12:04:09PM +0100, Jonas Oberhauser wrote:
>
>> > I think ARM64 approached this problem by adding the
>> > load-acquire/store-release instructions and for TSO based code,
>> > translate into those (eg. x86 -> arm64 transpilers).
>> 
>> 
>> Although those instructions have a bit more ordering constraints.
>> 
>> I have heard rumors that the apple chips also have a register that can be
>> set at runtime.
>
> Oh, I thought they made do with the load-acquire/store-release thingies.
> But to be fair, I haven't been paying *that* much attention to the apple
> stuff.
>
> I did read about how they fudged some of the x86 flags thing.
>
>> And there are some IBM machines that have a setting, but not sure how it is
>> controlled.
>
> Cute, I'm assuming this is the Power series (s390 already being TSO)? I
> wasn't aware they had this.

Are you referring to Strong Access Ordering? That is a per-page
attribute, not a CPU mode, and was removed in ISA v3.1 anyway.

cheers

Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

2023-11-24 Thread Peter Zijlstra

On Fri, Nov 24, 2023 at 12:04:09PM +0100, Jonas Oberhauser wrote:

> > I think ARM64 approached this problem by adding the
> > load-acquire/store-release instructions and for TSO based code,
> > translate into those (eg. x86 -> arm64 transpilers).
> 
> 
> Although those instructions have a bit more ordering constraints.
> 
> I have heard rumors that the apple chips also have a register that can be
> set at runtime.

Oh, I thought they made do with the load-acquire/store-release thingies.
But to be fair, I haven't been paying *that* much attention to the apple
stuff.

I did read about how they fudged some of the x86 flags thing.

> And there are some IBM machines that have a setting, but not sure how it is
> controlled.

Cute, I'm assuming this is the Power series (s390 already being TSO)? I
wasn't aware they had this.

> > IIRC Risc-V actually has such instructions as well, so *why* are you
> > doing this?!?!
> 
> 
> Unfortunately, at least last time I checked RISC-V still hadn't gotten such
> instructions.
> What they have is the *semantics* of the instructions, but no actual opcodes
> to encode them.

Well, that sucks..

> I argued for them in the RISC-V memory group, but it was considered to be
> outside the scope of that group.
> 
> Transpiling with sufficient DMB ISH to get the desired ordering is really
> bad for performance.

Ha!, quite dreadful I would imagine.

> That is not to say that linux should support this. Perhaps linux should
> pressure RISC-V into supporting implicit barriers instead.

I'm not sure I count for much in this regard, but yeah, that sounds like
a plan :-)

Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

2023-11-24 Thread Peter Zijlstra

On Fri, Nov 24, 2023 at 11:53:06AM +0100, Christoph Müllner wrote:

> > I think ARM64 approached this problem by adding the
> > load-acquire/store-release instructions and for TSO based code,
> > translate into those (eg. x86 -> arm64 transpilers).
> >
> > IIRC Risc-V actually has such instructions as well, so *why* are you
> > doing this?!?!
> 
> Not needing a transpiler is already a benefit.

This don't make sense, native risc-v stuff knows about the weak stuff,
its your natve model. The only reason you would ever need this dynamic
TSO stuff, is if you're going to run code that's written for some other
platform (notably x86).

> And the DTSO approach also covers the cases where transpilers can't be used
> (e.g. binary-only executables or libraries).

Uhh.. have you looked at the x86-on-arm64 things? That's all binary to
binary magic.

> We are also working on extending ld.so such, that it switches to DTSO
> (if available) in case the user wants to start an executable that was
> compiled for Ztso or loads a library that was compiled for Ztso.
> This would utilize the API that is introduced in this patchset.

I mean, sure, but *why* would you do this to your users? Who would want
to build a native risc-v tso binary?

Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

2023-11-24 Thread Christoph Müllner

On Fri, Nov 24, 2023 at 11:15 AM Peter Zijlstra  wrote:
>
> On Fri, Nov 24, 2023 at 08:21:37AM +0100, Christoph Muellner wrote:
> > From: Christoph Müllner 
> >
> > The upcoming RISC-V Ssdtso specification introduces a bit in the senvcfg
> > CSR to switch the memory consistency model at run-time from RVWMO to TSO
> > (and back). The active consistency model can therefore be switched on a
> > per-hart base and managed by the kernel on a per-process/thread base.
>
> You guys, computers are hartless, nobody told ya?

That's why they came up with RISC-V, the ISA with hart!

> > This patch implements basic Ssdtso support and adds a prctl API on top
> > so that user-space processes can switch to a stronger memory consistency
> > model (than the kernel was written for) at run-time.
> >
> > I am not sure if other architectures support switching the memory
> > consistency model at run-time, but designing the prctl API in an
> > arch-independent way allows reusing it in the future.
>
> IIRC some Sparc chips could do this, but I don't think anybody ever
> exposed this to userspace (or used it much).
>
> IA64 had planned to do this, except they messed it up and did it the
> wrong way around (strong first and then relax it later), which lead to
> the discovery that all existing software broke (d'uh).
>
> I think ARM64 approached this problem by adding the
> load-acquire/store-release instructions and for TSO based code,
> translate into those (eg. x86 -> arm64 transpilers).
>
> IIRC Risc-V actually has such instructions as well, so *why* are you
> doing this?!?!

Not needing a transpiler is already a benefit.
And the DTSO approach also covers the cases where transpilers can't be used
(e.g. binary-only executables or libraries).

We are also working on extending ld.so such, that it switches to DTSO
(if available) in case the user wants to start an executable that was
compiled for Ztso or loads a library that was compiled for Ztso.
This would utilize the API that is introduced in this patchset.

Re: [RFC PATCH 0/5] RISC-V: Add dynamic TSO support

2023-11-24 Thread Peter Zijlstra

On Fri, Nov 24, 2023 at 08:21:37AM +0100, Christoph Muellner wrote:
> From: Christoph Müllner 
> 
> The upcoming RISC-V Ssdtso specification introduces a bit in the senvcfg
> CSR to switch the memory consistency model at run-time from RVWMO to TSO
> (and back). The active consistency model can therefore be switched on a
> per-hart base and managed by the kernel on a per-process/thread base.

You guys, computers are hartless, nobody told ya?

> This patch implements basic Ssdtso support and adds a prctl API on top
> so that user-space processes can switch to a stronger memory consistency
> model (than the kernel was written for) at run-time.
> 
> I am not sure if other architectures support switching the memory
> consistency model at run-time, but designing the prctl API in an
> arch-independent way allows reusing it in the future.

IIRC some Sparc chips could do this, but I don't think anybody ever
exposed this to userspace (or used it much).

IA64 had planned to do this, except they messed it up and did it the
wrong way around (strong first and then relax it later), which lead to
the discovery that all existing software broke (d'uh).

I think ARM64 approached this problem by adding the
load-acquire/store-release instructions and for TSO based code,
translate into those (eg. x86 -> arm64 transpilers).

IIRC Risc-V actually has such instructions as well, so *why* are you
doing this?!?!

Re: [PATCH 0/5] tools: selftests: riscv: Fix compiler warnings

2023-11-24 Thread Andrew Jones

On Thu, Nov 23, 2023 at 07:58:16PM +0100, Christoph Muellner wrote:
> From: Christoph Müllner 
> 
> When building the RISC-V selftests with a riscv32 compiler I ran into
> a couple of compiler warnings. While riscv32 support for these tests is
> questionable, the fixes are so trivial that it is probably best to simply
> apply them.
> 
> Note that the missing-include patch and some format string warnings
> are also relevant for riscv64.

I also posted [1] a couple days ago for the format warnings, but, as this
series also includes rv32 fixes, then we can drop [1] in favor of this.

[1] https://lore.kernel.org/all/20231122171821.130854-2-ajo...@ventanamicro.com/

For the series,

Reviewed-by: Andrew Jones 

Thanks,
drew

[PATCH net-next 38/38] kselftest/runner.sh: add netns support

2023-11-24 Thread Hangbin Liu

Add a variable RUN_IN_NETNS if user want to run all the test in name
space in parallel. With this, we can save a lot of testing time.

Nit: the NUM in run_one is not used, rename it to test_num.

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/kselftest/runner.sh | 26 +++--
 tools/testing/selftests/run_kselftest.sh|  4 
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kselftest/runner.sh 
b/tools/testing/selftests/kselftest/runner.sh
index cd2fb43eea61..4306b716c115 100644
--- a/tools/testing/selftests/kselftest/runner.sh
+++ b/tools/testing/selftests/kselftest/runner.sh
@@ -6,6 +6,7 @@ export skip_rc=4
 export timeout_rc=124
 export logfile=/dev/stdout
 export per_test_logging=
+export RUN_IN_NETNS=
 
 # Defaults for "settings" file fields:
 # "timeout" how many seconds to let each test run before running
@@ -47,7 +48,7 @@ run_one()
 {
DIR="$1"
TEST="$2"
-   NUM="$3"
+   local test_num="$3"
 
BASENAME_TEST=$(basename $TEST)
 
@@ -141,6 +142,21 @@ run_one()
fi
 }
 
+run_in_netns()
+{
+   local netns=$(mktemp -u ${BASENAME_TEST}-XX)
+   local tmplog="/tmp/$(mktemp -u ${BASENAME_TEST}-XX)"
+   ip netns add $netns
+   if [ $? -ne 0 ]; then
+   echo "# Warning: Create namespace failed for $BASENAME_TEST"
+   echo "not ok $test_num selftests: $DIR: $BASENAME_TEST # Create 
NS failed"
+   fi
+   ip netns exec $netns bash -c "BASE_DIR=$BASE_DIR; source 
$BASE_DIR/kselftest/runner.sh; logfile=$logfile; run_one $DIR $TEST $test_num" 
&> $tmplog
+   ip netns del $netns &> /dev/null
+   cat $tmplog
+   rm -f $tmplog
+}
+
 run_many()
 {
echo "TAP version 13"
@@ -155,6 +171,12 @@ run_many()
logfile="/tmp/$BASENAME_TEST"
cat /dev/null > "$logfile"
fi
-   run_one "$DIR" "$TEST" "$test_num"
+   if [ -n "$RUN_IN_NETNS" ]; then
+   run_in_netns &
+   else
+   run_one "$DIR" "$TEST" "$test_num"
+   fi
done
+
+   wait
 }
diff --git a/tools/testing/selftests/run_kselftest.sh 
b/tools/testing/selftests/run_kselftest.sh
index 92743980e553..637aaa9e474a 100755
--- a/tools/testing/selftests/run_kselftest.sh
+++ b/tools/testing/selftests/run_kselftest.sh
@@ -25,6 +25,7 @@ Usage: $0 [OPTIONS]
   -c | --collection COLLECTION Run all tests from COLLECTION
   -l | --list  List the available collection:test entries
   -d | --dry-run   Don't actually run any tests
+  -n | --netns Run each test in namespace
   -h | --help  Show this usage info
   -o | --override-timeout  Number of seconds after which we timeout
 EOF
@@ -53,6 +54,9 @@ while true; do
-d | --dry-run)
dryrun="echo"
shift ;;
+   -n | --netns)
+   RUN_IN_NETNS=1
+   shift ;;
-o | --override-timeout)
kselftest_override_timeout="$2"
shift 2 ;;
-- 
2.41.0

[PATCH net-next 37/38] selftests/net: convert xfrm_policy.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./xfrm_policy.sh
PASS: policy before exception matches
PASS: ping to .254 bypassed ipsec tunnel (exceptions)
PASS: direct policy matches (exceptions)
PASS: policy matches (exceptions)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies)
PASS: direct policy matches (exceptions and block policies)
PASS: policy matches (exceptions and block policies)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after 
hresh changes)
PASS: direct policy matches (exceptions and block policies after hresh changes)
PASS: policy matches (exceptions and block policies after hresh changes)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after 
hthresh change in ns3)
PASS: direct policy matches (exceptions and block policies after hthresh change 
in ns3)
PASS: policy matches (exceptions and block policies after hthresh change in ns3)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after 
htresh change to normal)
PASS: direct policy matches (exceptions and block policies after htresh change 
to normal)
PASS: policy matches (exceptions and block policies after htresh change to 
normal)
PASS: policies with repeated htresh change

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/xfrm_policy.sh | 138 ++---
 1 file changed, 69 insertions(+), 69 deletions(-)

diff --git a/tools/testing/selftests/net/xfrm_policy.sh 
b/tools/testing/selftests/net/xfrm_policy.sh
index bdf450eaf60c..457789530645 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -18,8 +18,7 @@
 # ns1: ping 10.0.2.254: does NOT pass via ipsec tunnel (exception)
 # ns2: ping 10.0.1.254: does NOT pass via ipsec tunnel (exception)
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
+source lib.sh
 ret=0
 policy_checks_ok=1
 
@@ -204,24 +203,24 @@ check_xfrm() {
ip=$2
local lret=0
 
-   ip netns exec ns1 ping -q -c 1 10.0.2.$ip > /dev/null
+   ip netns exec ${ns[1]} ping -q -c 1 10.0.2.$ip > /dev/null
 
-   check_ipt_policy_count ns3
+   check_ipt_policy_count ${ns[3]}
if [ $? -ne $rval ] ; then
lret=1
fi
-   check_ipt_policy_count ns4
+   check_ipt_policy_count ${ns[4]}
if [ $? -ne $rval ] ; then
lret=1
fi
 
-   ip netns exec ns2 ping -q -c 1 10.0.1.$ip > /dev/null
+   ip netns exec ${ns[2]} ping -q -c 1 10.0.1.$ip > /dev/null
 
-   check_ipt_policy_count ns3
+   check_ipt_policy_count ${ns[3]}
if [ $? -ne $rval ] ; then
lret=1
fi
-   check_ipt_policy_count ns4
+   check_ipt_policy_count ${ns[4]}
if [ $? -ne $rval ] ; then
lret=1
fi
@@ -270,11 +269,11 @@ check_hthresh_repeat()
i=0
 
for i in $(seq 1 10);do
-   ip -net ns1 xfrm policy update src e000:0001:: dst 
ff01::0014::0001 dir in tmpl src :: dst :: proto esp mode tunnel priority 
100 action allow || break
-   ip -net ns1 xfrm policy set hthresh6 0 28 || break
+   ip -net ${ns[1]} xfrm policy update src e000:0001:: dst 
ff01::0014::0001 dir in tmpl src :: dst :: proto esp mode tunnel priority 
100 action allow || break
+   ip -net ${ns[1]} xfrm policy set hthresh6 0 28 || break
 
-   ip -net ns1 xfrm policy update src e000:0001:: dst ff01::01 
dir in tmpl src :: dst :: proto esp mode tunnel priority 100 action allow || 
break
-   ip -net ns1 xfrm policy set hthresh6 0 28 || break
+   ip -net ${ns[1]} xfrm policy update src e000:0001:: dst 
ff01::01 dir in tmpl src :: dst :: proto esp mode tunnel priority 100 action 
allow || break
+   ip -net ${ns[1]} xfrm policy set hthresh6 0 28 || break
done
 
if [ $i -ne 10 ] ;then
@@ -347,79 +346,80 @@ if [ $? -ne 0 ];then
exit $ksft_skip
 fi
 
-for i in 1 2 3 4; do
-ip netns add ns$i
-ip -net ns$i link set lo up
-done
+setup_ns ns1 ns2 ns3 ns4
+ns[1]=$ns1
+ns[2]=$ns2
+ns[3]=$ns3
+ns[4]=$ns4
 
 DEV=veth0
-ip link add $DEV netns ns1 type veth peer name eth1 netns ns3
-ip link add $DEV netns ns2 type veth peer name eth1 netns ns4
+ip link add $DEV netns ${ns[1]} type veth peer name eth1 netns ${ns[3]}
+ip link add $DEV netns ${ns[2]} type veth peer name eth1 netns ${ns[4]}
 
-ip link add $DEV netns ns3 type veth peer name veth0 netns ns4
+ip link add $DEV netns ${ns[3]} type veth peer name veth0 netns ${ns[4]}
 
 DEV=veth0
 for i in 1 2; do
-ip -net ns$i link set $DEV up
-ip -net ns$i addr add 10.0.$i.2/24 dev $DEV
-ip -net ns$i addr add dead:$i::2/64 dev $DEV
-
-ip -net ns$i addr add 10.0.$i.253 dev $DEV
-ip -net ns$i addr add 10.0.$i.254 dev $DEV
-ip -net ns$i addr add dead:$i::fd dev $DEV
-ip -net ns$i addr add dead:$i::fe dev $DEV
+ip -net ${ns[$i]} link set $DEV

[PATCH net-next 36/38] selftests/net: convert traceroute.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./traceroute.sh
TEST: IPV6 traceroute   [ OK ]
TEST: IPV4 traceroute   [ OK ]

Tests passed:   2
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/traceroute.sh | 82 ++-
 1 file changed, 36 insertions(+), 46 deletions(-)

diff --git a/tools/testing/selftests/net/traceroute.sh 
b/tools/testing/selftests/net/traceroute.sh
index de9ca97abc30..282f14760940 100755
--- a/tools/testing/selftests/net/traceroute.sh
+++ b/tools/testing/selftests/net/traceroute.sh
@@ -4,6 +4,7 @@
 # Run traceroute/traceroute6 tests
 #
 
+source lib.sh
 VERBOSE=0
 PAUSE_ON_FAIL=no
 
@@ -69,9 +70,6 @@ create_ns()
[ -z "${addr}" ] && addr="-"
[ -z "${addr6}" ] && addr6="-"
 
-   ip netns add ${ns}
-
-   ip netns exec ${ns} ip link set lo up
if [ "${addr}" != "-" ]; then
ip netns exec ${ns} ip addr add dev lo ${addr}
fi
@@ -160,12 +158,7 @@ connect_ns()
 
 cleanup_traceroute6()
 {
-   local ns
-
-   for ns in host-1 host-2 router-1 router-2
-   do
-   ip netns del ${ns} 2>/dev/null
-   done
+   cleanup_ns $h1 $h2 $r1 $r2
 }
 
 setup_traceroute6()
@@ -176,33 +169,34 @@ setup_traceroute6()
cleanup_traceroute6
 
set -e
-   create_ns host-1
-   create_ns host-2
-   create_ns router-1
-   create_ns router-2
+   setup_ns h1 h2 r1 r2
+   create_ns $h1
+   create_ns $h2
+   create_ns $r1
+   create_ns $r2
 
# Setup N3
-   connect_ns router-2 eth3 - 2000:103::2/64 host-2 eth3 - 2000:103::4/64
-   ip netns exec host-2 ip route add default via 2000:103::2
+   connect_ns $r2 eth3 - 2000:103::2/64 $h2 eth3 - 2000:103::4/64
+   ip netns exec $h2 ip route add default via 2000:103::2
 
# Setup N2
-   connect_ns router-1 eth2 - 2000:102::1/64 router-2 eth2 - 2000:102::2/64
-   ip netns exec router-1 ip route add default via 2000:102::2
+   connect_ns $r1 eth2 - 2000:102::1/64 $r2 eth2 - 2000:102::2/64
+   ip netns exec $r1 ip route add default via 2000:102::2
 
# Setup N1. host-1 and router-2 connect to a bridge in router-1.
-   ip netns exec router-1 ip link add name ${brdev} type bridge
-   ip netns exec router-1 ip link set ${brdev} up
-   ip netns exec router-1 ip addr add 2000:101::1/64 dev ${brdev}
+   ip netns exec $r1 ip link add name ${brdev} type bridge
+   ip netns exec $r1 ip link set ${brdev} up
+   ip netns exec $r1 ip addr add 2000:101::1/64 dev ${brdev}
 
-   connect_ns host-1 eth0 - 2000:101::3/64 router-1 eth0 - -
-   ip netns exec router-1 ip link set dev eth0 master ${brdev}
-   ip netns exec host-1 ip route add default via 2000:101::1
+   connect_ns $h1 eth0 - 2000:101::3/64 $r1 eth0 - -
+   ip netns exec $r1 ip link set dev eth0 master ${brdev}
+   ip netns exec $h1 ip route add default via 2000:101::1
 
-   connect_ns router-2 eth1 - 2000:101::2/64 router-1 eth1 - -
-   ip netns exec router-1 ip link set dev eth1 master ${brdev}
+   connect_ns $r2 eth1 - 2000:101::2/64 $r1 eth1 - -
+   ip netns exec $r1 ip link set dev eth1 master ${brdev}
 
# Prime the network
-   ip netns exec host-1 ping6 -c5 2000:103::4 >/dev/null 2>&1
+   ip netns exec $h1 ping6 -c5 2000:103::4 >/dev/null 2>&1
 
set +e
 }
@@ -217,7 +211,7 @@ run_traceroute6()
setup_traceroute6
 
# traceroute6 host-2 from host-1 (expects 2000:102::2)
-   run_cmd host-1 "traceroute6 2000:103::4 | grep -q 2000:102::2"
+   run_cmd $h1 "traceroute6 2000:103::4 | grep -q 2000:102::2"
log_test $? 0 "IPV6 traceroute"
 
cleanup_traceroute6
@@ -240,12 +234,7 @@ run_traceroute6()
 
 cleanup_traceroute()
 {
-   local ns
-
-   for ns in host-1 host-2 router
-   do
-   ip netns del ${ns} 2>/dev/null
-   done
+   cleanup_ns $h1 $h2 $router
 }
 
 setup_traceroute()
@@ -254,24 +243,25 @@ setup_traceroute()
cleanup_traceroute
 
set -e
-   create_ns host-1
-   create_ns host-2
-   create_ns router
+   setup_ns h1 h2 router
+   create_ns $h1
+   create_ns $h2
+   create_ns $router
 
-   connect_ns host-1 eth0 1.0.1.3/24 - \
-  router eth1 1.0.3.1/24 -
-   ip netns exec host-1 ip route add default via 1.0.1.1
+   connect_ns $h1 eth0 1.0.1.3/24 - \
+  $router eth1 1.0.3.1/24 -
+   ip netns exec $h1 ip route add default via 1.0.1.1
 
-   ip netns exec router ip addr add 1.0.1.1/24 dev eth1
-   ip netns exec router sysctl -qw \
+   ip netns exec $router ip addr add 1.0.1.1/24 dev eth1
+   ip netns exec $router sysctl -qw \
net.ipv4.icmp_errors_use_inbound_ifaddr=1
 
-   connect_ns host-2 eth0 1.0.2.4/24 - \
-

[PATCH net-next 35/38] selftests/net: convert vrf-xfrm-tests.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./vrf-xfrm-tests.sh

No qdisc on VRF device
TEST: IPv4 no xfrm policy   [ OK ]
TEST: IPv6 no xfrm policy   [ OK ]
TEST: IPv4 xfrm policy based on address [ OK ]
TEST: IPv6 xfrm policy based on address [ OK ]
TEST: IPv6 xfrm policy with VRF in selector [ OK ]
TEST: IPv4 xfrm policy with xfrm device [ OK ]
TEST: IPv6 xfrm policy with xfrm device [ OK ]

netem qdisc on VRF device
TEST: IPv4 no xfrm policy   [ OK ]
TEST: IPv6 no xfrm policy   [ OK ]
TEST: IPv4 xfrm policy based on address [ OK ]
TEST: IPv6 xfrm policy based on address [ OK ]
TEST: IPv6 xfrm policy with VRF in selector [ OK ]
TEST: IPv4 xfrm policy with xfrm device [ OK ]
TEST: IPv6 xfrm policy with xfrm device [ OK ]

Tests passed:  14
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/vrf-xfrm-tests.sh | 77 +--
 1 file changed, 36 insertions(+), 41 deletions(-)

diff --git a/tools/testing/selftests/net/vrf-xfrm-tests.sh 
b/tools/testing/selftests/net/vrf-xfrm-tests.sh
index 452638ae8aed..b64dd891699d 100755
--- a/tools/testing/selftests/net/vrf-xfrm-tests.sh
+++ b/tools/testing/selftests/net/vrf-xfrm-tests.sh
@@ -3,9 +3,7 @@
 #
 # Various combinations of VRF with xfrms and qdisc.
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
-
+source lib.sh
 PAUSE_ON_FAIL=no
 VERBOSE=0
 ret=0
@@ -67,7 +65,7 @@ run_cmd_host1()
printf "COMMAND: $cmd\n"
fi
 
-   out=$(eval ip netns exec host1 $cmd 2>&1)
+   out=$(eval ip netns exec $host1 $cmd 2>&1)
rc=$?
if [ "$VERBOSE" = "1" ]; then
if [ -n "$out" ]; then
@@ -116,9 +114,6 @@ create_ns()
[ -z "${addr}" ] && addr="-"
[ -z "${addr6}" ] && addr6="-"
 
-   ip netns add ${ns}
-
-   ip -netns ${ns} link set lo up
if [ "${addr}" != "-" ]; then
ip -netns ${ns} addr add dev lo ${addr}
fi
@@ -177,25 +172,25 @@ connect_ns()
 
 cleanup()
 {
-   ip netns del host1
-   ip netns del host2
+   cleanup_ns $host1 $host2
 }
 
 setup()
 {
-   create_ns "host1"
-   create_ns "host2"
+   setup_ns host1 host2
+   create_ns "$host1"
+   create_ns "$host2"
 
-   connect_ns "host1" eth0 ${HOST1_4}/24 ${HOST1_6}/64 \
-  "host2" eth0 ${HOST2_4}/24 ${HOST2_6}/64
+   connect_ns "$host1" eth0 ${HOST1_4}/24 ${HOST1_6}/64 \
+  "$host2" eth0 ${HOST2_4}/24 ${HOST2_6}/64
 
-   create_vrf "host1" ${VRF} ${TABLE}
-   ip -netns host1 link set dev eth0 master ${VRF}
+   create_vrf "$host1" ${VRF} ${TABLE}
+   ip -netns $host1 link set dev eth0 master ${VRF}
 }
 
 cleanup_xfrm()
 {
-   for ns in host1 host2
+   for ns in $host1 $host2
do
for x in state policy
do
@@ -218,57 +213,57 @@ setup_xfrm()
#
 
# host1 - IPv4 out
-   ip -netns host1 xfrm policy add \
+   ip -netns $host1 xfrm policy add \
  src ${h1_4} dst ${h2_4} ${devarg} dir out \
  tmpl src ${HOST1_4} dst ${HOST2_4} proto esp mode tunnel
 
# host2 - IPv4 in
-   ip -netns host2 xfrm policy add \
+   ip -netns $host2 xfrm policy add \
  src ${h1_4} dst ${h2_4} dir in \
  tmpl src ${HOST1_4} dst ${HOST2_4} proto esp mode tunnel
 
# host1 - IPv4 in
-   ip -netns host1 xfrm policy add \
+   ip -netns $host1 xfrm policy add \
  src ${h2_4} dst ${h1_4} ${devarg} dir in \
  tmpl src ${HOST2_4} dst ${HOST1_4} proto esp mode tunnel
 
# host2 - IPv4 out
-   ip -netns host2 xfrm policy add \
+   ip -netns $host2 xfrm policy add \
  src ${h2_4} dst ${h1_4} dir out \
  tmpl src ${HOST2_4} dst ${HOST1_4} proto esp mode tunnel
 
 
# host1 - IPv6 out
-   ip -6 -netns host1 xfrm policy add \
+   ip -6 -netns $host1 xfrm policy add \
  src ${h1_6} dst ${h2_6} ${devarg} dir out \
  tmpl src ${HOST1_6} dst ${HOST2_6} proto esp mode tunnel
 
# host2 - IPv6 in
-   ip -6 -netns host2 xfrm policy add \
+   ip -6 -netns $host2 xfrm policy add \
  src ${h1_6} dst ${h2_6} dir in \
  tmpl src ${HOST1_6} dst ${HOST2_6} proto esp mode tunnel
 
# host1 - IPv6 in
-   ip -6 -netns host1 xfrm policy add \
+   ip -6 -netns $host1 xfrm policy add \
  src ${h2_6} dst ${h1_6} ${devarg} dir in \
  tmpl src ${HOST2_6} dst ${HOST1_6} proto esp mode tunnel
 
# host2 - IPv6 out
-   ip -6 -netns host2

[PATCH net-next 34/38] selftests/net: convert vrf_strict_mode_test.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

 ]# ./vrf_strict_mode_test.sh

 

 TEST SECTION: VRF strict_mode test on init network namespace
 


 TEST: init: net.vrf.strict_mode is available[ OK ]

 TEST: init: strict_mode=0 by default, 0 vrfs[ OK ]

 ...

 TEST: init: check strict_mode=1 [ OK ]

 TEST: testns-HvoZkB: check strict_mode=0[ OK ]

 Tests passed:  37
 Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/vrf_strict_mode_test.sh | 47 +--
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/net/vrf_strict_mode_test.sh 
b/tools/testing/selftests/net/vrf_strict_mode_test.sh
index 417d214264f3..01552b542544 100755
--- a/tools/testing/selftests/net/vrf_strict_mode_test.sh
+++ b/tools/testing/selftests/net/vrf_strict_mode_test.sh
@@ -3,9 +3,7 @@
 
 # This test is designed for testing the new VRF strict_mode functionality.
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
-
+source lib.sh
 ret=0
 
 # identifies the "init" network namespace which is often called root network
@@ -247,13 +245,12 @@ setup()
 {
modprobe vrf
 
-   ip netns add testns
-   ip netns exec testns ip link set lo up
+   setup_ns testns
 }
 
 cleanup()
 {
-   ip netns del testns 2>/dev/null
+   ip netns del $testns 2>/dev/null
 
ip link del vrf100 2>/dev/null
ip link del vrf101 2>/dev/null
@@ -298,28 +295,28 @@ vrf_strict_mode_tests_testns()
 {
log_section "VRF strict_mode test on testns network namespace"
 
-   vrf_strict_mode_check_support testns
+   vrf_strict_mode_check_support $testns
 
-   strict_mode_check_default testns
+   strict_mode_check_default $testns
 
-   enable_strict_mode_and_check testns
+   enable_strict_mode_and_check $testns
 
-   add_vrf_and_check testns vrf100 100
-   config_vrf_and_check testns 10.0.100.1/24 vrf100
+   add_vrf_and_check $testns vrf100 100
+   config_vrf_and_check $testns 10.0.100.1/24 vrf100
 
-   add_vrf_and_check_fail testns vrf101 100
+   add_vrf_and_check_fail $testns vrf101 100
 
-   add_vrf_and_check_fail testns vrf102 100
+   add_vrf_and_check_fail $testns vrf102 100
 
-   add_vrf_and_check testns vrf200 200
+   add_vrf_and_check $testns vrf200 200
 
-   disable_strict_mode_and_check testns
+   disable_strict_mode_and_check $testns
 
-   add_vrf_and_check testns vrf101 100
+   add_vrf_and_check $testns vrf101 100
 
-   add_vrf_and_check testns vrf102 100
+   add_vrf_and_check $testns vrf102 100
 
-   #the strict_mode is disabled in the testns
+   #the strict_mode is disabled in the $testns
 }
 
 vrf_strict_mode_tests_mix()
@@ -328,25 +325,25 @@ vrf_strict_mode_tests_mix()
 
read_strict_mode_compare_and_check init 1
 
-   read_strict_mode_compare_and_check testns 0
+   read_strict_mode_compare_and_check $testns 0
 
-   del_vrf_and_check testns vrf101
+   del_vrf_and_check $testns vrf101
 
-   del_vrf_and_check testns vrf102
+   del_vrf_and_check $testns vrf102
 
disable_strict_mode_and_check init
 
-   enable_strict_mode_and_check testns
+   enable_strict_mode_and_check $testns
 
enable_strict_mode_and_check init
enable_strict_mode_and_check init
 
-   disable_strict_mode_and_check testns
-   disable_strict_mode_and_check testns
+   disable_strict_mode_and_check $testns
+   disable_strict_mode_and_check $testns
 
read_strict_mode_compare_and_check init 1
 
-   read_strict_mode_compare_and_check testns 0
+   read_strict_mode_compare_and_check $testns 0
 }
 
 

-- 
2.41.0

[PATCH net-next 33/38] selftests/net: convert vrf_route_leaking.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

 ]# ./vrf_route_leaking.sh

 ###
 IPv4 (sym route): VRF ICMP ttl error route lookup ping
 ###

 TEST: Basic IPv4 connectivity   [ OK ]
 TEST: Ping received ICMP ttl exceeded   [ OK ]

 ...

 TEST: Basic IPv6 connectivity   [ OK ]
 TEST: Traceroute6 reports a hop on r1   [ OK ]

 Tests passed:  18
 Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/vrf_route_leaking.sh| 201 +-
 1 file changed, 96 insertions(+), 105 deletions(-)

diff --git a/tools/testing/selftests/net/vrf_route_leaking.sh 
b/tools/testing/selftests/net/vrf_route_leaking.sh
index dedc52562b4f..2da32f4c479b 100755
--- a/tools/testing/selftests/net/vrf_route_leaking.sh
+++ b/tools/testing/selftests/net/vrf_route_leaking.sh
@@ -58,6 +58,7 @@
 # to send an ICMP error back to the source when the ttl of a packet reaches 1
 # while it is forwarded between different vrfs.
 
+source lib.sh
 VERBOSE=0
 PAUSE_ON_FAIL=no
 DEFAULT_TTYPE=sym
@@ -171,11 +172,7 @@ run_cmd_grep()
 
 cleanup()
 {
-   local ns
-
-   for ns in h1 h2 r1 r2; do
-   ip netns del $ns 2>/dev/null
-   done
+   cleanup_ns $h1 $h2 $r1 $r2
 }
 
 setup_vrf()
@@ -212,72 +209,69 @@ setup_sym()
 
#
# create nodes as namespaces
-   #
-   for ns in h1 h2 r1; do
-   ip netns add $ns
-   ip -netns $ns link set lo up
-
-   case "${ns}" in
-   h[12]) ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=0
-  ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.keep_addr_on_down=1
-   ;;
-   r1)ip netns exec $ns sysctl -q -w net.ipv4.ip_forward=1
-  ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=1
-   esac
+   setup_ns h1 h2 r1
+   for ns in $h1 $h2 $r1; do
+   if echo $ns | grep -q h[12]-; then
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=0
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.keep_addr_on_down=1
+   else
+   ip netns exec $ns sysctl -q -w net.ipv4.ip_forward=1
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=1
+   fi
done
 
#
# create interconnects
#
-   ip -netns h1 link add eth0 type veth peer name r1h1
-   ip -netns h1 link set r1h1 netns r1 name eth0 up
+   ip -netns $h1 link add eth0 type veth peer name r1h1
+   ip -netns $h1 link set r1h1 netns $r1 name eth0 up
 
-   ip -netns h2 link add eth0 type veth peer name r1h2
-   ip -netns h2 link set r1h2 netns r1 name eth1 up
+   ip -netns $h2 link add eth0 type veth peer name r1h2
+   ip -netns $h2 link set r1h2 netns $r1 name eth1 up
 
#
# h1
#
-   ip -netns h1 addr add dev eth0 ${H1_N1_IP}/24
-   ip -netns h1 -6 addr add dev eth0 ${H1_N1_IP6}/64 nodad
-   ip -netns h1 link set eth0 up
+   ip -netns $h1 addr add dev eth0 ${H1_N1_IP}/24
+   ip -netns $h1 -6 addr add dev eth0 ${H1_N1_IP6}/64 nodad
+   ip -netns $h1 link set eth0 up
 
# h1 to h2 via r1
-   ip -netns h1route add ${H2_N2} via ${R1_N1_IP} dev eth0
-   ip -netns h1 -6 route add ${H2_N2_6} via "${R1_N1_IP6}" dev eth0
+   ip -netns $h1route add ${H2_N2} via ${R1_N1_IP} dev eth0
+   ip -netns $h1 -6 route add ${H2_N2_6} via "${R1_N1_IP6}" dev eth0
 
#
# h2
#
-   ip -netns h2 addr add dev eth0 ${H2_N2_IP}/24
-   ip -netns h2 -6 addr add dev eth0 ${H2_N2_IP6}/64 nodad
-   ip -netns h2 link set eth0 up
+   ip -netns $h2 addr add dev eth0 ${H2_N2_IP}/24
+   ip -netns $h2 -6 addr add dev eth0 ${H2_N2_IP6}/64 nodad
+   ip -netns $h2 link set eth0 up
 
# h2 to h1 via r1
-   ip -netns h2 route add default via ${R1_N2_IP} dev eth0
-   ip -netns h2 -6 route add default via ${R1_N2_IP6} dev eth0
+   ip -netns $h2 route add default via ${R1_N2_IP} dev eth0
+   ip -netns $h2 -6 route add default via ${R1_N2_IP6} dev eth0
 
#
# r1
#
-   setup_vrf r1
-   create_vrf r1 blue 1101
-   create_vrf r1 red 1102
-   ip -netns r1 link set mtu 1400 dev eth1
-   ip -netns r1 link set eth0 vrf blue up
-   ip -netns r1 link set eth1 vrf red up
-   ip -netns r1 addr add dev eth0 ${R1_N1_IP}/24
-   ip -netns r1 -6 addr add dev eth0 ${R1_N1_IP6}/64 nodad
-   ip -netns r1 addr add dev eth1 ${R1_N2_IP}/24
-   ip -netns r1 -6 addr add dev eth1 ${R1_N2_IP6}/64 nodad
+   setup_vrf $r1
+   create_vrf $r1 blue

[PATCH net-next 32/38] selftests/net: convert unicast_extensions.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

 # ./unicast_extensions.sh
 /usr/bin/which: no nettest in 
(/root/.local/bin:/root/bin:/usr/share/Modules/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)
 ###
 Unicast address extensions tests (behavior of reserved IPv4 addresses)
 ###
 TEST: assign and ping within 240/4 (1 of 2) (is allowed)[ OK ]
 TEST: assign and ping within 240/4 (2 of 2) (is allowed)[ OK ]
 TEST: assign and ping within 0/8 (1 of 2) (is allowed)  [ OK ]

 ...

 TEST: assign and ping class D address (is forbidden)[ OK ]
 TEST: routing using class D (is forbidden)  [ OK ]
 TEST: routing using 127/8 (is forbidden)[ OK ]

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/unicast_extensions.sh   | 99 +--
 1 file changed, 46 insertions(+), 53 deletions(-)

diff --git a/tools/testing/selftests/net/unicast_extensions.sh 
b/tools/testing/selftests/net/unicast_extensions.sh
index 2d10ccac898a..b7a2cb9e7477 100755
--- a/tools/testing/selftests/net/unicast_extensions.sh
+++ b/tools/testing/selftests/net/unicast_extensions.sh
@@ -28,8 +28,7 @@
 # These tests provide an easy way to flip the expected result of any
 # of these behaviors for testing kernel patches that change them.
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
+source ./lib.sh
 
 # nettest can be run from PATH or from same directory as this selftest
 if ! which nettest >/dev/null; then
@@ -61,20 +60,20 @@ _do_segmenttest(){
# foo --- bar
# Arguments: ip_a ip_b prefix_length test_description
#
-   # Caller must set up foo-ns and bar-ns namespaces
+   # Caller must set up $foo_ns and $bar_ns namespaces
# containing linked veth devices foo and bar,
# respectively.
 
-   ip -n foo-ns address add $1/$3 dev foo || return 1
-   ip -n foo-ns link set foo up || return 1
-   ip -n bar-ns address add $2/$3 dev bar || return 1
-   ip -n bar-ns link set bar up || return 1
+   ip -n $foo_ns address add $1/$3 dev foo || return 1
+   ip -n $foo_ns link set foo up || return 1
+   ip -n $bar_ns address add $2/$3 dev bar || return 1
+   ip -n $bar_ns link set bar up || return 1
 
-   ip netns exec foo-ns timeout 2 ping -c 1 $2 || return 1
-   ip netns exec bar-ns timeout 2 ping -c 1 $1 || return 1
+   ip netns exec $foo_ns timeout 2 ping -c 1 $2 || return 1
+   ip netns exec $bar_ns timeout 2 ping -c 1 $1 || return 1
 
-   nettest -B -N bar-ns -O foo-ns -r $1 || return 1
-   nettest -B -N foo-ns -O bar-ns -r $2 || return 1
+   nettest -B -N $bar_ns -O $foo_ns -r $1 || return 1
+   nettest -B -N $foo_ns -O $bar_ns -r $2 || return 1
 
return 0
 }
@@ -88,31 +87,31 @@ _do_route_test(){
# Arguments: foo_ip foo1_ip bar1_ip bar_ip prefix_len test_description
# Displays test result and returns success or failure.
 
-   # Caller must set up foo-ns, bar-ns, and router-ns
+   # Caller must set up $foo_ns, $bar_ns, and $router_ns
# containing linked veth devices foo-foo1, bar1-bar
-   # (foo in foo-ns, foo1 and bar1 in router-ns, and
-   # bar in bar-ns).
-
-   ip -n foo-ns address add $1/$5 dev foo || return 1
-   ip -n foo-ns link set foo up || return 1
-   ip -n foo-ns route add default via $2 || return 1
-   ip -n bar-ns address add $4/$5 dev bar || return 1
-   ip -n bar-ns link set bar up || return 1
-   ip -n bar-ns route add default via $3 || return 1
-   ip -n router-ns address add $2/$5 dev foo1 || return 1
-   ip -n router-ns link set foo1 up || return 1
-   ip -n router-ns address add $3/$5 dev bar1 || return 1
-   ip -n router-ns link set bar1 up || return 1
-
-   echo 1 | ip netns exec router-ns tee /proc/sys/net/ipv4/ip_forward
-
-   ip netns exec foo-ns timeout 2 ping -c 1 $2 || return 1
-   ip netns exec foo-ns timeout 2 ping -c 1 $4 || return 1
-   ip netns exec bar-ns timeout 2 ping -c 1 $3 || return 1
-   ip netns exec bar-ns timeout 2 ping -c 1 $1 || return 1
-
-   nettest -B -N bar-ns -O foo-ns -r $1 || return 1
-   nettest -B -N foo-ns -O bar-ns -r $4 || return 1
+   # (foo in $foo_ns, foo1 and bar1 in $router_ns, and
+   # bar in $bar_ns).
+
+   ip -n $foo_ns address add $1/$5 dev foo || return 1
+   ip -n $foo_ns link set foo up || return 1
+   ip -n $foo_ns route add default via $2 || return 1
+   ip -n $bar_ns address add $4/$5 dev bar || return 1
+   ip -n $bar_ns link set bar up || return 1
+   ip -n $bar_ns route add default via $3 || return 1
+   ip -n $router_ns address add $2/$5 dev foo1 || return 1
+   ip -n $router_ns link set foo1 up || return 1
+   ip -n $router_ns address add

[PATCH net-next 31/38] selftests/net: convert toeplitz.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

I have no valid NIC for testing, but the result looks good.

]# ./toeplitz.sh -i eno1 -t -6
carrier ready
count: pass=0 nohash=0 fail=0
./toeplitz: too few frames for verification
setup_loopback.sh: line 68: 902542 Killed  ip netns exec 
$client_ns ./toeplitz_client.sh "${PROTO_FLAG}" "${IP_FLAG}" "${SERVER_IP%%/*}" 
"${PORT}"
carrier ready

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/toeplitz.sh | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/net/toeplitz.sh 
b/tools/testing/selftests/net/toeplitz.sh
index da5bfd834eff..4a70c9c6ad1c 100755
--- a/tools/testing/selftests/net/toeplitz.sh
+++ b/tools/testing/selftests/net/toeplitz.sh
@@ -12,6 +12,7 @@
 # [(-rss -irq_prefix )|(-rps )]
 
 source setup_loopback.sh
+source lib.sh
 readonly SERVER_IP4="192.168.1.200/24"
 readonly SERVER_IP6="fda8::1/64"
 readonly SERVER_MAC="aa:00:00:00:00:02"
@@ -146,15 +147,16 @@ parse_opts() {
 setup() {
setup_loopback_environment "${DEV}"
 
+   setup_ns server_ns client_ns
# Set up server_ns namespace and client_ns namespace
-   setup_macvlan_ns "${DEV}" server_ns server \
+   setup_macvlan_ns "${DEV}" $server_ns server \
"${SERVER_MAC}" "${SERVER_IP}"
-   setup_macvlan_ns "${DEV}" client_ns client \
+   setup_macvlan_ns "${DEV}" $client_ns client \
"${CLIENT_MAC}" "${CLIENT_IP}"
 }
 
 cleanup() {
-   cleanup_macvlan_ns server_ns server client_ns client
+   cleanup_macvlan_ns $server_ns server $client_ns client
cleanup_loopback "${DEV}"
 }
 
@@ -170,22 +172,22 @@ if [[ "${TEST_RSS}" = true ]]; then
# RPS/RFS must be disabled because they move packets between cpus,
# which breaks the PACKET_FANOUT_CPU identification of RSS decisions.
eval "$(get_disable_rfs_cmd) $(get_disable_rps_cmd)" \
- ip netns exec server_ns ./toeplitz "${IP_FLAG}" "${PROTO_FLAG}" \
+ ip netns exec $server_ns ./toeplitz "${IP_FLAG}" "${PROTO_FLAG}" \
  -d "${PORT}" -i "${DEV}" -k "${KEY}" -T 1000 \
  -C "$(get_rx_irq_cpus)" -s -v &
 elif [[ ! -z "${RPS_MAP}" ]]; then
eval "$(get_disable_rfs_cmd) $(get_set_rps_bitmaps_cmd ${RPS_MAP})" \
- ip netns exec server_ns ./toeplitz "${IP_FLAG}" "${PROTO_FLAG}" \
+ ip netns exec $server_ns ./toeplitz "${IP_FLAG}" "${PROTO_FLAG}" \
  -d "${PORT}" -i "${DEV}" -k "${KEY}" -T 1000 \
  -r "0x${RPS_MAP}" -s -v &
 else
-   ip netns exec server_ns ./toeplitz "${IP_FLAG}" "${PROTO_FLAG}" \
+   ip netns exec $server_ns ./toeplitz "${IP_FLAG}" "${PROTO_FLAG}" \
  -d "${PORT}" -i "${DEV}" -k "${KEY}" -T 1000 -s -v &
 fi
 
 server_pid=$!
 
-ip netns exec client_ns ./toeplitz_client.sh "${PROTO_FLAG}" \
+ip netns exec $client_ns ./toeplitz_client.sh "${PROTO_FLAG}" \
   "${IP_FLAG}" "${SERVER_IP%%/*}" "${PORT}" &
 
 client_pid=$!
-- 
2.41.0

[PATCH net-next 30/38] selftests/net: convert test_vxlan_vnifiltering.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./test_vxlan_vnifiltering.sh
TEST: Create traditional vxlan device   [ OK ]
TEST: Cannot create vnifilter device without external flag  [ OK ]
TEST: Creating external vxlan device with vnifilter flag[ OK ]
...
TEST: VM connectivity over traditional vxlan (ipv6 default rdst)[ OK ]
TEST: VM connectivity over metadata nonfiltering vxlan (ipv4 default rdst)  
[ OK ]

Tests passed:  27
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/test_vxlan_vnifiltering.sh  | 154 +++---
 1 file changed, 95 insertions(+), 59 deletions(-)

diff --git a/tools/testing/selftests/net/test_vxlan_vnifiltering.sh 
b/tools/testing/selftests/net/test_vxlan_vnifiltering.sh
index 8c3ac0a72545..6127a78ee988 100755
--- a/tools/testing/selftests/net/test_vxlan_vnifiltering.sh
+++ b/tools/testing/selftests/net/test_vxlan_vnifiltering.sh
@@ -78,10 +78,8 @@
 #
 #
 # This test tests the new vxlan vnifiltering api
-
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 # all tests in this script. Can be overridden with -t option
 TESTS="
@@ -148,18 +146,18 @@ run_cmd()
 }
 
 check_hv_connectivity() {
-   ip netns exec hv-1 ping -c 1 -W 1 $1 &>/dev/null
+   ip netns exec $hv_1 ping -c 1 -W 1 $1 &>/dev/null
sleep 1
-   ip netns exec hv-1 ping -c 1 -W 1 $2 &>/dev/null
+   ip netns exec $hv_1 ping -c 1 -W 1 $2 &>/dev/null
 
return $?
 }
 
 check_vm_connectivity() {
-   run_cmd "ip netns exec vm-11 ping -c 1 -W 1 10.0.10.12"
+   run_cmd "ip netns exec $vm_11 ping -c 1 -W 1 10.0.10.12"
log_test $? 0 "VM connectivity over $1 (ipv4 default rdst)"
 
-   run_cmd "ip netns exec vm-21 ping -c 1 -W 1 10.0.10.22"
+   run_cmd "ip netns exec $vm_21 ping -c 1 -W 1 10.0.10.22"
log_test $? 0 "VM connectivity over $1 (ipv6 default rdst)"
 }
 
@@ -167,26 +165,23 @@ cleanup() {
ip link del veth-hv-1 2>/dev/null || true
ip link del vethhv-11 vethhv-12 vethhv-21 vethhv-22 2>/dev/null || true
 
-   for ns in hv-1 hv-2 vm-11 vm-21 vm-12 vm-22 vm-31 vm-32; do
-   ip netns del $ns 2>/dev/null || true
-   done
+   cleanup_ns $hv_1 $hv_2 $vm_11 $vm_21 $vm_12 $vm_22 $vm_31 $vm_32
 }
 
 trap cleanup EXIT
 
 setup-hv-networking() {
-   hv=$1
+   id=$1
local1=$2
mask1=$3
local2=$4
mask2=$5
 
-   ip netns add hv-$hv
-   ip link set veth-hv-$hv netns hv-$hv
-   ip -netns hv-$hv link set veth-hv-$hv name veth0
-   ip -netns hv-$hv addr add $local1/$mask1 dev veth0
-   ip -netns hv-$hv addr add $local2/$mask2 dev veth0
-   ip -netns hv-$hv link set veth0 up
+   ip link set veth-hv-$id netns ${hv[$id]}
+   ip -netns ${hv[$id]} link set veth-hv-$id name veth0
+   ip -netns ${hv[$id]} addr add $local1/$mask1 dev veth0
+   ip -netns ${hv[$id]} addr add $local2/$mask2 dev veth0
+   ip -netns ${hv[$id]} link set veth0 up
 }
 
 # Setups a "VM" simulated by a netns an a veth pair
@@ -208,21 +203,20 @@ setup-vm() {
lastvxlandev=""
 
# create bridge
-   ip -netns hv-$hvid link add br$brid type bridge vlan_filtering 1 
vlan_default_pvid 0 \
+   ip -netns ${hv[$hvid]} link add br$brid type bridge vlan_filtering 1 
vlan_default_pvid 0 \
mcast_snooping 0
-   ip -netns hv-$hvid link set br$brid up
+   ip -netns ${hv[$hvid]} link set br$brid up
 
# create vm namespace and interfaces and connect to hypervisor
# namespace
-   ip netns add vm-$vmid
hvvethif="vethhv-$vmid"
vmvethif="veth-$vmid"
ip link add $hvvethif type veth peer name $vmvethif
-   ip link set $hvvethif netns hv-$hvid
-   ip link set $vmvethif netns vm-$vmid
-   ip -netns hv-$hvid link set $hvvethif up
-   ip -netns vm-$vmid link set $vmvethif up
-   ip -netns hv-$hvid link set $hvvethif master br$brid
+   ip link set $hvvethif netns ${hv[$hvid]}
+   ip link set $vmvethif netns ${vm[$vmid]}
+   ip -netns ${hv[$hvid]} link set $hvvethif up
+   ip -netns ${vm[$vmid]} link set $vmvethif up
+   ip -netns ${hv[$hvid]} link set $hvvethif master br$brid
 
# configure VM vlan/vni filtering on hypervisor
for vmap in $(echo $vattrs | cut -d "," -f1- --output-delimiter=' ')
@@ -234,9 +228,9 @@ setup-vm() {
local vtype=$(echo $vmap | awk -F'-' '{print ($5)}')
local port=$(echo $vmap | awk -F'-' '{print ($6)}')
 
-   ip -netns vm-$vmid link add name $vmvethif.$vid link $vmvethif type 
vlan id $vid
-   ip -netns vm-$vmid addr add 10.0.$vid.$vmid/24 dev $vmvethif.$vid
-   ip -netns vm-$vmid link set $vmvethif.$vid up
+   ip -netns ${vm[$vmid]} link add name $vmvethif.$vid link $vmvethif type 
vlan id $vid
+   ip -netns ${vm[$vmid]} addr add 10.0.$vid.$vmid/24 dev $vmvethif.$vid
+   ip -netns

[PATCH net-next 29/38] selftests/net: convert test_vxlan_under_vrf.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./test_vxlan_under_vrf.sh
Checking HV connectivity   [ OK ]
Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
Check VM connectivity through VXLAN (underlay in a VRF)[ OK ]

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/test_vxlan_under_vrf.sh | 70 ++-
 1 file changed, 36 insertions(+), 34 deletions(-)

diff --git a/tools/testing/selftests/net/test_vxlan_under_vrf.sh 
b/tools/testing/selftests/net/test_vxlan_under_vrf.sh
index 1fd1250ebc66..ae8fbe3f0779 100755
--- a/tools/testing/selftests/net/test_vxlan_under_vrf.sh
+++ b/tools/testing/selftests/net/test_vxlan_under_vrf.sh
@@ -43,15 +43,14 @@
 # This tests both the connectivity between vm-1 and vm-2, and that the underlay
 # can be moved in and out of the vrf by unsetting and setting veth0's master.
 
+source lib.sh
 set -e
 
 cleanup() {
 ip link del veth-hv-1 2>/dev/null || true
 ip link del veth-tap 2>/dev/null || true
 
-for ns in hv-1 hv-2 vm-1 vm-2; do
-ip netns del $ns 2>/dev/null || true
-done
+cleanup_ns $hv_1 $hv_2 $vm_1 $vm_2
 }
 
 # Clean start
@@ -60,72 +59,75 @@ cleanup &> /dev/null
 [[ $1 == "clean" ]] && exit 0
 
 trap cleanup EXIT
+setup_ns hv_1 hv_2 vm_1 vm_2
+hv[1]=$hv_1
+hv[2]=$hv_2
+vm[1]=$vm_1
+vm[2]=$vm_2
 
 # Setup "Hypervisors" simulated with netns
 ip link add veth-hv-1 type veth peer name veth-hv-2
 setup-hv-networking() {
-hv=$1
+id=$1
 
-ip netns add hv-$hv
-ip link set veth-hv-$hv netns hv-$hv
-ip -netns hv-$hv link set veth-hv-$hv name veth0
+ip link set veth-hv-$id netns ${hv[$id]}
+ip -netns ${hv[$id]} link set veth-hv-$id name veth0
 
-ip -netns hv-$hv link add vrf-underlay type vrf table 1
-ip -netns hv-$hv link set vrf-underlay up
-ip -netns hv-$hv addr add 172.16.0.$hv/24 dev veth0
-ip -netns hv-$hv link set veth0 up
+ip -netns ${hv[$id]} link add vrf-underlay type vrf table 1
+ip -netns ${hv[$id]} link set vrf-underlay up
+ip -netns ${hv[$id]} addr add 172.16.0.$id/24 dev veth0
+ip -netns ${hv[$id]} link set veth0 up
 
-ip -netns hv-$hv link add br0 type bridge
-ip -netns hv-$hv link set br0 up
+ip -netns ${hv[$id]} link add br0 type bridge
+ip -netns ${hv[$id]} link set br0 up
 
-ip -netns hv-$hv link add vxlan0 type vxlan id 10 local 172.16.0.$hv dev 
veth0 dstport 4789
-ip -netns hv-$hv link set vxlan0 master br0
-ip -netns hv-$hv link set vxlan0 up
+ip -netns ${hv[$id]} link add vxlan0 type vxlan id 10 local 172.16.0.$id 
dev veth0 dstport 4789
+ip -netns ${hv[$id]} link set vxlan0 master br0
+ip -netns ${hv[$id]} link set vxlan0 up
 }
 setup-hv-networking 1
 setup-hv-networking 2
 
 # Check connectivity between HVs by pinging hv-2 from hv-1
 echo -n "Checking HV connectivity   "
-ip netns exec hv-1 ping -c 1 -W 1 172.16.0.2 &> /dev/null || (echo "[FAIL]"; 
false)
+ip netns exec $hv_1 ping -c 1 -W 1 172.16.0.2 &> /dev/null || (echo "[FAIL]"; 
false)
 echo "[ OK ]"
 
 # Setups a "VM" simulated by a netns an a veth pair
 setup-vm() {
 id=$1
 
-ip netns add vm-$id
 ip link add veth-tap type veth peer name veth-hv
 
-ip link set veth-tap netns hv-$id
-ip -netns hv-$id link set veth-tap master br0
-ip -netns hv-$id link set veth-tap up
+ip link set veth-tap netns ${hv[$id]}
+ip -netns ${hv[$id]} link set veth-tap master br0
+ip -netns ${hv[$id]} link set veth-tap up
 
 ip link set veth-hv address 02:1d:8d:dd:0c:6$id
 
-ip link set veth-hv netns vm-$id
-ip -netns vm-$id addr add 10.0.0.$id/24 dev veth-hv
-ip -netns vm-$id link set veth-hv up
+ip link set veth-hv netns ${vm[$id]}
+ip -netns ${vm[$id]} addr add 10.0.0.$id/24 dev veth-hv
+ip -netns ${vm[$id]} link set veth-hv up
 }
 setup-vm 1
 setup-vm 2
 
 # Setup VTEP routes to make ARP work
-bridge -netns hv-1 fdb add 00:00:00:00:00:00 dev vxlan0 dst 172.16.0.2 self 
permanent
-bridge -netns hv-2 fdb add 00:00:00:00:00:00 dev vxlan0 dst 172.16.0.1 self 
permanent
+bridge -netns $hv_1 fdb add 00:00:00:00:00:00 dev vxlan0 dst 172.16.0.2 self 
permanent
+bridge -netns $hv_2 fdb add 00:00:00:00:00:00 dev vxlan0 dst 172.16.0.1 self 
permanent
 
 echo -n "Check VM connectivity through VXLAN (underlay in the default VRF)  "
-ip netns exec vm-1 ping -c 1 -W 1 10.0.0.2 &> /dev/null || (echo "[FAIL]"; 
false)
+ip netns exec $vm_1 ping -c 1 -W 1 10.0.0.2 &> /dev/null || (echo "[FAIL]"; 
false)
 echo "[ OK ]"
 
 # Move the underlay to a non-default VRF
-ip -netns hv-1 link set veth0 vrf vrf-underlay
-ip -netns hv-1 link set vxlan0 down
-ip -netns hv-1 link set vxlan0 up
-ip -netns hv-2 link set veth0 vrf vrf-underlay
-ip -netns hv-2 link set vxlan0 down
-ip -netns hv-2 link set vxlan0 up
+ip -netns $hv_1 link set veth0 vrf vrf-underlay
+ip -netns $hv_1 link set vxlan0 down
+ip -netns $hv_1 link set vxlan0 up

[PATCH net-next 28/38] selftests/net: convert test_vxlan_nolocalbypass.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./test_vxlan_nolocalbypass.sh
TEST: localbypass enabled   [ OK ]
TEST: Packet received by local VXLAN device - localbypass   [ OK ]
TEST: localbypass disabled  [ OK ]
TEST: Packet not received by local VXLAN device - nolocalbypass [ OK ]
TEST: localbypass enabled   [ OK ]
TEST: Packet received by local VXLAN device - localbypass   [ OK ]

Tests passed:   6
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/test_vxlan_nolocalbypass.sh | 48 +--
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh 
b/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh
index f75212bf142c..b8805983b728 100755
--- a/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh
+++ b/tools/testing/selftests/net/test_vxlan_nolocalbypass.sh
@@ -9,9 +9,8 @@
 # option and verifies that packets are no longer received by the second VXLAN
 # device.
 
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 TESTS="
nolocalbypass
@@ -98,20 +97,19 @@ tc_check_packets()
 
 setup()
 {
-   ip netns add ns1
+   setup_ns ns1
 
-   ip -n ns1 link set dev lo up
-   ip -n ns1 address add 192.0.2.1/32 dev lo
-   ip -n ns1 address add 198.51.100.1/32 dev lo
+   ip -n $ns1 address add 192.0.2.1/32 dev lo
+   ip -n $ns1 address add 198.51.100.1/32 dev lo
 
-   ip -n ns1 link add name vx0 up type vxlan id 100 local 198.51.100.1 \
+   ip -n $ns1 link add name vx0 up type vxlan id 100 local 198.51.100.1 \
dstport 4789 nolearning
-   ip -n ns1 link add name vx1 up type vxlan id 100 dstport 4790
+   ip -n $ns1 link add name vx1 up type vxlan id 100 dstport 4790
 }
 
 cleanup()
 {
-   ip netns del ns1 &> /dev/null
+   cleanup_ns $ns1
 }
 
 

@@ -122,40 +120,40 @@ nolocalbypass()
local smac=00:01:02:03:04:05
local dmac=00:0a:0b:0c:0d:0e
 
-   run_cmd "bridge -n ns1 fdb add $dmac dev vx0 self static dst 192.0.2.1 
port 4790"
+   run_cmd "bridge -n $ns1 fdb add $dmac dev vx0 self static dst 192.0.2.1 
port 4790"
 
-   run_cmd "tc -n ns1 qdisc add dev vx1 clsact"
-   run_cmd "tc -n ns1 filter add dev vx1 ingress pref 1 handle 101 proto 
all flower src_mac $smac dst_mac $dmac action pass"
+   run_cmd "tc -n $ns1 qdisc add dev vx1 clsact"
+   run_cmd "tc -n $ns1 filter add dev vx1 ingress pref 1 handle 101 proto 
all flower src_mac $smac dst_mac $dmac action pass"
 
-   run_cmd "tc -n ns1 qdisc add dev lo clsact"
-   run_cmd "tc -n ns1 filter add dev lo ingress pref 1 handle 101 proto ip 
flower ip_proto udp dst_port 4790 action drop"
+   run_cmd "tc -n $ns1 qdisc add dev lo clsact"
+   run_cmd "tc -n $ns1 filter add dev lo ingress pref 1 handle 101 proto 
ip flower ip_proto udp dst_port 4790 action drop"
 
-   run_cmd "ip -n ns1 -d -j link show dev vx0 | jq -e 
'.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == true'"
+   run_cmd "ip -n $ns1 -d -j link show dev vx0 | jq -e 
'.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == true'"
log_test $? 0 "localbypass enabled"
 
-   run_cmd "ip netns exec ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 
-q"
+   run_cmd "ip netns exec $ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 
-q"
 
-   tc_check_packets "ns1" "dev vx1 ingress" 101 1
+   tc_check_packets "$ns1" "dev vx1 ingress" 101 1
log_test $? 0 "Packet received by local VXLAN device - localbypass"
 
-   run_cmd "ip -n ns1 link set dev vx0 type vxlan nolocalbypass"
+   run_cmd "ip -n $ns1 link set dev vx0 type vxlan nolocalbypass"
 
-   run_cmd "ip -n ns1 -d -j link show dev vx0 | jq -e 
'.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == false'"
+   run_cmd "ip -n $ns1 -d -j link show dev vx0 | jq -e 
'.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == false'"
log_test $? 0 "localbypass disabled"
 
-   run_cmd "ip netns exec ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 
-q"
+   run_cmd "ip netns exec $ns1 mausezahn vx0 -a $smac -b $dmac -c 1 -p 100 
-q"
 
-   tc_check_packets "ns1" "dev vx1 ingress" 101 1
+   tc_check_packets "$ns1" "dev vx1 ingress" 101 1
log_test $? 0 "Packet not received by local VXLAN device - 
nolocalbypass"
 
-   run_cmd "ip -n ns1 link set dev vx0 type vxlan localbypass"
+   run_cmd "ip -n $ns1 link set dev vx0 type vxlan localbypass"
 
-   run_cmd "ip -n ns1 -d -j link show dev vx0 | jq -e 
'.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == true'"
+   run_cmd "ip -n $ns1 -d -j link show dev vx0 | jq -e 
'.[][\"linkinfo\"][\"info_data\"][\"localbypass\"] == true'"
log_test $? 0 "localbypass

[PATCH net-next 27/38] selftests/net: convert test_vxlan_mdb.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./test_vxlan_mdb.sh

Control path: Basic (*, G) operations - IPv4 overlay / IPv4 underlay

TEST: MDB entry addition[ OK ]

...

Data path: MDB torture test - IPv6 overlay / IPv6 underlay
--
TEST: Torture test  [ OK ]

Tests passed: 620
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/test_vxlan_mdb.sh | 202 +-
 1 file changed, 99 insertions(+), 103 deletions(-)

diff --git a/tools/testing/selftests/net/test_vxlan_mdb.sh 
b/tools/testing/selftests/net/test_vxlan_mdb.sh
index 6e996f8063cd..6725fd9157b9 100755
--- a/tools/testing/selftests/net/test_vxlan_mdb.sh
+++ b/tools/testing/selftests/net/test_vxlan_mdb.sh
@@ -55,9 +55,8 @@
 # | ns2_v4 | | ns2_v6 |
 # ++ ++
 
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 CONTROL_PATH_TESTS="
basic_star_g_ipv4_ipv4
@@ -260,9 +259,6 @@ setup_common()
local local_addr1=$1; shift
local local_addr2=$1; shift
 
-   ip netns add $ns1
-   ip netns add $ns2
-
ip link add name veth0 type veth peer name veth1
ip link set dev veth0 netns $ns1 name veth0
ip link set dev veth1 netns $ns2 name veth0
@@ -273,36 +269,36 @@ setup_common()
 
 setup_v4()
 {
-   setup_common ns1_v4 ns2_v4 192.0.2.1 192.0.2.2
+   setup_ns ns1_v4 ns2_v4
+   setup_common $ns1_v4 $ns2_v4 192.0.2.1 192.0.2.2
 
-   ip -n ns1_v4 address add 192.0.2.17/28 dev veth0
-   ip -n ns2_v4 address add 192.0.2.18/28 dev veth0
+   ip -n $ns1_v4 address add 192.0.2.17/28 dev veth0
+   ip -n $ns2_v4 address add 192.0.2.18/28 dev veth0
 
-   ip -n ns1_v4 route add default via 192.0.2.18
-   ip -n ns2_v4 route add default via 192.0.2.17
+   ip -n $ns1_v4 route add default via 192.0.2.18
+   ip -n $ns2_v4 route add default via 192.0.2.17
 }
 
 cleanup_v4()
 {
-   ip netns del ns2_v4
-   ip netns del ns1_v4
+   cleanup_ns $ns2_v4 $ns1_v4
 }
 
 setup_v6()
 {
-   setup_common ns1_v6 ns2_v6 2001:db8:1::1 2001:db8:1::2
+   setup_ns ns1_v6 ns2_v6
+   setup_common $ns1_v6 $ns2_v6 2001:db8:1::1 2001:db8:1::2
 
-   ip -n ns1_v6 address add 2001:db8:2::1/64 dev veth0 nodad
-   ip -n ns2_v6 address add 2001:db8:2::2/64 dev veth0 nodad
+   ip -n $ns1_v6 address add 2001:db8:2::1/64 dev veth0 nodad
+   ip -n $ns2_v6 address add 2001:db8:2::2/64 dev veth0 nodad
 
-   ip -n ns1_v6 route add default via 2001:db8:2::2
-   ip -n ns2_v6 route add default via 2001:db8:2::1
+   ip -n $ns1_v6 route add default via 2001:db8:2::2
+   ip -n $ns2_v6 route add default via 2001:db8:2::1
 }
 
 cleanup_v6()
 {
-   ip netns del ns2_v6
-   ip netns del ns1_v6
+   cleanup_ns $ns2_v6 $ns1_v6
 }
 
 setup()
@@ -433,7 +429,7 @@ basic_common()
 
 basic_star_g_ipv4_ipv4()
 {
-   local ns1=ns1_v4
+   local ns1=$ns1_v4
local grp_key="grp 239.1.1.1"
local vtep_ip=198.51.100.100
 
@@ -446,7 +442,7 @@ basic_star_g_ipv4_ipv4()
 
 basic_star_g_ipv6_ipv4()
 {
-   local ns1=ns1_v4
+   local ns1=$ns1_v4
local grp_key="grp ff0e::1"
local vtep_ip=198.51.100.100
 
@@ -459,7 +455,7 @@ basic_star_g_ipv6_ipv4()
 
 basic_star_g_ipv4_ipv6()
 {
-   local ns1=ns1_v6
+   local ns1=$ns1_v6
local grp_key="grp 239.1.1.1"
local vtep_ip=2001:db8:1000::1
 
@@ -472,7 +468,7 @@ basic_star_g_ipv4_ipv6()
 
 basic_star_g_ipv6_ipv6()
 {
-   local ns1=ns1_v6
+   local ns1=$ns1_v6
local grp_key="grp ff0e::1"
local vtep_ip=2001:db8:1000::1
 
@@ -485,7 +481,7 @@ basic_star_g_ipv6_ipv6()
 
 basic_sg_ipv4_ipv4()
 {
-   local ns1=ns1_v4
+   local ns1=$ns1_v4
local grp_key="grp 239.1.1.1 src 192.0.2.129"
local vtep_ip=198.51.100.100
 
@@ -498,7 +494,7 @@ basic_sg_ipv4_ipv4()
 
 basic_sg_ipv6_ipv4()
 {
-   local ns1=ns1_v4
+   local ns1=$ns1_v4
local grp_key="grp ff0e::1 src 2001:db8:100::1"
local vtep_ip=198.51.100.100
 
@@ -511,7 +507,7 @@ basic_sg_ipv6_ipv4()
 
 basic_sg_ipv4_ipv6()
 {
-   local ns1=ns1_v6
+   local ns1=$ns1_v6
local grp_key="grp 239.1.1.1 src 192.0.2.129"
local vtep_ip=2001:db8:1000::1
 
@@ -524,7 +520,7 @@ basic_sg_ipv4_ipv6()
 
 basic_sg_ipv6_ipv6()
 {
-   local ns1=ns1_v6
+   local ns1=$ns1_v6
local grp_key="grp ff0e::1 src 2001:db8:100::1"
local vtep_ip=2001:db8:1000::1
 
@@ -694,7 +690,7 @@ star_g_common()
 
 star_g_ipv4_ipv4()
 {
-   local ns1=ns1_v4
+   local ns1=$ns1_v4
local grp=239.1.1.1
local src1=192.0.2.129
local

[PATCH net-next 26/38] selftests/net: convert test_bridge_neigh_suppress.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./test_bridge_neigh_suppress.sh
declare -- sip="192.0.2.1"
declare -- tip="192.0.2.2"
declare -- vid="10"

Per-port ARP suppression - VLAN 10
--
TEST: arping[ OK ]
TEST: ARP suppression   [ OK ]

...

TEST: NS suppression (VLAN 20)  [ OK ]

Tests passed: 148
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../net/test_bridge_neigh_suppress.sh | 333 +-
 1 file changed, 163 insertions(+), 170 deletions(-)

diff --git a/tools/testing/selftests/net/test_bridge_neigh_suppress.sh 
b/tools/testing/selftests/net/test_bridge_neigh_suppress.sh
index d80f2cd87614..cd8629b476e2 100755
--- a/tools/testing/selftests/net/test_bridge_neigh_suppress.sh
+++ b/tools/testing/selftests/net/test_bridge_neigh_suppress.sh
@@ -45,9 +45,8 @@
 # | sw1| | sw2|
 # ++ ++
 
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 # All tests in this script. Can be overridden with -t option.
 TESTS="
@@ -140,9 +139,6 @@ setup_topo_ns()
 {
local ns=$1; shift
 
-   ip netns add $ns
-   ip -n $ns link set dev lo up
-
ip netns exec $ns sysctl -qw net.ipv6.conf.all.keep_addr_on_down=1
ip netns exec $ns sysctl -qw 
net.ipv6.conf.default.ignore_routes_with_linkdown=1
ip netns exec $ns sysctl -qw net.ipv6.conf.all.accept_dad=0
@@ -153,21 +149,22 @@ setup_topo()
 {
local ns
 
-   for ns in h1 h2 sw1 sw2; do
+   setup_ns h1 h2 sw1 sw2
+   for ns in $h1 $h2 $sw1 $sw2; do
setup_topo_ns $ns
done
 
ip link add name veth0 type veth peer name veth1
-   ip link set dev veth0 netns h1 name eth0
-   ip link set dev veth1 netns sw1 name swp1
+   ip link set dev veth0 netns $h1 name eth0
+   ip link set dev veth1 netns $sw1 name swp1
 
ip link add name veth0 type veth peer name veth1
-   ip link set dev veth0 netns sw1 name veth0
-   ip link set dev veth1 netns sw2 name veth0
+   ip link set dev veth0 netns $sw1 name veth0
+   ip link set dev veth1 netns $sw2 name veth0
 
ip link add name veth0 type veth peer name veth1
-   ip link set dev veth0 netns h2 name eth0
-   ip link set dev veth1 netns sw2 name swp1
+   ip link set dev veth0 netns $h2 name eth0
+   ip link set dev veth1 netns $sw2 name swp1
 }
 
 setup_host_common()
@@ -291,11 +288,7 @@ setup()
 
 cleanup()
 {
-   local ns
-
-   for ns in h1 h2 sw1 sw2; do
-   ip netns del $ns &> /dev/null
-   done
+   cleanup_ns $h1 $h2 $sw1 $sw2
 }
 
 

@@ -312,80 +305,80 @@ neigh_suppress_arp_common()
echo "Per-port ARP suppression - VLAN $vid"
echo "--"
 
-   run_cmd "tc -n sw1 qdisc replace dev vx0 clsact"
-   run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 101 
proto 0x0806 flower indev swp1 arp_tip $tip arp_sip $sip arp_op request action 
pass"
+   run_cmd "tc -n $sw1 qdisc replace dev vx0 clsact"
+   run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 101 
proto 0x0806 flower indev swp1 arp_tip $tip arp_sip $sip arp_op request action 
pass"
 
# Initial state - check that ARP requests are not suppressed and that
# ARP replies are received.
-   run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid 
$tip"
+   run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid 
$tip"
log_test $? 0 "arping"
-   tc_check_packets sw1 "dev vx0 egress" 101 1
+   tc_check_packets $sw1 "dev vx0 egress" 101 1
log_test $? 0 "ARP suppression"
 
# Enable neighbor suppression and check that nothing changes compared
# to the initial state.
-   run_cmd "bridge -n sw1 link set dev vx0 neigh_suppress on"
-   run_cmd "bridge -n sw1 -d link show dev vx0 | grep \"neigh_suppress 
on\""
+   run_cmd "bridge -n $sw1 link set dev vx0 neigh_suppress on"
+   run_cmd "bridge -n $sw1 -d link show dev vx0 | grep \"neigh_suppress 
on\""
log_test $? 0 "\"neigh_suppress\" is on"
 
-   run_cmd "ip netns exec h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid 
$tip"
+   run_cmd "ip netns exec $h1 arping -q -b -c 1 -w 5 -s $sip -I eth0.$vid 
$tip"
log_test $? 0 "arping"
-   tc_check_packets sw1 "dev vx0 egress" 101 2
+   tc_check_packets $sw1 "dev vx0 egress" 101 2
log_test $? 0 "ARP suppression"
 
# Install an FDB entry for the remote host and check that nothing
# changes compared to the initial state.
-   h2_mac=$(ip -n h2 -j -p link show

[PATCH net-next 25/38] selftests/net: convert test_bridge_backup_port.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

There is no h1 h2 actually. Remove it. Here is the test result after
conversion.

]# ./test_bridge_backup_port.sh

Backup port
---
TEST: Forwarding out of swp1[ OK ]
TEST: No forwarding out of vx0  [ OK ]
TEST: swp1 carrier off  [ OK ]
TEST: No forwarding out of swp1 [ OK ]
...
Backup nexthop ID - ping

TEST: Ping with backup nexthop ID   [ OK ]
TEST: Ping after disabling backup nexthop ID[ OK ]

Backup nexthop ID - torture test

TEST: Torture test  [ OK ]

Tests passed:  83
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/test_bridge_backup_port.sh  | 368 +-
 1 file changed, 182 insertions(+), 186 deletions(-)

diff --git a/tools/testing/selftests/net/test_bridge_backup_port.sh 
b/tools/testing/selftests/net/test_bridge_backup_port.sh
index 112cfd8a10ad..5fb7c5612dd3 100755
--- a/tools/testing/selftests/net/test_bridge_backup_port.sh
+++ b/tools/testing/selftests/net/test_bridge_backup_port.sh
@@ -35,9 +35,8 @@
 # | sw1| | sw2|
 # ++ ++
 
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 # All tests in this script. Can be overridden with -t option.
 TESTS="
@@ -145,13 +144,14 @@ setup_topo()
 {
local ns
 
-   for ns in sw1 sw2; do
+   setup_ns sw1 sw2
+   for ns in $sw1 $sw2; do
setup_topo_ns $ns
done
 
ip link add name veth0 type veth peer name veth1
-   ip link set dev veth0 netns sw1 name veth0
-   ip link set dev veth1 netns sw2 name veth0
+   ip link set dev veth0 netns $sw1 name veth0
+   ip link set dev veth1 netns $sw2 name veth0
 }
 
 setup_sw_common()
@@ -229,11 +229,7 @@ setup()
 
 cleanup()
 {
-   local ns
-
-   for ns in h1 h2 sw1 sw2; do
-   ip netns del $ns &> /dev/null
-   done
+   cleanup_ns $sw1 $sw2
 }
 
 

@@ -248,85 +244,85 @@ backup_port()
echo "Backup port"
echo "---"
 
-   run_cmd "tc -n sw1 qdisc replace dev swp1 clsact"
-   run_cmd "tc -n sw1 filter replace dev swp1 egress pref 1 handle 101 
proto ip flower src_mac $smac dst_mac $dmac action pass"
+   run_cmd "tc -n $sw1 qdisc replace dev swp1 clsact"
+   run_cmd "tc -n $sw1 filter replace dev swp1 egress pref 1 handle 101 
proto ip flower src_mac $smac dst_mac $dmac action pass"
 
-   run_cmd "tc -n sw1 qdisc replace dev vx0 clsact"
-   run_cmd "tc -n sw1 filter replace dev vx0 egress pref 1 handle 101 
proto ip flower src_mac $smac dst_mac $dmac action pass"
+   run_cmd "tc -n $sw1 qdisc replace dev vx0 clsact"
+   run_cmd "tc -n $sw1 filter replace dev vx0 egress pref 1 handle 101 
proto ip flower src_mac $smac dst_mac $dmac action pass"
 
-   run_cmd "bridge -n sw1 fdb replace $dmac dev swp1 master static vlan 10"
+   run_cmd "bridge -n $sw1 fdb replace $dmac dev swp1 master static vlan 
10"
 
# Initial state - check that packets are forwarded out of swp1 when it
# has a carrier and not forwarded out of any port when it does not have
# a carrier.
-   run_cmd "ip netns exec sw1 mausezahn br0.10 -a $smac -b $dmac -A 
198.51.100.1 -B 198.51.100.2 -t ip -p 100 -q -c 1"
-   tc_check_packets sw1 "dev swp1 egress" 101 1
+   run_cmd "ip netns exec $sw1 mausezahn br0.10 -a $smac -b $dmac -A 
198.51.100.1 -B 198.51.100.2 -t ip -p 100 -q -c 1"
+   tc_check_packets $sw1 "dev swp1 egress" 101 1
log_test $? 0 "Forwarding out of swp1"
-   tc_check_packets sw1 "dev vx0 egress" 101 0
+   tc_check_packets $sw1 "dev vx0 egress" 101 0
log_test $? 0 "No forwarding out of vx0"
 
-   run_cmd "ip -n sw1 link set dev swp1 carrier off"
+   run_cmd "ip -n $sw1 link set dev swp1 carrier off"
log_test $? 0 "swp1 carrier off"
 
-   run_cmd "ip netns exec sw1 mausezahn br0.10 -a $smac -b $dmac -A 
198.51.100.1 -B 198.51.100.2 -t ip -p 100 -q -c 1"
-   tc_check_packets sw1 "dev swp1 egress" 101 1
+   run_cmd "ip netns exec $sw1 mausezahn br0.10 -a $smac -b $dmac -A 
198.51.100.1 -B 198.51.100.2 -t ip -p 100 -q -c 1"
+   tc_check_packets $sw1 "dev swp1 egress" 101 1
log_test $? 0 "No forwarding out of swp1"
-   tc_check_packets sw1 "dev vx0 egress" 101 0
+   tc_check_packets $sw1 "dev vx0 egress" 101 0
log_test $? 0 "No forwarding out of vx0"
 
-   run_cmd "ip -n sw1 link set dev swp1 carrier on"
+   run_cmd "ip -n $sw1 link set dev swp1 carrier on"
log_test $?

[PATCH net-next 24/38] selftests/net: convert stress_reuseport_listen.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./stress_reuseport_listen.sh
listen 24000 socks took 0.47714

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/stress_reuseport_listen.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/net/stress_reuseport_listen.sh 
b/tools/testing/selftests/net/stress_reuseport_listen.sh
index 4de11da4092b..94d5d1a1c90f 100755
--- a/tools/testing/selftests/net/stress_reuseport_listen.sh
+++ b/tools/testing/selftests/net/stress_reuseport_listen.sh
@@ -2,18 +2,18 @@
 # SPDX-License-Identifier: GPL-2.0
 # Copyright (c) 2022 Meta Platforms, Inc. and affiliates.
 
-NS='stress_reuseport_listen_ns'
+source lib.sh
 NR_FILES=24100
 SAVED_NR_FILES=$(ulimit -n)
 
 setup() {
-   ip netns add $NS
+   setup_ns NS
ip netns exec $NS sysctl -q -w net.ipv6.ip_nonlocal_bind=1
ulimit -n $NR_FILES
 }
 
 cleanup() {
-   ip netns del $NS
+   cleanup_ns $NS
ulimit -n $SAVED_NR_FILES
 }
 
-- 
2.41.0

[PATCH net-next 23/38] selftests/net: use unique netns name for setup_loopback.sh setup_veth.sh

2023-11-24 Thread Hangbin Liu

rename server_ns/client_ns to unique name so we can run the tests in
parallel.

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/setup_loopback.sh | 8 +---
 tools/testing/selftests/net/setup_veth.sh | 9 ++---
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/setup_loopback.sh 
b/tools/testing/selftests/net/setup_loopback.sh
index e57bbfbc5208..6b1150bf3995 100755
--- a/tools/testing/selftests/net/setup_loopback.sh
+++ b/tools/testing/selftests/net/setup_loopback.sh
@@ -5,6 +5,8 @@ readonly FLUSH_PATH="/sys/class/net/${dev}/gro_flush_timeout"
 readonly IRQ_PATH="/sys/class/net/${dev}/napi_defer_hard_irqs"
 readonly FLUSH_TIMEOUT="$(< ${FLUSH_PATH})"
 readonly HARD_IRQS="$(< ${IRQ_PATH})"
+readonly server_ns=$(mktemp -u server-)
+readonly client_ns=$(mktemp -u client-)
 
 netdev_check_for_carrier() {
local -r dev="$1"
@@ -97,12 +99,12 @@ setup_interrupt() {
 
 setup_ns() {
# Set up server_ns namespace and client_ns namespace
-   setup_macvlan_ns "${dev}" server_ns server "${SERVER_MAC}"
-   setup_macvlan_ns "${dev}" client_ns client "${CLIENT_MAC}"
+   setup_macvlan_ns "${dev}" ${server_ns} server "${SERVER_MAC}"
+   setup_macvlan_ns "${dev}" ${client_ns} client "${CLIENT_MAC}"
 }
 
 cleanup_ns() {
-   cleanup_macvlan_ns server_ns server client_ns client
+   cleanup_macvlan_ns ${server} server ${client_ns} client
 }
 
 setup() {
diff --git a/tools/testing/selftests/net/setup_veth.sh 
b/tools/testing/selftests/net/setup_veth.sh
index 1003ddf7b3b2..a9a1759e035c 100644
--- a/tools/testing/selftests/net/setup_veth.sh
+++ b/tools/testing/selftests/net/setup_veth.sh
@@ -1,6 +1,9 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
+readonly server_ns=$(mktemp -u server-)
+readonly client_ns=$(mktemp -u client-)
+
 setup_veth_ns() {
local -r link_dev="$1"
local -r ns_name="$2"
@@ -19,14 +22,14 @@ setup_ns() {
# Set up server_ns namespace and client_ns namespace
ip link add name server type veth peer name client
 
-   setup_veth_ns "${dev}" server_ns server "${SERVER_MAC}"
-   setup_veth_ns "${dev}" client_ns client "${CLIENT_MAC}"
+   setup_veth_ns "${dev}" ${server_ns} server "${SERVER_MAC}"
+   setup_veth_ns "${dev}" ${client_ns} client "${CLIENT_MAC}"
 }
 
 cleanup_ns() {
local ns_name
 
-   for ns_name in client_ns server_ns; do
+   for ns_name in ${client_ns} ${server_ns}; do
[[ -e /var/run/netns/"${ns_name}" ]] && ip netns del 
"${ns_name}"
done
 }
-- 
2.41.0

[PATCH net-next 22/38] selftests/net: convert sctp_vrf.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./sctp_vrf.sh
Testing For SCTP VRF:
TEST 01: nobind, connect from client 1, l3mdev_accept=1, Y [PASS]
...
TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [PASS]
***v6 Tests Done***

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/sctp_vrf.sh | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/net/sctp_vrf.sh 
b/tools/testing/selftests/net/sctp_vrf.sh
index c721e952e5f3..c854034b6aa1 100755
--- a/tools/testing/selftests/net/sctp_vrf.sh
+++ b/tools/testing/selftests/net/sctp_vrf.sh
@@ -6,13 +6,11 @@
 #  SERVER_NS
 #   CLIENT_NS2 (veth1) <---> (veth2) -> vrf_s2
 
-CLIENT_NS1="client-ns1"
-CLIENT_NS2="client-ns2"
+source lib.sh
 CLIENT_IP4="10.0.0.1"
 CLIENT_IP6="2000::1"
 CLIENT_PORT=1234
 
-SERVER_NS="server-ns"
 SERVER_IP4="10.0.0.2"
 SERVER_IP6="2000::2"
 SERVER_PORT=1234
@@ -20,9 +18,7 @@ SERVER_PORT=1234
 setup() {
modprobe sctp
modprobe sctp_diag
-   ip netns add $CLIENT_NS1
-   ip netns add $CLIENT_NS2
-   ip netns add $SERVER_NS
+   setup_ns CLIENT_NS1 CLIENT_NS2 SERVER_NS
 
ip net exec $CLIENT_NS1 sysctl -w net.ipv6.conf.default.accept_dad=0 
2>&1 >/dev/null
ip net exec $CLIENT_NS2 sysctl -w net.ipv6.conf.default.accept_dad=0 
2>&1 >/dev/null
@@ -67,9 +63,7 @@ setup() {
 
 cleanup() {
ip netns exec $SERVER_NS pkill sctp_hello 2>&1 >/dev/null
-   ip netns del "$CLIENT_NS1"
-   ip netns del "$CLIENT_NS2"
-   ip netns del "$SERVER_NS"
+   cleanup_ns $CLIENT_NS1 $CLIENT_NS2 $SERVER_NS
 }
 
 wait_server() {
-- 
2.41.0

[PATCH net-next 21/38] selftests/net: convert rtnetlink.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./rtnetlink.sh
PASS: address proto IPv4
PASS: address proto IPv6

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/rtnetlink.sh | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/net/rtnetlink.sh 
b/tools/testing/selftests/net/rtnetlink.sh
index 38be9706c45f..3c94faf735a8 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -517,9 +517,8 @@ kci_test_encap_fou()
 # test various encap methods, use netns to avoid unwanted interference
 kci_test_encap()
 {
-   testns="testns"
local ret=0
-   run_cmd ip netns add "$testns"
+   run_cmd setup_ns testns
if [ $? -ne 0 ]; then
end_test "SKIP encap tests: cannot add net namespace $testns"
return $ksft_skip
@@ -836,11 +835,10 @@ EOF
 
 kci_test_gretap()
 {
-   testns="testns"
DEV_NS=gretap00
local ret=0
 
-   run_cmd ip netns add "$testns"
+   run_cmd setup_ns testns
if [ $? -ne 0 ]; then
end_test "SKIP gretap tests: cannot add net namespace $testns"
return $ksft_skip
@@ -878,11 +876,10 @@ kci_test_gretap()
 
 kci_test_ip6gretap()
 {
-   testns="testns"
DEV_NS=ip6gretap00
local ret=0
 
-   run_cmd ip netns add "$testns"
+   run_cmd setup_ns testns
if [ $? -ne 0 ]; then
end_test "SKIP ip6gretap tests: cannot add net namespace 
$testns"
return $ksft_skip
@@ -920,7 +917,6 @@ kci_test_ip6gretap()
 
 kci_test_erspan()
 {
-   testns="testns"
DEV_NS=erspan00
local ret=0
run_cmd_grep "^Usage:" ip link help erspan
@@ -928,7 +924,7 @@ kci_test_erspan()
end_test "SKIP: erspan: iproute2 too old"
return $ksft_skip
fi
-   run_cmd ip netns add "$testns"
+   run_cmd setup_ns testns
if [ $? -ne 0 ]; then
end_test "SKIP erspan tests: cannot add net namespace $testns"
return $ksft_skip
@@ -970,7 +966,6 @@ kci_test_erspan()
 
 kci_test_ip6erspan()
 {
-   testns="testns"
DEV_NS=ip6erspan00
local ret=0
run_cmd_grep "^Usage:" ip link help ip6erspan
@@ -978,7 +973,7 @@ kci_test_ip6erspan()
end_test "SKIP: ip6erspan: iproute2 too old"
return $ksft_skip
fi
-   run_cmd ip netns add "$testns"
+   run_cmd setup_ns testns
if [ $? -ne 0 ]; then
end_test "SKIP ip6erspan tests: cannot add net namespace 
$testns"
return $ksft_skip
@@ -1022,8 +1017,6 @@ kci_test_ip6erspan()
 
 kci_test_fdb_get()
 {
-   IP="ip -netns testns"
-   BRIDGE="bridge -netns testns"
brdev="test-br0"
vxlandev="vxlan10"
test_mac=de:ad:be:ef:13:37
@@ -1037,11 +1030,13 @@ kci_test_fdb_get()
return $ksft_skip
fi
 
-   run_cmd ip netns add testns
+   run_cmd setup_ns testns
if [ $? -ne 0 ]; then
end_test "SKIP fdb get tests: cannot add net namespace $testns"
return $ksft_skip
fi
+   IP="ip -netns $testns"
+   BRIDGE="bridge -netns $testns"
run_cmd $IP link add "$vxlandev" type vxlan id 10 local $localip \
 dstport 4789
run_cmd $IP link add name "$brdev" type bridge
-- 
2.41.0

[PATCH net-next 20/38] selftests/net: convert fdb_flush.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.
]# ./fdb_flush.sh
TEST: vx10: Expected 5 FDB entries, got 5   [ OK ]
TEST: vx20: Expected 5 FDB entries, got 5   [ OK ]
...
TEST: vx10: Expected 5 FDB entries, got 5   [ OK ]
TEST: Test entries with dst 192.0.2.1   [ OK ]

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/fdb_flush.sh | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/net/fdb_flush.sh 
b/tools/testing/selftests/net/fdb_flush.sh
index 90e7a29e0476..d5e3abb8658c 100755
--- a/tools/testing/selftests/net/fdb_flush.sh
+++ b/tools/testing/selftests/net/fdb_flush.sh
@@ -5,6 +5,8 @@
 # Check that flush works as expected with all the supported arguments and 
verify
 # some combinations of arguments.
 
+source lib.sh
+
 FLUSH_BY_STATE_TESTS="
vxlan_test_flush_by_permanent
vxlan_test_flush_by_nopermanent
@@ -739,10 +741,9 @@ bridge_vxlan_test_flush()
 
 setup()
 {
-   IP="ip -netns ns1"
-   BRIDGE="bridge -netns ns1"
-
-   ip netns add ns1
+   setup_ns NS
+   IP="ip -netns ${NS}"
+   BRIDGE="bridge -netns ${NS}"
 
$IP link add name vx10 type vxlan id 1000 dstport "$VXPORT"
$IP link add name vx20 type vxlan id 2000 dstport "$VXPORT"
@@ -759,7 +760,7 @@ cleanup()
$IP link del dev vx20
$IP link del dev vx10
 
-   ip netns del ns1
+   cleanup_ns ${NS}
 }
 
 

-- 
2.41.0

[PATCH net-next 19/38] selftests/net: convert netns-name.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

This test will move the device to netns 1. Add a new test_ns to do this.
Here is the test result after conversion.

]# ./netns-name.sh
netns-name.sh   [  OK  ]

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/netns-name.sh | 44 +++
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/net/netns-name.sh 
b/tools/testing/selftests/net/netns-name.sh
index 7d3d3fc99461..6974474c26f3 100755
--- a/tools/testing/selftests/net/netns-name.sh
+++ b/tools/testing/selftests/net/netns-name.sh
@@ -1,9 +1,9 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
+source lib.sh
 set -o pipefail
 
-NS=netns-name-test
 DEV=dummy-dev0
 DEV2=dummy-dev1
 ALT_NAME=some-alt-name
@@ -11,7 +11,7 @@ ALT_NAME=some-alt-name
 RET_CODE=0
 
 cleanup() {
-ip netns del $NS
+cleanup_ns $NS $test_ns
 }
 
 trap cleanup EXIT
@@ -21,50 +21,50 @@ fail() {
 RET_CODE=1
 }
 
-ip netns add $NS
+setup_ns NS test_ns
 
 #
 # Test basic move without a rename
 #
 ip -netns $NS link add name $DEV type dummy || fail
-ip -netns $NS link set dev $DEV netns 1 ||
+ip -netns $NS link set dev $DEV netns $test_ns ||
 fail "Can't perform a netns move"
-ip link show dev $DEV >> /dev/null || fail "Device not found after move"
-ip link del $DEV || fail
+ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found 
after move"
+ip -netns $test_ns link del $DEV || fail
 
 #
 # Test move with a conflict
 #
-ip link add name $DEV type dummy
+ip -netns $test_ns link add name $DEV type dummy
 ip -netns $NS link add name $DEV type dummy || fail
-ip -netns $NS link set dev $DEV netns 1 2> /dev/null &&
+ip -netns $NS link set dev $DEV netns $test_ns 2> /dev/null &&
 fail "Performed a netns move with a name conflict"
-ip link show dev $DEV >> /dev/null || fail "Device not found after move"
+ip -netns $test_ns link show dev $DEV >> /dev/null || fail "Device not found 
after move"
 ip -netns $NS link del $DEV || fail
-ip link del $DEV || fail
+ip -netns $test_ns link del $DEV || fail
 
 #
 # Test move with a conflict and rename
 #
-ip link add name $DEV type dummy
+ip -netns $test_ns link add name $DEV type dummy
 ip -netns $NS link add name $DEV type dummy || fail
-ip -netns $NS link set dev $DEV netns 1 name $DEV2 ||
+ip -netns $NS link set dev $DEV netns $test_ns name $DEV2 ||
 fail "Can't perform a netns move with rename"
-ip link del $DEV2 || fail
-ip link del $DEV || fail
+ip -netns $test_ns link del $DEV2 || fail
+ip -netns $test_ns link del $DEV || fail
 
 #
 # Test dup alt-name with netns move
 #
-ip link add name $DEV type dummy || fail
-ip link property add dev $DEV altname $ALT_NAME || fail
+ip -netns $test_ns link add name $DEV type dummy || fail
+ip -netns $test_ns link property add dev $DEV altname $ALT_NAME || fail
 ip -netns $NS link add name $DEV2 type dummy || fail
 ip -netns $NS link property add dev $DEV2 altname $ALT_NAME || fail
 
-ip -netns $NS link set dev $DEV2 netns 1 2> /dev/null &&
+ip -netns $NS link set dev $DEV2 netns $test_ns 2> /dev/null &&
 fail "Moved with alt-name dup"
 
-ip link del $DEV || fail
+ip -netns $test_ns link del $DEV || fail
 ip -netns $NS link del $DEV2 || fail
 
 #
@@ -72,11 +72,11 @@ ip -netns $NS link del $DEV2 || fail
 #
 ip -netns $NS link add name $DEV type dummy || fail
 ip -netns $NS link property add dev $DEV altname $ALT_NAME || fail
-ip -netns $NS link set dev $DEV netns 1 || fail
-ip link show dev $ALT_NAME >> /dev/null || fail "Can't find alt-name after 
move"
-ip  -netns $NS link show dev $ALT_NAME 2> /dev/null &&
+ip -netns $NS link set dev $DEV netns $test_ns || fail
+ip -netns $test_ns link show dev $ALT_NAME >> /dev/null || fail "Can't find 
alt-name after move"
+ip -netns $NS link show dev $ALT_NAME 2> /dev/null &&
 fail "Can still find alt-name after move"
-ip link del $DEV || fail
+ip -netns $test_ns link del $DEV || fail
 
 echo -ne "$(basename $0) \t\t\t\t"
 if [ $RET_CODE -eq 0 ]; then
-- 
2.41.0

[PATCH net-next 18/38] selftests/net: convert ndisc_unsolicited_na_test.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./ndisc_unsolicited_na_test.sh
TEST: test_unsolicited_na:  drop_unsolicited_na=0  accept_untracked_na=1  
forwarding=1  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=0  accept_untracked_na=0  
forwarding=0  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=0  accept_untracked_na=0  
forwarding=1  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=0  accept_untracked_na=1  
forwarding=0  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=1  accept_untracked_na=0  
forwarding=0  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=1  accept_untracked_na=0  
forwarding=1  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=1  accept_untracked_na=1  
forwarding=0  [ OK ]
TEST: test_unsolicited_na:  drop_unsolicited_na=1  accept_untracked_na=1  
forwarding=1  [ OK ]

Tests passed:   8
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../net/ndisc_unsolicited_na_test.sh  | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh 
b/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh
index 86e621b7b9c7..5db69dad0cfc 100755
--- a/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh
+++ b/tools/testing/selftests/net/ndisc_unsolicited_na_test.sh
@@ -10,16 +10,12 @@
 #01   0  Don't update NC
 #01   1  Add a STALE NC entry
 
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 PAUSE_ON_FAIL=no
 PAUSE=no
 
-HOST_NS="ns-host"
-ROUTER_NS="ns-router"
-
 HOST_INTF="veth-host"
 ROUTER_INTF="veth-router"
 
@@ -29,11 +25,6 @@ SUBNET_WIDTH=64
 ROUTER_ADDR_WITH_MASK="${ROUTER_ADDR}/${SUBNET_WIDTH}"
 HOST_ADDR_WITH_MASK="${HOST_ADDR}/${SUBNET_WIDTH}"
 
-IP_HOST="ip -6 -netns ${HOST_NS}"
-IP_HOST_EXEC="ip netns exec ${HOST_NS}"
-IP_ROUTER="ip -6 -netns ${ROUTER_NS}"
-IP_ROUTER_EXEC="ip netns exec ${ROUTER_NS}"
-
 tcpdump_stdout=
 tcpdump_stderr=
 
@@ -76,8 +67,12 @@ setup()
 
# Setup two namespaces and a veth tunnel across them.
# On end of the tunnel is a router and the other end is a host.
-   ip netns add ${HOST_NS}
-   ip netns add ${ROUTER_NS}
+   setup_ns HOST_NS ROUTER_NS
+   IP_HOST="ip -6 -netns ${HOST_NS}"
+   IP_HOST_EXEC="ip netns exec ${HOST_NS}"
+   IP_ROUTER="ip -6 -netns ${ROUTER_NS}"
+   IP_ROUTER_EXEC="ip netns exec ${ROUTER_NS}"
+
${IP_ROUTER} link add ${ROUTER_INTF} type veth \
 peer name ${HOST_INTF} netns ${HOST_NS}
 
-- 
2.41.0

[PATCH net-next 17/38] selftests/net: convert l2tp.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./l2tp.sh
TEST: IPv4 basic L2TP tunnel[ OK ]
TEST: IPv4 route through L2TP tunnel[ OK ]
TEST: IPv6 basic L2TP tunnel[ OK ]
TEST: IPv6 route through L2TP tunnel[ OK ]
TEST: IPv4 basic L2TP tunnel - with IPsec   [ OK ]
TEST: IPv4 route through L2TP tunnel - with IPsec   [ OK ]
TEST: IPv6 basic L2TP tunnel - with IPsec   [ OK ]
TEST: IPv6 route through L2TP tunnel - with IPsec   [ OK ]
TEST: IPv4 basic L2TP tunnel[ OK ]
TEST: IPv4 route through L2TP tunnel[ OK ]
TEST: IPv6 basic L2TP tunnel - with IPsec   [ OK ]
TEST: IPv6 route through L2TP tunnel - with IPsec   [ OK ]
TEST: IPv4 basic L2TP tunnel - after IPsec teardown [ OK ]
TEST: IPv4 route through L2TP tunnel - after IPsec teardown [ OK ]
TEST: IPv6 basic L2TP tunnel - after IPsec teardown [ OK ]
TEST: IPv6 route through L2TP tunnel - after IPsec teardown [ OK ]

Tests passed:  16
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/l2tp.sh | 130 +---
 1 file changed, 62 insertions(+), 68 deletions(-)

diff --git a/tools/testing/selftests/net/l2tp.sh 
b/tools/testing/selftests/net/l2tp.sh
index 5782433886fc..88de7166c8ae 100755
--- a/tools/testing/selftests/net/l2tp.sh
+++ b/tools/testing/selftests/net/l2tp.sh
@@ -13,6 +13,7 @@
 #10.1.1.1||   10.1.2.1
 #  2001:db8:1::1 || 2001:db8:2::1
 
+source lib.sh
 VERBOSE=0
 PAUSE_ON_FAIL=no
 
@@ -80,9 +81,6 @@ create_ns()
[ -z "${addr}" ] && addr="-"
[ -z "${addr6}" ] && addr6="-"
 
-   ip netns add ${ns}
-
-   ip -netns ${ns} link set lo up
if [ "${addr}" != "-" ]; then
ip -netns ${ns} addr add dev lo ${addr}
fi
@@ -133,12 +131,7 @@ connect_ns()
 
 cleanup()
 {
-   local ns
-
-   for ns in host-1 host-2 router
-   do
-   ip netns del ${ns} 2>/dev/null
-   done
+   cleanup_ns $host_1 $host_2 $router
 }
 
 setup_l2tp_ipv4()
@@ -146,28 +139,28 @@ setup_l2tp_ipv4()
#
# configure l2tpv3 tunnel on host-1
#
-   ip -netns host-1 l2tp add tunnel tunnel_id 1041 peer_tunnel_id 1042 \
+   ip -netns $host_1 l2tp add tunnel tunnel_id 1041 peer_tunnel_id 1042 \
 encap ip local 10.1.1.1 remote 10.1.2.1
-   ip -netns host-1 l2tp add session name l2tp4 tunnel_id 1041 \
+   ip -netns $host_1 l2tp add session name l2tp4 tunnel_id 1041 \
 session_id 1041 peer_session_id 1042
-   ip -netns host-1 link set dev l2tp4 up
-   ip -netns host-1 addr add dev l2tp4 172.16.1.1 peer 172.16.1.2
+   ip -netns $host_1 link set dev l2tp4 up
+   ip -netns $host_1 addr add dev l2tp4 172.16.1.1 peer 172.16.1.2
 
#
# configure l2tpv3 tunnel on host-2
#
-   ip -netns host-2 l2tp add tunnel tunnel_id 1042 peer_tunnel_id 1041 \
+   ip -netns $host_2 l2tp add tunnel tunnel_id 1042 peer_tunnel_id 1041 \
 encap ip local 10.1.2.1 remote 10.1.1.1
-   ip -netns host-2 l2tp add session name l2tp4 tunnel_id 1042 \
+   ip -netns $host_2 l2tp add session name l2tp4 tunnel_id 1042 \
 session_id 1042 peer_session_id 1041
-   ip -netns host-2 link set dev l2tp4 up
-   ip -netns host-2 addr add dev l2tp4 172.16.1.2 peer 172.16.1.1
+   ip -netns $host_2 link set dev l2tp4 up
+   ip -netns $host_2 addr add dev l2tp4 172.16.1.2 peer 172.16.1.1
 
#
# add routes to loopback addresses
#
-   ip -netns host-1 ro add 172.16.101.2/32 via 172.16.1.2
-   ip -netns host-2 ro add 172.16.101.1/32 via 172.16.1.1
+   ip -netns $host_1 ro add 172.16.101.2/32 via 172.16.1.2
+   ip -netns $host_2 ro add 172.16.101.1/32 via 172.16.1.1
 }
 
 setup_l2tp_ipv6()
@@ -175,28 +168,28 @@ setup_l2tp_ipv6()
#
# configure l2tpv3 tunnel on host-1
#
-   ip -netns host-1 l2tp add tunnel tunnel_id 1061 peer_tunnel_id 1062 \
+   ip -netns $host_1 l2tp add tunnel tunnel_id 1061 peer_tunnel_id 1062 \
 encap ip local 2001:db8:1::1 remote 2001:db8:2::1
-   ip -netns host-1 l2tp add session name l2tp6 tunnel_id 1061 \
+   ip -netns $host_1 l2tp add session name l2tp6 tunnel_id 1061 \
 session_id 1061 peer_session_id 1062
-   ip -netns host-1 link set dev l2tp6 up
-   ip -netns host-1 addr add dev l2tp6 fc00:1::1 peer fc00:1::2
+   ip -netns $host_1 link set dev l2tp6 up
+   ip -netns $host_1 addr add dev l2tp6 fc00:1::1 peer fc00:1::2
 
#

[PATCH net-next 16/38] selftests/net: convert ioam6.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./ioam6.sh

--
OUTPUT tests
--
TEST: Unknown IOAM namespace (inline mode)  [ OK ]
TEST: Unknown IOAM namespace (encap mode)   [ OK ]
TEST: Missing trace room (inline mode)  [ OK ]
TEST: Missing trace room (encap mode)   [ OK ]
TEST: Trace type with bit 0 only (inline mode)  [ OK ]
...
TEST: Full supported trace (encap mode) [ OK ]

--
GLOBAL tests
--
TEST: Forward - Full supported trace (inline mode)  [ OK ]
TEST: Forward - Full supported trace (encap mode)   [ OK ]

- Tests passed: 88
- Tests failed: 0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/ioam6.sh | 247 +--
 1 file changed, 121 insertions(+), 126 deletions(-)

diff --git a/tools/testing/selftests/net/ioam6.sh 
b/tools/testing/selftests/net/ioam6.sh
index 4ceb401da1bf..c2ea3ed43a93 100755
--- a/tools/testing/selftests/net/ioam6.sh
+++ b/tools/testing/selftests/net/ioam6.sh
@@ -117,8 +117,7 @@
 #| Schema Data | |
 #+---+
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
+source lib.sh
 
 

 #  
#
@@ -195,32 +194,32 @@ TESTS_GLOBAL="
 
 check_kernel_compatibility()
 {
-  ip netns add ioam-tmp-node
-  ip link add name veth0 netns ioam-tmp-node type veth \
- peer name veth1 netns ioam-tmp-node
+  setup_ns ioam_tmp_node
+  ip link add name veth0 netns $ioam_tmp_node type veth \
+ peer name veth1 netns $ioam_tmp_node
 
-  ip -netns ioam-tmp-node link set veth0 up
-  ip -netns ioam-tmp-node link set veth1 up
+  ip -netns $ioam_tmp_node link set veth0 up
+  ip -netns $ioam_tmp_node link set veth1 up
 
-  ip -netns ioam-tmp-node ioam namespace add 0
+  ip -netns $ioam_tmp_node ioam namespace add 0
   ns_ad=$?
 
-  ip -netns ioam-tmp-node ioam namespace show | grep -q "namespace 0"
+  ip -netns $ioam_tmp_node ioam namespace show | grep -q "namespace 0"
   ns_sh=$?
 
   if [[ $ns_ad != 0 || $ns_sh != 0 ]]
   then
 echo "SKIP: kernel version probably too old, missing ioam support"
 ip link del veth0 2>/dev/null || true
-ip netns del ioam-tmp-node || true
+ip netns del $ioam_tmp_node || true
 exit $ksft_skip
   fi
 
-  ip -netns ioam-tmp-node route add db02::/64 encap ioam6 mode inline \
+  ip -netns $ioam_tmp_node route add db02::/64 encap ioam6 mode inline \
  trace prealloc type 0x80 ns 0 size 4 dev veth0
   tr_ad=$?
 
-  ip -netns ioam-tmp-node -6 route | grep -q "encap ioam6"
+  ip -netns $ioam_tmp_node -6 route | grep -q "encap ioam6"
   tr_sh=$?
 
   if [[ $tr_ad != 0 || $tr_sh != 0 ]]
@@ -228,12 +227,12 @@ check_kernel_compatibility()
 echo "SKIP: cannot attach an ioam trace to a route, did you compile" \
  "without CONFIG_IPV6_IOAM6_LWTUNNEL?"
 ip link del veth0 2>/dev/null || true
-ip netns del ioam-tmp-node || true
+ip netns del $ioam_tmp_node || true
 exit $ksft_skip
   fi
 
   ip link del veth0 2>/dev/null || true
-  ip netns del ioam-tmp-node || true
+  ip netns del $ioam_tmp_node || true
 
   lsmod | grep -q "ip6_tunnel"
   ip6tnl_loaded=$?
@@ -265,9 +264,7 @@ cleanup()
   ip link del ioam-veth-alpha 2>/dev/null || true
   ip link del ioam-veth-gamma 2>/dev/null || true
 
-  ip netns del ioam-node-alpha || true
-  ip netns del ioam-node-beta || true
-  ip netns del ioam-node-gamma || true
+  cleanup_ns $ioam_node_alpha $ioam_node_beta $ioam_node_gamma
 
   if [ $ip6tnl_loaded != 0 ]
   then
@@ -277,69 +274,67 @@ cleanup()
 
 setup()
 {
-  ip netns add ioam-node-alpha
-  ip netns add ioam-node-beta
-  ip netns add ioam-node-gamma
-
-  ip link add name ioam-veth-alpha netns ioam-node-alpha type veth \
- peer name ioam-veth-betaL netns ioam-node-beta
-  ip link add name ioam-veth-betaR netns ioam-node-beta type veth \
- peer name ioam-veth-gamma netns ioam-node-gamma
-
-  ip -netns ioam-node-alpha link set ioam-veth-alpha name veth0
-  ip -netns ioam-node-beta link set ioam-veth-betaL name veth0
-  ip -netns ioam-node-beta link set ioam-veth-betaR name veth1
-  ip -netns ioam-node-gamma link set ioam-veth-gamma name veth0
-
-  ip -netns ioam-node-alpha addr add db01::2/64 dev veth0
-  ip -netns ioam-node-alpha link set veth0 up
-  ip -netns ioam-node-alpha link set lo up
-  ip -netns ioam-node-alpha route add db02::/64 via

[PATCH net-next 15/38] sleftests/net: convert icmp.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./icmp.sh
OK

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/icmp.sh | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/icmp.sh 
b/tools/testing/selftests/net/icmp.sh
index e4b04cd1644a..824cb0e35eff 100755
--- a/tools/testing/selftests/net/icmp.sh
+++ b/tools/testing/selftests/net/icmp.sh
@@ -18,8 +18,8 @@
 # that address space, so the kernel should substitute the dummy address
 # 192.0.0.8 defined in RFC7600.
 
-NS1=ns1
-NS2=ns2
+source lib.sh
+
 H1_IP=172.16.0.1/32
 H1_IP6=2001:db8:1::1
 RT1=172.16.1.0/24
@@ -32,15 +32,13 @@ TMPFILE=$(mktemp)
 cleanup()
 {
 rm -f "$TMPFILE"
-ip netns del $NS1
-ip netns del $NS2
+cleanup_ns $NS1 $NS2
 }
 
 trap cleanup EXIT
 
 # Namespaces
-ip netns add $NS1
-ip netns add $NS2
+setup_ns NS1 NS2
 
 # Connectivity
 ip -netns $NS1 link add veth0 type veth peer name veth0 netns $NS2
-- 
2.41.0

[PATCH net-next 14/38] selftests/net: convert icmp_redirect.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

 # ./icmp_redirect.sh

 ###
 Legacy routing
 ###

 TEST: IPv4: redirect exception  [ OK ]

 ...

 TEST: IPv4: mtu exception plus redirect [ OK ]
 TEST: IPv6: mtu exception plus redirect [ OK ]

 Tests passed:  40
 Tests failed:   0
 Tests xfailed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/icmp_redirect.sh | 182 +--
 1 file changed, 88 insertions(+), 94 deletions(-)

diff --git a/tools/testing/selftests/net/icmp_redirect.sh 
b/tools/testing/selftests/net/icmp_redirect.sh
index 7b9d6e31b8e7..d6f0e449c029 100755
--- a/tools/testing/selftests/net/icmp_redirect.sh
+++ b/tools/testing/selftests/net/icmp_redirect.sh
@@ -19,6 +19,7 @@
 # Route on r1 changed to go to r2 via eth0. This causes a redirect to be sent
 # from r1 to h1 telling h1 to use r2 when talking to h2.
 
+source lib.sh
 VERBOSE=0
 PAUSE_ON_FAIL=no
 
@@ -140,11 +141,7 @@ get_linklocal()
 
 cleanup()
 {
-   local ns
-
-   for ns in h1 h2 r1 r2; do
-   ip netns del $ns 2>/dev/null
-   done
+   cleanup_ns $h1 $h2 $r1 $r2
 }
 
 create_vrf()
@@ -171,102 +168,99 @@ setup()
 
#
# create nodes as namespaces
-   #
-   for ns in h1 h2 r1 r2; do
-   ip netns add $ns
-   ip -netns $ns li set lo up
-
-   case "${ns}" in
-   h[12]) ip netns exec $ns sysctl -q -w 
net.ipv4.conf.all.accept_redirects=1
-  ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=0
-  ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.accept_redirects=1
-  ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.keep_addr_on_down=1
-   ;;
-   r[12]) ip netns exec $ns sysctl -q -w net.ipv4.ip_forward=1
-  ip netns exec $ns sysctl -q -w 
net.ipv4.conf.all.send_redirects=1
-  ip netns exec $ns sysctl -q -w 
net.ipv4.conf.default.rp_filter=0
-  ip netns exec $ns sysctl -q -w 
net.ipv4.conf.all.rp_filter=0
-
-  ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=1
-  ip netns exec $ns sysctl -q -w 
net.ipv6.route.mtu_expires=10
-   esac
+   setup_ns h1 h2 r1 r2
+   for ns in $h1 $h2 $r1 $r2; do
+   if echo $ns | grep -q h[12]-; then
+   ip netns exec $ns sysctl -q -w 
net.ipv4.conf.all.accept_redirects=1
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=0
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.accept_redirects=1
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.keep_addr_on_down=1
+   else
+   ip netns exec $ns sysctl -q -w net.ipv4.ip_forward=1
+   ip netns exec $ns sysctl -q -w 
net.ipv4.conf.all.send_redirects=1
+   ip netns exec $ns sysctl -q -w 
net.ipv4.conf.default.rp_filter=0
+   ip netns exec $ns sysctl -q -w 
net.ipv4.conf.all.rp_filter=0
+
+   ip netns exec $ns sysctl -q -w 
net.ipv6.conf.all.forwarding=1
+   ip netns exec $ns sysctl -q -w 
net.ipv6.route.mtu_expires=10
+   fi
done
 
#
# create interconnects
#
-   ip -netns h1 li add eth0 type veth peer name r1h1
-   ip -netns h1 li set r1h1 netns r1 name eth0 up
+   ip -netns $h1 li add eth0 type veth peer name r1h1
+   ip -netns $h1 li set r1h1 netns $r1 name eth0 up
 
-   ip -netns h1 li add eth1 type veth peer name r2h1
-   ip -netns h1 li set r2h1 netns r2 name eth0 up
+   ip -netns $h1 li add eth1 type veth peer name r2h1
+   ip -netns $h1 li set r2h1 netns $r2 name eth0 up
 
-   ip -netns h2 li add eth0 type veth peer name r2h2
-   ip -netns h2 li set eth0 up
-   ip -netns h2 li set r2h2 netns r2 name eth2 up
+   ip -netns $h2 li add eth0 type veth peer name r2h2
+   ip -netns $h2 li set eth0 up
+   ip -netns $h2 li set r2h2 netns $r2 name eth2 up
 
-   ip -netns r1 li add eth1 type veth peer name r2r1
-   ip -netns r1 li set eth1 up
-   ip -netns r1 li set r2r1 netns r2 name eth1 up
+   ip -netns $r1 li add eth1 type veth peer name r2r1
+   ip -netns $r1 li set eth1 up
+   ip -netns $r1 li set r2r1 netns $r2 name eth1 up
 
#
# h1
#
if [ "${WITH_VRF}" = "yes" ]; then
-   create_vrf "h1"
+   create_vrf "$h1"
H1_VRF_ARG="vrf ${VRF}"
H1_PING_ARG="-I ${VRF}"
else
H1_VRF_ARG=
H1_PING_ARG=
fi
-

[PATCH net-next 12/38] selftests/net: convert fib_tests.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./fib_tests.sh

Single path route test
Start point
TEST: IPv4 fibmatch [ OK ]

...

Fib6 garbage collection test
TEST: ipv6 route garbage collection [ OK ]

IPv4 multipath list receive tests
TEST: Multipath route hit ratio (1.00)  [ OK ]

IPv6 multipath list receive tests
TEST: Multipath route hit ratio (1.00)  [ OK ]

Tests passed: 225
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/fib_tests.sh | 184 +++
 1 file changed, 87 insertions(+), 97 deletions(-)

diff --git a/tools/testing/selftests/net/fib_tests.sh 
b/tools/testing/selftests/net/fib_tests.sh
index 66d0db7a2614..b3ecccbbfcd2 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -3,10 +3,8 @@
 
 # This test is for checking IPv4 and IPv6 FIB behavior in response to
 # different events.
-
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 # all tests in this script. Can be overridden with -t option
 TESTS="unregister down carrier nexthop suppress ipv6_notify ipv4_notify \
@@ -18,8 +16,6 @@ TESTS="unregister down carrier nexthop suppress ipv6_notify 
ipv4_notify \
 VERBOSE=0
 PAUSE_ON_FAIL=no
 PAUSE=no
-IP="$(which ip) -netns ns1"
-NS_EXEC="$(which ip) netns exec ns1"
 
 which ping6 > /dev/null 2>&1 && ping6=$(which ping6) || ping6=$(which ping)
 
@@ -55,11 +51,11 @@ log_test()
 setup()
 {
set -e
-   ip netns add ns1
-   ip netns set ns1 auto
-   $IP link set dev lo up
-   ip netns exec ns1 sysctl -qw net.ipv4.ip_forward=1
-   ip netns exec ns1 sysctl -qw net.ipv6.conf.all.forwarding=1
+   setup_ns ns1
+   IP="$(which ip) -netns $ns1"
+   NS_EXEC="$(which ip) netns exec $ns1"
+   ip netns exec $ns1 sysctl -qw net.ipv4.ip_forward=1
+   ip netns exec $ns1 sysctl -qw net.ipv6.conf.all.forwarding=1
 
$IP link add dummy0 type dummy
$IP link set dev dummy0 up
@@ -72,8 +68,7 @@ setup()
 cleanup()
 {
$IP link del dev dummy0 &> /dev/null
-   ip netns del ns1 &> /dev/null
-   ip netns del ns2 &> /dev/null
+   cleanup_ns $ns1 $ns2
 }
 
 get_linklocal()
@@ -448,28 +443,25 @@ fib_rp_filter_test()
setup
 
set -e
-   ip netns add ns2
-   ip netns set ns2 auto
-
-   ip -netns ns2 link set dev lo up
+   setup_ns ns2
 
$IP link add name veth1 type veth peer name veth2
-   $IP link set dev veth2 netns ns2
+   $IP link set dev veth2 netns $ns2
$IP address add 192.0.2.1/24 dev veth1
-   ip -netns ns2 address add 192.0.2.1/24 dev veth2
+   ip -netns $ns2 address add 192.0.2.1/24 dev veth2
$IP link set dev veth1 up
-   ip -netns ns2 link set dev veth2 up
+   ip -netns $ns2 link set dev veth2 up
 
$IP link set dev lo address 52:54:00:6a:c7:5e
$IP link set dev veth1 address 52:54:00:6a:c7:5e
-   ip -netns ns2 link set dev lo address 52:54:00:6a:c7:5e
-   ip -netns ns2 link set dev veth2 address 52:54:00:6a:c7:5e
+   ip -netns $ns2 link set dev lo address 52:54:00:6a:c7:5e
+   ip -netns $ns2 link set dev veth2 address 52:54:00:6a:c7:5e
 
# 1. (ns2) redirect lo's egress to veth2's egress
-   ip netns exec ns2 tc qdisc add dev lo parent root handle 1: fq_codel
-   ip netns exec ns2 tc filter add dev lo parent 1: protocol arp basic \
+   ip netns exec $ns2 tc qdisc add dev lo parent root handle 1: fq_codel
+   ip netns exec $ns2 tc filter add dev lo parent 1: protocol arp basic \
action mirred egress redirect dev veth2
-   ip netns exec ns2 tc filter add dev lo parent 1: protocol ip basic \
+   ip netns exec $ns2 tc filter add dev lo parent 1: protocol ip basic \
action mirred egress redirect dev veth2
 
# 2. (ns1) redirect veth1's ingress to lo's ingress
@@ -487,24 +479,24 @@ fib_rp_filter_test()
action mirred egress redirect dev veth1
 
# 4. (ns2) redirect veth2's ingress to lo's ingress
-   ip netns exec ns2 tc qdisc add dev veth2 ingress
-   ip netns exec ns2 tc filter add dev veth2 ingress protocol arp basic \
+   ip netns exec $ns2 tc qdisc add dev veth2 ingress
+   ip netns exec $ns2 tc filter add dev veth2 ingress protocol arp basic \
action mirred ingress redirect dev lo
-   ip netns exec ns2 tc filter add dev veth2 ingress protocol ip basic \
+   ip netns exec $ns2 tc filter add dev veth2 ingress protocol ip basic \
action mirred ingress redirect dev lo
 
$NS_EXEC sysctl -qw net.ipv4.conf.all.rp_filter=1
$NS_EXEC sysctl -qw net.ipv4.conf.all.accept_local=1
$NS_EXEC sysctl -qw net.ipv4.conf.all.route_localnet=1
-   ip netns exec ns2 sysctl -qw

[PATCH net-next 13/38] selftests/net: convert gre_gso.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./gre_gso.sh
TEST: GREv6/v4 - copy file w/ TSO   [ OK ]
TEST: GREv6/v4 - copy file w/ GSO   [ OK ]
TEST: GREv6/v6 - copy file w/ TSO   [ OK ]
TEST: GREv6/v6 - copy file w/ GSO   [ OK ]

Tests passed:   4
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/gre_gso.sh | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/net/gre_gso.sh 
b/tools/testing/selftests/net/gre_gso.sh
index 3224651db97b..5100d90f92d2 100755
--- a/tools/testing/selftests/net/gre_gso.sh
+++ b/tools/testing/selftests/net/gre_gso.sh
@@ -2,10 +2,8 @@
 # SPDX-License-Identifier: GPL-2.0
 
 # This test is for checking GRE GSO.
-
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 # all tests in this script. Can be overridden with -t option
 TESTS="gre_gso"
@@ -13,8 +11,6 @@ TESTS="gre_gso"
 VERBOSE=0
 PAUSE_ON_FAIL=no
 PAUSE=no
-IP="ip -netns ns1"
-NS_EXEC="ip netns exec ns1"
 TMPFILE=`mktemp`
 PID=
 
@@ -50,13 +46,13 @@ log_test()
 setup()
 {
set -e
-   ip netns add ns1
-   ip netns set ns1 auto
-   $IP link set dev lo up
+   setup_ns ns1
+   IP="ip -netns $ns1"
+   NS_EXEC="ip netns exec $ns1"
 
ip link add veth0 type veth peer name veth1
ip link set veth0 up
-   ip link set veth1 netns ns1
+   ip link set veth1 netns $ns1
$IP link set veth1 name veth0
$IP link set veth0 up
 
@@ -70,7 +66,7 @@ cleanup()
[ -n "$PID" ] && kill $PID
ip link del dev gre1 &> /dev/null
ip link del dev veth0 &> /dev/null
-   ip netns del ns1
+   cleanup_ns $ns1
 }
 
 get_linklocal()
@@ -145,7 +141,7 @@ gre6_gso_test()
setup
 
a1=$(get_linklocal veth0)
-   a2=$(get_linklocal veth0 ns1)
+   a2=$(get_linklocal veth0 $ns1)
 
gre_create_tun $a1 $a2
 
-- 
2.41.0

[PATCH net-next 11/38] selftests/net: convert fib_rule_tests.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./fib_rule_tests.sh

TEST: rule6 check: oif redirect to table  [ OK ]

...

TEST: rule4 dsfield tcp connect (dsfield 0x07)[ OK ]

Tests passed:  66
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/fib_rule_tests.sh | 36 +--
 1 file changed, 16 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/net/fib_rule_tests.sh 
b/tools/testing/selftests/net/fib_rule_tests.sh
index 63c3eaec8d30..2ff8534fe353 100755
--- a/tools/testing/selftests/net/fib_rule_tests.sh
+++ b/tools/testing/selftests/net/fib_rule_tests.sh
@@ -3,14 +3,9 @@
 
 # This test is for checking IPv4 and IPv6 FIB rules API
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
-
+source lib.sh
 ret=0
-
 PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no}
-IP="ip -netns testns"
-IP_PEER="ip -netns peerns"
 
 RTABLE=100
 RTABLE_PEER=101
@@ -84,8 +79,8 @@ check_nettest()
 setup()
 {
set -e
-   ip netns add testns
-   $IP link set dev lo up
+   setup_ns testns
+   IP="ip -netns $testns"
 
$IP link add dummy0 type dummy
$IP link set dev dummy0 up
@@ -98,18 +93,19 @@ setup()
 cleanup()
 {
$IP link del dev dummy0 &> /dev/null
-   ip netns del testns
+   ip netns del $testns
 }
 
 setup_peer()
 {
set -e
 
-   ip netns add peerns
+   setup_ns peerns
+   IP_PEER="ip -netns $peerns"
$IP_PEER link set dev lo up
 
-   ip link add name veth0 netns testns type veth \
-   peer name veth1 netns peerns
+   ip link add name veth0 netns $testns type veth \
+   peer name veth1 netns $peerns
$IP link set dev veth0 up
$IP_PEER link set dev veth1 up
 
@@ -131,7 +127,7 @@ setup_peer()
 cleanup_peer()
 {
$IP link del dev veth0
-   ip netns del peerns
+   ip netns del $peerns
 }
 
 fib_check_iproute_support()
@@ -270,11 +266,11 @@ fib_rule6_connect_test()
# (Not-ECT: 0, ECT(1): 1, ECT(0): 2, CE: 3).
# The ECN bits shouldn't influence the result of the test.
for dsfield in 0x04 0x05 0x06 0x07; do
-   nettest -q -6 -B -t 5 -N testns -O peerns -U -D \
+   nettest -q -6 -B -t 5 -N $testns -O $peerns -U -D \
-Q "${dsfield}" -l 2001:db8::1:11 -r 2001:db8::1:11
log_test $? 0 "rule6 dsfield udp connect (dsfield ${dsfield})"
 
-   nettest -q -6 -B -t 5 -N testns -O peerns -Q "${dsfield}" \
+   nettest -q -6 -B -t 5 -N $testns -O $peerns -Q "${dsfield}" \
-l 2001:db8::1:11 -r 2001:db8::1:11
log_test $? 0 "rule6 dsfield tcp connect (dsfield ${dsfield})"
done
@@ -337,11 +333,11 @@ fib_rule4_test()
 
# need enable forwarding and disable rp_filter temporarily as all the
# addresses are in the same subnet and egress device == ingress device.
-   ip netns exec testns sysctl -qw net.ipv4.ip_forward=1
-   ip netns exec testns sysctl -qw net.ipv4.conf.$DEV.rp_filter=0
+   ip netns exec $testns sysctl -qw net.ipv4.ip_forward=1
+   ip netns exec $testns sysctl -qw net.ipv4.conf.$DEV.rp_filter=0
match="from $SRC_IP iif $DEV"
fib_rule4_test_match_n_redirect "$match" "$match" "iif redirect to 
table"
-   ip netns exec testns sysctl -qw net.ipv4.ip_forward=0
+   ip netns exec $testns sysctl -qw net.ipv4.ip_forward=0
 
# Reject dsfield (tos) options which have ECN bits set
for cnt in $(seq 1 3); do
@@ -407,11 +403,11 @@ fib_rule4_connect_test()
# (Not-ECT: 0, ECT(1): 1, ECT(0): 2, CE: 3).
# The ECN bits shouldn't influence the result of the test.
for dsfield in 0x04 0x05 0x06 0x07; do
-   nettest -q -B -t 5 -N testns -O peerns -D -U -Q "${dsfield}" \
+   nettest -q -B -t 5 -N $testns -O $peerns -D -U -Q "${dsfield}" \
-l 198.51.100.11 -r 198.51.100.11
log_test $? 0 "rule4 dsfield udp connect (dsfield ${dsfield})"
 
-   nettest -q -B -t 5 -N testns -O peerns -Q "${dsfield}" \
+   nettest -q -B -t 5 -N $testns -O $peerns -Q "${dsfield}" \
-l 198.51.100.11 -r 198.51.100.11
log_test $? 0 "rule4 dsfield tcp connect (dsfield ${dsfield})"
done
-- 
2.41.0

[PATCH net-next 10/38] selftests/net: convert fib-onlink-tests.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Remove PEER_CMD, which is not used in this test

Here is the test result after conversion.

 ]# ./fib-onlink-tests.sh
 Error: ipv4: FIB table does not exist.
 Flush terminated
 Error: ipv6: FIB table does not exist.
 Flush terminated

 
 Configuring interfaces

   ...

 TEST: Gateway resolves to wrong nexthop device - VRF  [ OK ]

 Tests passed:  38
 Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/fib-onlink-tests.sh | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/net/fib-onlink-tests.sh 
b/tools/testing/selftests/net/fib-onlink-tests.sh
index c287b90b8af8..8b04f8282480 100755
--- a/tools/testing/selftests/net/fib-onlink-tests.sh
+++ b/tools/testing/selftests/net/fib-onlink-tests.sh
@@ -3,6 +3,7 @@
 
 # IPv4 and IPv6 onlink tests
 
+source lib.sh
 PAUSE_ON_FAIL=${PAUSE_ON_FAIL:=no}
 VERBOSE=0
 
@@ -74,9 +75,6 @@ TEST_NET4IN6[2]=10.2.1.254
 # mcast address
 MCAST6=ff02::1
 
-
-PEER_NS=bart
-PEER_CMD="ip netns exec ${PEER_NS}"
 VRF=lisa
 VRF_TABLE=1101
 PBR_TABLE=101
@@ -176,8 +174,7 @@ setup()
set -e
 
# create namespace
-   ip netns add ${PEER_NS}
-   ip -netns ${PEER_NS} li set lo up
+   setup_ns PEER_NS
 
# add vrf table
ip li add ${VRF} type vrf table ${VRF_TABLE}
-- 
2.41.0

[PATCH net-next 09/38] selftests/net: convert fib_nexthops.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./fib_nexthops.sh

Basic functional tests
--
TEST: List with nothing defined [ OK ]
TEST: Nexthop get on non-existent id[ OK ]

...

TEST: IPv6 resilient nexthop group torture test [ OK ]

Tests passed: 234
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/fib_nexthops.sh | 142 ++--
 1 file changed, 69 insertions(+), 73 deletions(-)

diff --git a/tools/testing/selftests/net/fib_nexthops.sh 
b/tools/testing/selftests/net/fib_nexthops.sh
index a6f2c0b9555d..d5a281aadbac 100755
--- a/tools/testing/selftests/net/fib_nexthops.sh
+++ b/tools/testing/selftests/net/fib_nexthops.sh
@@ -14,6 +14,7 @@
 # objects. Device reference counts and network namespace cleanup tested
 # by use of network namespace for peer.
 
+source lib.sh
 ret=0
 # Kselftest framework requirement - SKIP code is 4.
 ksft_skip=4
@@ -148,13 +149,7 @@ create_ns()
 {
local n=${1}
 
-   ip netns del ${n} 2>/dev/null
-
set -e
-   ip netns add ${n}
-   ip netns set ${n} $((nsid++))
-   ip -netns ${n} addr add 127.0.0.1/8 dev lo
-   ip -netns ${n} link set lo up
 
ip netns exec ${n} sysctl -qw net.ipv4.ip_forward=1
ip netns exec ${n} sysctl -qw net.ipv4.fib_multipath_use_neigh=1
@@ -173,12 +168,13 @@ setup()
 {
cleanup
 
-   create_ns me
-   create_ns peer
-   create_ns remote
+   setup_ns me peer remote
+   create_ns $me
+   create_ns $peer
+   create_ns $remote
 
-   IP="ip -netns me"
-   BRIDGE="bridge -netns me"
+   IP="ip -netns $me"
+   BRIDGE="bridge -netns $me"
set -e
$IP li add veth1 type veth peer name veth2
$IP li set veth1 up
@@ -190,24 +186,24 @@ setup()
$IP addr add 172.16.2.1/24 dev veth3
$IP -6 addr add 2001:db8:92::1/64 dev veth3 nodad
 
-   $IP li set veth2 netns peer up
-   ip -netns peer addr add 172.16.1.2/24 dev veth2
-   ip -netns peer -6 addr add 2001:db8:91::2/64 dev veth2 nodad
+   $IP li set veth2 netns $peer up
+   ip -netns $peer addr add 172.16.1.2/24 dev veth2
+   ip -netns $peer -6 addr add 2001:db8:91::2/64 dev veth2 nodad
 
-   $IP li set veth4 netns peer up
-   ip -netns peer addr add 172.16.2.2/24 dev veth4
-   ip -netns peer -6 addr add 2001:db8:92::2/64 dev veth4 nodad
+   $IP li set veth4 netns $peer up
+   ip -netns $peer addr add 172.16.2.2/24 dev veth4
+   ip -netns $peer -6 addr add 2001:db8:92::2/64 dev veth4 nodad
 
-   ip -netns remote li add veth5 type veth peer name veth6
-   ip -netns remote li set veth5 up
-   ip -netns remote addr add dev veth5 172.16.101.1/24
-   ip -netns remote -6 addr add dev veth5 2001:db8:101::1/64 nodad
-   ip -netns remote ro add 172.16.0.0/22 via 172.16.101.2
-   ip -netns remote -6 ro add 2001:db8:90::/40 via 2001:db8:101::2
+   ip -netns $remote li add veth5 type veth peer name veth6
+   ip -netns $remote li set veth5 up
+   ip -netns $remote addr add dev veth5 172.16.101.1/24
+   ip -netns $remote -6 addr add dev veth5 2001:db8:101::1/64 nodad
+   ip -netns $remote ro add 172.16.0.0/22 via 172.16.101.2
+   ip -netns $remote -6 ro add 2001:db8:90::/40 via 2001:db8:101::2
 
-   ip -netns remote li set veth6 netns peer up
-   ip -netns peer addr add dev veth6 172.16.101.2/24
-   ip -netns peer -6 addr add dev veth6 2001:db8:101::2/64 nodad
+   ip -netns $remote li set veth6 netns $peer up
+   ip -netns $peer addr add dev veth6 172.16.101.2/24
+   ip -netns $peer -6 addr add dev veth6 2001:db8:101::2/64 nodad
set +e
 }
 
@@ -215,7 +211,7 @@ cleanup()
 {
local ns
 
-   for ns in me peer remote; do
+   for ns in $me $peer $remote; do
ip netns del ${ns} 2>/dev/null
done
 }
@@ -779,7 +775,7 @@ ipv6_grp_refs()
run_cmd "$IP route add 2001:db8:101::1/128 nhid 102"
 
# create per-cpu dsts through nh 100
-   run_cmd "ip netns exec me mausezahn -6 veth1.10 -B 2001:db8:101::1 -A 
2001:db8:91::1 -c 5 -t tcp "dp=1-1023, flags=syn" >/dev/null 2>&1"
+   run_cmd "ip netns exec $me mausezahn -6 veth1.10 -B 2001:db8:101::1 -A 
2001:db8:91::1 -c 5 -t tcp "dp=1-1023, flags=syn" >/dev/null 2>&1"
 
# remove nh 100 from the group to delete the route potentially leaving
# a stale per-cpu dst which holds a reference to the nexthop's net
@@ -805,7 +801,7 @@ ipv6_grp_refs()
 
# if a reference was lost this command will hang because the net device
# cannot be removed
-   timeout -s KILL 5 ip netns exec me ip link del veth1.10 >/dev/null 2>&1
+   timeout -s KILL 5 ip netns exec $me ip link del veth1.10 >/dev/null 2>&1
 
# we can't cleanup if the command is hung trying to delete the netdev
if [ $? -eq 137 ]; then
@@ -1012,13

[PATCH net-next 08/38] selftests/net: convert fib_nexthop_nongw.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./fib_nexthop_nongw.sh
TEST: nexthop: get route with nexthop without gw[ OK ]
TEST: nexthop: ping through nexthop without gw  [ OK ]

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/fib_nexthop_nongw.sh| 34 ---
 1 file changed, 15 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/net/fib_nexthop_nongw.sh 
b/tools/testing/selftests/net/fib_nexthop_nongw.sh
index b7b928b38ce4..1ccf56f10171 100755
--- a/tools/testing/selftests/net/fib_nexthop_nongw.sh
+++ b/tools/testing/selftests/net/fib_nexthop_nongw.sh
@@ -8,6 +8,7 @@
 #veth0 <---|---> veth1
 # Validate source address selection for route without gateway
 
+source lib.sh
 PAUSE_ON_FAIL=no
 VERBOSE=0
 ret=0
@@ -64,35 +65,31 @@ run_cmd()
 # config
 setup()
 {
-   ip netns add h1
-   ip -n h1 link set lo up
-   ip netns add h2
-   ip -n h2 link set lo up
+   setup_ns h1 h2
 
# Add a fake eth0 to support an ip address
-   ip -n h1 link add name eth0 type dummy
-   ip -n h1 link set eth0 up
-   ip -n h1 address add 192.168.0.1/24 dev eth0
+   ip -n $h1 link add name eth0 type dummy
+   ip -n $h1 link set eth0 up
+   ip -n $h1 address add 192.168.0.1/24 dev eth0
 
# Configure veths (same @mac, arp off)
-   ip -n h1 link add name veth0 type veth peer name veth1 netns h2
-   ip -n h1 link set veth0 up
+   ip -n $h1 link add name veth0 type veth peer name veth1 netns $h2
+   ip -n $h1 link set veth0 up
 
-   ip -n h2 link set veth1 up
+   ip -n $h2 link set veth1 up
 
# Configure @IP in the peer netns
-   ip -n h2 address add 192.168.1.1/32 dev veth1
-   ip -n h2 route add default dev veth1
+   ip -n $h2 address add 192.168.1.1/32 dev veth1
+   ip -n $h2 route add default dev veth1
 
# Add a nexthop without @gw and use it in a route
-   ip -n h1 nexthop add id 1 dev veth0
-   ip -n h1 route add 192.168.1.1 nhid 1
+   ip -n $h1 nexthop add id 1 dev veth0
+   ip -n $h1 route add 192.168.1.1 nhid 1
 }
 
 cleanup()
 {
-   ip netns del h1 2>/dev/null
-   ip netns del h2 2>/dev/null
+   cleanup_ns $h1 $h2
 }
 
 trap cleanup EXIT
@@ -108,12 +105,11 @@ do
esac
 done
 
-cleanup
 setup
 
-run_cmd ip -netns h1 route get 192.168.1.1
+run_cmd ip -netns $h1 route get 192.168.1.1
 log_test $? 0 "nexthop: get route with nexthop without gw"
-run_cmd ip netns exec h1 ping -c1 192.168.1.1
+run_cmd ip netns exec $h1 ping -c1 192.168.1.1
 log_test $? 0 "nexthop: ping through nexthop without gw"
 
 exit $ret
-- 
2.41.0

[PATCH net-next 07/38] selftests/net: convert fib_nexthop_multiprefix to run it in unique namespace

2023-11-24 Thread Hangbin Liu

There is no need cleanup since the lib trap will clean all created ns.

Here is the test result after conversion.

]# ./fib_nexthop_multiprefix.sh
TEST: IPv4: host 0 to host 1, mtu 1300  [ OK ]
TEST: IPv6: host 0 to host 1, mtu 1300  [ OK ]

TEST: IPv4: host 0 to host 2, mtu 1350  [ OK ]
TEST: IPv6: host 0 to host 2, mtu 1350  [ OK ]

TEST: IPv4: host 0 to host 3, mtu 1400  [ OK ]
TEST: IPv6: host 0 to host 3, mtu 1400  [ OK ]

TEST: IPv4: host 0 to host 1, mtu 1300  [ OK ]
TEST: IPv6: host 0 to host 1, mtu 1300  [ OK ]

TEST: IPv4: host 0 to host 2, mtu 1350  [ OK ]
TEST: IPv6: host 0 to host 2, mtu 1350  [ OK ]

TEST: IPv4: host 0 to host 3, mtu 1400  [ OK ]
TEST: IPv6: host 0 to host 3, mtu 1400  [ OK ]

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/fib_nexthop_multiprefix.sh  | 104 --
 1 file changed, 47 insertions(+), 57 deletions(-)

diff --git a/tools/testing/selftests/net/fib_nexthop_multiprefix.sh 
b/tools/testing/selftests/net/fib_nexthop_multiprefix.sh
index 51df5e305855..58efeccc96f7 100755
--- a/tools/testing/selftests/net/fib_nexthop_multiprefix.sh
+++ b/tools/testing/selftests/net/fib_nexthop_multiprefix.sh
@@ -12,6 +12,7 @@
 #
 # routing in h0 to hN is done with nexthop objects.
 
+source lib.sh
 PAUSE_ON_FAIL=no
 VERBOSE=0
 
@@ -72,12 +73,6 @@ create_ns()
 {
local ns=${1}
 
-   ip netns del ${ns} 2>/dev/null
-
-   ip netns add ${ns}
-   ip -netns ${ns} addr add 127.0.0.1/8 dev lo
-   ip -netns ${ns} link set lo up
-
ip netns exec ${ns} sysctl -q -w net.ipv6.conf.all.keep_addr_on_down=1
case ${ns} in
h*)
@@ -97,7 +92,13 @@ setup()
 
#set -e
 
-   for ns in h0 r1 h1 h2 h3
+   setup_ns h0 r1 h1 h2 h3
+   h[0]=$h0
+   h[1]=$h1
+   h[2]=$h2
+   h[3]=$h3
+   r[1]=$r1
+   for ns in ${h[0]} ${r[1]} ${h[1]} ${h[2]} ${h[3]}
do
create_ns ${ns}
done
@@ -108,55 +109,47 @@ setup()
 
for i in 0 1 2 3
do
-   ip -netns h${i} li add eth0 type veth peer name r1h${i}
-   ip -netns h${i} li set eth0 up
-   ip -netns h${i} li set r1h${i} netns r1 name eth${i} up
-
-   ip -netns h${i}addr add dev eth0 172.16.10${i}.1/24
-   ip -netns h${i} -6 addr add dev eth0 2001:db8:10${i}::1/64
-   ip -netns r1addr add dev eth${i} 172.16.10${i}.254/24
-   ip -netns r1 -6 addr add dev eth${i} 2001:db8:10${i}::64/64
+   ip -netns ${h[$i]} li add eth0 type veth peer name r1h${i}
+   ip -netns ${h[$i]} li set eth0 up
+   ip -netns ${h[$i]} li set r1h${i} netns ${r[1]} name eth${i} up
+
+   ip -netns ${h[$i]}addr add dev eth0 172.16.10${i}.1/24
+   ip -netns ${h[$i]} -6 addr add dev eth0 2001:db8:10${i}::1/64
+   ip -netns ${r[1]}addr add dev eth${i} 172.16.10${i}.254/24
+   ip -netns ${r[1]} -6 addr add dev eth${i} 2001:db8:10${i}::64/64
done
 
-   ip -netns h0 nexthop add id 4 via 172.16.100.254 dev eth0
-   ip -netns h0 nexthop add id 6 via 2001:db8:100::64 dev eth0
+   ip -netns ${h[0]} nexthop add id 4 via 172.16.100.254 dev eth0
+   ip -netns ${h[0]} nexthop add id 6 via 2001:db8:100::64 dev eth0
 
-   # routing from h0 to h1-h3 and back
+   # routing from ${h[0]} to h1-h3 and back
for i in 1 2 3
do
-   ip -netns h0ro add 172.16.10${i}.0/24 nhid 4
-   ip -netns h${i} ro add 172.16.100.0/24 via 172.16.10${i}.254
+   ip -netns ${h[0]}ro add 172.16.10${i}.0/24 nhid 4
+   ip -netns ${h[$i]} ro add 172.16.100.0/24 via 172.16.10${i}.254
 
-   ip -netns h0-6 ro add 2001:db8:10${i}::/64 nhid 6
-   ip -netns h${i} -6 ro add 2001:db8:100::/64 via 
2001:db8:10${i}::64
+   ip -netns ${h[0]}-6 ro add 2001:db8:10${i}::/64 nhid 6
+   ip -netns ${h[$i]} -6 ro add 2001:db8:100::/64 via 
2001:db8:10${i}::64
done
 
if [ "$VERBOSE" = "1" ]; then
echo
echo "host 1 config"
-   ip -netns h0 li sh
-   ip -netns h0 ro sh
-   ip -netns h0 -6 ro sh
+   ip -netns ${h[0]} li sh
+   ip -netns ${h[0]} ro sh
+   ip -netns ${h[0]} -6 ro sh
fi
 
#set +e
 }
 
-cleanup()
-{
-   for n in h0 r1 h1 h2 h3
-   do
-   ip netns del ${n} 2>/dev/null
-   done
-}
-
 change_mtu()
 {
local hostid=$1
local mtu=$2
 
run_cmd ip -netns h${hostid} li set eth0 mtu ${mtu}
-

[PATCH net-next 06/38] selftests/net: convert fcnal-test.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion. There are some failures, but it
also exists on my system without this patch. So it's not affectec by
this patch and I will check the reason later.

  ]# time ./fcnal-test.sh
  /usr/bin/which: no nettest in 
(/root/.local/bin:/root/bin:/usr/share/Modules/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)

  ###
  IPv4 ping
  ###

  #
  No VRF

  SYSCTL: net.ipv4.raw_l3mdev_accept=0

  TEST: ping out - ns-B IP  
[ OK ]
  TEST: ping out, device bind - ns-B IP 
[ OK ]
  TEST: ping out, address bind - ns-B IP
[ OK ]
  ...

  #
  SNAT on VRF

  TEST: IPv4 TCP connection over VRF with SNAT  
[ OK ]
  TEST: IPv6 TCP connection over VRF with SNAT  
[ OK ]

  Tests passed: 893
  Tests failed:  21

  real52m48.178s
  user0m34.158s
  sys 1m42.976s

BTW, this test needs a really long time. So expand the timeout to 1h.

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/fcnal-test.sh | 30 ++-
 tools/testing/selftests/net/settings  |  2 +-
 2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/net/fcnal-test.sh 
b/tools/testing/selftests/net/fcnal-test.sh
index d32a14ba069a..0d4f252427e2 100755
--- a/tools/testing/selftests/net/fcnal-test.sh
+++ b/tools/testing/selftests/net/fcnal-test.sh
@@ -37,9 +37,7 @@
 #
 # server / client nomenclature relative to ns-A
 
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
-
+source lib.sh
 VERBOSE=0
 
 NSA_DEV=eth1
@@ -82,14 +80,6 @@ MCAST=ff02::1
 NSA_LINKIP6=
 NSB_LINKIP6=
 
-NSA=ns-A
-NSB=ns-B
-NSC=ns-C
-
-NSA_CMD="ip netns exec ${NSA}"
-NSB_CMD="ip netns exec ${NSB}"
-NSC_CMD="ip netns exec ${NSC}"
-
 which ping6 > /dev/null 2>&1 && ping6=$(which ping6) || ping6=$(which ping)
 
 # Check if FIPS mode is enabled
@@ -406,9 +396,6 @@ create_ns()
local addr=$2
local addr6=$3
 
-   ip netns add ${ns}
-
-   ip -netns ${ns} link set lo up
if [ "${addr}" != "-" ]; then
ip -netns ${ns} addr add dev lo ${addr}
fi
@@ -467,13 +454,12 @@ cleanup()
ip -netns ${NSA} link del dev ${NSA_DEV}
 
ip netns pids ${NSA} | xargs kill 2>/dev/null
-   ip netns del ${NSA}
+   cleanup_ns ${NSA}
fi
 
ip netns pids ${NSB} | xargs kill 2>/dev/null
-   ip netns del ${NSB}
ip netns pids ${NSC} | xargs kill 2>/dev/null
-   ip netns del ${NSC} >/dev/null 2>&1
+   cleanup_ns ${NSB} ${NSC}
 }
 
 cleanup_vrf_dup()
@@ -487,6 +473,8 @@ setup_vrf_dup()
 {
# some VRF tests use ns-C which has the same config as
# ns-B but for a device NOT in the VRF
+   setup_ns NSC
+   NSC_CMD="ip netns exec ${NSC}"
create_ns ${NSC} "-" "-"
connect_ns ${NSA} ${NSA_DEV2} ${NSA_IP}/24 ${NSA_IP6}/64 \
   ${NSC} ${NSC_DEV} ${NSB_IP}/24 ${NSB_IP6}/64
@@ -503,6 +491,10 @@ setup()
log_debug "Configuring network namespaces"
set -e
 
+   setup_ns NSA NSB
+   NSA_CMD="ip netns exec ${NSA}"
+   NSB_CMD="ip netns exec ${NSB}"
+
create_ns ${NSA} ${NSA_LO_IP}/32 ${NSA_LO_IP6}/128
create_ns ${NSB} ${NSB_LO_IP}/32 ${NSB_LO_IP6}/128
connect_ns ${NSA} ${NSA_DEV} ${NSA_IP}/24 ${NSA_IP6}/64 \
@@ -545,6 +537,10 @@ setup_lla_only()
log_debug "Configuring network namespaces"
set -e
 
+   setup_ns NSA NSB NSC
+   NSA_CMD="ip netns exec ${NSA}"
+   NSB_CMD="ip netns exec ${NSB}"
+   NSC_CMD="ip netns exec ${NSC}"
create_ns ${NSA} "-" "-"
create_ns ${NSB} "-" "-"
create_ns ${NSC} "-" "-"
diff --git a/tools/testing/selftests/net/settings 
b/tools/testing/selftests/net/settings
index dfc27cdc6c05..ed8418e8217a 100644
--- a/tools/testing/selftests/net/settings
+++ b/tools/testing/selftests/net/settings
@@ -1 +1 @@
-timeout=1500
+timeout=3600
-- 
2.41.0

[PATCH net-next 05/38] selftests/net: convert drop_monitor_tests.sh to run it in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./drop_monitor_tests.sh

Software drops test
TEST: Capturing active software drops   [ OK ]
TEST: Capturing inactive software drops [ OK ]

Hardware drops test
TEST: Capturing active hardware drops   [ OK ]
TEST: Capturing inactive hardware drops [ OK ]

Tests passed:   4
Tests failed:   0

Signed-off-by: Hangbin Liu 
---
 .../selftests/net/drop_monitor_tests.sh   | 21 ++-
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/net/drop_monitor_tests.sh 
b/tools/testing/selftests/net/drop_monitor_tests.sh
index b7650e30d18b..7c4818c971fc 100755
--- a/tools/testing/selftests/net/drop_monitor_tests.sh
+++ b/tools/testing/selftests/net/drop_monitor_tests.sh
@@ -2,10 +2,8 @@
 # SPDX-License-Identifier: GPL-2.0
 
 # This test is for checking drop monitor functionality.
-
+source lib.sh
 ret=0
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
 
 # all tests in this script. Can be overridden with -t option
 TESTS="
@@ -13,10 +11,6 @@ TESTS="
hw_drops
 "
 
-IP="ip -netns ns1"
-TC="tc -netns ns1"
-DEVLINK="devlink -N ns1"
-NS_EXEC="ip netns exec ns1"
 NETDEVSIM_PATH=/sys/bus/netdevsim/
 DEV_ADDR=1337
 DEV=netdevsim${DEV_ADDR}
@@ -43,7 +37,7 @@ setup()
modprobe netdevsim &> /dev/null
 
set -e
-   ip netns add ns1
+   setup_ns NS1
$IP link add dummy10 up type dummy
 
$NS_EXEC echo "$DEV_ADDR 1" > ${NETDEVSIM_PATH}/new_device
@@ -57,7 +51,7 @@ setup()
 cleanup()
 {
$NS_EXEC echo "$DEV_ADDR" > ${NETDEVSIM_PATH}/del_device
-   ip netns del ns1
+   cleanup_ns ${NS1}
 }
 
 sw_drops_test()
@@ -194,8 +188,15 @@ if [ $? -ne 0 ]; then
exit $ksft_skip
 fi
 
-# start clean
+# create netns first so we can get the namespace name
+setup_ns NS1
 cleanup &> /dev/null
+trap cleanup EXIT
+
+IP="ip -netns ${NS1}"
+TC="tc -netns ${NS1}"
+DEVLINK="devlink -N ${NS1}"
+NS_EXEC="ip netns exec ${NS1}"
 
 for t in $TESTS
 do
-- 
2.41.0

[PATCH net-next 04/38] selftests/net: convert cmsg tests to make them run in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./cmsg_ipv6.sh
OK
]# ./cmsg_so_mark.sh
OK
]# ./cmsg_time.sh
OK

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/cmsg_ipv6.sh| 10 --
 tools/testing/selftests/net/cmsg_so_mark.sh |  7 ---
 tools/testing/selftests/net/cmsg_time.sh|  7 ---
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/net/cmsg_ipv6.sh 
b/tools/testing/selftests/net/cmsg_ipv6.sh
index 330d0b1ceced..f30bd57d5e38 100755
--- a/tools/testing/selftests/net/cmsg_ipv6.sh
+++ b/tools/testing/selftests/net/cmsg_ipv6.sh
@@ -1,9 +1,8 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-ksft_skip=4
+source lib.sh
 
-NS=ns
 IP6=2001:db8:1::1/64
 TGT6=2001:db8:1::2
 TMPF=$(mktemp --suffix ".pcap")
@@ -11,13 +10,11 @@ TMPF=$(mktemp --suffix ".pcap")
 cleanup()
 {
 rm -f $TMPF
-ip netns del $NS
+cleanup_ns $NS
 }
 
 trap cleanup EXIT
 
-NSEXE="ip netns exec $NS"
-
 tcpdump -h | grep immediate-mode >> /dev/null
 if [ $? -ne 0 ]; then
 echo "SKIP - tcpdump with --immediate-mode option required"
@@ -25,7 +22,8 @@ if [ $? -ne 0 ]; then
 fi
 
 # Namespaces
-ip netns add $NS
+setup_ns NS
+NSEXE="ip netns exec $NS"
 
 $NSEXE sysctl -w net.ipv4.ping_group_range='0 2147483647' > /dev/null
 
diff --git a/tools/testing/selftests/net/cmsg_so_mark.sh 
b/tools/testing/selftests/net/cmsg_so_mark.sh
index 1650b8622f2f..772ad0cc2630 100755
--- a/tools/testing/selftests/net/cmsg_so_mark.sh
+++ b/tools/testing/selftests/net/cmsg_so_mark.sh
@@ -1,7 +1,8 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-NS=ns
+source lib.sh
+
 IP4=172.16.0.1/24
 TGT4=172.16.0.2
 IP6=2001:db8:1::1/64
@@ -10,13 +11,13 @@ MARK=1000
 
 cleanup()
 {
-ip netns del $NS
+cleanup_ns $NS
 }
 
 trap cleanup EXIT
 
 # Namespaces
-ip netns add $NS
+setup_ns NS
 
 ip netns exec $NS sysctl -w net.ipv4.ping_group_range='0 2147483647' > 
/dev/null
 
diff --git a/tools/testing/selftests/net/cmsg_time.sh 
b/tools/testing/selftests/net/cmsg_time.sh
index 91161e1da734..af85267ad1e3 100755
--- a/tools/testing/selftests/net/cmsg_time.sh
+++ b/tools/testing/selftests/net/cmsg_time.sh
@@ -1,7 +1,8 @@
 #!/bin/bash
 # SPDX-License-Identifier: GPL-2.0
 
-NS=ns
+source lib.sh
+
 IP4=172.16.0.1/24
 TGT4=172.16.0.2
 IP6=2001:db8:1::1/64
@@ -9,13 +10,13 @@ TGT6=2001:db8:1::2
 
 cleanup()
 {
-ip netns del $NS
+cleanup_ns $NS
 }
 
 trap cleanup EXIT
 
 # Namespaces
-ip netns add $NS
+setup_ns NS
 
 ip netns exec $NS sysctl -w net.ipv4.ping_group_range='0 2147483647' > 
/dev/null
 
-- 
2.41.0

[PATCH net-next 03/38] selftest: arp_ndisc_untracked_subnets.sh convert to run test in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

2 tests also failed without this patch

]# ./arp_ndisc_untracked_subnets.sh
TEST: test_arp:  accept_arp=0   [ OK ]
TEST: test_arp:  accept_arp=1   [FAIL]
TEST: test_arp:  accept_arp=2  same_subnet=0[ OK ]
TEST: test_arp:  accept_arp=2  same_subnet=1[FAIL]
TEST: test_ndisc:  accept_untracked_na=0[ OK ]
TEST: test_ndisc:  accept_untracked_na=1[ OK ]
TEST: test_ndisc:  accept_untracked_na=2  same_subnet=0 [ OK ]
TEST: test_ndisc:  accept_untracked_na=2  same_subnet=1 [ OK ]

Signed-off-by: Hangbin Liu 
---
 .../net/arp_ndisc_untracked_subnets.sh | 18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh 
b/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh
index c899b446acb6..5fda2344e14a 100755
--- a/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh
+++ b/tools/testing/selftests/net/arp_ndisc_untracked_subnets.sh
@@ -5,16 +5,14 @@
 # garp to the router. Router accepts or ignores based on its arp_accept
 # or accept_untracked_na configuration.
 
+source lib.sh
+
 TESTS="arp ndisc"
 
-ROUTER_NS="ns-router"
-ROUTER_NS_V6="ns-router-v6"
 ROUTER_INTF="veth-router"
 ROUTER_ADDR="10.0.10.1"
 ROUTER_ADDR_V6="2001:db8:abcd:0012::1"
 
-HOST_NS="ns-host"
-HOST_NS_V6="ns-host-v6"
 HOST_INTF="veth-host"
 HOST_ADDR="10.0.10.2"
 HOST_ADDR_V6="2001:db8:abcd:0012::2"
@@ -23,13 +21,11 @@ SUBNET_WIDTH=24
 PREFIX_WIDTH_V6=64
 
 cleanup() {
-   ip netns del ${HOST_NS}
-   ip netns del ${ROUTER_NS}
+   cleanup_ns ${HOST_NS} ${ROUTER_NS}
 }
 
 cleanup_v6() {
-   ip netns del ${HOST_NS_V6}
-   ip netns del ${ROUTER_NS_V6}
+   cleanup_ns ${HOST_NS_V6} ${ROUTER_NS_V6}
 }
 
 setup() {
@@ -37,8 +33,7 @@ setup() {
local arp_accept=$1
 
# Set up two namespaces
-   ip netns add ${ROUTER_NS}
-   ip netns add ${HOST_NS}
+   setup_ns HOST_NS ROUTER_NS
 
# Set up interfaces veth0 and veth1, which are pairs in separate
# namespaces. veth0 is veth-router, veth1 is veth-host.
@@ -72,8 +67,7 @@ setup_v6() {
local accept_untracked_na=$1
 
# Set up two namespaces
-   ip netns add ${ROUTER_NS_V6}
-   ip netns add ${HOST_NS_V6}
+   setup_ns HOST_NS_V6 ROUTER_NS_V6
 
# Set up interfaces veth0 and veth1, which are pairs in separate
# namespaces. veth0 is veth-router, veth1 is veth-host.
-- 
2.41.0

[PATCH net-next 02/38] selftests/net: arp_ndisc_evict_nocarrier.sh convert to run test in unique namespace

2023-11-24 Thread Hangbin Liu

Here is the test result after conversion.

]# ./arp_ndisc_evict_nocarrier.sh
run arp_evict_nocarrier=1 test
ok
run arp_evict_nocarrier=0 test
ok
run all.arp_evict_nocarrier=0 test
ok
run ndisc_evict_nocarrier=1 test
ok
run ndisc_evict_nocarrier=0 test
ok
run all.ndisc_evict_nocarrier=0 test
ok

Signed-off-by: Hangbin Liu 
---
 .../net/arp_ndisc_evict_nocarrier.sh  | 46 +++
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/tools/testing/selftests/net/arp_ndisc_evict_nocarrier.sh 
b/tools/testing/selftests/net/arp_ndisc_evict_nocarrier.sh
index 4a110bb01e53..92eb880c52f2 100755
--- a/tools/testing/selftests/net/arp_ndisc_evict_nocarrier.sh
+++ b/tools/testing/selftests/net/arp_ndisc_evict_nocarrier.sh
@@ -12,7 +12,8 @@
 # {arp,ndisc}_evict_nocarrer=0 should still contain the single ARP/ND entry
 #
 
-readonly PEER_NS="ns-peer-$(mktemp -u XX)"
+source lib.sh
+
 readonly V4_ADDR0=10.0.10.1
 readonly V4_ADDR1=10.0.10.2
 readonly V6_ADDR0=2001:db8:91::1
@@ -22,43 +23,29 @@ ret=0
 
 cleanup_v6()
 {
-ip netns del me
-ip netns del peer
+cleanup_ns ${me} ${peer}
 
 sysctl -w net.ipv6.conf.veth1.ndisc_evict_nocarrier=1 >/dev/null 2>&1
 sysctl -w net.ipv6.conf.all.ndisc_evict_nocarrier=1 >/dev/null 2>&1
 }
 
-create_ns()
-{
-local n=${1}
-
-ip netns del ${n} 2>/dev/null
-
-ip netns add ${n}
-ip netns set ${n} $((nsid++))
-ip -netns ${n} link set lo up
-}
-
-
 setup_v6() {
-create_ns me
-create_ns peer
+setup_ns me peer
 
-IP="ip -netns me"
+IP="ip -netns ${me}"
 
 $IP li add veth1 type veth peer name veth2
 $IP li set veth1 up
 $IP -6 addr add $V6_ADDR0/64 dev veth1 nodad
-$IP li set veth2 netns peer up
-ip -netns peer -6 addr add $V6_ADDR1/64 dev veth2 nodad
+$IP li set veth2 netns ${peer} up
+ip -netns ${peer} -6 addr add $V6_ADDR1/64 dev veth2 nodad
 
-ip netns exec me sysctl -w $1 >/dev/null 2>&1
+ip netns exec ${me} sysctl -w $1 >/dev/null 2>&1
 
 # Establish an ND cache entry
-ip netns exec me ping -6 -c1 -Iveth1 $V6_ADDR1 >/dev/null 2>&1
+ip netns exec ${me} ping -6 -c1 -Iveth1 $V6_ADDR1 >/dev/null 2>&1
 # Should have the veth1 entry in ND table
-ip netns exec me ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
+ip netns exec ${me} ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
 if [ $? -ne 0 ]; then
 cleanup_v6
 echo "failed"
@@ -66,11 +53,11 @@ setup_v6() {
 fi
 
 # Set veth2 down, which will put veth1 in NOCARRIER state
-ip netns exec peer ip link set veth2 down
+ip netns exec ${peer} ip link set veth2 down
 }
 
 setup_v4() {
-ip netns add "${PEER_NS}"
+setup_ns PEER_NS
 ip link add name veth0 type veth peer name veth1
 ip link set dev veth0 up
 ip link set dev veth1 netns "${PEER_NS}"
@@ -99,8 +86,7 @@ setup_v4() {
 cleanup_v4() {
 ip neigh flush dev veth0
 ip link del veth0
-local -r ns="$(ip netns list|grep $PEER_NS)"
-[ -n "$ns" ] && ip netns del $ns 2>/dev/null
+cleanup_ns $PEER_NS
 
 sysctl -w net.ipv4.conf.veth0.arp_evict_nocarrier=1 >/dev/null 2>&1
 sysctl -w net.ipv4.conf.all.arp_evict_nocarrier=1 >/dev/null 2>&1
@@ -163,7 +149,7 @@ run_ndisc_evict_nocarrier_enabled() {
 
 setup_v6 "net.ipv6.conf.veth1.ndisc_evict_nocarrier=1"
 
-ip netns exec me ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
+ip netns exec ${me} ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
 
 if [ $? -eq 0 ];then
 echo "failed"
@@ -180,7 +166,7 @@ run_ndisc_evict_nocarrier_disabled() {
 
 setup_v6 "net.ipv6.conf.veth1.ndisc_evict_nocarrier=0"
 
-ip netns exec me ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
+ip netns exec ${me} ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
 
 if [ $? -eq 0 ];then
 echo "ok"
@@ -197,7 +183,7 @@ run_ndisc_evict_nocarrier_disabled_all() {
 
 setup_v6 "net.ipv6.conf.all.ndisc_evict_nocarrier=0"
 
-ip netns exec me ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
+ip netns exec ${me} ip -6 neigh get $V6_ADDR1 dev veth1 >/dev/null 2>&1
 
 if [ $? -eq 0 ];then
 echo "ok"
-- 
2.41.0

[PATCH net-next 01/38] selftests/net: add lib.sh

2023-11-24 Thread Hangbin Liu

Add a lib.sh for net selftests. This file can be used to define commonly
used variables and functions.

Add function setup_ns() for user to create unique namespaces with given
prefix name.

Signed-off-by: Hangbin Liu 
---
 tools/testing/selftests/net/Makefile |  2 +-
 tools/testing/selftests/net/lib.sh   | 98 
 2 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/net/lib.sh

diff --git a/tools/testing/selftests/net/Makefile 
b/tools/testing/selftests/net/Makefile
index 9274edfb76ff..14bd68da7466 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -54,7 +54,7 @@ TEST_PROGS += ip_local_port_range.sh
 TEST_PROGS += rps_default_mask.sh
 TEST_PROGS += big_tcp.sh
 TEST_PROGS_EXTENDED := in_netns.sh setup_loopback.sh setup_veth.sh
-TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh
+TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh lib.sh
 TEST_GEN_FILES =  socket nettest
 TEST_GEN_FILES += psock_fanout psock_tpacket msg_zerocopy reuseport_addr_any
 TEST_GEN_FILES += tcp_mmap tcp_inq psock_snd txring_overwrite
diff --git a/tools/testing/selftests/net/lib.sh 
b/tools/testing/selftests/net/lib.sh
new file mode 100644
index ..239ab2beb438
--- /dev/null
+++ b/tools/testing/selftests/net/lib.sh
@@ -0,0 +1,98 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+##
+# Defines
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+# namespace list created by setup_ns
+NS_LIST=""
+
+##
+# Helpers
+busywait()
+{
+   local timeout=$1; shift
+
+   local start_time="$(date -u +%s%3N)"
+   while true
+   do
+   local out
+   out=$($@)
+   local ret=$?
+   if ((!ret)); then
+   echo -n "$out"
+   return 0
+   fi
+
+   local current_time="$(date -u +%s%3N)"
+   if ((current_time - start_time > timeout)); then
+   echo -n "$out"
+   return 1
+   fi
+   done
+}
+
+cleanup_ns()
+{
+   local ns=""
+   local errexit=0
+
+   # disable errexit temporary
+   if [[ $- =~ "e" ]]; then
+   errexit=1
+   set +e
+   fi
+
+   for ns in "$@"; do
+   ip netns delete "${ns}" &> /dev/null
+   busywait 2 "ip netns list | grep -vq $1" &> /dev/null
+   if ip netns list | grep -q $1; then
+   echo "Failed to remove namespace $1"
+   return $ksft_skip
+   fi
+   done
+
+   [ $errexit -eq 1 ] && set -e
+   return 0
+}
+
+# By default, remove all netns before EXIT.
+cleanup_all_ns()
+{
+   cleanup_ns $NS_LIST
+}
+trap cleanup_all_ns EXIT
+
+# setup netns with given names as prefix. e.g
+# setup_ns local remote
+setup_ns()
+{
+   local ns=""
+   # the ns list we created in this call
+   local ns_list=""
+   while [ -n "$1" ]; do
+   # Some test may setup/remove same netns multi times
+   if unset $1 2> /dev/null; then
+   ns="${1,,}-$(mktemp -u XX)"
+   eval readonly $1=$ns
+   else
+   eval ns='$'$1
+   cleanup_ns $ns
+
+   fi
+
+   ip netns add $ns
+   if ! ip netns list | grep -q $ns; then
+   echo "Failed to create namespace $1"
+   cleanup_ns $ns_list
+   return $ksft_skip
+   fi
+   ip -n $ns link set lo up
+   ns_list="$ns_list $ns"
+
+   shift
+   done
+   NS_LIST="$NS_LIST $ns_list"
+}
-- 
2.41.0

1 2 >

1 - 100 of 101 matches

Mail list logo