there's been discussions for years (and even some diffs!) about how we
should let drivers establish interrupts on multiple cpus.
the simple approach is to let every driver look at the number of cpus in
a box and just pin an interrupt on it, which is what pretty much
everyone else started with, but we have never seemed to get past
bikeshedding about. from what i can tell, the principal objections to
this are:
1. interrupts will tend to land on low numbered cpus.
ie, if drivers try to establish n interrupts on m cpus, they'll
start at cpu 0 and go to cpu n, which means cpu 0 will end up with more
interrupts than cpu m-1. apparently this is terrible, even though
currently we have all the interrupts on cpu0 anyway and the world
hasnt ended.
2. some cpus shouldn't be used for interrupts.
why a cpu should or shouldn't be used for interrupts can be pretty
arbitrary, but in practical terms i'm going to borrow from the scheduler
and say that we shouldn't run work on hyperthreads. discussions about
big.little configs and so on can wait.
3. making all the drivers make the same decisions about the above is
a lot of maintenance overhead.
either we will have a bunch of inconsistencies, or we'll have a lot
of untested commits to keep everything the same.
my proposed solution to the above is this diff to provide the intrmap
api. drivers that want to establish multiple interrupts ask the api for
a set of cpus it can use, and the api considers the above issues when
generating a set of cpus for the driver to use. drivers then establish
interrupts on cpus with the info provided by the map.
it is based on the if_ringmap api in dragonflybsd, but generalised so it
could be used by something like nvme(4) in the future.
jmatthew@ and i have been working on implementing a
pci_intr_establish_cpu() api on a few archs, and tweaking some drivers
to see if it works out well, and so far the conclusion is "yes, yes it
does".
the best example so far is vmx hacked up to establish interrupts on
multiple cps using the api. i changed the interrupt name string so it
includes the ring and cpuid it is establishing on. each vmx is also
limited to 8 rings overall, no matter how big the system is.
on a machine with 2 vmx interfaces, 16 "real" CPUs, and no hyperthreads,
the mappings look like this:
dlg@kbuild ~$ vmstat -zi | grep vmx
irq114/vmx0 0 0
irq115/vmx0:0:0 207 5
irq116/vmx0:1:15 22 0
irq117/vmx0:2:14 11 0
irq118/vmx0:3:13 24 0
irq119/vmx0:4:12 39 0
irq120/vmx0:5:11 1 0
irq121/vmx0:6:10 12 0
irq122/vmx0:7:9 7 0
irq126/vmx1 0 0
irq127/vmx1:0:8 0 0
irq128/vmx1:1:7 0 0
irq129/vmx1:2:6 0 0
irq130/vmx1:3:5 0 0
irq131/vmx1:4:4 0 0
irq132/vmx1:5:3 0 0
irq133/vmx1:6:2 0 0
irq134/vmx1:7:1 0 0
if you move it to 8 cores and 16 threads:
dlg@kbuild ~$ sysctl hw.{ncpu,ncpufound,ncpuonline}
hw.ncpu=16
hw.ncpufound=16
hw.ncpuonline=8
dlg@kbuild ~$ vmstat -zi | grep vmx
irq114/vmx0 0 0
irq115/vmx0:0:0 40 0
irq116/vmx0:1:14 15 0
irq117/vmx0:2:12 33 0
irq118/vmx0:3:10 64 0
irq119/vmx0:4:8 23 0
irq120/vmx0:5:6 32 0
irq121/vmx0:6:4 137 1
irq122/vmx0:7:2 245 3
irq126/vmx1 0 0
irq127/vmx1:0:0 0 0
irq128/vmx1:1:14 0 0
irq129/vmx1:2:12 0 0
irq130/vmx1:3:10 0 0
irq131/vmx1:4:8 0 0
irq132/vmx1:5:6 0 0
irq133/vmx1:6:4 0 0
irq134/vmx1:7:2 0 0
dlg@kbuild ~$ dmesg | grep smt
cpu0: smt 0, core 0, package 0
cpu1: smt 1, core 0, package 0
cpu2: smt 0, core 1, package 0
cpu3: smt 1, core 1, package 0
cpu4: smt 0, core 2, package 0
cpu5: smt 1, core 2, package 0
cpu6: smt 0, core 3, package 0
cpu7: smt 1, core 3, package 0
cpu8: smt 0, core 4, package 0
cpu9: smt 1, core 4, package 0
cpu10: smt 0, core 5, package 0
cpu11: smt 1, core 5, package 0
cpu12: smt 0, core 6, package 0
cpu13: smt 1, core 6, package 0
cpu14: smt 0, core 7, package 0
cpu15: smt 1, core 7, package 0
in the first case you can see it spreading the vmx interfaces over the
cpus. in the latter, there's not enough real cpus so it stacks the
interrupts.
jmatthew@ and i have the following question we can't resolve
ourselves: should the api provide struct cpu_info pointers instead
of number cpu ids?
our experience so far is that pci_intr_establish_cpuid() immediately
maps the id to a pointer anyway, and intrmap iterates over struct
cpu_info pointers to build the list of ids, so we could just remove the
numbers in the middle. pci_intr_establish_cpu() could take a cpu_info
pointer, and intrmap could provide cpu_info pointers.
the only caveat to this i can think of is if we need to establish
interrupts before cpus are attached, which might be useful on arm
archs. we can also change this in the tree.
if it's not obvious, im kind of sick of talking about this stuff,
so i'd rather shut up and hack on multiq support in the tree as
much as possible.
ok?
Index: share/man/man9/intrmap_create.9
===================================================================
RCS file: share/man/man9/intrmap_create.9
diff -N share/man/man9/intrmap_create.9
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ share/man/man9/intrmap_create.9 16 Jun 2020 00:13:50 -0000
@@ -0,0 +1,125 @@
+.\" $OpenBSD$
+.\"
+.\" Copyright (c) 2020 David Gwynne <[email protected]>
+.\"
+.\" Permission to use, copy, modify, and distribute this software for any
+.\" purpose with or without fee is hereby granted, provided that the above
+.\" copyright notice and this permission notice appear in all copies.
+.\"
+.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+.\"
+.Dd $Mdocdate: June 16 2020 $
+.Dt INTRMAP_CREATE 9
+.Os
+.Sh NAME
+.Nm intrmap_create ,
+.Nm intrmap_destroy ,
+.Nm intrmap_count ,
+.Nm intrmap_cpu
+.Nd interrupt to CPU mapping API
+.Sh SYNOPSIS
+.In sys/inrtmap.h
+.Ft struct intrmap *
+.Fo intrmap_create
+.Fa "const struct device *dv"
+.Fa "unsigned int nintr"
+.Fa "unsigned int maxintr"
+.Fa "unsigned int flags"
+.Fc
+.Ft void
+.Fn intrmap_destroy "struct intrmap *im"
+.Ft unsigned int
+.Fn intrmap_count "struct intrmap *im"
+.Ft unsigned int
+.Fn intrmap_cpu "struct intrmap *im" "unsigned int index"
+.Sh DESCRIPTION
+The interrupt to CPU mapping API supports the use of multiple CPUs
+by hardware drivers.
+Drivers that can use multiple interrupts use the API to request a
+set of CPUs that they can establish those interrupts on.
+The API limits the requested number of interrupts to what is available
+on the system, and attempts to distribute the requested interrupts
+over those CPUs.
+On some platforms the API will filter the set of available CPUs.
+.\" to avoid hyperthreads, basically.
+.Pp
+.Fn intrmap_create
+allocates an interrupt map data structure for use by the driver
+identified by
+.Fa dv .
+The number of interrupts the hardware supports is specified via the
+.Fa nintr
+argument.
+The driver supplies the maximum number of interrupts it can support
+via
+.Fa maxintr ,
+which, along with the number of available CPUs at the time the
+function is called, is used as a constraint on the number of requested
+interrupts.
+.Fa nintr
+may be zero to use the driver limit as the number of requested
+interrupts.
+The
+.Fa flags
+argument may have the following defines OR'ed together:
+.Bl -tag -width xxx -offset indent
+.It Dv INTRMAP_POWEROF2
+The hardware only supports a power of 2 number of interrupts, so
+constrain the number of supplied interrupts after the system and
+driver limits are applied.
+.El
+.Pp
+.Fn intrmap_destroy
+frees the memory associated with the interrupt map data structure
+passed via
+.Fa im .
+.Pp
+.Fn intrmap_count
+returns the number of interrupts that the driver can establish
+according to the
+.Fa im
+interrupt map.
+.Pp
+.Fn intrmap_cpu
+returns which CPU the interrupt specified in
+.Fa index
+should be established on according to the
+.Fa im
+interrupt map.
+Interrupts are identified as a number from 0 to the value returned by
+.Fn intrmap_count .
+.Sh CONTEXT
+.Fn intrmap_create ,
+.Fn intrmap_destroy ,
+.Fn intrmap_count ,
+and
+.Fn intrmap_cpu
+can be called during autoconf, or from process context.
+.Sh RETURN VALUES
+.Fn intrmap_create
+returns a pointer to a interrupt mapping structure on success, or
+.Dv NULL
+on failure.
+.Pp
+.Fn intrmap_count
+returns the number of interrupts that were allocated for the driver
+to use.
+.Pp
+.Fn intrmap_cpu
+returns an identifier for the CPU that the interrupt should be
+established on.
+.\" .Sh SEE ALSO
+.\" .Xr pci_intr_establish_cpuid 9
+.Sh HISTORY
+The interrupt mapping API is based on the if_ringmap API in
+.Dx .
+It was ported to
+.Ox 6.8
+by
+.An David Gwynne Aq Mt [email protected] .
Index: share/man/man9/Makefile
===================================================================
RCS file: /cvs/src/share/man/man9/Makefile,v
retrieving revision 1.300
diff -u -p -r1.300 Makefile
--- share/man/man9/Makefile 5 Jun 2020 02:24:12 -0000 1.300
+++ share/man/man9/Makefile 16 Jun 2020 00:13:50 -0000
@@ -20,7 +20,8 @@ MAN= aml_evalnode.9 atomic_add_int.9 ato
ieee80211_node.9 ieee80211_output.9 ieee80211_proto.9 \
ieee80211_radiotap.9 if_addrhook_add.9 if_get.9 if_rxr_init.9 \
ifiq_input.9 ifq_enqueue.9 \
- ifq_deq_begin.9 imax.9 iic.9 intro.9 inittodr.9 intr_barrier.9 \
+ ifq_deq_begin.9 imax.9 iic.9 intro.9 inittodr.9 \
+ intr_barrier.9 intrmap_create.9 \
KASSERT.9 km_alloc.9 knote.9 kthread.9 ktrace.9 \
lim_cur.9 loadfirmware.9 log.9 \
malloc.9 membar_sync.9 memcmp.9 mbuf.9 mbuf_tags.9 md5.9 mi_switch.9 \
Index: sys/sys/intrmap.h
===================================================================
RCS file: sys/sys/intrmap.h
diff -N sys/sys/intrmap.h
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ sys/sys/intrmap.h 16 Jun 2020 00:13:50 -0000
@@ -0,0 +1,38 @@
+/* $OpenBSD$ */
+
+/*
+ * Copyright (c) 2020 David Gwynne <[email protected]>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#ifndef _SYS_INTRMAP_H_
+#define _SYS_INTRMAP_H_
+
+struct intrmap;
+
+#define INTRMAP_POWEROF2 (1 << 0)
+
+struct intrmap *intrmap_create(const struct device *,
+ unsigned int, unsigned int, unsigned int);
+void intrmap_destroy(struct intrmap *);
+
+void intrmap_match(const struct device *,
+ struct intrmap *, struct intrmap *);
+void intrmap_align(const struct device *,
+ struct intrmap *, struct intrmap *);
+
+unsigned int intrmap_count(const struct intrmap *);
+unsigned int intrmap_cpu(const struct intrmap *, unsigned int);
+
+#endif /* _SYS_INTRMAP_H_ */
Index: sys/kern/kern_intrmap.c
===================================================================
RCS file: sys/kern/kern_intrmap.c
diff -N sys/kern/kern_intrmap.c
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ sys/kern/kern_intrmap.c 16 Jun 2020 00:13:50 -0000
@@ -0,0 +1,347 @@
+/* $OpenBSD$ */
+
+/*
+ * Copyright (c) 1980, 1986, 1993
+ * The Regents of the University of California. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * @(#)if.c 8.3 (Berkeley) 1/4/94
+ * $FreeBSD: src/sys/net/if.c,v 1.185 2004/03/13 02:35:03 brooks Exp $
+ */
+
+/*
+ * This code is adapted from the if_ringmap code in DragonflyBSD,
+ * but generalised for use by all types of devices, not just network
+ * cards.
+ */
+
+#include <sys/param.h>
+#include <sys/systm.h>
+#include <sys/device.h>
+#include <sys/malloc.h>
+#include <sys/rwlock.h>
+
+#include <sys/intrmap.h>
+
+struct intrmap_cpus {
+ struct refcnt ic_refs;
+ unsigned int ic_count;
+ unsigned int *ic_cpumap;
+};
+
+struct intrmap {
+ unsigned int im_count;
+ unsigned int im_grid;
+ struct intrmap_cpus *
+ im_cpus;
+ unsigned int *im_cpumap;
+};
+
+/*
+ * The CPUs that should be used for interrupts may be a subset of all CPUs.
+ */
+
+struct rwlock intrmap_lock = RWLOCK_INITIALIZER("intrcpus");
+struct intrmap_cpus *intrmap_cpus = NULL;
+int intrmap_ncpu = 0;
+
+static void
+intrmap_cpus_put(struct intrmap_cpus *ic)
+{
+ if (ic == NULL)
+ return;
+
+ if (refcnt_rele(&ic->ic_refs)) {
+ free(ic->ic_cpumap, M_DEVBUF,
+ ic->ic_count * sizeof(*ic->ic_cpumap));
+ free(ic, M_DEVBUF, sizeof(*ic));
+ }
+}
+
+static struct intrmap_cpus *
+intrmap_cpus_get(void)
+{
+ struct intrmap_cpus *oic = NULL;
+ struct intrmap_cpus *ic;
+
+ rw_enter_write(&intrmap_lock);
+ if (intrmap_ncpu != ncpus) {
+ unsigned int icpus = 0;
+ unsigned int *cpumap;
+ CPU_INFO_ITERATOR cii;
+ struct cpu_info *ci;
+
+ /*
+ * there's a new "version" of the set of CPUs available, so
+ * we need to figure out which ones we can use for interrupts.
+ */
+
+ cpumap = mallocarray(ncpus, sizeof(*cpumap),
+ M_DEVBUF, M_WAITOK);
+
+ CPU_INFO_FOREACH(cii, ci) {
+#ifdef __HAVE_CPU_TOPOLOGY
+ if (ci->ci_smt_id > 0)
+ continue;
+#endif
+ cpumap[icpus++] = CPU_INFO_UNIT(ci);
+ }
+
+ if (icpus < ncpus) {
+ /* this is mostly about free(9) needing a size */
+ unsigned int *icpumap = mallocarray(icpus,
+ sizeof(*icpumap), M_DEVBUF, M_WAITOK);
+ memcpy(icpumap, cpumap, icpus * sizeof(*icpumap));
+ free(cpumap, M_DEVBUF, ncpus * sizeof(*cpumap));
+ cpumap = icpumap;
+ }
+
+ ic = malloc(sizeof(*ic), M_DEVBUF, M_WAITOK);
+ refcnt_init(&ic->ic_refs);
+ ic->ic_count = icpus;
+ ic->ic_cpumap = cpumap;
+
+ oic = intrmap_cpus;
+ intrmap_cpus = ic; /* give this ref to the global. */
+ } else
+ ic = intrmap_cpus;
+
+ refcnt_take(&ic->ic_refs); /* take a ref for the caller */
+ rw_exit_write(&intrmap_lock);
+
+ intrmap_cpus_put(oic);
+
+ return (ic);
+}
+
+static int
+intrmap_nintrs(const struct intrmap_cpus *ic, unsigned int nintrs,
+ unsigned int maxintrs)
+{
+ KASSERTMSG(maxintrs > 0, "invalid maximum interrupt count %u",
+ maxintrs);
+
+ if (nintrs == 0 || nintrs > maxintrs)
+ nintrs = maxintrs;
+ if (nintrs > ic->ic_count)
+ nintrs = ic->ic_count;
+ return (nintrs);
+}
+
+static void
+intrmap_set_grid(struct intrmap *im, unsigned int unit, unsigned int grid)
+{
+ unsigned int i, offset;
+ unsigned int *cpumap = im->im_cpumap;
+ const struct intrmap_cpus *ic = im->im_cpus;
+
+ KASSERTMSG(grid > 0, "invalid if_ringmap grid %u", grid);
+ KASSERTMSG(grid >= im->im_count, "invalid intrmap grid %u, count %u",
+ grid, im->im_count);
+ im->im_grid = grid;
+
+ offset = (grid * unit) % ic->ic_count;
+ for (i = 0; i < im->im_count; i++) {
+ cpumap[i] = offset + i;
+ KASSERTMSG(cpumap[i] < ic->ic_count,
+ "invalid cpumap[%u] = %u, offset %u (ncpu %d)", i,
+ cpumap[i], offset, ic->ic_count);
+ }
+}
+
+struct intrmap *
+intrmap_create(const struct device *dv,
+ unsigned int nintrs, unsigned int maxintrs, unsigned int flags)
+{
+ struct intrmap *im;
+ unsigned int unit = dv->dv_unit;
+ unsigned int i, grid = 0, prev_grid;
+ struct intrmap_cpus *ic;
+
+ ic = intrmap_cpus_get();
+
+ nintrs = intrmap_nintrs(ic, nintrs, maxintrs);
+ if (ISSET(flags, INTRMAP_POWEROF2))
+ nintrs = 1 << (fls(nintrs) - 1);
+ im = malloc(sizeof(*im), M_DEVBUF, M_WAITOK | M_ZERO);
+ im->im_count = nintrs;
+ im->im_cpus = ic;
+ im->im_cpumap = mallocarray(nintrs, sizeof(*im->im_cpumap), M_DEVBUF,
+ M_WAITOK | M_ZERO);
+
+ prev_grid = ic->ic_count;
+ for (i = 0; i < ic->ic_count; i++) {
+ if (ic->ic_count % (i + 1) != 0)
+ continue;
+
+ grid = ic->ic_count / (i + 1);
+ if (nintrs > grid) {
+ grid = prev_grid;
+ break;
+ }
+
+ if (nintrs > ic->ic_count / (i + 2))
+ break;
+ prev_grid = grid;
+ }
+ intrmap_set_grid(im, unit, grid);
+
+ return (im);
+}
+
+void
+intrmap_destroy(struct intrmap *im)
+{
+ free(im->im_cpumap, M_DEVBUF, im->im_count * sizeof(*im->im_cpumap));
+ intrmap_cpus_put(im->im_cpus);
+ free(im, M_DEVBUF, sizeof(*im));
+}
+
+/*
+ * Align the two ringmaps.
+ *
+ * e.g. 8 netisrs, rm0 contains 4 rings, rm1 contains 2 rings.
+ *
+ * Before:
+ *
+ * CPU 0 1 2 3 4 5 6 7
+ * NIC_RX n0 n1 n2 n3
+ * NIC_TX N0 N1
+ *
+ * After:
+ *
+ * CPU 0 1 2 3 4 5 6 7
+ * NIC_RX n0 n1 n2 n3
+ * NIC_TX N0 N1
+ */
+void
+intrmap_align(const struct device *dv,
+ struct intrmap *im0, struct intrmap *im1)
+{
+ unsigned int unit = dv->dv_unit;
+
+ KASSERT(im0->im_cpus == im1->im_cpus);
+
+ if (im0->im_grid > im1->im_grid)
+ intrmap_set_grid(im1, unit, im0->im_grid);
+ else if (im0->im_grid < im1->im_grid)
+ intrmap_set_grid(im0, unit, im1->im_grid);
+}
+
+void
+intrmap_match(const struct device *dv,
+ struct intrmap *im0, struct intrmap *im1)
+{
+ unsigned int unit = dv->dv_unit;
+ const struct intrmap_cpus *ic;
+ unsigned int subset_grid, cnt, divisor, mod, offset, i;
+ struct intrmap *subset_im, *im;
+ unsigned int old_im0_grid, old_im1_grid;
+
+ KASSERT(im0->im_cpus == im1->im_cpus);
+ if (im0->im_grid == im1->im_grid)
+ return;
+
+ /* Save grid for later use */
+ old_im0_grid = im0->im_grid;
+ old_im1_grid = im1->im_grid;
+
+ intrmap_align(dv, im0, im1);
+
+ /*
+ * Re-shuffle rings to get more even distribution.
+ *
+ * e.g. 12 netisrs, rm0 contains 4 rings, rm1 contains 2 rings.
+ *
+ * CPU 0 1 2 3 4 5 6 7 8 9 10 11
+ *
+ * NIC_RX a0 a1 a2 a3 b0 b1 b2 b3 c0 c1 c2 c3
+ * NIC_TX A0 A1 B0 B1 C0 C1
+ *
+ * NIC_RX d0 d1 d2 d3 e0 e1 e2 e3 f0 f1 f2 f3
+ * NIC_TX D0 D1 E0 E1 F0 F1
+ */
+
+ if (im0->im_count >= (2 * old_im1_grid)) {
+ cnt = im0->im_count;
+ subset_grid = old_im1_grid;
+ subset_im = im1;
+ im = im0;
+ } else if (im1->im_count > (2 * old_im0_grid)) {
+ cnt = im1->im_count;
+ subset_grid = old_im0_grid;
+ subset_im = im0;
+ im = im1;
+ } else {
+ /* No space to shuffle. */
+ return;
+ }
+
+ ic = im0->im_cpus;
+
+ mod = cnt / subset_grid;
+ KASSERT(mod >= 2);
+ divisor = ic->ic_count / im->im_grid;
+ offset = ((unit / divisor) % mod) * subset_grid;
+
+ for (i = 0; i < subset_im->im_count; i++) {
+ subset_im->im_cpumap[i] += offset;
+ KASSERTMSG(subset_im->im_cpumap[i] < ic->ic_count,
+ "match: invalid cpumap[%d] = %d, offset %d",
+ i, subset_im->im_cpumap[i], offset);
+ }
+#ifdef DIAGNOSTIC
+ for (i = 0; i < subset_im->im_count; i++) {
+ unsigned int j;
+
+ for (j = 0; j < im->im_count; j++) {
+ if (im->im_cpumap[j] == subset_im->im_cpumap[i])
+ break;
+ }
+ KASSERTMSG(j < im->im_count,
+ "subset cpumap[%u] = %u not found in superset",
+ i, subset_im->im_cpumap[i]);
+ }
+#endif
+}
+
+unsigned int
+intrmap_count(const struct intrmap *im)
+{
+ return (im->im_count);
+}
+
+unsigned int
+intrmap_cpu(const struct intrmap *im, unsigned int ring)
+{
+ const struct intrmap_cpus *ic = im->im_cpus;
+ unsigned int icpu;
+ KASSERTMSG(ring < im->im_count, "invalid ring %u", ring);
+ icpu = im->im_cpumap[ring];
+ KASSERTMSG(icpu < ic->ic_count, "invalid interrupt cpu %u for ring %u"
+ " (intrmap %p)", icpu, ring, im);
+ return (ic->ic_cpumap[icpu]);
+}
Index: sys/conf/files
===================================================================
RCS file: /cvs/src/sys/conf/files,v
retrieving revision 1.686
diff -u -p -r1.686 files
--- sys/conf/files 15 Apr 2020 09:26:49 -0000 1.686
+++ sys/conf/files 16 Jun 2020 00:13:50 -0000
@@ -20,6 +20,7 @@ define i2cbus {}
define gpiobus {}
define onewirebus {}
define video {}
+define intrmap
# filesystem firmware loading attribute
define firmload
@@ -691,6 +692,7 @@ file kern/kern_resource.c
file kern/kern_pledge.c
file kern/kern_unveil.c
file kern/kern_sched.c
+file kern/kern_intrmap.c intrmap
file kern/kern_sensors.c
file kern/kern_sig.c
file kern/kern_smr.c