Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-04-04 Thread Christopher Lameter
On Wed, 13 Mar 2019, Barret Rhoden wrote:

> > It is very expensive. VMSP exchanges 4K segments via RDMA between servers
> > to build a large address space and run a kernel in the large address
> > space. Using smaller segments can cause a lot of
> > "cacheline" bouncing (meaning transfers of 4K segments back and forth
> > between servers).
> >
>
> Given that these are large machines, would it be OK to statically reserve 64K
> on them for modules' percpu data?

Likely.

> The bug that led me to here was from someone running on a non-VSMP machine but
> had that config set.  Perhaps we make it more clear in the Kconfig option to
> not set it on other machines.  That might make it less likely anyone on a
> non-VSMP machine pays the 64K overhead.

Right.

> Are there any other alternatives?  Not using static SRCU in any code that
> could be built as a module seems a little harsh.

Sorry this ended up in my spam folder somehow. Just fished it out.



Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-18 Thread Paul E. McKenney
On Mon, Mar 18, 2019 at 10:18:48AM +0200, Eial Czerwacki wrote:
> Greetings Paul,
> 
> On 3/15/19 12:19 AM, Paul E. McKenney wrote:
> > On Thu, Mar 14, 2019 at 10:36:19AM -0700, Tejun Heo wrote:
> >> On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
> >>> commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
> >>> Author: Paul E. McKenney 
> >>> Date:   Wed Mar 13 16:06:22 2019 -0700
> >>>
> >>> srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
> >>> 
> >>> Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
> >>> requires that the size of the reserved region be increased, which is
> >>> not something we want to be doing all that often.  Instead, loadable
> >>> modules should define an srcu_struct and invoke init_srcu_struct()
> >>> from their module_init function and cleanup_srcu_struct() from their
> >>> module_exit function.  Note that modules using call_srcu() will also
> >>> need to invoke srcu_barrier() from their module_exit function.
> >>> 
> >>> This commit enforces this advice by refusing to define DEFINE_SRCU()
> >>> and DEFINE_STATIC_SRCU() within loadable modules.
> >>> 
> >>> Suggested-by: Barret Rhoden 
> >>> Signed-off-by: Paul E. McKenney 
> >>
> >> Looks-great-to-me-by: Tejun Heo 
> > 
> > Applied.  ;-)
> > 
> > Thanx, Paul
> > 
> >> Thanks. :)
> >>
> >> -- 
> >> tejun
> >>
> > 
> > 
> 
> when can this patch be found in the kernel mainline git repo? I'd like
> to test and see if the patch that started this mail thread still occurs.

Thank you for your interest!

It is a3f5f4fae725 ("srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules")
in my -rcu tree.  If all goes well, I will submit it to the v5.2 merge
window.  I do not expect it to be submitted to -stable.

And -rcu is here:

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

Thanx, Paul



Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-18 Thread Eial Czerwacki
Greetings Paul,

On 3/15/19 12:19 AM, Paul E. McKenney wrote:
> On Thu, Mar 14, 2019 at 10:36:19AM -0700, Tejun Heo wrote:
>> On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
>>> commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
>>> Author: Paul E. McKenney 
>>> Date:   Wed Mar 13 16:06:22 2019 -0700
>>>
>>> srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
>>> 
>>> Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
>>> requires that the size of the reserved region be increased, which is
>>> not something we want to be doing all that often.  Instead, loadable
>>> modules should define an srcu_struct and invoke init_srcu_struct()
>>> from their module_init function and cleanup_srcu_struct() from their
>>> module_exit function.  Note that modules using call_srcu() will also
>>> need to invoke srcu_barrier() from their module_exit function.
>>> 
>>> This commit enforces this advice by refusing to define DEFINE_SRCU()
>>> and DEFINE_STATIC_SRCU() within loadable modules.
>>> 
>>> Suggested-by: Barret Rhoden 
>>> Signed-off-by: Paul E. McKenney 
>>
>> Looks-great-to-me-by: Tejun Heo 
> 
> Applied.  ;-)
> 
>   Thanx, Paul
> 
>> Thanks. :)
>>
>> -- 
>> tejun
>>
> 
> 

when can this patch be found in the kernel mainline git repo? I'd like
to test and see if the patch that started this mail thread still occurs.

Thanks,

Eial.


Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-14 Thread Paul E. McKenney
On Thu, Mar 14, 2019 at 10:36:19AM -0700, Tejun Heo wrote:
> On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
> > commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
> > Author: Paul E. McKenney 
> > Date:   Wed Mar 13 16:06:22 2019 -0700
> > 
> > srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
> > 
> > Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
> > requires that the size of the reserved region be increased, which is
> > not something we want to be doing all that often.  Instead, loadable
> > modules should define an srcu_struct and invoke init_srcu_struct()
> > from their module_init function and cleanup_srcu_struct() from their
> > module_exit function.  Note that modules using call_srcu() will also
> > need to invoke srcu_barrier() from their module_exit function.
> > 
> > This commit enforces this advice by refusing to define DEFINE_SRCU()
> > and DEFINE_STATIC_SRCU() within loadable modules.
> > 
> > Suggested-by: Barret Rhoden 
> > Signed-off-by: Paul E. McKenney 
> 
> Looks-great-to-me-by: Tejun Heo 

Applied.  ;-)

Thanx, Paul

> Thanks. :)
> 
> -- 
> tejun
> 



Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-14 Thread Tejun Heo
On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
> commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
> Author: Paul E. McKenney 
> Date:   Wed Mar 13 16:06:22 2019 -0700
> 
> srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
> 
> Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
> requires that the size of the reserved region be increased, which is
> not something we want to be doing all that often.  Instead, loadable
> modules should define an srcu_struct and invoke init_srcu_struct()
> from their module_init function and cleanup_srcu_struct() from their
> module_exit function.  Note that modules using call_srcu() will also
> need to invoke srcu_barrier() from their module_exit function.
> 
> This commit enforces this advice by refusing to define DEFINE_SRCU()
> and DEFINE_STATIC_SRCU() within loadable modules.
> 
> Suggested-by: Barret Rhoden 
> Signed-off-by: Paul E. McKenney 

Looks-great-to-me-by: Tejun Heo 

Thanks. :)

-- 
tejun


Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Paul E. McKenney
On Wed, Mar 13, 2019 at 02:29:12PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Wed, Mar 13, 2019 at 02:22:55PM -0700, Paul E. McKenney wrote:
> > Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if
> > !defined(MODULE)?
> 
> Yeah, that sounds like a great idea with comments explaining why it's
> like that.

Like this?

 * Build-time srcu_struct definition is not allowed in modules because
 * otherwise it is necessary to increase the size of the reserved region
 * each time a DEFINE_SRCU() or DEFINE_STATIC_SRCU() are added to a
 * kernel module.  Kernel modules should instead declare an srcu_struct
 * and then invoke init_srcu_struct() from their module_init function and
 * cleanup_srcu_struct() from their module_exit function.  Note that modules
 * using call_srcu() will also need to invoke srcu_barrier() from their
 * module_exit function.

Also, it looks like Barret beat me to this suggestion.  ;-)

In addition, rcutorture and rcuperf needed to be updated because
they used to use DEFINE_STATIC_STRUCT() whether built in or built
as a loadable module.

How does the (very lightly tested) patch below look to you all?

Thanx, Paul



commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
Author: Paul E. McKenney 
Date:   Wed Mar 13 16:06:22 2019 -0700

srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules

Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
requires that the size of the reserved region be increased, which is
not something we want to be doing all that often.  Instead, loadable
modules should define an srcu_struct and invoke init_srcu_struct()
from their module_init function and cleanup_srcu_struct() from their
module_exit function.  Note that modules using call_srcu() will also
need to invoke srcu_barrier() from their module_exit function.

This commit enforces this advice by refusing to define DEFINE_SRCU()
and DEFINE_STATIC_SRCU() within loadable modules.

Suggested-by: Barret Rhoden 
Signed-off-by: Paul E. McKenney 

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 7f7c8c050f63..ac5ea1c72e97 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -105,6 +105,15 @@ struct srcu_struct {
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
  *
+ * Build-time srcu_struct definition is not allowed in modules because
+ * otherwise it is necessary to increase the size of the reserved region
+ * each time a DEFINE_SRCU() or DEFINE_STATIC_SRCU() are added to a
+ * kernel module.  Kernel modules should instead declare an srcu_struct
+ * and then invoke init_srcu_struct() from their module_init function and
+ * cleanup_srcu_struct() from their module_exit function.  Note that modules
+ * using call_srcu() will also need to invoke srcu_barrier() from their
+ * module_exit function.
+ *
  * Note that although DEFINE_STATIC_SRCU() hides the name from other
  * files, the per-CPU variable rules nevertheless require that the
  * chosen name be globally unique.  These rules also prohibit use of
@@ -120,11 +129,13 @@ struct srcu_struct {
  *
  * See include/linux/percpu-defs.h for the rules on per-CPU variables.
  */
-#define __DEFINE_SRCU(name, is_static) \
-   static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);\
+#ifndef MODULE
+#  define __DEFINE_SRCU(name, is_static)   \
+   static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);  \
is_static struct srcu_struct name = __SRCU_STRUCT_INIT(name, 
name##_srcu_data)
-#define DEFINE_SRCU(name)  __DEFINE_SRCU(name, /* not static */)
-#define DEFINE_STATIC_SRCU(name)   __DEFINE_SRCU(name, static)
+#  define DEFINE_SRCU(name)__DEFINE_SRCU(name, /* not static */)
+#  define DEFINE_STATIC_SRCU(name) __DEFINE_SRCU(name, static)
+#endif
 
 void synchronize_srcu_expedited(struct srcu_struct *ssp);
 void srcu_barrier(struct srcu_struct *ssp);
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index c29761152874..b44208b3bf95 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -139,6 +139,7 @@ struct rcu_perf_ops {
void (*sync)(void);
void (*exp_sync)(void);
const char *name;
+   const char *altname;
 };
 
 static struct rcu_perf_ops *cur_ops;
@@ -186,8 +187,16 @@ static struct rcu_perf_ops rcu_ops = {
  * Definitions for srcu perf testing.
  */
 
+static struct srcu_struct *srcu_ctlp;
+
+#ifndef MODULE
 DEFINE_STATIC_SRCU(srcu_ctl_perf);
-static struct srcu_struct *srcu_ctlp = _ctl_perf;
+
+static void srcu_sync_perf_init(void)
+{
+   srcu_ctlp = _ctl_perf
+}
+#endif
 
 static int srcu_perf_read_lock(void) __acquires(srcu_ctlp)
 {
@@ -224,9 +233,10 @@ static void 

Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Tejun Heo
Hello,

On Wed, Mar 13, 2019 at 02:22:55PM -0700, Paul E. McKenney wrote:
> Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if
> !defined(MODULE)?

Yeah, that sounds like a great idea with comments explaining why it's
like that.

Thanks.

-- 
tejun


Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Paul E. McKenney
On Wed, Mar 13, 2019 at 01:26:40PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Wed, Mar 13, 2019 at 03:40:04PM -0400, Barret Rhoden wrote:
> > Are there any other alternatives?  Not using static SRCU in any code
> > that could be built as a module seems a little harsh.
> 
> Yes, allocate the srcu dynamically on module init and destroy on
> module exit.  That's how the other similar case got solved too.  We
> can't keep bumping up reserved size by the number of static SRCUs in
> modules.  It's mostly there to make trivial small things easier.  We
> don't lose anything meaningful by allocating srcu dynamically.

Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if
!defined(MODULE)?

Untested (probably doesn't even build) patch below.

Thanx, Paul



diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 7f7c8c050f63..a979da9cf71f 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -105,6 +105,8 @@ struct srcu_struct {
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
  *
+ * Build-time srcu_struct definition is not allowed in modules.
+ *
  * Note that although DEFINE_STATIC_SRCU() hides the name from other
  * files, the per-CPU variable rules nevertheless require that the
  * chosen name be globally unique.  These rules also prohibit use of
@@ -120,11 +122,13 @@ struct srcu_struct {
  *
  * See include/linux/percpu-defs.h for the rules on per-CPU variables.
  */
-#define __DEFINE_SRCU(name, is_static) \
-   static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);\
+#ifndef MODULE
+#  define __DEFINE_SRCU(name, is_static)   \
+   static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);  \
is_static struct srcu_struct name = __SRCU_STRUCT_INIT(name, 
name##_srcu_data)
-#define DEFINE_SRCU(name)  __DEFINE_SRCU(name, /* not static */)
-#define DEFINE_STATIC_SRCU(name)   __DEFINE_SRCU(name, static)
+#  define DEFINE_SRCU(name)__DEFINE_SRCU(name, /* not static */)
+#  define DEFINE_STATIC_SRCU(name) __DEFINE_SRCU(name, static)
+#endif
 
 void synchronize_srcu_expedited(struct srcu_struct *ssp);
 void srcu_barrier(struct srcu_struct *ssp);
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 5ff797fd3715..7cf1e3aed695 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -496,9 +496,18 @@ static struct rcu_torture_ops rcu_busted_ops = {
  * Definitions for srcu torture testing.
  */
 
-DEFINE_STATIC_SRCU(srcu_ctl);
 static struct srcu_struct srcu_ctld;
-static struct srcu_struct *srcu_ctlp = _ctl;
+static struct srcu_struct *srcu_ctlp;
+
+#ifndef MODULE
+DEFINE_STATIC_SRCU(srcu_ctl);
+
+static void srcu_torture_init(void)
+{
+   rcu_sync_torture_init();
+   srcu_ctlp = _ctl;
+}
+#endif
 
 static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
 {
@@ -565,9 +574,10 @@ static void srcu_torture_synchronize_expedited(void)
synchronize_srcu_expedited(srcu_ctlp);
 }
 
+#ifndef MODULE
 static struct rcu_torture_ops srcu_ops = {
.ttype  = SRCU_FLAVOR,
-   .init   = rcu_sync_torture_init,
+   .init   = srcu_torture_init,
.readlock   = srcu_torture_read_lock,
.read_delay = srcu_read_delay,
.readunlock = srcu_torture_read_unlock,
@@ -581,25 +591,25 @@ static struct rcu_torture_ops srcu_ops = {
.irq_capable= 1,
.name   = "srcu"
 };
+#endif
 
-static void srcu_torture_init(void)
+static void srcud_torture_init(void)
 {
rcu_sync_torture_init();
WARN_ON(init_srcu_struct(_ctld));
srcu_ctlp = _ctld;
 }
 
-static void srcu_torture_cleanup(void)
+static void srcud_torture_cleanup(void)
 {
cleanup_srcu_struct(_ctld);
-   srcu_ctlp = _ctl; /* In case of a later rcutorture run. */
 }
 
 /* As above, but dynamically allocated. */
 static struct rcu_torture_ops srcud_ops = {
.ttype  = SRCU_FLAVOR,
-   .init   = srcu_torture_init,
-   .cleanup= srcu_torture_cleanup,
+   .init   = srcud_torture_init,
+   .cleanup= srcud_torture_cleanup,
.readlock   = srcu_torture_read_lock,
.read_delay = srcu_read_delay,
.readunlock = srcu_torture_read_unlock,
@@ -617,8 +627,8 @@ static struct rcu_torture_ops srcud_ops = {
 /* As above, but broken due to inappropriate reader extension. */
 static struct rcu_torture_ops busted_srcud_ops = {
.ttype  = SRCU_FLAVOR,
-   .init   = srcu_torture_init,
-   .cleanup= srcu_torture_cleanup,
+   .init   = srcud_torture_init,
+   .cleanup= srcud_torture_cleanup,
.readlock   = srcu_torture_read_lock,

Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Tejun Heo
Hello,

On Wed, Mar 13, 2019 at 03:40:04PM -0400, Barret Rhoden wrote:
> Are there any other alternatives?  Not using static SRCU in any code
> that could be built as a module seems a little harsh.

Yes, allocate the srcu dynamically on module init and destroy on
module exit.  That's how the other similar case got solved too.  We
can't keep bumping up reserved size by the number of static SRCUs in
modules.  It's mostly there to make trivial small things easier.  We
don't lose anything meaningful by allocating srcu dynamically.

Thanks.

-- 
tejun


Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-13 Thread Barret Rhoden

Hi -

On 03/01/2019 04:54 PM, Christopher Lameter wrote:

On Fri, 1 Mar 2019, Barret Rhoden wrote:


I'm not familiar with VSMP - how bad is it to use L1 cache alignment instead
of 4K page alignment?  Maybe some structures can use the smaller alignment?
Or maybe have VSMP require SRCU-using modules to be built-in?


It is very expensive. VMSP exchanges 4K segments via RDMA between servers
to build a large address space and run a kernel in the large address
space. Using smaller segments can cause a lot of
"cacheline" bouncing (meaning transfers of 4K segments back and forth
between servers).



Given that these are large machines, would it be OK to statically 
reserve 64K on them for modules' percpu data?


The bug that led me to here was from someone running on a non-VSMP 
machine but had that config set.  Perhaps we make it more clear in the 
Kconfig option to not set it on other machines.  That might make it less 
likely anyone on a non-VSMP machine pays the 64K overhead.


Are there any other alternatives?  Not using static SRCU in any code 
that could be built as a module seems a little harsh.


Thanks,

Barret



Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-03 Thread Eial Czerwacki
Greetings Barret,

On 3/1/19 8:30 PM, Barret Rhoden wrote:
> Hi -
> 
> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>>
> 
> Your main issue was that you only sent this patch to LKML, but not the
> maintainers of the file.  If you don't, your patch might get lost.  To
> get the appropriate people and lists, run:
> 
> scripts/get_maintainer.pl YOUR_PATCH.patch.
> 
> For this patch, you'll get this:
> 
> Dennis Zhou  (maintainer:PER-CPU MEMORY ALLOCATOR)
> Tejun Heo  (maintainer:PER-CPU MEMORY ALLOCATOR)
> Christoph Lameter  (maintainer:PER-CPU MEMORY ALLOCATOR)
> linux-kernel@vger.kernel.org (open list)
> 
> I added the three maintainers to this email.
> 
> I have a few minor comments below.
> 
thanks, I did not knew that, I'll use it next time.

>> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP
> is set
> 
> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase
> module reservation size if X86_VSMP is set".  ('change' -> 'increase'),
> only says 'reservation' once.)
> 
>> as reported in bug #201339
>> (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
> 
> I think you can add a tag for this right above your Signed-off-by tags.
> e.g.:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
> 
>> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from
>> the default one
>> causing the struct size to exceed the size ok 8KB.
>     ^of
> 
will fix, thanks.

> Which struct are you talking about?  I have one in mind, but others
> might not know from reading the commit message.
> 
> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511.
> In that case, it was because modules (drm and amdkfd) were using
> DEFINE_SRCU, which does a DEFINE_PER_CPU on struct srcu_data, and that
> used cacheline_internodealigned_in_smp.
you are correct, the structure in question is struct srcu_data.

> 
>>
>> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if
>> CONFIG_X86_VSMP is set.
>    ^increase
> 
>>
>> the value was caculated on linux 4.20.3, make allmodconfig all and the
>> following oneliner:
>    ^calculated
> 
will fix, thanks.

>> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc;
>> done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r:
>> "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk
>> '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END
>> {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column
>> -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
> 
> Not sure how useful the one-liner is, versus a description of what
> you're doing.  i.e. "the size of all module percpu data sections, or
> something."
I thought an easy reproducing will suffice, I'll take that into account.

> 
> Also, how close was that calculated value to 64K?  If more modules start
> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
the biggest module was 12472 bytes in size, as multiple modules uses the
same percpu, more is needed, the only way I was able to make it fit was 64K.

of course there is a possibility that at a specific scenario 64K will
not be enough but we have yet to encounter such scenario.

> 
> Thanks,
> Barret
> 
>> Signed-off-by: Eial Czerwacki 
>> Signed-off-by: Shai Fultheim 
>> Signed-off-by: Oren Twaig 
>> ---
>>   include/linux/percpu.h | 4 
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>> index 70b7123..6b79693 100644
>> --- a/include/linux/percpu.h
>> +++ b/include/linux/percpu.h
>> @@ -14,7 +14,11 @@
>>     /* enough to cover all DEFINE_PER_CPUs in modules */
>>   #ifdef CONFIG_MODULES
>> +#ifdef X86_VSMP
>> +#define PERCPU_MODULE_RESERVE    (1 << 16)
>> +#else
>>   #define PERCPU_MODULE_RESERVE    (8 << 10)
>> +#endif
>>   #else
>>   #define PERCPU_MODULE_RESERVE    0
>>   #endif
>>
> 
> 



Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-01 Thread Christopher Lameter
On Fri, 1 Mar 2019, Barret Rhoden wrote:

> I'm not familiar with VSMP - how bad is it to use L1 cache alignment instead
> of 4K page alignment?  Maybe some structures can use the smaller alignment?
> Or maybe have VSMP require SRCU-using modules to be built-in?

It is very expensive. VMSP exchanges 4K segments via RDMA between servers
to build a large address space and run a kernel in the large address
space. Using smaller segments can cause a lot of
"cacheline" bouncing (meaning transfers of 4K segments back and forth
between servers).


Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-01 Thread Barret Rhoden

Hi -

On 03/01/2019 03:34 PM, Dennis Zhou wrote:

Hi Barret,

On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:

Hi -

On 01/21/2019 06:47 AM, Eial Czerwacki wrote:




Your main issue was that you only sent this patch to LKML, but not the
maintainers of the file.  If you don't, your patch might get lost.  To get
the appropriate people and lists, run:

scripts/get_maintainer.pl YOUR_PATCH.patch.

For this patch, you'll get this:

Dennis Zhou  (maintainer:PER-CPU MEMORY ALLOCATOR)
Tejun Heo  (maintainer:PER-CPU MEMORY ALLOCATOR)
Christoph Lameter  (maintainer:PER-CPU MEMORY ALLOCATOR)
linux-kernel@vger.kernel.org (open list)

I added the three maintainers to this email.

I have a few minor comments below.


[PATCH] percpu/module resevation: change resevation size iff X86_VSMP is

set

You misspelled 'reservation'.  Also, I'd just say: "percpu: increase module
reservation size if X86_VSMP is set".  ('change' -> 'increase'), only says
'reservation' once.)


as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)


I think you can add a tag for this right above your Signed-off-by tags.
e.g.:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339


by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the 
default one
causing the struct size to exceed the size ok 8KB.

 ^of

Which struct are you talking about?  I have one in mind, but others might
not know from reading the commit message.

I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. In
that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
which does a DEFINE_PER_CPU on struct srcu_data, and that used
cacheline_internodealigned_in_smp.



in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if 
CONFIG_X86_VSMP is set.

^increase



the value was caculated on linux 4.20.3, make allmodconfig all and the 
following oneliner:

^calculated


for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do 
echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" 
$1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, 
m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc


Not sure how useful the one-liner is, versus a description of what you're
doing.  i.e. "the size of all module percpu data sections, or something."

Also, how close was that calculated value to 64K?  If more modules start
using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.

Thanks,
Barret


Signed-off-by: Eial Czerwacki 
Signed-off-by: Shai Fultheim 
Signed-off-by: Oren Twaig 
---
   include/linux/percpu.h | 4 
   1 file changed, 4 insertions(+)

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 70b7123..6b79693 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -14,7 +14,11 @@
   /* enough to cover all DEFINE_PER_CPUs in modules */
   #ifdef CONFIG_MODULES
+#ifdef X86_VSMP
+#define PERCPU_MODULE_RESERVE  (1 << 16)
+#else
   #define PERCPU_MODULE_RESERVE(8 << 10)
+#endif
   #else
   #define PERCPU_MODULE_RESERVE0
   #endif





Thanks for sending this to me.

I must say, I really do not want to expand the reserved region. In most
cases, it can easily end up unused and thus wasted memory as it is hard
allocated on boot. This is done because code gen assumes static
variables are close to the program counter. This would not be true with
dynamic allocations which being at the end of the vmalloc area
(Summarized from Tejun's account in [1]).

Another note on the reserved region. It starts at the end of the static
region which means it generally isn't page aligned. So while an 8kb
allocation would fit, a 4kb alignment more than likely would fail.
Something as large as 8kb should probably be dynamically allocated as
well.

>

I read through the bugzilla report and it seems that the culprits are:
   drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
   drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);

Is there a reason we cannot dynamically initialize these structs? I've
cced Paul McKenney because we saw an issue with ipmi in December [1].


I looked at the AMD driver, and it looks like they could dynamically 
initialize it.  It would require a little extra plumbing.  I imagine the 
DRM one is the same way.


To catch this in the future, should we disallow DEFINE_SRCU in modules 
or something?  Otherwise, this will pop up again the next time someone 
uses DEFINE_SRCU in a module and builds with CONFIG_X86_VSMP.


That might be a little much, and it still won't be sufficient to catch 
all cases.  Thi

Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-01 Thread Dennis Zhou
Hi Barret,

On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:
> Hi -
> 
> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
> >
> 
> Your main issue was that you only sent this patch to LKML, but not the
> maintainers of the file.  If you don't, your patch might get lost.  To get
> the appropriate people and lists, run:
> 
>   scripts/get_maintainer.pl YOUR_PATCH.patch.
> 
> For this patch, you'll get this:
> 
> Dennis Zhou  (maintainer:PER-CPU MEMORY ALLOCATOR)
> Tejun Heo  (maintainer:PER-CPU MEMORY ALLOCATOR)
> Christoph Lameter  (maintainer:PER-CPU MEMORY ALLOCATOR)
> linux-kernel@vger.kernel.org (open list)
> 
> I added the three maintainers to this email.
> 
> I have a few minor comments below.
> 
> > [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is
> set
> 
> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase module
> reservation size if X86_VSMP is set".  ('change' -> 'increase'), only says
> 'reservation' once.)
> 
> > as reported in bug #201339 
> > (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
> 
> I think you can add a tag for this right above your Signed-off-by tags.
> e.g.:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
> 
> > by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the 
> > default one
> > causing the struct size to exceed the size ok 8KB.
> ^of
> 
> Which struct are you talking about?  I have one in mind, but others might
> not know from reading the commit message.
> 
> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. In
> that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
> which does a DEFINE_PER_CPU on struct srcu_data, and that used
> cacheline_internodealigned_in_smp.
> 
> > 
> > in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if 
> > CONFIG_X86_VSMP is set.
>^increase
> 
> > 
> > the value was caculated on linux 4.20.3, make allmodconfig all and the 
> > following oneliner:
>^calculated
> 
> > for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done 
> > |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump 
> > --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; 
> > d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d 
> > vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n 
> > | awk '{print $8}'| paste -sd+ | bc
> 
> Not sure how useful the one-liner is, versus a description of what you're
> doing.  i.e. "the size of all module percpu data sections, or something."
> 
> Also, how close was that calculated value to 64K?  If more modules start
> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
> 
> Thanks,
> Barret
> 
> > Signed-off-by: Eial Czerwacki 
> > Signed-off-by: Shai Fultheim 
> > Signed-off-by: Oren Twaig 
> > ---
> >   include/linux/percpu.h | 4 
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> > index 70b7123..6b79693 100644
> > --- a/include/linux/percpu.h
> > +++ b/include/linux/percpu.h
> > @@ -14,7 +14,11 @@
> >   /* enough to cover all DEFINE_PER_CPUs in modules */
> >   #ifdef CONFIG_MODULES
> > +#ifdef X86_VSMP
> > +#define PERCPU_MODULE_RESERVE  (1 << 16)
> > +#else
> >   #define PERCPU_MODULE_RESERVE (8 << 10)
> > +#endif
> >   #else
> >   #define PERCPU_MODULE_RESERVE 0
> >   #endif
> > 
> 

Thanks for sending this to me.

I must say, I really do not want to expand the reserved region. In most
cases, it can easily end up unused and thus wasted memory as it is hard
allocated on boot. This is done because code gen assumes static
variables are close to the program counter. This would not be true with
dynamic allocations which being at the end of the vmalloc area
(Summarized from Tejun's account in [1]).

Another note on the reserved region. It starts at the end of the static
region which means it generally isn't page aligned. So while an 8kb
allocation would fit, a 4kb alignment more than likely would fail.
Something as large as 8kb should probably be dynamically allocated as
well.

I read through the bugzilla report and it seems that the culprits are:
  drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
  drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);

Is there a reason we cannot dynamically initialize these structs? I've
cced Paul McKenney because we saw an issue with ipmi in December [1].

[1] 
https://lore.kernel.org/linux-mm/CAJM9R-JWO1P_qJzw2JboMH2dgPX7K1tF49nO5ojvf=iwgdd...@mail.gmail.com/

Thanks,
Dennis


Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-03-01 Thread Barret Rhoden

Hi -

On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>

Your main issue was that you only sent this patch to LKML, but not the 
maintainers of the file.  If you don't, your patch might get lost.  To 
get the appropriate people and lists, run:


scripts/get_maintainer.pl YOUR_PATCH.patch.

For this patch, you'll get this:

Dennis Zhou  (maintainer:PER-CPU MEMORY ALLOCATOR)
Tejun Heo  (maintainer:PER-CPU MEMORY ALLOCATOR)
Christoph Lameter  (maintainer:PER-CPU MEMORY ALLOCATOR)
linux-kernel@vger.kernel.org (open list)

I added the three maintainers to this email.

I have a few minor comments below.

> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP 
is set


You misspelled 'reservation'.  Also, I'd just say: "percpu: increase 
module reservation size if X86_VSMP is set".  ('change' -> 'increase'), 
only says 'reservation' once.)



as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)


I think you can add a tag for this right above your Signed-off-by tags. 
e.g.:


Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339


by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the 
default one
causing the struct size to exceed the size ok 8KB.

^of

Which struct are you talking about?  I have one in mind, but others 
might not know from reading the commit message.


I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. 
In that case, it was because modules (drm and amdkfd) were using 
DEFINE_SRCU, which does a DEFINE_PER_CPU on struct srcu_data, and that 
used cacheline_internodealigned_in_smp.




in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if 
CONFIG_X86_VSMP is set.

   ^increase



the value was caculated on linux 4.20.3, make allmodconfig all and the 
following oneliner:

   ^calculated


for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do 
echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" 
$1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, 
m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc


Not sure how useful the one-liner is, versus a description of what 
you're doing.  i.e. "the size of all module percpu data sections, or 
something."


Also, how close was that calculated value to 64K?  If more modules start 
using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.


Thanks,
Barret


Signed-off-by: Eial Czerwacki 
Signed-off-by: Shai Fultheim 
Signed-off-by: Oren Twaig 
---
  include/linux/percpu.h | 4 
  1 file changed, 4 insertions(+)

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 70b7123..6b79693 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -14,7 +14,11 @@
  
  /* enough to cover all DEFINE_PER_CPUs in modules */

  #ifdef CONFIG_MODULES
+#ifdef X86_VSMP
+#define PERCPU_MODULE_RESERVE  (1 << 16)
+#else
  #define PERCPU_MODULE_RESERVE (8 << 10)
+#endif
  #else
  #define PERCPU_MODULE_RESERVE 0
  #endif





Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-01-30 Thread Eial Czerwacki
Greetings,

On 1/21/19 1:47 PM, Eial Czerwacki wrote:
> as reported in bug #201339 
> (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the 
> default one
> causing the struct size to exceed the size ok 8KB.
> 
> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if 
> CONFIG_X86_VSMP is set.
> 
> the value was caculated on linux 4.20.3, make allmodconfig all and the 
> following oneliner:
> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep 
> data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms 
> --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" 
> $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 
> 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print 
> $8}'| paste -sd+ | bc
> 
> Signed-off-by: Eial Czerwacki 
> Signed-off-by: Shai Fultheim 
> Signed-off-by: Oren Twaig 
> ---
>  include/linux/percpu.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 70b7123..6b79693 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -14,7 +14,11 @@
>  
>  /* enough to cover all DEFINE_PER_CPUs in modules */
>  #ifdef CONFIG_MODULES
> +#ifdef X86_VSMP
> +#define PERCPU_MODULE_RESERVE(1 << 16)
> +#else
>  #define PERCPU_MODULE_RESERVE(8 << 10)
> +#endif
>  #else
>  #define PERCPU_MODULE_RESERVE0
>  #endif
> 
is it possible to push this patch to mainline?
it seems like no objections/comment regarding it exists.
we'd like to fix the bug mentioned above.


[PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set

2019-01-21 Thread Eial Czerwacki
as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the 
default one
causing the struct size to exceed the size ok 8KB.

in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if 
CONFIG_X86_VSMP is set.

the value was caculated on linux 4.20.3, make allmodconfig all and the 
following oneliner:
for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep 
data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms 
--section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) 
+ strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( 
%d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste 
-sd+ | bc

Signed-off-by: Eial Czerwacki 
Signed-off-by: Shai Fultheim 
Signed-off-by: Oren Twaig 
---
 include/linux/percpu.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 70b7123..6b79693 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -14,7 +14,11 @@
 
 /* enough to cover all DEFINE_PER_CPUs in modules */
 #ifdef CONFIG_MODULES
+#ifdef X86_VSMP
+#define PERCPU_MODULE_RESERVE  (1 << 16)
+#else
 #define PERCPU_MODULE_RESERVE  (8 << 10)
+#endif
 #else
 #define PERCPU_MODULE_RESERVE  0
 #endif
-- 
2.7.4