systat(1) counter overflow

2021-07-01 Thread Anindya Mukherjee
Hi,

I noticed that if I leave the system running for more than about a month, some
of the counters in the uvm view of systat(1) overflow and become negative. This
is because the members of struct uvmexp in sys/uvm/uvmexp.h are ints. The
kernel's internal counters are of course uint64_t so they don't overflow. It
only happens during the uvm_sysctl(9) call which casts the numbers to integers.
The function is uvmexp_read.

In the attached diff I took the path of least resistance and promoted some of
the counters to unsigned int. Ideally I would have liked to use int64_t or even
uint64_t, but I hit an issue in some of the architecture dependent code. An
example is:
/usr/src/sys/arch/alpha/alpha/trap.c:536 atomic_add_int(&uvmexp.syscalls, 1);
In other places the ++ operator is used to increment the counters and the 64 bit
types can be used.

I am not completely sure this is the best way to proceed, but even if this diff
is horrifying, I'd appreciate some feedback and advice, thanks!

Anindya

Index: sys/uvm/uvm_meter.c
===
RCS file: /cvs/src/sys/uvm/uvm_meter.c,v
retrieving revision 1.42
diff -u -p -r1.42 uvm_meter.c
--- sys/uvm/uvm_meter.c 28 Dec 2020 14:01:23 -  1.42
+++ sys/uvm/uvm_meter.c 28 Jun 2021 05:24:56 -
@@ -329,7 +329,7 @@ uvmexp_read(struct uvmexp *uexp)
counters_read(uvmexp_counters, counters, exp_ncounters);
 
/* stat counters */
-   uexp->faults = (int)counters[faults];
+   uexp->faults = (unsigned int)counters[faults];
uexp->pageins = (int)counters[pageins];
 
/* fault subcounters */
@@ -379,10 +379,10 @@ uvmexp_print(int (*pr)(const char *, ...
(*pr)("  freemin=%d, free-target=%d, inactive-target=%d, "
"wired-max=%d\n", uexp.freemin, uexp.freetarg, uexp.inactarg,
uexp.wiredmax);
-   (*pr)("  faults=%d, traps=%d, intrs=%d, ctxswitch=%d fpuswitch=%d\n",
+   (*pr)("  faults=%u, traps=%u, intrs=%u, ctxswitch=%u fpuswitch=%d\n",
uexp.faults, uexp.traps, uexp.intrs, uexp.swtch,
uexp.fpswtch);
-   (*pr)("  softint=%d, syscalls=%d, kmapent=%d\n",
+   (*pr)("  softint=%u, syscalls=%u, kmapent=%d\n",
uexp.softs, uexp.syscalls, uexp.kmapent);
 
(*pr)("  fault counts:\n");
Index: sys/uvm/uvmexp.h
===
RCS file: /cvs/src/sys/uvm/uvmexp.h,v
retrieving revision 1.9
diff -u -p -r1.9 uvmexp.h
--- sys/uvm/uvmexp.h4 Mar 2021 09:00:03 -   1.9
+++ sys/uvm/uvmexp.h28 Jun 2021 05:24:56 -
@@ -90,12 +90,12 @@ struct uvmexp {
int unused06;   /* formerly nfreeanon */
 
/* stat counters */
-   int faults; /* page fault count */
-   int traps;  /* trap count */
-   int intrs;  /* interrupt count */
-   int swtch;  /* context switch count */
-   int softs;  /* software interrupt count */
-   int syscalls;   /* system calls */
+   unsigned int faults;/* page fault count */
+   unsigned int traps; /* trap count */
+   unsigned int intrs; /* interrupt count */
+   unsigned int swtch; /* context switch count */
+   unsigned int softs; /* software interrupt count */
+   unsigned int syscalls;  /* system calls */
int pageins;/* pagein operation count */
/* pageouts are in pdpageouts below */
int unused07;   /* formerly obsolete_swapins */
Index: usr.bin/systat/uvm.c
===
RCS file: /cvs/src/usr.bin/systat/uvm.c,v
retrieving revision 1.5
diff -u -p -r1.5 uvm.c
--- usr.bin/systat/uvm.c28 Jun 2019 13:35:04 -  1.5
+++ usr.bin/systat/uvm.c28 Jun 2021 05:24:57 -
@@ -37,22 +37,23 @@ void print_uvm(void);
 int  read_uvm(void);
 int  select_uvm(void);
 
-void print_uvmexp_field(field_def *, field_def *, int *, int *, const char *);
+void print_uvmexp_field(field_def *, field_def *, unsigned int *,
+unsigned int *, const char *);
 void print_uvmexp_line(int);
 
 struct uvmexp uvmexp;
 struct uvmexp last_uvmexp;
 
 struct uvmline {
-   int *v1;
-   int *ov1;
-   char*n1;
-   int *v2;
-   int *ov2;
-   char*n2;
-   int *v3;
-   int *ov3;
-   char*n3;
+   unsigned int*v1;
+   unsigned int*ov1;
+   char*n1;
+   unsigned int*v2;
+   unsigned int*ov2;
+   char*n2;
+   unsigned int*v3;
+   unsigned int*ov3;
+   char*n3;
 };
 
 struct uvmline uvmline[] = {
@@ -214,8 +215,8 @@ read_uvm(void)
 }
 
 void
-print_uvmexp_field(field_def *fvalue, field_def *fname, int *new, int *old,
-const char *name)
+print_uvmexp_field(field_def *fvalue, fi

Re: systat(1) counter overflow

2021-07-02 Thread Martin Pieuchot
On 01/07/21(Thu) 13:53, Anindya Mukherjee wrote:
> Hi,
> 
> I noticed that if I leave the system running for more than about a month, some
> of the counters in the uvm view of systat(1) overflow and become negative. 
> This
> is because the members of struct uvmexp in sys/uvm/uvmexp.h are ints. The
> kernel's internal counters are of course uint64_t so they don't overflow. It
> only happens during the uvm_sysctl(9) call which casts the numbers to 
> integers.
> The function is uvmexp_read.
> 
> In the attached diff I took the path of least resistance and promoted some of
> the counters to unsigned int. Ideally I would have liked to use int64_t or 
> even
> uint64_t, but I hit an issue in some of the architecture dependent code. An
> example is:
> /usr/src/sys/arch/alpha/alpha/trap.c:536 atomic_add_int(&uvmexp.syscalls, 1);
> In other places the ++ operator is used to increment the counters and the 64 
> bit
> types can be used.
> 
> I am not completely sure this is the best way to proceed, but even if this 
> diff
> is horrifying, I'd appreciate some feedback and advice, thanks!

I wonder if we shouldn't use uint64_t for those and embrace the ABI
break, that would at least simplify the kernel side and remove a
truncation.

Do you have an idea of the different consumers of the "struct uvmexp"
apart from systat(1)?  What is the impact in userland & ports of such
change?

> Index: sys/uvm/uvm_meter.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_meter.c,v
> retrieving revision 1.42
> diff -u -p -r1.42 uvm_meter.c
> --- sys/uvm/uvm_meter.c   28 Dec 2020 14:01:23 -  1.42
> +++ sys/uvm/uvm_meter.c   28 Jun 2021 05:24:56 -
> @@ -329,7 +329,7 @@ uvmexp_read(struct uvmexp *uexp)
>   counters_read(uvmexp_counters, counters, exp_ncounters);
>  
>   /* stat counters */
> - uexp->faults = (int)counters[faults];
> + uexp->faults = (unsigned int)counters[faults];
>   uexp->pageins = (int)counters[pageins];
>  
>   /* fault subcounters */
> @@ -379,10 +379,10 @@ uvmexp_print(int (*pr)(const char *, ...
>   (*pr)("  freemin=%d, free-target=%d, inactive-target=%d, "
>   "wired-max=%d\n", uexp.freemin, uexp.freetarg, uexp.inactarg,
>   uexp.wiredmax);
> - (*pr)("  faults=%d, traps=%d, intrs=%d, ctxswitch=%d fpuswitch=%d\n",
> + (*pr)("  faults=%u, traps=%u, intrs=%u, ctxswitch=%u fpuswitch=%d\n",
>   uexp.faults, uexp.traps, uexp.intrs, uexp.swtch,
>   uexp.fpswtch);
> - (*pr)("  softint=%d, syscalls=%d, kmapent=%d\n",
> + (*pr)("  softint=%u, syscalls=%u, kmapent=%d\n",
>   uexp.softs, uexp.syscalls, uexp.kmapent);
>  
>   (*pr)("  fault counts:\n");
> Index: sys/uvm/uvmexp.h
> ===
> RCS file: /cvs/src/sys/uvm/uvmexp.h,v
> retrieving revision 1.9
> diff -u -p -r1.9 uvmexp.h
> --- sys/uvm/uvmexp.h  4 Mar 2021 09:00:03 -   1.9
> +++ sys/uvm/uvmexp.h  28 Jun 2021 05:24:56 -
> @@ -90,12 +90,12 @@ struct uvmexp {
>   int unused06;   /* formerly nfreeanon */
>  
>   /* stat counters */
> - int faults; /* page fault count */
> - int traps;  /* trap count */
> - int intrs;  /* interrupt count */
> - int swtch;  /* context switch count */
> - int softs;  /* software interrupt count */
> - int syscalls;   /* system calls */
> + unsigned int faults;/* page fault count */
> + unsigned int traps; /* trap count */
> + unsigned int intrs; /* interrupt count */
> + unsigned int swtch; /* context switch count */
> + unsigned int softs; /* software interrupt count */
> + unsigned int syscalls;  /* system calls */
>   int pageins;/* pagein operation count */
>   /* pageouts are in pdpageouts below */
>   int unused07;   /* formerly obsolete_swapins */
> Index: usr.bin/systat/uvm.c
> ===
> RCS file: /cvs/src/usr.bin/systat/uvm.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 uvm.c
> --- usr.bin/systat/uvm.c  28 Jun 2019 13:35:04 -  1.5
> +++ usr.bin/systat/uvm.c  28 Jun 2021 05:24:57 -
> @@ -37,22 +37,23 @@ void print_uvm(void);
>  int  read_uvm(void);
>  int  select_uvm(void);
>  
> -void print_uvmexp_field(field_def *, field_def *, int *, int *, const char 
> *);
> +void print_uvmexp_field(field_def *, field_def *, unsigned int *,
> +unsigned int *, const char *);
>  void print_uvmexp_line(int);
>  
>  struct uvmexp uvmexp;
>  struct uvmexp last_uvmexp;
>  
>  struct uvmline {
> - int *v1;
> - int *ov1;
> - char*n1;
> - int *v2;
> - int *ov2;
> - char*n2;
> - int *v3;
> - int *ov3;
> - char*n3;
> + un

Re: systat(1) counter overflow

2021-07-02 Thread Claudio Jeker
On Fri, Jul 02, 2021 at 01:09:05PM +0200, Martin Pieuchot wrote:
> On 01/07/21(Thu) 13:53, Anindya Mukherjee wrote:
> > Hi,
> > 
> > I noticed that if I leave the system running for more than about a month, 
> > some
> > of the counters in the uvm view of systat(1) overflow and become negative. 
> > This
> > is because the members of struct uvmexp in sys/uvm/uvmexp.h are ints. The
> > kernel's internal counters are of course uint64_t so they don't overflow. It
> > only happens during the uvm_sysctl(9) call which casts the numbers to 
> > integers.
> > The function is uvmexp_read.
> > 
> > In the attached diff I took the path of least resistance and promoted some 
> > of
> > the counters to unsigned int. Ideally I would have liked to use int64_t or 
> > even
> > uint64_t, but I hit an issue in some of the architecture dependent code. An
> > example is:
> > /usr/src/sys/arch/alpha/alpha/trap.c:536 atomic_add_int(&uvmexp.syscalls, 
> > 1);
> > In other places the ++ operator is used to increment the counters and the 
> > 64 bit
> > types can be used.
> > 
> > I am not completely sure this is the best way to proceed, but even if this 
> > diff
> > is horrifying, I'd appreciate some feedback and advice, thanks!
> 
> I wonder if we shouldn't use uint64_t for those and embrace the ABI
> break, that would at least simplify the kernel side and remove a
> truncation.
> 
> Do you have an idea of the different consumers of the "struct uvmexp"
> apart from systat(1)?  What is the impact in userland & ports of such
> change?

I know that golang has its definition of uvmexp and so if you change the
ABI then you would break at least that. struct uvmexp is used more often
than we would like.

> > Index: sys/uvm/uvm_meter.c
> > ===
> > RCS file: /cvs/src/sys/uvm/uvm_meter.c,v
> > retrieving revision 1.42
> > diff -u -p -r1.42 uvm_meter.c
> > --- sys/uvm/uvm_meter.c 28 Dec 2020 14:01:23 -  1.42
> > +++ sys/uvm/uvm_meter.c 28 Jun 2021 05:24:56 -
> > @@ -329,7 +329,7 @@ uvmexp_read(struct uvmexp *uexp)
> > counters_read(uvmexp_counters, counters, exp_ncounters);
> >  
> > /* stat counters */
> > -   uexp->faults = (int)counters[faults];
> > +   uexp->faults = (unsigned int)counters[faults];
> > uexp->pageins = (int)counters[pageins];
> >  
> > /* fault subcounters */
> > @@ -379,10 +379,10 @@ uvmexp_print(int (*pr)(const char *, ...
> > (*pr)("  freemin=%d, free-target=%d, inactive-target=%d, "
> > "wired-max=%d\n", uexp.freemin, uexp.freetarg, uexp.inactarg,
> > uexp.wiredmax);
> > -   (*pr)("  faults=%d, traps=%d, intrs=%d, ctxswitch=%d fpuswitch=%d\n",
> > +   (*pr)("  faults=%u, traps=%u, intrs=%u, ctxswitch=%u fpuswitch=%d\n",
> > uexp.faults, uexp.traps, uexp.intrs, uexp.swtch,
> > uexp.fpswtch);
> > -   (*pr)("  softint=%d, syscalls=%d, kmapent=%d\n",
> > +   (*pr)("  softint=%u, syscalls=%u, kmapent=%d\n",
> > uexp.softs, uexp.syscalls, uexp.kmapent);
> >  
> > (*pr)("  fault counts:\n");
> > Index: sys/uvm/uvmexp.h
> > ===
> > RCS file: /cvs/src/sys/uvm/uvmexp.h,v
> > retrieving revision 1.9
> > diff -u -p -r1.9 uvmexp.h
> > --- sys/uvm/uvmexp.h4 Mar 2021 09:00:03 -   1.9
> > +++ sys/uvm/uvmexp.h28 Jun 2021 05:24:56 -
> > @@ -90,12 +90,12 @@ struct uvmexp {
> > int unused06;   /* formerly nfreeanon */
> >  
> > /* stat counters */
> > -   int faults; /* page fault count */
> > -   int traps;  /* trap count */
> > -   int intrs;  /* interrupt count */
> > -   int swtch;  /* context switch count */
> > -   int softs;  /* software interrupt count */
> > -   int syscalls;   /* system calls */
> > +   unsigned int faults;/* page fault count */
> > +   unsigned int traps; /* trap count */
> > +   unsigned int intrs; /* interrupt count */
> > +   unsigned int swtch; /* context switch count */
> > +   unsigned int softs; /* software interrupt count */
> > +   unsigned int syscalls;  /* system calls */
> > int pageins;/* pagein operation count */
> > /* pageouts are in pdpageouts below */
> > int unused07;   /* formerly obsolete_swapins */
> > Index: usr.bin/systat/uvm.c
> > ===
> > RCS file: /cvs/src/usr.bin/systat/uvm.c,v
> > retrieving revision 1.5
> > diff -u -p -r1.5 uvm.c
> > --- usr.bin/systat/uvm.c28 Jun 2019 13:35:04 -  1.5
> > +++ usr.bin/systat/uvm.c28 Jun 2021 05:24:57 -
> > @@ -37,22 +37,23 @@ void print_uvm(void);
> >  int  read_uvm(void);
> >  int  select_uvm(void);
> >  
> > -void print_uvmexp_field(field_def *, field_def *, int *, int *, const char 
> > *);
> > +void print_uvmexp_field(field_def 

Re: systat(1) counter overflow

2021-07-02 Thread Stuart Henderson
On 2021/07/02 13:09, Martin Pieuchot wrote:
> On 01/07/21(Thu) 13:53, Anindya Mukherjee wrote:
> > Hi,
> > 
> > I noticed that if I leave the system running for more than about a month, 
> > some
> > of the counters in the uvm view of systat(1) overflow and become negative. 
> > This
> > is because the members of struct uvmexp in sys/uvm/uvmexp.h are ints. The
> > kernel's internal counters are of course uint64_t so they don't overflow. It
> > only happens during the uvm_sysctl(9) call which casts the numbers to 
> > integers.
> > The function is uvmexp_read.
> > 
> > In the attached diff I took the path of least resistance and promoted some 
> > of
> > the counters to unsigned int. Ideally I would have liked to use int64_t or 
> > even
> > uint64_t, but I hit an issue in some of the architecture dependent code. An
> > example is:
> > /usr/src/sys/arch/alpha/alpha/trap.c:536 atomic_add_int(&uvmexp.syscalls, 
> > 1);
> > In other places the ++ operator is used to increment the counters and the 
> > 64 bit
> > types can be used.
> > 
> > I am not completely sure this is the best way to proceed, but even if this 
> > diff
> > is horrifying, I'd appreciate some feedback and advice, thanks!
> 
> I wonder if we shouldn't use uint64_t for those and embrace the ABI
> break, that would at least simplify the kernel side and remove a
> truncation.
> 
> Do you have an idea of the different consumers of the "struct uvmexp"
> apart from systat(1)?  What is the impact in userland & ports of such
> change?

It's used by a couple of dozen ports, from a search over 1-year-old
ports source:

lcdproc
libuv, embedded copies (incl electron, node, ruby-passenger)
py-uv
py-psutil, embedded copies (incl spidermonkey + firefox ports)
libstatgrab, embedded copies (digikam)
jdk
libgtop2
py-gevent
net-snmp
zabbix
plan9port
conky
bubblemon-dockapp
facter
gkrellm
free
htop
monit
node_exporter
p5-Sys-MemInfo
slant
tmux-mem-cpu-load
xuvmstat
xfce4-systemload
xfce4-taskmanager

Most of these would just need a recompile, so would either need a
REVISION bump in the port, or in most cases bumping libc would be
enough to trigger pkg_add -u to update them (some exceptions e.g.
py-psutil, py-uv, py-gevent don't depend on libc and would need to
be bumped separately).

Go has its own translated copy of structs from system headers (e.g.
in golang.org/x/sys/unix/zsysctl_openbsd_*) and these are bundled in
many ports that use go (even core system libraries are not exempt from
"vendoring" or having old versions used by software using modules).
Obviously uvmexp isn't used in all of the software which includes these
files though.

Some of those ports include gopsutil which increases the chance
things from struct uvmexp might actually be used (including sqlc,
keybase, vault, consul, nomad, packer).

Some other go ports are either very likely to use uvmexp things e.g.
at least some of the sysutils/beats/XXX ports. (And I listed
node_exporter in the main list above as it's almost certain).



Re: systat(1) counter overflow

2021-07-02 Thread Stuart Henderson
On 2021/07/02 13:43, Stuart Henderson wrote:
> Go has its own translated copy of structs from system headers (e.g.
> in golang.org/x/sys/unix/zsysctl_openbsd_*) and these are bundled in
> many ports that use go (even core system libraries are not exempt from
> "vendoring" or having old versions used by software using modules).
> Obviously uvmexp isn't used in all of the software which includes these
> files though.

FWIW in the year-old ports source that was 41 ports (special prize for
sysutils/nomad which managed to include 3 separate copies), there are
probably a few more now.



Re: systat(1) counter overflow

2021-07-02 Thread Theo de Raadt
Claudio Jeker  wrote:

> I know that golang has its definition of uvmexp and so if you change the
> ABI then you would break at least that. struct uvmexp is used more often
> than we would like.


It is a huge ABI change.  If we are going to change the size of the subunits,
we have to get it right in one step.

Are there issues with using 64-bit types for some fields on the non-atomic
architectures?



Re: systat(1) counter overflow

2021-07-03 Thread Anindya Mukherjee
Thanks for the discussion. This has been very illuminating. I have been digging
around in /usr/src/ and ignoring the atomic architectures (where I got stuck) it
looks like it should be possible to use uint64_t everywhere. I'm playing with
some changes on my machine to see if I can get at least systat(1) and vmstat(8)
to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.

If it works in the base system then ideally every relevant port should be
updated to be consistent. That is indeed quite a big change; more than I
realised so thanks for setting me straight on that.

Regards,
Anindya



Re: systat(1) counter overflow

2021-07-03 Thread Stuart Henderson
On 2021/07/03 01:09, Anindya Mukherjee wrote:
> Thanks for the discussion. This has been very illuminating. I have been 
> digging
> around in /usr/src/ and ignoring the atomic architectures (where I got stuck) 
> it
> looks like it should be possible to use uint64_t everywhere. I'm playing with
> some changes on my machine to see if I can get at least systat(1) and 
> vmstat(8)
> to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.
> 
> If it works in the base system then ideally every relevant port should be
> updated to be consistent. That is indeed quite a big change; more than I
> realised so thanks for setting me straight on that.

We have coped with bigger changes in structs like this before,
it didn't used to be too difficult, but that was before go...



Re: systat(1) counter overflow

2021-07-07 Thread Anindya Mukherjee
One issue I have on amd64 while trying to build the kernel with all uvmexp
members promoted to uint64_t is the use of atomic operations. For example,
swpgonly is modified atomically in several places and those expect an int. I
think only a few members are accessed in this way.

On Sat, Jul 03, 2021 at 11:20:42AM +0100, Stuart Henderson wrote:
> On 2021/07/03 01:09, Anindya Mukherjee wrote:
> > Thanks for the discussion. This has been very illuminating. I have been 
> > digging
> > around in /usr/src/ and ignoring the atomic architectures (where I got 
> > stuck) it
> > looks like it should be possible to use uint64_t everywhere. I'm playing 
> > with
> > some changes on my machine to see if I can get at least systat(1) and 
> > vmstat(8)
> > to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.
> > 
> > If it works in the base system then ideally every relevant port should be
> > updated to be consistent. That is indeed quite a big change; more than I
> > realised so thanks for setting me straight on that.
> 
> We have coped with bigger changes in structs like this before,
> it didn't used to be too difficult, but that was before go...
> 



Re: systat(1) counter overflow

2021-07-13 Thread Anindya Mukherjee
On Sat, Jul 03, 2021 at 11:20:42AM +0100, Stuart Henderson wrote:
> On 2021/07/03 01:09, Anindya Mukherjee wrote:
> > Thanks for the discussion. This has been very illuminating. I have been 
> > digging
> > around in /usr/src/ and ignoring the atomic architectures (where I got 
> > stuck) it
> > looks like it should be possible to use uint64_t everywhere. I'm playing 
> > with
> > some changes on my machine to see if I can get at least systat(1) and 
> > vmstat(8)
> > to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.
> > 
> > If it works in the base system then ideally every relevant port should be
> > updated to be consistent. That is indeed quite a big change; more than I
> > realised so thanks for setting me straight on that.
> 
> We have coped with bigger changes in structs like this before,
> it didn't used to be too difficult, but that was before go...
> 

Hi,

I have been running for a week with the following diff. This is just a POC and
hence there are a few ugly hacks. So far top(1), systat(1), and vmstat(8) seem
to be happy. I haven't hit the 32-bit overflow point for any counters yet but
the counts look right. I have completely ignored ports, but it looks like the
base system can run with this change. This was mostly to satisfy my curiosity.

Regards,
Anindya

? usr.bin/systat/vim_session
Index: sys/arch/amd64/include/atomic.h
===
RCS file: /cvs/src/sys/arch/amd64/include/atomic.h,v
retrieving revision 1.21
diff -u -p -r1.21 atomic.h
--- sys/arch/amd64/include/atomic.h 11 Mar 2021 11:16:55 -  1.21
+++ sys/arch/amd64/include/atomic.h 13 Jul 2021 07:42:51 -
@@ -150,6 +150,14 @@ _atomic_inc_long(volatile unsigned long 
 #define atomic_inc_long(_p) _atomic_inc_long(_p)
 
 static inline void
+_atomic_inc_uint64(volatile uint64_t *p)
+{
+   __asm volatile(_LOCK " incq %0"
+   : "+m" (*p));
+}
+#define atomic_inc_uint64(_p) _atomic_inc_uint64(_p)
+
+static inline void
 _atomic_dec_int(volatile unsigned int *p)
 {
__asm volatile(_LOCK " decl %0"
@@ -166,6 +174,14 @@ _atomic_dec_long(volatile unsigned long 
 #define atomic_dec_long(_p) _atomic_dec_long(_p)
 
 static inline void
+_atomic_dec_uint64(volatile uint64_t *p)
+{
+   __asm volatile(_LOCK " decq %0"
+   : "+m" (*p));
+}
+#define atomic_dec_uint64(_p) _atomic_dec_uint64(_p)
+
+static inline void
 _atomic_add_int(volatile unsigned int *p, unsigned int v)
 {
__asm volatile(_LOCK " addl %1,%0"
@@ -182,6 +198,15 @@ _atomic_add_long(volatile unsigned long 
: "a" (v));
 }
 #define atomic_add_long(_p, _v) _atomic_add_long(_p, _v)
+
+static inline void
+_atomic_add_uint64(volatile uint64_t *p, uint64_t v)
+{
+   __asm volatile(_LOCK " addq %1,%0"
+   : "+m" (*p)
+   : "a" (v));
+}
+#define atomic_add_uint64(_p, _v) _atomic_add_uint64(_p, _v)
 
 static inline void
 _atomic_sub_int(volatile unsigned int *p, unsigned int v)
Index: sys/sys/sysctl.h
===
RCS file: /cvs/src/sys/sys/sysctl.h,v
retrieving revision 1.218
diff -u -p -r1.218 sysctl.h
--- sys/sys/sysctl.h17 May 2021 17:54:31 -  1.218
+++ sys/sys/sysctl.h13 Jul 2021 07:42:52 -
@@ -38,7 +38,8 @@
 #ifndef _SYS_SYSCTL_H_
 #define_SYS_SYSCTL_H_
 
-#include 
+/*#include */
+#include "/usr/src/sys/uvm/uvmexp.h"
 
 /*
  * Definitions for sysctl call.  The sysctl call uses a hierarchical name
Index: sys/uvm/uvm_anon.c
===
RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
retrieving revision 1.54
diff -u -p -r1.54 uvm_anon.c
--- sys/uvm/uvm_anon.c  26 Mar 2021 13:40:05 -  1.54
+++ sys/uvm/uvm_anon.c  13 Jul 2021 07:42:52 -
@@ -119,7 +119,7 @@ uvm_anfree_list(struct vm_anon *anon, st
if (anon->an_swslot != 0) {
/* This page is no longer only in swap. */
KASSERT(uvmexp.swpgonly > 0);
-   atomic_dec_int(&uvmexp.swpgonly);
+   atomic_dec_uint64(&uvmexp.swpgonly);
}
}
anon->an_lock = NULL;
Index: sys/uvm/uvm_aobj.c
===
RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v
retrieving revision 1.99
diff -u -p -r1.99 uvm_aobj.c
--- sys/uvm/uvm_aobj.c  28 Jun 2021 11:19:01 -  1.99
+++ sys/uvm/uvm_aobj.c  13 Jul 2021 07:42:52 -
@@ -1516,6 +1516,6 @@ uao_dropswap_range(struct uvm_object *uo
 */
if (swpgonlydelta > 0) {
KASSERT(uvmexp.swpgonly >= swpgonlydelta);
-   atomic_add_int(&uvmexp.swpgonly, -swpgonlydelta);
+   atomic_add_uint64(&uvmexp.swpgonly, -swpgonlydelta);
}
 }
Index: sys/uvm/uvm_km.c
===
RCS file: /cvs/src/sys/uvm/uvm_km.c,v
retrieving revision 1.145
diff 

Re: systat(1) counter overflow

2021-07-22 Thread Anindya Mukherjee
Hi, just a quick update. After about two weeks, various counters have exceeded
the 32 bit int limit, and haven't wrapped. There might eventually be some issues
with the field width but that will take a long time to reach.

Regards,
Anindya

On Sat, Jul 03, 2021 at 11:20:42AM +0100, Stuart Henderson wrote:
> On 2021/07/03 01:09, Anindya Mukherjee wrote:
> > Thanks for the discussion. This has been very illuminating. I have been 
> > digging
> > around in /usr/src/ and ignoring the atomic architectures (where I got 
> > stuck) it
> > looks like it should be possible to use uint64_t everywhere. I'm playing 
> > with
> > some changes on my machine to see if I can get at least systat(1) and 
> > vmstat(8)
> > to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.
> > 
> > If it works in the base system then ideally every relevant port should be
> > updated to be consistent. That is indeed quite a big change; more than I
> > realised so thanks for setting me straight on that.
> 
> We have coped with bigger changes in structs like this before,
> it didn't used to be too difficult, but that was before go...
> 



Re: systat(1) counter overflow

2021-08-30 Thread Martin Pieuchot
On 13/07/21(Tue) 00:55, Anindya Mukherjee wrote:
> On Sat, Jul 03, 2021 at 11:20:42AM +0100, Stuart Henderson wrote:
> > On 2021/07/03 01:09, Anindya Mukherjee wrote:
> > > Thanks for the discussion. This has been very illuminating. I have been 
> > > digging
> > > around in /usr/src/ and ignoring the atomic architectures (where I got 
> > > stuck) it
> > > looks like it should be possible to use uint64_t everywhere. I'm playing 
> > > with
> > > some changes on my machine to see if I can get at least systat(1) and 
> > > vmstat(8)
> > > to work with uint64_t. The ddb printer (uvmexp_print)is another consumer.
> > > 
> > > If it works in the base system then ideally every relevant port should be
> > > updated to be consistent. That is indeed quite a big change; more than I
> > > realised so thanks for setting me straight on that.
> > 
> > We have coped with bigger changes in structs like this before,
> > it didn't used to be too difficult, but that was before go...
> > 
> 
> Hi,
> 
> I have been running for a week with the following diff. This is just a POC and
> hence there are a few ugly hacks. So far top(1), systat(1), and vmstat(8) seem
> to be happy. I haven't hit the 32-bit overflow point for any counters yet but
> the counts look right. I have completely ignored ports, but it looks like the
> base system can run with this change. This was mostly to satisfy my curiosity.

Thanks for your work.

There's no guarantee that 64bit value can be updated atomically on 32bit
architecture, so we can't follow this road.

It seems to me that this issue requires a bit more investigation.  The
existing "struct uvmexp" contains multiple fields with multiple purposes:

1. constants (pagesize, pagemask, pageshift, ...)

2. counters frequently updated, only incremented and only used
   for accounting purposes (faults, pageinsm fltnoram, ...)

3. counters rarely updated, incremented and decremented
   (swpgonly, nswapdev, ...)

4. global variables, that are incremented/decremented and used for
   making decisions (wired, free, zeropages...)

I don't believe all of them need to be of type uint64_t in the kernel.

It's also not clear to me if keeping "struct uvmexp" as it is but
bumping the size to 64bit is the best way forward.  Did you look at
other userland consumer of "struct uvmexp"?  Which fields do they care
about?  If it is the way forward we could simply use a different layout
in the kernel and do the conversion during the syscall.

> ? usr.bin/systat/vim_session
> Index: sys/arch/amd64/include/atomic.h
> ===
> RCS file: /cvs/src/sys/arch/amd64/include/atomic.h,v
> retrieving revision 1.21
> diff -u -p -r1.21 atomic.h
> --- sys/arch/amd64/include/atomic.h   11 Mar 2021 11:16:55 -  1.21
> +++ sys/arch/amd64/include/atomic.h   13 Jul 2021 07:42:51 -
> @@ -150,6 +150,14 @@ _atomic_inc_long(volatile unsigned long 
>  #define atomic_inc_long(_p) _atomic_inc_long(_p)
>  
>  static inline void
> +_atomic_inc_uint64(volatile uint64_t *p)
> +{
> + __asm volatile(_LOCK " incq %0"
> + : "+m" (*p));
> +}
> +#define atomic_inc_uint64(_p) _atomic_inc_uint64(_p)
> +
> +static inline void
>  _atomic_dec_int(volatile unsigned int *p)
>  {
>   __asm volatile(_LOCK " decl %0"
> @@ -166,6 +174,14 @@ _atomic_dec_long(volatile unsigned long 
>  #define atomic_dec_long(_p) _atomic_dec_long(_p)
>  
>  static inline void
> +_atomic_dec_uint64(volatile uint64_t *p)
> +{
> + __asm volatile(_LOCK " decq %0"
> + : "+m" (*p));
> +}
> +#define atomic_dec_uint64(_p) _atomic_dec_uint64(_p)
> +
> +static inline void
>  _atomic_add_int(volatile unsigned int *p, unsigned int v)
>  {
>   __asm volatile(_LOCK " addl %1,%0"
> @@ -182,6 +198,15 @@ _atomic_add_long(volatile unsigned long 
>   : "a" (v));
>  }
>  #define atomic_add_long(_p, _v) _atomic_add_long(_p, _v)
> +
> +static inline void
> +_atomic_add_uint64(volatile uint64_t *p, uint64_t v)
> +{
> + __asm volatile(_LOCK " addq %1,%0"
> + : "+m" (*p)
> + : "a" (v));
> +}
> +#define atomic_add_uint64(_p, _v) _atomic_add_uint64(_p, _v)
>  
>  static inline void
>  _atomic_sub_int(volatile unsigned int *p, unsigned int v)
> Index: sys/sys/sysctl.h
> ===
> RCS file: /cvs/src/sys/sys/sysctl.h,v
> retrieving revision 1.218
> diff -u -p -r1.218 sysctl.h
> --- sys/sys/sysctl.h  17 May 2021 17:54:31 -  1.218
> +++ sys/sys/sysctl.h  13 Jul 2021 07:42:52 -
> @@ -38,7 +38,8 @@
>  #ifndef _SYS_SYSCTL_H_
>  #define  _SYS_SYSCTL_H_
>  
> -#include 
> +/*#include */
> +#include "/usr/src/sys/uvm/uvmexp.h"
>  
>  /*
>   * Definitions for sysctl call.  The sysctl call uses a hierarchical name
> Index: sys/uvm/uvm_anon.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> retrieving revision 1.54
> diff -u -p -r1.54 uvm_anon.c
> --- sys/uvm/

Re: systat(1) counter overflow

2021-09-02 Thread Anindya Mukherjee
On Mon, Aug 30, 2021 at 11:11:55AM +0200, Martin Pieuchot wrote:
> On 13/07/21(Tue) 00:55, Anindya Mukherjee wrote:
> > On Sat, Jul 03, 2021 at 11:20:42AM +0100, Stuart Henderson wrote:
> > > On 2021/07/03 01:09, Anindya Mukherjee wrote:
> > > > Thanks for the discussion. This has been very illuminating. I have been 
> > > > digging
> > > > around in /usr/src/ and ignoring the atomic architectures (where I got 
> > > > stuck) it
> > > > looks like it should be possible to use uint64_t everywhere. I'm 
> > > > playing with
> > > > some changes on my machine to see if I can get at least systat(1) and 
> > > > vmstat(8)
> > > > to work with uint64_t. The ddb printer (uvmexp_print)is another 
> > > > consumer.
> > > > 
> > > > If it works in the base system then ideally every relevant port should 
> > > > be
> > > > updated to be consistent. That is indeed quite a big change; more than I
> > > > realised so thanks for setting me straight on that.
> > > 
> > > We have coped with bigger changes in structs like this before,
> > > it didn't used to be too difficult, but that was before go...
> > > 
> > 
> > Hi,
> > 
> > I have been running for a week with the following diff. This is just a POC 
> > and
> > hence there are a few ugly hacks. So far top(1), systat(1), and vmstat(8) 
> > seem
> > to be happy. I haven't hit the 32-bit overflow point for any counters yet 
> > but
> > the counts look right. I have completely ignored ports, but it looks like 
> > the
> > base system can run with this change. This was mostly to satisfy my 
> > curiosity.
> 
> Thanks for your work.
> 
> There's no guarantee that 64bit value can be updated atomically on 32bit
> architecture, so we can't follow this road.
> 
> It seems to me that this issue requires a bit more investigation.  The
> existing "struct uvmexp" contains multiple fields with multiple purposes:
> 
> 1. constants (pagesize, pagemask, pageshift, ...)
> 
> 2. counters frequently updated, only incremented and only used
>for accounting purposes (faults, pageinsm fltnoram, ...)
> 
> 3. counters rarely updated, incremented and decremented
>(swpgonly, nswapdev, ...)
> 
> 4. global variables, that are incremented/decremented and used for
>making decisions (wired, free, zeropages...)
> 
> I don't believe all of them need to be of type uint64_t in the kernel.

Agreed. In my test diff I promoted everything because of the way systat(1)'s uvm
module is implemented currently, namely all uvmexp members are assumed to be of
the same type. Ideally I would only like to change the stat and fault counters,
which increase monotonically. The uvm module can be accordingly updated.

If this is done we can leave swpgonly untouched and hence remove the 64-bit
atomic functions from my diff. However, this doesn't solve the issue with atomic
increments because in certain architectures some of the monotonic counters use
atomic increments. An example:
/usr/src/sys/arch/mips64/mips64/trap.c|184| atomic_inc_int(&uvmexp.traps);
I'm not sure how to get around this; traps specifically is a counter which
overflows quickly. Maybe if the architecture is 64 bit and it uses atomic
increment as above then we can use a 64-bit atomic function in the arch specific
code?

I don't fully understand the details here, but it seems that some of the
counters are maintained by the kernel as 64-bit variables internally. These get
converted during a sysctl(8) call in uvmexp_read and are being cast to int
currently. Along with the counters which are only incremented using ++, these
are the easiest to convert to 64-bit.

> 
> It's also not clear to me if keeping "struct uvmexp" as it is but
> bumping the size to 64bit is the best way forward.  Did you look at
> other userland consumer of "struct uvmexp"?  Which fields do they care
> about?  If it is the way forward we could simply use a different layout
> in the kernel and do the conversion during the syscall.

I searched for usages of the stat and fault counters in /usr/src (again ignoring
ports) and the only consumers in /usr/src/usr.bin are systat(1), vmstat(8),
iostat(8), and top(1). The rest of the references in /usr/src are mostly from
code which increments these counters directly or converts them in uvmexp_read.
The function uvmexp_print in uvm_meter.c is used by ddb(4) to print these
values.

Regards,
Anindya

> 
> > ? usr.bin/systat/vim_session
> > Index: sys/arch/amd64/include/atomic.h
> > ===
> > RCS file: /cvs/src/sys/arch/amd64/include/atomic.h,v
> > retrieving revision 1.21
> > diff -u -p -r1.21 atomic.h
> > --- sys/arch/amd64/include/atomic.h 11 Mar 2021 11:16:55 -  1.21
> > +++ sys/arch/amd64/include/atomic.h 13 Jul 2021 07:42:51 -
> > @@ -150,6 +150,14 @@ _atomic_inc_long(volatile unsigned long 
> >  #define atomic_inc_long(_p) _atomic_inc_long(_p)
> >  
> >  static inline void
> > +_atomic_inc_uint64(volatile uint64_t *p)
> > +{
> > +   __asm volatile(_LOCK " incq %0"