On 16 Oct 13:41 , Henk Langeveld wrote:
Before I create an issue, I'd like to know if anyone else can reproduce
this on an active server?

The following kstat/awk script reports kstat anomalies where
[rw](len)time fields inside a kstat_io[1] structure actually *decrease*
in value:

kstat -c disk -p sd:::*time  1 |
     nawk '/time/ {
         if (old[$1] && old[$1] > $2)
             print $1,"old: "old[$1],"new:"$2
             old[$1]=$2
     }'
sd:39:sd39:rlentime old: 7223147599853 new:7223146752623
sd:39:sd39:rtime old: 1556295270928 new:1556295116623
sd:21:sd21:rlentime old: 44879798465601 new:44879798422271
sd:24:sd24:wlentime old: 15995502763 new:15995464571
sd:24:sd24:wtime old: 15595171901 new:15595120298
sd:38:sd38:wlentime old: 16762450959 new:16762280835
sd:38:sd38:wtime old: 16293043067 new:16292872942
sd:15:sd15:wlentime old: 16274495238 new:16274415688
sd:15:sd15:wtime old: 15857746407 new:15857664000
sd:2:sd2:wlentime old: 16300070951 new:16299959181
sd:2:sd2:wtime old: 15882033262 new:15881917149
sd:31:sd31:rlentime old: 7260803325344 new:7260802679629
sd:31:sd31:rtime old: 1558548730649 new:1558548606359


Here's the story behind this:

Several times now I've seen iostat -xn <interval> report broken values
for the wait, wsvc_t and %w columns.

The outrageously large values led me to believe we're dealing with
a 64bit signed/unsigned overflow.

Looking at the code and the definitions in kstat.h, these columns are
derived from data in the kstat_io structure.

According to the description of the structure, all fields in there
should increase monotonously.

I also have examples for the old perl-based kstat; I wanted to make
sure I wasn't seeing any artefact of the conversion to floating point.

Anyone else seeing this?

Is this a case of improper reporting (inappropriate caching) or
improper recording in the kernel (missing locks)?


rmustacc on #illumos (irc) pointed me at gethrtime_unscaled(), which according to uts/common/os/msacct.c is inconsistent between different CPUs.

This appears to break the assumptions implied by kstat.h on 'Accumulated time and queue length statistics'.


637      * All times are 64-bit nanoseconds (hrtime_t), as returned by
638      * gethrtime().
639      *
640      * The units of cumulative busy time are accumulated nanoseconds.
641      * The units of cumulative length*time products are elapsed time
642      * times queue length.
643      *
644      * Updates to the fields below are performed implicitly by calls to
645      * these five functions:
646      *
647      *      kstat_waitq_enter()
648      *      kstat_waitq_exit()
649      *      kstat_runq_enter()
650      *      kstat_runq_exit()
651      *
652      *      kstat_waitq_to_runq()           (see below)
653      *      kstat_runq_back_to_waitq()      (see below)
654      *


Each of these functions computes new and delta, assuming delta > 0.

        new = gethrtime_unscaled();
        delta = new - kiop->wlastupdate;

IMO, the [rw]lastupdate, [rw]lentime, and [rw]time fields in the kiop structure should be protected with a test on (delta > 0).

Does this look sensible?


diff --git a/usr/src/uts/common/os/kstat_fr.c b/usr/src/uts/common/os/kstat_fr.c
index 83b817e..473cd71 100644
--- a/usr/src/uts/common/os/kstat_fr.c
+++ b/usr/src/uts/common/os/kstat_fr.c
@@ -1325,11 +1325,14 @@ kstat_waitq_enter(kstat_io_t *kiop)

        new = gethrtime_unscaled();
        delta = new - kiop->wlastupdate;
-       kiop->wlastupdate = new;
        wcnt = kiop->wcnt++;
-       if (wcnt != 0) {
-               kiop->wlentime += delta * wcnt;
-               kiop->wtime += delta;
+
+       if (delta > 0) {
+           kiop->wlastupdate = new;
+           if (wcnt != 0) {
+                   kiop->wlentime += delta * wcnt;
+                   kiop->wtime += delta;
+           }
        }
 }

@@ -1341,11 +1344,13 @@ kstat_waitq_exit(kstat_io_t *kiop)

        new = gethrtime_unscaled();
        delta = new - kiop->wlastupdate;
-       kiop->wlastupdate = new;
        wcnt = kiop->wcnt--;
        ASSERT((int)wcnt > 0);
-       kiop->wlentime += delta * wcnt;
-       kiop->wtime += delta;
+       if (delta > 0) {
+           kiop->wlastupdate = new;
+           kiop->wlentime += delta * wcnt;
+           kiop->wtime += delta;
+       }
 }

 void
@@ -1356,11 +1361,13 @@ kstat_runq_enter(kstat_io_t *kiop)

        new = gethrtime_unscaled();
        delta = new - kiop->rlastupdate;
-       kiop->rlastupdate = new;
        rcnt = kiop->rcnt++;
-       if (rcnt != 0) {
-               kiop->rlentime += delta * rcnt;
-               kiop->rtime += delta;
+       if (delta > 0) {
+           kiop->rlastupdate = new;
+           if (rcnt != 0) {
+                   kiop->rlentime += delta * rcnt;
+                   kiop->rtime += delta;
+           }
        }
 }

@@ -1372,11 +1379,14 @@ kstat_runq_exit(kstat_io_t *kiop)

        new = gethrtime_unscaled();
        delta = new - kiop->rlastupdate;
-       kiop->rlastupdate = new;
        rcnt = kiop->rcnt--;
+
        ASSERT((int)rcnt > 0);
-       kiop->rlentime += delta * rcnt;
-       kiop->rtime += delta;
+       if (delta > 0) {
+           kiop->rlastupdate = new;
+           kiop->rlentime += delta * rcnt;
+           kiop->rtime += delta;
+       }
 }

 void
@@ -1388,18 +1398,22 @@ kstat_waitq_to_runq(kstat_io_t *kiop)
        new = gethrtime_unscaled();

        delta = new - kiop->wlastupdate;
-       kiop->wlastupdate = new;
        wcnt = kiop->wcnt--;
        ASSERT((int)wcnt > 0);
-       kiop->wlentime += delta * wcnt;
-       kiop->wtime += delta;
+       if (delta > 0) {
+           kiop->wlastupdate = new;
+           kiop->wlentime += delta * wcnt;
+           kiop->wtime += delta;
+       }

        delta = new - kiop->rlastupdate;
-       kiop->rlastupdate = new;
        rcnt = kiop->rcnt++;
-       if (rcnt != 0) {
-               kiop->rlentime += delta * rcnt;
-               kiop->rtime += delta;
+       if (delta > 0) {
+           kiop->rlastupdate = new;
+           if (rcnt != 0) {
+                   kiop->rlentime += delta * rcnt;
+                   kiop->rtime += delta;
+           }
        }
 }

@@ -1412,18 +1426,22 @@ kstat_runq_back_to_waitq(kstat_io_t *kiop)
        new = gethrtime_unscaled();

        delta = new - kiop->rlastupdate;
-       kiop->rlastupdate = new;
        rcnt = kiop->rcnt--;
        ASSERT((int)rcnt > 0);
-       kiop->rlentime += delta * rcnt;
-       kiop->rtime += delta;
+       if (delta > 0) {
+           kiop->rlastupdate = new;
+           kiop->rlentime += delta * rcnt;
+           kiop->rtime += delta;
+       }

        delta = new - kiop->wlastupdate;
-       kiop->wlastupdate = new;
        wcnt = kiop->wcnt++;
-       if (wcnt != 0) {
-               kiop->wlentime += delta * wcnt;
-               kiop->wtime += delta;
+       if (delta > 0) {
+           kiop->wlastupdate = new;
+           if (wcnt != 0) {
+                   kiop->wlentime += delta * wcnt;
+                   kiop->wtime += delta;
+           }
        }
 }

Regards,

Henk


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to