Maybe these should just use gethrtime()? I think a check for negative values
is insufficient… if the times are not consistent and the values are taken on
different snapshots, then the entire results cannot be trusted. gethrtime()
would fix that … but may make the cost of these snapshots larger.
- Garrett
On Oct 16, 2013, at 2:34 PM, Henk Langeveld <[email protected]> wrote:
> On 16 Oct 13:41 , Henk Langeveld wrote:
>> Before I create an issue, I'd like to know if anyone else can reproduce
>> this on an active server?
>>
>> The following kstat/awk script reports kstat anomalies where
>> [rw](len)time fields inside a kstat_io[1] structure actually *decrease*
>> in value:
>>
>> kstat -c disk -p sd:::*time 1 |
>> nawk '/time/ {
>> if (old[$1] && old[$1] > $2)
>> print $1,"old: "old[$1],"new:"$2
>> old[$1]=$2
>> }'
>> sd:39:sd39:rlentime old: 7223147599853 new:7223146752623
>> sd:39:sd39:rtime old: 1556295270928 new:1556295116623
>> sd:21:sd21:rlentime old: 44879798465601 new:44879798422271
>> sd:24:sd24:wlentime old: 15995502763 new:15995464571
>> sd:24:sd24:wtime old: 15595171901 new:15595120298
>> sd:38:sd38:wlentime old: 16762450959 new:16762280835
>> sd:38:sd38:wtime old: 16293043067 new:16292872942
>> sd:15:sd15:wlentime old: 16274495238 new:16274415688
>> sd:15:sd15:wtime old: 15857746407 new:15857664000
>> sd:2:sd2:wlentime old: 16300070951 new:16299959181
>> sd:2:sd2:wtime old: 15882033262 new:15881917149
>> sd:31:sd31:rlentime old: 7260803325344 new:7260802679629
>> sd:31:sd31:rtime old: 1558548730649 new:1558548606359
>>
>>
>> Here's the story behind this:
>>
>> Several times now I've seen iostat -xn <interval> report broken values
>> for the wait, wsvc_t and %w columns.
>>
>> The outrageously large values led me to believe we're dealing with
>> a 64bit signed/unsigned overflow.
>>
>> Looking at the code and the definitions in kstat.h, these columns are
>> derived from data in the kstat_io structure.
>>
>> According to the description of the structure, all fields in there
>> should increase monotonously.
>>
>> I also have examples for the old perl-based kstat; I wanted to make
>> sure I wasn't seeing any artefact of the conversion to floating point.
>>
>> Anyone else seeing this?
>>
>> Is this a case of improper reporting (inappropriate caching) or
>> improper recording in the kernel (missing locks)?
>
>
> rmustacc on #illumos (irc) pointed me at gethrtime_unscaled(), which
> according to uts/common/os/msacct.c is inconsistent between different CPUs.
>
> This appears to break the assumptions implied by kstat.h on 'Accumulated time
> and queue length statistics'.
>
>
> 637 * All times are 64-bit nanoseconds (hrtime_t), as returned by
> 638 * gethrtime().
> 639 *
> 640 * The units of cumulative busy time are accumulated nanoseconds.
> 641 * The units of cumulative length*time products are elapsed time
> 642 * times queue length.
> 643 *
> 644 * Updates to the fields below are performed implicitly by calls to
> 645 * these five functions:
> 646 *
> 647 * kstat_waitq_enter()
> 648 * kstat_waitq_exit()
> 649 * kstat_runq_enter()
> 650 * kstat_runq_exit()
> 651 *
> 652 * kstat_waitq_to_runq() (see below)
> 653 * kstat_runq_back_to_waitq() (see below)
> 654 *
>
>
> Each of these functions computes new and delta, assuming delta > 0.
>
> new = gethrtime_unscaled();
> delta = new - kiop->wlastupdate;
>
> IMO, the [rw]lastupdate, [rw]lentime, and [rw]time fields in the kiop
> structure should be protected with a test on (delta > 0).
>
> Does this look sensible?
>
>
> diff --git a/usr/src/uts/common/os/kstat_fr.c
> b/usr/src/uts/common/os/kstat_fr.c
> index 83b817e..473cd71 100644
> --- a/usr/src/uts/common/os/kstat_fr.c
> +++ b/usr/src/uts/common/os/kstat_fr.c
> @@ -1325,11 +1325,14 @@ kstat_waitq_enter(kstat_io_t *kiop)
>
> new = gethrtime_unscaled();
> delta = new - kiop->wlastupdate;
> - kiop->wlastupdate = new;
> wcnt = kiop->wcnt++;
> - if (wcnt != 0) {
> - kiop->wlentime += delta * wcnt;
> - kiop->wtime += delta;
> +
> + if (delta > 0) {
> + kiop->wlastupdate = new;
> + if (wcnt != 0) {
> + kiop->wlentime += delta * wcnt;
> + kiop->wtime += delta;
> + }
> }
> }
>
> @@ -1341,11 +1344,13 @@ kstat_waitq_exit(kstat_io_t *kiop)
>
> new = gethrtime_unscaled();
> delta = new - kiop->wlastupdate;
> - kiop->wlastupdate = new;
> wcnt = kiop->wcnt--;
> ASSERT((int)wcnt > 0);
> - kiop->wlentime += delta * wcnt;
> - kiop->wtime += delta;
> + if (delta > 0) {
> + kiop->wlastupdate = new;
> + kiop->wlentime += delta * wcnt;
> + kiop->wtime += delta;
> + }
> }
>
> void
> @@ -1356,11 +1361,13 @@ kstat_runq_enter(kstat_io_t *kiop)
>
> new = gethrtime_unscaled();
> delta = new - kiop->rlastupdate;
> - kiop->rlastupdate = new;
> rcnt = kiop->rcnt++;
> - if (rcnt != 0) {
> - kiop->rlentime += delta * rcnt;
> - kiop->rtime += delta;
> + if (delta > 0) {
> + kiop->rlastupdate = new;
> + if (rcnt != 0) {
> + kiop->rlentime += delta * rcnt;
> + kiop->rtime += delta;
> + }
> }
> }
>
> @@ -1372,11 +1379,14 @@ kstat_runq_exit(kstat_io_t *kiop)
>
> new = gethrtime_unscaled();
> delta = new - kiop->rlastupdate;
> - kiop->rlastupdate = new;
> rcnt = kiop->rcnt--;
> +
> ASSERT((int)rcnt > 0);
> - kiop->rlentime += delta * rcnt;
> - kiop->rtime += delta;
> + if (delta > 0) {
> + kiop->rlastupdate = new;
> + kiop->rlentime += delta * rcnt;
> + kiop->rtime += delta;
> + }
> }
>
> void
> @@ -1388,18 +1398,22 @@ kstat_waitq_to_runq(kstat_io_t *kiop)
> new = gethrtime_unscaled();
>
> delta = new - kiop->wlastupdate;
> - kiop->wlastupdate = new;
> wcnt = kiop->wcnt--;
> ASSERT((int)wcnt > 0);
> - kiop->wlentime += delta * wcnt;
> - kiop->wtime += delta;
> + if (delta > 0) {
> + kiop->wlastupdate = new;
> + kiop->wlentime += delta * wcnt;
> + kiop->wtime += delta;
> + }
>
> delta = new - kiop->rlastupdate;
> - kiop->rlastupdate = new;
> rcnt = kiop->rcnt++;
> - if (rcnt != 0) {
> - kiop->rlentime += delta * rcnt;
> - kiop->rtime += delta;
> + if (delta > 0) {
> + kiop->rlastupdate = new;
> + if (rcnt != 0) {
> + kiop->rlentime += delta * rcnt;
> + kiop->rtime += delta;
> + }
> }
> }
>
> @@ -1412,18 +1426,22 @@ kstat_runq_back_to_waitq(kstat_io_t *kiop)
> new = gethrtime_unscaled();
>
> delta = new - kiop->rlastupdate;
> - kiop->rlastupdate = new;
> rcnt = kiop->rcnt--;
> ASSERT((int)rcnt > 0);
> - kiop->rlentime += delta * rcnt;
> - kiop->rtime += delta;
> + if (delta > 0) {
> + kiop->rlastupdate = new;
> + kiop->rlentime += delta * rcnt;
> + kiop->rtime += delta;
> + }
>
> delta = new - kiop->wlastupdate;
> - kiop->wlastupdate = new;
> wcnt = kiop->wcnt++;
> - if (wcnt != 0) {
> - kiop->wlentime += delta * wcnt;
> - kiop->wtime += delta;
> + if (delta > 0) {
> + kiop->wlastupdate = new;
> + if (wcnt != 0) {
> + kiop->wlentime += delta * wcnt;
> + kiop->wtime += delta;
> + }
> }
> }
>
> Regards,
>
> Henk
>
>
> -------------------------------------------
> illumos-discuss
> Archives: https://www.listbox.com/member/archive/182180/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/182180/22003744-9012f59c
> Modify Your Subscription: https://www.listbox.com/member/?&
> Powered by Listbox: http://www.listbox.com
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com