Hi all,
Please code review following simple fix (6 lines changed):
http://cr.opensolaris.org/~aragorn/6873852/
Background (root cause):
========================
The problem is with uninitialized ksum_kstat in nfsstat.c file.
First, ksum_kstat is allocated using malloc() in setup(). Then, cn_print() is
called and in turn the nfsstat_kstat_sum() is invoked. As a third parameter in
the nfsstat_kstat_sum() call the ksum_kstat is passed. Please note the
cn_print()/nfsstat_kstat_sum() pair can be called multiple times in a loop.
nfsstat_kstat_sum() assumes several things for its third parameter (named sum
inside the nfsstat_kstat_sum() implementation) which are not always true:
1. sum is allocated (it is not NULL); Why? malloc() in setup() can fail and
return NULL
2. sum->ks_data is NULL when nfsstat_kstat_sum() is called first time; Why?
nobody is zeroing the ksum_kstat after malloc() in setup()
The assumption #2 caused the core reported here. libumem filled the ksum_kstat
in the malloc() by some data, then the nfsstat_kstat_sum() did not called
nfsstat_kstat_copy() leaving sum->ks_data to point to nowhere. And finally we
failed and cored in following "for" statement while trying to dereference
sum->ks_data (aliased here to knpsum).
999 nfsstat_kstat_sum(kstat_t *kstat1, kstat_t *kstat2, kstat_t *sum)
1000 {
1001 int i;
1002 kstat_named_t *knp1, *knp2, *knpsum;
1003 if (kstat1 == NULL || kstat2 == NULL)
1004 return;
1005
1006 knp1 = KSTAT_NAMED_PTR(kstat1);
1007 knp2 = KSTAT_NAMED_PTR(kstat2);
1008 if (sum->ks_data == NULL)
1009 nfsstat_kstat_copy(kstat1, sum, 0);
1010 knpsum = KSTAT_NAMED_PTR(sum);
1011
1012 for (i = 0; i < (kstat1->ks_ndata); i++)
1013 knpsum[i].value.ui64 = knp1[i].value.ui64 +
knp2[i].value.ui64;
1014 }
The change aims to fix both wrong assumptions in nfsstat_kstat_sum() - see
above. In addition one small bug in safe_zalloc() is fixed too.
Thanks for the review.
--
Marcel Telka
Solaris RPE