Hi,
I recently started to get huge delays when booting my opensolaris box,
since upgrading to 2009.06 although not for the first couple of boots
afterwards. Thanks to a pointer from someone in #opensolaris, I've
discovered it's the zpool import of my storage pool that is taking ages.
I re-installed again, and have captured everything that I know how to
below as I import the pool.
Initial ZFS status after install
--------------------------------
Last login: Sun Jul 19 05:01:35 2009
Sun Microsystems Inc. SunOS 5.11 snv_111b November 2008
dunc at prefect:~$ pfexec bash
dunc at prefect:~# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 37G 3.90G 33.1G 10% ONLINE -
dunc at prefect:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 4.74G 31.7G 77.5K /rpool
rpool/ROOT 2.86G 31.7G 19K legacy
rpool/ROOT/opensolaris 2.86G 31.7G 2.86G /
rpool/dump 959M 31.7G 959M -
rpool/export 890K 31.7G 21K /export
rpool/export/home 869K 31.7G 21K /export/home
rpool/export/home/dunc 848K 31.7G 848K /export/home/dunc
rpool/swap 959M 32.5G 101M -
Importing the pool
------------------
dunc at prefect:~# zpool import
pool: store
id: 6301690472664648097
state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
the '-f' flag.
see: http://www.sun.com/msg/ZFS-8000-EY
config:
store ONLINE
mirror ONLINE
c10d0 ONLINE
c9d0 ONLINE
mirror ONLINE
c11d0 ONLINE
c12d0 ONLINE
dunc at prefect:~#
dunc at prefect:~# date && zpool import -f store && date
Sunday, 19 July 2009 05:07:19 BST
Sunday, 19 July 2009 06:05:53 BST
dunc at prefect:~#
As you can see the import took approximately 1 hour. I'm sure this can't
be right, but I don't know why it's happening.
While this was going on, devfsadm was busy the whole time, please see
below some captured output from top at intervals.
Just after I ran the import:-
load averages: 0.66, 0.45, 0.22; up 0+00:04:36
05:07:59
54 processes: 52 sleeping, 1 running, 1 on cpu
CPU states: 16.0% idle, 78.2% user, 21.8% kernel, 0.0% iowait, 0.0% swap
Kernel: 1050 ctxsw, 319 trap, 498 intr, 8951 syscall, 235 flt
Memory: 1920M phys mem, 1078M free mem, 959M total swap, 959M free swap
PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
63 root 7 31 0 13M 5944K run 0:22 76.16% devfsadm
472 root 19 59 0 35M 29M sleep 0:07 18.62% fmd
630 root 1 59 0 7364K 3044K sleep 0:00 1.05% zpool
123 root 17 59 0 7268K 2716K sleep 0:00 0.17% syseventd
641 dunc 1 59 0 3988K 2756K cpu 0:00 0.10% top
At this point, the pool seems to have imported OK though, as I can do this:-
dunc at prefect:~$ zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 37G 3.90G 33.1G 10% ONLINE -
store 1.13T 691G 469G 59% ONLINE -
dunc at prefect:~$
More top output.....
load averages: 1.08, 1.29, 1.08; up 0+00:28:31
05:31:54
57 processes: 55 sleeping, 1 running, 1 on cpu
CPU states: 5.8% idle, 82.0% user, 12.2% kernel, 0.0% iowait, 0.0% swap
Kernel: 352 ctxsw, 265 trap, 491 intr, 4838 syscall, 178 flt
Memory: 1920M phys mem, 427M free mem, 959M total swap, 959M free swap
PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
63 root 7 32 0 14M 7184K run 20:37 87.04% devfsadm
472 root 19 59 0 226M 220M sleep 1:36 5.37% fmd
630 root 1 59 0 7364K 3044K sleep 0:04 0.25% zpool
650 dunc 1 59 0 4596K 3220K cpu 0:01 0.15% top
load averages: 1.09, 1.15, 1.10; up 0+00:39:48
05:43:11
57 processes: 55 sleeping, 1 running, 1 on cpu
CPU states: 1.4% idle, 89.0% user, 9.6% kernel, 0.0% iowait, 0.0% swap
Kernel: 233 ctxsw, 132 trap, 426 intr, 4379 syscall, 67 flt
Memory: 1920M phys mem, 334M free mem, 959M total swap, 959M free swap
PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
63 root 7 11 0 14M 7280K run 30:13 94.17% devfsadm
472 root 19 59 0 283M 276M sleep 2:00 3.16% fmd
650 dunc 1 59 0 4644K 3268K cpu 0:02 0.16% top
630 root 1 59 0 7364K 3044K sleep 0:05 0.09% zpool
load averages: 1.03, 1.09, 1.12; up 0+00:56:19
05:59:42
57 processes: 55 sleeping, 1 running, 1 on cpu
CPU states: 6.4% idle, 85.6% user, 8.0% kernel, 0.0% iowait, 0.0% swap
Kernel: 170 ctxsw, 116 trap, 401 intr, 4089 syscall, 65 flt
Memory: 1920M phys mem, 233M free mem, 959M total swap, 959M free swap
PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
63 root 7 21 0 15M 8200K run 45:00 90.86% devfsadm
472 root 19 59 0 363M 357M sleep 2:32 1.90% fmd
650 dunc 1 59 0 4700K 3332K cpu 0:04 0.17% top
630 root 1 59 0 7364K 3044K sleep 0:07 0.10% zpool
All that seems to happen is fmd uses more and more memory.
Eventually, it completes as you can see at the beginning, and I can use
the pool with no problems. At each reboot from now on though, I have to
wait an hour.
Sorry for the long email, I wanted to capture all the information.
Can anybody shed some light on this, or tell me how to start to debug
this problem?
Thanks in advance,
Dunc