Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Lachlan Mulcahy
Hi All,

Just a follow up - it seems like whatever it was doing it eventually got
done with and the speed picked back up again. The send/recv finally
finished -- I guess I could do with a little patience :)

Lachlan

On Mon, Dec 5, 2011 at 10:47 AM, Lachlan Mulcahy  wrote:

> Hi All,
>
> We are currently doing a zfs send/recv with mbuffer to send incremental
> changes across and it seems to be running quite slowly, with zfs receive
> the apparent bottle neck.
>
> The process itself seems to be using almost 100% of a single CPU in "sys"
> time.
>
> Wondering if anyone has any ideas if this is normal or if this is just
> going to run forever and never finish...
>
>
> details - two machines connected via Gigabit Ethernet on the same LAN.
>
> Sending server:
>
> zfs send -i 20111201_1 data@20111205_1 | mbuffer -s 128k -m 1G -O
> tdp03r-int:9090
>
> Receiving server:
>
> mbuffer -s 128k -m 1G -I 9090 | zfs receive -vF tank/db/data
>
> mbuffer showing:
>
> in @  256 KiB/s, out @  256 KiB/s,  306 GiB total, buffer 100% ful
>
>
>
> My debug:
>
> DTraceToolkit hotkernel reports:
>
> zfs`lzjb_decompress10   0.0%
> unix`page_nextn31   0.0%
> genunix`fsflush_do_pages   37   0.0%
> zfs`dbuf_free_range   183   0.1%
> genunix`list_next5822   3.7%
> unix`mach_cpu_idle 150261  96.1%
>
>
> Top shows:
>
>PID USERNAME NLWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
>  22945 root1  600   13M 3004K cpu/6  144:21  3.79% zfs
>550 root   28  590   39M   22M sleep   10:19  0.06% fmd
>
> I'd say the 3.7% or so here is so low because we are providing not per
> CPU, but aggregate CPU usage. mpstat seems to show the real story.
>
> mpstat 1 shows output much like this each second:
>
> CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt
> idl
>   00   00   329  108   830   1730 00   0   0
> 100
>   10   00   1001   940   2310 00   0   0
> 100
>   20   00320   280510 00   0   0
> 100
>   30   00180   110000 00   0   0
> 100
>   40   00166   100200 00   0   0
> 100
>   50   00 6020000 00   0   0
> 100
>   60   00 2000000 00   0   0
> 100
>   70   00 9040000160   0   0
> 100
>   80   00 6030000 00   3   0
> 97
>   90   00 3100000 00   0   0
> 100
>  100   00222   350110 00  89   0
> 11
>  110   00 2000000 00   0   0
> 100
>  120   00 3020100 20   0   0
> 100
>  130   00 2000000 00   0   0
> 100
>  140   0024   1760020610   0   0
> 100
>  150   00140   240010 20   0   0
> 100
>  160   00 2000000 00   0   0
> 100
>  170   0010280050780   1   0
> 99
>  180   00 2000000 00   0   0
> 100
>  190   00 5120000100   0   0
> 100
>  200   00 2000000 00   0   0
> 100
>  210   00 9240000 40   0   0
> 100
>  220   00 4000000 00   0   0
> 100
>  230   00 2000000 00   0   0
> 100
>
>
> So I'm lead to believe that zfs receive is spending almost 100% of a
> single CPUs time doing a lot of genunix`list_next ...
>
> Any ideas what is going on here?
>
> Best Regards,
> --
> Lachlan Mulcahy
> Senior DBA,
> Marin Software Inc.
> San Francisco, USA
>
> AU Mobile: +61 458 448 721
> US Mobile: +1 (415) 867 2839
> Office : +1 (415) 671 6080
>
>


-- 
Lachlan Mulcahy
Senior DBA,
Marin Software Inc.
San Francisco, USA

AU Mobile: +61 458 448 721
US Mobile: +1 (415) 867 2839
Office : +1 (415) 671 6080
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Lachlan Mulcahy
Hi Bob,


On Mon, Dec 5, 2011 at 12:31 PM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Mon, 5 Dec 2011, Lachlan Mulcahy wrote:
>
>>
>> Anything else you suggest I'd check for faults? (Though I'm sort of
>> doubting it is an issue, I'm happy to be
>> thorough)
>>
>
> Try running
>
>  fmdump -ef
>
> and see if new low-level fault events are comming in during the zfs
> receive.
>
>
Just a bunch of what I would guess is unrelated USB errors?

Dec 05 20:58:09.0246 ereport.io.usb.epse
Dec 05 20:58:50.9207 ereport.io.usb.epse
Dec 05 20:59:36.0242 ereport.io.usb.epse
Dec 05 20:59:39.0230 ereport.io.usb.epse
Dec 05 21:00:21.0223 ereport.io.usb.epse
Dec 05 21:01:06.0215 ereport.io.usb.epse
Dec 05 21:01:09.0314 ereport.io.usb.epse
Dec 05 21:01:50.9213 ereport.io.usb.epse
Dec 05 21:02:36.0299 ereport.io.usb.epse
Dec 05 21:02:39.0298 ereport.io.usb.epse

Regards,
-- 
Lachlan Mulcahy
Senior DBA,
Marin Software Inc.
San Francisco, USA

AU Mobile: +61 458 448 721
US Mobile: +1 (415) 867 2839
Office : +1 (415) 671 6080
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Bob Friesenhahn

On Mon, 5 Dec 2011, Lachlan Mulcahy wrote:


Anything else you suggest I'd check for faults? (Though I'm sort of doubting it 
is an issue, I'm happy to be
thorough)


Try running

  fmdump -ef

and see if new low-level fault events are comming in during the zfs 
receive.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Lachlan Mulcahy
Hi Bob,

On Mon, Dec 5, 2011 at 11:19 AM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Mon, 5 Dec 2011, Lachlan Mulcahy wrote:
>
>> genunix`list_next **   5822   3.7%
>> unix`mach_cpu_idle** 150261  96.1%
>>
>
> Rather idle.


Actually this is a bit misleading since it is averaged over all the cores
in the system.

This system has 24 cores and the approximate 90% used on a single core in
system time can be shown to be 3.75%

90 / 24 = 3.75

Similarly 23/24 cores are close to 100% idle.

 Top shows:
>
>PID USERNAME NLWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
>  22945 root1  600   13M 3004K cpu/6  144:21  3.79% zfs
>550 root   28  590   39M   22M sleep   10:19  0.06% fmd
>

Having 'fmd' (fault monitor daemon) show up in top at all is rather ominous
> since it implies that faults are actively being reported.  You might want
> to look into what is making it active.


Nothing in /var/adm/messages as this is the last two days entries:

Dec  4 18:46:20 mslvstdp03r sshd[20926]: [ID 800047 auth.crit] fatal: Read
from socket failed: Connection reset by peer
Dec  5 01:40:07 mslvstdp03r sshd[21808]: [ID 800047 auth.crit] fatal: Read
from socket failed: Connection reset by peer


Anything else you suggest I'd check for faults? (Though I'm sort of
doubting it is an issue, I'm happy to be thorough)



 So I'm lead to believe that zfs receive is spending almost 100% of a
> single CPUs time doing a lot of
> genunix`list_next ...
>

Or maybe it is only doing 3.7% of this.  There seems to be a whole lot of
> nothing going on.


See the math above -- also mpstat shows a single CPU at around 90% system
time and since top reports zfs as the only active process the
circumstantial evidence is fairly indicative.


> Any ideas what is going on here?
>

Definitely check if low-level faults are being reported to fmd.
>

Any logs other than /var/adm/messages I should check (apologies, I'm quite
new to (open)solaris )

Regards,
-- 
Lachlan Mulcahy
Senior DBA,
Marin Software Inc.
San Francisco, USA

AU Mobile: +61 458 448 721
US Mobile: +1 (415) 867 2839
Office : +1 (415) 671 6080
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Bill Sommerfeld
On 12/05/11 10:47, Lachlan Mulcahy wrote:
> zfs`lzjb_decompress10   0.0%
> unix`page_nextn31   0.0%
> genunix`fsflush_do_pages   37   0.0%
> zfs`dbuf_free_range   183   0.1%
> genunix`list_next5822   3.7%
> unix`mach_cpu_idle 150261  96.1%

your best bet in a situation like this -- where there's a lot of cpu time
spent in a generic routine -- is to use an alternate profiling method that
shows complete stack traces rather than just the top function on the stack.

often the names of functions two or three or four deep in the stack will point
at what's really responsible.

something as simple as:

dtrace -n 'profile-1001 { @[stack()] = count(); }'

(let it run for a bit then interrupt it).

should show who's calling list_next() so much.

- Bill
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Bob Friesenhahn

On Mon, 5 Dec 2011, Lachlan Mulcahy wrote:

genunix`list_next    5822   3.7%
unix`mach_cpu_idle 150261  96.1%


Rather idle.


Top shows:

   PID USERNAME NLWP PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND
 22945 root    1  60    0   13M 3004K cpu/6  144:21  3.79% zfs
   550 root   28  59    0   39M   22M sleep   10:19  0.06% fmd


Having 'fmd' (fault monitor daemon) show up in top at all is rather 
ominous since it implies that faults are actively being reported.  You 
might want to look into what is making it active.



So I'm lead to believe that zfs receive is spending almost 100% of a single 
CPUs time doing a lot of
genunix`list_next ...


Or maybe it is only doing 3.7% of this.  There seems to be a whole lot 
of nothing going on.



Any ideas what is going on here?


Definitely check if low-level faults are being reported to fmd.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs receive slowness - lots of systime spent in genunix`list_next ?

2011-12-05 Thread Lachlan Mulcahy
Hi All,

We are currently doing a zfs send/recv with mbuffer to send incremental
changes across and it seems to be running quite slowly, with zfs receive
the apparent bottle neck.

The process itself seems to be using almost 100% of a single CPU in "sys"
time.

Wondering if anyone has any ideas if this is normal or if this is just
going to run forever and never finish...


details - two machines connected via Gigabit Ethernet on the same LAN.

Sending server:

zfs send -i 20111201_1 data@20111205_1 | mbuffer -s 128k -m 1G -O
tdp03r-int:9090

Receiving server:

mbuffer -s 128k -m 1G -I 9090 | zfs receive -vF tank/db/data

mbuffer showing:

in @  256 KiB/s, out @  256 KiB/s,  306 GiB total, buffer 100% ful



My debug:

DTraceToolkit hotkernel reports:

zfs`lzjb_decompress10   0.0%
unix`page_nextn31   0.0%
genunix`fsflush_do_pages   37   0.0%
zfs`dbuf_free_range   183   0.1%
genunix`list_next5822   3.7%
unix`mach_cpu_idle 150261  96.1%


Top shows:

   PID USERNAME NLWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
 22945 root1  600   13M 3004K cpu/6  144:21  3.79% zfs
   550 root   28  590   39M   22M sleep   10:19  0.06% fmd

I'd say the 3.7% or so here is so low because we are providing not per CPU,
but aggregate CPU usage. mpstat seems to show the real story.

mpstat 1 shows output much like this each second:

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  00   00   329  108   830   1730 00   0   0 100
  10   00   1001   940   2310 00   0   0 100
  20   00320   280510 00   0   0 100
  30   00180   110000 00   0   0 100
  40   00166   100200 00   0   0 100
  50   00 6020000 00   0   0 100
  60   00 2000000 00   0   0 100
  70   00 9040000160   0   0 100
  80   00 6030000 00   3   0  97
  90   00 3100000 00   0   0 100
 100   00222   350110 00  89   0  11
 110   00 2000000 00   0   0 100
 120   00 3020100 20   0   0 100
 130   00 2000000 00   0   0 100
 140   0024   1760020610   0   0 100
 150   00140   240010 20   0   0 100
 160   00 2000000 00   0   0 100
 170   0010280050780   1   0  99
 180   00 2000000 00   0   0 100
 190   00 5120000100   0   0 100
 200   00 2000000 00   0   0 100
 210   00 9240000 40   0   0 100
 220   00 4000000 00   0   0 100
 230   00 2000000 00   0   0 100


So I'm lead to believe that zfs receive is spending almost 100% of a single
CPUs time doing a lot of genunix`list_next ...

Any ideas what is going on here?

Best Regards,
-- 
Lachlan Mulcahy
Senior DBA,
Marin Software Inc.
San Francisco, USA

AU Mobile: +61 458 448 721
US Mobile: +1 (415) 867 2839
Office : +1 (415) 671 6080
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss