Re: Issue with hast replication
Mikolaj Golub (to.my.trociny) writes: > > I just tried to reproduce this and failed. For me a new recource was added > without problems on reload. > > Mar 17 20:04:24 kopusha hastd[52678]: Reloading configuration... > Mar 17 20:04:24 kopusha hastd[52678]: Keep listening on address 0.0.0.0:7771. > Mar 17 20:04:24 kopusha hastd[52678]: Resource rtest added. > Mar 17 20:04:24 kopusha hastd[52678]: Configuration reloaded successfully. > > You sent SIGHUP to master process and on both hosts, didn't you? Nope :-| Duh. > Could you please provide more details if you still fail to add new resources > on the fly (configuration, log messages). I'll look. Right now, I need to try and reproduce the original hast-over- zvol problem. Thanks, Phil ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Tue, 13 Mar 2012 00:22:23 +0100 Phil Regnauld wrote: PR> (side note: hastd doesn't pick up configuration changes even with SIGHUP, PR>which makes it hard to provision new resources on the fly) I just tried to reproduce this and failed. For me a new recource was added without problems on reload. Mar 17 20:04:24 kopusha hastd[52678]: Reloading configuration... Mar 17 20:04:24 kopusha hastd[52678]: Keep listening on address 0.0.0.0:7771. Mar 17 20:04:24 kopusha hastd[52678]: Resource rtest added. Mar 17 20:04:24 kopusha hastd[52678]: Configuration reloaded successfully. You sent SIGHUP to master process and on both hosts, didn't you? Could you please provide more details if you still fail to add new resources on the fly (configuration, log messages). -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
Mikolaj Golub (to.my.trociny) writes: > > > What about failed counters like mbuf_alloc_failed_count, > dma_map_addr_rx_failed_count, dma_map_addr_tx_failed_count? dev.bce.0.l2fhdr_error_count: 0 dev.bce.0.mbuf_alloc_failed_count: 0 dev.bce.0.mbuf_frag_count: 0 dev.bce.0.dma_map_addr_rx_failed_count: 0 dev.bce.0.dma_map_addr_tx_failed_count: 0 dev.bce.0.unexpected_attention_count: 0 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Tue, 13 Mar 2012 22:19:28 +0100 Phil Regnauld wrote: PR> dev.bce.0.l2fhdr_error_count: 0 PR> dev.bce.0.stat_emac_tx_stat_dot3statsinternalmactransmiterrors: 0 PR> dev.bce.0.stat_Dot3StatsCarrierSenseErrors: 0 PR> dev.bce.0.stat_Dot3StatsFCSErrors: 0 PR> dev.bce.0.stat_Dot3StatsAlignmentErrors: 0 What about failed counters like mbuf_alloc_failed_count, dma_map_addr_rx_failed_count, dma_map_addr_tx_failed_count? -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
Mikolaj Golub (to.my.trociny) writes: > > Ok. So it is send(2). I suppose the network driver could generate the > error. Did you tell what network adaptor you had? Not yet. bce0: mem 0xf400-0xf5ff irq 16 at device 0.0 on pci2 bce0: ASIC (0x57092003); Rev (C0); Bus (PCIe x2, 2.5Gbps); B/C (4.6.4); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 1.0.3) > PR> No obvious errors there either, but again what should I look out for > ? > > I would look at sysctl -a dev. statistics and try to find if there is > correlation > between ENOMEM failures and growing of error counters. 0 errors: dev.bce.0.l2fhdr_error_count: 0 dev.bce.0.stat_emac_tx_stat_dot3statsinternalmactransmiterrors: 0 dev.bce.0.stat_Dot3StatsCarrierSenseErrors: 0 dev.bce.0.stat_Dot3StatsFCSErrors: 0 dev.bce.0.stat_Dot3StatsAlignmentErrors: 0 > Looking at buffer usage from 'netstat -nax' output ran during synchronization > (on both hosts) could provide useful info where the bottleneck is. top -HS > output might be useful too. Good point. I'll have to attempt to recreate the problem, as the volume has replicated without errors. Typical. Cheers, Phil ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Tue, 13 Mar 2012 00:22:23 +0100 Phil Regnauld wrote: PR> Mikolaj Golub (to.my.trociny) writes: >> >> It looks like in the case of hastd this was send(2) who returned ENOMEM, but >> it would be good to check. Could you please start synchronization again, >> ktrace primary worker process when ENOMEM errors are observed and show >> output >> here? PR> Ok, took a little while, as running ktrace on the hastd does slow it down PR> significantly, and the error normally occurs at 30-90 sec intervals. PR>0x0f90 b2f3 3ad5 e657 7f0f 3e50 698f 5deb 12af |..:..W..>Pi.]...| PR>0x0fa0 740d c343 6e80 75f3 e1a7 bfdf a4c1 f6a6 |t..Cn.u.| PR>0x0fb0 ea85 655d e423 bd5e 42f7 7e9a 05d2 363a |..e].#.^B.~...6:| PR>0x0fc0 025e a7b5 0956 417c f31c a6eb 2cd9 d073 |.^...VA|,..s| PR>0x0fd0 2589 e8c0 d76a 889f 8345 eeaf f2a0 c2d6 |%j...E..| PR>0x0fe0 b89e aaef fee2 6593 e515 7271 88aa cf66 |..e...rq...f| PR>0x0ff0 d272 411a 7289 d6c9 6643 bdbe 3c8c 8ae8 |.rA.r...fC..<...| PR> 50959 hastdRET sendto 32768/0x8000 PR> 50959 hastdCALL sendto(0x6,0x8024bf000,0x8000,0x2,0,0) PR> 50959 hastdRET sendto -1 errno 12 Cannot allocate memory PR> 50959 hastdCALL clock_gettime(0xd,0x7f3f86f0) PR> 50959 hastdRET clock_gettime 0 PR> 50959 hastdCALL getpid PR> 50959 hastdRET getpid 50959/0xc70f PR> 50959 hastdCALL sendto(0x3,0x7f3f8780,0x84,0,0,0) PR> 50959 hastdGIO fd 3 wrote 132 bytes PR>"<27>Mar 12 23:42:43 hastd[50959]: [hvol] (primary) Unable to sen\ PR> d request (Cannot allocate memory): WRITE(8626634752, 131072)." PR> 50959 hastdRET sendto 132/0x84 PR> 50959 hastdCALL close(0x7) PR> 50959 hastdRET close 0 Ok. So it is send(2). I suppose the network driver could generate the error. Did you tell what network adaptor you had? >> If it is send(2) who fails then monitoring netstat and network driver >> statistics might be helpful. Something like >> >> netstat -nax >> netstat -naT >> netstat -m >> netstat -nid PR> I could run this in a loop, but that would be a lot of data, and might PR> not be appropriate to paste here. PR> I didn't see any obvious errors, but I'm not sure what I'm looking for. PR> netstat -m didn't show anything close to running out of buffers or PR> clusters... >> sysctl -a dev. >> >> And may be >> >> vmstat -m >> vmstat -z PR> No obvious errors there either, but again what should I look out for ? I would look at sysctl -a dev. statistics and try to find if there is correlation between ENOMEM failures and growing of error counters. PR> In the meantime, I've also experimented with a few different scenarios, and PR> I'm quite puzzled. PR> For instance, I configured one of the other gigabit cards on each host to PR> provide a dedicated replication network. The main difference is that up PR> until now this has been running using tagged vlans. To be on the safe side, PR> I decided to use an untagged interface (the second gigabit adapter in each PR> machine). PR> PR> Here's where I observed, and it is very odd: PR> PR> - doing a dd ... | ssh dd fails in the same fashion as before PR> - I created a second zvol + hast resource of just 1 GB, and it replicated PR> without any problems, peaking at 75 MB / sec (!) - maybe 1GB is too small PR> ? PR> PR> (side note: hastd doesn't pick up configuration changes even with SIGHUP, PR>which makes it hard to provision new resources on the fly) PR> - I restarted replication on the 100 G hast resource, and it's currently PR> replicating without any problems over the second ethernet, but it's PR> dragging along at 9-10 MB/sec, peaking at 29 MB/sec occasionally. Looking at buffer usage from 'netstat -nax' output ran during synchronization (on both hosts) could provide useful info where the bottleneck is. top -HS output might be useful too. PR> Earlier, I was observing peaks at 65-70 MB sec in between failures... PR> So I don't really know what to conclude :-| -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
Mikolaj Golub (to.my.trociny) writes: > > It looks like in the case of hastd this was send(2) who returned ENOMEM, but > it would be good to check. Could you please start synchronization again, > ktrace primary worker process when ENOMEM errors are observed and show output > here? Ok, took a little while, as running ktrace on the hastd does slow it down significantly, and the error normally occurs at 30-90 sec intervals. 0x0f90 b2f3 3ad5 e657 7f0f 3e50 698f 5deb 12af |..:..W..>Pi.]...| 0x0fa0 740d c343 6e80 75f3 e1a7 bfdf a4c1 f6a6 |t..Cn.u.| 0x0fb0 ea85 655d e423 bd5e 42f7 7e9a 05d2 363a |..e].#.^B.~...6:| 0x0fc0 025e a7b5 0956 417c f31c a6eb 2cd9 d073 |.^...VA|,..s| 0x0fd0 2589 e8c0 d76a 889f 8345 eeaf f2a0 c2d6 |%j...E..| 0x0fe0 b89e aaef fee2 6593 e515 7271 88aa cf66 |..e...rq...f| 0x0ff0 d272 411a 7289 d6c9 6643 bdbe 3c8c 8ae8 |.rA.r...fC..<...| 50959 hastdRET sendto 32768/0x8000 50959 hastdCALL sendto(0x6,0x8024bf000,0x8000,0x2,0,0) 50959 hastdRET sendto -1 errno 12 Cannot allocate memory 50959 hastdCALL clock_gettime(0xd,0x7f3f86f0) 50959 hastdRET clock_gettime 0 50959 hastdCALL getpid 50959 hastdRET getpid 50959/0xc70f 50959 hastdCALL sendto(0x3,0x7f3f8780,0x84,0,0,0) 50959 hastdGIO fd 3 wrote 132 bytes "<27>Mar 12 23:42:43 hastd[50959]: [hvol] (primary) Unable to sen\ d request (Cannot allocate memory): WRITE(8626634752, 131072)." 50959 hastdRET sendto 132/0x84 50959 hastdCALL close(0x7) 50959 hastdRET close 0 > If it is send(2) who fails then monitoring netstat and network driver > statistics might be helpful. Something like > > netstat -nax > netstat -naT > netstat -m > netstat -nid I could run this in a loop, but that would be a lot of data, and might not be appropriate to paste here. I didn't see any obvious errors, but I'm not sure what I'm looking for. netstat -m didn't show anything close to running out of buffers or clusters... > sysctl -a dev. > > And may be > > vmstat -m > vmstat -z No obvious errors there either, but again what should I look out for ? In the meantime, I've also experimented with a few different scenarios, and I'm quite puzzled. For instance, I configured one of the other gigabit cards on each host to provide a dedicated replication network. The main difference is that up until now this has been running using tagged vlans. To be on the safe side, I decided to use an untagged interface (the second gigabit adapter in each machine). Here's where I observed, and it is very odd: - doing a dd ... | ssh dd fails in the same fashion as before - I created a second zvol + hast resource of just 1 GB, and it replicated without any problems, peaking at 75 MB / sec (!) - maybe 1GB is too small ? (side note: hastd doesn't pick up configuration changes even with SIGHUP, which makes it hard to provision new resources on the fly) - I restarted replication on the 100 G hast resource, and it's currently replicating without any problems over the second ethernet, but it's dragging along at 9-10 MB/sec, peaking at 29 MB/sec occasionally. Earlier, I was observing peaks at 65-70 MB sec in between failures... So I don't really know what to conclude :-| ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Mon, 12 Mar 2012 15:31:27 +0100 Phil Regnauld wrote: PR> Phil Regnauld (regnauld) writes: >> >> 7) ktrace on the destination dd: >> >> fstat(0,{ mode=p- ,inode=5,size=16384,blksize=4096 }) = 0 (0x0) >> lseek(0,0x0,SEEK_CUR)ERR#29 'Illegal seek' PR> [...] >> Illegal seek, eh ? Any clues ? >> >> The boxes are identical (HP DL380 G6), though the RAM config is >> different. >> >> Summary: >> >> - ssh works fine >> - h1 zvol to h2 zvol over ssh fails >> - h1 zvol to h2 /tmp/x over ssh is fine >> - h2 /dev/zero locally to h2 zvol is fine >> - h2 /tmp/x locally to h2 zvol fails at first, but works afterwards... PR> A few more data points: dd from a local zvol to a local zvol on either PR> machine works fine. PR> Using nc instead of ssh, this time it's the sender nc dying: PR> ktrace on the sender: PR> 47704 nc CALL write(0x3,0x7fff5450,0x800) PR> 47704 nc RET write -1 errno 32 Broken pipe PR> 47704 nc PSIG SIGPIPE SIG_DFL code=0x10006 PR> truss on the sender: PR> poll({3/POLLIN 0/POLLIN},2,-1) = 2 (0x2) PR> read(3,0x7fff5450,2048) ERR#54 'Connection reset by peer' PR> close(3) = 0 (0x0) PR> On tcpdump, I do see the receiver send a FIN when using nc. PR> When using ssh, the sender is sending the FIN. PR> Anything else I can look for ? It looks like in the case of hastd this was send(2) who returned ENOMEM, but it would be good to check. Could you please start synchronization again, ktrace primary worker process when ENOMEM errors are observed and show output here? If it is send(2) who fails then monitoring netstat and network driver statistics might be helpful. Something like netstat -nax netstat -naT netstat -m netstat -nid sysctl -a dev. And may be vmstat -m vmstat -z -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
Phil Regnauld (regnauld) writes: > > 7) ktrace on the destination dd: > > fstat(0,{ mode=p- ,inode=5,size=16384,blksize=4096 }) = 0 (0x0) > lseek(0,0x0,SEEK_CUR)ERR#29 'Illegal seek' [...] > Illegal seek, eh ? Any clues ? > > The boxes are identical (HP DL380 G6), though the RAM config is different. > > Summary: > > - ssh works fine > - h1 zvol to h2 zvol over ssh fails > - h1 zvol to h2 /tmp/x over ssh is fine > - h2 /dev/zero locally to h2 zvol is fine > - h2 /tmp/x locally to h2 zvol fails at first, but works afterwards... A few more data points: dd from a local zvol to a local zvol on either machine works fine. Using nc instead of ssh, this time it's the sender nc dying: ktrace on the sender: 47704 nc CALL write(0x3,0x7fff5450,0x800) 47704 nc RET write -1 errno 32 Broken pipe 47704 nc PSIG SIGPIPE SIG_DFL code=0x10006 truss on the sender: poll({3/POLLIN 0/POLLIN},2,-1) = 2 (0x2) read(3,0x7fff5450,2048) ERR#54 'Connection reset by peer' close(3) = 0 (0x0) On tcpdump, I do see the receiver send a FIN when using nc. When using ssh, the sender is sending the FIN. Anything else I can look for ? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
Mikolaj Golub (trociny) writes: > > > PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Disconnected from > tcp4://192.168.1.200. > PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Unable to write > synchronization data: Cannot allocate memory. > PR> Mar 11 02:02:41 h1 hastd[2282]: [hvol] (primary) Unable to send request > (Cannot allocate memory): WRITE(31642091520, 131072). > > 31642091520 looks like rather large offset for 10Gb volume... Sorry, that should have been 100G - I typed from memory instead of copy-pasting. > Just to be more confident that this is a HAST issue could you please try the > following experiment? > > 1) Stop hastd on h2. > > 2) On h1 run something like below: > > dd if=/dev/zvol/zfs/hvol bs=131072 | ssh h2 dd bs=131072 > of=/dev/zvol/zfs/hvol > > (copy hvol from h1 to h2 without hastd to see if it will succeed). > > Note: you will need to recreate HAST provider on secondary after this. Ok this is interesting. (For debugging purposes I've renamed the target zvol as "junk", you'll see why below). 1) As you suggested: h1# dd if=/dev/zvol/zfs/hvol bs=131072 | ssh h2 dd bs=131072 of=/dev/zvol/zfs/junk dd: /dev/zvol/zfs/junk: Invalid argument 0+6 records in 0+5 records out 131072 bytes transferred in 0.002344 secs (55920640 bytes/sec) To be certain which dd was complaining, I renamed the target zvol. 2) Tried repeatedly, sometimes the number of bytes is a bit different: 0+7 records in 0+6 records out 147456 bytes transferred in 0.002448 secs (60233277 bytes/sec) And yes, hastd is stopped on h2. 3) I tried dd'ing zero to the zvol locally on h2: h2# dd if=/dev/zero of=/dev/zvol/zfs/junk bs=131072 ^C1817+0 records in 1816+0 records out 238026752 bytes transferred in 1.582006 secs (150458820 bytes/sec) That works, until I ^C it. 4) I tried redirecting the output of the dd | ssh to a file on the h2 side: h1# dd if=/dev/zvol/zfs/hvol bs=131072 | ssh h2 dd bs=131072 of=/tmp/x ^C653+0 records in 652+0 records out 85458944 bytes transferred in 2.408074 secs (35488506 bytes/sec) That works too, until I ^C it. 5) Things get even weirder - if I then go over to h2 and dd the "/tmp/x" test file over to the zvol: h2# dd if=x bs=131072 of=/dev/zvol/zfs/junk dd: /dev/zvol/zfs/junk: Invalid argument 652+1 records in 652+0 records out 85458944 bytes transferred in 0.444571 secs (192227879 bytes/sec) Note that the file /tmp/x is 86917120 bytes long. 6) I try to copy more data into /tmp/x - it's now 291946496 (~280 MB) h2# dd if=x bs=131072 of=/dev/zvol/zfs/junk 2227+1 records in 2227+1 records out 291946496 bytes transferred in 3.564129 secs (81912441 bytes/sec) No more "invalid argument"... 7) ktrace on the destination dd: [...] \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\ \0" 5807 dd RET read 17992/0x4648 5807 dd CALL write(0x3,0x800c09000,0x4648) 5807 dd RET write -1 errno 22 Invalid argument 5807 dd CALL write(0x2,0x7fffd300,0x4) 5807 dd GIO fd 2 wrote 4 bytes "dd: " 5807 dd RET write 4 5807 dd CALL write(0x2,0x7fffd3e0,0x12) 5807 dd GIO fd 2 wrote 18 bytes "/dev/zvol/zfs/junk" truss is a bit more informative: fstat(0,{ mode=p- ,inode=5,size=16384,blksize=4096 }) = 0 (0x0) lseek(0,0x0,SEEK_CUR)ERR#29 'Illegal seek' Illegal seek, eh ? Any clues ? The boxes are identical (HP DL380 G6), though the RAM config is different. Summary: - ssh works fine - h1 zvol to h2 zvol over ssh fails - h1 zvol to h2 /tmp/x over ssh is fine - h2 /dev/zero locally to h2 zvol is fine - h2 /tmp/x locally to h2 zvol fails at first, but works afterwards... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Issue with hast replication
On Sun, 11 Mar 2012 19:54:57 +0100 Phil Regnauld wrote: PR> Hi, PR> I've got a fairly simple setup: two hosts running 9.0-R (will upgrade to stable PR> if told to, but want to check here first), ZFS and HAST. HAST is configured to PR> run on top of zvols configured on each host, as illustrated: PR> FS FS PR>+--++--+ PR>| hvol | < hastd -> | hvol | PR>+--++--+ PR>| zvol || zvol | PR>+--++--+ PR>| zfs || zfs | PR>+--++--+ PR> h1 h2 PR> Connection is gigabit to the same switch. No issues with large TCP PR> transfers such as SCP/FTP. PR> Config is vanilla: PR> # zfs create -V 10G zfs/hvol PR> hast.conf: PR> resource hvol { PR> on h1 { PR> local /dev/zvol/zfs/hvol PR> remote tcp4://192.168.1.100 PR> } PR> on h2 { PR> local /dev/zvol/zfs/hvol PR> remote tcp4://192.168.1.200 PR> } PR> } PR> h1 is behaving fine as primary, either with h2 turned off or in init - PR> but as soon as I set the role to secondary for h2, the receiver PR> repeatedly crashes and restarts - see the traces below. PR> Primary: PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Disconnected from tcp4://192.168.1.200. PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Unable to write synchronization data: Cannot allocate memory. PR> Mar 11 02:02:41 h1 hastd[2282]: [hvol] (primary) Unable to send request (Cannot allocate memory): WRITE(31642091520, 131072). 31642091520 looks like rather large offset for 10Gb volume... Just to be more confident that this is a HAST issue could you please try the following experiment? 1) Stop hastd on h2. 2) On h1 run something like below: dd if=/dev/zvol/zfs/hvol bs=131072 | ssh h2 dd bs=131072 of=/dev/zvol/zfs/hvol (copy hvol from h1 to h2 without hastd to see if it will succeed). Note: you will need to recreate HAST provider on secondary after this. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"