Re: ggate still broken on 6.2-RC1 for amd64.
On Mon, Dec 11, 2006 at 02:47:41AM -0500, David Gilbert wrote: That doesn't square with my experience. Although bigger buffers could be involved in a performance problem, what we're dealing with here is a _zero_ traffic situation. It seems that it works enough for tasting to be successful, but any significant load wedges it hard. The problem I observed was also a zero traffic situation. A quick way to test is to do something like this (assuming you don't care about the contents of the device!) dd if=/dev/zero of=/dev/ggateX bs=1m and watch the network traffic to see what happens. When I ran into it, small block sizes worked fine, but anything bigger than the send buffer size would cause the entire ggate device to wedge with zero traffic. The ggatec logs in my mail archive say 128k, which itself is a little odd because I thought GEOM broke big transfers into 64k chunks. In any case, ggatec got stuck in a loop getting EAGAIN from send(), so the packets never made it out to the wire. However checking my mail archive also indicates that was a year ago so chances are this is a different problem. The symptoms just sounded a little familiar. Craig ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ggate still broken on 6.2-RC1 for amd64.
Craig == Craig Boston [EMAIL PROTECTED] writes: Craig On Mon, Dec 11, 2006 at 02:47:41AM -0500, David Gilbert wrote: That doesn't square with my experience. Although bigger buffers could be involved in a performance problem, what we're dealing with here is a _zero_ traffic situation. It seems that it works enough for tasting to be successful, but any significant load wedges it hard. Craig The problem I observed was also a zero traffic situation. A Craig quick way to test is to do something like this (assuming you Craig don't care about the contents of the device!) Craig dd if=/dev/zero of=/dev/ggateX bs=1m Craig and watch the network traffic to see what happens. When I ran Craig into it, small block sizes worked fine, but anything bigger Craig than the send buffer size would cause the entire ggate device Craig to wedge with zero traffic. The ggatec logs in my mail archive Craig say 128k, which itself is a little odd because I thought GEOM Craig broke big transfers into 64k chunks. Craig In any case, ggatec got stuck in a loop getting EAGAIN from Craig send(), so the packets never made it out to the wire. Craig However checking my mail archive also indicates that was a year Craig ago so chances are this is a different problem. The symptoms Craig just sounded a little familiar. Urm... what would be the transfersize that the filesystem prefers to use? Also, what trasnfersize does the gmirror sync use? Dave. -- |David Gilbert, Independent Contractor. | Two things can be | |Mail: [EMAIL PROTECTED]| equal if and only if they | |http://daveg.ca | are precisely opposite. | =GLO ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ggate still broken on 6.2-RC1 for amd64.
Craig Boston wrote: Have you tried increasing the send/receive buffer size? In my local ggate setup I'm running both the client and server with the options -R 196608 -S 196608. I added it a while back after discovering that the default buffer size was inadequate in certain situations and would sometimes cause large block sized I/O to hang. Heh, this is funny. I have reports from another source, who _decreases_ bufsize to 8kB, because that is giving him the most performance. Since I'm using HPS' USB stack I can't use my uplcom device and therefore cannot usefully test some more ggate/gmirror scenarios on -CURRENT ... But I'll whip up a ggate test case. Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ggate still broken on 6.2-RC1 for amd64.
Ulrich Spoerlein wrote: But I'll whip up a ggate test case. Very strange ... I thought I would work through different buffer sizes, starting with some low value. Here's what gives: igor# ggated -a localhost -v -R8k -S8k /tmp/ggate_exports igor# ggatec create -v -R8k -S8k localhost /tmp/ggate_test info: Reading exports file (/tmp/ggate_exports).info: Connected to the server: localhost:3080. debug: Added 127.0.0.1/32 /tmp/ggate_test RW to exports list. debug: Sending version packet. info: Exporting 1 object(s). info: Listen on port: 3080. info: Connection from: 127.0.0.1. debug: Receiving version packet. debug: Version packet received. debug: Receiving initial packet. VERY LONG PAUSE debug: Initial packet received. debug: Sending initial packet. debug: Connection created [127.0.0.1, /tmp/ggate_test]. debug: Receiving initial packet. debug: New connection created (token=226910802).debug: Received initial packet. debug: Sending initial packet. info: Connected to the server: localhost:3080. debug: Sending version packet. VERY LONG PAUSE g_gate_send: EAGAIN g_gate_send: EAGAIN g_gate_send: EAGAIN g_gate_send: EAGAIN info: Connection from: 127.0.0.1. ^C debug: Receiving version packet. ^C Now try with 16k. igor# ggated -a localhost -v -R16k -S16k /tmp/ggate_exports igor# ggatec create -v -R16k -S16k localhost /tmp/ggate_test info: Reading exports file (/tmp/ggate_exports).info: Connected to the server: localhost:3080. debug: Added 127.0.0.1/32 /tmp/ggate_test RW to exports list. debug: Sending version packet. info: Exporting 1 object(s). info: Listen on port: 3080. info: Connection from: 127.0.0.1. debug: Receiving version packet. debug: Version packet received. debug: Receiving initial packet. LONG PAUSE debug: Initial packet received. debug: Sending initial packet. debug: Connection created [127.0.0.1, /tmp/ggate_test]. debug: Receiving initial packet. debug: New connection created (token=2294332471). debug: Received initial packet. debug: Sending initial packet. info: Connected to the server: localhost:3080. info: Connection from: 127.0.0.1. debug: Sending version packet. debug: Receiving version packet. debug: Version packet received. debug: Receiving initial packet. LONG PAUSE debug: Initial packet received. debug: Sending initial packet. debug: Found existing connection (token=2294332471).debug: Receiving initial packet. debug: Connection added [127.0.0.1, /tmp/ggate_test]. debug: Received initial packet. debug: Sending initial packet. ggate5 debug: Connection removed [127.0.0.1 /tmp/ggate_test]. notice: send_thread: started! debug: Process created [/tmp/ggate_test]. notice: recv_thread: started! notice: disk_thread: started [/tmp/ggate_test]! notice: send_thread: started [/tmp/ggate_test]! notice: recv_thread: started [/tmp/ggate_test]! debug: Process 1140 exiting. ^C I wanted to use something like the following, for first draft of a benchmark, but I just I/O deadlocked the system (6.2 and CURRENT). Simply by running ggated/ggatec in various combinations. db show alllocks Process 1333 (ggatel) thread 0xc2767510 (100081) exclusive sx sysctl lock r = 0 (0xc078c420) locked @ /vol/src/sys/kern/kern_sysctl.c:1376 db trace 1333 Tracing pid 1333 tid 100081 td 0xc2767510 sched_switch(c2767510,0,1) at sched_switch+0xe7 mi_switch(1,0) at mi_switch+0x27c sleepq_switch(c2b3e680,c078bdd0,0,c070e413,236,...) at sleepq_switch+0xc9 sleepq_timedwait(c2b3e680) at sleepq_timedwait+0x4a msleep(c2b3e680,0,4c,c07028f3,64) at msleep+0x281 g_waitfor_event(c050d120,c2b6c300,2,0,0,0,0,1) at g_waitfor_event+0x73 sysctl_kern_geom_confxml(c07485e0,0,0,d1781b9c,c07485e0,...) at sysctl_kern_geom_confxml+0x26 sysctl_root(0,d1781c1c,3,d1781b9c) at sysctl_root+0x12f userland_sysctl(c2767510,d1781c1c,3,830,bfbfe3d8,0,0,0,d1781c18,0,c078bde8,0,c070bc1f,522) at userland_sysctl+0xf4 __sysctl(c2767510,d1781d04) at __sysctl+0x77 syscall(3b,3b,3b,3,bfbfe3d8,...) at syscall+0x27e Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x2816ba7f, esp = 0xbfbfe2bc, ebp = 0xbfbfe2f8 --- db ps pid ppid pgrp uid state wmesg wchancmd 1348 800 800 0 S sysctl l 0xc078c444 cron 1347 800 800 0 S sysctl l 0xc078c444 cron 1346 800 800 0 S sysctl l 0xc078c444 cron 1345
Re: ggate still broken on 6.2-RC1 for amd64.
Craig == Craig Boston [EMAIL PROTECTED] writes: Craig On Sun, Dec 03, 2006 at 06:12:21PM +0100, Ulrich Spoerlein Craig wrote: David Gilbert wrote: But the ggated/ggatec in 6.2-RC1 connects now (and is happy about that). In fact, the tasting on the ggatec side that happens due to new disks showing up works, too. However, any attempt to pass significant traffic causes ggatec to seeminly lock up. /me too. Though I tested this on two FreeBSD/i386 SMP machines with gmirror + ggated combination. There *is* traffic going on, but it is somewhere around 50kB/s (sic! no kidding!). Craig Have you tried increasing the send/receive buffer size? In my Craig local ggate setup I'm running both the client and server with Craig the options -R 196608 -S 196608. I added it a while back Craig after discovering that the default buffer size was inadequate Craig in certain situations and would sometimes cause large block Craig sized I/O to hang. Craig This was a while ago and I mentioned it to pjd@ so the issue Craig may be have been corrected, but it's something that shouldn't Craig take long to try. That doesn't square with my experience. Although bigger buffers could be involved in a performance problem, what we're dealing with here is a _zero_ traffic situation. It seems that it works enough for tasting to be successful, but any significant load wedges it hard. Dave. -- |David Gilbert, Independent Contractor. | Two things can be | |Mail: [EMAIL PROTECTED]| equal if and only if they | |http://daveg.ca | are precisely opposite. | =GLO ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ggate still broken on 6.2-RC1 for amd64.
David Gilbert wrote: GGate is still broken on 6.2-RC1 for amd64. I have verified that the patch in kern/104829 has been applied (it's in the tree). I have also added the patch in amd64/91799 --- without it, ggated doesn't work at all. This should definately make it into 6.2 But the ggated/ggatec in 6.2-RC1 connects now (and is happy about that). In fact, the tasting on the ggatec side that happens due to new disks showing up works, too. However, any attempt to pass significant traffic causes ggatec to seeminly lock up. In my configuration, I have a gmirror running with a local disk (already) and I want to gmirror insert the ggate disk. When I do so, I get 50 write requests queued (I upped the gmirror buffer count to 50 to make syncronization happen faster) and things never move from there. /me too. Though I tested this on two FreeBSD/i386 SMP machines with gmirror + ggated combination. There *is* traffic going on, but it is somewhere around 50kB/s (sic! no kidding!). Also, forcefully removing the ggate0 provider (ggatec destroy -fu0), which should not impact the mirror operation in any way, panic'ed the system. I can't rebuild this test scenario on -CURRENT right now, but will do so time permitting. Maybe this is related to the gmirror deadlock I reported. But I no longer have SMP hardware to play with ... Ulrich Spoerlein -- A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]