Re: ggate still broken on 6.2-RC1 for amd64.

2006-12-11 Thread Craig Boston
On Mon, Dec 11, 2006 at 02:47:41AM -0500, David Gilbert wrote:
 That doesn't square with my experience.  Although bigger buffers could
 be involved in a performance problem, what we're dealing with here is
 a _zero_ traffic situation.  It seems that it works enough for tasting
 to be successful, but any significant load wedges it hard.

The problem I observed was also a zero traffic situation.  A quick way
to test is to do something like this (assuming you don't care about the
contents of the device!)

dd if=/dev/zero of=/dev/ggateX bs=1m

and watch the network traffic to see what happens.  When I ran into it,
small block sizes worked fine, but anything bigger than the send buffer
size would cause the entire ggate device to wedge with zero traffic.
The ggatec logs in my mail archive say 128k, which itself is a little
odd because I thought GEOM broke big transfers into 64k chunks.

In any case, ggatec got stuck in a loop getting EAGAIN from send(), so
the packets never made it out to the wire.

However checking my mail archive also indicates that was a year ago so
chances are this is a different problem.  The symptoms just sounded a
little familiar.

Craig
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ggate still broken on 6.2-RC1 for amd64.

2006-12-11 Thread David Gilbert
 Craig == Craig Boston [EMAIL PROTECTED] writes:

Craig On Mon, Dec 11, 2006 at 02:47:41AM -0500, David Gilbert wrote:
 That doesn't square with my experience.  Although bigger buffers
 could be involved in a performance problem, what we're dealing with
 here is a _zero_ traffic situation.  It seems that it works enough
 for tasting to be successful, but any significant load wedges it
 hard.

Craig The problem I observed was also a zero traffic situation.  A
Craig quick way to test is to do something like this (assuming you
Craig don't care about the contents of the device!)

Craig dd if=/dev/zero of=/dev/ggateX bs=1m

Craig and watch the network traffic to see what happens.  When I ran
Craig into it, small block sizes worked fine, but anything bigger
Craig than the send buffer size would cause the entire ggate device
Craig to wedge with zero traffic.  The ggatec logs in my mail archive
Craig say 128k, which itself is a little odd because I thought GEOM
Craig broke big transfers into 64k chunks.

Craig In any case, ggatec got stuck in a loop getting EAGAIN from
Craig send(), so the packets never made it out to the wire.

Craig However checking my mail archive also indicates that was a year
Craig ago so chances are this is a different problem.  The symptoms
Craig just sounded a little familiar.

Urm... what would be the transfersize that the filesystem prefers to use?
Also, what trasnfersize does the gmirror sync use?

Dave.

-- 

|David Gilbert, Independent Contractor.   | Two things can be  |
|Mail:   [EMAIL PROTECTED]|  equal if and only if they |
|http://daveg.ca  |   are precisely opposite.  |
=GLO
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ggate still broken on 6.2-RC1 for amd64.

2006-12-11 Thread Ulrich Spoerlein
Craig Boston wrote:
 Have you tried increasing the send/receive buffer size?  In my local
 ggate setup I'm running both the client and server with the options
 -R 196608 -S 196608.  I added it a while back after discovering that
 the default buffer size was inadequate in certain situations and would
 sometimes cause large block sized I/O to hang.

Heh, this is funny. I have reports from another source, who _decreases_
bufsize to 8kB, because that is giving him the most performance.

Since I'm using HPS' USB stack I can't use my uplcom device and
therefore cannot usefully test some more ggate/gmirror scenarios on
-CURRENT ...

But I'll whip up a ggate test case.

Ulrich Spoerlein
-- 
A: Yes.
Q: Are you sure?
 A: Because it reverses the logical flow of conversation.
 Q: Why is top posting frowned upon?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ggate still broken on 6.2-RC1 for amd64.

2006-12-11 Thread Ulrich Spoerlein
Ulrich Spoerlein wrote:
 But I'll whip up a ggate test case.

Very strange ... I thought I would work through different buffer sizes,
starting with some low value. Here's what gives:

igor# ggated -a localhost -v -R8k -S8k /tmp/ggate_exports   igor# 
ggatec create -v -R8k -S8k localhost /tmp/ggate_test
info: Reading exports file (/tmp/ggate_exports).info: 
Connected to the server: localhost:3080.
debug: Added 127.0.0.1/32 /tmp/ggate_test RW to exports list.   debug: 
Sending version packet.
info: Exporting 1 object(s).
info: Listen on port: 3080.
info: Connection from: 127.0.0.1.
debug: Receiving version packet.
debug: Version packet received.
debug: Receiving initial packet.

VERY LONG PAUSE

debug: Initial packet received. debug: 
Sending initial packet.
debug: Connection created [127.0.0.1, /tmp/ggate_test]. debug: 
Receiving initial packet.
debug: New connection created (token=226910802).debug: 
Received initial packet.
debug: Sending initial packet.  info: 
Connected to the server: localhost:3080.
debug: 
Sending version packet.
VERY LONG PAUSE

g_gate_send: EAGAIN 
g_gate_send: EAGAIN
g_gate_send: EAGAIN 
g_gate_send: EAGAIN
info: Connection from: 127.0.0.1.   ^C
debug: Receiving version packet.
^C

Now try with 16k.

igor# ggated -a localhost -v -R16k -S16k /tmp/ggate_exports igor# 
ggatec create -v -R16k -S16k localhost /tmp/ggate_test
info: Reading exports file (/tmp/ggate_exports).info: 
Connected to the server: localhost:3080.
debug: Added 127.0.0.1/32 /tmp/ggate_test RW to exports list.   debug: 
Sending version packet.
info: Exporting 1 object(s).
info: Listen on port: 3080.
info: Connection from: 127.0.0.1.
debug: Receiving version packet.
debug: Version packet received.
debug: Receiving initial packet.

LONG PAUSE

debug: Initial packet received. debug: 
Sending initial packet.
debug: Connection created [127.0.0.1, /tmp/ggate_test]. debug: 
Receiving initial packet.
debug: New connection created (token=2294332471).   debug: 
Received initial packet.
debug: Sending initial packet.  info: 
Connected to the server: localhost:3080.
info: Connection from: 127.0.0.1.   debug: 
Sending version packet.
debug: Receiving version packet.
debug: Version packet received.
debug: Receiving initial packet.

LONG PAUSE

debug: Initial packet received. debug: 
Sending initial packet.
debug: Found existing connection (token=2294332471).debug: 
Receiving initial packet.
debug: Connection added [127.0.0.1, /tmp/ggate_test].   debug: 
Received initial packet.
debug: Sending initial packet.  ggate5
debug: Connection removed [127.0.0.1 /tmp/ggate_test].  notice: 
send_thread: started!
debug: Process created [/tmp/ggate_test].   notice: 
recv_thread: started!
notice: disk_thread: started [/tmp/ggate_test]!
notice: send_thread: started [/tmp/ggate_test]!
notice: recv_thread: started [/tmp/ggate_test]!
debug: Process 1140 exiting.
^C


I wanted to use something like the following, for first draft of a
benchmark, but I just I/O deadlocked the system (6.2 and CURRENT).
Simply by running ggated/ggatec in various combinations.

db show alllocks
Process 1333 (ggatel) thread 0xc2767510 (100081)
exclusive sx sysctl lock r = 0 (0xc078c420) locked @ 
/vol/src/sys/kern/kern_sysctl.c:1376
db trace 1333
Tracing pid 1333 tid 100081 td 0xc2767510
sched_switch(c2767510,0,1) at sched_switch+0xe7
mi_switch(1,0) at mi_switch+0x27c
sleepq_switch(c2b3e680,c078bdd0,0,c070e413,236,...) at sleepq_switch+0xc9
sleepq_timedwait(c2b3e680) at sleepq_timedwait+0x4a
msleep(c2b3e680,0,4c,c07028f3,64) at msleep+0x281
g_waitfor_event(c050d120,c2b6c300,2,0,0,0,0,1) at g_waitfor_event+0x73
sysctl_kern_geom_confxml(c07485e0,0,0,d1781b9c,c07485e0,...) at 
sysctl_kern_geom_confxml+0x26
sysctl_root(0,d1781c1c,3,d1781b9c) at sysctl_root+0x12f
userland_sysctl(c2767510,d1781c1c,3,830,bfbfe3d8,0,0,0,d1781c18,0,c078bde8,0,c070bc1f,522)
 at userland_sysctl+0xf4
__sysctl(c2767510,d1781d04) at __sysctl+0x77
syscall(3b,3b,3b,3,bfbfe3d8,...) at syscall+0x27e
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (202, FreeBSD ELF32, __sysctl), eip = 0x2816ba7f, esp = 0xbfbfe2bc, 
ebp = 0xbfbfe2f8 ---
db ps
  pid  ppid  pgrp   uid   state   wmesg wchancmd
 1348   800   800 0  S   sysctl l 0xc078c444 cron
 1347   800   800 0  S   sysctl l 0xc078c444 cron
 1346   800   800 0  S   sysctl l 0xc078c444 cron
 1345   

Re: ggate still broken on 6.2-RC1 for amd64.

2006-12-10 Thread David Gilbert
 Craig == Craig Boston [EMAIL PROTECTED] writes:

Craig On Sun, Dec 03, 2006 at 06:12:21PM +0100, Ulrich Spoerlein
Craig wrote:
 David Gilbert wrote:  But the ggated/ggatec in 6.2-RC1 connects
 now (and is happy about  that).  In fact, the tasting on the
 ggatec side that happens due to  new disks showing up works, too.
 However, any attempt to pass  significant traffic causes ggatec to
 seeminly lock up.
 
 /me too. Though I tested this on two FreeBSD/i386 SMP machines with
 gmirror + ggated combination. There *is* traffic going on, but it
 is somewhere around 50kB/s (sic! no kidding!).

Craig Have you tried increasing the send/receive buffer size?  In my
Craig local ggate setup I'm running both the client and server with
Craig the options -R 196608 -S 196608.  I added it a while back
Craig after discovering that the default buffer size was inadequate
Craig in certain situations and would sometimes cause large block
Craig sized I/O to hang.

Craig This was a while ago and I mentioned it to pjd@ so the issue
Craig may be have been corrected, but it's something that shouldn't
Craig take long to try.

That doesn't square with my experience.  Although bigger buffers could
be involved in a performance problem, what we're dealing with here is
a _zero_ traffic situation.  It seems that it works enough for tasting
to be successful, but any significant load wedges it hard.

Dave.

-- 

|David Gilbert, Independent Contractor.   | Two things can be  |
|Mail:   [EMAIL PROTECTED]|  equal if and only if they |
|http://daveg.ca  |   are precisely opposite.  |
=GLO
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ggate still broken on 6.2-RC1 for amd64.

2006-12-05 Thread Ulrich Spoerlein
David Gilbert wrote:
 GGate is still broken on 6.2-RC1 for amd64.
 
 I have verified that the patch in kern/104829 has been applied (it's
 in the tree).
 
 I have also added the patch in amd64/91799 --- without it, ggated
 doesn't work at all.  This should definately make it into 6.2
 
 But the ggated/ggatec in 6.2-RC1 connects now (and is happy about
 that).  In fact, the tasting on the ggatec side that happens due to
 new disks showing up works, too.  However, any attempt to pass
 significant traffic causes ggatec to seeminly lock up.
 
 In my configuration, I have a gmirror running with a local disk
 (already) and I want to gmirror insert the ggate disk.  When I do
 so, I get 50 write requests queued (I upped the gmirror buffer count
 to 50 to make syncronization happen faster) and things never move from
 there.

/me too. Though I tested this on two FreeBSD/i386 SMP machines with
gmirror + ggated combination. There *is* traffic going on, but it is
somewhere around 50kB/s (sic! no kidding!).

Also, forcefully removing the ggate0 provider (ggatec destroy -fu0),
which should not impact the mirror operation in any way, panic'ed the
system.

I can't rebuild this test scenario on -CURRENT right now, but will do so
time permitting. Maybe this is related to the gmirror deadlock I
reported. But I no longer have SMP hardware to play with ...

Ulrich Spoerlein
-- 
A: Yes.
Q: Are you sure?
 A: Because it reverses the logical flow of conversation.
 Q: Why is top posting frowned upon?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]