On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: > Quoting "Jeremy Chadwick" <free...@jdc.parodius.com>: > > >On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: > >>I'm running virtualbox 3.2.12_1 if that has anything to do with it. > >> > >>sysctl vfs.zfs.arc_max: 6200000000 > >> > >>While I'm trying to scp, kstat.zfs.misc.arcstats.size is > >>hovering right around that value, sometimes above, sometimes > >>below (that's as it should be, right?). I don't think that it > >>dies when crossing over arc_max. I can run the same scp 10 times > >>and it might fail 1-3 times, with no correlation to the > >>arcstats.size being above/below arc_max that I can see. > >> > >>Scott > >> > >>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: > >> > >>>Hi all, > >>> > >>>just as an addition: an upgrade to last Friday's > >>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the > >>>problem. > >>> > >>>I will experiment a bit more tomorrow after hours and grab some statistics. > >>> > >>>Regards > >>>Peter > >>> > >>>Quoting "Peter Ross" <peter.r...@bogen.in-berlin.de>: > >>> > >>>>Hi all, > >>>> > >>>>I noticed a similar problem last week. It is also very > >>>>similar to one reported last year: > >>>> > >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html > >>>> > >>>>My server is a Dell T410 server with the same bge card (the > >>>>same pciconf -lvc output as described by Mahlon: > >>>> > >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html > >>>> > >>>>Yours, Scott, is a em(4).. > >>>> > >>>>Another similarity: In all cases we are using VirtualBox. I > >>>>just want to mention it, in case it matters. I am still > >>>>running VirtualBox 3.2. > >>>> > >>>>Most of the time kstat.zfs.misc.arcstats.size was reaching > >>>>vfs.zfs.arc_max then, but I could catch one or two cases > >>>>then the value was still below. > >>>> > >>>>I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it does not help. > >>>> > >>>>BTW: It looks as ARC only gives back the memory when I > >>>>destroy the ZFS (a cloned snapshot containing virtual > >>>>machines). Even if nothing happens for hours the buffer > >>>>isn't released.. > >>>> > >>>>My machine was still running 8.2-PRERELEASE so I am upgrading. > >>>> > >>>>I am happy to give information gathered on old/new kernel if it helps. > >>>> > >>>>Regards > >>>>Peter > >>>> > >>>>Quoting "Scott Sipe" <csco...@gmail.com>: > >>>> > >>>>> > >>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: > >>>>> > >>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: > >>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: > >>>>>>>>I'm running 8.2-RELEASE and am having new problems > >>>>>>>>with scp. When scping > >>>>>>>>files to a ZFS directory on the FreeBSD server -- > >>>>>>>>most notably large files > >>>>>>>>-- the transfer frequently dies after just a few > >>>>>>>>seconds. In my last test, I > >>>>>>>>tried to scp an 800mb file to the FreeBSD system and > >>>>>>>>the transfer died after > >>>>>>>>200mb. It completely copied the next 4 times I > >>>>>>>>tried, and then died again on > >>>>>>>>the next attempt. > >>>>>>>> > >>>>>>>>On the client side: > >>>>>>>> > >>>>>>>>"Connection to home closed by remote host. > >>>>>>>>lost connection" > >>>>>>>> > >>>>>>>>In /var/log/auth.log: > >>>>>>>> > >>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write > >>>>>>>>failed: Cannot allocate > >>>>>>>>memory > >>>>>>>> > >>>>>>>>I've never seen this before and have used scp before > >>>>>>>>to transfer large files > >>>>>>>>without problems. This computer has been used in > >>>>>>>>production for months and > >>>>>>>>has a current uptime of 36 days. I have not been > >>>>>>>>able to notice any problems > >>>>>>>>copying files to the server via samba or netatalk, or any problems in > >>>>>>>>apache. > >>>>>>>> > >>>>>>>>Uname: > >>>>>>>> > >>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat > >>>>>>>>Feb 19 01:02:54 EST > >>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 > >>>>>>>> > >>>>>>>>I've attached my dmesg and output of vmstat -z. > >>>>>>>> > >>>>>>>>I have not restarted the sshd daemon or rebooted the computer. > >>>>>>>> > >>>>>>>>Am glad to provide any other information or test anything else. > >>>>>>>> > >>>>>>>>{snip vmstat -z and dmesg} > >>>>>>> > >>>>>>>You didn't provide details about your networking setup (rc.conf, > >>>>>>>ifconfig -a, etc.). netstat -m would be useful too. > >>>>>>> > >>>>>>>Next, please see this thread circa September 2010, titled "Network > >>>>>>>memory allocation failures": > >>>>>>> > >>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 > >>>>>>> > >>>>>>>The user in that thread is using rsync, which relies on scp by default. > >>>>>>>I believe this problem is similar, if not identical, to yours. > >>>>>>> > >>>>>> > >>>>>>Please also provide your output of ( /usr/bin/limits -a ) for the server > >>>>>>end and the client. > >>>>>> > >>>>>>I am not quite sure I agree with the need for ifconfig -a but some > >>>>>>information about the networking driver your using for the interface > >>>>>>would be helpful, uptime of the boxes. And configuration of the pool. > >>>>>>e.g. ( zpool status -a ;zfs get all <poolname> ) You should probably > >>>>>>prop this information up somewhere so you can reference by URL whenever > >>>>>>needed. > >>>>>> > >>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made to > >>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is > >>>>>>stating here but correct me if I am wrong. It does use ssh(1) by > >>>>>>default. > >>>>>> > >>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp > >>>>>>type filesystems that rsync(1) may be just filling up your temp ram area > >>>>>>and causing the connection abort which would be > >>>>>>expected. ( df -h ) would > >>>>>>help here. > >>>>> > >>>>>Hello, > >>>>> > >>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday > >>>>>were 3 different OSX computers (over gigabit). The FreeBSD > >>>>>server has 12gb of ram and no bce adapter. For what it's > >>>>>worth, the server is backed up remotely every night with > >>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite > >>>>>(slow cable connection) FreeBSD computer, and I have not > >>>>>seen any errors in the nightly rsync. > >>>>> > >>>>>Sorry for the omission of networking info, here's the > >>>>>output of the requested commands and some that popped up > >>>>>in the other thread: > >>>>> > >>>>>http://www.cap-press.com/misc/ > >>>>> > >>>>>In rc.conf: ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0" > >>>>> > >>>>>Scott > > > >Just to make it crystal clear to everyone: > > > >There is no correlation between this problem and use of ZFS. People are > >attempting to correlate "cannot allocate memory" messages with "anything > >on the system that uses memory". The VM is much more complex than that. > > > >Given the nature of this problem, it's much more likely the issue is > >"somewhere" within a networking layer within FreeBSD, whether it be > >driver-level or some sort of intermediary layer. > > > >Two people who have this issue in this thread are both using VirtualBox. > >Can one, or both, of you remove VirtualBox from the configuration > >entirely (kernel, etc. -- not sure what is required) and then see if the > >issue goes away? > > On the machine in question I only can do it after hours so I will do > it tonight. > > I was _successfully_ sending the file over the loopback interface using > > cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" > > I did it, btw, with the IPv6 localhost address first (accidently), > and then using IPv4. Both worked. > > It always fails if I am sending it through the bce(4) interface, > even if my target is the VirtualBox bridged to the bce card (so it > does not "leave" the computer physically). > > Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and kldstat output. > > I have another box where I do not see that problem. It copies files > happily over the net using ssh. > > It is an an older HP ML 150 with 3GB RAM only but with a bge(4) > driver instead. It runs the same last week's RELENG_8. I installed > VirtualBox and enabled vboxnet (so it loads the kernel modules). But > I do not run VirtualBox on it (because it hasn't enough RAM). > > Regards > Peter > > DellT410one# uname -a > FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun > 30 17:07:18 EST 2011 > r...@dellt410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 > DellT410one# ifconfig -a > bce0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> > metric 0 mtu 1500 > > options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE> > ether 84:2b:2b:68:64:e4 > inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 > media: Ethernet autoselect (1000baseT <full-duplex>) > status: active > bce1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 > > options=c01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE> > ether 84:2b:2b:68:64:e5 > media: Ethernet autoselect > lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 > options=3<RXCSUM,TXCSUM> > inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb > inet6 ::1 prefixlen 128 > inet 127.0.0.1 netmask 0xff000000 > nd6 options=3<PERFORMNUD,ACCEPT_RTADV> > vboxnet0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 > ether 0a:00:27:00:00:00 > DellT410one# netstat -rn > Routing tables > > Internet: > Destination Gateway Flags Refs Use Netif Expire > default 192.168.50.201 UGS 0 52195 bce0 > 127.0.0.1 link#11 UH 0 6 lo0 > 192.168.50.0/24 link#1 U 0 1118212 bce0 > 192.168.50.219 link#1 UHS 0 9670 lo0 > 192.168.50.220 link#1 UHS 0 8347 lo0 > 192.168.50.221 link#1 UHS 0 103024 lo0 > 192.168.50.223 link#1 UHS 0 43614 lo0 > 192.168.50.224 link#1 UHS 0 8358 lo0 > 192.168.50.225 link#1 UHS 0 8438 lo0 > 192.168.50.226 link#1 UHS 0 8338 lo0 > 192.168.50.227 link#1 UHS 0 8333 lo0 > 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 > 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 > 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 > 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 > > Internet6: > Destination Gateway > Flags Netif Expire > ::1 ::1 UH > lo0 > fe80::%lo0/64 link#11 U > lo0 > fe80::1%lo0 link#11 UHS > lo0 > ff01::%lo0/32 fe80::1%lo0 U > lo0 > ff02::%lo0/32 fe80::1%lo0 U > lo0 > DellT410one# kldstat > Id Refs Address Size Name > 1 19 0xffffffff80100000 dbf5d0 kernel > 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko > 3 1 0xffffffff81012000 131998 zfs.ko > 4 1 0xffffffff81144000 1ff1 opensolaris.ko > 5 2 0xffffffff81146000 2940 vboxnetflt.ko > 6 2 0xffffffff81149000 8e38 netgraph.ko > 7 1 0xffffffff81152000 153c ng_ether.ko > 8 1 0xffffffff81154000 e70 vboxnetadp.ko > DellT410one# pciconf -lv > .. > bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 > chip=0x163b14e4 rev=0x20 hdr=0x00 > vendor = 'Broadcom Corporation' > class = network > subclass = ethernet > bce1@pci0:1:0:1: class=0x020000 card=0x028d1028 > chip=0x163b14e4 rev=0x20 hdr=0x00 > vendor = 'Broadcom Corporation' > class = network > subclass = ethernet
Could you please provide "pciconf -lvcb" output instead, specific to the bce chips? Thanks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"