Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY
On Fri, Sep 26, 2008 at 10:35:57PM -0700, Derek Kuli??ski wrote: > Hello Jeremy, > > Friday, September 26, 2008, 10:14:13 PM, you wrote: > > >> Actually what's the advantage of having fsck run in background if it > >> isn't capable of fixing things? > >> Isn't it more dangerous to be it like that? i.e. administrator might > >> not notice the problem; also filesystem could break even further... > > > This question should really be directed at a set of different folks, > > e.g. actual developers of said stuff (UFS2 and soft updates in > > specific), because it's opening up a can of worms. > > > I believe it has to do with the fact that there is much faith given to > > UFS2 soft updates -- the ability to background fsck allows the user to > > boot their system and have it up and working (able to log in, etc.) in a > > much shorter amount of time[1]. It makes the assumption that "everything > > will work just fine", which is faulty. > > As far as I know (at least ideally, when write caching is disabled) Re: write caching: wheelies and burn-outs in empty parking lots detected. Let's be realistic. We're talking about ATA and SATA hard disks, hooked up to on-board controllers -- these are the majority of users. Those with ATA/SATA RAID controllers (not on-board RAID either; most/all of those do not let you disable drive write caching) *might* have a RAID BIOS menu item for disabling said feature. FreeBSD atacontrol does not let you toggle such features (although "cap" will show you if feature is available and if it's enabled or not). Users using SCSI will most definitely have the ability to disable said feature (either via SCSI BIOS or via camcontrol). But the majority of users are not using SCSI disks, because the majority of users are not going to spend hundreds of dollars on a controller followed by hundreds of dollars for a small (~74GB) disk. Regardless of all of this, end-users should, in no way shape or form, be expected to go to great lengths to disable their disk's write cache. They will not, I can assure you. Thus, we must assume: write caching on a disk will be enabled, period. If a filesystem is engineered with that fact ignored, then the filesystem is either 1) worthless, or 2) serves a very niche purpose and should not be the default filesystem. Do we agree? > the data should always be consistent, and all fsck supposed to be > doing is to free unreferenced blocks that were allocated. fsck does a heck of a lot more than that, and there's no guarantee that's all fsck is going to do on a UFS2+SU filesystem. I'm under the impression it does a lot more than just looking for unref'd blocks. > Wouldn't be possible for background fsck to do that while the > filesystem is mounted, and if there's some unrepairable error, that > somehow happen (while in theory it should be impossible) just > periodically scream on the emergency log level? The system is already up and the filesystems mounted. If the error in question is of such severity that it would impact a user's ability to reliably use the filesystem, how do you expect constant screaming on the console will help? A user won't know what it means; there is already evidence of this happening (re: mysterious ATA DMA errors which still cannot be figured out[6]). IMHO, a dirty filesystem should not be mounted until it's been fully analysed/scanned by fsck. So again, people are putting faith into UFS2+SU despite actual evidence proving that it doesn't handle all scenarios. > > It also gives the impression of a journalled filesystem, which UFS2 soft > > updates are not. gjournal(8) on the other hand, is, and doesn't require > > fsck at all[2]. > > > I also think this further adds fuel to the "so why are we enabling soft > > updates by default and using UFS2 as a filesystem again?" fire. I'm > > sure someone will respond to this with "So use ZFS and shut up". *sigh* > > I think the reason for using Soft Updates by default is that it was > a pretty hard thing to implement, and (at least in theory it supposed > by as reliable as journaling. The problem here is that when it was created, it was sort of an "experiment". Now, when someone installs FreeBSD, UFS2 is the default filesystem used, and SU are enabled on every filesystem except the root fs. Thus, we have now put ourselves into a situation where said feature ***must*** be reliable in all cases. You're also forgetting a huge focus of SU -- snapshots[1]. However, there are more than enough facts on the table at this point concluding that snapshots are causing more problems[7] than previously expected. And there's further evidence filesystem snapshots shouldn't even be used in this way[8]. > Also, if I remember correctly, PJD said that gjournal is performing > much better with small files, while softupdates is faster with big > ones. Okay, so now we want to talk about benchmarks. The benchmarks you're talking about are in two places[2][3]. The benchmarks pjd@ provided were v
Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY
Hello Jeremy, Friday, September 26, 2008, 10:14:13 PM, you wrote: >> Actually what's the advantage of having fsck run in background if it >> isn't capable of fixing things? >> Isn't it more dangerous to be it like that? i.e. administrator might >> not notice the problem; also filesystem could break even further... > This question should really be directed at a set of different folks, > e.g. actual developers of said stuff (UFS2 and soft updates in > specific), because it's opening up a can of worms. > I believe it has to do with the fact that there is much faith given to > UFS2 soft updates -- the ability to background fsck allows the user to > boot their system and have it up and working (able to log in, etc.) in a > much shorter amount of time[1]. It makes the assumption that "everything > will work just fine", which is faulty. As far as I know (at least ideally, when write caching is disabled) the data should always be consistent, and all fsck supposed to be doing is to free unreferenced blocks that were allocated. Wouldn't be possible for background fsck to do that while the filesystem is mounted, and if there's some unrepairable error, that somehow happen (while in theory it should be impossible) just periodically scream on the emergency log level? > It also gives the impression of a journalled filesystem, which UFS2 soft > updates are not. gjournal(8) on the other hand, is, and doesn't require > fsck at all[2]. > I also think this further adds fuel to the "so why are we enabling soft > updates by default and using UFS2 as a filesystem again?" fire. I'm > sure someone will respond to this with "So use ZFS and shut up". *sigh* I think the reason for using Soft Updates by default is that it was a pretty hard thing to implement, and (at least in theory it supposed by as reliable as journaling. Also, if I remember correctly, PJD said that gjournal is performing much better with small files, while softupdates is faster with big ones. -- Best regards, Derekmailto:[EMAIL PROTECTED] Programmers are tools for converting caffeine into code. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY
On Fri, Sep 26, 2008 at 09:33:41PM -0700, Derek Kuli??ski wrote: > Hello Jeremy, > > Sunday, September 21, 2008, 3:07:20 PM, you wrote: > > > Consider using background_fsck="no" in /etc/rc.conf if you prefer the > > old behaviour. Otherwise, boot single-user then do the fsck. > > Actually what's the advantage of having fsck run in background if it > isn't capable of fixing things? > Isn't it more dangerous to be it like that? i.e. administrator might > not notice the problem; also filesystem could break even further... This question should really be directed at a set of different folks, e.g. actual developers of said stuff (UFS2 and soft updates in specific), because it's opening up a can of worms. I believe it has to do with the fact that there is much faith given to UFS2 soft updates -- the ability to background fsck allows the user to boot their system and have it up and working (able to log in, etc.) in a much shorter amount of time[1]. It makes the assumption that "everything will work just fine", which is faulty. It also gives the impression of a journalled filesystem, which UFS2 soft updates are not. gjournal(8) on the other hand, is, and doesn't require fsck at all[2]. I also think this further adds fuel to the "so why are we enabling soft updates by default and using UFS2 as a filesystem again?" fire. I'm sure someone will respond to this with "So use ZFS and shut up". *sigh* [1]: http://lists.freebsd.org/pipermail/freebsd-questions/2004-December/069114.html [2]: http://lists.freebsd.org/pipermail/freebsd-questions/2008-April/173501.html -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UNEXPECTED SOFT UPDATE INCONSISTENCY; RUN fsck MANUALLY
Hello Jeremy, Sunday, September 21, 2008, 3:07:20 PM, you wrote: > Consider using background_fsck="no" in /etc/rc.conf if you prefer the > old behaviour. Otherwise, boot single-user then do the fsck. Actually what's the advantage of having fsck run in background if it isn't capable of fixing things? Isn't it more dangerous to be it like that? i.e. administrator might not notice the problem; also filesystem could break even further... -- Best regards, Derekmailto:[EMAIL PROTECTED] I tried to daydream, but my mind kept wandering. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: sysctl maxfiles
On Sat, Sep 27, 2008 at 11:10:01AM +1000, Aristedes Maniatis wrote: > By default FreeBSD 7.0 shipped with the sysctls set to: > > kern.maxfiles: 12328 > kern.maxfilesperproc: 11095 > > We recently bumped up against these limits in an unfortunate way and we > are going to raise them. I have some questions: > > * why are the numbers set the way they are? They aren't round numbers, > they aren't powers of 2. But they were arrived at somehow with planning > and thought presumably, so when I increase them I'd like to know a bit > more about why these numbers were chosen. The values are calculated when the kernel is loaded, based on many other parameters; you won't find "12328" hard-coded anywhere in the kernel source, for example. The Handbook goes over this fact: http://www.freebsd.org/doc/en/books/handbook/configtuning-kernel-limits.html By the way, DO NOT let the term "maxusers" make you think that has something to do with the number of users which can be logged in simultaneously or added to a box. It has nothing to do with that. Anyway, I'd like to know why you have so many fds open simultaneously in the first place. We're talking over 11,000 fds actively open at once -- this is not a small number. What exactly is this machine doing? Are you absolutely certain tuning this higher is justified? Have you looked into the possibility that you have a program which is exhausting fds by not closing them when finished? (Yes, this is quite common; I've seen bad Java code cause this problem on Solaris.) > * why are the numbers so close together? Surely there should be more gap > between max files per process and the max files for the whole system. > What happens is that with one runaway broken process is that it hits > 11095 and the 1233 files left for everything else is not enough (on many > servers) to allow the admin to login using ssh. That gets very ugly very > quickly. Others will have to comment on this. > * Under OSX (both server and client), these numbers are 12288 and 10240. > A bit more of a gap, but not terribly different to FreeBSD. Still > interesting that someone changed these numbers just slightly. OS X isn't based on FreeBSD 7. The calculation logic has changed over time. > * why do these controls exist at all? That is, if they were set to > infinite what part of the system would be exhausted by a runaway process > which kept opening files? Would the kernel run out of memory? What memory > setting would be relevant here? I don't want to set maxfiles too high and > then run out of some other resource which this maxfiles was protecting. You're asking for trouble setting these values to the equivalent of unlimited. Instead of asking "what would happen", you should be asking "why would I need to do that". Regarding memory implications, the Handbook goes over it. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: 7.1-PRERELEASE freezes
On Fri, Sep 26, 2008 at 06:21:01PM +0200, Christian Laursen wrote: > I decided to give 7.1-PRERELEASE a try on one of my machines to find > out if there might be any problems I should be aware of. > > I quickly ran into problems. After a while the system freezes > completely. It seems to be somehow related to the load of the machine > as it doesn't seem to happen when it is idle. I built a kernel with > software watchdog enabled and enabled watchdog which had the nice > effect of turning the freeze into a panic. Hopefully that will be of > some help. > > I first encountered the problem using SCHED_ULE and then tried if > SCHED_4BSD made any difference. But the freeze happens with either > scheduler. > > I have disabled xorg and the nvidia driver but that doesn't help > either. I can cut down on various other stuff too, but first I hope > that someone here have a more educated guess about what could be the > cause of the freezes. > > I have placed the backtraces from the most recent crashes as well as > the demsg output from the most recent boot at this URL: > http://borderworlds.dk/~xi/7.1-PRERELEASE.freeze.txt > > My kernel config is also included. > > #0 doadump () at pcpu.h:196 > #1 0xc05abd03 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 > #2 0xc05abeff in panic (fmt=Variable "fmt" is not available.) at > /usr/src/sys/kern/kern_shutdown.c:572 > #3 0xc0570d18 in hardclock (usermode=0, pc=3231434181) at > /usr/src/sys/kern/kern_clock.c:642 > #4 0xc07d194f in clkintr (frame=0xe38e1c68) at > /usr/src/sys/i386/isa/clock.c:164 > #5 0xc07c0465 in intr_execute_handlers (isrc=0xc0866700, frame=0xe38e1c68) > at /usr/src/sys/i386/i386/intr_machdep.c:366 > #6 0xc07d0fa8 in atpic_handle_intr (vector=0, frame=0xe38e1c68) at > /usr/src/sys/i386/isa/atpic.c:596 > #7 0xc07bbf41 in Xatpic_intr0 () at atpic_vector.s:62 > #8 0xc09bc5c5 in acpi_cpu_c1 () at > /usr/src/sys/modules/acpi/acpi/../../../i386/acpica/acpi_machdep.c:550 > #9 0xc09b54f4 in acpi_cpu_idle () at > /usr/src/sys/modules/acpi/acpi/../../../dev/acpica/acpi_cpu.c:945 > #10 0xc07c35b6 in cpu_idle () at /usr/src/sys/i386/i386/machdep.c:1183 > #11 0xc05c9275 in sched_idletd (dummy=0x0) at > /usr/src/sys/kern/sched_4bsd.c:1429 > #12 0xc05895d6 in fork_exit (callout=0xc05c9260 , arg=0x0, > frame=0xe38e1d38) at /usr/src/sys/kern/kern_fork.c:804 > #13 0xc07bbf10 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 > A couple generic things, although I think jhb@ might be able to figure out what's going on here: 1) Is this machine running the latest BIOS available? 2) Are you running powerd(8) on this box? 3) Does disabling ACPI (it's a menu option when booting) help? 4) Does removing "device cpufreq" help? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
sysctl maxfiles
By default FreeBSD 7.0 shipped with the sysctls set to: kern.maxfiles: 12328 kern.maxfilesperproc: 11095 We recently bumped up against these limits in an unfortunate way and we are going to raise them. I have some questions: * why are the numbers set the way they are? They aren't round numbers, they aren't powers of 2. But they were arrived at somehow with planning and thought presumably, so when I increase them I'd like to know a bit more about why these numbers were chosen. * why are the numbers so close together? Surely there should be more gap between max files per process and the max files for the whole system. What happens is that with one runaway broken process is that it hits 11095 and the 1233 files left for everything else is not enough (on many servers) to allow the admin to login using ssh. That gets very ugly very quickly. * Under OSX (both server and client), these numbers are 12288 and 10240. A bit more of a gap, but not terribly different to FreeBSD. Still interesting that someone changed these numbers just slightly. * why do these controls exist at all? That is, if they were set to infinite what part of the system would be exhausted by a runaway process which kept opening files? Would the kernel run out of memory? What memory setting would be relevant here? I don't want to set maxfiles too high and then run out of some other resource which this maxfiles was protecting. Thanks Ari Maniatis --> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HELP DEBUG: FreeBSD 6.3-RELEASE-p3 TIMEOUT - WRITE_DMA + other strange behaviour!
On 2008-Sep-26 13:12:14 +0300, Anton - Valqk <[EMAIL PROTECTED]> wrote: >1. I get a lot of dma times outs. mostly on ad5 and ad7 where I keep ... >dmesg.today:ad7: FAILURE - WRITE_DMA48 status=51 >error=10 LBA=374303456 This is a bad sign and suggests dying disk but... >2. The other strange issue is that when (I guess) it starts timeouting >*sometimes* not everytime I'm loosing connection to xl0 or fxp0 You have an awful lot of hardware in this box. Are you sure the power supply and cooling is up to scratch? Sagging power could cause the problems you report, as could overheating. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. pgpwR8dFmRPsS.pgp Description: PGP signature
Re: bad NFS/UDP performance
David, You beat me to it. Danny, read the iperf man page: -b, --bandwidth n[KM] set target bandwidth to n bits/sec (default 1 Mbit/sec). This setting requires UDP (-u). The page needs updating, though. It should read "-b, --bandwidth n[KMG]. It also does NOT require -u. If you use -b, UDP is assumed. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 pgpx3jIT8TITC.pgp Description: PGP signature
Re: HELP DEBUG: FreeBSD 6.3-RELEASE-p3 TIMEOUT - WRITE_DMA + other strange behaviour!
Hi Anton, On 2008-Sep-26 15:13:19 +0300, Anton - Valqk <[EMAIL PROTECTED]> wrote: >you are right that the machine has *lots* ot hardware in it, >I was thinking of the power supply as a reason and measured the 5 and 12 >volts - seemd to be ok 11.8 and 5.2 with all hardware in it. A multimeter won't show noise or load spikes. That said, if the PSU is reasonably new and running well within its ratings, it shouldn't be a problem. >1. remove rl0 and run only one isp for the test. It's definitely worthwhile getting rid of rl(4) cards. Read the top of the driver source for reasons. >3. try to replace the ATA100 cables (the one with 80 wires) with an >older ones with only 40 cabels? I wouldn't recommend this. The 80-wire cables are electrically much better than the 40-wire ones. You might like to try a different cable. You should verify that the master/slave/MB sockets on the cable are plugged into the correct device. If you want to slow down the ATA bus, I suggest you do it in software. >4. ? anything else? Try disconnecting some of the disks and see if the problem goes away - this would help rule out PSU problems. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. pgpMmVmLW3BJt.pgp Description: PGP signature
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 04:35:17PM +0300, Danny Braniss wrote: > I know, but I get about 1mgb, which seems somewhat low :-( Since UDP has no way to know how fast to send, you need to tell iperf how fast to send the packets. I think 1Mbps is the default speed. David. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
[RELENG_6] Works Fine For Me!
Just an effort to test RELENG_6 . No issues noted on my Dell server. Nice work folks! Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.4-PRERELEASE #1 r183385M: Fri Sep 26 11:13:48 PDT 2008 [EMAIL PROTECTED]:/usr/obj/home/sbruno/bsd/6/sys/GENERIC ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.19-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 Features=0xbfebfbff Features2=0x641d AMD Features=0x2010 Logical CPUs per core: 2 real memory = 1073479680 (1023 MB) avail memory = 1037283328 (989 MB) ioapic0: Changing APIC ID to 8 ioapic1: Changing APIC ID to 9 ioapic2: Changing APIC ID to 10 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 32-55 on motherboard ioapic2 irqs 64-87 on motherboard kbd1 at kbdmux0 ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) hptrr: HPT RocketRAID controller driver v1.1 (Sep 26 2008 11:10:49) acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 acpi_hpet0: iomem 0xfed0-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 cpu0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 2.0 on pci0 pci1: on pcib1 pcib2: at device 0.0 on pci1 pci2: on pcib2 mpt0: port 0xec00-0xecff mem 0xfe9f-0xfe9f,0xfe9e-0xfe9e irq 34 at device 5.0 on pci2 mpt0: [GIANT-LOCKED] mpt0: MPI Version=1.2.12.0 pcib3: at device 0.2 on pci1 pci3: on pcib3 ahc0: port 0xdc00-0xdcff mem 0xfe7ff000-0xfe7f irq 37 at device 11.0 on pci3 ahc0: [GIANT-LOCKED] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs pcib4: at device 4.0 on pci0 pci4: on pcib4 pcib5: at device 5.0 on pci0 pci5: on pcib5 pcib6: at device 0.0 on pci5 pci6: on pcib6 em0: port 0xccc0-0xccff mem 0xfe4e-0xfe4f irq 64 at device 7.0 on pci6 em0: Ethernet address: 00:11:43:e2:ff:fd pcib7: at device 0.2 on pci5 pci7: on pcib7 em1: port 0xbcc0-0xbcff mem 0xfe2e-0xfe2f irq 65 at device 8.0 on pci7 em1: Ethernet address: 00:11:43:e2:ff:fe pcib8: at device 6.0 on pci0 pci8: on pcib8 uhci0: port 0x9ce0-0x9cff irq 16 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0x9cc0-0x9cdf irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0x9ca0-0x9cbf irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered ehci0: mem 0xfeb0-0xfeb003ff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb3: EHCI version 1.0 usb3: companion controllers, 2 ports each: usb0 usb1 usb2 usb3: on ehci0 usb3: USB revision 2.0 uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub3: 6 ports with 6 removable, self powered uhub4: vendor 0x413c product 0xa001, class 9/0, rev 2.00/0.00, addr 2 uhub4: multiple transaction translators uhub4: 2 ports with 2 removable, self powered pcib9: at device 30.0 on pci0 pci9: on pcib9 pci9: at device 13.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse Explorer, device ID 4 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A pmtimer0 on isa0 orm0: at iomem 0xc-0xcafff,0xcb000-0xcbfff,0xcc000-0xc,0xd-0xd0fff,0xec000-0xe on isa0 ppc0: parallel port not found. sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 Timecounter "TSC" frequency 2793191800 Hz quality 800 Timecounters tick every 1.000 msec hptrr: no controller detected. Waiting 5 seconds for SCSI devices to settle acd0: CDROM at ata0-master UDMA33 ses0 at mpt0 bus 0 target 6 lun 0 ses0: Fixed Processor SCSI-2 device ses0: 3.300MB/s transfers ses0: SAF-TE Compliant Device da2 at ahc0 bus 0 target 1 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 160
Re: bad NFS/UDP performance
:> -vfs.nfs.realign_test: 22141777 :> +vfs.nfs.realign_test: 498351 :> :> -vfs.nfsrv.realign_test: 5005908 :> +vfs.nfsrv.realign_test: 0 :> :> +vfs.nfsrv.commit_miss: 0 :> +vfs.nfsrv.commit_blks: 0 :> :> changing them did nothing - or at least with respect to nfs throughput :-) : :I'm not sure what any of these do, as NFS is a bit out of my league. ::-) I'll be following this thread though! : :-- :| Jeremy Chadwickjdc at parodius.com | A non-zero nfs_realign_count is bad, it means NFS had to copy the mbuf chain to fix the alignment. nfs_realign_test is just the number of times it checked. So nfs_realign_test is irrelevant. it's nfs_realign_count that matters. Several things can cause NFS payloads to be improperly aligned. Anything from older network drivers which can't start DMA on a 2-byte boundary, resulting in the 14-byte encapsulation header causing improper alignment of the IP header & payload, to rpc embedded in NFS TCP streams winding up being misaligned. Modern network hardware either support 2-byte-aligned DMA, allowing the encapsulation to be 2-byte aligned so the payload winds up being 4-byte aligned, or support DMA chaining allowing the payload to be placed in its own mbuf, or pad, etc. -- One thing I would check is to be sure a couple of nfsiod's are running on the client when doing your tests. If none are running the RPCs wind up being more synchronous and less pipelined. Another thing I would check is IP fragment reassembly statistics (for UDP) - there should be none for TCP connections no matter what the NFS I/O size selected. (It does seem more likely to be scheduler-related, though). -Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: rl0: watchdog timeout + 40, 000 ms ping with 7.1-BETA-i386-disc1.iso
- Original Message > From: Julian Stacey <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Sent: Friday, September 26, 2008 8:16:57 PM > Subject: Re: rl0: watchdog timeout + 40, 000 ms ping with > 7.1-BETA-i386-disc1.iso > > > > I'm remaking binaries, > > New generic kernel built & installed, & install of all src/ done too. > No improvement. > > > Is there reliable way to reproduce the issue? > > Its continuous, the machine virtually never does a ping in less > than 10 seconds. > > > Anyway, would you try attached patch and let me know result? > > Thanks > Done, doesnt help. > Seeing a new message now too: > ping: sendto: No buffer space available. > > Output of vmstat -i and pciconf -lv look the same as before > > It's a small card. Weighs 46 gram. I was going to write > I could simply post it to you, & you could keep it if you > want. As I had quessed it might be some new kind of card > unexperienced before, > RTL8139D, card just says made in China > > But I just grabbed another card > card says Level One. > chip 8139B > & with both patched kernel & original no improvement. > So I tried a totaly different card xl0 fails too, > I think that 3com xl0 card was OK before in another box, > so I'd guess not an rl problem, Sorry. > > Probably not 7.1 either, but probably a BIOS config problem of some sort. > > IRQ 12 was listed in Award BIOS as Primary, options were also secondary or > disabled, so Ive set it disabled. > PNP OS Yes > Resources: Auto > "Reset config data" to Enabled (I forgot before after card changes) > > Did another restore BIOS factory defaults, no help. > Moved xl0 to another slot (all other 3 slots never use I guess, as > chassis plates not torn off on what I guess is original chassis. > No luck with xl0 > I'm out of ideas. > > > Cheers, > Julian > -- > Julian Stacey: BSDUnixLinux C Prog Admin SysEng Consult Munich www.berklix.com > Mail plain ASCII text. HTML & Base64 text are spam. www.asciiribbon.org Just a shot in the darkness. Do you have poll enabled for rl0 ? Regards, -Abdullah Ibn Hamad Al-Marri Arab Portal http://www.WeArab.Net/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: rl0: watchdog timeout + 40, 000 ms ping with 7.1-BETA-i386-disc1.iso
Hi, Reference: > From: "Julian Stacey" <[EMAIL PROTECTED]> > Date: Fri, 26 Sep 2008 19:16:57 +0200 > Message-id: <[EMAIL PROTECTED]> "Julian Stacey" wrote: > > > I'm remaking binaries, > > New generic kernel built & installed, & install of all src/ done too. > No improvement. > > > Is there reliable way to reproduce the issue? > > Its continuous, the machine virtually never does a ping in less > than 10 seconds. > > > Anyway, would you try attached patch and let me know result? > > Thanks > Done, doesnt help. > Seeing a new message now too: > ping: sendto: No buffer space available. > > Output of vmstat -i and pciconf -lv look the same as before > > It's a small card. Weighs 46 gram. I was going to write > I could simply post it to you, & you could keep it if you > want. As I had quessed it might be some new kind of card > unexperienced before, > RTL8139D, card just says made in China > > But I just grabbed another card > card says Level One. > chip 8139B > & with both patched kernel & original no improvement. > So I tried a totaly different card xl0 fails too, > I think that 3com xl0 card was OK before in another box, > so I'd guess not an rl problem, Sorry. > > Probably not 7.1 either, but probably a BIOS config problem of some sort. > > IRQ 12 was listed in Award BIOS as Primary, options were also secondary or > disabled, so Ive set it disabled. > PNP OS Yes > Resources: Auto > "Reset config data" to Enabled (I forgot before after card changes) > > Did another restore BIOS factory defaults, no help. > Moved xl0 to another slot (all other 3 slots never use I guess, as > chassis plates not torn off on what I guess is original chassis. > No luck with xl0 > I'm out of ideas. Got it working on xl interrupt problem, I turned off lpt com2 & something else in bios. Got to go out now Ill go back to rl0 too & report back soon thanks for help both ! Cheers, Julian -- Julian Stacey: BSDUnixLinux C Prog Admin SysEng Consult Munich www.berklix.com Mail plain ASCII text. HTML & Base64 text are spam. www.asciiribbon.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: rl0: watchdog timeout + 40, 000 ms ping with 7.1-BETA-i386-disc1.iso
> > I'm remaking binaries, New generic kernel built & installed, & install of all src/ done too. No improvement. > Is there reliable way to reproduce the issue? Its continuous, the machine virtually never does a ping in less than 10 seconds. > Anyway, would you try attached patch and let me know result? Thanks Done, doesnt help. Seeing a new message now too: ping: sendto: No buffer space available. Output of vmstat -i and pciconf -lv look the same as before It's a small card. Weighs 46 gram. I was going to write I could simply post it to you, & you could keep it if you want. As I had quessed it might be some new kind of card unexperienced before, RTL8139D, card just says made in China But I just grabbed another card card says Level One. chip 8139B & with both patched kernel & original no improvement. So I tried a totaly different card xl0 fails too, I think that 3com xl0 card was OK before in another box, so I'd guess not an rl problem, Sorry. Probably not 7.1 either, but probably a BIOS config problem of some sort. IRQ 12 was listed in Award BIOS as Primary, options were also secondary or disabled, so Ive set it disabled. PNP OS Yes Resources: Auto "Reset config data" to Enabled (I forgot before after card changes) Did another restore BIOS factory defaults, no help. Moved xl0 to another slot (all other 3 slots never use I guess, as chassis plates not torn off on what I guess is original chassis. No luck with xl0 I'm out of ideas. Cheers, Julian -- Julian Stacey: BSDUnixLinux C Prog Admin SysEng Consult Munich www.berklix.com Mail plain ASCII text. HTML & Base64 text are spam. www.asciiribbon.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
7.1-PRERELEASE freezes
Hello, I decided to give 7.1-PRERELEASE a try on one of my machines to find out if there might be any problems I should be aware of. I quickly ran into problems. After a while the system freezes completely. It seems to be somehow related to the load of the machine as it doesn't seem to happen when it is idle. I built a kernel with software watchdog enabled and enabled watchdog which had the nice effect of turning the freeze into a panic. Hopefully that will be of some help. I first encountered the problem using SCHED_ULE and then tried if SCHED_4BSD made any difference. But the freeze happens with either scheduler. I have disabled xorg and the nvidia driver but that doesn't help either. I can cut down on various other stuff too, but first I hope that someone here have a more educated guess about what could be the cause of the freezes. I have placed the backtraces from the most recent crashes as well as the demsg output from the most recent boot at this URL: http://borderworlds.dk/~xi/7.1-PRERELEASE.freeze.txt My kernel config is also included. As far as I can tell the two backtraces are identical and look like this: #0 doadump () at pcpu.h:196 #1 0xc05abd03 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc05abeff in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:572 #3 0xc0570d18 in hardclock (usermode=0, pc=3231434181) at /usr/src/sys/kern/kern_clock.c:642 #4 0xc07d194f in clkintr (frame=0xe38e1c68) at /usr/src/sys/i386/isa/clock.c:164 #5 0xc07c0465 in intr_execute_handlers (isrc=0xc0866700, frame=0xe38e1c68) at /usr/src/sys/i386/i386/intr_machdep.c:366 #6 0xc07d0fa8 in atpic_handle_intr (vector=0, frame=0xe38e1c68) at /usr/src/sys/i386/isa/atpic.c:596 #7 0xc07bbf41 in Xatpic_intr0 () at atpic_vector.s:62 #8 0xc09bc5c5 in acpi_cpu_c1 () at /usr/src/sys/modules/acpi/acpi/../../../i386/acpica/acpi_machdep.c:550 #9 0xc09b54f4 in acpi_cpu_idle () at /usr/src/sys/modules/acpi/acpi/../../../dev/acpica/acpi_cpu.c:945 #10 0xc07c35b6 in cpu_idle () at /usr/src/sys/i386/i386/machdep.c:1183 #11 0xc05c9275 in sched_idletd (dummy=0x0) at /usr/src/sys/kern/sched_4bsd.c:1429 #12 0xc05895d6 in fork_exit (callout=0xc05c9260 , arg=0x0, frame=0xe38e1d38) at /usr/src/sys/kern/kern_fork.c:804 #13 0xc07bbf10 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 I can provide more information as needed. Any help will be greatly appreciated. Thanks. -- Christian Laursen ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: buildworld fails in csh
On 09/26/08 12:49, Jeremy Chadwick wrote: Being as I just rebuilt world only 2 days ago and I did not run into this problem, I'm concluding the issue must be with your system. :-) It's possible you've done some bizarre tuning in /etc/make.conf or /etc/src.conf which is somehow breaking the build. I checked make.conf already, since that is usually the cause when I have such problems. I didn't know about src.conf, I'll have a look at its manpage (so, since I don't have one, that can't be the cause of my problem either). I'll wipe out /usr/src as well and re-cvsup, then build from single user mode for minimal intervention by shells and environments and see whether that might help. I don't see how booting single-user is going to help with any of this. I was finally able to do a buildworld by doing it from single user mode. My guess is that the root of the problem was with either the shell I was using or some environment variables. Going to single user mode was just the safest way to remove all those possible effects, since I'm not quite sure how to do it in another way. But I agree, single user mode itself is not likely to help other than that. And do not forget to remove /var/db/sup/src-all if you remove all of /usr/src. People often forget this fact. I forgot it as well :-) Thanks, Tobias -- Tobias Roth || http://fsck.ch || PGP: 0xCE599B4D | You can't have everything. Where would you put it? | - Steven Wright ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: vm.kmem_size settings doesn't affect loader?
On Sep 26, 2008, at 4:43 AM, Bartosz Stec wrote: Jeremy Chadwick wrote: These are the tuning settings I use: vm.kmem_size="1536M" vm.kmem_size_max="1536M" vfs.zfs.arc_min="16M" vfs.zfs.arc_max="64M" Yesterday I've added 512 MB memory to box (sum 1,5GB), and set vm.kmem_size and vm.kmem_size to "1024M". With pieces of 1024MB, 512MB, 256MB, 256MB available and 3 memory slots it is hard to have 2GB RAM ;) Until now it survived world cleaning/building/installing/bonnie++ benchmarkink/fs scrubing and general usage. Memory usage seems stable. If unfortunately kmem exhaustion will happen again I will experiment with ARC settings. IMHO you've explained gently a lot of zfs tuning concerns in this thread and they should be added to tuning guide - espacially explanation of ARC and prefetch settings. Thanks again! Did you increase KVA_PAGES in your kernel config as well? The default of 256 only allows 1GB of kernel memory total. Setting KVA_PAGES to 384 would probably be good for a kmem_size of 1GB. This would give leave you with 512MB of space for other things in the kernel. In your kernel config: optionsKVA_PAGES=384 Sorry if you already knew this. I know its in the zfs tuning guide. I just hadn't seen it mentioned in the thread yet and wanted to make sure it wasn't missed. Hope that helps. - Ben ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: rl0: watchdog timeout + 40, 000 ms ping with 7.1-BETA-i386-disc1.iso
Hi All Jeremy Chadwick wrote: > On Thu, Sep 25, 2008 at 05:36:44PM +0200, Julian Stacey wrote: > > Hi stable@, > > I just imported an old tower from a friend. Used to run Linux OK. > > Reset BIOS to defaults, turned off power saving etc, installed > > 7.1-BETA-i386-disc1.iso > > I now sees > > rl0: watchdog timeout + 40,000 ms ping outgoing. > > ping incoming fails, > > it's not my net switch, I've moved to different segments etc & all else fine > > > > I'm remaking binaries, & will look around for netstat r whatever > > commands later, meanwhile here's dmesg (via a floppy) > > > > Of course it could be somehow a hardaware bad config, its a new box to me. > > It's a "new box" with hardware from the late 90s? :-) Yes, new to me :-) The offer I got was "Do you want this or shall I dump it ?" :-) > > Copyright (c) 1992-2008 The FreeBSD Project. > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > > The Regents of the University of California. All rights reserved. > > FreeBSD is a registered trademark of The FreeBSD Foundation. > > FreeBSD 7.1-BETA #0: Sun Sep 7 13:49:18 UTC 2008 > > [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > CPU: Intel Pentium III (651.48-MHz 686-class CPU) > > Origin = "GenuineIntel" Id = 0x681 Stepping = 1 > > > > Features=0x383f9ff > > real memory = 134152192 (127 MB) > > avail memory = 117157888 (111 MB) > > kbd1 at kbdmux0 > > ath_hal: 0.9.20.3 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) > > acpi0: on motherboard > > acpi0: [ITHREAD] > > ACPI Error (psargs-0459): [INX_] Namespace lookup failure, AE_NOT_FOUND > > ACPI Error (psparse-0626): Method parse/execution failed [\\_SB_.PCI0._PRW] > > (Node 0xc1bd6700), AE_NOT_FOUND > > acpi0: Power Button (fixed) > > acpi0: reservation of 0, a (3) failed > > acpi0: reservation of 10, 7ef (3) failed > > Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 > > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0 > > pcib0: port > > 0xcf8-0xcff,0x4000-0x407f,0x4080-0x40ff,0x5000-0x500f on acpi0 > > pci0: on pcib0 > > agp0: on hostb0 > > agp0: aperture size is 256M > > pcib1: at device 1.0 on pci0 > > pci1: on pcib1 > > vgapci0: port 0xc000-0xc0ff mem > > 0xe000-0xe7ff,0xed00-0xed00 irq 11 at device 0.0 on pci1 > > isab0: at device 7.0 on pci0 > > isa0: on isab0 > > atapci0: port > > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xd000-0xd00f at device 7.1 on pci0 > > ata0: on atapci0 > > ata0: [ITHREAD] > > ata1: on atapci0 > > ata1: [ITHREAD] > > uhci0: port 0xd400-0xd41f irq 10 at device 7.2 > > on pci0 > > uhci0: [GIANT-LOCKED] > > uhci0: [ITHREAD] > > usb0: on uhci0 > > usb0: USB revision 1.0 > > uhub0: on usb0 > > uhub0: 2 ports with 2 removable, self powered > > pci0: at device 7.3 (no driver attached) > > rl0: port 0xd800-0xd8ff mem > > 0xee00-0xeeff irq 12 at device 10.0 on pci0 > > miibus0: on rl0 > > rlphy0: PHY 0 on miibus0 > > rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > rl0: Ethernet address: 00:08:a1:6d:65:07 > > rl0: [ITHREAD] > > pci0: at device 11.0 (no driver attached) > > cpu0: on acpi0 > > acpi_throttle0: on cpu0 > > acpi_button0: on acpi0 > > acpi_tz0: on acpi0 > > fdc0: port 0x3f2-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > > fdc0: [FILTER] > > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on > > acpi0 > > sio0: type 16550A > > sio0: [FILTER] > > sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 > > sio1: type 16550A > > sio1: [FILTER] > > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > > atkbd0: irq 1 on atkbdc0 > > kbd0 at atkbd0 > > atkbd0: [GIANT-LOCKED] > > atkbd0: [ITHREAD] > > ACPI Error (psargs-0459): [INX_] Namespace lookup failure, AE_NOT_FOUND > > ACPI Error (psparse-0626): Method parse/execution failed [\\_SB_.PCI0._PRW] > > (Node 0xc1bd6700), AE_NOT_FOUND > > ACPI Error (psargs-0459): [INX_] Namespace lookup failure, AE_NOT_FOUND > > ACPI Error (psparse-0626): Method parse/execution failed [\\_SB_.PCI0._PRW] > > (Node 0xc1bd6700), AE_NOT_FOUND > > pmtimer0 on isa0 > > orm0: at iomem 0xc-0xccfff pnpid ORM on isa0 > > sc0: at flags 0x100 on isa0 > > sc0: VGA <16 virtual consoles, flags=0x300> > > vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 > > Timecounter "TSC" frequency 651482522 Hz quality 800 > > Timecounters tick every 1.000 msec > > ad0: 4110MB at ata0-master UDMA33 > > acd0: CDROM at ata1-master UDMA33 > > Trying to mount root from ufs:/dev/ad0s1a > > rl0: link state changed to UP > > rl0: watchdog timeout > > rl0: link state changed to DOWN > > rl0: link state changed to UP > > rl0: link state changed to DOWN > > rl0: link state changed to UP > > rl0: watchdog timeout > > rl0: watchdog timeout > > I've CC'd PYUN Yong-Hyeon (surname is Pyun), who helps maintain the rl(4) > driver.
Re: bad NFS/UDP performance
> On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote: > > > On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: > > > > Hi, > > > > There seems to be some serious degradation in performance. > > > > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > > > > under 7.1 it drops to 20! > > > > Any ideas? > > > > > > 1) Network card driver changes, > > could be, but at least iperf/tcp is ok - can't get udp numbers, do you > > know of any tool to measure udp performance? > > BTW, I also checked on different hardware, and the badness is there. > > According to INDEX, benchmarks/iperf does UDP bandwidth testing. > > benchmarks/nttcp should as well. > > What network card is in use? If Intel, what driver version (should be > in dmesg). > > > > 2) This could be relevant, but rwatson@ will need to help determine > > >that. > > > > > > http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html > > > > gut feeling is that it's somewhere else: > > > > Writing 16 MB file > > BSCount / 7.0 --/ / 7.1 -/ > >1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s > >2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s > >4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s > >8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s > > 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s > > 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s > > 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s > > 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s > > 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s > > 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s > > 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s > > 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s > > 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s > > 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s > >16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s > >32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s > > > > Average: 75.8633.00 > > > > the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations > > in > > the > > measurements, but the relation are similar, good on 7.0, bad on 7.1 > > Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf? > after more testing, it seems it's related to changes made between Aug 4 and Aug 29 ie, a kernel built on Aug 4 works fine, Aug 29 is slow. I'l now try and close the gap. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 04:35:17PM +0300, Danny Braniss wrote: > > On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote: > > > > On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: > > > > > Hi, > > > > > There seems to be some serious degradation in performance. > > > > > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > > > > > under 7.1 it drops to 20! > > > > > Any ideas? > > > > > > > > 1) Network card driver changes, > > > could be, but at least iperf/tcp is ok - can't get udp numbers, do you > > > know of any tool to measure udp performance? > > > BTW, I also checked on different hardware, and the badness is there. > > > > According to INDEX, benchmarks/iperf does UDP bandwidth testing. > > I know, but I get about 1mgb, which seems somewhat low :-( > > > > > benchmarks/nttcp should as well. > > > > What network card is in use? If Intel, what driver version (should be > > in dmesg). > > bge: > and > bce: > and intels, but haven't tested there yet. Both bge(4) and bce(4) claim to support checksum offloading. You might try disabling it (ifconfig ... -txcsum -rxcsum) to see if things improve. If not, more troubleshooting is needed. You might also try turning off TSO if it's supported (check your ifconfig output for TSO in the options=<> section. Then use ifconfig ... -tso) > > Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf? > > > no, but diffing the sysctl show: > > -vfs.nfs.realign_test: 22141777 > +vfs.nfs.realign_test: 498351 > > -vfs.nfsrv.realign_test: 5005908 > +vfs.nfsrv.realign_test: 0 > > +vfs.nfsrv.commit_miss: 0 > +vfs.nfsrv.commit_blks: 0 > > changing them did nothing - or at least with respect to nfs throughput :-) I'm not sure what any of these do, as NFS is a bit out of my league. :-) I'll be following this thread though! -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
On Friday 26 September 2008 03:04:16 am Danny Braniss wrote: > Hi, > There seems to be some serious degradation in performance. > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > under 7.1 it drops to 20! > Any ideas? > > thanks, > danny Perhaps use nfsstat to see if 7.1 is performing more on-the-wire requests? Also, if you can, do a binary search to narrow down when the regression occurred in RELENG_7. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
> On Fri, 2008-09-26 at 10:04 +0300, Danny Braniss wrote: > > Hi, > > There seems to be some serious degradation in performance. > > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > > under 7.1 it drops to 20! > > Any ideas? > > The scheduler has been changed to ULE, and NFS has historically been > very sensitive to changes like that. You could try switching back to > the 4BSD scheduler and seeing if that makes a difference. If it does, > toggling PREEMPTION would also be interesting to see the results of. > > Gavin I'm testing 7.0-stable vs 7.1-prerelease, and both have ULE. BTW, the nfs client hosts I'm testing are idle. danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
> On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote: > > > On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: > > > > Hi, > > > > There seems to be some serious degradation in performance. > > > > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > > > > under 7.1 it drops to 20! > > > > Any ideas? > > > > > > 1) Network card driver changes, > > could be, but at least iperf/tcp is ok - can't get udp numbers, do you > > know of any tool to measure udp performance? > > BTW, I also checked on different hardware, and the badness is there. > > According to INDEX, benchmarks/iperf does UDP bandwidth testing. I know, but I get about 1mgb, which seems somewhat low :-( > > benchmarks/nttcp should as well. > > What network card is in use? If Intel, what driver version (should be > in dmesg). bge: and bce: and intels, but haven't tested there yet. > > > > 2) This could be relevant, but rwatson@ will need to help determine > > >that. > > > > > > http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html > > > > gut feeling is that it's somewhere else: > > > > Writing 16 MB file > > BSCount / 7.0 --/ / 7.1 -/ > >1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s > >2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s > >4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s > >8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s > > 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s > > 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s > > 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s > > 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s > > 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s > > 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s > > 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s > > 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s > > 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s > > 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s > >16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s > >32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s > > > > Average: 75.8633.00 > > > > the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations > > in > > the > > measurements, but the relation are similar, good on 7.0, bad on 7.1 > > Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf? > no, but diffing the sysctl show: -vfs.nfs.realign_test: 22141777 +vfs.nfs.realign_test: 498351 -vfs.nfsrv.realign_test: 5005908 +vfs.nfsrv.realign_test: 0 +vfs.nfsrv.commit_miss: 0 +vfs.nfsrv.commit_blks: 0 changing them did nothing - or at least with respect to nfs throughput :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Rare problems in upgrade process (corrupted FS?) [SOLVED]
On Fri, Sep 26, 2008 at 02:12:08PM +0200, Jordi Espasa Clofent wrote: > Finally I've modified the stable-supfile TAG from > > *default release=cvs tag=RELENG_7_0 > > to > > *default release=cvs tag=RELENG_7 > > and... voilà!... it works! > > I've interrupted the csup process (^C) and change again the tag to > > *default release=cvs tag=RELENG_7_0 > > and it works perfecty. > > Maybe it's so stupid as the first tag was miss-typed... but I think not. > I checked it several times. > I'ts solved, but I don't understand yet. The part that doesn't make sense to me is why csup using /usr/share/example/cvsup/stable-supfile did not work for you. That file contains tag=RELENG_7. Are you modifying this file? If so, please don't. Make a copy of it somewhere and refer to that location. /root might be a good place. The next time you install world, /usr/share/examples will be overwritten, and you'll lose your changes. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: HELP DEBUG: FreeBSD 6.3-RELEASE-p3 TIMEOUT - WRITE_DMA + other strange behaviour!
Thanks Jeremy and Peter, you are right that the machine has *lots* ot hardware in it, I was thinking of the power supply as a reason and measured the 5 and 12 volts - seemd to be ok 11.8 and 5.2 with all hardware in it. The shared irq is the one I've thought of and that's why I've posted vmstat -i to hear your opinion. [forgot to mention that I've read the wiki and next step is to patch the kernel with http://freenas.svn.sourceforge.net/viewvc/freenas/branches/0.69/build/kernel-patches/ata/files/patch-ata.diff?view=markup this patch (any bad words for this patch or could just run - nothing bad can happen?)] Yes, I have 3 nics(2 on pci) + pci ide promise, I'll get a smart switch with vlans and I'll leave just the integrated xl0 and fxp0 with both external ips on it these days, but first I'll patch the kernel if Jeremy says it won't hurt (as far as I saw just a timeout is moved from hardcoded value to a sysctl?)... I have another promise card that is a raid controller, but when I've started looking for one I've asked here and there were answers for PROMISE ULTRA ATA133 for being a good card for my freebsd ( http://docs.freebsd.org/cgi/getmsg.cgi?fetch=290848+0+archive/2008/freebsd-stable/20080316.freebsd-stable ) (hmm, just saw that Jeremy pointed out promise card: 'Their Ultra133 TX2 card works fine on 33MHz PCI bus machines; don't worry about the card being 66MHz, it will downthrottle correctly.') so maybe the problem will be solved if I leave just two nics and no rl0... Actually I'm using 6.3 here because I didn't wanted this to happen and I was ware of such problems happening on 7-current So test must be done... pls just answer about the patch will it be helpful or I should try: 1. remove rl0 and run only one isp for the test. 2. replace the ultra 133 card with another one. 3. try to replace the ATA100 cables (the one with 80 wires) with an older ones with only 40 cabels? 4. ? anything else? Anton - Valqk wrote: > Hello, > I have a VERY strange behaving 6-3p3 with DMA tmieouts and network cards > 'dropping traffic'. > Following is the explanation of hardware and the thinga that are happening. > The machine is DELL optiplex PII 300mHZ with 512RAM. > It has 3 NICs: > fxp0: flags=8843 mtu 1500 > options=8 > inet 7.8.9.10 netmask 0xf000 broadcast 7.8.9.255 > ether 00:91:21:16:14:bf > media: Ethernet autoselect (100baseTX ) > status: active > rl0: flags=8843 mtu 1500 > options=8 > inet 8.9.10.11 netmask 0xffe0 broadcast 8.9.10.255 > ether 00:02:44:73:2a:fa > media: Ethernet autoselect (100baseTX ) > status: active > xl0: flags=8843 mtu 1500 > options=9 > inet 192.168.123.2 netmask 0xff00 broadcast 192.168.123.255 > inet 192.168.123.5 netmask 0xff00 broadcast 192.168.123.255 > inet 192.168.123.6 netmask 0xff00 broadcast 192.168.123.255 > ether 00:c0:4f:20:66:a3 > media: Ethernet autoselect (100baseTX ) > status: active > fxp0 and rl0 are external links to the world and are plugged into pci slots > xl0 is the internal interface and is integrated on motherboard. > It also has 1 PROMISE ULTRA133 ATA pci IDE controller plugged into the > pci slot. > It has 5 disks in it - 4 connected to the PROMISE card and 1 to the > motherboard ide. > > they are as follows: > ad0 and ad6 are two identical hitachi disks in gmirror for the system > and a partition that I keep backups on. > > ad4, ad5 and ad7 are storage disks - seagates 500GB 8mb cache that I > keep isos etc files on and are the problematic (maybe because of high > traffic operations compared to the other two?). > > What is the problem: > Actually there are two problems: > 1. I get a lot of dma times outs. mostly on ad5 and ad7 where I keep > files over 4-5MBs and write/read very often with 3-6-8MB/s from the > disk. I don't use ad4 so I can not tell if there's gona be timeous but I > suppose there will (currently has linux partitions on it and is not > mounted). I get these errors: > dmesg.today:ad7: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=5554848 > dmesg.today:ad7: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=5914112 > dmesg.today:ad7: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=14924096 > dmesg.today:ad7: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=374303456 > dmesg.today:ad7: FAILURE - WRITE_DMA48 status=51 > error=10 LBA=374303456 > dmesg.today:g_vfs_done():ad7[WRITE(offset=191643369472, > length=131072)]error = 5 > dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=50757760 > dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=50760192 > dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=12032 > dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=50769792 > > strange thing is that I'm seeing the g_vfs_done just recently and this > problem is from the very start of this hardware setup of the machine. > The machine used to
Re: Rare problems in upgrade process (corrupted FS?) [SOLVED]
Finally I've modified the stable-supfile TAG from *default release=cvs tag=RELENG_7_0 to *default release=cvs tag=RELENG_7 and... voilà!... it works! I've interrupted the csup process (^C) and change again the tag to *default release=cvs tag=RELENG_7_0 and it works perfecty. Maybe it's so stupid as the first tag was miss-typed... but I think not. I checked it several times. I'ts solved, but I don't understand yet. -- Thanks, Jordi Espasa Clofent ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
On Fri, 2008-09-26 at 10:04 +0300, Danny Braniss wrote: > Hi, > There seems to be some serious degradation in performance. > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > under 7.1 it drops to 20! > Any ideas? The scheduler has been changed to ULE, and NFS has historically been very sensitive to changes like that. You could try switching back to the 4BSD scheduler and seeing if that makes a difference. If it does, toggling PREEMPTION would also be interesting to see the results of. Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Rare problems in upgrade process (corrupted FS?)
On Fri, Sep 26, 2008 at 01:23:12PM +0200, Jordi Espasa Clofent wrote: >> I would do the following: >> >> rm -fr /usr/src/* >> rm -fr /var/db/sup/src-all >> csup -h -L 2 -g /usr/share/examples/stable-supfile > > I've done it. But the results are, at least, curious... > > # csup -h cvsup.de.FreeBSD.org -L 2 -g > /usr/share/examples/cvsup/stable-supfile > Parsing supfile "/usr/share/examples/cvsup/stable-supfile" > Connecting to cvsup.de.FreeBSD.org > Connected to 212.19.57.134 > Server software version: SNAP_16_1h > Negotiating file attribute support > Exchanging collection information > Establishing multiplexed-mode data connection > Running > Updating collection src-all/cvs > Shutting down connection to server > Finished successfully > > # cd /usr/src ; ls -la > total 0 What's df -k have to say about this? This is truly bizarre. Can you truss the csup process? Something like this should work: truss -o truss.out -s 256 csup {...flags from above...} Then put truss.out up somewhere where we can get to it? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Rare problems in upgrade process (corrupted FS?)
On 2008-Sep-26 13:23:12 +0200, Jordi Espasa Clofent <[EMAIL PROTECTED]> wrote: >Connecting to cvsup.de.FreeBSD.org Edwin's script reports this as up-to-date. ># cd /usr/src ; ls -la >total 0 But something is obviously wrong. Can you post your supfile please. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. pgpAG1RvfOlva.pgp Description: PGP signature
Re: Rare problems in upgrade process (corrupted FS?)
I would do the following: rm -fr /usr/src/* rm -fr /var/db/sup/src-all csup -h -L 2 -g /usr/share/examples/stable-supfile I've done it. But the results are, at least, curious... # csup -h cvsup.de.FreeBSD.org -L 2 -g /usr/share/examples/cvsup/stable-supfile Parsing supfile "/usr/share/examples/cvsup/stable-supfile" Connecting to cvsup.de.FreeBSD.org Connected to 212.19.57.134 Server software version: SNAP_16_1h Negotiating file attribute support Exchanging collection information Establishing multiplexed-mode data connection Running Updating collection src-all/cvs Shutting down connection to server Finished successfully # cd /usr/src ; ls -la total 0 Anythings exists now in /usr/src. I've tried again using another mirror and cvsup(1) instead of csup(1). Same results: nothing in /usr/src. It's desconcerting I can assure you /sys/amd64/conf/GENERIC exists, and is on the cvsup mirrors. Yes, of course. I've checked it from cvsweb. Superblock problems wouldn't explain this; there are hundreds of superblocks available (you wouldn't be able to use your machine if they were all horked). I've supposed it; your words confirm it. -- Thanks, Jordi Espasa Clofent ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Rare problems in upgrade process (corrupted FS?)
On 2008-Sep-26 12:22:55 +0200, Jordi Espasa Clofent <[EMAIL PROTECTED]> wrote: >1) I do the sync process with csup(1); next I go into >/usr/src/sys/amd64/conf to edit the GENERIC file (I use a custimized >kernels) and this file doesn't exists. You might like to check your CVSup site against http://www.mavetju.org/unix/freebsd-mirrors/ to confirm it is updating correctly. GENERIC should exist. >* I reboot the machine (because of I suspect a very weird FS problem), >boot in single user mode and do a 'fsck -fy'. Effectively, the fsck(8) >found and repair several errors. Epecially, one error claims my >attention: SUPERBLOCK. It might have been useful if you had kept a record of the exact messages. If you repeat the fsck, does it now report any problems? If you are using an up-to-date CVSup mirror, my next suggestion would be hardware problems. -- Peter Jeremy Please excuse any delays as the result of my ISP's inability to implement an MTA that is either RFC2821-compliant or matches their claimed behaviour. pgpc1Ionz3bYP.pgp Description: PGP signature
Re: HELP DEBUG: FreeBSD 6.3-RELEASE-p3 TIMEOUT - WRITE_DMA + other strange behaviour!
On Fri, Sep 26, 2008 at 01:12:14PM +0300, Anton - Valqk wrote: > Hello, > I have a VERY strange behaving 6-3p3 with DMA tmieouts and network cards > 'dropping traffic'. The disk errors you see are well-known, but the reasons for them happening differ per person. Some people replace cables and the problem goes away. Others change controller cards. Others found no solution and went to Linux. http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting Here's some facts: 1) The LBAs reported to have problems are scattered, which indicates to me there are probably not bad blocks on your disks, 2) You have two separate disks showing the above behaviour, decreasing the probability of it being bad blocks/sectors, 3) Your dmesg.today doesn't include timestamps, so I have to assume the problems all happen at once or within short moments of one another, rather than at random moments throughout a 24 hour period, > strange thing is that I'm seeing the g_vfs_done just recently and this > problem is from the very start of this hardware setup of the machine. I believe the g_vfs_done issues can either be attributed to the disk errors you're seeing, or oddities with gmirror/GEOM. I've seen people report this before, and GEOM often spits back an error on an index/offset which seems way too large for it to be realistic. > The machine used to work with two hitachi disks connected to the ad0 and > ad1 (integrated ide) and only one - xl0 - nic perfectly. > The problems started when I plugged in the PROMISE and other nic cards > and started using it as router, fileserver and backup server (each in > separate jail, except the pf firewall). > ... > > 2. The other strange issue is that when (I guess) it starts timeouting > *sometimes* not everytime I'm loosing connection to xl0 or fxp0 > (sometimes the rl0 works and accepts connections from the outside, > sometimes - not). When I go to the machine and plug a monitor - there > are no messages from kernel, no logs in /var/log/messages or debug - > noting. Stange thing is that I ping host from the local net and it time > outs, ifconfig shows that interface is connected at fd 100mbit and > everyting seems ok. I've tried ifconfig xl0 down up but doesn't help, > tried plugging out the cable and it got connected but not packets passed > - timeout again! I've looked at your dmesg and vmstat output, and I have a feeling the problem is an obvious one. Your system has no APIC (this is not a typo), so your system *must* share IRQs. You have ***four*** devices on IRQ 11: a USB controller, your fxp0 card, your rl0 card, and your xl0 card. > http://valqk.ath.cx/tmp/dmesg > http://valqk.ath.cx/tmp/vmstat > http://valqk.ath.cx/tmp/smartctl > > please give any ideas/hints/solutions! I would recommend you start yanking PCI cards out of the system and see which solve the problem. You did state once you added the Promise card (which makes your system have FIVE PCI cards in it?!? Sheesh) the problems began. I can't imagine you'll have a stable system with that many cards in the box all sharing a single IRQ -- especially on a board that old. I'd recommend decreasing the amount of cards you have in that system, or get a motherboard that has an APIC and preferably some reliable on-board networking (read: Intel chips). Toss the rl0 card if possible, and consider replacing the Promise controller with a different one. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: buildworld fails in csh
On Fri, Sep 26, 2008 at 12:14:49PM +0200, Tobias Roth wrote: > On 09/26/08 11:59, Jeremy Chadwick wrote: >> On Fri, Sep 26, 2008 at 11:46:28AM +0200, Tobias Roth wrote: >>> On 09/25/08 15:14, Andreas Rudisch wrote: On Thu, 25 Sep 2008 12:49:42 +0200 Tobias Roth <[EMAIL PROTECTED]> wrote: > heh, that should be RELENG_7. Update your source tree again and clean up the build dirs. http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html#Q23.4.14.6. Could be caused by some left overs from a previous build. >>> That didn't work. What else could I try? >> >> Did you rm -fr /usr/obj/* before rebuilding world? "That didn't work" >> is too ambiguous. > > I followed the above URL and did what was suggested there. So "That > didn't work" was refering to > > # chflags -R noschg /usr/obj/usr > # rm -rf /usr/obj/usr > # cd /usr/src > # make cleandir > # make cleandir > >> The build is failing because it claims ICONV_CONST is undefined. >> >> ICONV_CONST is found here: >> >> $ grep -r ICONV_CONST /usr/src/contrib/tcsh /usr/src/bin/csh >> /usr/src/contrib/tcsh/config.h.in:#undef ICONV_CONST >> /usr/src/contrib/tcsh/configure:#define ICONV_CONST $am_cv_proto_iconv_arg1 >> /usr/src/contrib/tcsh/sh.func.c:ICONV_CONST char *src; >> /usr/src/bin/csh/config.h:#define ICONV_CONST const >> >> src/bin/csh/config.h declares it. >> >> The proper include files are only included if HAVE_ICONV is declared, >> which it is (in src/bin/csh/Makefile), as you can see from -DHAVE_ICONV. > > Nothing seems to be wrong here really. Being as I just rebuilt world only 2 days ago and I did not run into this problem, I'm concluding the issue must be with your system. :-) It's possible you've done some bizarre tuning in /etc/make.conf or /etc/src.conf which is somehow breaking the build. >> You might have to end up giving someone access to your box to solve this >> problem. > > That will not be possible. > > I'll wipe out /usr/src as well and re-cvsup, then build from single user > mode for minimal intervention by shells and environments and see whether > that might help. I don't see how booting single-user is going to help with any of this. And do not forget to remove /var/db/sup/src-all if you remove all of /usr/src. People often forget this fact. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Rare problems in upgrade process (corrupted FS?)
On Fri, Sep 26, 2008 at 12:22:55PM +0200, Jordi Espasa Clofent wrote: > Hi all, > > I'm traying to update a FreeBSD server box from 6.3p11 to 7.0 and I've > found a rare problems. > > 1) I do the sync process with csup(1); next I go into > /usr/src/sys/amd64/conf to edit the GENERIC file (I use a custimized > kernels) and this file doesn't exists. Mmmm I decide to repeat the > process againt other cvsup mirror but I get the same results: GENERIC > file isn't there. > > 2) I go to FreeBSD CVSWeb , locate the GENERIC file under the 7_0 tag, > copy and paste. Yes, I know: a very nasty process. The big problem > appears when I try to do 'make cleandir' and others. I get the next > outputs: > > # pwd > /usr/src > # make cleandir > make: don't know how to make cleandir. Stop > # make buildworld > make: don't know how to make buildworld. Stop > # ls -l /usr/bin/make > -r-xr-xr-x 1 root wheel 351024 Aug 18 13:19 /usr/bin/make > # file /usr/bin/make > /usr/bin/make: ELF 64-bit LSB executable, AMD x86-64, version 1 > (FreeBSD), for FreeBSD 6.3, statically linked, stripped Looks to me like you have no /usr/src/Makefile. > * After the theorical FS reparation I'm again in the point 1. None of the information you provided in your above output, however, shows anything about the filesystem (other than /usr/bin/make). But this sounds honestly like some sort of corrupted supdb, or a cvsup mirror that's broken. I would do the following: rm -fr /usr/src/* rm -fr /var/db/sup/src-all csup -h -L 2 -g /usr/share/examples/stable-supfile I can assure you /sys/amd64/conf/GENERIC exists, and is on the cvsup mirrors. > * I reboot the machine (because of I suspect a very weird FS problem), > boot in single user mode and do a 'fsck -fy'. Effectively, the fsck(8) > found and repair several errors. Epecially, one error claims my > attention: SUPERBLOCK. Superblock problems wouldn't explain this; there are hundreds of superblocks available (you wouldn't be able to use your machine if they were all horked). -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Rare problems in upgrade process (corrupted FS?)
Hi all, I'm traying to update a FreeBSD server box from 6.3p11 to 7.0 and I've found a rare problems. 1) I do the sync process with csup(1); next I go into /usr/src/sys/amd64/conf to edit the GENERIC file (I use a custimized kernels) and this file doesn't exists. Mmmm I decide to repeat the process againt other cvsup mirror but I get the same results: GENERIC file isn't there. 2) I go to FreeBSD CVSWeb , locate the GENERIC file under the 7_0 tag, copy and paste. Yes, I know: a very nasty process. The big problem appears when I try to do 'make cleandir' and others. I get the next outputs: # pwd /usr/src # make cleandir make: don't know how to make cleandir. Stop # make buildworld make: don't know how to make buildworld. Stop # ls -l /usr/bin/make -r-xr-xr-x 1 root wheel 351024 Aug 18 13:19 /usr/bin/make # file /usr/bin/make /usr/bin/make: ELF 64-bit LSB executable, AMD x86-64, version 1 (FreeBSD), for FreeBSD 6.3, statically linked, stripped ¿?¿?¿?¿ * I reboot the machine (because of I suspect a very weird FS problem), boot in single user mode and do a 'fsck -fy'. Effectively, the fsck(8) found and repair several errors. Epecially, one error claims my attention: SUPERBLOCK. * After the theorical FS reparation I'm again in the point 1. ¿Any clues? -- Thanks, Jordi Espasa Clofent ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
HELP DEBUG: FreeBSD 6.3-RELEASE-p3 TIMEOUT - WRITE_DMA + other strange behaviour!
Hello, I have a VERY strange behaving 6-3p3 with DMA tmieouts and network cards 'dropping traffic'. Following is the explanation of hardware and the thinga that are happening. The machine is DELL optiplex PII 300mHZ with 512RAM. It has 3 NICs: fxp0: flags=8843 mtu 1500 options=8 inet 7.8.9.10 netmask 0xf000 broadcast 7.8.9.255 ether 00:91:21:16:14:bf media: Ethernet autoselect (100baseTX ) status: active rl0: flags=8843 mtu 1500 options=8 inet 8.9.10.11 netmask 0xffe0 broadcast 8.9.10.255 ether 00:02:44:73:2a:fa media: Ethernet autoselect (100baseTX ) status: active xl0: flags=8843 mtu 1500 options=9 inet 192.168.123.2 netmask 0xff00 broadcast 192.168.123.255 inet 192.168.123.5 netmask 0xff00 broadcast 192.168.123.255 inet 192.168.123.6 netmask 0xff00 broadcast 192.168.123.255 ether 00:c0:4f:20:66:a3 media: Ethernet autoselect (100baseTX ) status: active fxp0 and rl0 are external links to the world and are plugged into pci slots xl0 is the internal interface and is integrated on motherboard. It also has 1 PROMISE ULTRA133 ATA pci IDE controller plugged into the pci slot. It has 5 disks in it - 4 connected to the PROMISE card and 1 to the motherboard ide. they are as follows: ad0 and ad6 are two identical hitachi disks in gmirror for the system and a partition that I keep backups on. ad4, ad5 and ad7 are storage disks - seagates 500GB 8mb cache that I keep isos etc files on and are the problematic (maybe because of high traffic operations compared to the other two?). What is the problem: Actually there are two problems: 1. I get a lot of dma times outs. mostly on ad5 and ad7 where I keep files over 4-5MBs and write/read very often with 3-6-8MB/s from the disk. I don't use ad4 so I can not tell if there's gona be timeous but I suppose there will (currently has linux partitions on it and is not mounted). I get these errors: dmesg.today:ad7: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=5554848 dmesg.today:ad7: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=5914112 dmesg.today:ad7: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=14924096 dmesg.today:ad7: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=374303456 dmesg.today:ad7: FAILURE - WRITE_DMA48 status=51 error=10 LBA=374303456 dmesg.today:g_vfs_done():ad7[WRITE(offset=191643369472, length=131072)]error = 5 dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=50757760 dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=50760192 dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=12032 dmesg.today:ad5: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=50769792 strange thing is that I'm seeing the g_vfs_done just recently and this problem is from the very start of this hardware setup of the machine. The machine used to work with two hitachi disks connected to the ad0 and ad1 (integrated ide) and only one - xl0 - nic perfectly. The problems started when I plugged in the PROMISE and other nic cards and started using it as router, fileserver and backup server (each in separate jail, except the pf firewall). 2. The other strange issue is that when (I guess) it starts timeouting *sometimes* not everytime I'm loosing connection to xl0 or fxp0 (sometimes the rl0 works and accepts connections from the outside, sometimes - not). When I go to the machine and plug a monitor - there are no messages from kernel, no logs in /var/log/messages or debug - noting. Stange thing is that I ping host from the local net and it time outs, ifconfig shows that interface is connected at fd 100mbit and everyting seems ok. I've tried ifconfig xl0 down up but doesn't help, tried plugging out the cable and it got connected but not packets passed - timeout again! I've rebooted and nic came up. These 'drops' became more and more common recently and last night I wasn't able to login for about an hour and after that the machine came back up again by itself!!!that's in the lan - but it wasn't accessible at all from the outside - strange thins is that it replied to ping but I wasn't able to even open the ssh port connection and the nat wasn't working?! After that I've remembered that at this time I have a cronjob started for about an hour that fetches into a file a online radio cast for an hour wired!!! it also have rtorrent, apache22, samba (in a jail) runing. some output from it can be found here: http://valqk.ath.cx/tmp/dmesg http://valqk.ath.cx/tmp/vmstat http://valqk.ath.cx/tmp/smartctl please give any ideas/hints/solutions! thanks a lot to everyone! cheers, valqk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
ssh problems when upgrading 5.5 to 6.3
Hi all, i'm trying to remotely upgrade a 5.5 system to 6.3 and have run into an issue with userland not matching my kernel. (Yes i know i am a bad guy for even trying to do a upgrade remote, but this is a dress rehersal for future such scenarios.) Symptoms: When trying to ssh to the machine with a 6.3 kernel and a 5.5 userland i get: % ssh machine Password: Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. And then the motd and after the session is stuck. I can manage to do "ssh machine csh" i dont get a prompt but are able to execute commands. A tail of /var/log/messages reveal: Sep 26 12:00:36 web sshd[3012]: error: openpty: Invalid argument Sep 26 12:00:36 web sshd[3015]: error: session_pty_req: session 0 alloc failed ok lets do a "su" and reboot the machine (I have used nextboot to try the new kernel out), but su gives me a "su: Sorry" straight away. Looking in messages i see: Sep 26 11:14:14 web su: in prompt_echo_off(): tcgetattr(): Operation not supported Sep 26 11:14:14 web su: BAD SU chris to root on tty Ok, i'm totally aware that this is related to running the wrong userland for the wrong kernel. But i still would like to explore this problem a bit. Thus these questions: A) Is this issue related to going directly from 5.5 to 6.3? That is could i have gotten away without theese problems by upgrading to 6.0 first and then head on to 6.3? B) do you thing i would have been able to do an "su" or even login if i have had /usr/ports/misc/compat5x installed? C) Does anyone have a creative way to reboot the machine remote? You all where waiting for this, wasn't you ;-) (Or is there a way to get su to survive long enough to do a reebot?) /Chris ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: buildworld fails in csh
On 09/26/08 11:59, Jeremy Chadwick wrote: On Fri, Sep 26, 2008 at 11:46:28AM +0200, Tobias Roth wrote: On 09/25/08 15:14, Andreas Rudisch wrote: On Thu, 25 Sep 2008 12:49:42 +0200 Tobias Roth <[EMAIL PROTECTED]> wrote: heh, that should be RELENG_7. Update your source tree again and clean up the build dirs. http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html#Q23.4.14.6. Could be caused by some left overs from a previous build. That didn't work. What else could I try? Did you rm -fr /usr/obj/* before rebuilding world? "That didn't work" is too ambiguous. I followed the above URL and did what was suggested there. So "That didn't work" was refering to # chflags -R noschg /usr/obj/usr # rm -rf /usr/obj/usr # cd /usr/src # make cleandir # make cleandir The build is failing because it claims ICONV_CONST is undefined. ICONV_CONST is found here: $ grep -r ICONV_CONST /usr/src/contrib/tcsh /usr/src/bin/csh /usr/src/contrib/tcsh/config.h.in:#undef ICONV_CONST /usr/src/contrib/tcsh/configure:#define ICONV_CONST $am_cv_proto_iconv_arg1 /usr/src/contrib/tcsh/sh.func.c:ICONV_CONST char *src; /usr/src/bin/csh/config.h:#define ICONV_CONST const src/bin/csh/config.h declares it. The proper include files are only included if HAVE_ICONV is declared, which it is (in src/bin/csh/Makefile), as you can see from -DHAVE_ICONV. Nothing seems to be wrong here really. You might have to end up giving someone access to your box to solve this problem. That will not be possible. I'll wipe out /usr/src as well and re-cvsup, then build from single user mode for minimal intervention by shells and environments and see whether that might help. Thanks, Tobias -- Tobias Roth || http://fsck.ch || PGP: 0xCE599B4D | Percusive Maintenance: | The art of tuning or repairing equipment by hitting it. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Problems with FreeBSD 7.1 Pre-Release after Upgrade from 7.0
On Thu, 25 Sep 2008, [EMAIL PROTECTED] wrote: After cvsuping the source and recompiling the kernel from 7.0 pid 971 (kldstat), uid 0: exited on signal 11 (core dumped) fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8 pid 977 (mdconfig), uid 0: exited on signal 11 (core dumped) pid 978 (mdconfig), uid 0: exited on signal 11 (core dumped) acpi_ec0: warning: EC done before starting event wait pid 1371 (kldstat), uid 1001: exited on signal 11 (core dumped) pid 4485 (kldstat), uid 0: exited on signal 11 (core dumped) Just checking, have you rebuilt your userland too? (And i see you use fuse, you might want to rebuild that to.) /Chris -- http://www.arnold.se/chris/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: buildworld fails in csh
On Fri, Sep 26, 2008 at 11:46:28AM +0200, Tobias Roth wrote: > On 09/25/08 15:14, Andreas Rudisch wrote: >> On Thu, 25 Sep 2008 12:49:42 +0200 >> Tobias Roth <[EMAIL PROTECTED]> wrote: >> >>> heh, that should be RELENG_7. >> >> Update your source tree again and clean up the build dirs. >> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html#Q23.4.14.6. >> >> Could be caused by some left overs from a previous build. > > That didn't work. What else could I try? Did you rm -fr /usr/obj/* before rebuilding world? "That didn't work" is too ambiguous. The build is failing because it claims ICONV_CONST is undefined. ICONV_CONST is found here: $ grep -r ICONV_CONST /usr/src/contrib/tcsh /usr/src/bin/csh /usr/src/contrib/tcsh/config.h.in:#undef ICONV_CONST /usr/src/contrib/tcsh/configure:#define ICONV_CONST $am_cv_proto_iconv_arg1 /usr/src/contrib/tcsh/sh.func.c:ICONV_CONST char *src; /usr/src/bin/csh/config.h:#define ICONV_CONST const src/bin/csh/config.h declares it. The proper include files are only included if HAVE_ICONV is declared, which it is (in src/bin/csh/Makefile), as you can see from -DHAVE_ICONV. You might have to end up giving someone access to your box to solve this problem. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 12:27:08PM +0300, Danny Braniss wrote: > > On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: > > > Hi, > > > There seems to be some serious degradation in performance. > > > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > > > under 7.1 it drops to 20! > > > Any ideas? > > > > 1) Network card driver changes, > could be, but at least iperf/tcp is ok - can't get udp numbers, do you > know of any tool to measure udp performance? > BTW, I also checked on different hardware, and the badness is there. According to INDEX, benchmarks/iperf does UDP bandwidth testing. benchmarks/nttcp should as well. What network card is in use? If Intel, what driver version (should be in dmesg). > > 2) This could be relevant, but rwatson@ will need to help determine > >that. > > > > http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html > > gut feeling is that it's somewhere else: > > Writing 16 MB file > BSCount / 7.0 --/ / 7.1 -/ >1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s >2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s >4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s >8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s > 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s > 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s > 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s > 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s > 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s > 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s > 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s > 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s > 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s > 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s >16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s >32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s > > Average: 75.8633.00 > > the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in > the > measurements, but the relation are similar, good on 7.0, bad on 7.1 Do you have any NFS-related tunings in /etc/rc.conf or /etc/sysctl.conf? -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: buildworld fails in csh
On 09/25/08 15:14, Andreas Rudisch wrote: On Thu, 25 Sep 2008 12:49:42 +0200 Tobias Roth <[EMAIL PROTECTED]> wrote: heh, that should be RELENG_7. Update your source tree again and clean up the build dirs. http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html#Q23.4.14.6. Could be caused by some left overs from a previous build. That didn't work. What else could I try? Thanks, Tobias -- Tobias Roth || http://fsck.ch || PGP: 0xCE599B4D | God is a comedian playing to an audience too afraid to laugh. | - Voltaire ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
> On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: > > Hi, > > There seems to be some serious degradation in performance. > > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > > under 7.1 it drops to 20! > > Any ideas? > > 1) Network card driver changes, could be, but at least iperf/tcp is ok - can't get udp numbers, do you know of any tool to measure udp performance? BTW, I also checked on different hardware, and the badness is there. > > 2) This could be relevant, but rwatson@ will need to help determine >that. > > http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html gut feeling is that it's somewhere else: Writing 16 MB file BSCount / 7.0 --/ / 7.1 -/ 1*512 32768 0.16s 98.11MB/s 0.43s 37.18MB/s 2*512 16384 0.17s 92.04MB/s 0.46s 34.79MB/s 4*512 8192 0.16s 101.88MB/s 0.43s 37.26MB/s 8*512 4096 0.16s 99.86MB/s 0.44s 36.41MB/s 16*512 2048 0.16s 100.11MB/s 0.50s 32.03MB/s 32*512 1024 0.26s 61.71MB/s 0.46s 34.79MB/s 64*512512 0.22s 71.45MB/s 0.45s 35.41MB/s 128*512256 0.21s 77.84MB/s 0.51s 31.34MB/s 256*512128 0.19s 82.47MB/s 0.43s 37.22MB/s 512*512 64 0.18s 87.77MB/s 0.49s 32.69MB/s 1024*512 32 0.18s 89.24MB/s 0.47s 34.02MB/s 2048*512 16 0.17s 91.81MB/s 0.30s 53.41MB/s 4096*512 8 0.16s 100.56MB/s 0.42s 38.07MB/s 8192*512 4 0.82s 19.56MB/s 0.80s 19.95MB/s 16384*512 2 0.82s 19.63MB/s 0.95s 16.80MB/s 32768*512 1 0.81s 19.69MB/s 0.96s 16.64MB/s Average: 75.8633.00 the nfs filer is a NetWork Appliance, and is in use, so i get fluctuations in the measurements, but the relation are similar, good on 7.0, bad on 7.1 Cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: vm.kmem_size settings doesn't affect loader?
Jeremy Chadwick wrote: On Thu, Sep 25, 2008 at 04:14:02PM +0200, Bartosz Stec wrote: Your options are: 1) Consider increasing it from 512M to something like 1.5GB; do not increase it past that on RELENG_7, as there isn't support for more than 2GB total. For example, on a 1GB memory machine, I often recommend 768M. On 2GB machines, 1536M. You will need to run -CURRENT if you want more. 2) Tune ZFS aggressively. Start by setting vfs.zfs.arc_min="16M" and vfs.zfs.arc_max="64M". If your machine has some small amount of memory (768MB, 1GB, etc.), then you probably shouldn't be using ZFS. Problem occured on i386 machine with 1GB of memory and 7.1-pre (3HDD, 40GB, RAIDZ1). I know that i386 is highly unrecommended for ZFS, but it's just a home box for testing and learning purposes - I just want to know what I'm doing and what should I expect when I decide to put ZFS on server machines :) Currently, from posts on freebsd-fs, I conclude that even with a gigs of kmem and using AMD64, we still can experience panic from kmem_malloc. The i386 vs. amd64 argument is bogus, if you ask me. ZFS works on both. amd64 is recommended because ZFS contains code that makes heavy use of 64-bit values, and because amd64 offers large amounts of addressed memory without disgusting hacks like PAE. That said -- yes, even with "gigs of kmem and using AMD64", you can still panic due to kmem exhaustion. I have fairly decent experience with this problem, because it haunted me for quite some time. A large portion of the problem is that kmem_max, on i386 and amd64 (yes, you read that right) has a 2GB limit on RELENG_7. I repeat: a 2GB limit, regardless of i386 or amd64. This limit has been increased to 512GB on CURRENT, but there are no plans to MFC those changes, as they are too major. Let me tell you something I did this weekend. I had to copy literally 200GB of data from a ZFS raidz1 pool (spread across 3 disks) to two different places: 1) a UFS2 filesystem on a different disk, and 2) across a gigE network to a Windows machine. I had to do this because I was adding a disk to the vdev, which cannot be done without re-creating the pool (this is a known problem with ZFS, and has nothing to do with FreeBSD). The machine hosting the data runs RELENG_7 with amd64, and contains 4GB of memory. However, I've accomplished the same task with only 2GB of memory as well. These are the tuning settings I use: vm.kmem_size="1536M" vm.kmem_size_max="1536M" vfs.zfs.arc_min="16M" vfs.zfs.arc_max="64M" The entire copying process took almost 2 hours. Not once did I experience kmem exhaustion. I can *guarantee* that I would have crashed the box numerous times had I not tuned the machine with the values above. Manual tuning is hard for me because I'm not familiar with BSD kernel code nor kernel memory management. I'm just an end-user who love concepts of ZFS and wait for it to be (more) stable. Of course I've followed tuning guide carefully. I'm an "experienced" end-user who has very little experience with BSD kernel code and absolutely no experience with kernel memory management. Proper tuning is all that's needed, regardless of your knowledge set. Please try installing 2GB of memory in your i386 box, and then use the exact loader.conf values I specified above. Thank you for hints. Yesterday I've added 512 MB memory to box (sum 1,5GB), and set vm.kmem_size and vm.kmem_size to "1024M". With pieces of 1024MB, 512MB, 256MB, 256MB available and 3 memory slots it is hard to have 2GB RAM ;) Until now it survived world cleaning/building/installing/bonnie++ benchmarkink/fs scrubing and general usage. Memory usage seems stable. If unfortunately kmem exhaustion will happen again I will experiment with ARC settings. IMHO you've explained gently a lot of zfs tuning concerns in this thread and they should be added to tuning guide - espacially explanation of ARC and prefetch settings. Thanks again! -- Bartosz Stec ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
On Fri, Sep 26, 2008 at 10:04:16AM +0300, Danny Braniss wrote: > Hi, > There seems to be some serious degradation in performance. > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > under 7.1 it drops to 20! > Any ideas? 1) Network card driver changes, 2) This could be relevant, but rwatson@ will need to help determine that. http://lists.freebsd.org/pipermail/freebsd-stable/2008-September/045109.html -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: bad NFS/UDP performance
>There seems to be some serious degradation in performance. > Under 7.0 I get about 90 MB/s (on write), while, on the same machine > under 7.1 it drops to 20! > Any ideas? Can you compare performanc with tcp? -- regards Claus When lenity and cruelty play for a kingdom, the gentler gamester is the soonest winner. Shakespeare ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
bad NFS/UDP performance
Hi, There seems to be some serious degradation in performance. Under 7.0 I get about 90 MB/s (on write), while, on the same machine under 7.1 it drops to 20! Any ideas? thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"