Re: 5.4 -> 6.0 buildworld failure
On Mon, 06 Feb 2006 14:13:13 +0100 Markus Buretorp <[EMAIL PROTECTED]> wrote: > For me, the problem was caused by some stupid envvars I had > in my shell config. I removed these and the problem was solved. > > export INCLUDE_PATH=/usr/include:/usr/local/include export > C_INCLUDE_PATH=/usr/include:/usr/local/include:/usr/X11R6/include > export CPLUS_INCLUDE_PATH=$C_INCLUDE_PATH export > LIBRARY_PATH=/usr/lib:/usr/local/lib export LD_LIBRARY_PATH=. Thanks for the info, Markus. I didn't have anything of the sort in my shell config, but fortunately after I upgraded to 5.4-RELEASE-p11, I executed a successful buildworld for 6.0-RELEASE. Cheers! -- Anthony Chavez http://anthonychavez.org/ mailto:[EMAIL PROTECTED] jabber:[EMAIL PROTECTED] pgp3vMMxaRaGG.pgp Description: PGP signature
Re: 5.4 -> 6.0 buildworld failure
Markus and freebsd-stable: I have encountered a situation exactly the same as the one described in this thread when attempting a source upgrade from 5.4-RELEASE-p4 to 6.0-RELEASE-p4. I have had nothing but success on 10 other machines that were initially running 5.4-RELEASE, 5.4-RELEASE, -p6, and -p8. Each of these machines has a unique hardware configuration, and the one that fails to buildworld is no exception. I have tried an empty /etc/make.conf as well as specifically including "CFLAGS=-O -pipe" therein. I have also tried a default /etc/profile. The build still fails. I'm thinking that the best course of action might be to upgrade to 5.4-RELEASE-p11 (which builds successfully), but I'm very interested to know what's causing this error in case my intended course of action doesn't work. The commit logs show no changes for this particular file since well before this problem was reported, so the problem must have lied somewhere else in the source tree. Any ideas what could be causing this? Am I on the right track? I have included my dmesg.boot below. Cheers! On Sat, 05 Nov 2005 23:49:34 +0100 Markus Buretorp <[EMAIL PROTECTED]> wrote: > Peter Jeremy wrote: > >>On Sat, 2005-Nov-05 21:17:58 +0100, Markus Buretorp wrote: >> >>> I'm trying to upgrade from FreeBSD 5.4-STABLE to 6.0. I've done a >>> cvsup to RELENG_6 and RELENG_6_0, I've ran make cleanworld, make >>> clean, rm -rf /usr/obj/*, etc; but nothing helps. >>> >>>... >>> >>> /usr/src/lib/libkvm/kvm_proc.c:108: error: storage size of 't_cdev' >>> isn't known >> >>Where is this error occurring during the buildworld? (What are the >>latest lines beginning '>>>' and '===>') >>What non-standard bits do you have in your command line, /etc/make.conf >>or MAKEOBJDIRPREFIX? > > >>> stage 4.2: building libraries > ... > ===> lib/libkvm (depend,all,install) > > make.conf: > > WITHOUT_X11=yes > CPUTYPE?=athlon-xp > CFLAGS=-O2 -pipe > COPTFLAGS=-O -pipe > # added by use.perl 2005-06-24 23:01:50 > PERL_VER=5.8.7 > PERL_VERSION=5.8.7 > > Note, I've tried without the first four lines. > > $ cd lib/libkvm > /usr/src [EMAIL PROTECTED] > $ make > ...r/src/lib/libkvm [EMAIL PROTECTED] > cc -O -pipe -DLIBC_SCCS -I/usr/src/lib/libkvm -c > /usr/src/lib/libkvm/kvm_proc.c > /usr/src/lib/libkvm/kvm_proc.c: In function `kvm_proclist': > /usr/src/lib/libkvm/kvm_proc.c:108: error: storage size of 't_cdev' > isn't known > /usr/src/lib/libkvm/kvm_proc.c:114: error: storage size of 'pr' > isn't known > /usr/src/lib/libkvm/kvm_proc.c:176: error: structure has no member > named `ki_jid' > /usr/src/lib/libkvm/kvm_proc.c:377: error: structure has no member > named `p_rux' > *** Error code 1 > > Stop in /usr/src/lib/libkvm. > > I found this, http://www.freebsd.org/cgi/query-pr.cgi?pr=77821 , but > it doesn't help me much. I don't now what I've done. I've used cvsup > and buildworld several times. -- Anthony Chavez http://anthonychavez.org/ mailto:[EMAIL PROTECTED] jabber:[EMAIL PROTECTED] --8<---cut here---start->8--- Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-RELEASE-p4 #1: Sun Sep 11 20:13:50 MDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/MYBOX WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant WARNING: MPSAFE network stack disabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3010.67-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 1073414144 (1023 MB) avail memory = 1040855040 (992 MB) ACPI APIC Table: ioapic0 irqs 0-23 on motherboard npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 pcib0
Re: Stress testing and TIMEOUT - WRITE_DMA
On Mon, 12 Sep 2005 08:19:18 +0200 martin hudec <[EMAIL PROTECTED]> wrote: > On Sun, Sep 11, 2005 at 10:33:47PM +0200 or thereabouts, Daniel Gerzo wrote: >> On Fri, 26 Aug 2005 03:21:35 -0600 Anthony Chavez <[EMAIL PROTECTED]> >> wrote: >> > Sep 6 11:35:27 mybox kernel: ad0: TIMEOUT - WRITE_DMA retrying (2 retries >> > left) LBA=8348191 >> > ... >> > Sep 6 18:59:09 mybox kernel: ad0: TIMEOUT - WRITE_DMA retrying (2 retries >> > left) LBA=8348383 >> > Sep 6 19:04:58 mybox kernel: ad0: TIMEOUT - READ_DMA retrying (2 retries >> > left) LBA=61749183 >> >> > The READ_DMA timeouts are happening very infrequently, but it's worth >> > mentioning that I'm seeing them now in addition. >> >> > This is quite disturbing, particularly when the machine in question is >> > *in*production.* >> >> I thing you should really quickly look for backuping your data. When >> I was seeing this kind of messages last time, my disk died after 3 >> days from time they started showing up in my log files. I wasn't able >> to write any data to the disk (system just sudennly paniced, when >> I tried to mount it rw, but I was able to mount it ro and copy most of >> the data) Note, that I wasn't able to copy about 10GB out of 30GB. So >> don't ignore them and have a good luck. > > Hmmm, before trashing that disk, you could surely consider running > smartmontools to see what they have to say about health condition of > your disk :).. go for sysutils/smartmontools. Okay, I've actually got 3 identical drives (SAMSUNG SP0802N) in 3 identical systems, running identical hardware using Intel ICH4 controllers. Only one of these machines managed to spit 81 errors at me over a period of about 6.5 hours (so far). This particular machine produced the warnings after approximately 8 days after installing FreeBSD. Ironically, another one of these machines only produced 1 warning after nearly 21 days and then another solitary warning 14 days after that (which occurred as I was drafting this response). smartctl reports each of these drives passes the "SMART overall-health self-assessment test" but goes on to report exactly 6 "SET MAX ADDRESS [OBS-6]" errors occur for each drive within 1 hour of uptime. I do not think that any of these errors occured at the same time the DMA warnings did. > After that can one make assumptions whether it is faulty hardware or > ata patches :). Well, the drives are pretty much brand new. I think that it's safe to assume that the health of these drives are not a concern, and smartctl seems to confirm this. On Mon, 12 Sep 2005 15:53:27 +0200 MaXX <[EMAIL PROTECTED]> wrote: > On Fri, 26 Aug 2005 03:21:35 -0600 Anthony Chavez <[EMAIL PROTECTED]> > wrote: >> My question is simply this: is the fact that I received 4 TIMEOUT >> warnings in the space of roughly 2 weeks significant cause for concern? > Hi, > You may have a look at this pr :85603 (FS corruption and 'uncorrectable' DMA > errors on ATA disks after unclean shutdown) and see if that applies for you. Thanks. My hardware doesn't match, but I'll keep it in mind. > Are you running a kernel built around mid June this year? The machine that gave me 81 warnings after applying ata-mk3n: FreeBSD 5.4-RELEASE-p6 #0: Sun Sep 11 21:57:16 MDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/MYBOX1 The machine that's been in commission the longest: FreeBSD 5.4-RELEASE #0: Sun Sep 11 21:46:18 MDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/MYBOX2 New kid on the block: FreeBSD 5.4-RELEASE-p6 #0: Sun Sep 11 21:58:08 MDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/MYBOX3 FWIW, although they have different names, the kernel configs are exactly the same. > Did your machine paniced before the DMA problems appears (I think a power > faillure can do the trick too)? No panic. However, I recall reading that these warnings are a good indication that a panic may be imminent, hence my call for help. > In our case this problem was fixed by newfs, even smartctl > (sysutils/smartmontool) did report errors at the drive level. After newfs'ing > the disk no more message (but they still in the drive's log). That seems very strange, particularly when I have newfs'ed the disks when installing FreeBSD. Furthermore, this solution is not sufficient. The machines that are giving me this error are in crucial locations and I need to know what causes these errors and if a fix is available or if I really should worry about a few popping up now and then. -- Anthony Chavez http://anthonychavez.org/ mailto:[EMAIL PROTECTED] jabber:[EMAIL PROTECTED] pgpxTEOcEyNIj.pgp Description: PGP signature
Re: Stress testing and TIMEOUT - WRITE_DMA
On Sun, 11 Sep 2005 23:02:43 +0200 Matthias Buelow <[EMAIL PROTECTED]> wrote: > Anthony Chavez wrote: > >> Sep 6 11:35:27 mybox kernel: ad0: TIMEOUT - WRITE_DMA retrying (2 retries >> left) LBA=8348191 > [...] >> Has anyone who has experienced this pain found solace in 5-STABLE's ATA >> drivers? > > Is this with the ATA mkIII patches? As I mentioned in my first post to -questions, the system in question is currently tracking RELENG_5 and is currently at version 5.4-RELEASE-p6. I have applied Soeren's mkIII revsion n patchset, available at http://people.freebsd.org/~sos/ATA/, and I'm still seeing the messages, although *much* less frequently than before applying the patches. The question I have is: should I revert back to an unaffected 5.x-RELEASE (which version would that be?) or should I consider tracking 5-STABLE instead? > I assume you're acquainted with the ATA DMA timeout discussions of the > last couple months concerning 5.x. Yes, I have read through the discussions. Is it safe yet to assume that the issues (at least for the ICH controllers) have been fixed in -CURRENT? Thanks. -- Anthony Chavez http://anthonychavez.org/ mailto:[EMAIL PROTECTED] jabber:[EMAIL PROTECTED] pgpqeI8lypFDa.pgp Description: PGP signature
Re: Stress testing and TIMEOUT - WRITE_DMA
I'm not seeing much in the way of responses to this post from freebsd-questions, so I thought I'd take it to freebsd-stable, where it is probably more relevant. ;-) Please see my original thread on freebsd-questions for context. On Fri, 26 Aug 2005 03:21:35 -0600 Anthony Chavez <[EMAIL PROTECTED]> wrote: > My question is simply this: is the fact that I received 4 TIMEOUT > warnings in the space of roughly 2 weeks significant cause for concern? Apparently, the fact that the stress tool produced so few warnings may have given me a false sense of security. I'm being treated to the following messages (81 in total) today, after 8 days uptime: Sep 6 11:35:27 mybox kernel: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=8348191 ... Sep 6 18:59:09 mybox kernel: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=8348383 Sep 6 19:04:58 mybox kernel: ad0: TIMEOUT - READ_DMA retrying (2 retries left) LBA=61749183 The READ_DMA timeouts are happening very infrequently, but it's worth mentioning that I'm seeing them now in addition. This is quite disturbing, particularly when the machine in question is *in*production.* Has anyone who has experienced this pain found solace in 5-STABLE's ATA drivers? dmesg below. -- Anthony Chavez http://anthonychavez.org/ mailto:[EMAIL PROTECTED] jabber:[EMAIL PROTECTED] Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-RELEASE-p6 #0: Fri Aug 26 02:23:19 MDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Celeron(R) CPU 2.40GHz (2392.25-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff real memory = 266813440 (254 MB) avail memory = 251445248 (239 MB) ioapic0: Changing APIC ID to 1 ioapic0 irqs 0-23 on motherboard npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xfeb8-0xfebf,0xe800-0xefff irq 16 at device 2.0 on pci0 agp0: detected 892k stolen memory agp0: aperture size is 128M uhci0: port 0xff80-0xff9f irq 16 at device 29.0 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xff60-0xff7f irq 19 at device 29.1 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xff40-0xff5f irq 18 at device 29.2 on pci0 usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered pci0: at device 29.7 (no driver attached) pcib1: at device 30.0 on pci0 pci1: on pcib1 pci1: at device 5.0 (no driver attached) xl0: <3Com 3c900-TPO Etherlink XL> port 0xddc0-0xddff irq 18 at device 6.0 on pci1 xl0: selecting 10baseT transceiver, half duplex xl0: Ethernet address: 00:60:97:74:a8:6d bfe0: mem 0xfe9fe000-0xfe9f irq 17 at device 9.0 on pci1 miibus0: on bfe0 bmtphy0: on miibus0 bmtphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto bfe0: Ethernet address: 00:12:3f:d4:21:75 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0xffa0-0xffaf,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 18 at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 pci0: at device 31.3 (no driver attached) pci0: at device 31.5 (no driver attached) fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ppc0: port 0x778-0x77f,0x378-0x37f irq 7 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 orm0: at iomem 0xcd000-0xc,0xcb800-0xccfff,0xc-0xcb7ff on isa0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa-0xb on isa0 Timecounter "TSC" frequency 2392248384 Hz quality 800 Timecounters tick every 10.000 msec ad0: 76293MB at ata0-master UDMA100 acd0: CDROM at ata1-master UDMA33 ATA PseudoRAID loaded Mounting root from ufs:/dev/ad0s1a pgpEaaDKdYpvh.pgp Description: PGP signature