VFS_BIO_DEBUG and 4.11
Hi-diddly-ho! Is VFS_BIO_DEBUG still supposed to work in 4.11? I'm trying to debug a data corruption problem that could be a bug in the cd9660 file system and thought that enabling VFS_BIO_DEBUG might help. Instead it complains a lot about directories and character devices being VMIO'd nowadays, then panics with biodone: zero vnode ref count before it even finishes booting. I have reason to believe this was a useful flag back in 4.4 (because I saw a kernel config from Matt Dillon that included it), but have not found any evidence of use more recent than that. So, is it obsolete now? Or is it just only a little bit broken? I don't (yet) understand the invariants it is trying to enforce, so perhaps none of them apply any more. Stephen. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Winbind NT domain authentication
Hi list, Sorry for the cros-post, I'm not sure which list is better for me as I got a question related to samba, configuration, FreeBSD. I'm trying to configure NT authentication on FreeBSD 5.4 with Samba 3.0.12 (installed form the ports collection). I've folowed the Samba 3 howto I've managed the following : wbinfo -g returns correctly the domain groups wbinfo -u returns all the users (including those ones from the domain) ntlm auth does authenticate the user correctly ntlm_auth --username=usr1 password: NT_STATUS_OK: Success (0x0) and in the winbind log I get : rpc: trusted_domains [ 3141]: request interface version [ 3141]: request location of privileged pipe [ 3141]: request domain name [ 3141]: request misc info [ 3141]: pam auth MYDOMAIN\usr1 rpc_dc_name: Returning DC PASSV_SERV (_the_ip_) for domain MYDOMAIN IPC$ connections done anonymously Connecting to host=PASSV_SERV Connecting to _the_ip_ at port 445 I suspect this means that my samba/winbind configuration is correct. The trouble is that I still can't login (login or ssh) with usernames from the domain. If I try with MYDOMAIN\usr1 I just get an Access Denied. The worse is that I'm not sure that I'm looking for the logs in the right place, the auth.log of messages doesn't show any trace of winbind beeing called. My smb.conf : workgroup = MYDOMAIN netbios name = MY_BSD password server = passwd_serv_ip security = domain encrypt passwords = yes #passdb backend = tdbsam guest server string = MY_BSD Samba Server # separate domain and username with '\', like DOMAIN\username winbind separator = \\ # use uids from 1 to 2 for domain users idmap uid = 1-2 # use gids from 1 to 2 for domain groups idmap gid = 1-2 # allow enumeration of winbind users and groups winbind enum users = yes winbind enum groups = yes # give winbind users a real shell (only needed if they have telnet access) template homedir = /home/winnt/%D%U template shell = /usr/local/bin/bash My nsswitch.conf group: compat winbind group_compat: nis hosts: files dns winbind networks: files passwd: compat winbind passwd_compat: nis shells: files and finally my /etc/pam.d/sshd # auth authrequiredpam_nologin.so no_warn #auth sufficient pam_opie.so no_warn no_fake_prompts #auth requisite pam_opieaccess.so no_warn allow_local #auth sufficient pam_krb5.so no_warn try_first_pass #auth sufficient pam_ssh.so no_warn try_first_pass #auth requiredpam_unix.so no_warn try_first_pass #tfa authsufficient pam_winbind.so debug try_first_pass authsufficient pam_unix.so no_warn try_first_pass # account #accountrequiredpam_krb5.so account requiredpam_login_access.so account sufficient pam_winbind.so debug account sufficient pam_unix.so # session #sessionoptionalpam_ssh.so session requiredpam_permit.so # password #password sufficient pam_krb5.so no_warn try_first_pass passwordsufficient pam_winbind.so debug try_first_pass passwordsufficient pam_unix.so no_warn try_first_pass I hope this question is not silly but only for NT authentication smbd/nmbd is not necessary to run, isn't it ? Winbind should do de job. This is the 2'nd week I keep trying setting this thing up, and one of the most frustrating experience ever... Can anybody give me some hints (other then going to a psychiatrist) Thomas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Portupgrade in Xfree86 pkg failed
ln -s /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-support/linux/drm/xf86drmRandom.c xf86drmRandom.c rm -f xf86drmSL.c ln -s /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-support/linux/drm/xf86drmSL.c xf86drmSL.c make: don't know how to make /drm.h. Stop *** Error code 2 Stop in /usr/ports/graphics/xfree86-dri/work/xc/lib/GL. *** Error code 1 Stop in /usr/ports/graphics/xfree86-dri. -- Yours Sincerely Shinjii http://www.shinji.nq.nu ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Portupgrade in Xfree86 pkg failed
On Fri, 24 Jun 2005 20:35, Warren wrote: ln -s /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-supp ort/linux/drm/xf86drmRandom.c xf86drmRandom.c rm -f xf86drmSL.c ln -s /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-supp ort/linux/drm/xf86drmSL.c xf86drmSL.c make: don't know how to make /drm.h. Stop *** Error code 2 Stop in /usr/ports/graphics/xfree86-dri/work/xc/lib/GL. *** Error code 1 Stop in /usr/ports/graphics/xfree86-dri. What commanad did you run? What version of FreeBSD are you running? When did you last cvsup your ports tree? Did you read /usr/ports/UPDATING? -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgpJv9uMlTlvs.pgp Description: PGP signature
Re: Portupgrade in Xfree86 pkg failed
On Fri, 24 Jun 2005 9:11 pm, Daniel O'Connor wrote: On Fri, 24 Jun 2005 20:35, Warren wrote: ln -s /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-su pp ort/linux/drm/xf86drmRandom.c xf86drmRandom.c rm -f xf86drmSL.c ln -s /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-su pp ort/linux/drm/xf86drmSL.c xf86drmSL.c make: don't know how to make /drm.h. Stop *** Error code 2 Stop in /usr/ports/graphics/xfree86-dri/work/xc/lib/GL. *** Error code 1 Stop in /usr/ports/graphics/xfree86-dri. What commanad did you run? portupgrade -aDk -m BATCH=yes What version of FreeBSD are you running? 5.4-STABLE When did you last cvsup your ports tree? Just before doing PortUpgrade before sending the 1st email Did you read /usr/ports/UPDATING? cant say as i did. -- Yours Sincerely Shinjii http://www.shinji.nq.nu ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Simultaneously two versions of one port?
Dear All, I'm trying to install gdesklets which depends on py24-orbit. However, py23-orbit is already installed. I googled and tryed various ways with portupgrade et al. What to do? Thanks Tom [EMAIL PROTECTED] gdesklets]# make install === gdesklets-0.34.3 depends on file: /usr/local/bin/python - found ---snip--- === Installing for py24-orbit-2.0.1_1 === py24-orbit-2.0.1_1 depends on file: /usr/local/bin/python2.4 - found === py24-orbit-2.0.1_1 depends on executable: pkg-config - found === py24-orbit-2.0.1_1 depends on shared library: glib-2.0.600 - found === py24-orbit-2.0.1_1 depends on shared library: IDL-2.0 - found === py24-orbit-2.0.1_1 depends on shared library: ORBit-2.0 - found === Generating temporary packing list === Checking if devel/py-orbit2 already installed === An older version of devel/py-orbit2 is already installed (py23-orbit-2.0.1_1) You may wish to ``make deinstall'' and install this port again by ``make reinstall'' to upgrade it properly. If you really wish to overwrite the old port of devel/py-orbit2 without deleting it first, set the variable FORCE_PKG_REGISTER in your environment or the make install command line. *** Error code 1 Stop in /usr/ports/devel/py-orbit2. *** Error code 1 Stop in /usr/ports/x11-toolkits/py-gnome2. *** Error code 1 Stop in /usr/ports/deskutils/gdesklets. -- -- Which is worse: ignorance or apathy? -- Don't know. Don't care. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Simultaneously two versions of one port?
Dear Thom, What I do when something like this happens is: either, 'cd' to the port that is causing the problem, devel/py-orbit2, in this case and type 'make deinstall', than go back to your original port and issue a make again or, 'cd' to the port that is causing the problem, devel/py-orbit2, in this case and type 'make -DFORCE_PKG_REGISTER install clean'. This forces an upgrade of the troublesome port, but may result in a double registered port. This can be checked by issuing 'pkgdb -F' Good luck, Pascal On Fri, 24 Jun 2005 14:17:29 +0200, Thomas Beer wrote Dear All, I'm trying to install gdesklets which depends on py24-orbit. However, py23-orbit is already installed. I googled and tryed various ways with portupgrade et al. What to do? Thanks Tom [EMAIL PROTECTED] gdesklets]# make install === gdesklets-0.34.3 depends on file: /usr/local/bin/python - found ---snip--- === Installing for py24-orbit-2.0.1_1 === py24-orbit-2.0.1_1 depends on file: /usr/local/bin/python2.4 - found === py24-orbit-2.0.1_1 depends on executable: pkg-config - found === py24-orbit-2.0.1_1 depends on shared library: glib-2.0.600 - found === py24-orbit-2.0.1_1 depends on shared library: IDL-2.0 - found === py24-orbit-2.0.1_1 depends on shared library: ORBit-2.0 - found === Generating temporary packing list === Checking if devel/py-orbit2 already installed === An older version of devel/py-orbit2 is already installed (py23-orbit-2.0.1_1) You may wish to ``make deinstall'' and install this port again by ``make reinstall'' to upgrade it properly. If you really wish to overwrite the old port of devel/py-orbit2 without deleting it first, set the variable FORCE_PKG_REGISTER in your environment or the make install command line. *** Error code 1 Stop in /usr/ports/devel/py-orbit2. *** Error code 1 Stop in /usr/ports/x11-toolkits/py-gnome2. *** Error code 1 Stop in /usr/ports/deskutils/gdesklets. -- -- Which is worse: ignorance or apathy? -- Don't know. Don't care. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Data corruption in cd9660 on FreeBSD 4.11?
Hi! I'm experiencing data corruption when reading CDs and DVDs on FreeBSD 4.11. My best theory so far is that cd9660 or perhaps the VFS layer is mishandling 2048 byte buffers (since they are smaller than one virtual memory page), occasionally writing them to the wrong location in RAM. Read on for why I think so. First up, I don't think this is the usual hardware problem since the machine has done huge numbers of buildworlds (in 4.x and -current) without any of the telltale signs (eg bus errors and segmentation violations). There are no error messages in /var/log/messages. Also, it moonlights as a games machine and plays Doom 3, Battlefield 1942, Neverwinter Nights and so forth like a champ. Memory, cpu, video, disk, networking are all just fine 100% of the time. The hardware is an ASUS P4P800 mobo (including onboard Marvell Yukon gigabit ethernet) with a P4 2.8GHz cpu, 1GB RAM, Maxtor 120GB disk, Pioneer 103S DVD-ROM, LiteOn SOHW-1673S DVD burner in an Antec Sonata case. Now that I have a DVD burner, I make backups of my main machines (over NFS) but have found that they often don't verify as 100% correct. The symptom is that, for some files, an entire 2048 DVD sector is replaced with different (non-zero) data. This occurs both when reading with the Pioneer DVD-ROM and when reading with the LiteOn burner (though I don't test with the Pioneer much as it is slower). I emphasise that all burns have been 100% correct (ie the burning process worked and this can be verified by reading on, say, my iBook), so all of the hardware seems to be operating correctly (and swiftly, I might add). The problem is that reading the iso9660 file system is not safe. After some experimenting, I've found that the problem also occurs when reading CDs, and I built a test CD (of photos of a recent wedding) and in testing I read this CD over and over. I compare the CD with the original files (via NFS) using diff. When diff finds a difference, I save copies of the differing files before they can be flushed from the cache. I have calculated checksums for all 2048 blocks on the CD, so I can know if any given block of 2048 bytes came from the CD and if so which file it came from. In all cases so far, the 2048 byte error has been a block from another file, not a random corruption. I am starting to believe that, under high load, the cd9660 file system code tells the ata driver to put a 2K block in the wrong spot in memory, leaving some old junk in the gap in the file being read, and blasting some other 2K block of memory. It may not be cd9660 code per se that is wrong, but a bug in the complex buffer handling code (getblk, getnewbuf, allocbuf, etc). Why do I believe it is writing to the wrong memory, rather than any number of other flaws? In two runs (out of many), unusual things occurred that are consistent with memory being overwritten, rather than, say, a 2K block just not being read at all: In one, an innocent sshd core-dumped (which is something that has never happened except when running my cd9660 tests), and in another, a previously OK cached NFS file became corrupted. Explaining that last case further: I had been running a test script that would mount the CD, compare files, unmount the CD, and repeat. This meant that the NFS copy of the files was read over and over and hence became memory resident (there being enough space in 1GB of RAM for one copy of the files, plus my normal programs). Several tests passed without fault (hence all the NFS files were cached and correct), when suddenly there were multiple corruptions; call them file A and file B. File A was the usual corruption where a 2K block of another file was unexpectedly present in the copy read from the CD, but in file B it was the NFS file that was wrong. In fact it contained the missing block from file A! In short, the fully memory resident NFS file B had been corrupted by reading file A from the CD. It's been pretty interesting hunting this problem, but now I'm sort of stuck. I believe that some 2K reads from DVDs and CDs end up in the wrong place in RAM, but I can't find where this happens in the code (it's pretty hard to work out just by reading it), and I can't rule out the possibility that there's a hardware error here that I've just never run across before. So, can anyone suggest any more tests I could try? Or is there a kind of hardware fault that could cause this substitution of whole blocks read from CDs without causing any other problems? And does anyone know of any commits made anywhere in the 5 years since 4.x split off from 5.x that may be relevant? Yep. 5 years. I have started looking, but there's a fair bit of stuff in there... Stephen. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Simultaneously two versions of one port?
either, 'cd' to the port that is causing the problem, devel/py-orbit2, in this case and type 'make deinstall', than go back to your original port and issue a make again What's with existing dependencies? or, 'cd' to the port that is causing the problem, devel/py-orbit2, in this case and type 'make -DFORCE_PKG_REGISTER install clean'. This forces an upgrade of the troublesome port, but may result in a double registered port. This can be checked by issuing 'pkgdb -F' Good luck, Pascal On Fri, 24 Jun 2005 14:17:29 +0200, Thomas Beer wrote Dear All, I'm trying to install gdesklets which depends on py24-orbit. However, py23-orbit is already installed. I googled and tryed various ways with portupgrade et al. What to do? Thanks Tom [EMAIL PROTECTED] gdesklets]# make install === gdesklets-0.34.3 depends on file: /usr/local/bin/python - found ---snip--- === Installing for py24-orbit-2.0.1_1 === py24-orbit-2.0.1_1 depends on file: /usr/local/bin/python2.4 - found === py24-orbit-2.0.1_1 depends on executable: pkg-config - found === py24-orbit-2.0.1_1 depends on shared library: glib-2.0.600 - found === py24-orbit-2.0.1_1 depends on shared library: IDL-2.0 - found === py24-orbit-2.0.1_1 depends on shared library: ORBit-2.0 - found === Generating temporary packing list === Checking if devel/py-orbit2 already installed === An older version of devel/py-orbit2 is already installed (py23-orbit-2.0.1_1) You may wish to ``make deinstall'' and install this port again by ``make reinstall'' to upgrade it properly. If you really wish to overwrite the old port of devel/py-orbit2 without deleting it first, set the variable FORCE_PKG_REGISTER in your environment or the make install command line. *** Error code 1 Stop in /usr/ports/devel/py-orbit2. *** Error code 1 Stop in /usr/ports/x11-toolkits/py-gnome2. *** Error code 1 Stop in /usr/ports/deskutils/gdesklets. -- -- Which is worse: ignorance or apathy? -- Don't know. Don't care. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] -- -- Which is worse: ignorance or apathy? -- Don't know. Don't care. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
EHCI: mtools stuck in state 'physrd' or panic
Hi, I just updated to this morning's RELENG_5 and thought I'd give USB 2.0 a try to speed up data exchange with my USB sticks. The controller is correctly identified, it seems: ehci0: VIA VT6202 USB 2.0 controller mem 0xdbfdf700-0xdbfdf7ff irq 3 at device 16.3 on pci0 When plugging in a USB stick, it is correctly identified, too: umass0: USB Flash Disk, rev 2.00/2.00, addr 2 da2 at umass-sim0 bus 0 target 0 lun 0 da2: USB BAR 2.00 Removable Direct Access SCSI-2 device da2: 40.000MB/s transfers da2: 124MB (255744 512 byte sectors: 64H 32S/T 124C) I can also list the content of the FAT filesystem with mtools' mdir command. When trying to copy a file from the stick to a local filesystem, however, mcopy is almost immediately stuck in state physrd (according to top(1)) after copying a varying number of bytes (between 100 and 2200 KB is what I've seen so far). I cannot kill the mtools process, but pulling out the USB stick helps - it panics after a few times of doing that, though. I thought it might have to do with IRQ sharing first, but according to vmstat -i and dmesg, ehci0 doesn't share its IRQ with anything else. I know that the ehci(4) man page says the driver is not finished and quite buggy, but that doesn't mean I shouldn't report a problem, right? ;) Any ideas? Stefan pgpuFhzD1g7Qh.pgp Description: PGP signature
Re: Simultaneously two versions of one port?
Thomas Beer in gmane.os.freebsd.stable: I'm trying to install gdesklets which depends on py24-orbit. However, py23-orbit is already installed. I googled and tryed various ways with portupgrade et al. What to do? Do you still need the Python 2.3 stuff? If not, you could just upgrade all packages installed for 2.3 with 'portupgrade py23*', for instance. Stefan -- No reading beyond this point pgpX5W7W4GdGd.pgp Description: PGP signature
Re: EHCI: mtools stuck in state 'physrd' or panic
At 08:50 AM 24/06/2005, Stefan Walter wrote: I can also list the content of the FAT filesystem with mtools' mdir command. When trying to copy a file from the stick to a local filesystem, however, mcopy is almost immediately stuck in state physrd (according to top(1)) after copying a varying number of bytes (between 100 and 2200 KB is what I've seen so far). I cannot kill the mtools process, but pulling out the USB stick helps - it panics after a few times of doing that, though. If you reformat the USB stick with UFS2, does IO still hang the box ? ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Portupgrade in Xfree86 pkg failed
On Fri, 24 Jun 2005 20:47, Warren wrote: Just before doing PortUpgrade before sending the 1st email Did you read /usr/ports/UPDATING? cant say as i did. Well that was silly.. Not that I think there is a specific entry in this case but it is a good habit to get in to.. Do you have the kernel source installed? I think you may need that to build the xfree86-dri port (I don't know why it doesn't check) -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au The nice thing about standards is that there are so many of them to choose from. -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C pgpLwwFvumZLC.pgp Description: PGP signature
Re: EHCI: mtools stuck in state 'physrd' or panic
Mike Tancsa in gmane.os.freebsd.stable: If you reformat the USB stick with UFS2, does IO still hang the box ? I haven't tried that, but instead tried to just dump the whole USB stick to a file with dd if=/dev/da2 of=stickimage bs=1024. The dd process also hung in state physrd eventually, and about a minute after pulling out the USB stick the system panic'd. Furthermore, I tried the same (both mcopy and dd) on my notebook (Centrino - Intel ICH4 chipset), which didn't have ehci in its kernel until then, either. It worked flawlessly, multiple times. Stefan -- No reading beyond this point pgpdMFKJF869g.pgp Description: PGP signature
Re: ATA DMA timeouts [NOT FIXED] (forget my last mail)
Hi (again), believe it or not. After sending the last mail and closing firefox, the system has been hanging for a few seconds and spit this out: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=226025086 I'm back on kernel of May 26th again. This time I did not have a corrupted file system or at least not as badly corrupted that it panics the kernel, like the first time. Usually my system has DMA timeouts very early while booting up and creates mess on the file systems. Sorry for causing noise. :( I'm going to test a kernel every week and report, when the problems are gone. Martin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
dmesg queries
Hi All, I'm trying to figure out a couple of things would like some advice please. My server was on FreeBSD 5.3 STABLE #1 this morning (I only took it to STABLE, because at the time, my GigNIC was not supported fully @RELEASE). I upgraded today it's now looking like this: - % uname -a FreeBSD venus.rainbow-it.net 5.4-RELEASE-p2 FreeBSD 5.4-RELEASE-p2 #2: Fri Jun 24 13:43:08 BST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/VENUS i386 It's a lowly Dell PowerEdge 800 DELL PE800 Here's what has me writing... looking in /var/run/dmesg I see: - kernel: ioapic0: Changing APIC ID to 2 kernel: ioapic1: Changing APIC ID to 3 kernel: ioapic1: WARNING: intbase 32 != expected base 24 kernel: ioapic0 Version 2.0 irqs 0-23 on motherboard kernel: ioapic1 Version 2.0 irqs 32-55 on motherboard Should I be worried by that WARNING? Also in /var/run/dmesg I see: - kernel: Interrupt storm detected on irq19: uhci0 uhci2; throttling interrupt source kernel: Interrupt storm detected on irq18: bge0 uhci1+; throttling interrupt source I think the top device is a Logitech QuickCam Express the bottom one (using irq18), is my onboard Gigabit NIC. Would these lines in dmesg suggest that there is a problem and if so, is there anything that I can do to combat it? Is it likely that this 'throttling', is slowing my NIC at all? I have seen this kind of notification (throttling), when printing. This is the dmesg output for that device: - kernel: ppc0: ECP parallel printer port port 0x778-0x77f,0x378-0x37f irq 7 drq 1 on acpi0 kernel: ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode kernel: ppc0: FIFO with 16/16/8 bytes threshold kernel: ppbus0: Parallel port bus on ppc0 kernel: ppbus0: IEEE1284 device found /NIBBLE/ECP kernel: Probing for PnP devices on ppbus0: kernel: ppbus0: HEWLETT-PACKARD DESKJET 990C PRINTER MLC,PCL,PML kernel: lpt0: Printer on ppbus0 kernel: lpt0: Interrupt-driven port kernel: ppi0: Parallel I/O on ppbus0 Now, the upgrade may well have stopped the behavior mentioned below, but if not, does anyone know what I might have done wrong, to be getting (a lot of) messages like: - sio0: 1848 more interrupt-level buffer overflows (total 11269) ? This is the device: - kernel: sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 kernel: sio0: type 16550A Finally, is it considered bad form, to ask multiple questions like I have, or should I have separated them sent them in multiple emails? Kind Regards, Chris Phillips ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
pxeboot, NFS and root-path: bug or documentation error?
I have been setting up a pxeboot jumpstart environment for FreeBSD 4.11, following the instructions at http://www.freebsd.org/doc/en_US.ISO8859-1/articles/pxe/ Rather than build pxeboot like this: # rm -rf /usr/obj/* # cd /usr/src/sys/boot # make # cp /usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot I just copied /boot/pxeboot from the FreeBSD-4.11 CD-ROM. Otherwise I followed the instructions very closely. The pxeboot client machine is a Compaq ProLiant DL380. On first attempt, it got as far as pxeboot starting, and then: pxe_open: server addr: 192.168.0.1 pxe_open: server path: /pxeroot pxe_open: gateway ip: 0.0.0.0 Booting [kernel]... can't load 'kernel' can't load 'kernel.old' And my NFS server logs a failed attempt to mount /pxeroot: Jun 24 16:32:40 sr-mon-00 mountd[642]: mount request from 192.168.0.240 for non existent path /pxeroot Jun 24 16:32:49 sr-mon-00 last message repeated 59 times This is strange; I thought that at this stage pxeboot would be pulling across the kernel and ramdisk via TFTP from /usr/tftpboot, although the documentation is far from clear. pxeboot(8) says: pxeboot recognizes next-server and option root-path directives as the server and path to NFS mount for file requests, respectively, or the server to make TFTP requests to. (Erm, so exactly how do I choose whether to use NFS or to use TFTP for the next stage?) Anyway, assuming that I'm forced to use NFS at this point, I added another DHCP option: option root-path 192.168.0.1:/usr/tftpboot; This option is not shown in the example dhcpd.conf in pxeboot(8), nor in the article referred to above. However, if I also put an entry in the NFS server's /etc/hosts file for the client DHCP address, it then works properly. So the question is: when pxeboot runs on the client, is it able to fetch loader.rc, the kernel and ramdisk via TFTP, or only via NFS? If it's only NFS, then I think the pxeboot(8) manpage, and the pxeboot article, ought to be updated. If it *can* use TFTP, does anyone have any suggestions for what I was doing wrong? Regards, Brian Candler. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: dmesg queries
On Jun 24, 2005, at 12:09 PM, Chris Phillips wrote: Here's what has me writing... looking in /var/run/dmesg I see: - kernel: ioapic0: Changing APIC ID to 2 kernel: ioapic1: Changing APIC ID to 3 kernel: ioapic1: WARNING: intbase 32 != expected base 24 kernel: ioapic0 Version 2.0 irqs 0-23 on motherboard kernel: ioapic1 Version 2.0 irqs 32-55 on motherboard Should I be worried by that WARNING? I see this as well on my PE800. Also in /var/run/dmesg I see: - kernel: Interrupt storm detected on irq19: uhci0 uhci2; throttling interrupt source kernel: Interrupt storm detected on irq18: bge0 uhci1+; throttling interrupt source I get this as well. I don't use USB so I turned that off at the BIOS and removed it from my kernel, but I still get interrupt storm on my bge0 device. No idea why. Performance is not all that great on this box disk-wise, but CPU wise it is acceptably fast. I run FreeBSD/amd64 on it rather than FreeBSD/ i386. Vivek Khera, Ph.D. +1-301-869-4449 x806
Re: pxeboot, NFS and root-path: bug or documentation error?
On Friday 24 June 2005 01:03 pm, Brian Candler wrote: I have been setting up a pxeboot jumpstart environment for FreeBSD 4.11, following the instructions at http://www.freebsd.org/doc/en_US.ISO8859-1/articles/pxe/ Rather than build pxeboot like this: # rm -rf /usr/obj/* # cd /usr/src/sys/boot # make # cp /usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot I just copied /boot/pxeboot from the FreeBSD-4.11 CD-ROM. Otherwise I followed the instructions very closely. The pxeboot client machine is a Compaq ProLiant DL380. On first attempt, it got as far as pxeboot starting, and then: pxe_open: server addr: 192.168.0.1 pxe_open: server path: /pxeroot pxe_open: gateway ip: 0.0.0.0 Booting [kernel]... can't load 'kernel' can't load 'kernel.old' And my NFS server logs a failed attempt to mount /pxeroot: Jun 24 16:32:40 sr-mon-00 mountd[642]: mount request from 192.168.0.240 for non existent path /pxeroot Jun 24 16:32:49 sr-mon-00 last message repeated 59 times This is strange; I thought that at this stage pxeboot would be pulling across the kernel and ramdisk via TFTP from /usr/tftpboot, although the documentation is far from clear. pxeboot(8) says: pxeboot recognizes next-server and option root-path directives as the server and path to NFS mount for file requests, respectively, or the server to make TFTP requests to. (Erm, so exactly how do I choose whether to use NFS or to use TFTP for the next stage?) Anyway, assuming that I'm forced to use NFS at this point, I added another DHCP option: option root-path 192.168.0.1:/usr/tftpboot; This option is not shown in the example dhcpd.conf in pxeboot(8), nor in the article referred to above. However, if I also put an entry in the NFS server's /etc/hosts file for the client DHCP address, it then works properly. So the question is: when pxeboot runs on the client, is it able to fetch loader.rc, the kernel and ramdisk via TFTP, or only via NFS? If it's only NFS, then I think the pxeboot(8) manpage, and the pxeboot article, ought to be updated. If it *can* use TFTP, does anyone have any suggestions for what I was doing wrong? It uses TFTP to fetch the pxeboot binary itself. After that, it uses either NFS or TFTP. By default it uses NFS to access /boot/loader and friends. If you want it to just use TFTP and not use NFS at all, you need to recompile pxeboot with LOADER_TFTP_SUPPORT=yes defined in make. That is: % cd /sys/boot % make clean % make LOADER_TFTP_SUPPORT=yes % cp /usr/obj/usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot -- John Baldwin [EMAIL PROTECTED]http://www.baldwin.cx/~john/ Power Users Use the Power to Serve = http://www.FreeBSD.org -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve = http://www.FreeBSD.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Network/fxp related panic in 5.4?
Hi all, I recently re-enabled SMP on one of my 5.4 servers (dual intel p3), and after a relatively short while (couple of days) it starts acting up. Today it was frozen and had jumped into kernel debugger on serial console. Problem is that my serial console was controlled by a terminal at work, and when I got home it seemed that the work terminal had disconnected. All I could do was a 'trace' - I don't have the panic screen (if any) nor do I have any other output because the watchdog triggered the powerswitch cycle just after I got the trace: Tracing pid 29 tid 10 td 0xc22a fxp_intr_body(c2404000,c2404000,40,,8) at fxp_intr_body+0xd0 fxp_intr(c2404000,0,0,0,0) at fxp_intr+0x14e ithread_loop(c22f6500,e3384d38,0,0,0) at ithread_loop+0x1b8 fork_exit(c06a9150,c22f6500,e3384d38) at fork_exit+0x80 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe3384d6c, ebp = 0 --- db What makes me wonder is ... When I connected the serial console, the db prompt was already there. Does that mean that the work terminal disconnect somehow sent a telnet break, and triggered the kernel debugger? I.e. - this was no panic, but a stupid serial console hiccup? Is there any way to prevent this in the future - like changing the control character that would trigger the kernel debugger? (I have BREAK_TO_DEBUGGER in my kernel config..) Thanks, /Eirik ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ATA_DMA errors
twesky wrote: I am having ATA_DMA errors on 5.4R and 5 STABLE up to June 16 (haven't done a cvsup again). It doesn't happen on 5.3R or lower. I've just upgraded my fileserver from 5.1-R to 5.4-R, and I'm seeing this problem too now on 3 out of 4 drives. The exact error message is below: It happens within a few hours of use. The laptop will then reboot, and fsck must be ran. After fsck the timeouts happen within a few seconds of booting. My system uses a SiI 0680 UDMA133 controller in addition to the old built-in Intel PIIX4 UDMA33 controller. My system drive hangs off the PIIX4 controller and I see no issues with it, only drives off the SiI; ad0: 8207MB ST38641A/3.29 [16676/16/63] at ata0-master UDMA33 ad4: 57241MB ST360021A/3.05 [116301/16/63] at ata2-master UDMA100 ad6: 76319MB ST380021A/3.19 [155061/16/63] at ata3-master UDMA100 ad7: 152627MB WDC WD1600JB-00DUA3/75.13B75 [310101/16/63] at ata3-slave UDMA100 Right after the upgrade things worked well for a couple of hours, and then I got a reboot all of a sudden. Upon inspection I found tons of both READ_DMA timed out as well as WRITE_DMA UDMA ICRC error messages in log prior to the reboot. After the reboot it went to do the fsck and made it perhaps halfway through it before it started churning out READ_DMA timed out messages again, followed by the ad7: warning - removed from configuration message. Things did not get better from there, but with each sucessive reboot more and more started going wrong. In order to be able to get the system to even boot in the end I had to physically disconnect the ad7 drive, but even so I'm getting READ_DMA timed out messages for ad4 and ad6. Since I'm getting WRITE_DMA errors on both ad6 and ad7 now (I haven't written anything to ad4 yet, so I don't know if I'll get errors on that one too), and I wasn't a few hours ago when I was running 5.1-R, I refuse to believe that two disks have gone bad in that timespan! I'm not sure what I should do at this point - theoretically I could proceed to roll back to 5.1 to prevent further data loss, but I'm guessing it'd be good if I kept it for a little while so that I could run tests for patches :-/ Seeing the comments about possible failing controller hardware, I might see if I can find a replacement controller tomorrow... any ideas in the meantime will be appreciated though! Still feels very iffy that this started happening right after the upgrade... I was expecting to get rid of some of the quirks from the early preview, not get far worse ones! :-( Oh, btw, using smartmontools' smartctl, I've gotten the information that ad4 has had 32 write errors in total, ad6 have had 0 (despite seeing the WRITE_DMA errors in the system log), and ad7 refuses to even talk SMART. ### Here's the contents of the dmesg from before I pulled ad7 out: Jun 24 18:22:19 kernel: FreeBSD 5.4-RELEASE #0: Sun May 8 10:21:06 UTC 2005 Jun 24 18:22:19 kernel: [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC Jun 24 18:22:19 kernel: Timecounter i8254 frequency 1193182 Hz quality 0 Jun 24 18:22:19 kernel: CPU: Pentium II/Pentium II Xeon/Celeron (467.73-MHz 686-class CPU) Jun 24 18:22:19 kernel: Origin = GenuineIntel Id = 0x665 Stepping = 5 Jun 24 18:22:19 kernel: Features=0x183f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,S EP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR Jun 24 18:22:19 kernel: real memory = 805240832 (767 MB) Jun 24 18:22:19 kernel: avail memory = 778231808 (742 MB) Jun 24 18:22:19 kernel: npx0: math processor on motherboard Jun 24 18:22:19 kernel: npx0: INT 16 interface Jun 24 18:22:19 kernel: acpi0: AWARD AWRDACPI on motherboard Jun 24 18:22:19 kernel: acpi0: Power Button (fixed) Jun 24 18:22:19 kernel: Timecounter ACPI-safe frequency 3579545 Hz quality 1000 Jun 24 18:22:19 kernel: acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0 Jun 24 18:22:19 kernel: cpu0: ACPI CPU (3 Cx states) on acpi0 Jun 24 18:22:19 kernel: acpi_throttle0: ACPI CPU Throttling on cpu0 Jun 24 18:22:19 kernel: acpi_button0: Power Button on acpi0 Jun 24 18:22:19 kernel: pcib0: ACPI Host-PCI bridge port 0x5000-0x500f,0x4000-0x4041,0xcf8-0xcff on acpi0 Jun 24 18:22:19 kernel: pci0: ACPI PCI bus on pcib0 Jun 24 18:22:19 kernel: agp0: Intel 82443BX (440 BX) host to PCI bridge mem 0xe000-0xe3ff at device 0.0 on pci0 Jun 24 18:22:19 kernel: pcib1: PCI-PCI bridge at device 1.0 on pci0 Jun 24 18:22:19 kernel: pci1: PCI bus on pcib1 Jun 24 18:22:19 kernel: isab0: PCI-ISA bridge at device 7.0 on pci0 Jun 24 18:22:19 kernel: isa0: ISA bus on isab0 Jun 24 18:22:19 kernel: atapci0: Intel PIIX4 UDMA33 controller port 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0 Jun 24 18:22:19 kernel: ata0: channel #0 on atapci0 Jun 24 18:22:19 kernel: ata1: channel #1 on atapci0 Jun 24 18:22:19 kernel: uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0x9000-0x901f irq 11 at device 7.2 on pci0 Jun 24 18:22:19 kernel: usb0: Intel 82371AB/EB (PIIX4) USB
Re: ATA_DMA errors
I don't think it is a hardware problem. Unless you replace it with the exact same hardware, it'll be difficult to determine if it was the hardware. I haven't had any issues with 5.3R or any stable version before April 15. I am going to do some checking this weekend and see if it is hardware or software what is causing my timeouts. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: pxeboot, NFS and root-path: bug or documentation error?
On Fri, Jun 24, 2005 at 01:50:37PM -0400, John Baldwin wrote: It uses TFTP to fetch the pxeboot binary itself. After that, it uses either NFS or TFTP. By default it uses NFS to access /boot/loader and friends. If you want it to just use TFTP and not use NFS at all, you need to recompile pxeboot with LOADER_TFTP_SUPPORT=yes defined in make. That is: % cd /sys/boot % make clean % make LOADER_TFTP_SUPPORT=yes % cp /usr/obj/usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot Thank you, that's very clear. Re-reading the manpage I do now see the phrase selectable through compile-time options; perhaps it would be worth also showing those options. Is there any fundamental reason why both couldn't be compiled in at once, e.g. limitations on the pxeboot binary size? Or is it just awkward to implement? I would have no objection to options root-path = tftp://192.168.0.1/usr/tftpboot; I would also have no objection to pxeboot.nfs and pxeboot.tftp being built :-) I'll try building the tftp version when back in the office next week. Thanks again, Brian. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: panic in RELENG_5 UMA
All, Can someone confirm that the following stack trace is showing the same problem, or not? I can reproduce the problem with the custom kernel config included below (which is basically GENERIC stripped of devices I don't have or need and IPFILTER added), but not with a stock GENERIC kernel. To cause the crash I'm running 20-30 instances of the following script: d5# cat arping.sh #!/bin/sh while : do arp -d 192.168.4.$1 /dev/null 21; ping -c 1 -t 1 192.168.4.$1 /dev/null 21; done d5# uname -a FreeBSD d5.bidx.com 5.4-RELEASE FreeBSD 5.4-RELEASE #6: Thu Jun 23 13:45:20 EDT 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/DB-DUAL-AMD64-RAID5 amd64 d5# kgdb /usr/obj/usr/src/sys/DB-DUAL-AMD64-RAID5/kernel.debug ./vmcore.5 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd. #0 doadump () at pcpu.h:167 167 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:167 #1 0x in ?? () #2 0x802557b7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410 #3 0x80255fef in panic (fmt=0xff00b5907500 ë6µ) at /usr/src/sys/kern/kern_shutdown.c:566 #4 0x8029ad2a in sbdrop_locked (sb=0xb6274860, len=1146) at /usr/src/sys/kern/uipc_socket2.c:1149 #5 0x8029afe2 in sbflush_locked (sb=0xb6274860) at /usr/src/sys/kern/uipc_socket2.c:1116 #6 0x8029b049 in sbrelease_locked (sb=0xb6274860, so=0xff00a0a2a8a0) at /usr/src/sys/kern/uipc_socket2.c:564 #7 0x8029b0d5 in sbrelease (sb=0xb6274860, so=0xff00a0a2a8a0) at /usr/src/sys/kern/uipc_socket2.c:577 #8 0x80297b03 in sorflush (so=0xff00a0a2a8a0) at /usr/src/sys/kern/uipc_socket.c:1483 #9 0x80297e42 in sofree (so=0xff00a0a2a8a0) at /usr/src/sys/kern/uipc_socket.c:407 #10 0x80298467 in soclose (so=0xff00a0a2a8a0) at /usr/src/sys/kern/uipc_socket.c:485 #11 0x802847b5 in soo_close (fp=0xff009ca95b60, td=0x0) at /usr/src/sys/kern/sys_socket.c:299 #12 0x8022c2c0 in fdrop_locked (fp=0xff009ca95b60, td=0xff00b5907500) at file.h:288 #13 0x8022c40a in closef (fp=0xff009ca95b60, td=0xff00b5907500) at /usr/src/sys/kern/kern_descrip.c:1920 #14 0x8022e5be in fdfree (td=0xff00b5907500) at /usr/src/sys/kern/kern_descrip.c:1624 #15 0x80238bd0 in exit1 (td=0xff00b5907500, rv=0) at /usr/src/sys/kern/kern_exit.c:236 #16 0x8023a04e in sys_exit (td=0x0, uap=0x0) at /usr/src/sys/kern/kern_exit.c:93 #17 0x8035cd8c in syscall (frame= {tf_rdi = 0, tf_rsi = 5263360, tf_rdx = 0, tf_rcx = 34366596768, tf_r8 = 0, tf_r9 = 140737488350136, tf_rax = 1, tf_rbx = 0, tf_rbp = 3, tf_r10 = -1099499764224, tf_r11 = 515, tf_r12 = 140---Type return to continue, or q return to quit--- 737488350376, tf_r13 = 0, tf_r14 = 0, tf_r15 = 0, tf_trapno = 12, tf_addr = 34368259080, tf_flags = 0, tf_err = 2, tf_rip = 34366590280, tf_cs = 43, tf_rflags = 514, tf_rsp = 140737488350296, tf_ss = 35}) at /usr/src/sys/amd64/amd64/trap.c:771 #18 0x80349f88 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:248 #19 0x in ?? () #20 0x00505000 in ?? () #21 0x in ?? () #22 0x00080068a6a0 in ?? () #23 0x in ?? () #24 0x7fffebb8 in ?? () #25 0x0001 in ?? () #26 0x in ?? () #27 0x0003 in ?? () #28 0xffb50600 in ?? () #29 0x0203 in ?? () #30 0x7fffeca8 in ?? () #31 0x in ?? () #32 0x in ?? () #33 0x in ?? () #34 0x000c in ?? () #35 0x000800820408 in ?? () #36 0x in ?? () #37 0x0002 in ?? () #38 0x000800688d48 in ?? () #39 0x002b in ?? () #40 0x0202 in ?? () #41 0x7fffec58 in ?? () #42 0x0023 in ?? () #43 0x7fffe968 in ?? () #44 0x0023 in ?? () #45 0x in ?? () ---Type return to continue, or q return to quit--- #46 0x in ?? () #47 0x in ?? () #48 0x in ?? () #49 0x in ?? () #50 0x in ?? () #51 0x in ?? () #52 0x in ?? () #53 0xa14b4000 in ?? () #54 0xb6274c40 in ?? () #55 0x0101 in ?? () #56 0x in ?? () #57 0xff00b536eba0 in ?? () #58 0xff00ec19a780 in ?? () #59 0xb6274b58 in
Re: panic in RELENG_5 UMA
Sorry, I forgot to add that this is a Tyan Thunder K8SPRO w/dual AMD Opteron Processors, model no. 246, 4GB of RAM and an Adaptec 2200S RAID controller. The NIC being used is the onboard Broadcom Gigabit Ethernet (bge). Thanks, Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Data corruption in cd9660 on FreeBSD 4.11?
On Fri, 2005-Jun-24 22:31:06 +1000, Stephen McKay wrote: I'm experiencing data corruption when reading CDs and DVDs on FreeBSD 4.11. ... So, can anyone suggest any more tests I could try? Or is there a kind of hardware fault that could cause this substitution of whole blocks read from CDs without causing any other problems? You might like to post the relevant sections of a verbose boot - the ATA and CD probes. Are you running the CD/DVD drives in PIO or UDMA modes? In the former, the CPU is reading the data from the CD and writing it to memory. In the latter, the CPU tells the disk controller where to write. It could be instructive to change modes and see what happens. Have you tried anything other than ISO9660 filesystems on a physical CD? What happens if you just dd the CD-ROM? What happens if you use a vnode mount (see vnconfig(8)) of an ISO filesystem sitting in a UFS filesystem? Anything unusual in your kernel config file? Have you tried building a kernel with WITNESS and/or DIAGNOSTIC? Any chance of you repeating the tests with a 5.x system? Maybe on a spare small partition or using a 5.4-RELEASE disk1 as a live filesystem. -- Peter Jeremy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re[2]: ATA DMA timeouts [NOT FIXED] (forget my last mail)
Hello Martin, M believe it or not. After sending the last mail and closing M firefox, the system has been hanging for a few seconds and M spit this out: M ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=226025086 Yeah, I updated and rebuilt on seeing your email, but found it did little. I has TIMEOUTs again within 10 minutes of rebooting. M Sorry for causing noise. :( M I'm going to test a kernel every week and report, when the M problems are gone. I'd certainly be interested in hearing about your results, but from the general lack of discussion of the problem on this list, I'm guessing we may be on our own. I'm not sure anyone is actually working on a fix, because I'm not sure the problem is recognized. I have two separate Intel IHC5 based boxes affected by DMA TIMEOUTS, both with SATA drives and I'd like to get to the bottom of this. Regards, Tony. -- Tony Byrne ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 5.4-p1 crash
On Mon, 20 Jun 2005, Philippe PEGON wrote: Philippe PEGON wrote: Mitch Parks wrote: On Sun, 19 Jun 2005, Doug White wrote: On Fri, 17 Jun 2005, Mitch Parks wrote: Below are details regarding another crash on a Dell 2600 SMP (HTT and USB disabled). It has been 9 days since the last crash. I didn't have the serial console in place for this last crash, but it is now. As noted, the ttwakeup() panic is a known bug. The best thing we have for a fix is this patch: http://people.freebsd.org/~mlaier/tty.t_pgrp.diff Please give it a try and report back if you have any more panics (or don't :-) ). Thanks! This patch appears to be for 5.3, but I manually applied the chunk of the patch that didn't apply cleanly and the countdown is on. I'll report back in 10 days unless something bad happens before then. Below is the patch chunk #10 that I actually applied rather than the one given. If I've done something bad here by removing the PGRP_LOCK please let me know. I'm not a kernel developper, but if you remove PGRP_LOCK(tp-t_pgrp); and the PGRP_UNLOCK(tp-t_pgrp) in the if condition (removed by the orginal patch) there is maybe another PGRP_UNLOCK(tp-t_pgrp); to remove if the if condition doesn't match, line 2528 in the original 5.4-p1 tty.c ? after having applied the patch (with your modification), there is no sx_sunlock(proctree_lock) in the ttyinfo function if the three conditions failed. Maybe we have just to replace PGRP_UNLOCK(tp-t_pgrp); line 2528 by sx_sunlock(proctree_lock) ? I think that we need the helps of a kernel developper. No, that would be a leaked lock, which would cause hangs. More likely its some other case that got missed that needs locks extended to it, or the aliased pgrp isn't the underlying problem. I've run out of time to debug this, unfortunately... Hunk #6 succeeded at 1154 (offset -51 lines). Hunk #7 succeeded at 1215 (offset -6 lines). Hunk #8 succeeded at 1203 (offset -51 lines). Hunk #9 succeeded at 1946 (offset -5 lines). Hunk #10 failed at 2562. Hunk #11 succeeded at 2847 (offset -212 lines). 1 out of 11 hunks failed--saving rejects to tty.c.rej @@ -2495,19 +2511,21 @@ * On return following a ttyprintf(), we set tp-t_rocount to 0 so * that pending input will be retyped on BS. */ + sx_slock(proctree_lock); if (tp-t_session == NULL) { + sx_sunlock(proctree_lock); ttyprintf(tp, not a controlling terminal\n); tp-t_rocount = 0; return; } if (tp-t_pgrp == NULL) { + sx_sunlock(proctree_lock); ttyprintf(tp, no foreground process group\n); tp-t_rocount = 0; return; } - PGRP_LOCK(tp-t_pgrp); - if ((p = LIST_FIRST(tp-t_pgrp-pg_members)) == 0) { - PGRP_UNLOCK(tp-t_pgrp); + if ((p = LIST_FIRST(tp-t_pgrp-pg_members)) == NULL) { + sx_sunlock(proctree_lock); ttyprintf(tp, empty foreground process group\n); tp-t_rocount = 0; return; Or the complete patch: http://kuoi.asui.uidaho.edu/~mitch/crash/tty_5.4.patch Mitch Parks [EMAIL PROTECTED] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] -- Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]