VFS_BIO_DEBUG and 4.11

2005-06-24 Thread Stephen McKay
Hi-diddly-ho!

Is VFS_BIO_DEBUG still supposed to work in 4.11?  I'm trying to debug a
data corruption problem that could be a bug in the cd9660 file system and
thought that enabling VFS_BIO_DEBUG might help.  Instead it complains a
lot about directories and character devices being VMIO'd nowadays, then
panics with biodone: zero vnode ref count before it even finishes
booting.

I have reason to believe this was a useful flag back in 4.4 (because I
saw a kernel config from Matt Dillon that included it), but have not found
any evidence of use more recent than that.

So, is it obsolete now?  Or is it just only a little bit broken?  I don't
(yet) understand the invariants it is trying to enforce, so perhaps none
of them apply any more.

Stephen.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Winbind NT domain authentication

2005-06-24 Thread Thomas Fazekas
Hi list,

Sorry for the cros-post, I'm not sure which list is better for
me as I got a question related to samba, configuration, FreeBSD.

I'm trying to configure NT authentication on FreeBSD 5.4 with 
Samba 3.0.12 (installed form the ports collection).
I've folowed the Samba 3 howto I've managed the following :

wbinfo -g returns correctly the domain groups

wbinfo -u returns all the users (including those ones from the domain)

ntlm auth does authenticate the user correctly
ntlm_auth --username=usr1
password:
NT_STATUS_OK: Success (0x0)
and in the winbind log I get :
rpc: trusted_domains
[ 3141]: request interface version
[ 3141]: request location of privileged pipe
[ 3141]: request domain name
[ 3141]: request misc info
[ 3141]: pam auth MYDOMAIN\usr1
rpc_dc_name: Returning DC PASSV_SERV (_the_ip_) for domain MYDOMAIN
IPC$ connections done anonymously
Connecting to host=PASSV_SERV
Connecting to _the_ip_ at port 445

I suspect this means that my samba/winbind configuration is correct.
The trouble is that I still can't login (login or ssh) with usernames
from the domain.
If I try with MYDOMAIN\usr1 I just get an Access Denied.
The worse is that I'm not sure that I'm looking for the logs in the 
right place, the auth.log of messages doesn't show any trace of
winbind beeing called.

My smb.conf :

workgroup = MYDOMAIN
netbios name = MY_BSD
password server = passwd_serv_ip
security = domain
encrypt passwords = yes
#passdb backend = tdbsam guest
server string = MY_BSD Samba Server

# separate domain and username with '\', like DOMAIN\username
winbind separator = \\
# use uids from 1 to 2 for domain users
idmap uid = 1-2
# use gids from 1 to 2 for domain groups
idmap gid = 1-2
# allow enumeration of winbind users and groups
winbind enum users = yes
winbind enum groups = yes
# give winbind users a real shell (only needed if they have telnet access)
template homedir = /home/winnt/%D%U
template shell = /usr/local/bin/bash

My nsswitch.conf

group: compat winbind
group_compat: nis
hosts: files dns winbind
networks: files
passwd: compat winbind
passwd_compat: nis
shells: files

and finally my /etc/pam.d/sshd

# auth
authrequiredpam_nologin.so  no_warn
#auth   sufficient  pam_opie.so no_warn no_fake_prompts
#auth   requisite   pam_opieaccess.so   no_warn allow_local
#auth   sufficient  pam_krb5.so no_warn try_first_pass
#auth   sufficient  pam_ssh.so  no_warn try_first_pass
#auth   requiredpam_unix.so no_warn try_first_pass
#tfa
authsufficient  pam_winbind.so  debug try_first_pass
authsufficient  pam_unix.so no_warn try_first_pass

# account
#accountrequiredpam_krb5.so
account requiredpam_login_access.so
account sufficient  pam_winbind.so  debug
account sufficient  pam_unix.so

# session
#sessionoptionalpam_ssh.so
session requiredpam_permit.so

# password
#password   sufficient  pam_krb5.so no_warn try_first_pass
passwordsufficient  pam_winbind.so  debug try_first_pass
passwordsufficient  pam_unix.so no_warn try_first_pass

I hope this question is not silly but only for NT authentication smbd/nmbd
is not necessary to run, isn't it ? Winbind should do de job.

This is the 2'nd week I keep trying setting this thing up, and one of the most
frustrating experience ever...
Can anybody give me some hints (other then going to a psychiatrist)

Thomas

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Portupgrade in Xfree86 pkg failed

2005-06-24 Thread Warren
ln 
-s 
/usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-support/linux/drm/xf86drmRandom.c
 
xf86drmRandom.c
rm -f xf86drmSL.c
ln 
-s 
/usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-support/linux/drm/xf86drmSL.c
 
xf86drmSL.c
make: don't know how to make /drm.h. Stop
*** Error code 2

Stop in /usr/ports/graphics/xfree86-dri/work/xc/lib/GL.
*** Error code 1

Stop in /usr/ports/graphics/xfree86-dri.

-- 
Yours Sincerely
Shinjii
http://www.shinji.nq.nu
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Portupgrade in Xfree86 pkg failed

2005-06-24 Thread Daniel O'Connor
On Fri, 24 Jun 2005 20:35, Warren wrote:
 ln
 -s
 /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-supp
ort/linux/drm/xf86drmRandom.c xf86drmRandom.c
 rm -f xf86drmSL.c
 ln
 -s
 /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-supp
ort/linux/drm/xf86drmSL.c xf86drmSL.c
 make: don't know how to make /drm.h. Stop
 *** Error code 2

 Stop in /usr/ports/graphics/xfree86-dri/work/xc/lib/GL.
 *** Error code 1

 Stop in /usr/ports/graphics/xfree86-dri.

What commanad did you run?
What version of FreeBSD are you running?
When did you last cvsup your ports tree?
Did you read /usr/ports/UPDATING?

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


pgpJv9uMlTlvs.pgp
Description: PGP signature


Re: Portupgrade in Xfree86 pkg failed

2005-06-24 Thread Warren
On Fri, 24 Jun 2005 9:11 pm, Daniel O'Connor wrote:
 On Fri, 24 Jun 2005 20:35, Warren wrote:
  ln
  -s
  /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-su
 pp ort/linux/drm/xf86drmRandom.c xf86drmRandom.c
  rm -f xf86drmSL.c
  ln
  -s
  /usr/ports/graphics/xfree86-dri/work/xc/programs/Xserver/hw/xfree86/os-su
 pp ort/linux/drm/xf86drmSL.c xf86drmSL.c
  make: don't know how to make /drm.h. Stop
  *** Error code 2
 
  Stop in /usr/ports/graphics/xfree86-dri/work/xc/lib/GL.
  *** Error code 1
 
  Stop in /usr/ports/graphics/xfree86-dri.

 What commanad did you run?

portupgrade -aDk -m BATCH=yes
 What version of FreeBSD are you running?
5.4-STABLE
 When did you last cvsup your ports tree?
Just before doing PortUpgrade before sending the 1st email
 Did you read /usr/ports/UPDATING?
cant say as i did.

-- 
Yours Sincerely
Shinjii
http://www.shinji.nq.nu
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Simultaneously two versions of one port?

2005-06-24 Thread Thomas Beer
Dear All,

I'm trying to install gdesklets which depends on
py24-orbit. However, py23-orbit is already
installed. I googled and tryed various ways
with portupgrade et al. What to do?

Thanks Tom

[EMAIL PROTECTED] gdesklets]# make install
===   gdesklets-0.34.3 depends on file: /usr/local/bin/python - found

---snip---

===  Installing for py24-orbit-2.0.1_1
===   py24-orbit-2.0.1_1 depends on file: /usr/local/bin/python2.4 - found
===   py24-orbit-2.0.1_1 depends on executable: pkg-config - found
===   py24-orbit-2.0.1_1 depends on shared library: glib-2.0.600 - found
===   py24-orbit-2.0.1_1 depends on shared library: IDL-2.0 - found
===   py24-orbit-2.0.1_1 depends on shared library: ORBit-2.0 - found
===   Generating temporary packing list
===  Checking if devel/py-orbit2 already installed
===   An older version of devel/py-orbit2 is already installed
(py23-orbit-2.0.1_1)
 You may wish to ``make deinstall'' and install this port again
 by ``make reinstall'' to upgrade it properly.
 If you really wish to overwrite the old port of devel/py-orbit2
 without deleting it first, set the variable FORCE_PKG_REGISTER
 in your environment or the make install command line.
*** Error code 1

Stop in /usr/ports/devel/py-orbit2.
*** Error code 1

Stop in /usr/ports/x11-toolkits/py-gnome2.
*** Error code 1

Stop in /usr/ports/deskutils/gdesklets.

-- 
--  Which is worse:  ignorance or apathy?
--  Don't know.  Don't care.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Simultaneously two versions of one port?

2005-06-24 Thread Pascal Groenen
Dear Thom,

What I do when something like this happens is:

either, 'cd'  to the port that is causing the problem, devel/py-orbit2, in 
this case and type 'make deinstall', than go back to your original port and 
issue a make again

or, 'cd' to the port that is causing the problem, devel/py-orbit2, in this 
case and type 'make -DFORCE_PKG_REGISTER install clean'. This forces an 
upgrade of the troublesome port, but may result in a double registered port. 
This can be checked by issuing 'pkgdb -F'

Good luck,

Pascal

On Fri, 24 Jun 2005 14:17:29 +0200, Thomas Beer wrote
 Dear All,
 
 I'm trying to install gdesklets which depends on
 py24-orbit. However, py23-orbit is already
 installed. I googled and tryed various ways
 with portupgrade et al. What to do?
 
 Thanks Tom
 
 [EMAIL PROTECTED] gdesklets]# make install
 ===   gdesklets-0.34.3 depends on file: /usr/local/bin/python - found
 
 ---snip---
 
 ===  Installing for py24-orbit-2.0.1_1
 ===   py24-orbit-2.0.1_1 depends on file: /usr/local/bin/python2.4 - found
 ===   py24-orbit-2.0.1_1 depends on executable: pkg-config - found
 ===   py24-orbit-2.0.1_1 depends on shared library: glib-2.0.600 - found
 ===   py24-orbit-2.0.1_1 depends on shared library: IDL-2.0 - found
 ===   py24-orbit-2.0.1_1 depends on shared library: ORBit-2.0 - found
 ===   Generating temporary packing list
 ===  Checking if devel/py-orbit2 already installed
 ===   An older version of devel/py-orbit2 is already installed
 (py23-orbit-2.0.1_1)
  You may wish to ``make deinstall'' and install this port again
  by ``make reinstall'' to upgrade it properly.
  If you really wish to overwrite the old port of devel/py-orbit2
  without deleting it first, set the variable FORCE_PKG_REGISTER
  in your environment or the make install command line.
 *** Error code 1
 
 Stop in /usr/ports/devel/py-orbit2.
 *** Error code 1
 
 Stop in /usr/ports/x11-toolkits/py-gnome2.
 *** Error code 1
 
 Stop in /usr/ports/deskutils/gdesklets.
 
 -- 
 --  Which is worse:  ignorance or apathy?
 --  Don't know.  Don't care.
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Data corruption in cd9660 on FreeBSD 4.11?

2005-06-24 Thread Stephen McKay
Hi!

I'm experiencing data corruption when reading CDs and DVDs on FreeBSD 4.11.
My best theory so far is that cd9660 or perhaps the VFS layer is mishandling
2048 byte buffers (since they are smaller than one virtual memory page),
occasionally writing them to the wrong location in RAM.  Read on for why
I think so.

First up, I don't think this is the usual hardware problem since the machine
has done huge numbers of buildworlds (in 4.x and -current) without any of
the telltale signs (eg bus errors and segmentation violations).  There are
no error messages in /var/log/messages.  Also, it moonlights as a games
machine and plays Doom 3, Battlefield 1942, Neverwinter Nights and so forth
like a champ.  Memory, cpu, video, disk, networking are all just fine 100%
of the time.

The hardware is an ASUS P4P800 mobo (including onboard Marvell Yukon gigabit
ethernet) with a P4 2.8GHz cpu, 1GB RAM, Maxtor 120GB disk, Pioneer 103S
DVD-ROM, LiteOn SOHW-1673S DVD burner in an Antec Sonata case.

Now that I have a DVD burner, I make backups of my main machines (over NFS)
but have found that they often don't verify as 100% correct.  The symptom
is that, for some files, an entire 2048 DVD sector is replaced with
different (non-zero) data.  This occurs both when reading with the Pioneer
DVD-ROM and when reading with the LiteOn burner (though I don't test with
the Pioneer much as it is slower).

I emphasise that all burns have been 100% correct (ie the burning process
worked and this can be verified by reading on, say, my iBook), so all of
the hardware seems to be operating correctly (and swiftly, I might add).
The problem is that reading the iso9660 file system is not safe.

After some experimenting, I've found that the problem also occurs when
reading CDs, and I built a test CD (of photos of a recent wedding) and in
testing I read this CD over and over.  I compare the CD with the original
files (via NFS) using diff.  When diff finds a difference, I save copies
of the differing files before they can be flushed from the cache.

I have calculated checksums for all 2048 blocks on the CD, so I can know
if any given block of 2048 bytes came from the CD and if so which file it
came from.  In all cases so far, the 2048 byte error has been a block from
another file, not a random corruption.

I am starting to believe that, under high load, the cd9660 file system
code tells the ata driver to put a 2K block in the wrong spot in memory,
leaving some old junk in the gap in the file being read, and blasting some
other 2K block of memory.  It may not be cd9660 code per se that is wrong,
but a bug in the complex buffer handling code (getblk, getnewbuf, allocbuf,
etc).

Why do I believe it is writing to the wrong memory, rather than any number
of other flaws?  In two runs (out of many), unusual things occurred that
are consistent with memory being overwritten, rather than, say, a 2K block
just not being read at all: In one, an innocent sshd core-dumped (which
is something that has never happened except when running my cd9660 tests),
and in another, a previously OK cached NFS file became corrupted.

Explaining that last case further: I had been running a test script that
would mount the CD, compare files, unmount the CD, and repeat.  This meant
that the NFS copy of the files was read over and over and hence became
memory resident (there being enough space in 1GB of RAM for one copy of
the files, plus my normal programs).  Several tests passed without fault
(hence all the NFS files were cached and correct), when suddenly there
were multiple corruptions; call them file A and file B.  File A was the
usual corruption where a 2K block of another file was unexpectedly present
in the copy read from the CD, but in file B it was the NFS file that was
wrong.  In fact it contained the missing block from file A!  In short, the
fully memory resident NFS file B had been corrupted by reading file A from
the CD.

It's been pretty interesting hunting this problem, but now I'm sort of
stuck.  I believe that some 2K reads from DVDs and CDs end up in the wrong
place in RAM, but I can't find where this happens in the code (it's pretty
hard to work out just by reading it), and I can't rule out the possibility
that there's a hardware error here that I've just never run across before.

So, can anyone suggest any more tests I could try?  Or is there a kind of
hardware fault that could cause this substitution of whole blocks read from
CDs without causing any other problems?

And does anyone know of any commits made anywhere in the 5 years since
4.x split off from 5.x that may be relevant?  Yep.  5 years.  I have
started looking, but there's a fair bit of stuff in there...

Stephen.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Simultaneously two versions of one port?

2005-06-24 Thread Thomas Beer
 either, 'cd'  to the port that is causing the problem, devel/py-orbit2, in
 this case and type 'make deinstall', than go back to your original port and
 issue a make again

What's with existing dependencies?

 
 or, 'cd' to the port that is causing the problem, devel/py-orbit2, in this
 case and type 'make -DFORCE_PKG_REGISTER install clean'. This forces an
 upgrade of the troublesome port, but may result in a double registered port.
 This can be checked by issuing 'pkgdb -F'
 
 Good luck,
 
 Pascal
 
 On Fri, 24 Jun 2005 14:17:29 +0200, Thomas Beer wrote
  Dear All,
 
  I'm trying to install gdesklets which depends on
  py24-orbit. However, py23-orbit is already
  installed. I googled and tryed various ways
  with portupgrade et al. What to do?
 
  Thanks Tom
 
  [EMAIL PROTECTED] gdesklets]# make install
  ===   gdesklets-0.34.3 depends on file: /usr/local/bin/python - found
 
  ---snip---
 
  ===  Installing for py24-orbit-2.0.1_1
  ===   py24-orbit-2.0.1_1 depends on file: /usr/local/bin/python2.4 - found
  ===   py24-orbit-2.0.1_1 depends on executable: pkg-config - found
  ===   py24-orbit-2.0.1_1 depends on shared library: glib-2.0.600 - found
  ===   py24-orbit-2.0.1_1 depends on shared library: IDL-2.0 - found
  ===   py24-orbit-2.0.1_1 depends on shared library: ORBit-2.0 - found
  ===   Generating temporary packing list
  ===  Checking if devel/py-orbit2 already installed
  ===   An older version of devel/py-orbit2 is already installed
  (py23-orbit-2.0.1_1)
   You may wish to ``make deinstall'' and install this port again
   by ``make reinstall'' to upgrade it properly.
   If you really wish to overwrite the old port of devel/py-orbit2
   without deleting it first, set the variable FORCE_PKG_REGISTER
   in your environment or the make install command line.
  *** Error code 1
 
  Stop in /usr/ports/devel/py-orbit2.
  *** Error code 1
 
  Stop in /usr/ports/x11-toolkits/py-gnome2.
  *** Error code 1
 
  Stop in /usr/ports/deskutils/gdesklets.
 
  --
  --  Which is worse:  ignorance or apathy?
  --  Don't know.  Don't care.
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to [EMAIL PROTECTED]
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 


-- 
--  Which is worse:  ignorance or apathy?
--  Don't know.  Don't care.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


EHCI: mtools stuck in state 'physrd' or panic

2005-06-24 Thread Stefan Walter
Hi,

I just updated to this morning's RELENG_5 and thought I'd give USB 2.0 a
try to speed up data exchange with my USB sticks. The controller is
correctly identified, it seems:

ehci0: VIA VT6202 USB 2.0 controller mem 0xdbfdf700-0xdbfdf7ff irq 3 at 
device 16.3 on pci0

When plugging in a USB stick, it is correctly identified, too:

umass0: USB Flash Disk, rev 2.00/2.00, addr 2
da2 at umass-sim0 bus 0 target 0 lun 0
da2:  USB BAR 2.00 Removable Direct Access SCSI-2 device 
da2: 40.000MB/s transfers
da2: 124MB (255744 512 byte sectors: 64H 32S/T 124C)

I can also list the content of the FAT filesystem with mtools' mdir
command. When trying to copy a file from the stick to a local filesystem,
however, mcopy is almost immediately stuck in state physrd (according to
top(1)) after copying a varying number of bytes (between 100 and 2200 KB
is what I've seen so far). I cannot kill the mtools process, but pulling
out the USB stick helps - it panics after a few times of doing that,
though.

I thought it might have to do with IRQ sharing first, but according to
vmstat -i and dmesg, ehci0 doesn't share its IRQ with anything else.

I know that the ehci(4) man page says the driver is not finished and quite
buggy, but that doesn't mean I shouldn't report a problem, right? ;)

Any ideas?

Stefan


pgpuFhzD1g7Qh.pgp
Description: PGP signature


Re: Simultaneously two versions of one port?

2005-06-24 Thread Stefan Walter
Thomas Beer in gmane.os.freebsd.stable:

 I'm trying to install gdesklets which depends on
 py24-orbit. However, py23-orbit is already
 installed. I googled and tryed various ways
 with portupgrade et al. What to do?

Do you still need the Python 2.3 stuff? If not, you could just upgrade all
packages installed for 2.3 with 'portupgrade py23*', for instance.

Stefan
-- 
No reading beyond this point


pgpX5W7W4GdGd.pgp
Description: PGP signature


Re: EHCI: mtools stuck in state 'physrd' or panic

2005-06-24 Thread Mike Tancsa

At 08:50 AM 24/06/2005, Stefan Walter wrote:

I can also list the content of the FAT filesystem with mtools' mdir
command. When trying to copy a file from the stick to a local filesystem,
however, mcopy is almost immediately stuck in state physrd (according to
top(1)) after copying a varying number of bytes (between 100 and 2200 KB
is what I've seen so far). I cannot kill the mtools process, but pulling
out the USB stick helps - it panics after a few times of doing that,
though.


If you reformat the USB stick with UFS2, does IO still hang the box ?

---Mike 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Portupgrade in Xfree86 pkg failed

2005-06-24 Thread Daniel O'Connor
On Fri, 24 Jun 2005 20:47, Warren wrote:
 Just before doing PortUpgrade before sending the 1st email

  Did you read /usr/ports/UPDATING?

 cant say as i did.

Well that was silly..
Not that I think there is a specific entry in this case but it is a good habit 
to get in to..

Do you have the kernel source installed? I think you may need that to build 
the xfree86-dri port (I don't know why it doesn't check)

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
The nice thing about standards is that there
are so many of them to choose from.
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


pgpLwwFvumZLC.pgp
Description: PGP signature


Re: EHCI: mtools stuck in state 'physrd' or panic

2005-06-24 Thread Stefan Walter
Mike Tancsa in gmane.os.freebsd.stable:

 If you reformat the USB stick with UFS2, does IO still hang the box ?

I haven't tried that, but instead tried to just dump the whole USB stick
to a file with dd if=/dev/da2 of=stickimage bs=1024. The dd process also
hung in state physrd eventually, and about a minute after pulling out
the USB stick the system panic'd.

Furthermore, I tried the same (both mcopy and dd) on my notebook (Centrino
- Intel ICH4 chipset), which didn't have ehci in its kernel until then,
either. It worked flawlessly, multiple times.

Stefan
-- 
No reading beyond this point


pgpdMFKJF869g.pgp
Description: PGP signature


Re: ATA DMA timeouts [NOT FIXED] (forget my last mail)

2005-06-24 Thread Martin

Hi (again),

believe it or not. After sending the last mail and closing
firefox, the system has been hanging for a few seconds and
spit this out:

ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=226025086

I'm back on kernel of May 26th again. This time I did not have
a corrupted file system or at least not as badly corrupted that
it panics the kernel, like the first time.

Usually my system has DMA timeouts very early while booting up
and creates mess on the file systems.

Sorry for causing noise. :(
I'm going to test a kernel every week and report, when the
problems are gone.

Martin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


dmesg queries

2005-06-24 Thread Chris Phillips

Hi All,

I'm trying to figure out a couple of things  would like some advice please.

My server was on FreeBSD 5.3 STABLE #1 this morning (I only took it to 
STABLE, because at the time, my GigNIC was not supported fully 
@RELEASE).  I upgraded today  it's now looking like this: -


% uname -a
FreeBSD venus.rainbow-it.net 5.4-RELEASE-p2 FreeBSD 5.4-RELEASE-p2 #2: 
Fri Jun 24 13:43:08 BST 2005 
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/VENUS  i386


It's a lowly Dell PowerEdge 800 DELL PE800


Here's what has me writing... looking in /var/run/dmesg I see: -

kernel: ioapic0: Changing APIC ID to 2
kernel: ioapic1: Changing APIC ID to 3
kernel: ioapic1: WARNING: intbase 32 != expected base 24
kernel: ioapic0 Version 2.0 irqs 0-23 on motherboard
kernel: ioapic1 Version 2.0 irqs 32-55 on motherboard

Should I be worried by that WARNING?


Also in /var/run/dmesg I see: -

kernel: Interrupt storm detected on irq19: uhci0 uhci2; throttling 
interrupt source
kernel: Interrupt storm detected on irq18: bge0 uhci1+; throttling 
interrupt source


I think the top device is a Logitech QuickCam Express  the bottom one 
(using irq18), is my onboard Gigabit NIC.  Would these lines in dmesg 
suggest that there is a problem and if so, is there anything that I can 
do to combat it?  Is it likely that this 'throttling', is slowing my NIC 
at all?



I have seen this kind of notification (throttling), when printing.  This 
is the dmesg output for that device: -


kernel: ppc0: ECP parallel printer port port 0x778-0x77f,0x378-0x37f 
irq 7 drq 1 on acpi0

kernel: ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
kernel: ppc0: FIFO with 16/16/8 bytes threshold
kernel: ppbus0: Parallel port bus on ppc0
kernel: ppbus0: IEEE1284 device found /NIBBLE/ECP
kernel: Probing for PnP devices on ppbus0:
kernel: ppbus0: HEWLETT-PACKARD DESKJET 990C PRINTER MLC,PCL,PML
kernel: lpt0: Printer on ppbus0
kernel: lpt0: Interrupt-driven port
kernel: ppi0: Parallel I/O on ppbus0


Now, the upgrade may well have stopped the behavior mentioned below, but 
if not, does anyone know what I might have done wrong, to be getting (a 
lot of) messages like: -


sio0: 1848 more interrupt-level buffer overflows (total 11269) ?

This is the device: -

kernel: sio0: 16550A-compatible COM port port 0x3f8-0x3ff irq 4 flags 
0x10 on acpi0

kernel: sio0: type 16550A


Finally, is it considered bad form, to ask multiple questions like I 
have, or should I have separated them  sent them in multiple emails?


Kind Regards,


Chris Phillips
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


pxeboot, NFS and root-path: bug or documentation error?

2005-06-24 Thread Brian Candler
I have been setting up a pxeboot jumpstart environment for FreeBSD 4.11, 
following the instructions at 
http://www.freebsd.org/doc/en_US.ISO8859-1/articles/pxe/

Rather than build pxeboot like this:
# rm -rf /usr/obj/*
# cd /usr/src/sys/boot
# make
# cp /usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot
I just copied /boot/pxeboot from the FreeBSD-4.11 CD-ROM. Otherwise I followed 
the instructions very closely.

The pxeboot client machine is a Compaq ProLiant DL380.

On first attempt, it got as far as pxeboot starting, and then:

  pxe_open: server addr: 192.168.0.1
  pxe_open: server path: /pxeroot
  pxe_open: gateway ip: 0.0.0.0
  Booting [kernel]...
  can't load 'kernel'
  can't load 'kernel.old'

And my NFS server logs a failed attempt to mount /pxeroot:

  Jun 24 16:32:40 sr-mon-00 mountd[642]: mount request from 192.168.0.240 for 
non existent path /pxeroot
  Jun 24 16:32:49 sr-mon-00 last message repeated 59 times

This is strange; I thought that at this stage pxeboot would be pulling across 
the kernel and ramdisk via TFTP from /usr/tftpboot, although the 
documentation is far from clear. pxeboot(8) says:

 pxeboot recognizes next-server and option root-path directives as the
 server and path to NFS mount for file requests, respectively, or the
 server to make TFTP requests to.

(Erm, so exactly how do I choose whether to use NFS or to use TFTP for the 
next stage?)

Anyway, assuming that I'm forced to use NFS at this point, I added another 
DHCP option:

  option root-path 192.168.0.1:/usr/tftpboot;

This option is not shown in the example dhcpd.conf in pxeboot(8), nor in the 
article referred to above. However, if I also put an entry in the NFS 
server's /etc/hosts file for the client DHCP address, it then works properly.

So the question is: when pxeboot runs on the client, is it able to fetch 
loader.rc, the kernel and ramdisk via TFTP, or only via NFS?

If it's only NFS, then I think the pxeboot(8) manpage, and the pxeboot 
article, ought to be updated. If it *can* use TFTP, does anyone have any 
suggestions for what I was doing wrong?

Regards,

Brian Candler.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: dmesg queries

2005-06-24 Thread Vivek Khera


On Jun 24, 2005, at 12:09 PM, Chris Phillips wrote:


Here's what has me writing... looking in /var/run/dmesg I see: -

kernel: ioapic0: Changing APIC ID to 2
kernel: ioapic1: Changing APIC ID to 3
kernel: ioapic1: WARNING: intbase 32 != expected base 24
kernel: ioapic0 Version 2.0 irqs 0-23 on motherboard
kernel: ioapic1 Version 2.0 irqs 32-55 on motherboard

Should I be worried by that WARNING?


I see this as well on my PE800.



Also in /var/run/dmesg I see: -

kernel: Interrupt storm detected on irq19: uhci0 uhci2;  
throttling interrupt source
kernel: Interrupt storm detected on irq18: bge0 uhci1+;  
throttling interrupt source




I get this as well.  I don't use USB so I turned that off at the BIOS  
and removed it from my kernel, but I still get interrupt storm on  
my bge0 device.  No idea why.


Performance is not all that great on this box disk-wise, but CPU wise  
it is acceptably fast.  I run FreeBSD/amd64 on it rather than FreeBSD/ 
i386.



Vivek Khera, Ph.D.
+1-301-869-4449 x806




Re: pxeboot, NFS and root-path: bug or documentation error?

2005-06-24 Thread John Baldwin
On Friday 24 June 2005 01:03 pm, Brian Candler wrote:
 I have been setting up a pxeboot jumpstart environment for FreeBSD 4.11,
 following the instructions at
 http://www.freebsd.org/doc/en_US.ISO8859-1/articles/pxe/

 Rather than build pxeboot like this:
 # rm -rf /usr/obj/*
 # cd /usr/src/sys/boot
 # make
 # cp /usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot
 I just copied /boot/pxeboot from the FreeBSD-4.11 CD-ROM. Otherwise I
 followed the instructions very closely.

 The pxeboot client machine is a Compaq ProLiant DL380.

 On first attempt, it got as far as pxeboot starting, and then:

   pxe_open: server addr: 192.168.0.1
   pxe_open: server path: /pxeroot
   pxe_open: gateway ip: 0.0.0.0
   Booting [kernel]...
   can't load 'kernel'
   can't load 'kernel.old'

 And my NFS server logs a failed attempt to mount /pxeroot:

   Jun 24 16:32:40 sr-mon-00 mountd[642]: mount request from 192.168.0.240
 for non existent path /pxeroot
   Jun 24 16:32:49 sr-mon-00 last message repeated 59 times

 This is strange; I thought that at this stage pxeboot would be pulling
 across the kernel and ramdisk via TFTP from /usr/tftpboot, although the
 documentation is far from clear. pxeboot(8) says:

  pxeboot recognizes next-server and option root-path directives as the
  server and path to NFS mount for file requests, respectively, or the
  server to make TFTP requests to.

 (Erm, so exactly how do I choose whether to use NFS or to use TFTP for the
 next stage?)

 Anyway, assuming that I'm forced to use NFS at this point, I added another
 DHCP option:

   option root-path 192.168.0.1:/usr/tftpboot;

 This option is not shown in the example dhcpd.conf in pxeboot(8), nor in
 the article referred to above. However, if I also put an entry in the NFS
 server's /etc/hosts file for the client DHCP address, it then works
 properly.

 So the question is: when pxeboot runs on the client, is it able to fetch
 loader.rc, the kernel and ramdisk via TFTP, or only via NFS?

 If it's only NFS, then I think the pxeboot(8) manpage, and the pxeboot
 article, ought to be updated. If it *can* use TFTP, does anyone have any
 suggestions for what I was doing wrong?

It uses TFTP to fetch the pxeboot binary itself.  After that, it uses either 
NFS or TFTP.  By default it uses NFS to access /boot/loader and friends.  If 
you want it to just use TFTP and not use NFS at all, you need to recompile 
pxeboot with LOADER_TFTP_SUPPORT=yes defined in make.  That is:

% cd /sys/boot
% make clean
% make LOADER_TFTP_SUPPORT=yes
% cp /usr/obj/usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot

-- 
John Baldwin [EMAIL PROTECTED]http://www.baldwin.cx/~john/
Power Users Use the Power to Serve  =  http://www.FreeBSD.org

-- 
John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve  =  http://www.FreeBSD.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Network/fxp related panic in 5.4?

2005-06-24 Thread Eirik Øverby

Hi all,

I recently re-enabled SMP on one of my 5.4 servers (dual intel p3),  
and after a relatively short while (couple of days) it starts acting  
up. Today it was frozen and had jumped into kernel debugger on serial  
console. Problem is that my serial console was controlled by a  
terminal at work, and when I got home it seemed that the work  
terminal had disconnected. All I could do was a 'trace' - I don't  
have the panic screen (if any) nor do I have any other output because  
the watchdog triggered the powerswitch cycle just after I got the trace:


Tracing pid 29 tid 10 td 0xc22a
fxp_intr_body(c2404000,c2404000,40,,8) at fxp_intr_body+0xd0
fxp_intr(c2404000,0,0,0,0) at fxp_intr+0x14e
ithread_loop(c22f6500,e3384d38,0,0,0) at ithread_loop+0x1b8
fork_exit(c06a9150,c22f6500,e3384d38) at fork_exit+0x80
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe3384d6c, ebp = 0 ---
db

What makes me wonder is ... When I connected the serial console, the  
db prompt was already there. Does that mean that the work terminal  
disconnect somehow sent a telnet break, and triggered the kernel  
debugger? I.e. - this was no panic, but a stupid serial console hiccup?


Is there any way to prevent this in the future - like changing the  
control character that would trigger the kernel debugger? (I have  
BREAK_TO_DEBUGGER in my kernel config..)


Thanks,
/Eirik
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ATA_DMA errors

2005-06-24 Thread Johny Mattsson

twesky wrote:

I am having ATA_DMA errors on 5.4R and 5 STABLE up to June 16 (haven't
done a cvsup again).  It doesn't happen on 5.3R or lower.


I've just upgraded my fileserver from 5.1-R to 5.4-R, and I'm seeing 
this problem too now on 3 out of 4 drives.




The exact error message is below:

It happens within a few hours of use.  The laptop will then reboot,
and fsck must be ran.  After fsck the timeouts happen within a few
seconds of booting.


My system uses a SiI 0680 UDMA133 controller in addition to the old 
built-in Intel PIIX4 UDMA33 controller. My system drive hangs off the 
PIIX4 controller and I see no issues with it, only drives off the SiI;


ad0: 8207MB ST38641A/3.29 [16676/16/63] at ata0-master UDMA33
ad4: 57241MB ST360021A/3.05 [116301/16/63] at ata2-master UDMA100
ad6: 76319MB ST380021A/3.19 [155061/16/63] at ata3-master UDMA100
ad7: 152627MB WDC WD1600JB-00DUA3/75.13B75 [310101/16/63] at 
ata3-slave UDMA100



Right after the upgrade things worked well for a couple of hours, and 
then I got a reboot all of a sudden. Upon inspection I found tons of 
both READ_DMA timed out as well as WRITE_DMA UDMA ICRC error 
messages in log prior to the reboot. After the reboot it went to do the 
fsck and made it perhaps halfway through it before it started churning 
out READ_DMA timed out messages again, followed by the ad7: warning - 
removed from configuration message.


Things did not get better from there, but with each sucessive reboot 
more and more started going wrong. In order to be able to get the system 
to even boot in the end I had to physically disconnect the ad7 drive, 
but even so I'm getting READ_DMA timed out messages for ad4 and ad6.


Since I'm getting WRITE_DMA errors on both ad6 and ad7 now (I haven't 
written anything to ad4 yet, so I don't know if I'll get errors on that 
one too), and I wasn't a few hours ago when I was running 5.1-R, I 
refuse to believe that two disks have gone bad in that timespan!


I'm not sure what I should do at this point - theoretically I could 
proceed to roll back to 5.1 to prevent further data loss, but I'm 
guessing it'd be good if I kept it for a little while so that I could 
run tests for patches :-/



Seeing the comments about possible failing controller hardware, I might 
see if I can find a replacement controller tomorrow... any ideas in the 
meantime will be appreciated though!


Still feels very iffy that this started happening right after the 
upgrade... I was expecting to get rid of some of the quirks from the 
early preview, not get far worse ones! :-(



Oh, btw, using smartmontools' smartctl, I've gotten the information that 
ad4 has had 32 write errors in total, ad6 have had 0 (despite seeing the 
WRITE_DMA errors in the system log), and ad7 refuses to even talk SMART.



###

Here's the contents of the dmesg from before I pulled ad7 out:

Jun 24 18:22:19 kernel: FreeBSD 5.4-RELEASE #0: Sun May 8 10:21:06 UTC 2005
Jun 24 18:22:19 kernel:
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
Jun 24 18:22:19 kernel: Timecounter i8254 frequency 1193182 Hz quality 0
Jun 24 18:22:19 kernel: CPU: Pentium II/Pentium II Xeon/Celeron
(467.73-MHz 686-class CPU)
Jun 24 18:22:19 kernel: Origin = GenuineIntel Id = 0x665 Stepping = 5
Jun 24 18:22:19 kernel:
Features=0x183f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,S
EP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR
Jun 24 18:22:19 kernel: real memory = 805240832 (767 MB)
Jun 24 18:22:19 kernel: avail memory = 778231808 (742 MB)
Jun 24 18:22:19 kernel: npx0: math processor on motherboard
Jun 24 18:22:19 kernel: npx0: INT 16 interface
Jun 24 18:22:19 kernel: acpi0: AWARD AWRDACPI on motherboard
Jun 24 18:22:19 kernel: acpi0: Power Button (fixed)
Jun 24 18:22:19 kernel: Timecounter ACPI-safe frequency 3579545 Hz
quality 1000
Jun 24 18:22:19 kernel: acpi_timer0: 24-bit timer at 3.579545MHz port
0x4008-0x400b on acpi0
Jun 24 18:22:19 kernel: cpu0: ACPI CPU (3 Cx states) on acpi0
Jun 24 18:22:19 kernel: acpi_throttle0: ACPI CPU Throttling on cpu0
Jun 24 18:22:19 kernel: acpi_button0: Power Button on acpi0
Jun 24 18:22:19 kernel: pcib0: ACPI Host-PCI bridge port
0x5000-0x500f,0x4000-0x4041,0xcf8-0xcff on acpi0
Jun 24 18:22:19 kernel: pci0: ACPI PCI bus on pcib0
Jun 24 18:22:19 kernel: agp0: Intel 82443BX (440 BX) host to PCI
bridge mem 0xe000-0xe3ff at device 0.0 on pci0
Jun 24 18:22:19 kernel: pcib1: PCI-PCI bridge at device 1.0 on pci0
Jun 24 18:22:19 kernel: pci1: PCI bus on pcib1
Jun 24 18:22:19 kernel: isab0: PCI-ISA bridge at device 7.0 on pci0
Jun 24 18:22:19 kernel: isa0: ISA bus on isab0
Jun 24 18:22:19 kernel: atapci0: Intel PIIX4 UDMA33 controller port
0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0
Jun 24 18:22:19 kernel: ata0: channel #0 on atapci0
Jun 24 18:22:19 kernel: ata1: channel #1 on atapci0
Jun 24 18:22:19 kernel: uhci0: Intel 82371AB/EB (PIIX4) USB controller
port 0x9000-0x901f irq 11 at device 7.2 on pci0
Jun 24 18:22:19 kernel: usb0: Intel 82371AB/EB (PIIX4) USB 

Re: ATA_DMA errors

2005-06-24 Thread twesky
I don't think it is a hardware problem.  Unless you replace it with
the exact same hardware, it'll be difficult to determine if it was the
hardware.

I haven't had any issues with 5.3R or any stable version before April
15.  I am going to do some checking this weekend and see if it is
hardware or software what is causing my timeouts.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pxeboot, NFS and root-path: bug or documentation error?

2005-06-24 Thread Brian Candler
On Fri, Jun 24, 2005 at 01:50:37PM -0400, John Baldwin wrote:
 It uses TFTP to fetch the pxeboot binary itself.  After that, it uses either 
 NFS or TFTP.  By default it uses NFS to access /boot/loader and friends.  If 
 you want it to just use TFTP and not use NFS at all, you need to recompile 
 pxeboot with LOADER_TFTP_SUPPORT=yes defined in make.  That is:
 
 % cd /sys/boot
 % make clean
 % make LOADER_TFTP_SUPPORT=yes
 % cp /usr/obj/usr/src/sys/boot/i386/pxeldr/pxeboot /usr/tftpboot

Thank you, that's very clear. Re-reading the manpage I do now see the phrase
selectable through compile-time options; perhaps it would be worth also
showing those options.

Is there any fundamental reason why both couldn't be compiled in at once,
e.g. limitations on the pxeboot binary size? Or is it just awkward to
implement?

I would have no objection to

options root-path = tftp://192.168.0.1/usr/tftpboot;

I would also have no objection to pxeboot.nfs and pxeboot.tftp being
built :-)

I'll try building the tftp version when back in the office next week.

Thanks again,

Brian.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic in RELENG_5 UMA

2005-06-24 Thread Gary Mu1der

All,

Can someone confirm that the following stack trace is showing the same 
problem, or not?


I can reproduce the problem with the custom kernel config included below 
(which is basically GENERIC stripped of devices I don't have or need and 
IPFILTER added), but not with a stock GENERIC kernel.


To cause the crash I'm running 20-30 instances of the following script:

d5# cat arping.sh
#!/bin/sh

while :
do
arp -d 192.168.4.$1 /dev/null 21;
ping -c 1 -t 1 192.168.4.$1 /dev/null 21;
done

d5# uname -a
FreeBSD d5.bidx.com 5.4-RELEASE FreeBSD 5.4-RELEASE #6: Thu Jun 23
13:45:20 EDT 2005
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/DB-DUAL-AMD64-RAID5  amd64

d5# kgdb /usr/obj/usr/src/sys/DB-DUAL-AMD64-RAID5/kernel.debug ./vmcore.5
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as amd64-marcel-freebsd.
#0  doadump () at pcpu.h:167
167 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:167
#1  0x in ?? ()
#2  0x802557b7 in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:410
#3  0x80255fef in panic (fmt=0xff00b5907500  ë6µ)
at /usr/src/sys/kern/kern_shutdown.c:566
#4  0x8029ad2a in sbdrop_locked (sb=0xb6274860, len=1146)
at /usr/src/sys/kern/uipc_socket2.c:1149
#5  0x8029afe2 in sbflush_locked (sb=0xb6274860)
at /usr/src/sys/kern/uipc_socket2.c:1116
#6  0x8029b049 in sbrelease_locked (sb=0xb6274860,
so=0xff00a0a2a8a0)
at /usr/src/sys/kern/uipc_socket2.c:564
#7  0x8029b0d5 in sbrelease (sb=0xb6274860,
so=0xff00a0a2a8a0)
at /usr/src/sys/kern/uipc_socket2.c:577
#8  0x80297b03 in sorflush (so=0xff00a0a2a8a0)
at /usr/src/sys/kern/uipc_socket.c:1483
#9  0x80297e42 in sofree (so=0xff00a0a2a8a0) at
/usr/src/sys/kern/uipc_socket.c:407
#10 0x80298467 in soclose (so=0xff00a0a2a8a0) at
/usr/src/sys/kern/uipc_socket.c:485
#11 0x802847b5 in soo_close (fp=0xff009ca95b60, td=0x0)
at /usr/src/sys/kern/sys_socket.c:299
#12 0x8022c2c0 in fdrop_locked (fp=0xff009ca95b60,
td=0xff00b5907500)
at file.h:288
#13 0x8022c40a in closef (fp=0xff009ca95b60,
td=0xff00b5907500)
at /usr/src/sys/kern/kern_descrip.c:1920
#14 0x8022e5be in fdfree (td=0xff00b5907500)
at /usr/src/sys/kern/kern_descrip.c:1624
#15 0x80238bd0 in exit1 (td=0xff00b5907500, rv=0)
at /usr/src/sys/kern/kern_exit.c:236
#16 0x8023a04e in sys_exit (td=0x0, uap=0x0) at
/usr/src/sys/kern/kern_exit.c:93
#17 0x8035cd8c in syscall (frame=
  {tf_rdi = 0, tf_rsi = 5263360, tf_rdx = 0, tf_rcx = 34366596768,
tf_r8 = 0, tf_r9 = 140737488350136, tf_rax = 1, tf_rbx = 0, tf_rbp = 3,
tf_r10 = -1099499764224, tf_r11 = 515, tf_r12 = 140---Type return to
continue, or q return to quit---
737488350376, tf_r13 = 0, tf_r14 = 0, tf_r15 = 0, tf_trapno = 12,
tf_addr = 34368259080, tf_flags = 0, tf_err = 2, tf_rip = 34366590280,
tf_cs = 43, tf_rflags = 514, tf_rsp = 140737488350296, tf_ss = 35}) at
/usr/src/sys/amd64/amd64/trap.c:771
#18 0x80349f88 in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:248
#19 0x in ?? ()
#20 0x00505000 in ?? ()
#21 0x in ?? ()
#22 0x00080068a6a0 in ?? ()
#23 0x in ?? ()
#24 0x7fffebb8 in ?? ()
#25 0x0001 in ?? ()
#26 0x in ?? ()
#27 0x0003 in ?? ()
#28 0xffb50600 in ?? ()
#29 0x0203 in ?? ()
#30 0x7fffeca8 in ?? ()
#31 0x in ?? ()
#32 0x in ?? ()
#33 0x in ?? ()
#34 0x000c in ?? ()
#35 0x000800820408 in ?? ()
#36 0x in ?? ()
#37 0x0002 in ?? ()
#38 0x000800688d48 in ?? ()
#39 0x002b in ?? ()
#40 0x0202 in ?? ()
#41 0x7fffec58 in ?? ()
#42 0x0023 in ?? ()
#43 0x7fffe968 in ?? ()
#44 0x0023 in ?? ()
#45 0x in ?? ()
---Type return to continue, or q return to quit---
#46 0x in ?? ()
#47 0x in ?? ()
#48 0x in ?? ()
#49 0x in ?? ()
#50 0x in ?? ()
#51 0x in ?? ()
#52 0x in ?? ()
#53 0xa14b4000 in ?? ()
#54 0xb6274c40 in ?? ()
#55 0x0101 in ?? ()
#56 0x in ?? ()
#57 0xff00b536eba0 in ?? ()
#58 0xff00ec19a780 in ?? ()
#59 0xb6274b58 in 

Re: panic in RELENG_5 UMA

2005-06-24 Thread Gary Mu1der
Sorry, I forgot to add that this is a Tyan Thunder K8SPRO w/dual AMD 
Opteron Processors, model no. 246, 4GB of RAM and an Adaptec 2200S RAID 
controller.


The NIC being used is the onboard Broadcom Gigabit Ethernet (bge).

Thanks,
Gary


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Data corruption in cd9660 on FreeBSD 4.11?

2005-06-24 Thread Peter Jeremy
On Fri, 2005-Jun-24 22:31:06 +1000, Stephen McKay wrote:
I'm experiencing data corruption when reading CDs and DVDs on FreeBSD 4.11.
...
So, can anyone suggest any more tests I could try?  Or is there a kind of
hardware fault that could cause this substitution of whole blocks read from
CDs without causing any other problems?

You might like to post the relevant sections of a verbose boot - the
ATA and CD probes.

Are you running the CD/DVD drives in PIO or UDMA modes?  In the former,
the CPU is reading the data from the CD and writing it to memory.  In
the latter, the CPU tells the disk controller where to write.  It could
be instructive to change modes and see what happens.

Have you tried anything other than ISO9660 filesystems on a physical CD?
What happens if you just dd the CD-ROM?  What happens if you use a vnode
mount (see vnconfig(8)) of an ISO filesystem sitting in a UFS filesystem?

Anything unusual in your kernel config file?

Have you tried building a kernel with WITNESS and/or DIAGNOSTIC?

Any chance of you repeating the tests with a 5.x system?  Maybe
on a spare small partition or using a 5.4-RELEASE disk1 as a live
filesystem.

-- 
Peter Jeremy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re[2]: ATA DMA timeouts [NOT FIXED] (forget my last mail)

2005-06-24 Thread Tony Byrne
Hello Martin,

M believe it or not. After sending the last mail and closing
M firefox, the system has been hanging for a few seconds and
M spit this out:

M ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=226025086

Yeah, I updated and rebuilt on seeing your email, but found it did
little.  I has TIMEOUTs again within 10 minutes of rebooting.

M Sorry for causing noise. :(
M I'm going to test a kernel every week and report, when the
M problems are gone.

I'd certainly be interested in hearing about your results, but from
the general lack of discussion of the problem on this list, I'm
guessing we may be on our own. I'm not sure anyone is actually working
on a fix, because I'm not sure the problem is recognized.

I have two separate Intel IHC5 based boxes affected by DMA TIMEOUTS,
both with SATA drives and I'd like to get to the bottom of this.

Regards,

Tony.

-- 
Tony Byrne


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 5.4-p1 crash

2005-06-24 Thread Doug White
On Mon, 20 Jun 2005, Philippe PEGON wrote:

 Philippe PEGON wrote:
  Mitch Parks wrote:
 
  On Sun, 19 Jun 2005, Doug White wrote:
 
  On Fri, 17 Jun 2005, Mitch Parks wrote:
 
  Below are details regarding another crash on a Dell 2600 SMP (HTT
  and USB
  disabled). It has been 9 days since the last crash. I didn't have
  the serial
  console in place for this last crash, but it is now.
 
 
 
  As noted, the ttwakeup() panic is a known bug. The best thing we have
  for
  a fix is this patch:
 
  http://people.freebsd.org/~mlaier/tty.t_pgrp.diff
 
  Please give it a try and report back if you have any more panics (or
  don't :-) ).
 
 
 
  Thanks! This patch appears to be for 5.3, but I manually applied the
  chunk of the patch that didn't apply cleanly and the countdown is on.
 
  I'll report back in 10 days unless something bad happens before then.
 
  Below is the patch chunk #10 that I actually applied rather than the
  one given. If I've done something bad here by removing the PGRP_LOCK
  please let me know.
 
 
  I'm not a kernel developper, but if you remove
 
  PGRP_LOCK(tp-t_pgrp);
 
  and the PGRP_UNLOCK(tp-t_pgrp) in the if condition (removed by the
  orginal patch)
 
  there is maybe another PGRP_UNLOCK(tp-t_pgrp); to remove if the if
  condition doesn't match, line 2528 in the original 5.4-p1 tty.c ?

 after having applied the patch (with your modification), there is no
 sx_sunlock(proctree_lock) in the ttyinfo function if the three
 conditions failed. Maybe we have just to replace
 PGRP_UNLOCK(tp-t_pgrp); line 2528 by sx_sunlock(proctree_lock) ?
 I think that we need the helps of a kernel developper.

No, that would be a leaked lock, which would cause hangs.  More likely its
some other case that got missed that needs locks extended to it, or the
aliased pgrp isn't the underlying problem.

I've run out of time to debug this, unfortunately...


 
 
  
  Hunk #6 succeeded at 1154 (offset -51 lines).
  Hunk #7 succeeded at 1215 (offset -6 lines).
  Hunk #8 succeeded at 1203 (offset -51 lines).
  Hunk #9 succeeded at 1946 (offset -5 lines).
  Hunk #10 failed at 2562.
  Hunk #11 succeeded at 2847 (offset -212 lines).
  1 out of 11 hunks failed--saving rejects to tty.c.rej
 
 
  @@ -2495,19 +2511,21 @@
   * On return following a ttyprintf(), we set tp-t_rocount to
  0 so
   * that pending input will be retyped on BS.
   */
  +   sx_slock(proctree_lock);
  if (tp-t_session == NULL) {
  +   sx_sunlock(proctree_lock);
  ttyprintf(tp, not a controlling terminal\n);
  tp-t_rocount = 0;
  return;
  }
  if (tp-t_pgrp == NULL) {
  +   sx_sunlock(proctree_lock);
  ttyprintf(tp, no foreground process group\n);
  tp-t_rocount = 0;
  return;
  }
  -   PGRP_LOCK(tp-t_pgrp);
  -   if ((p = LIST_FIRST(tp-t_pgrp-pg_members)) == 0) {
  -   PGRP_UNLOCK(tp-t_pgrp);
  +   if ((p = LIST_FIRST(tp-t_pgrp-pg_members)) == NULL) {
  +   sx_sunlock(proctree_lock);
  ttyprintf(tp, empty foreground process group\n);
  tp-t_rocount = 0;
  return;
 
  Or the complete patch:
  http://kuoi.asui.uidaho.edu/~mitch/crash/tty_5.4.patch
 
  Mitch Parks
  [EMAIL PROTECTED]
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to [EMAIL PROTECTED]
 
 




-- 
Doug White|  FreeBSD: The Power to Serve
[EMAIL PROTECTED]  |  www.FreeBSD.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]