Re: atapicam hangs system

2003-07-18 Thread Daniel Lang
Dear scsi experts,

currently I'm discussing a problem I experience with 
Thomas' atapicam(4) driver. To simplify discussions and
channel opinions and current discoveries, I've opened a PR
about the matter.

Please have a look at

kern/54616

and tell us your opinions.

Many thanks!

Daniel
-- 
IRCnet: Mr-Spock
 - kommst du siehst du, gehst du hast du, weisst du, krass! -
Daniel Lang * [EMAIL PROTECTED] * +49 89 289 18532 * http://www.leo.org/~dl/


smime.p7s
Description: S/MIME cryptographic signature


Re: atapicam hangs system

2003-07-17 Thread Thomas Quinot
Le 2003-07-17, Daniel Lang écrivait :

> I found out, that the hangs do not appear (or way less likely),
> if the writing speed used is <= 12. But they seem to occur
> very likely if the (attempted) writing speed is like 48.

Hum, nasty, nasty. Looks like the amount of interrupts caused by
high-speed burning might trigger a race condition between two
instances of camisr(). camisr() does splcam(), but maybe this is not
sufficient to correctly prevent concurrent execution when interrupts
from the ATA driver occur. Maybe the freebsd-scsi people will have a
clearer idea of what is going on here.

Thomas.

-- 
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: atapicam hangs system

2003-07-17 Thread Daniel Lang
Hi,

I managed to do some more investigations.

Daniel Lang wrote on Tue, Jul 15, 2003 at 04:25:56PM +0200:
[..]
> After two successful writes, the system hung again, as before.
> _No_ messages on the console.

I found out, that the hangs do not appear (or way less likely),
if the writing speed used is <= 12. But they seem to occur
very likely if the (attempted) writing speed is like 48.

Maybe this is an important hint. Although the drive and the
media claim to support speed 48, it seems that the 
overall throughput is in fact slower. 20-30 as it seems
to me. Still I manage to write now and then a CD using this
setting, but maybe after a while something gets confused, if
the application tries to keep up a high writing speed, but
the drive (or rest of the system, bus, etc) cannot keep up
with that. Does this sound reasonable or am I poking in the utter
darkness here?

> I entered the debugger with Ctrl-Alt-Esc and did a trace.
> I did not copy everything, because I thought I could use gdb -k
> later on (I was wrong). However, what I've saved from the trace was:
> 
> Apparently the system hung in
> 
> camisr(c02f3250,c02b7078,c253aa3,0,10) at camisr+0x8f
> 
> eip: 0xc01279d7, esp: 0xc0297008, ebp: 0xc0297020
> 
> Please advice what to examine how.
[..]
> Remote-GDB debugging is not an option, unfortunately. I don't have
[..]

I withdraw that statement! I did set up a remote gdb session
successfully!

But it was sort of useless.

After the system hung again, I used Ctrl-Alt-Esc to enter DDB.
I fired up the remote gdb and told it to remote connect.
Then I issued the 'gdb' command to DDB.
The remote gdb took over and I was in control.

But it seems useless, because the stack did only contain
the DDB routines? I include the (as it seems useless)
backtrace here:

Program received signal SIGTRAP, Trace/breakpoint trap.
Debugger (msg=0xc02a7c49 "manual escape to debugger")
at /usr/src/sys/i386/i386/db_interface.c:319
319  * XXX
(kgdb) bt
#0  Debugger (msg=0xc02a7c49 "manual escape to debugger")
at /usr/src/sys/i386/i386/db_interface.c:319
#1  0xc024ce92 in scgetc (sc=0xc030cb20, flags=2)
at /usr/src/sys/dev/syscons/syscons.c:3164
#2  0xc0249645 in sckbdevent (thiskbd=0xc0305540, event=0, arg=0xc030cb20)
at /usr/src/sys/dev/syscons/syscons.c:617
#3  0xc0240ea6 in atkbd_intr (kbd=0xc0305540, arg=0x0)
at /usr/src/sys/dev/kbd/atkbd.c:462
#4  0xc026c48c in atkbd_isa_intr (arg=0xc0305540)
at /usr/src/sys/isa/atkbd_isa.c:140
#5  0xc02531ef in Xresume1 ()

How do I get to the hanging routine from here?

I'm willing to trace the problem from here, but I need advice
how to proceed. 

Thanks a lot.

Daniel
-- 
IRCnet: Mr-Spock - "I hear that, if you play the WindowsXP CD
backwards, you get a Satanic message!"
- "That's nothing. If you play it forward, it installs WindowsXP!"
 Daniel Lang * [EMAIL PROTECTED] * +49 89 289 18532 * http://www.leo.org/~dl/


smime.p7s
Description: S/MIME cryptographic signature


Re: atapicam hangs system

2003-07-16 Thread Daniel Lang
Hi hackers,

following a problem I currently discuss with Thomas
(maintainer of atapicam). Since it turned into some
kernel debugging issues, I think I could use input from
-hackers, as well.

Summary: System (SCSI bus?) hangs, after using an ATAPI CD-RW
 device via the atapicam layer.

atapicam is the only ATA device option in the kernel
(so no acd# devices, just cd#).

Original Message follows:

Thomas Quinot wrote on Tue, Jul 08, 2003 at 08:48:38PM +0200:
[..]
> Your problem sounds like the ATA bus hanging. When it hangs and you can
> still access your xterms, it would be nice to see what dmesg says.
> You can also try to trigger the hang outside of X, and see if there are
> messages on the console. Another possible solution is to drop into DDB
> using Ctrl+Alt+Esc, and then manually trigger a panic.
[..]
> BTW, you're using the latest -STABLE kernel, right?
[..]

So, back in the office, I went back to the issue. Before anything
else, I've updated my system to 4.8-STABLE, then built a
debugging kernel with options DDB.

uname -a:
FreeBSD atrbg11.informatik.tu-muenchen.de 4.8-STABLE FreeBSD 4.8-STABLE #19: MonJul 14 
18:17:20 CEST 2003 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ATRBG11  i386

I did not start X but worked on the console. I used cdrdao to write
a cdimage:

# camcontrol devlist
at scbus0 target 1 lun 0 (pass0,da0)
   at scbus0 target 2 lun 0 (pass1,da1)
 at scbus1 target 0 lun 0 (pass2,cd0)
at scbus2 target 0 lun 0 (pass3,cd1)

The last one is the cd writer I've used with the following command:

# cdrdao write --device 2,0,0 --driver generic-mmc --speed 48 -v 2 -n bla.toc

After two successful writes, the system hung again, as before.
Unfortunately there are _no_ messages on the console.

I entered the debugger with Ctrl-Alt-Esc and did a trace.
I did not copy everything, because I thought I could use gdb -k
later on (I was wrong). However, what I've saved from the trace was:

Apparently the system hung in

camisr(c02f3250,c02b7078,c253aa3,0,10) at camisr+0x8f

eip: 0xc01279d7, esp: 0xc0297008, ebp: 0xc0297020

Please advice what to examine how.

Forcing a panic did not work. I could call panic, but
'continue' did not write a crash dump to disk, but hung again.

call boot(0) did not work, too and from this on, I could not even 
get back into the debugger and had to hit the reset button.

So it seems that scsi disk operations are not working as well,
so maybe not the ATA bus is hanging but the SCSI subsystem?

Remote-GDB debugging is not an option, unfortunately. I don't have
another RELENG_4 machine ready. However, I have a laptop with
some half-working 5.1-CURRENT. There are no 4.x sources on it...
maybe it could work, if I copy the source tree, but I would like
to have some confirmation, that it works, before I put effort
into this.

Please advise. :)

Best regards,
 Daniel
-- 
IRCnet: Mr-Spock  - May His Shadow fall upon thee - 
 Daniel Lang * [EMAIL PROTECTED] * +49 89 289 18532 * http://www.leo.org/~dl/


smime.p7s
Description: S/MIME cryptographic signature