Re: [2.2.18pre17 OOPS Report] Linux' musical taste (ide-cdrom / autofs related) (Repost)

2000-11-10 Thread Jens Axboe

On Fri, Nov 10 2000, Henning P. Schmiedehausen wrote:
[snip]
> Running 2.2.18pre17 completely modular built + 20001027 IDE patch from
> kernel.org + Andreas' 2.2.18pre17aa1 patch + some more but I think not
> related patches. Complete Kernel SRPMS and RPMS on request. :-)

> The Mitsumi CDROM is used only for listening to music. There is still
> an entry in my automount table for /mnt/misc mounting /dev/hdd to
> /mnt/misc/cdrom1 if I ever desire to.
> 
> % cat /etc/auto.misc
> [...]
> cdrom1  -fstype=iso9660,ro  :/dev/hdd

[snip cdrom_analyze_sense oops]

Could you try with this patch? It's against 2.2.18-pre21, but will
probably apply cleanly to your current kernel.

-- 
* Jens Axboe <[EMAIL PROTECTED]>
* SuSE Labs


--- /opt/kernel/linux-2.2.17.SuSE/drivers/block/ide-cd.cFri Oct 27 12:22:01 
2000
+++ drivers/block/ide-cd.c  Fri Nov 10 17:16:14 2000
@@ -324,41 +325,50 @@
info->nsectors_buffered = 0;
 }
 
+static int cdrom_log_sense(ide_drive_t *drive, struct packet_command *pc,
+  struct request_sense *sense)
+{
+   int log = 0;
+
+   if (sense == NULL)
+   return 0;
+
+   switch (sense->sense_key) {
+   case NO_SENSE: case RECOVERED_ERROR:
+   break;
+   case NOT_READY:
+   /*
+* don't care about tray state messages for
+* e.g. capacity commands or in-progress or
+* becoming ready
+*/
+   if (sense->asc == 0x3a || sense->asc == 0x04)
+   break;
+   log = 1;
+   break;
+   case UNIT_ATTENTION:
+   /*
+* Make good and sure we've seen this potential media
+* change. Some drives (i.e. Creative) fail to present
+* the correct sense key in the error register.
+*/
+   cdrom_saw_media_change(drive);
+   break;
+   default:
+   log = 1;
+   break;
+   }
+   return log;
+}
 
 static
 void cdrom_analyze_sense_data(ide_drive_t *drive,
  struct packet_command *failed_command,
  struct request_sense *sense)
 {
-   if (sense->sense_key == NOT_READY ||
-   sense->sense_key == UNIT_ATTENTION) {
-   /* Make good and sure we've seen this potential media change.
-  Some drives (i.e. Creative) fail to present the correct
-  sense key in the error register. */
-   cdrom_saw_media_change (drive);
-
-
-   /* Don't print not ready or unit attention errors for
-  READ_SUBCHANNEL.  Workman (and probably other programs)
-  uses this command to poll the drive, and we don't want
-  to fill the syslog with useless errors. */
-   if (failed_command &&
-   (failed_command->c[0] == GPCMD_READ_SUBCHANNEL ||
-failed_command->c[0] == GPCMD_TEST_UNIT_READY))
-   return;
-   }
-
-   if (sense->error_code == 0x70 && sense->sense_key  == 0x02
-&& ((sense->asc  == 0x3a && sense->ascq   == 0x00) ||
-(sense->asc  == 0x04 && sense->ascq   == 0x01)))
-   {
-   /*
-* Suppress the following errors:
-* "Medium not present", "in progress of becoming ready",
-* and "writing" to keep the noise level down to a dull roar.
-*/
+
+   if (!cdrom_log_sense(drive, failed_command, sense))
return;
-   }
 
 #if VERBOSE_IDE_CD_ERRORS
{
@@ -1105,7 +1115,13 @@
 
if (retry && jiffies - info->start_seek > IDECD_SEEK_TIMER) {
if (--retry == 0) {
+   /*
+* this condition is far too common, to bother
+* users about it
+*/
+#if 0
printk("%s: disabled DSC seek overlap\n", drive->name);
+#endif
drive->dsc_overlap = 0;
}
}
@@ -1329,8 +1345,12 @@
 static
 void cdrom_sleep (int time)
 {
-   current->state = TASK_INTERRUPTIBLE;
-   schedule_timeout(time);
+   int sleep = time;
+
+   do {
+   set_current_state(TASK_INTERRUPTIBLE);
+   sleep = schedule_timeout(sleep);
+   } while (sleep);
 }
 
 static
@@ -1848,6 +1868,9 @@
struct cdrom_info *info = drive->driver_data;
struct atapi_toc *toc = info->toc;
int ntracks;
+
+   if (!CDROM_STATE_FLAGS(drive)->toc_valid)
+   return -EINVAL;
 
/* Check validity of requested track number. */
ntracks = toc->hdr.las

[2.2.18pre17 OOPS Report] Linux' musical taste (ide-cdrom / autofs related) (Repost)

2000-11-10 Thread Henning P. Schmiedehausen

[ Ok, so my first mail seems to never have it made to the list. :-( ]

Hi,

the following situation:

Intel Celeron 667, 128 MB RAM, 440BX-based board (ASUS CUBX)
IBM 30 GB Disk and TEAC CDROM on ide0
LS120 Floppy and a Mitsumi CDROM on ide1   (see boot messages below for details)

Once upon a time this was a RedHat 6.2 box.

Running 2.2.18pre17 completely modular built + 20001027 IDE patch from
kernel.org + Andreas' 2.2.18pre17aa1 patch + some more but I think not
related patches. Complete Kernel SRPMS and RPMS on request. :-)


The following modules loaded (only the interesting ones):

Module  Size  Used by
nfs75736   4 (autoclean)
sr_mod 15844   0 (autoclean) (unused)
ide-cd 25756   0 (autoclean)
cdrom  28348   0 (autoclean) [sr_mod ide-cd]
isofs  18304   0 (autoclean) (unused)
autofs  9328   2 (autoclean)
lockd  45200   0 (autoclean) [nfs]
sunrpc 57988   1 (autoclean) [nfs lockd]
3c59x  20872   1 (autoclean)
ide-disk6160   6

The Mitsumi CDROM is used only for listening to music. There is still
an entry in my automount table for /mnt/misc mounting /dev/hdd to
/mnt/misc/cdrom1 if I ever desire to.

% cat /etc/auto.misc
[...]
cdrom1  -fstype=iso9660,ro  :/dev/hdd
[...]

And after a reinstall of this box, I forgot this line in /etc/fstab,
which is bad but should not do any harm with an audio cd in the drive.

/dev/hdd /mnt/cdrom1 iso9660 noauto,owner,ro 0 0

Nothing really wierd happened until today I had to reboot the box and
a certain [1] music cd was in the Mitsumi drive. I'm pretty sure that
I've booted with lots of other audio cds in this drive before. This
time, however, the box crashed completely with this oops:

CPU:0
EIP:0010:[]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010097
eax:    ebx: c02b5270   ecx: c6e42810   edx: c6e42805
esi: 0001   edi: c02b5270   ebp: 0282   esp: c026fe2c
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 0, process nr: 0, stackpage=c026f000)
Stack: c02b5270 0282  c8917969 c02b5270 c89183ec 03e8 c89177d8 
   c02b5270 c6e42800  c6e42850  c02b5198 0258 c891845d 
   c02b5270 0012 c89183ec  c891896e c02b5270 c02b5270 c02b5270 
Call Trace: [] [] [] [] [] 
[] [] 
   [] [] [] [] [] [] 
Code: 83 78 0c 00 0f 85 f2 03 00 00 8b 84 24 88 00 00 00 8a 08 80 

>>EIP; c89170c6 <[ide-cd]cdrom_analyze_sense_data+5a/460>   <=
Trace; c0181723 
Trace; c010a4d6 
Trace; c010a28f 
Trace; c010a5f8 
Trace; c010a2d0 
Trace; c0107b15 
Trace; c0106000 
Trace; c0107b38 
Trace; c0109224 
Trace; c0106000 
Trace; c010607b 
Trace; c0106000 
Trace; c0100175 <_stext+175/6000>
Code;  c89170c6 <[ide-cd]cdrom_analyze_sense_data+5a/460>
 <_EIP>:
Code;  c89170c6 <[ide-cd]cdrom_analyze_sense_data+5a/460>   <=
   0:   83 78 0c 00   cmpl   $0x0,0xc(%eax)   <=
Code;  c89170ca <[ide-cd]cdrom_analyze_sense_data+5e/460>
   4:   0f 85 f2 03 00 00 jne3fc <_EIP+0x3fc> c89174c2 
<[ide-cd]cdrom_analyze_sense_data+456/460>
Code;  c89170d0 <[ide-cd]cdrom_analyze_sense_data+64/460>
   a:   8b 84 24 88 00 00 00  mov0x88(%esp,1),%eax
Code;  c89170d7 <[ide-cd]cdrom_analyze_sense_data+6b/460>
  11:   8a 08 mov(%eax),%cl
Code;  c89170d9 <[ide-cd]cdrom_analyze_sense_data+6d/460>
  13:   80 00 00  addb   $0x0,(%eax)

Aiee, killing interrupt handler
Kernel panic: Attempted to kill the idle task!
In swapper task - not syncing

This is the complete serial capture from booting to the crash:


LILO boot: 
Loading linux..
Linux version 2.2.18pre17-2t (root@babsi) (gcc version egcs-2.91.66 19990314/Linux 
(egcs-1.1.2 release)) #1 Wed Nov 8 01:09:45 MET 2000
Detected 668197 kHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 1333.65 BogoMIPS
Memory: 127024k/130944k available (1100k kernel code, 412k reserved, 1892k data, 84k 
init, 0k bigmem)
Dentry hash table entries: 16384 (order 5, 128k)
Buffer cache hash table entries: 131072 (order 7, 512k)
Page cache hash table entries: 32768 (order 5, 128k)
Inode hash table entries: 16384 (order 5, 128k)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: Intel Pentium III (Coppermine) stepping 03
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.35a (19990819) Richard Gooch ([EMAIL PROTECTED])
PCI: PCI BIOS revision 2.10 entry at 0xf08c0
PCI: Using configuration type 1
PCI: Probing PCI hardware
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0 for Linux