I am having some corruptions problems that seem to be in my SCSI system.  We
are running Debian 2.1 with a pure 2.0.37 kernel, have an Adaptec AHA-2940U2
with 5 9GB seagate drives attached.  We are running an Oracle Database and did
not have any problems until we started running large data loads.  Then we
started getting data corruptions in the database and file system. We also are
getting errors in the syslog:
kernel: attempt to access beyond end of device
kernel: 08:41: rw=0, want=551158321, limit=8883913

The major and minor number changes often, and they never refer to a valid
partion, always one of the last partitions (15) on one of the scsi drivers. 
These messages normally come in groups during heavy database usage.  Random
messages about I/O errors and files not existing show up in the Oracle logs.  I
have run a bunch of tests on the memory and replaced the CPU.  I have done
multiple bad block scans on the disk and the media does not appear the be the
problem.  There also seems to be a strange thing were on certain problems I can
shutdown the database, remount the drives (flush the cache), restart the
database, and the problem is gone.  I have tried to reproduce the problem by
cat'ing multiple drives to /dev/null simultaneously and such, but have been
unsuccessful.  The only way I seem to have to reproduce it is to due a large
data load and wait a few hours.  Unfortunately we don't have a spare Adaptec
card and since we are a small business, don't want to order another until we
have established that that is the problem.

How should I proceed in tracking down this problem?

Thanks,
Josha Foust

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]

Reply via email to