Re: System Crash when using Amanda

2006-01-11 Thread Stefan G. Weichinger

Freels, James D. wrote:
The problem I had started at kernels greater than 2.6.12.x (starting at 
2.6.13.0) and was finally cleared at 2.6.15.0.


Interesting ... I am using that module with kernel 2.6.13 and have no 
problems. Maybe this related to the old hardware I use, maybe to the 
fact that the Suse-guys have patched that already (assumption: NO 
research done on this by me ...)


Stefan



Re: System Crash when using Amanda

2006-01-11 Thread Stefan G. Weichinger

Stefan G. Weichinger wrote:

Freels, James D. wrote:

The problem I had started at kernels greater than 2.6.12.x (starting 
at 2.6.13.0) and was finally cleared at 2.6.15.0.



Interesting ... I am using that module with kernel 2.6.13 and have no 
problems. Maybe this related to the old hardware I use, maybe to the 
fact that the Suse-guys have patched that already (assumption: NO 
research done on this by me ...)


Some bell rang 

Could someone point mo to any related bugreport on this?
Maybe this has to do with some strange symptoms I see at a customers 
site. They use aic7xx with linux-2.6.8 on a Suse-9.2 ...


No complete lockups, but strange tape-errors all over the place.
Everything swapped already, the only thing that helped so far was 
putting the drive out of the box and laying it on the top of the case 
with the SCSI-cables through the air ;) Smooth backups since then.


Might be temperature, we'll try an external case for that drive.

But maybe it has to with some module/kernel-problems as well ...

Stefan



Re: System Crash when using Amanda

2006-01-11 Thread Freels, James D.
the following text was included in the 2.6.15 kernel changelog

The following change was found from the Adaptec developers.  Note the
2.6.13+ reference, the word deadlock, and the final sign offs

commit e5508c13ac25b07585229b144a45cf64a990171e
Author: Salyzyn, Mark [EMAIL PROTECTED]
Date:   Sat Dec 17 19:26:30 2005 -0800

  [PATCH] dpt_i2o fix for deadlock condition

  Miquel van Smoorenburg [EMAIL PROTECTED] forwarded me this fix to
  resolve a deadlock condition that occurs due to the API change in
  2.6.13+ kernels dropping the host locking when entering the error
  handling.  They all end up calling adpt_i2o_post_wait(), which if  you
  call it unlocked, might return with host_lock locked anyway and that
  causes a deadlock.

  Signed-off-by: Mark Salyzyn [EMAIL PROTECTED]
  Cc: James Bottomley [EMAIL PROTECTED]
  Cc: [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]
  Signed-off-by: Linus Torvalds [EMAIL PROTECTED]

I can definitely say that kernel 2.6.15 fixed my problems.  I also can
definitely say that the prior kernel to work was 2.6.12.6.  My guess is
that the 2.6.8 also would work.

On Wed, 2006-01-11 at 21:27 +0100, Stefan G. Weichinger wrote:
 Stefan G. Weichinger wrote:
  Freels, James D. wrote:
  
  The problem I had started at kernels greater than 2.6.12.x (starting 
  at 2.6.13.0) and was finally cleared at 2.6.15.0.
  
  
  Interesting ... I am using that module with kernel 2.6.13 and have no 
  problems. Maybe this related to the old hardware I use, maybe to the 
  fact that the Suse-guys have patched that already (assumption: NO 
  research done on this by me ...)
 
 Some bell rang 
 
 Could someone point mo to any related bugreport on this?
 Maybe this has to do with some strange symptoms I see at a customers 
 site. They use aic7xx with linux-2.6.8 on a Suse-9.2 ...
 
 No complete lockups, but strange tape-errors all over the place.
 Everything swapped already, the only thing that helped so far was 
 putting the drive out of the box and laying it on the top of the case 
 with the SCSI-cables through the air ;) Smooth backups since then.
 
 Might be temperature, we'll try an external case for that drive.
 
 But maybe it has to with some module/kernel-problems as well ...
 
 Stefan
 
--
James D. Freels, Ph.D.
Oak Ridge National Laboratory
[EMAIL PROTECTED]
http://www.comsol.com/stories/hfir/




Re: System Crash when using Amanda

2006-01-10 Thread Freels, James D.




Do you happen to be using the aic7xxx driver ? If so, I had the same problem until the new kernel 2.6.15.




--
James D. Freels, Ph.D.
Oak Ridge National Laboratory
[EMAIL PROTECTED]
http://www.comsol.com/stories/hfir/







---BeginMessage---
I hope this is not a rerun of a previous topic but I have not found anything
on the net yet. I have amanda 2.4.5 running on 2 different servers. The
problem started about 5 or 6 weeks after implementing amanda. Kernel is 2.6.
Several times, shortly after amanda started a backup the server locked up. I
could not log on at the console and it did not respond to any services. It
seems to be related to when amanda accesses the tape drive to write to tape.
Can anyone point me in the right direction as to the cause (and a possible
solution) to this problem. I have had to suspend backups on both servers
since it happened on both.

In fact, the first one ended up crashing 3 times and somehow the server
became unrecoverable. I installed a totally new server with new SCSI card
for the tape drive and it locked up after 1 week.

Hopefully someone will have some insight.

Thanks,
Gordon


---End Message---


Re: System Crash when using Amanda

2006-01-10 Thread Michael Loftis



--On January 10, 2006 3:43:56 PM -0500 Freels, James D. 
[EMAIL PROTECTED] wrote:



Do you happen to be using the aic7xxx driver ? If so, I had the same
problem until the new kernel 2.6.15.



I'm having problems in debian 2.6.8 related to aic7xxx as well...I use an 
aic7xxx HVD SCSI connected to a tape library, occasionally since going to 
2.6 it totally locks up one of the tape drives to the point i have to 
shutdown and re-init the drive (which in this case means power cycling the 
whole library) and then reboot the tape server.


Re: System Crash when using Amanda

2006-01-10 Thread Freels, James D.




The problem I had started at kernels greater than 2.6.12.x (starting at 2.6.13.0) and was finally cleared at 2.6.15.0.




--
James D. Freels, Ph.D.
Oak Ridge National Laboratory
[EMAIL PROTECTED]
http://www.comsol.com/stories/hfir/







---BeginMessage---



--On January 10, 2006 3:43:56 PM -0500 Freels, James D. 
[EMAIL PROTECTED] wrote:



Do you happen to be using the aic7xxx driver ? If so, I had the same
problem until the new kernel 2.6.15.



I'm having problems in debian 2.6.8 related to aic7xxx as well...I use an 
aic7xxx HVD SCSI connected to a tape library, occasionally since going to 
2.6 it totally locks up one of the tape drives to the point i have to 
shutdown and re-init the drive (which in this case means power cycling the 
whole library) and then reboot the tape server.
---End Message---


RE: System Crash when using Amanda

2006-01-10 Thread Gordon J. Mills III



Yes, I am using the aic7xxx driver. I am using the adaptec 
39160 card. My tape drive is a dell 112T, basically a 1U unit with 2 DLT 40/80 
drives in it. The other server had the same scsi card with Compaq AIT-35 8 tape 
autoloader. I will try installing the new kernel to see. The 2 servers that it 
happened to recently had the debian package kernel 
2.6.12-1-smp.

Thanks for the info.

Regards,
Gordon



Do you happen to be using the aic7xxx driver ? If so, I had the same 
problem until the new kernel 2.6.15.

  
  
--James D. Freels, Ph.D.Oak Ridge 
  National 
  Laboratory[EMAIL PROTECTED]http://www.comsol.com/stories/hfir/


RE: System Crash when using Amanda

2006-01-10 Thread Joshua Baker-LePain

On Tue, 10 Jan 2006 at 8:50pm, Gordon J. Mills III wrote


Yes, I am using the aic7xxx driver. I am using the adaptec 39160 card. My
tape drive is a dell 112T, basically a 1U unit with 2 DLT 40/80 drives in
it. The other server had the same scsi card with Compaq AIT-35 8 tape
autoloader. I will try installing the new kernel to see. The 2 servers that
it happened to recently had the debian package kernel 2.6.12-1-smp.


FWIW (and so it's in the archives), if you decide to switch host adapters, 
I've had very good luck with LSI based products.  My AIT3 loader is 
hanging off a 53c1010 based board (U160) using the sym53c8xx_2 driver, and 
my LTO3 loader is on a 53c1030 based board (U320) using the mptscsih 
driver.  They run centos-3 and centos-4, respectively.


--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University