Tag starvation with HPPA kernel

2004-09-16 Thread Stuart Brady
On Thu, Sep 16, 2004 at 12:31:38PM +0200, W. Borgert wrote:
 a colleague tried to install Debian on his HP712/60 with standard SCSI
 adaptor (whatever that is).  He used the d-i testing image of 2004-08-20.
 
 The d-i went smoothly until the creation of file systems.  The kernel
 output on the fourth console said:
 
 SCSI device sda: 8467200 512-bytes hdwr sectorrs (4335 MB)
  p6 
 adding swap: 191044k swap-space (priority -1)
 scsi0 (3:0) target is suffering from tag starvation.
 scsi0 (3:0) broken device is looping in contingent allegiance: ignoring
 
 After that, the system is stone dead.  As he tried also with two more
 712/60s and also tried other d-i versions, I believe, there must be a
 fundamental problem.  Any ideas?

I think this is a known problem.

To get an idea of the numbers, can anyone who is affected by this mail
me privately with the number of machines affected, please. If you can
also give give me extra details, such as what machine and drive you
have, and what value of NCR_700_MAX_TAGS works, that'd be great!

For the time being, he may want to do what I do, which is to install
woody and then upgrade to testing.

The workaround requires rebuilding the kernel with NCR_700_MAX_TAGS
changed to 1 in drivers/scsi/53c700.h.

Perhaps this workaround should be used if a proper fix isn't found by
then?  Does the freeze prevent that?

BTW, this only affects my machine with the 2.4 kernel (not 2.6).  Does
anyone have this problem with 2.6 at all?
-- 
Stuart Brady




Re: Tag starvation with HPPA kernel

2004-09-16 Thread Joel Soete
Hello Stuart,

Stuart Brady wrote:
On Thu, Sep 16, 2004 at 12:31:38PM +0200, W. Borgert wrote:
 

a colleague tried to install Debian on his HP712/60 with standard SCSI
adaptor (whatever that is).  He used the d-i testing image of 2004-08-20.
The d-i went smoothly until the creation of file systems.  The kernel
output on the fourth console said:
SCSI device sda: 8467200 512-bytes hdwr sectorrs (4335 MB)
p6 
adding swap: 191044k swap-space (priority -1)
scsi0 (3:0) target is suffering from tag starvation.
scsi0 (3:0) broken device is looping in contingent allegiance: ignoring
After that, the system is stone dead.  As he tried also with two more
712/60s and also tried other d-i versions, I believe, there must be a
fundamental problem.  Any ideas?
   

I think this is a known problem.
To get an idea of the numbers, can anyone who is affected by this mail
me privately with the number of machines affected, please. If you can
also give give me extra details, such as what machine and drive you
have, and what value of NCR_700_MAX_TAGS works, that'd be great!
For the time being, he may want to do what I do, which is to install
woody and then upgrade to testing.
The workaround requires rebuilding the kernel with NCR_700_MAX_TAGS
changed to 1 in drivers/scsi/53c700.h.
Perhaps this workaround should be used if a proper fix isn't found by
then?  Does the freeze prevent that?
BTW, this only affects my machine with the 2.4 kernel (not 2.6).  Does
anyone have this problem with 2.6 at all?
 

The pb pb I met was not exactely, 'tag starvation' but anyway related 
directly to 53c700.c and NCR_700_MAX_TAGS.

I don't play a lot with 2.4 on my c110 and a few with this driver 
(53c700) but recently, to make test on glibc with a chroot disk 
connected on this narrow se scsi crtl, I encounter this kind of pb with 
recent 2.6.9-rc2-pa2+Carlos's revamp exception handling patch:
http://lists.parisc-linux.org/pipermail/parisc-linux/2004-September/024736.html

btw, I find back the similar pb ref related to 2.4:
http://lists.parisc-linux.org/pipermail/parisc-linux/2004-August/024595.html 

and apply this tips:
#define NCR_700_MAX_TAGS128 (in 53c700.h)
That help as at least the system didn't crash anymore even though I 
still get a few:
scsi1: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
scsi1: Bus Reset detected, executing command 2fd58480, slot 2fd988a4, 
dsp 001281e8[01e8]
failing command because of reset, slot 2fd98520, cmnd 2fd58b60
failing command because of reset, slot 2fd9864c, cmnd 2fd588a0
failing command because of reset, slot 2fd98778, cmnd 2fd581c0
failing command because of reset, slot 2fd988a4, cmnd 2fd58480

I would also try ggg advise:
http://lists.parisc-linux.org/pipermail/parisc-linux/2004-September/024742.html
hth,
   Joel
PS: mmm, btw on this c110 the actual ctrlr is of model NCR53C710 and 
afaik the ncr53c720 is just a 16bits release of 710? My thought would 
be: could it be possible to 'extend' ncr53c8xx to Lasi chips?