Re: [PATCH] add transport class symlink to device object
On Wednesday 31 August 2005 16:43, Greg KH wrote: > On Thu, Aug 18, 2005 at 02:50:19PM -0500, Dmitry Torokhov wrote: > > On 8/18/05, Greg KH <[EMAIL PROTECTED]> wrote: > > > @@ -500,9 +519,13 @@ int class_device_add(struct class_device > > >} > > > > > >class_device_add_attrs(class_dev); > > > - if (class_dev->dev) > > > + if (class_dev->dev) { > > > + class_name = make_class_name(class_dev); > > >sysfs_create_link(&class_dev->kobj, > > > &class_dev->dev->kobj, "device"); > > > + sysfs_create_link(&class_dev->dev->kobj, &class_dev->kobj, > > > + class_name); > > > + } > > > > > > > I wonder if we need to grab a reference to class_dev->dev here: > > > > dev = device_get(class_dev->dev); > > if (dev) { > > > > } > > > > Otherwise, if device gets unregistered/deleted before class device is > > deleted we'll get into trouble when removing the link since > > class_dev->dev will be garbage. > > > > .. But grabbing that reference will cause pains in SCSI system which, > > when I looked, removed class devices from device's release function. > > No the sysfs_create_link() call increments the kobject reference on the > target of the symlink. See sysfs_add_link() for details. So this > should be just fine, right? > Yes, you are right. Sorry for the moise. -- Dmitry - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
On Thu, Sep 01, 2005 at 02:44:00PM +0900, Tejun Heo wrote: > Can you please elaborate why getting sense data from EH is bad idea > for ATAPI? For more advanced SCSI transports, I agree with you that > autosensing is necessary with queueing and multiple initiator and etc, > but I don't really see how requesting sense from EH would be bad for ATAPI. The long term direction for the SCSI core seems to be that of requiring auto-sensing. libata is simply being lazy: while the SCSI core continues to support kicking the EH thread when sense is missing, it's preferred for libata to reuse that infrastructure. Auto-sensing (and READ LOG EXT for NCQ errors) requires either an FSM or a kernel thread, to initiate a secondary qc for REQUEST SENSE. Since the common infrastructure already exists for this, libata reuses the existing SCSI EH kernel thread. We should move libata-scsi to auto-sensing, but it's not an urgent priority. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
Hello, Luben. Luben Tuikov wrote: --- Tejun Heo <[EMAIL PROTECTED]> wrote: As implementing autosensing will probably need rewriting failed qc for REQUEST SENSE command, I'm opposing it. My proposal is to do the following, which, in effect, should be equivalent to autosensing. 1. ATAPI CHECK SENSE occurs 2. libata fails the command 3. SCSI sees failure code but no sense data, SCSI EH invoked 4. libata EH invoked 5. REQUEST SENSE 6. sense data acquired 7. scsi_decide_disposition() called (this needs to be exported from SCSI) 8. libata handles the failed qc according to the verdict. Hmm, yes. It sounds good, except can you make it so that step 3 doesn't exist, ever. This means that you would _reduce_ the double "bouncing" between eh's _and_ implement autosense. libata EH is invoked from SCSI EH via hostt->eh_strategy_handler(), so they're one - libata EH uses SCSH EH framework to operate. I'm having hard time understanding what you mean by 'double bounncing'. SCSI Core should never know what happened. I.e. if the command has completed with CHECK SENSE, sense data _is_ present => "autosense". This is very similar to what SCSI EH currently does for commands without sense data. Yes, you're right -- it is very similar to what SCSI EH currently does. Unfortunately it isn't quite correct. Can you please elaborate why getting sense data from EH is bad idea for ATAPI? For more advanced SCSI transports, I agree with you that autosensing is necessary with queueing and multiple initiator and etc, but I don't really see how requesting sense from EH would be bad for ATAPI. As ATAPI device's queue depth is always one (ignoring SERVICE cruft everyone seems to hate), I don't think there will be any noticeable performance penalty as James was describing in the other mail in this thread. What you can do is keep a qc around to request sense immediately afterwards. If _that_ qc fails, then you know you need the big hammer. Yes, that is also a possibility, but I was opting for REQUEST SENSE from EH for the following two reasons. a. As we're gonna have facilities to issue EH cmds from EH, ATAPI can just join the crowd without implementing separate mechanism to issue REQUEST SENSE. b. It's not a hot path and I think performance gain from implementing autosense would be negligible. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
--- Tejun Heo <[EMAIL PROTECTED]> wrote: > As implementing autosensing will probably need rewriting failed qc > for REQUEST SENSE command, I'm opposing it. My proposal is to do the > following, which, in effect, should be equivalent to autosensing. > > 1. ATAPI CHECK SENSE occurs > 2. libata fails the command > 3. SCSI sees failure code but no sense data, SCSI EH invoked > 4. libata EH invoked > 5. REQUEST SENSE > 6. sense data acquired > 7. scsi_decide_disposition() called (this needs to be exported from SCSI) > 8. libata handles the failed qc according to the verdict. Hmm, yes. It sounds good, except can you make it so that step 3 doesn't exist, ever. This means that you would _reduce_ the double "bouncing" between eh's _and_ implement autosense. SCSI Core should never know what happened. I.e. if the command has completed with CHECK SENSE, sense data _is_ present => "autosense". > This is very similar to what SCSI EH currently does for commands > without sense data. Yes, you're right -- it is very similar to what SCSI EH currently does. Unfortunately it isn't quite correct. > As ATAPI device's queue depth is always one (ignoring SERVICE cruft > everyone seems to hate), I don't think there will be any noticeable > performance penalty as James was describing in the other mail in this > thread. What you can do is keep a qc around to request sense immediately afterwards. If _that_ qc fails, then you know you need the big hammer. Luben - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
Hi, Luben. On Wed, Aug 31, 2005 at 08:30:27PM -0700, Luben Tuikov wrote: > --- Tejun Heo <[EMAIL PROTECTED]> wrote: > > IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command > > mapping as long as possible. And, in the suggested framework, it's > > Yes, that makes sense. > > > guaranteed that no other command can come inbetween CHECK_SENSE and > > REQUEST_SENSE. > > That's good. > > > Requesting sense from EH, > > Done in an ATA eh handler. > > > calling scsi_decide_disposition() on the > > sense > > Done in SCSI Core. > > > and following the verdict should achieve the same effect as > > emulating autosense. > > Yes, precisely. > > > Is there any compelling reason to break one qc to > > one command mapping? > > ? > I wasn't clear enough. I'll try again. :-) As implementing autosensing will probably need rewriting failed qc for REQUEST SENSE command, I'm opposing it. My proposal is to do the following, which, in effect, should be equivalent to autosensing. 1. ATAPI CHECK SENSE occurs 2. libata fails the command 3. SCSI sees failure code but no sense data, SCSI EH invoked 4. libata EH invoked 5. REQUEST SENSE 6. sense data acquired 7. scsi_decide_disposition() called (this needs to be exported from SCSI) 8. libata handles the failed qc according to the verdict. This is very similar to what SCSI EH currently does for commands without sense data. As ATAPI device's queue depth is always one (ignoring SERVICE cruft everyone seems to hate), I don't think there will be any noticeable performance penalty as James was describing in the other mail in this thread. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
--- Jeff Garzik <[EMAIL PROTECTED]> wrote: > Tejun Heo wrote: > > IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command > > mapping as long as possible. And, in the suggested framework, it's > > guaranteed that no other command can come inbetween CHECK_SENSE and > > REQUEST_SENSE. > > > > Requesting sense from EH, calling scsi_decide_disposition() on the > > sense and following the verdict should achieve the same effect as > > emulating autosense. Is there any compelling reason to break one qc to > > one command mapping? > > > Yes, you should have one qc <-> one ATA/ATAPI command. That's why, in Agree. > the NCQ scenario, I wanted to make sure that one qc was always reserved > for error handling: REQUEST SENSE or READ LOG EXT, most importantly. Yes. > For SAT layer MODE SELECT translations, that implies multiple calls to > qc_new/qc_issue/qc_complete before completing the overall SCSI command. > The same for handling sata_sil mod15write: I am beginning to feel > like the mod15write workaround might be best implemented in a manner > that caused libata-scsi (not sata_sil) to create/issue/complete multiple > ATA commands. > > The only problem you run into is that a qc may be active during EH, when > you need another qc. So avoiding recursive details becomes an issue. Hmm... Luben - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
--- Tejun Heo <[EMAIL PROTECTED]> wrote: > IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command > mapping as long as possible. And, in the suggested framework, it's Yes, that makes sense. > guaranteed that no other command can come inbetween CHECK_SENSE and > REQUEST_SENSE. That's good. > Requesting sense from EH, Done in an ATA eh handler. > calling scsi_decide_disposition() on the > sense Done in SCSI Core. > and following the verdict should achieve the same effect as > emulating autosense. Yes, precisely. > Is there any compelling reason to break one qc to > one command mapping? ? Luben - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fc remote port timeout with qla2xxx driver
On Wed, Aug 31, 2005 at 03:44:09PM -0700, Andrew Vasquez wrote: > Hmm, could you try the attached small patch? This should close that > whole where the fc_remote_port state is restored to a correct state. This seems to fix the problem. The debug now shows: ... Sep 1 10:05:15 baku kernel: scsi(0): LOOP READY Sep 1 10:05:15 baku kernel: scsi(0): qla2x00_loop_resync - end Sep 1 10:05:36 baku kernel: scsi(0): Port Update -- creating RSCN fcport f7c2a080 for 81/7/6000. Sep 1 10:05:36 baku kernel: scsi(0): Handle RSCN -- process RSCN for fcport [ff]. Sep 1 10:05:36 baku kernel: scsi(0): Handle RSCN -- attempting login to [81/ff]. Sep 1 10:05:36 baku kernel: scsi(0): Sending Login IOCB (a0004000) to [81/ff]. Sep 1 10:05:36 baku kernel: scsi(0): Port login retry: 21d02367d125, id = 0x0081 retry cnt=10 Sep 1 10:05:36 baku kernel: scsi(0): Process IODesc -- processing a0004000. Sep 1 10:05:36 baku kernel: scsi(0): Login IOCB -- loop id [81] used by port id [0b1132]. Sep 1 10:05:36 baku kernel: scsi(0): Login IOCB -- retrying login to [81/0b1132] (2). Sep 1 10:05:36 baku kernel: scsi(0): Sending Login IOCB (a0005000) to [81/0b1132]. Sep 1 10:05:36 baku kernel: scsi(0): Process IODesc -- processing a0005000. Sep 1 10:05:36 baku kernel: scsi(0): Login IOCB -- status=0 mb1=0 pn=21d02367d125. Sep 1 10:05:36 baku kernel: scsi(0): fcport-0 - port retry count: 29 remaining Sep 1 10:05:36 baku kernel: scsi(0): qla2x00_port_login() Sep 1 10:05:36 baku kernel: scsi(0): Trying Fabric Login w/loop id 0x0081 for port 0b1132. Sep 1 10:05:36 baku kernel: scsi(0): Login IOCB -- found RSCN fcport in fcports list [f7db8100]. Sep 1 10:05:36 baku kernel: scsi(0): Login IOCB -- marking existing fcport [81/0b1132] online. Sep 1 10:05:36 baku kernel: scsi(0): Login IOCB -- Freeing RSCN fcport f7c2a080 [81/0b1132]. Sep 1 10:05:36 baku kernel: scsi(0): port login OK: logged in ID 0x81 Sep 1 10:05:36 baku kernel: scsi(0): qla2x00_port_login - end one thing that I forgot to mention is that I'm prodding the scsi layer to get rescan for devices by doing: echo "1" > '/sys/class/fc_remote_ports/rport-0:0-0/device/target0:0:0/0:0:0:1/rescan' I did this above at 10:05:36, as shown in the log, which led to the port_login. This explains the delay between loop_resync and relogin. Apologies for the basic question, but is this what one is supposed to do? (I believe the dm-multipath stuff does this when it tries to update devices) If so, it seems like there might be a reference counting issue hanging around, as I am able to do a rescan _after_ the FC port is blocked (as indicated in the debug output), whereas I'd expect the fc_remote_port sysfs stuff to have disappeared. Related to that, when the port is disconnected, /sys/class/fc_remote_ports/rport-0:0-0/ still exists - I presume this is part of the same issue. In any case, thanks for the patch, as it seems to fix the real issue for me. signature.asc Description: Digital signature
Re: [RFC] libata new EH document
Hello, Jeff. On Wed, Aug 31, 2005 at 10:22:17PM -0400, Jeff Garzik wrote: > Tejun Heo wrote: > > IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command > >mapping as long as possible. And, in the suggested framework, it's > >guaranteed that no other command can come inbetween CHECK_SENSE and > >REQUEST_SENSE. > > > > Requesting sense from EH, calling scsi_decide_disposition() on the > >sense and following the verdict should achieve the same effect as > >emulating autosense. Is there any compelling reason to break one qc to > >one command mapping? > > > Yes, you should have one qc <-> one ATA/ATAPI command. That's why, in > the NCQ scenario, I wanted to make sure that one qc was always reserved > for error handling: REQUEST SENSE or READ LOG EXT, most importantly. Having an extra (as opposed to reserved) EH qc doesn't break one qc <-> one command mapping. a. All EH commands are non-NCQ. b. Inside EH, no other command is allowed. So, we can allocate a qc which does not have a corresponding NCQ tag. This qc will never be used for normal commands. It's used only for internal commands when no other qc can be active. If we don't have an extra qc for EH, as non-NCQ devices have only one qc, we should either, a. Rewrite failed qc to issue recovery command b. Complete failed qc and issue recovery command Both are not too attractive, IMHO. I currently don't understand very well why you don't like extra qc approach. Can you please elaborate? > > For SAT layer MODE SELECT translations, that implies multiple calls to > qc_new/qc_issue/qc_complete before completing the overall SCSI command. > The same for handling sata_sil mod15write: I am beginning to feel > like the mod15write workaround might be best implemented in a manner > that caused libata-scsi (not sata_sil) to create/issue/complete multiple > ATA commands. That's what I've done for multi-qc SCSI cmd translation patch I've posted the other day, and I think it would be really neat to do similar thing for m15w. However, to do so, we'll need some callbacks at libata scsi/core layers (say, driver-overridable command translation callbacks?) at the very least and I'm not sure about adding those just for m15w. > The only problem you run into is that a qc may be active during EH, when > you need another qc. So avoiding recursive details becomes an issue. I guess this means the same thing I've described above about non-NCQ devices, right? Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
BTW I still have three of your documents to review and comment on. Haven't forgotten about them. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
Tejun Heo wrote: IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command mapping as long as possible. And, in the suggested framework, it's guaranteed that no other command can come inbetween CHECK_SENSE and REQUEST_SENSE. Requesting sense from EH, calling scsi_decide_disposition() on the sense and following the verdict should achieve the same effect as emulating autosense. Is there any compelling reason to break one qc to one command mapping? Yes, you should have one qc <-> one ATA/ATAPI command. That's why, in the NCQ scenario, I wanted to make sure that one qc was always reserved for error handling: REQUEST SENSE or READ LOG EXT, most importantly. For SAT layer MODE SELECT translations, that implies multiple calls to qc_new/qc_issue/qc_complete before completing the overall SCSI command. The same for handling sata_sil mod15write: I am beginning to feel like the mod15write workaround might be best implemented in a manner that caused libata-scsi (not sata_sil) to create/issue/complete multiple ATA commands. The only problem you run into is that a qc may be active during EH, when you need another qc. So avoiding recursive details becomes an issue. Jeff - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] libata new EH document
Luben Tuikov wrote: On 08/30/05 06:26, Tejun Heo wrote: Albert Lee wrote: 4. Corresponding scmd's result code is set to SAM_STAT_CHECK_CONDITION and qc->scsidone() callback is called directly. As we haven't filled sense data, scsi_determine_disposition() will return FAILED and SCSI EH will be scheduled. Note that as we directly call qc->scsidone(), qc is left intact. Could we get the sense data before calling qc->scsidone()? (Using the proposed separate EH qc can keep the original qc intact.) The issue: When a DVD drive returns MEDIUM_ERROR in the sense data, libata doesn't retry the command. For libata, when scsi_softirq() calls scsi_decide_disposition() and scsi_check_sense() to determine how to handle the result, scsi_check_sense() always returns "fail" since the sense data is not there yet. The sense data is requested later in the libata error handler. But the command has already been considered as an "error". By having the sense data ready before calling qc->scsidone(), we can make the NEEDS_RETRY work in scsi_softirq(). So, for things like MEDIUM_ERROR, the device has a chance to retry/recover the error. This seems to be important for devices with built-in defect management system. There are two ways a scmd can leave EH - retry by scsi_queue_insert() and finish by scsi_finish_cmd(). I think the problem you described can be easily solved by choosing the former method when finishing the qc from EH. Note that other advanced EH stuff like reconfiguring transport speed also requires retrying, so we will surely have a mechanism for retrying failed qc's from EH. What is needed is autosense simulation for ATA, so that SCSI Core doesn't know that the device doesn't support autosense. So, before a failed command reaches SCSI Core recovery, it should pass by ATA layer recovery to get sense. Note: if you send another command for execution after the failed command _and_ no autosense is provided, then any sense data is lost -- this is further subject to more rules set forth in SAM and SPC. IMHO, it's a good idea to maintain one qc to one ATA/ATAPI command mapping as long as possible. And, in the suggested framework, it's guaranteed that no other command can come inbetween CHECK_SENSE and REQUEST_SENSE. Requesting sense from EH, calling scsi_decide_disposition() on the sense and following the verdict should achieve the same effect as emulating autosense. Is there any compelling reason to break one qc to one command mapping? -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: adp94xx driver for 2.6.13?
On Wed, Aug 31, 2005 at 04:08:32PM -0400, James Bottomley wrote: > On Tue, 2005-08-30 at 22:33 -0700, Ravikiran G Thirumalai wrote: > > Are drivers/sources available for Adaptec's AIC-94XX SAS controllers > > someplace? I am trying to run 2.6.13 on a x460. Any pointers to the latest > > driver sources appreciated. > > There is no driver source for Adaptec SATA hardware. You need to get > the latest binary drivers from Adaptec support. Someone from adaptec had posted the sources to linux-scsi some time back, but unfortunately, it is a humongous 27 patch patchset, with some malformed patches. Also one of the patches is missing (#6 ,or the patch numbering was bad). So I was wondering if folks from adaptec or other people stuck with this adapter could point me to the sources, if it is available... Thanks, Kiran - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] add transport class symlink to device object
On Thu, Aug 18, 2005 at 02:50:19PM -0500, Dmitry Torokhov wrote: > On 8/18/05, Greg KH <[EMAIL PROTECTED]> wrote: > > @@ -500,9 +519,13 @@ int class_device_add(struct class_device > >} > > > >class_device_add_attrs(class_dev); > > - if (class_dev->dev) > > + if (class_dev->dev) { > > + class_name = make_class_name(class_dev); > >sysfs_create_link(&class_dev->kobj, > > &class_dev->dev->kobj, "device"); > > + sysfs_create_link(&class_dev->dev->kobj, &class_dev->kobj, > > + class_name); > > + } > > > > I wonder if we need to grab a reference to class_dev->dev here: > > dev = device_get(class_dev->dev); > if (dev) { > > } > > Otherwise, if device gets unregistered/deleted before class device is > deleted we'll get into trouble when removing the link since > class_dev->dev will be garbage. > > .. But grabbing that reference will cause pains in SCSI system which, > when I looked, removed class devices from device's release function. No the sysfs_create_link() call increments the kobject reference on the target of the symlink. See sysfs_add_link() for details. So this should be just fine, right? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: adp94xx driver for 2.6.13?
On Wed, 2005-08-31 at 16:08 -0400, James Bottomley wrote: > On Tue, 2005-08-30 at 22:33 -0700, Ravikiran G Thirumalai wrote: > > Are drivers/sources available for Adaptec's AIC-94XX SAS controllers > > someplace? I am trying to run 2.6.13 on a x460. Any pointers to the latest > > driver sources appreciated. > > There is no driver source for Adaptec SATA hardware. You need to get > the latest binary drivers from Adaptec support. > > James > > James & Ravikiran, Adaptec does have source code available under the GPL for the AIC 94xx. In fact it was submitted to this list and rejected. It appears to work reasonably well under our stress testing environments here at mvista. It does have some problems on x86_64 with 8gb of ram but works well with 6gb on x86_64. It also has a significant amount of nits which make it unacceptable for kernel.org merging. Regards -steve > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: adp94xx driver for 2.6.13?
On Tue, 2005-08-30 at 22:33 -0700, Ravikiran G Thirumalai wrote: > Are drivers/sources available for Adaptec's AIC-94XX SAS controllers > someplace? I am trying to run 2.6.13 on a x460. Any pointers to the latest > driver sources appreciated. There is no driver source for Adaptec SATA hardware. You need to get the latest binary drivers from Adaptec support. James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fc remote port timeout with qla2xxx driver
On Wed, 31 Aug 2005, Rudolph Pereira wrote: > Aug 31 15:55:16 baku kernel: scsi(0): Sending Login IOCB (a0005000) to > [81/0b1132]. > Aug 31 15:55:16 baku kernel: scsi(0): Process IODesc -- processing a0005000. > Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- status=0 mb1=0 > pn=21d02367d125. > Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- found RSCN fcport in > fcports list [f7c84600]. > Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- marking existing fcport > [81/0b1132] online. > Aug 31 15:55:16 baku kernel: scsi(0): Login IOCB -- Freeing RSCN fcport > f5c12d80 [81/0b1132]. > Aug 31 15:55:16 baku kernel: scsi(0): port login OK: logged in ID 0x81 > Aug 31 15:55:16 baku kernel: scsi(0): qla2x00_port_login - end > Aug 31 15:55:50 baku kernel: rport-0:0-0: blocked FC remote port time out: > removing target > > at this point, the path is no longer unavailable, whereas it should be > (everything's physically connected). The most worrying indication is the > final "blocked FC remote port time out" which seems like the port is not > being unblocked when it should. > > Has anyone seen this issue, and is it known, and if so, are there any > fixes for it? Hmm, could you try the attached small patch? This should close that whole where the fc_remote_port state is restored to a correct state. --- diff --git a/drivers/scsi/qla2xxx/qla_rscn.c b/drivers/scsi/qla2xxx/qla_rscn.c --- a/drivers/scsi/qla2xxx/qla_rscn.c +++ b/drivers/scsi/qla2xxx/qla_rscn.c @@ -330,6 +330,8 @@ qla2x00_update_login_fcport(scsi_qla_hos fcport->flags &= ~FCF_FAILOVER_NEEDED; fcport->iodesc_idx_sent = IODESC_INVALID_INDEX; atomic_set(&fcport->state, FCS_ONLINE); + if (fcport->rport) + fc_remote_port_unblock(fcport->rport); } - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] lpfc: use wwn_to_u64() transport helper
Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c --- a/drivers/scsi/lpfc/lpfc_attr.c +++ b/drivers/scsi/lpfc/lpfc_attr.c @@ -965,21 +965,21 @@ static void lpfc_get_host_fabric_name (struct Scsi_Host *shost) { struct lpfc_hba *phba = (struct lpfc_hba*)shost->hostdata[0]; - u64 nodename; + u64 node_name; spin_lock_irq(shost->host_lock); if ((phba->fc_flag & FC_FABRIC) || ((phba->fc_topology == TOPOLOGY_LOOP) && (phba->fc_flag & FC_PUBLIC_LOOP))) - memcpy(&nodename, &phba->fc_fabparam.nodeName, sizeof(u64)); + node_name = wwn_to_u64(phba->fc_fabparam.nodeName.wwn); else /* fabric is local port if there is no F/FL_Port */ - memcpy(&nodename, &phba->fc_nodename, sizeof(u64)); + node_name = wwn_to_u64(phba->fc_nodename.wwn); spin_unlock_irq(shost->host_lock); - fc_host_fabric_name(shost) = be64_to_cpu(nodename); + fc_host_fabric_name(shost) = node_name; } @@ -1101,21 +1101,20 @@ lpfc_get_starget_node_name(struct scsi_t { struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); struct lpfc_hba *phba = (struct lpfc_hba *) shost->hostdata[0]; - uint64_t node_name = 0; + u64 node_name = 0; struct lpfc_nodelist *ndlp = NULL; spin_lock_irq(shost->host_lock); /* Search the mapped list for this target ID */ list_for_each_entry(ndlp, &phba->fc_nlpmap_list, nlp_listp) { if (starget->id == ndlp->nlp_sid) { - memcpy(&node_name, &ndlp->nlp_nodename, - sizeof(struct lpfc_name)); + node_name = wwn_to_u64(ndlp->nlp_nodename.wwn); break; } } spin_unlock_irq(shost->host_lock); - fc_starget_node_name(starget) = be64_to_cpu(node_name); + fc_starget_node_name(starget) = node_name; } static void @@ -1123,21 +1122,20 @@ lpfc_get_starget_port_name(struct scsi_t { struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); struct lpfc_hba *phba = (struct lpfc_hba *) shost->hostdata[0]; - uint64_t port_name = 0; + u64 port_name = 0; struct lpfc_nodelist *ndlp = NULL; spin_lock_irq(shost->host_lock); /* Search the mapped list for this target ID */ list_for_each_entry(ndlp, &phba->fc_nlpmap_list, nlp_listp) { if (starget->id == ndlp->nlp_sid) { - memcpy(&port_name, &ndlp->nlp_portname, - sizeof(struct lpfc_name)); + port_name = wwn_to_u64(ndlp->nlp_portname.wwn); break; } } spin_unlock_irq(shost->host_lock); - fc_starget_port_name(starget) = be64_to_cpu(port_name); + fc_starget_port_name(starget) = port_name; } static void diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -1016,13 +1016,10 @@ lpfc_register_remote_port(struct lpfc_hb struct fc_rport *rport; struct lpfc_rport_data *rdata; struct fc_rport_identifiers rport_ids; - uint64_t wwn; /* Remote port has reappeared. Re-register w/ FC transport */ - memcpy(&wwn, &ndlp->nlp_nodename, sizeof(uint64_t)); - rport_ids.node_name = be64_to_cpu(wwn); - memcpy(&wwn, &ndlp->nlp_portname, sizeof(uint64_t)); - rport_ids.port_name = be64_to_cpu(wwn); + rport_ids.node_name = wwn_to_u64(ndlp->nlp_nodename.wwn); + rport_ids.port_name = wwn_to_u64(ndlp->nlp_portname.wwn); rport_ids.port_id = ndlp->nlp_DID; rport_ids.roles = FC_RPORT_ROLE_UNKNOWN; if (ndlp->nlp_type & NLP_FCP_TARGET) diff --git a/drivers/scsi/lpfc/lpfc_hw.h b/drivers/scsi/lpfc/lpfc_hw.h --- a/drivers/scsi/lpfc/lpfc_hw.h +++ b/drivers/scsi/lpfc/lpfc_hw.h @@ -262,12 +262,14 @@ struct lpfc_sli_ct_request { #define FF_FRAME_SIZE 2048 struct lpfc_name { + union { + struct { #ifdef __BIG_ENDIAN_BITFIELD - uint8_t nameType:4; /* FC Word 0, bit 28:31 */ - uint8_t IEEEextMsn:4; /* FC Word 0, bit 24:27, bit 8:11 of IEEE ext */ + uint8_t nameType:4; /* FC Word 0, bit 28:31 */ + uint8_t IEEEextMsn:4; /* FC Word 0, bit 24:27, bit 8:11 of IEEE ext */ #else /* __LITTLE_ENDIAN_BITFIELD */ - uint8_t IEEEextMsn:4; /* FC Word 0, bit 24:27, bit 8:11 of IEEE ext */ - uint8_t nameType:4; /* FC Word 0, bit 28:31 */ + uint8_t IEEEextMsn:4; /* FC Word 0, bit 24:27, bit 8:11 of IEEE ext */ + uint8_t nameType:4; /* FC Word 0, bit 28:31 */ #endif #defi
[PATCH 2/3] qla2xxx: use wwn_to_u64() transport helper
Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c --- a/drivers/scsi/qla2xxx/qla_attr.c +++ b/drivers/scsi/qla2xxx/qla_attr.c @@ -360,16 +360,16 @@ qla2x00_get_starget_node_name(struct scs struct Scsi_Host *host = dev_to_shost(starget->dev.parent); scsi_qla_host_t *ha = to_qla_host(host); fc_port_t *fcport; - uint64_t node_name = 0; + u64 node_name = 0; list_for_each_entry(fcport, &ha->fcports, list) { if (starget->id == fcport->os_target_id) { - node_name = *(uint64_t *)fcport->node_name; + node_name = wwn_to_u64(fcport->node_name); break; } } - fc_starget_node_name(starget) = be64_to_cpu(node_name); + fc_starget_node_name(starget) = node_name; } static void @@ -378,16 +378,16 @@ qla2x00_get_starget_port_name(struct scs struct Scsi_Host *host = dev_to_shost(starget->dev.parent); scsi_qla_host_t *ha = to_qla_host(host); fc_port_t *fcport; - uint64_t port_name = 0; + u64 port_name = 0; list_for_each_entry(fcport, &ha->fcports, list) { if (starget->id == fcport->os_target_id) { - port_name = *(uint64_t *)fcport->port_name; + port_name = wwn_to_u64(fcport->port_name); break; } } - fc_starget_port_name(starget) = be64_to_cpu(port_name); + fc_starget_port_name(starget) = port_name; } static void @@ -460,9 +460,7 @@ struct fc_function_template qla2xxx_tran void qla2x00_init_host_attr(scsi_qla_host_t *ha) { - fc_host_node_name(ha->host) = - be64_to_cpu(*(uint64_t *)ha->init_cb->node_name); - fc_host_port_name(ha->host) = - be64_to_cpu(*(uint64_t *)ha->init_cb->port_name); + fc_host_node_name(ha->host) = wwn_to_u64(ha->init_cb->node_name); + fc_host_port_name(ha->host) = wwn_to_u64(ha->init_cb->port_name); fc_host_supported_classes(ha->host) = FC_COS_CLASS3; } diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -2070,8 +2070,8 @@ qla2x00_reg_remote_port(scsi_qla_host_t return; } - rport_ids.node_name = be64_to_cpu(*(uint64_t *)fcport->node_name); - rport_ids.port_name = be64_to_cpu(*(uint64_t *)fcport->port_name); + rport_ids.node_name = wwn_to_u64(fcport->node_name); + rport_ids.port_name = wwn_to_u64(fcport->port_name); rport_ids.port_id = fcport->d_id.b.domain << 16 | fcport->d_id.b.area << 8 | fcport->d_id.b.al_pa; rport_ids.roles = FC_RPORT_ROLE_UNKNOWN; - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] Generalize WWN to u64 interger conversions.
> > --- a/drivers/scsi/qla2xxx/qla_attr.c > > +++ b/drivers/scsi/qla2xxx/qla_attr.c > > @@ -345,6 +345,15 @@ struct class_device_attribute *qla2x00_h > > > > /* Host attributes. */ > > > > +static u64 > > +wwn_to_u64(uint8_t *wwn) > > +{ > > + return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 | > > + (u64)wwn[2] << 40 | (u64)wwn[3] << 32 | > > + (u64)wwn[4] << 24 | (u64)wwn[5] << 16 | > > + (u64)wwn[6] << 8 | (u64)wwn[7]; > > +} > > Shouldn't this go into the transport class? Could probably be an inline > aswell. Ok, how about this generic patchset: 1) add helper function to transport class 2) add support to qla2xxx 3) add support to lpfc -- I'll let James S. decide on whether the union {} changes to lpfc_name are acceptable --- Generalize WWN to u64 interger conversions. On some platforms the hard-casting of 8 byte node_name and port_name arrays to an u64 would cause unaligned-access warnings. Generalize the conversions with a transport helper function which performs consistent shifting of WWN bytes. Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]> --- diff --git a/include/scsi/scsi_transport_fc.h b/include/scsi/scsi_transport_fc.h --- a/include/scsi/scsi_transport_fc.h +++ b/include/scsi/scsi_transport_fc.h @@ -439,4 +439,12 @@ int fc_remote_port_block(struct fc_rport void fc_remote_port_unblock(struct fc_rport *rport); int scsi_is_fc_rport(const struct device *); +static inline u64 wwn_to_u64(u8 *wwn) +{ + return (u64)wwn[0] << 56 | (u64)wwn[1] << 48 | + (u64)wwn[2] << 40 | (u64)wwn[3] << 32 | + (u64)wwn[4] << 24 | (u64)wwn[5] << 16 | + (u64)wwn[6] << 8 | (u64)wwn[7]; +} + #endif /* SCSI_TRANSPORT_FC_H */ - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible bug in qla2xxx rev. 8.00.02
On Mon, 29 Aug 2005, Frank Borich wrote: > I have a controller with a firmware bug that mishandles > underrun situations for inquiry commands. It correctly > sets the FCP_RESID_UNDER flag however it miscalculates > the FCP_RESID count. The 7.05.00 Qlogic driver for 2.4.x > kernels thinks frame(s) were lost and retries the command > 20 times with no success: > > > if (!(scsi_status & SS_RESIDUAL_UNDER)) { > ha->dropped_frame_error_cnt++; > CMD_RESULT(cp) = DID_BUS_BUSY << > 16; > DEBUG2(printk(KERN_INFO > "scsi(%ld): Dropped " > "frame(s) detected (%x > of %x " > "bytes)...retrying > command.\n", > ha->host_no, > resid, > CMD_XFRLEN(cp));) > break; > } > Interesting, could you submit a support a problem report here: http://connection.qlogic.com/support/report/index.asp?id=csg this will insure, this gets routed through the proper channels. 7.05.00 uses an older 23xx firmware. Might be best if we had a FC trace as well, I'm sure the FW folks will be interested to see what's comming back on the wire. > Just wondering why the 8.00.02 driver for 2.6.x kernels did not > detect this transport error ? > Did you enable the proper debug settings for this driver? This driver has the dropped-frame check as well... IAC: route this through tech-support. > Latest Emulex and LSI drivers did > not detect this either ??? > -- av - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] SCSI tape signed/unsigned fix
This patch fixes in st.c the bug in the signed/unsigned int comparison reported by Doug Gilbert and fixed by him in sg.c (see [PATCH] sg direct io/mmap oops). Doug fixed the comparison in sg.c. This fix for st.c does not touch the comparison but makes both arguments signed to remove the problem. The new code is adapted from linux/fs/bio.c. Doug's fix is correct and simple, no question about it. I just have learned to hate these signed/unsigned comparisons where they can be avoided (having originally created this bug is one more reason ;-) Signed-off-by: Kai Makisara <[EMAIL PROTECTED]> --- linux-2.6.13/drivers/scsi/st.c 2005-08-29 21:04:57.0 +0300 +++ linux-2.6.13-k1/drivers/scsi/st.c 2005-08-30 21:35:14.0 +0300 @@ -17,7 +17,7 @@ Last modified: 18-JAN-1998 Richard Gooch <[EMAIL PROTECTED]> Devfs support */ -static char *verstr = "20050501"; +static char *verstr = "20050830"; #include @@ -4348,12 +4348,12 @@ static int st_map_user_pages(struct scat static int sgl_map_user_pages(struct scatterlist *sgl, const unsigned int max_pages, unsigned long uaddr, size_t count, int rw) { + unsigned long end = (uaddr + count + PAGE_SIZE - 1) >> PAGE_SHIFT; + unsigned long start = uaddr >> PAGE_SHIFT; + const int nr_pages = end - start; int res, i, j; - unsigned int nr_pages; struct page **pages; - nr_pages = ((uaddr & ~PAGE_MASK) + count + ~PAGE_MASK) >> PAGE_SHIFT; - /* User attempted Overflow! */ if ((uaddr + count) < uaddr) return -EINVAL; - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix crash in aic79xx probing in scsi-misc when no hardware is present
On Wed, 2005-08-31 at 10:45 +0200, Andi Kleen wrote: > aic79xx in scsi-misc would oops when no hardware was present. > Reason was a duplicated call to free the spi transport object - > it was done both in ahd_linux_exit and in the cleanup part > of ahd_linux_init. > > Just remove the superfluous call. Actually, the fix is slightly wrong. The correct thing to do is remove ahd_linux_exit() (It really does nothing in the failure case except release the transport) which is an __exit function. The one I plan to push is here: http://www.kernel.org/git/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commit;h=a80b3424d9fde3c4b6d62adaf6dda78128dc5c27 James - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
usb flash disks w/o partition tables + 2.4 kernel can hang
I'm not on this list but I thought someone here might be interested. (please cc me if you respond) I was tracking down a problem where some USB flash disks could not be used with a 2.4.21 kernel. Apparently lots of "READ CAPACITY failed" messages are generated. It turns out the problem was caused by USB flash disks which have no partition table. The have a valid FAT-16 file system starting at block zero (0). If the disk is inserted and the "partition 1" block device (/dev/discs/disc0/part1) is read, the disk will 'hang' after a read and not respond. Subsequent accesses get "READ CAPACITY failed" messages and the disk will get into a state where TEST_UNIT_READY will always return "not ready". I think I have a trace where I can dig out the actual SCSI read command and see what block number it's asking form. I suspect it's some gigantic number. I'm guessing the invalid parition map caused the scsi code to ask the disk to read off into space and the disk went south. (technical description :-) Note that if the 'raw' partition (/dev/discs/disc0/disc) is used, all is well. I realize it's hard/impossible to validate the partition map. In this case it has the correct '55 aa' signature. But none of the partition 'types' are valid and most are zero. I'm a little suprised that reading from */part1 worked at all, but perhaps it just the perils of using an ms-dos parition map... I'm also suprised a bad partition map could generate a read which would 'crash' the disk, so to speak. I may dig into it a little further. I believe the traces I have showed the original READ CAPACITY worked, so the scsi code should know how big the drive actually is. -brad - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Fix crash in aic79xx probing in scsi-misc when no hardware is present
aic79xx in scsi-misc would oops when no hardware was present. Reason was a duplicated call to free the spi transport object - it was done both in ahd_linux_exit and in the cleanup part of ahd_linux_init. Just remove the superfluous call. Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> Index: linux-2.6.13/drivers/scsi/aic7xxx/aic79xx_osm.c === --- linux-2.6.13.orig/drivers/scsi/aic7xxx/aic79xx_osm.c +++ linux-2.6.13/drivers/scsi/aic7xxx/aic79xx_osm.c @@ -2771,7 +2771,6 @@ ahd_linux_init(void) sizeof(struct ahd_linux_device)); if (ahd_linux_detect(&aic79xx_driver_template) > 0) return 0; - spi_release_transport(ahd_linux_transport_template); ahd_linux_exit(); return -ENODEV; } - To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html