Re: [systemd-devel] [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Tom Gundersen
On Fri, Sep 12, 2014 at 12:26 AM, Luis R. Rodriguez
 wrote:
> On Thu, Sep 11, 2014 at 2:43 PM, Tom Gundersen  wrote:
>> How about simply introducing a new flag to finit_module() to indicate
>> that the caller does not care about asynchronicity. We could then pass
>> this from udev, but existing scripts calling modprobe/insmod will not
>> be affected.
>
> Do you mean that you *do want asynchronicity*?

Precisely, udev would opt-in, but existing scripts etc would not.

>> But isn't finit_module() taking a long time a serious problem given
>> that it means no other module can be loaded in parallel?
>
> Indeed but having a desire to make the init() complete fast is
> different than the desire to have the combination of both init and
> probe fast synchronously.

I guess no one is arguing that probe should somehow be required to be
fast, but rather:

> If userspace wants init to be fast and let
> probe be async then userspace has no option but to deal with the fact
> that async probe will be async, and it should then use other methods
> to match any dependencies if its doing that itself.

Correct. And this therefore likely needs to be opt-in behaviour per
finit_module() invocation to avoid breaking old assumptions.

> For example
> networking should not kick off after a network driver is loaded but
> rather one the device creeps up on udev. We should be good with
> networking dealing with this correctly today but not sure about other
> subsystems. depmod should be able to load the required modules in
> order and if bus drivers work right then probe of the remnant devices
> should happen asynchronously. The one case I can think of that is a
> bit different is modules-load.d things but those *do not rely on the
> timeout*, but are loaded prior to a service requirement. Note though
> that if those modules had probe and they then run async'd then systemd
> service would probably need to consider that the requirements may not
> be there until later. If this is not carefully considered that could
> introduce regression to users of modules-load.d when async probe is
> fully deployed. The same applies to systemd making assumptions of kmod
> loading a module and a dependency being complete as probe would have
> run it before.

Yeah, these all needs to be considered when deciding whether or not to
enable async in each specific case.

> I believe one concern here lies in on whether or not userspace
> is properly equipped to deal with the requirements on module loading
> doing async probing and that possibly failing. Perhaps systemd might
> think all userspace is ready for that but are we sure that's the case?

There almost certainly are custom things out there relying on the
synchronous behaviour, but if we make it opt-in we should not have a
problem.

Cheers,

Tom
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 16/17] arcmsr: support new adapter ARC12x4 series

2014-09-11 Thread Ching Huang
On Thu, 2014-09-11 at 16:21 +0200, Tomas Henzl wrote:
> On 09/11/2014 05:59 AM, Ching Huang wrote:
> > On Wed, 2014-09-10 at 11:58 +0200, Tomas Henzl wrote:
> >> On 09/09/2014 06:30 PM, Christoph Hellwig wrote:
> >>> Ching,
> >>>
> >>> do you have a chance to address Thomas second concern below?  As
> >>> far as I can tell (Thomas, please correct me) that's the last
> >>> outstanding concern, and I'd really like to merge the arcmsr updates
> >>> for the Linux 3.18 merge window.
> >> Correct, still awaiting a response.
> > Christoph, Tomas,
> >
> > Sorry for the late reply.
> >
> > I think I misunderstand Tomas' meaning.
> > The spin lock in arcmsr_hbaD_polling_ccbdone() is necessary to protect 
> > doneq_index and have to be modified as following.
> 
> OK, so you are going to repost 16/17 ? If so, please describe all changes 
> you'll do, in that new post.
> 
By previous review, I will post two patches.
One for 13/17 and another for 16/17.
These patches are relative to 
http://git.infradead.org/users/hch/scsi-queue.git/tree/arcmsr-for-3.18:/drivers/scsi/arcmsr
> >
> > static int arcmsr_hbaD_polling_ccbdone(struct AdapterControlBlock *acb,
> > struct CommandControlBlock *poll_ccb)
> > {
> > bool error;
> > uint32_t poll_ccb_done = 0, poll_count = 0, flag_ccb, ccb_cdb_phy;
> > int rtn, doneq_index, index_stripped, outbound_write_pointer, toggle;
> > unsigned long flags;
> > struct ARCMSR_CDB *arcmsr_cdb;
> > struct CommandControlBlock *pCCB;
> > struct MessageUnit_D *pmu = acb->pmuD;
> >
> > polling_hbaD_ccb_retry:
> > poll_count++;
> > while (1) {
> > spin_lock_irqsave(&acb->doneq_lock, flags);
> > outbound_write_pointer = pmu->done_qbuffer[0].addressLow + 1;
> > doneq_index = pmu->doneq_index;
> > if ((outbound_write_pointer & 0xFFF) == (doneq_index & 0xFFF)) {
> > spin_unlock_irqrestore(&acb->doneq_lock, flags);
> > if (poll_ccb_done) {
> > rtn = SUCCESS;
> > break;
> > } else {
> > msleep(25);
> > if (poll_count > 40) {
> > rtn = FAILED;
> > break;
> > }
> > goto polling_hbaD_ccb_retry;
> > }
> > }
> > toggle = doneq_index & 0x4000;
> > index_stripped = (doneq_index & 0xFFF) + 1;
> > index_stripped %= ARCMSR_MAX_ARC1214_DONEQUEUE;
> > pmu->doneq_index = index_stripped ? (index_stripped | toggle) :
> > ((index_stripped + 1) | (toggle ^ 0x4000));
> > spin_unlock_irqrestore(&acb->doneq_lock, flags);
> > doneq_index = pmu->doneq_index;
> > flag_ccb = pmu->done_qbuffer[doneq_index & 0xFFF].addressLow;
> > ccb_cdb_phy = (flag_ccb & 0xFFF0);
> > arcmsr_cdb = (struct ARCMSR_CDB *)(acb->vir2phy_offset +
> > ccb_cdb_phy);
> > pCCB = container_of(arcmsr_cdb, struct CommandControlBlock,
> > arcmsr_cdb);
> > poll_ccb_done |= (pCCB == poll_ccb) ? 1 : 0;
> > if ((pCCB->acb != acb) ||
> > (pCCB->startdone != ARCMSR_CCB_START)) {
> > if (pCCB->startdone == ARCMSR_CCB_ABORTED) {
> > pr_notice("arcmsr%d: scsi id = %d "
> > "lun = %d ccb = '0x%p' poll command "
> > "abort successfully\n"
> > , acb->host->host_no
> > , pCCB->pcmd->device->id
> > , (u32)pCCB->pcmd->device->lun
> > , pCCB);
> > pCCB->pcmd->result = DID_ABORT << 16;
> > arcmsr_ccb_complete(pCCB);
> > continue;
> > }
> > pr_notice("arcmsr%d: polling an illegal "
> > "ccb command done ccb = '0x%p' "
> > "ccboutstandingcount = %d\n"
> > , acb->host->host_no
> > , pCCB
> > , atomic_read(&acb->ccboutstandingcount));
> > continue;
> > }
> > error = (flag_ccb & ARCMSR_CCBREPLY_FLAG_ERROR_MODE1)
> > ? true : false;
> > arcmsr_report_ccb_state(acb, pCCB, error);
> > }
> > return rtn;
> > }
> >
> >
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> >>> the body of a message to majord...@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 14/14] sd: Honor block layer integrity handling flags

2014-09-11 Thread Martin K. Petersen
> "Sagi" == Sagi Grimberg  writes:

Sagi,

Sagi> That's still a dependence on prot_type (type 0...). Notice that
Sagi> you set SCSI_PROT_REF_INCREMENT in every op index (except
Sagi> SCSI_PROT_NORMAL) so my point is that it's strange to see this
Sagi> association.

I still don't understand your point. The mask table indicates which
flags are valid for a given protection operation. They are explicitly
DIX flags and have nothing to do with the target protection type.

Sagi> P.S.  Now drivers can stop understanding prot_type to set DIF
Sagi> operations...  so once drivers stop referencing it we can remove
Sagi> it from scsi_cmnd.

Yes. Second part of this installment removes this and a few other things
and converts qla2xxx, lpfc, mptNsas, etc. to the new interface.

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V3 01/16] scsi: support well known logical units

2014-09-11 Thread Subhash Jadavani
> This is the defintion of the field in SPC (does the UFS spec duplicate all
of SPC?), 
Yes, UFS device specification (both 1.1 and 2.0) supports
[SAM], INCITS T10 draft standard: SCSI Architecture Model - 5 (SAM-5),
Revision 05, 19 May 2010  
[SPC], INCITS T10 draft standard: SCSI Primary Commands - 4 (SPC-4),
Revision 27, 11 October 2010 
[SBC], INCITS T10 draft standard: SCSI Block Commands - 3 (SBC-3), Revision
24, 05 August 2010  

In addition, this is what specification says explicitly " SCSI SBC and SPC
commands are the baseline for UFS. UFS will not modify the SBC and SPC
Compliant commands.  The goal is to maximize re-use and leverage of the
software codebase available on platforms (PC, netbook, MID) that are already
supporting SCSI."

> but still doesn't explain why you want it.

UFS device has supports 4 different well known logical units: "REPORT_LUNS"
(address: 01h), "UFS Device" (address: 50h), "RPMB" (address: 44h) and
"BOOT" (address: 30h).

UFS device's power management needs to be controlled by "POWER CONDITION"
field of SSU (START STOP UNIT) command. But this "power condition"  field
will take effect only when its sent to "UFS device" well known logical unit
(address: 50h) hence we require the scsi_device instance to represent this
logical unit in order for the UFS host driver to send the SSU command for
power management.

We also require the scsi_device instance for "RPMB" (Replay Protected Memory
Block) LU so user space process can control this LU.

> Also the SELECT REPORT field only appeared in SPC-3, so we should not set
it for devices that report older standards compliance.

Thanks for pointing this out, was not aware of this. We will add appropriate
check (if (sdev->scsi_level >= SCSI_SPC_3))  before setting the SELECT
REPORT field.

Regards,
Subhash

-Original Message-
From: linux-scsi-ow...@vger.kernel.org
[mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Christoph Hellwig
Sent: Wednesday, September 10, 2014 11:59 AM
To: Dolev Raviv
Cc: james.bottom...@hansenpartnership.com; h...@infradead.org;
linux-scsi@vger.kernel.org; linux-scsi-ow...@vger.kernel.org;
linux-arm-...@vger.kernel.org; santos...@gmail.com; Subhash Jadavani; Sujit
Reddy Thumma; Martin K. Petersen
Subject: Re: [PATCH V3 01/16] scsi: support well known logical units

On Wed, Sep 10, 2014 at 02:54:08PM +0300, Dolev Raviv wrote:
> From: Subhash Jadavani 
> 
> REPORT LUNS command has "SELECT REPORT" field which controls what type 
> of logical units to be reported by device server. According to UFS 
> device standard, if this field is set to 0, REPORT LUNS would report 
> only report standard logical units. If it's set to 1 then it would 
> report only well known logical unit and if it's set to 2 then device 
> would report both standard and well known logical units.

This is the defintion of the field in SPC (does the UFS spec duplicate all
of SPC?), but still doesn't explain why you want it.

Also the SELECT REPORT field only appeared in SPC-3, so we should not set it
for devices that report older standards compliance.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the
body of a message to majord...@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [systemd-devel] [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Luis R. Rodriguez
On Thu, Sep 11, 2014 at 2:43 PM, Tom Gundersen  wrote:
> On Wed, Sep 10, 2014 at 11:10 PM, Luis R. Rodriguez
>  wrote:
 More than two years
 have gone by on growing design and assumptions on top of that original
 commit. I'm not sure if *systemd folks* yet believe its was a design
 regression?
>>>
>>> I don't think so. udev should not allow its workers to run for an
>>> unbounded length of time. Whether the upper bound should be 30, 60,
>>> 180 seconds or something else is up for debate (currently it is 60,
>>> but if that is too short for some drivers we could certainly revisit
>>> that).
>>
>> That's the thing -- the timeout was put in place under the assumption
>> probe was asyncronous and its not, the driver core issues both module
>> init *and* probe together, the loader has to wait. That alone makes
>> the timeout a design flaw, and then systemd carried on top of that
>> design over two years after that. Its not systemd's fault, its just
>> that we never spoke about this as a design thing broadly and we should
>> have, and I will mention that even when the first issues creeped up,
>> the issue was still tossed back a driver problems. It was only until
>> recently that we realized that both init and probe run together that
>> we've been thinking about this problem differently. Systemd was trying
>> to ensure init on driver don't take long but its not init that is
>> taking long, its probe, and probe gets then penalized as the driver
>> core batches both init and probe synchronously before finishing module
>> loading.
>
> Just to clarify: udev/systemd is not trying to get into the business
> of what the kernel does on finit_module(), we just need to make sure
> that none of our workers stay around forever, which is why we have a
> global timeout. If necessary we can bump this higher (as mentioned in
> another thread I just bumped it to 180 secs), but we cannot abolish it
> entirely.

180 seconds is certainly better than 30, but let me be clear here on
the extent to which the timeout at least for kmod built-in command can
be an issue. The driver core not only batches init and probe together
synchronously, it also runs probe for *all* devices that the device
driver can claim and all those series of probes run synchronously
within itself, that is bus_for_each_dev() runs synchronously on each
device. So, if a init takes 1 second and probe for each device takes
120 seconds and the system has 2 devices with the new timeout the
second device would not be successfully probed (and in fact I'm not
sure if this would kill the first).

>> Furthermore as clarified by Tejun random userland is known to
>> exist that will wait indefinitely for module loading under the simple
>> assumption things *are done synchronously*, and its precisely why we
>> can't just blindly enable async probe upstream through a new driver
>> boolean as it can be unfair to this old userland. What is being
>> evaluated is to enable aync probe for *all* drivers through a new
>> general system-wide option. We cannot regress old userspace and
>> assumptions but we can create a new shiny universe.
>
> How about simply introducing a new flag to finit_module() to indicate
> that the caller does not care about asynchronicity. We could then pass
> this from udev, but existing scripts calling modprobe/insmod will not
> be affected.

Do you mean that you *do want asynchronicity*?

>>> Moreover, it seems from this discussion that the aim is (still)
>>> that insmod should be near-instantaneous (i.e., not wait for probe),
>>
>> The only reason that is being discussed is that systemd has not
>> accepted the timeout as a system design despite me pointing out the
>> original design flaw recently and at this point even if was accepted
>> as a design flaw it seems its too late. The approach taken to help
>> make all drivers async is simply an afterthought to give systemd what
>> it *thought* was in place, and it by no measure should be considered
>> the proper fix to the regression introduced by systemd, it may perhaps
>> be the right step long term for systemd systems given it goes with
>> what it assumed was there, but the timeout was flawed. Its not clear
>> if systemd can help with old kernels, it seems the ship has sailed and
>> there seems no options but for folks to work around that -- unless of
>> course some reasonable solution is found which doesn't break current
>> systemd design?
>
> If I read the git logs correctly the hard timeout was introduced in
> April 2011, so reverting it now seems indeed not to help much with all
> the running systems out there.

yeah figured :(

>> As part of this series I addressed hunting for the  "misbehaving
>> drivers" in-kernel as I saw no progress on the systemd side of things
>> to non-fatally detect "misbehaving drivers" despite my original RFCs
>> and request for advice. I quote  "misbehaving drivers" as its a flawed
>> view to consider them misbehaving now in light of clarifications of
>> how the

Re: [systemd-devel] [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Tom Gundersen
On Wed, Sep 10, 2014 at 11:10 PM, Luis R. Rodriguez
 wrote:
>>> More than two years
>>> have gone by on growing design and assumptions on top of that original
>>> commit. I'm not sure if *systemd folks* yet believe its was a design
>>> regression?
>>
>> I don't think so. udev should not allow its workers to run for an
>> unbounded length of time. Whether the upper bound should be 30, 60,
>> 180 seconds or something else is up for debate (currently it is 60,
>> but if that is too short for some drivers we could certainly revisit
>> that).
>
> That's the thing -- the timeout was put in place under the assumption
> probe was asyncronous and its not, the driver core issues both module
> init *and* probe together, the loader has to wait. That alone makes
> the timeout a design flaw, and then systemd carried on top of that
> design over two years after that. Its not systemd's fault, its just
> that we never spoke about this as a design thing broadly and we should
> have, and I will mention that even when the first issues creeped up,
> the issue was still tossed back a driver problems. It was only until
> recently that we realized that both init and probe run together that
> we've been thinking about this problem differently. Systemd was trying
> to ensure init on driver don't take long but its not init that is
> taking long, its probe, and probe gets then penalized as the driver
> core batches both init and probe synchronously before finishing module
> loading.

Just to clarify: udev/systemd is not trying to get into the business
of what the kernel does on finit_module(), we just need to make sure
that none of our workers stay around forever, which is why we have a
global timeout. If necessary we can bump this higher (as mentioned in
another thread I just bumped it to 180 secs), but we cannot abolish it
entirely.

> Furthermore as clarified by Tejun random userland is known to
> exist that will wait indefinitely for module loading under the simple
> assumption things *are done synchronously*, and its precisely why we
> can't just blindly enable async probe upstream through a new driver
> boolean as it can be unfair to this old userland. What is being
> evaluated is to enable aync probe for *all* drivers through a new
> general system-wide option. We cannot regress old userspace and
> assumptions but we can create a new shiny universe.

How about simply introducing a new flag to finit_module() to indicate
that the caller does not care about asynchronicity. We could then pass
this from udev, but existing scripts calling modprobe/insmod will not
be affected.

>> Moreover, it seems from this discussion that the aim is (still)
>> that insmod should be near-instantaneous (i.e., not wait for probe),
>
> The only reason that is being discussed is that systemd has not
> accepted the timeout as a system design despite me pointing out the
> original design flaw recently and at this point even if was accepted
> as a design flaw it seems its too late. The approach taken to help
> make all drivers async is simply an afterthought to give systemd what
> it *thought* was in place, and it by no measure should be considered
> the proper fix to the regression introduced by systemd, it may perhaps
> be the right step long term for systemd systems given it goes with
> what it assumed was there, but the timeout was flawed. Its not clear
> if systemd can help with old kernels, it seems the ship has sailed and
> there seems no options but for folks to work around that -- unless of
> course some reasonable solution is found which doesn't break current
> systemd design?

If I read the git logs correctly the hard timeout was introduced in
April 2011, so reverting it now seems indeed not to help much with all
the running systems out there.

> As part of this series I addressed hunting for the  "misbehaving
> drivers" in-kernel as I saw no progress on the systemd side of things
> to non-fatally detect "misbehaving drivers" despite my original RFCs
> and request for advice. I quote  "misbehaving drivers" as its a flawed
> view to consider them misbehaving now in light of clarifications of
> how the driver core works in that it batches both init and probe
> together always and we can't be penalizing long probes due to the fact
> long probes are simply fine. My patch to pick up "misbehaving drivers"
> drivers on the kernel front by picking up systemd's signal was
> reactive but it was also simply braindead given the same exact reasons
> why systemd's original timeout was flawed. We want a general solution
> and we don't want to work around the root cause, in this case it was
> systemd's assumption on how drivers work.

Would your ongoing work to make probing asynchronous solve this
problem in the long-term? In the short-term I guess bumping the udev
timeout should be sufficient.

> Keep in mind that the above just addresses kmod built-in cmd on
> systemd, its where the timeout was introduced but as has been
> clarified here assuming the same ti

Re: [PATCH] uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check

2014-09-11 Thread Greg Kroah-Hartman
On Thu, Sep 11, 2014 at 11:06:12AM +0200, Hans de Goede wrote:
> Cc: sta...@vger.kernel.org # 3.16
> Signed-off-by: Hans de Goede 
> ---
>  drivers/usb/storage/uas-detect.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

You forgot the reported-by line, I'll go add it...
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Luis R. Rodriguez
On Thu, Sep 11, 2014 at 1:53 PM, Dmitry Torokhov
 wrote:
> On Thu, Sep 11, 2014 at 01:42:20PM -0700, Luis R. Rodriguez wrote:
>> On Thu, Sep 11, 2014 at 1:23 PM, Dmitry Torokhov
>>  wrote:
>> >
>> >>  There are elements in common, but by and
>> >> large the biggest headaches at least in large device number boots have
>> >> already been tackled by the enterprise crowd (they don't like their
>> >> S390's or 1024 core NUMA systems taking half an hour to come up).
>> >
>> > Please do not position this as a mostly solved large systems problem,
>> > For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is
>> > a lot given that we boot in seconds.
>>
>> Dmitry, would working on top of the aysnc series be reasonable? Then
>> we could address these as separate things which we'd build on top of.
>> The one aspect I see us needing to share is the "async probe universe
>> is OK" flag.
>
> Sure. Are you planning on refreshing your series?

Yes.

> I think the code-related discussion kind of stalled...

I was just waiting for any possible brain farts to flush out before a
new respin. I'll tackle this now.

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Dmitry Torokhov
On Thu, Sep 11, 2014 at 01:42:20PM -0700, Luis R. Rodriguez wrote:
> On Thu, Sep 11, 2014 at 1:23 PM, Dmitry Torokhov
>  wrote:
> >
> >>  There are elements in common, but by and
> >> large the biggest headaches at least in large device number boots have
> >> already been tackled by the enterprise crowd (they don't like their
> >> S390's or 1024 core NUMA systems taking half an hour to come up).
> >
> > Please do not position this as a mostly solved large systems problem,
> > For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is
> > a lot given that we boot in seconds.
> 
> Dmitry, would working on top of the aysnc series be reasonable? Then
> we could address these as separate things which we'd build on top of.
> The one aspect I see us needing to share is the "async probe universe
> is OK" flag.

Sure. Are you planning on refreshing your series? I think the
code-related discussion kind of stalled...

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Luis R. Rodriguez
On Thu, Sep 11, 2014 at 1:23 PM, Dmitry Torokhov
 wrote:
>
>>  There are elements in common, but by and
>> large the biggest headaches at least in large device number boots have
>> already been tackled by the enterprise crowd (they don't like their
>> S390's or 1024 core NUMA systems taking half an hour to come up).
>
> Please do not position this as a mostly solved large systems problem,
> For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is
> a lot given that we boot in seconds.

Dmitry, would working on top of the aysnc series be reasonable? Then
we could address these as separate things which we'd build on top of.
The one aspect I see us needing to share is the "async probe universe
is OK" flag.

 Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread Dmitry Torokhov
On Thu, Sep 11, 2014 at 12:59:25PM -0700, James Bottomley wrote:
> 
> On Tue, 2014-09-09 at 16:01 -0700, Dmitry Torokhov wrote:
> > On Tuesday, September 09, 2014 03:46:23 PM James Bottomley wrote:
> > > On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote:
> > > > 
> > > > The thing is that we have to have dynamic mechanism to listen for
> > > > device attachments no matter what and such mechanism has been in place
> > > > for a long time at this point.  The synchronous wait simply doesn't
> > > > serve any purpose anymore and kinda gets in the way in that it makes
> > > > it a possibly extremely slow process to tell whether loading of a
> > > > module succeeded or not because the wait for the initial round of
> > > > probe is piggybacked.
> > > 
> > > OK, so we just fire and forget in userland ... why bother inventing an
> > > elaborate new infrastructure in the kernel to do exactly what
> > > 
> > > modprobe  &
> > > 
> > > would do?
> > 
> > Just so we do not forget: we also want the no-modules case to also be able
> > to probe asynchronously so that a slow device does not stall kernel booting.
> 
> Yes, but we mostly do this anyway.  SCSI for instance does asynchronous
> scanning of attached devices (once the cards are probed)

What would it do it card was a bit slow to probe?

> but has a sync
> point for ordering.

Quite often we do not really care about ordering of devices. I mean,
does it matter if your mouse is discovered before your keyboard or
after?

>
> The problem of speeding up boot is different from the one of init
> processes killing modprobes.

Right. One is systemd doing stupid things, another is kernel could be
smarter.

>  There are elements in common, but by and
> large the biggest headaches at least in large device number boots have
> already been tackled by the enterprise crowd (they don't like their
> S390's or 1024 core NUMA systems taking half an hour to come up).

Please do not position this as a mostly solved large systems problem,
For us it is touchpad detection stalling kernel for 0.5-1 sec. Which is
a lot given that we boot in seconds.

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

2014-09-11 Thread James Bottomley

On Tue, 2014-09-09 at 16:01 -0700, Dmitry Torokhov wrote:
> On Tuesday, September 09, 2014 03:46:23 PM James Bottomley wrote:
> > On Wed, 2014-09-10 at 07:41 +0900, Tejun Heo wrote:
> > > 
> > > The thing is that we have to have dynamic mechanism to listen for
> > > device attachments no matter what and such mechanism has been in place
> > > for a long time at this point.  The synchronous wait simply doesn't
> > > serve any purpose anymore and kinda gets in the way in that it makes
> > > it a possibly extremely slow process to tell whether loading of a
> > > module succeeded or not because the wait for the initial round of
> > > probe is piggybacked.
> > 
> > OK, so we just fire and forget in userland ... why bother inventing an
> > elaborate new infrastructure in the kernel to do exactly what
> > 
> > modprobe  &
> > 
> > would do?
> 
> Just so we do not forget: we also want the no-modules case to also be able
> to probe asynchronously so that a slow device does not stall kernel booting.

Yes, but we mostly do this anyway.  SCSI for instance does asynchronous
scanning of attached devices (once the cards are probed) but has a sync
point for ordering.

The problem of speeding up boot is different from the one of init
processes killing modprobes.  There are elements in common, but by and
large the biggest headaches at least in large device number boots have
already been tackled by the enterprise crowd (they don't like their
S390's or 1024 core NUMA systems taking half an hour to come up).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 10/11] megaraid_sas : MFI MPT linked list corruption fix

2014-09-11 Thread Kashyap Desai
> -Original Message-
> From: Tomas Henzl [mailto:the...@redhat.com]
> Sent: Friday, September 12, 2014 12:50 AM
> To: Kashyap Desai
> Cc: Sumit Saxena; linux-scsi@vger.kernel.org; Martin K. Petersen;
> Christoph
> Hellwig; jbottom...@parallels.com; aradford
> Subject: Re: [PATCH 10/11] megaraid_sas : MFI MPT linked list corruption
> fix
>
> On 09/11/2014 08:41 PM, Kashyap Desai wrote:
> > On Thu, Sep 11, 2014 at 5:53 PM, Tomas Henzl 
> wrote:
> >> On 09/11/2014 04:48 AM, Kashyap Desai wrote:
> >>> On Wed, Sep 10, 2014 at 8:36 PM, Tomas Henzl 
> wrote:
>  On 09/06/2014 03:25 PM, sumit.sax...@avagotech.com wrote:
> > Problem statement:
> > MFI link list in megaraid_sas driver is used from mfi-mpt
> > pass-through
> commands.
> > This list can be corrupted due to many possible race conditions in
> > driver and eventually we may see kernel panic.
> >
> > One example -
> > MFI frame is freed from calling process as driver send command via
> > polling method and interrupt for that command comes after driver
> > free mfi frame (actually even after some other context reuse the
> > mfi frame). When driver receive MPT frame in ISR, driver will be
> > using
> the index of MFI and access that MFI frame and finally in-used MFI frame’s
> list will be corrupted.
> >
> > High level description of new solution - Free MFI and MPT command
> > from same context.
> > Free both the command either from process (from where mfi-mpt
> > pass-through was called) or from ISR context. Do not split freeing
> > of MFI and MPT, because it creates the race condition which will do
> MFI/MPT list corruption.
> >
> > Renamed the cmd_pool_lock which is used in instance as well as
> fusion with below name.
> > mfi_pool_lock and mpt_pool_lock to add more code readability.
> >
> > Signed-off-by: Sumit Saxena 
> > Signed-off-by: Kashyap Desai 
> > ---
> >  drivers/scsi/megaraid/megaraid_sas.h|  25 +++-
> >  drivers/scsi/megaraid/megaraid_sas_base.c   | 196
> 
> >  drivers/scsi/megaraid/megaraid_sas_fusion.c |  95 ++
> >  drivers/scsi/megaraid/megaraid_sas_fusion.h |   2 +-
> >  4 files changed, 235 insertions(+), 83 deletions(-)
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > b/drivers/scsi/megaraid/megaraid_sas.h
> > index 156d4b9..f99db18 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > @@ -1016,6 +1016,12 @@ struct megasas_ctrl_info {
> >
> >  #define VD_EXT_DEBUG 0
> >
> > +enum MR_MFI_MPT_PTHR_FLAGS {
> > + MFI_MPT_DETACHED = 0,
> > + MFI_LIST_ADDED = 1,
> > + MFI_MPT_ATTACHED = 2,
> > +};
> > +
> >  /* Frame Type */
> >  #define IO_FRAME 0
> >  #define PTHRU_FRAME  1
> > @@ -1033,7 +1039,7 @@ struct megasas_ctrl_info {
> >  #define MEGASAS_IOCTL_CMD0
> >  #define MEGASAS_DEFAULT_CMD_TIMEOUT  90
> >  #define MEGASAS_THROTTLE_QUEUE_DEPTH 16
> > -
> > +#define MEGASAS_BLOCKED_CMD_TIMEOUT  60
> >  /*
> >   * FW reports the maximum of number of commands that it can
> accept (maximum
> >   * commands that can be outstanding) at any time. The driver must
> > report a @@ -1652,7 +1658,7 @@ struct megasas_instance {
> >   struct megasas_cmd **cmd_list;
> >   struct list_head cmd_pool;
> >   /* used to sync fire the cmd to fw */
> > - spinlock_t cmd_pool_lock;
> > + spinlock_t mfi_pool_lock;
> >   /* used to sync fire the cmd to fw */
> >   spinlock_t hba_lock;
> >   /* used to synch producer, consumer ptrs in dpc */ @@
> > -1839,6 +1845,11 @@ struct megasas_cmd {
> >
> >   struct list_head list;
> >   struct scsi_cmnd *scmd;
> > +
> > + void *mpt_pthr_cmd_blocked;
> > + atomic_t mfi_mpt_pthr;
> > + u8 is_wait_event;
> > +
> >   struct megasas_instance *instance;
> >   union {
> >   struct {
> > @@ -1927,4 +1938,14 @@ int
> megasas_set_crash_dump_params(struct
> > megasas_instance *instance,  void
> > megasas_free_host_crash_buffer(struct megasas_instance
> *instance);
> > void megasas_fusion_crash_dump_wq(struct work_struct *work);
> >
> > +void megasas_return_cmd_fusion(struct megasas_instance
> *instance,
> > + struct megasas_cmd_fusion *cmd); int
> > +megasas_issue_blocked_cmd(struct megasas_instance *instance,
> > + struct megasas_cmd *cmd, int timeout); void
> > +__megasas_return_cmd(struct megasas_instance *instance,
> > + struct megasas_cmd *cmd);
> > +
> > +void megasas_return_mfi_mpt_pthr(struct megasas_instance
> *instance,
> > + struct megasas_cmd *cmd_m

Re: [PATCH 10/11] megaraid_sas : MFI MPT linked list corruption fix

2014-09-11 Thread Tomas Henzl
On 09/11/2014 08:41 PM, Kashyap Desai wrote:
> On Thu, Sep 11, 2014 at 5:53 PM, Tomas Henzl  wrote:
>> On 09/11/2014 04:48 AM, Kashyap Desai wrote:
>>> On Wed, Sep 10, 2014 at 8:36 PM, Tomas Henzl  wrote:
 On 09/06/2014 03:25 PM, sumit.sax...@avagotech.com wrote:
> Problem statement:
> MFI link list in megaraid_sas driver is used from mfi-mpt pass-through 
> commands.
> This list can be corrupted due to many possible race conditions in driver 
> and
> eventually we may see kernel panic.
>
> One example -
> MFI frame is freed from calling process as driver send command via 
> polling method and interrupt
> for that command comes after driver free mfi frame (actually even after 
> some other context reuse
> the mfi frame). When driver receive MPT frame in ISR, driver will be 
> using the index of MFI and
> access that MFI frame and finally in-used MFI frame’s list will be 
> corrupted.
>
> High level description of new solution -
> Free MFI and MPT command from same context.
> Free both the command either from process (from where mfi-mpt 
> pass-through was called) or from
> ISR context. Do not split freeing of MFI and MPT, because it creates the 
> race condition which
> will do MFI/MPT list corruption.
>
> Renamed the cmd_pool_lock which is used in instance as well as fusion 
> with below name.
> mfi_pool_lock and mpt_pool_lock to add more code readability.
>
> Signed-off-by: Sumit Saxena 
> Signed-off-by: Kashyap Desai 
> ---
>  drivers/scsi/megaraid/megaraid_sas.h|  25 +++-
>  drivers/scsi/megaraid/megaraid_sas_base.c   | 196 
> 
>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  95 ++
>  drivers/scsi/megaraid/megaraid_sas_fusion.h |   2 +-
>  4 files changed, 235 insertions(+), 83 deletions(-)
>
> diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
> b/drivers/scsi/megaraid/megaraid_sas.h
> index 156d4b9..f99db18 100644
> --- a/drivers/scsi/megaraid/megaraid_sas.h
> +++ b/drivers/scsi/megaraid/megaraid_sas.h
> @@ -1016,6 +1016,12 @@ struct megasas_ctrl_info {
>
>  #define VD_EXT_DEBUG 0
>
> +enum MR_MFI_MPT_PTHR_FLAGS {
> + MFI_MPT_DETACHED = 0,
> + MFI_LIST_ADDED = 1,
> + MFI_MPT_ATTACHED = 2,
> +};
> +
>  /* Frame Type */
>  #define IO_FRAME 0
>  #define PTHRU_FRAME  1
> @@ -1033,7 +1039,7 @@ struct megasas_ctrl_info {
>  #define MEGASAS_IOCTL_CMD0
>  #define MEGASAS_DEFAULT_CMD_TIMEOUT  90
>  #define MEGASAS_THROTTLE_QUEUE_DEPTH 16
> -
> +#define MEGASAS_BLOCKED_CMD_TIMEOUT  60
>  /*
>   * FW reports the maximum of number of commands that it can accept 
> (maximum
>   * commands that can be outstanding) at any time. The driver must report 
> a
> @@ -1652,7 +1658,7 @@ struct megasas_instance {
>   struct megasas_cmd **cmd_list;
>   struct list_head cmd_pool;
>   /* used to sync fire the cmd to fw */
> - spinlock_t cmd_pool_lock;
> + spinlock_t mfi_pool_lock;
>   /* used to sync fire the cmd to fw */
>   spinlock_t hba_lock;
>   /* used to synch producer, consumer ptrs in dpc */
> @@ -1839,6 +1845,11 @@ struct megasas_cmd {
>
>   struct list_head list;
>   struct scsi_cmnd *scmd;
> +
> + void *mpt_pthr_cmd_blocked;
> + atomic_t mfi_mpt_pthr;
> + u8 is_wait_event;
> +
>   struct megasas_instance *instance;
>   union {
>   struct {
> @@ -1927,4 +1938,14 @@ int megasas_set_crash_dump_params(struct 
> megasas_instance *instance,
>  void megasas_free_host_crash_buffer(struct megasas_instance *instance);
>  void megasas_fusion_crash_dump_wq(struct work_struct *work);
>
> +void megasas_return_cmd_fusion(struct megasas_instance *instance,
> + struct megasas_cmd_fusion *cmd);
> +int megasas_issue_blocked_cmd(struct megasas_instance *instance,
> + struct megasas_cmd *cmd, int timeout);
> +void __megasas_return_cmd(struct megasas_instance *instance,
> + struct megasas_cmd *cmd);
> +
> +void megasas_return_mfi_mpt_pthr(struct megasas_instance *instance,
> + struct megasas_cmd *cmd_mfi, struct megasas_cmd_fusion *cmd_fusion);
> +
>  #endif   /*LSI_MEGARAID_SAS_H */
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
> b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 086beee..50d69eb 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -210,43 +210,66 @@ struct megasas_cmd *megasas_get_cmd(struct 
> megasas_instance
>   unsigned long

Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-11 Thread Kashyap Desai
On Fri, Sep 12, 2014 at 12:28 AM, Tomas Henzl  wrote:
> On 09/11/2014 06:39 PM, Kashyap Desai wrote:
>> On Thu, Sep 11, 2014 at 4:50 PM, Tomas Henzl  wrote:
>>> On 09/11/2014 11:02 AM, Kashyap Desai wrote:
> -Original Message-
> From: Vivek Goyal [mailto:vgo...@redhat.com]
> Sent: Wednesday, September 10, 2014 9:17 PM
> To: Tomas Henzl
> Cc: Elliott, Robert (Server Storage); Sumit Saxena;
 linux-scsi@vger.kernel.org;
> martin.peter...@oracle.com; h...@infradead.org;
> jbottom...@parallels.com; Kashyap Desai; aradf...@gmail.com; Michal
> Schmidt; am...@mellanox.com; Jens Axboe 
> (ax...@kernel.dk); scame...@beardog.cce.hp.com
> Subject: Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature
> support
>
> On Wed, Sep 10, 2014 at 05:28:40PM +0200, Tomas Henzl wrote:
>> On 09/10/2014 05:06 PM, Elliott, Robert (Server Storage) wrote:
 From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
 ow...@vger.kernel.org] On Behalf Of Sumit Saxena

> From: Tomas Henzl [mailto:the...@redhat.com]
>
> With several controllers in a system this may take a lot memory,
> could you also in case when a kdump kernel is running lower it, by
> not using this feature?
>
 Agreed, we will disable this feature for kdump kernel by adding
 "reset_devices" global varaiable.
 That check is required for only one place, throughout the code,
 this feature will remain disabled.  Code snippet for the same-

 instance->crash_dump_drv_support = (!reset_devices) &&
 crashdump_enable &&
 instance->crash_dump_fw_support &&
 instance->crash_dump_buf);
 if(instance->crash_dump_drv_support) {
 printk(KERN_INFO "megaraid_sas: FW Crash dump is
 supported\n");
 megasas_set_crash_dump_params(instance,
 MR_CRASH_BUF_TURN_OFF);

 } else {
 ..
 }
>>> Network drivers have been running into similar problems.
>>>
>>> There's a new patch from Amir coming through net-next to make
>>> is_kdump_kernel() (in crash_dump.h) accessible to modules.
>>> That may be a better signal than reset_devices that the driver
>>> should use minimal resources.
>>>
>>> http://comments.gmane.org/gmane.linux.network/324737
>>>
>>> I'm not sure about the logistics of a SCSI patch depending on a
>>> net-next patch.
>> Probably better to start with reset_devices and switch to
>> is_kdump_kernel() later.
>> This is not a discussion about reset_devices versus is_kdump_kernel,
>> but while it looks good to have it distinguished - is the
>> reset_devices actually used anywhere else than in kdump kernel?
> I think usage of reset_devices for lowering memory footprint of driver
 is
> plain wrong. It tells driver to only reset the device as BIOS might not
 have
> done it right or we skipped BIOS completely.
>
> Using is_kdump_kernel() is also not perfect either but atleast better
 than
> reset_devices.
 We will use is_kdump_kernel() not to enable this feature in Kdump kernel.
>>> OK, just keep in mind that the is_kdump_kernel() is not yet in mainline.
>> Sure I read that part, but when I check kernel source I found
>> is_kdump_kernel() is available.
>>
>> http://lxr.free-electrons.com/source/include/linux/crash_dump.h#L55
>> Let's do one thing - We will submit a separate patch for this to avoid
>> any confusion.
>>
>>
 MegaRaid Driver will not allocate Host memory (which is used to collect
 complete FW crash image) until and unless associated FW is crashed/IO
 timeout.
>>> This is new for the driver? A previous reply stated that the driver will
>>> allocate 1MB for each MR controller at the 'start of the day'.
>>> If I misunderstood it somehow then nothing is needed.
>> This is correct. 1MB DMA buffer per controller will be allocated
>> irrespective of FW is crashed or not.
>> When FW actually crash driver will go and try to allocate upto 512MB
>> memory. I though you queried for 512 MB memory.
>
> Memory allocated for a crashkernel depends on system configuration,
> values below 100MB are often used - the less the better.
> Every MB counts.

It is true that we don't need full feature set in kdump mode and as
specially FW crash dump feature is better to disable in kdump mode.
Couple of places megaraid_sas driver use  reset_devices variable, so
will continue to use that and will resend patch
Later when we have better API like "is_kdump_kernel()" we can replace
in whole megaraid_sas driver.

>
>>
 Driver do allocation only if required. Also, it will break the function
 immediately after memory allocation is failed and operate on available
 memory.

 To 

Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-11 Thread Tomas Henzl
On 09/11/2014 06:39 PM, Kashyap Desai wrote:
> On Thu, Sep 11, 2014 at 4:50 PM, Tomas Henzl  wrote:
>> On 09/11/2014 11:02 AM, Kashyap Desai wrote:
 -Original Message-
 From: Vivek Goyal [mailto:vgo...@redhat.com]
 Sent: Wednesday, September 10, 2014 9:17 PM
 To: Tomas Henzl
 Cc: Elliott, Robert (Server Storage); Sumit Saxena;
>>> linux-scsi@vger.kernel.org;
 martin.peter...@oracle.com; h...@infradead.org;
 jbottom...@parallels.com; Kashyap Desai; aradf...@gmail.com; Michal
 Schmidt; am...@mellanox.com; Jens Axboe 
 (ax...@kernel.dk); scame...@beardog.cce.hp.com
 Subject: Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature
 support

 On Wed, Sep 10, 2014 at 05:28:40PM +0200, Tomas Henzl wrote:
> On 09/10/2014 05:06 PM, Elliott, Robert (Server Storage) wrote:
>>> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
>>> ow...@vger.kernel.org] On Behalf Of Sumit Saxena
>>>
 From: Tomas Henzl [mailto:the...@redhat.com]

 With several controllers in a system this may take a lot memory,
 could you also in case when a kdump kernel is running lower it, by
 not using this feature?

>>> Agreed, we will disable this feature for kdump kernel by adding
>>> "reset_devices" global varaiable.
>>> That check is required for only one place, throughout the code,
>>> this feature will remain disabled.  Code snippet for the same-
>>>
>>> instance->crash_dump_drv_support = (!reset_devices) &&
>>> crashdump_enable &&
>>> instance->crash_dump_fw_support &&
>>> instance->crash_dump_buf);
>>> if(instance->crash_dump_drv_support) {
>>> printk(KERN_INFO "megaraid_sas: FW Crash dump is
>>> supported\n");
>>> megasas_set_crash_dump_params(instance,
>>> MR_CRASH_BUF_TURN_OFF);
>>>
>>> } else {
>>> ..
>>> }
>> Network drivers have been running into similar problems.
>>
>> There's a new patch from Amir coming through net-next to make
>> is_kdump_kernel() (in crash_dump.h) accessible to modules.
>> That may be a better signal than reset_devices that the driver
>> should use minimal resources.
>>
>> http://comments.gmane.org/gmane.linux.network/324737
>>
>> I'm not sure about the logistics of a SCSI patch depending on a
>> net-next patch.
> Probably better to start with reset_devices and switch to
> is_kdump_kernel() later.
> This is not a discussion about reset_devices versus is_kdump_kernel,
> but while it looks good to have it distinguished - is the
> reset_devices actually used anywhere else than in kdump kernel?
 I think usage of reset_devices for lowering memory footprint of driver
>>> is
 plain wrong. It tells driver to only reset the device as BIOS might not
>>> have
 done it right or we skipped BIOS completely.

 Using is_kdump_kernel() is also not perfect either but atleast better
>>> than
 reset_devices.
>>> We will use is_kdump_kernel() not to enable this feature in Kdump kernel.
>> OK, just keep in mind that the is_kdump_kernel() is not yet in mainline.
> Sure I read that part, but when I check kernel source I found
> is_kdump_kernel() is available.
>
> http://lxr.free-electrons.com/source/include/linux/crash_dump.h#L55
> Let's do one thing - We will submit a separate patch for this to avoid
> any confusion.
>
>
>>> MegaRaid Driver will not allocate Host memory (which is used to collect
>>> complete FW crash image) until and unless associated FW is crashed/IO
>>> timeout.
>> This is new for the driver? A previous reply stated that the driver will
>> allocate 1MB for each MR controller at the 'start of the day'.
>> If I misunderstood it somehow then nothing is needed.
> This is correct. 1MB DMA buffer per controller will be allocated
> irrespective of FW is crashed or not.
> When FW actually crash driver will go and try to allocate upto 512MB
> memory. I though you queried for 512 MB memory.

Memory allocated for a crashkernel depends on system configuration,
values below 100MB are often used - the less the better.
Every MB counts.

>
>>> Driver do allocation only if required. Also, it will break the function
>>> immediately after memory allocation is failed and operate on available
>>> memory.
>>>
>>> To be on safer side, we can disable this feature in Kdump mode.
>>>
>>> ~ Kashyap
 Thanks
 Vivek
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/11] megaraid_sas : MFI MPT linked list corruption fix

2014-09-11 Thread Kashyap Desai
On Thu, Sep 11, 2014 at 5:53 PM, Tomas Henzl  wrote:
> On 09/11/2014 04:48 AM, Kashyap Desai wrote:
>> On Wed, Sep 10, 2014 at 8:36 PM, Tomas Henzl  wrote:
>>> On 09/06/2014 03:25 PM, sumit.sax...@avagotech.com wrote:
 Problem statement:
 MFI link list in megaraid_sas driver is used from mfi-mpt pass-through 
 commands.
 This list can be corrupted due to many possible race conditions in driver 
 and
 eventually we may see kernel panic.

 One example -
 MFI frame is freed from calling process as driver send command via polling 
 method and interrupt
 for that command comes after driver free mfi frame (actually even after 
 some other context reuse
 the mfi frame). When driver receive MPT frame in ISR, driver will be using 
 the index of MFI and
 access that MFI frame and finally in-used MFI frame’s list will be 
 corrupted.

 High level description of new solution -
 Free MFI and MPT command from same context.
 Free both the command either from process (from where mfi-mpt pass-through 
 was called) or from
 ISR context. Do not split freeing of MFI and MPT, because it creates the 
 race condition which
 will do MFI/MPT list corruption.

 Renamed the cmd_pool_lock which is used in instance as well as fusion with 
 below name.
 mfi_pool_lock and mpt_pool_lock to add more code readability.

 Signed-off-by: Sumit Saxena 
 Signed-off-by: Kashyap Desai 
 ---
  drivers/scsi/megaraid/megaraid_sas.h|  25 +++-
  drivers/scsi/megaraid/megaraid_sas_base.c   | 196 
 
  drivers/scsi/megaraid/megaraid_sas_fusion.c |  95 ++
  drivers/scsi/megaraid/megaraid_sas_fusion.h |   2 +-
  4 files changed, 235 insertions(+), 83 deletions(-)

 diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
 b/drivers/scsi/megaraid/megaraid_sas.h
 index 156d4b9..f99db18 100644
 --- a/drivers/scsi/megaraid/megaraid_sas.h
 +++ b/drivers/scsi/megaraid/megaraid_sas.h
 @@ -1016,6 +1016,12 @@ struct megasas_ctrl_info {

  #define VD_EXT_DEBUG 0

 +enum MR_MFI_MPT_PTHR_FLAGS {
 + MFI_MPT_DETACHED = 0,
 + MFI_LIST_ADDED = 1,
 + MFI_MPT_ATTACHED = 2,
 +};
 +
  /* Frame Type */
  #define IO_FRAME 0
  #define PTHRU_FRAME  1
 @@ -1033,7 +1039,7 @@ struct megasas_ctrl_info {
  #define MEGASAS_IOCTL_CMD0
  #define MEGASAS_DEFAULT_CMD_TIMEOUT  90
  #define MEGASAS_THROTTLE_QUEUE_DEPTH 16
 -
 +#define MEGASAS_BLOCKED_CMD_TIMEOUT  60
  /*
   * FW reports the maximum of number of commands that it can accept 
 (maximum
   * commands that can be outstanding) at any time. The driver must report a
 @@ -1652,7 +1658,7 @@ struct megasas_instance {
   struct megasas_cmd **cmd_list;
   struct list_head cmd_pool;
   /* used to sync fire the cmd to fw */
 - spinlock_t cmd_pool_lock;
 + spinlock_t mfi_pool_lock;
   /* used to sync fire the cmd to fw */
   spinlock_t hba_lock;
   /* used to synch producer, consumer ptrs in dpc */
 @@ -1839,6 +1845,11 @@ struct megasas_cmd {

   struct list_head list;
   struct scsi_cmnd *scmd;
 +
 + void *mpt_pthr_cmd_blocked;
 + atomic_t mfi_mpt_pthr;
 + u8 is_wait_event;
 +
   struct megasas_instance *instance;
   union {
   struct {
 @@ -1927,4 +1938,14 @@ int megasas_set_crash_dump_params(struct 
 megasas_instance *instance,
  void megasas_free_host_crash_buffer(struct megasas_instance *instance);
  void megasas_fusion_crash_dump_wq(struct work_struct *work);

 +void megasas_return_cmd_fusion(struct megasas_instance *instance,
 + struct megasas_cmd_fusion *cmd);
 +int megasas_issue_blocked_cmd(struct megasas_instance *instance,
 + struct megasas_cmd *cmd, int timeout);
 +void __megasas_return_cmd(struct megasas_instance *instance,
 + struct megasas_cmd *cmd);
 +
 +void megasas_return_mfi_mpt_pthr(struct megasas_instance *instance,
 + struct megasas_cmd *cmd_mfi, struct megasas_cmd_fusion *cmd_fusion);
 +
  #endif   /*LSI_MEGARAID_SAS_H */
 diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
 b/drivers/scsi/megaraid/megaraid_sas_base.c
 index 086beee..50d69eb 100644
 --- a/drivers/scsi/megaraid/megaraid_sas_base.c
 +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
 @@ -210,43 +210,66 @@ struct megasas_cmd *megasas_get_cmd(struct 
 megasas_instance
   unsigned long flags;
   struct megasas_cmd *cmd = NULL;

 - spin_lock_irqsave(&instance->cmd_pool_lock, flags);
 + spin_lock_irqsave(&instance->mfi_pool_l

Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-11 Thread Vivek Goyal
On Thu, Sep 11, 2014 at 10:09:27PM +0530, Kashyap Desai wrote:

[..]
> > OK, just keep in mind that the is_kdump_kernel() is not yet in mainline.
> 
> Sure I read that part, but when I check kernel source I found
> is_kdump_kernel() is available.
> 
> http://lxr.free-electrons.com/source/include/linux/crash_dump.h#L55
> Let's do one thing - We will submit a separate patch for this to avoid
> any confusion.

is_kdump_kernel() has been there for long time just that it was not
exported. Now a patch is sitting in dave miller's tree to export it
and make it available to drivers.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-11 Thread Kashyap Desai
On Thu, Sep 11, 2014 at 4:50 PM, Tomas Henzl  wrote:
> On 09/11/2014 11:02 AM, Kashyap Desai wrote:
>>> -Original Message-
>>> From: Vivek Goyal [mailto:vgo...@redhat.com]
>>> Sent: Wednesday, September 10, 2014 9:17 PM
>>> To: Tomas Henzl
>>> Cc: Elliott, Robert (Server Storage); Sumit Saxena;
>> linux-scsi@vger.kernel.org;
>>> martin.peter...@oracle.com; h...@infradead.org;
>>> jbottom...@parallels.com; Kashyap Desai; aradf...@gmail.com; Michal
>>> Schmidt; am...@mellanox.com; Jens Axboe 
>>> (ax...@kernel.dk); scame...@beardog.cce.hp.com
>>> Subject: Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature
>>> support
>>>
>>> On Wed, Sep 10, 2014 at 05:28:40PM +0200, Tomas Henzl wrote:
 On 09/10/2014 05:06 PM, Elliott, Robert (Server Storage) wrote:
>> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
>> ow...@vger.kernel.org] On Behalf Of Sumit Saxena
>>
>>> From: Tomas Henzl [mailto:the...@redhat.com]
>>>
>>> With several controllers in a system this may take a lot memory,
>>> could you also in case when a kdump kernel is running lower it, by
>>> not using this feature?
>>>
>> Agreed, we will disable this feature for kdump kernel by adding
>> "reset_devices" global varaiable.
>> That check is required for only one place, throughout the code,
>> this feature will remain disabled.  Code snippet for the same-
>>
>> instance->crash_dump_drv_support = (!reset_devices) &&
>> crashdump_enable &&
>> instance->crash_dump_fw_support &&
>> instance->crash_dump_buf);
>> if(instance->crash_dump_drv_support) {
>> printk(KERN_INFO "megaraid_sas: FW Crash dump is
>> supported\n");
>> megasas_set_crash_dump_params(instance,
>> MR_CRASH_BUF_TURN_OFF);
>>
>> } else {
>> ..
>> }
> Network drivers have been running into similar problems.
>
> There's a new patch from Amir coming through net-next to make
> is_kdump_kernel() (in crash_dump.h) accessible to modules.
> That may be a better signal than reset_devices that the driver
> should use minimal resources.
>
> http://comments.gmane.org/gmane.linux.network/324737
>
> I'm not sure about the logistics of a SCSI patch depending on a
> net-next patch.
 Probably better to start with reset_devices and switch to
 is_kdump_kernel() later.
 This is not a discussion about reset_devices versus is_kdump_kernel,
 but while it looks good to have it distinguished - is the
 reset_devices actually used anywhere else than in kdump kernel?
>>> I think usage of reset_devices for lowering memory footprint of driver
>> is
>>> plain wrong. It tells driver to only reset the device as BIOS might not
>> have
>>> done it right or we skipped BIOS completely.
>>>
>>> Using is_kdump_kernel() is also not perfect either but atleast better
>> than
>>> reset_devices.
>> We will use is_kdump_kernel() not to enable this feature in Kdump kernel.
>
> OK, just keep in mind that the is_kdump_kernel() is not yet in mainline.

Sure I read that part, but when I check kernel source I found
is_kdump_kernel() is available.

http://lxr.free-electrons.com/source/include/linux/crash_dump.h#L55
Let's do one thing - We will submit a separate patch for this to avoid
any confusion.


>
>>
>> MegaRaid Driver will not allocate Host memory (which is used to collect
>> complete FW crash image) until and unless associated FW is crashed/IO
>> timeout.
>
> This is new for the driver? A previous reply stated that the driver will
> allocate 1MB for each MR controller at the 'start of the day'.
> If I misunderstood it somehow then nothing is needed.

This is correct. 1MB DMA buffer per controller will be allocated
irrespective of FW is crashed or not.
When FW actually crash driver will go and try to allocate upto 512MB
memory. I though you queried for 512 MB memory.

>
>> Driver do allocation only if required. Also, it will break the function
>> immediately after memory allocation is failed and operate on available
>> memory.
>>
>> To be on safer side, we can disable this feature in Kdump mode.
>>
>> ~ Kashyap
>>> Thanks
>>> Vivek
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Device Driver Developer @ Avagotech
Kashyap D. Desai
Note - my new email address
kashyap.de...@avagotech.com
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION 3.17] scsi (uas) disks no longer using tagged command queuing

2014-09-11 Thread Christoph Hellwig
On Thu, Sep 11, 2014 at 12:01:13PM +0200, Hans de Goede wrote:
> > So we're initializing the tag map, but scsi_activate_tcq doesn't pick it
> > up.  I can't really come up with a good explanation for it, but there
> > even without that there is an elephant in the room:  as part of the
> > scsi-mq series I moved the bqt field used for this into a union with the
> > new blk_mq_tag_set.  Below is a patch to get rid of that union, can you
> > try if that fixes it?
> 
> Unfortunately that does not fix it :|

Alright, I guess we need the big bisect hammer unfortunately.

We should be able to start with trying the first and last commit
of the big SCSI merge:

 start: fcc95a763444017288b318d48367098850c23c0d
 end:   c21a2c1a4973c8dde32557970fdb44eaa9489aeb

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 16/17] arcmsr: support new adapter ARC12x4 series

2014-09-11 Thread Tomas Henzl
On 09/11/2014 05:59 AM, Ching Huang wrote:
> On Wed, 2014-09-10 at 11:58 +0200, Tomas Henzl wrote:
>> On 09/09/2014 06:30 PM, Christoph Hellwig wrote:
>>> Ching,
>>>
>>> do you have a chance to address Thomas second concern below?  As
>>> far as I can tell (Thomas, please correct me) that's the last
>>> outstanding concern, and I'd really like to merge the arcmsr updates
>>> for the Linux 3.18 merge window.
>> Correct, still awaiting a response.
> Christoph, Tomas,
>
> Sorry for the late reply.
>
> I think I misunderstand Tomas' meaning.
> The spin lock in arcmsr_hbaD_polling_ccbdone() is necessary to protect 
> doneq_index and have to be modified as following.

OK, so you are going to repost 16/17 ? If so, please describe all changes 
you'll do, in that new post.

>
> static int arcmsr_hbaD_polling_ccbdone(struct AdapterControlBlock *acb,
>   struct CommandControlBlock *poll_ccb)
> {
>   bool error;
>   uint32_t poll_ccb_done = 0, poll_count = 0, flag_ccb, ccb_cdb_phy;
>   int rtn, doneq_index, index_stripped, outbound_write_pointer, toggle;
>   unsigned long flags;
>   struct ARCMSR_CDB *arcmsr_cdb;
>   struct CommandControlBlock *pCCB;
>   struct MessageUnit_D *pmu = acb->pmuD;
>
> polling_hbaD_ccb_retry:
>   poll_count++;
>   while (1) {
>   spin_lock_irqsave(&acb->doneq_lock, flags);
>   outbound_write_pointer = pmu->done_qbuffer[0].addressLow + 1;
>   doneq_index = pmu->doneq_index;
>   if ((outbound_write_pointer & 0xFFF) == (doneq_index & 0xFFF)) {
>   spin_unlock_irqrestore(&acb->doneq_lock, flags);
>   if (poll_ccb_done) {
>   rtn = SUCCESS;
>   break;
>   } else {
>   msleep(25);
>   if (poll_count > 40) {
>   rtn = FAILED;
>   break;
>   }
>   goto polling_hbaD_ccb_retry;
>   }
>   }
>   toggle = doneq_index & 0x4000;
>   index_stripped = (doneq_index & 0xFFF) + 1;
>   index_stripped %= ARCMSR_MAX_ARC1214_DONEQUEUE;
>   pmu->doneq_index = index_stripped ? (index_stripped | toggle) :
>   ((index_stripped + 1) | (toggle ^ 0x4000));
>   spin_unlock_irqrestore(&acb->doneq_lock, flags);
>   doneq_index = pmu->doneq_index;
>   flag_ccb = pmu->done_qbuffer[doneq_index & 0xFFF].addressLow;
>   ccb_cdb_phy = (flag_ccb & 0xFFF0);
>   arcmsr_cdb = (struct ARCMSR_CDB *)(acb->vir2phy_offset +
>   ccb_cdb_phy);
>   pCCB = container_of(arcmsr_cdb, struct CommandControlBlock,
>   arcmsr_cdb);
>   poll_ccb_done |= (pCCB == poll_ccb) ? 1 : 0;
>   if ((pCCB->acb != acb) ||
>   (pCCB->startdone != ARCMSR_CCB_START)) {
>   if (pCCB->startdone == ARCMSR_CCB_ABORTED) {
>   pr_notice("arcmsr%d: scsi id = %d "
>   "lun = %d ccb = '0x%p' poll command "
>   "abort successfully\n"
>   , acb->host->host_no
>   , pCCB->pcmd->device->id
>   , (u32)pCCB->pcmd->device->lun
>   , pCCB);
>   pCCB->pcmd->result = DID_ABORT << 16;
>   arcmsr_ccb_complete(pCCB);
>   continue;
>   }
>   pr_notice("arcmsr%d: polling an illegal "
>   "ccb command done ccb = '0x%p' "
>   "ccboutstandingcount = %d\n"
>   , acb->host->host_no
>   , pCCB
>   , atomic_read(&acb->ccboutstandingcount));
>   continue;
>   }
>   error = (flag_ccb & ARCMSR_CCBREPLY_FLAG_ERROR_MODE1)
>   ? true : false;
>   arcmsr_report_ccb_state(acb, pCCB, error);
>   }
>   return rtn;
> }
>
>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.ker

Re: [PATCH] xen-scsifront: don't deadlock if the ring becomes full

2014-09-11 Thread Juergen Gross

On 09/11/2014 03:31 PM, David Vrabel wrote:

scsifront_action_handler() will deadlock on host->host_lock, if the
ring is full and it has to wait for entries to become available.

Signed-off-by: David Vrabel 
---
This was found with sparse. I've not tested it.


Test might be difficult. :-)

Reviewed-by: Juergen Gross 

Thanks for spotting this.

Juergen


---
  drivers/scsi/xen-scsifront.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
index 7e88659..cc14c8d 100644
--- a/drivers/scsi/xen-scsifront.c
+++ b/drivers/scsi/xen-scsifront.c
@@ -541,8 +541,9 @@ static int scsifront_action_handler(struct scsi_cmnd *sc, 
uint8_t act)
if (!shadow)
return FAILED;

+   spin_lock_irq(host->host_lock);
+
for (;;) {
-   spin_lock_irq(host->host_lock);
if (!RING_FULL(&info->ring)) {
ring_req = scsifront_command2ring(info, sc, shadow);
if (ring_req)



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] xen-scsifront: don't deadlock if the ring becomes full

2014-09-11 Thread David Vrabel
scsifront_action_handler() will deadlock on host->host_lock, if the
ring is full and it has to wait for entries to become available.

Signed-off-by: David Vrabel 
---
This was found with sparse. I've not tested it.
---
 drivers/scsi/xen-scsifront.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
index 7e88659..cc14c8d 100644
--- a/drivers/scsi/xen-scsifront.c
+++ b/drivers/scsi/xen-scsifront.c
@@ -541,8 +541,9 @@ static int scsifront_action_handler(struct scsi_cmnd *sc, 
uint8_t act)
if (!shadow)
return FAILED;
 
+   spin_lock_irq(host->host_lock);
+
for (;;) {
-   spin_lock_irq(host->host_lock);
if (!RING_FULL(&info->ring)) {
ring_req = scsifront_command2ring(info, sc, shadow);
if (ring_req)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 11/16] scsi: ufs: refactor configuring power mode

2014-09-11 Thread Akinobu Mita
2014-09-10 20:54 GMT+09:00 Dolev Raviv :
> +static int ufshcd_config_pwr_mode(struct ufs_hba *hba,
> +   struct ufs_pa_layer_attr *desired_pwr_mode)
> +{
> +   struct ufs_pa_layer_attr final_params = { 0 };
> +   int ret;
> +
> +   if (hba->vops->pwr_change_notify)

If hba->vops is null as no vendor specific callbacks, this causes
a null pointer dereference.  So null check for hba->vops is also needed
just like other callbacks.

> +   hba->vops->pwr_change_notify(hba,
> +PRE_CHANGE, desired_pwr_mode, &final_params);
> +   else
> +   memcpy(&final_params, desired_pwr_mode, sizeof(final_params));
> +
> /*
>  * Configure attributes for power mode change with below.
>  * - PA_RXGEAR, PA_ACTIVERXDATALANES, PA_RXTERMINATION,
>  * - PA_TXGEAR, PA_ACTIVETXDATALANES, PA_TXTERMINATION,
>  * - PA_HSSERIES
>  */
> -   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_RXGEAR), gear[RX]);
> -   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_ACTIVERXDATALANES), lanes[RX]);
> -   if (pwr[RX] == FASTAUTO_MODE)
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_RXGEAR), final_params.gear_rx);
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_ACTIVERXDATALANES),
> +   final_params.lane_rx);
> +   if (final_params.pwr_rx == FASTAUTO_MODE ||
> +   final_params.pwr_rx == FAST_MODE)
> ufshcd_dme_set(hba, UIC_ARG_MIB(PA_RXTERMINATION), TRUE);
> +   else
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_RXTERMINATION), FALSE);
>
> -   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_TXGEAR), gear[TX]);
> -   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_ACTIVETXDATALANES), lanes[TX]);
> -   if (pwr[TX] == FASTAUTO_MODE)
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_TXGEAR), final_params.gear_tx);
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_ACTIVETXDATALANES),
> +   final_params.lane_tx);
> +   if (final_params.pwr_tx == FASTAUTO_MODE ||
> +   final_params.pwr_tx == FAST_MODE)
> ufshcd_dme_set(hba, UIC_ARG_MIB(PA_TXTERMINATION), TRUE);
> -
> -   if (pwr[RX] == FASTAUTO_MODE || pwr[TX] == FASTAUTO_MODE)
> -   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_HSSERIES), PA_HS_MODE_B);
> -
> -   ret = ufshcd_uic_change_pwr_mode(hba, pwr[RX] << 4 | pwr[TX]);
> -   if (ret)
> +   else
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_TXTERMINATION), FALSE);
> +
> +   if ((final_params.pwr_rx == FASTAUTO_MODE ||
> +   final_params.pwr_tx == FASTAUTO_MODE ||
> +   final_params.pwr_rx == FAST_MODE ||
> +   final_params.pwr_tx == FAST_MODE) &&
> +   final_params.hs_rate == PA_HS_MODE_B)
> +   ufshcd_dme_set(hba, UIC_ARG_MIB(PA_HSSERIES),
> +   final_params.hs_rate);
> +
> +   ret = ufshcd_uic_change_pwr_mode(hba, final_params.pwr_rx << 4
> +   | final_params.pwr_tx);
> +   if (ret) {
> dev_err(hba->dev,
> "pwr_mode: power mode change failed %d\n", ret);
> +   } else {
> +   if (hba->vops->pwr_change_notify)

Ditto.

> +   hba->vops->pwr_change_notify(hba,
> +   POST_CHANGE, NULL, &final_params);
> +
> +   memcpy(&hba->pwr_info, &final_params, sizeof(final_params));
> +   }
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 09/16] scsi: ufs: introduce well known logical unit in ufs

2014-09-11 Thread Akinobu Mita
2014-09-10 20:54 GMT+09:00 Dolev Raviv :
> +static void ufshcd_set_queue_depth(struct scsi_device *sdev)
> +{
> +   int ret = 0;
> +   u8 lun_qdepth;
> +   struct ufs_hba *hba;
> +
> +   hba = shost_priv(sdev->host);
> +
> +   lun_qdepth = hba->nutrs;
> +   ret = ufshcd_read_unit_desc_param(hba,
> + ufshcd_scsi_to_upiu_lun(sdev->lun),
> + UNIT_DESC_PARAM_LU_Q_DEPTH,
> + &lun_qdepth,
> + sizeof(lun_qdepth));
> +
> +   /* Some WLUN doesn't support unit descriptor */
> +   if (ret == -EOPNOTSUPP)
> +   lun_qdepth = 1;
> +   else if (!lun_qdepth)
> +   /* eventually, we can figure out the real queue depth */
> +   lun_qdepth = hba->nutrs;
> +   else
> +   lun_qdepth = min_t(int, lun_qdepth, hba->nutrs);

If ufshcd_read_unit_desc_param() failed and its error code was not
-EOPNOTSUPP, lun_qdepth is undefined.  In such cases lun_qdepth
should be 1?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/11] megaraid_sas : MFI MPT linked list corruption fix

2014-09-11 Thread Tomas Henzl
On 09/11/2014 04:48 AM, Kashyap Desai wrote:
> On Wed, Sep 10, 2014 at 8:36 PM, Tomas Henzl  wrote:
>> On 09/06/2014 03:25 PM, sumit.sax...@avagotech.com wrote:
>>> Problem statement:
>>> MFI link list in megaraid_sas driver is used from mfi-mpt pass-through 
>>> commands.
>>> This list can be corrupted due to many possible race conditions in driver 
>>> and
>>> eventually we may see kernel panic.
>>>
>>> One example -
>>> MFI frame is freed from calling process as driver send command via polling 
>>> method and interrupt
>>> for that command comes after driver free mfi frame (actually even after 
>>> some other context reuse
>>> the mfi frame). When driver receive MPT frame in ISR, driver will be using 
>>> the index of MFI and
>>> access that MFI frame and finally in-used MFI frame’s list will be 
>>> corrupted.
>>>
>>> High level description of new solution -
>>> Free MFI and MPT command from same context.
>>> Free both the command either from process (from where mfi-mpt pass-through 
>>> was called) or from
>>> ISR context. Do not split freeing of MFI and MPT, because it creates the 
>>> race condition which
>>> will do MFI/MPT list corruption.
>>>
>>> Renamed the cmd_pool_lock which is used in instance as well as fusion with 
>>> below name.
>>> mfi_pool_lock and mpt_pool_lock to add more code readability.
>>>
>>> Signed-off-by: Sumit Saxena 
>>> Signed-off-by: Kashyap Desai 
>>> ---
>>>  drivers/scsi/megaraid/megaraid_sas.h|  25 +++-
>>>  drivers/scsi/megaraid/megaraid_sas_base.c   | 196 
>>> 
>>>  drivers/scsi/megaraid/megaraid_sas_fusion.c |  95 ++
>>>  drivers/scsi/megaraid/megaraid_sas_fusion.h |   2 +-
>>>  4 files changed, 235 insertions(+), 83 deletions(-)
>>>
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h 
>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>> index 156d4b9..f99db18 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>> @@ -1016,6 +1016,12 @@ struct megasas_ctrl_info {
>>>
>>>  #define VD_EXT_DEBUG 0
>>>
>>> +enum MR_MFI_MPT_PTHR_FLAGS {
>>> + MFI_MPT_DETACHED = 0,
>>> + MFI_LIST_ADDED = 1,
>>> + MFI_MPT_ATTACHED = 2,
>>> +};
>>> +
>>>  /* Frame Type */
>>>  #define IO_FRAME 0
>>>  #define PTHRU_FRAME  1
>>> @@ -1033,7 +1039,7 @@ struct megasas_ctrl_info {
>>>  #define MEGASAS_IOCTL_CMD0
>>>  #define MEGASAS_DEFAULT_CMD_TIMEOUT  90
>>>  #define MEGASAS_THROTTLE_QUEUE_DEPTH 16
>>> -
>>> +#define MEGASAS_BLOCKED_CMD_TIMEOUT  60
>>>  /*
>>>   * FW reports the maximum of number of commands that it can accept (maximum
>>>   * commands that can be outstanding) at any time. The driver must report a
>>> @@ -1652,7 +1658,7 @@ struct megasas_instance {
>>>   struct megasas_cmd **cmd_list;
>>>   struct list_head cmd_pool;
>>>   /* used to sync fire the cmd to fw */
>>> - spinlock_t cmd_pool_lock;
>>> + spinlock_t mfi_pool_lock;
>>>   /* used to sync fire the cmd to fw */
>>>   spinlock_t hba_lock;
>>>   /* used to synch producer, consumer ptrs in dpc */
>>> @@ -1839,6 +1845,11 @@ struct megasas_cmd {
>>>
>>>   struct list_head list;
>>>   struct scsi_cmnd *scmd;
>>> +
>>> + void *mpt_pthr_cmd_blocked;
>>> + atomic_t mfi_mpt_pthr;
>>> + u8 is_wait_event;
>>> +
>>>   struct megasas_instance *instance;
>>>   union {
>>>   struct {
>>> @@ -1927,4 +1938,14 @@ int megasas_set_crash_dump_params(struct 
>>> megasas_instance *instance,
>>>  void megasas_free_host_crash_buffer(struct megasas_instance *instance);
>>>  void megasas_fusion_crash_dump_wq(struct work_struct *work);
>>>
>>> +void megasas_return_cmd_fusion(struct megasas_instance *instance,
>>> + struct megasas_cmd_fusion *cmd);
>>> +int megasas_issue_blocked_cmd(struct megasas_instance *instance,
>>> + struct megasas_cmd *cmd, int timeout);
>>> +void __megasas_return_cmd(struct megasas_instance *instance,
>>> + struct megasas_cmd *cmd);
>>> +
>>> +void megasas_return_mfi_mpt_pthr(struct megasas_instance *instance,
>>> + struct megasas_cmd *cmd_mfi, struct megasas_cmd_fusion *cmd_fusion);
>>> +
>>>  #endif   /*LSI_MEGARAID_SAS_H */
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c 
>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> index 086beee..50d69eb 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> @@ -210,43 +210,66 @@ struct megasas_cmd *megasas_get_cmd(struct 
>>> megasas_instance
>>>   unsigned long flags;
>>>   struct megasas_cmd *cmd = NULL;
>>>
>>> - spin_lock_irqsave(&instance->cmd_pool_lock, flags);
>>> + spin_lock_irqsave(&instance->mfi_pool_lock, flags);
>>>
>>>   if (!list_empty(&instance->cmd_pool)) {
>>>   cmd = list_entry((&instance->cmd_pool)->next,
>>>struct megas

Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-11 Thread Tomas Henzl
On 09/11/2014 11:02 AM, Kashyap Desai wrote:
>> -Original Message-
>> From: Vivek Goyal [mailto:vgo...@redhat.com]
>> Sent: Wednesday, September 10, 2014 9:17 PM
>> To: Tomas Henzl
>> Cc: Elliott, Robert (Server Storage); Sumit Saxena;
> linux-scsi@vger.kernel.org;
>> martin.peter...@oracle.com; h...@infradead.org;
>> jbottom...@parallels.com; Kashyap Desai; aradf...@gmail.com; Michal
>> Schmidt; am...@mellanox.com; Jens Axboe 
>> (ax...@kernel.dk); scame...@beardog.cce.hp.com
>> Subject: Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature
>> support
>>
>> On Wed, Sep 10, 2014 at 05:28:40PM +0200, Tomas Henzl wrote:
>>> On 09/10/2014 05:06 PM, Elliott, Robert (Server Storage) wrote:
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Sumit Saxena
>
>> From: Tomas Henzl [mailto:the...@redhat.com]
>>
>> With several controllers in a system this may take a lot memory,
>> could you also in case when a kdump kernel is running lower it, by
>> not using this feature?
>>
> Agreed, we will disable this feature for kdump kernel by adding
> "reset_devices" global varaiable.
> That check is required for only one place, throughout the code,
> this feature will remain disabled.  Code snippet for the same-
>
> instance->crash_dump_drv_support = (!reset_devices) &&
> crashdump_enable &&
> instance->crash_dump_fw_support &&
> instance->crash_dump_buf);
> if(instance->crash_dump_drv_support) {
> printk(KERN_INFO "megaraid_sas: FW Crash dump is
> supported\n");
> megasas_set_crash_dump_params(instance,
> MR_CRASH_BUF_TURN_OFF);
>
> } else {
> ..
> }
 Network drivers have been running into similar problems.

 There's a new patch from Amir coming through net-next to make
 is_kdump_kernel() (in crash_dump.h) accessible to modules.
 That may be a better signal than reset_devices that the driver
 should use minimal resources.

 http://comments.gmane.org/gmane.linux.network/324737

 I'm not sure about the logistics of a SCSI patch depending on a
 net-next patch.
>>> Probably better to start with reset_devices and switch to
>>> is_kdump_kernel() later.
>>> This is not a discussion about reset_devices versus is_kdump_kernel,
>>> but while it looks good to have it distinguished - is the
>>> reset_devices actually used anywhere else than in kdump kernel?
>> I think usage of reset_devices for lowering memory footprint of driver
> is
>> plain wrong. It tells driver to only reset the device as BIOS might not
> have
>> done it right or we skipped BIOS completely.
>>
>> Using is_kdump_kernel() is also not perfect either but atleast better
> than
>> reset_devices.
> We will use is_kdump_kernel() not to enable this feature in Kdump kernel.

OK, just keep in mind that the is_kdump_kernel() is not yet in mainline.

>
> MegaRaid Driver will not allocate Host memory (which is used to collect
> complete FW crash image) until and unless associated FW is crashed/IO
> timeout.

This is new for the driver? A previous reply stated that the driver will
allocate 1MB for each MR controller at the 'start of the day'.
If I misunderstood it somehow then nothing is needed.

> Driver do allocation only if required. Also, it will break the function
> immediately after memory allocation is failed and operate on available
> memory.
>
> To be on safer side, we can disable this feature in Kdump mode.
>
> ~ Kashyap
>> Thanks
>> Vivek
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check

2014-09-11 Thread Bjørn Mork
Hans de Goede  writes:

> --- a/drivers/usb/storage/uas-detect.h
> +++ b/drivers/usb/storage/uas-detect.h
> @@ -73,8 +73,8 @@ static int uas_use_uas_driver(struct usb_interface *intf,
>* broken on the ASM1051, use the number of streams to differentiate.
>* New ASM1053-s also support 32 streams, but have a different prod-id.
>*/
> - if (udev->descriptor.idVendor == 0x174c &&
> - udev->descriptor.idProduct == 0x55aa) {
> + if (le16_to_cpu(udev->descriptor.idVendor) == 0x174c &&
> + le16_to_cpu(udev->descriptor.idProduct) == 0x55aa) {
>   if (udev->speed < USB_SPEED_SUPER) {
>   /* No streams info, assume ASM1051 */
>   flags |= US_FL_IGNORE_UAS;


Not that it matters much in this case, but I believe converting the
constants is preferred so that this can be resolved at build time
instead of runtime.

I.e.
if (udev->descriptor.idVendor == cpu_to_le16(0x174c) &&
etc



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION 3.17] scsi (uas) disks no longer using tagged command queuing

2014-09-11 Thread Hans de Goede
Hi,

On 09/10/2014 05:45 PM, Christoph Hellwig wrote:
> On Wed, Sep 10, 2014 at 09:21:24AM +0200, Hans de Goede wrote:
>> I've applied the patch, this results in the following new dmesg output
>> when using uas:
>>
>> [  120.602632] initialized host-wide tag map!
>>
>> Thank you for looking into this.
> 
> So we're initializing the tag map, but scsi_activate_tcq doesn't pick it
> up.  I can't really come up with a good explanation for it, but there
> even without that there is an elephant in the room:  as part of the
> scsi-mq series I moved the bqt field used for this into a union with the
> new blk_mq_tag_set.  Below is a patch to get rid of that union, can you
> try if that fixes it?

Unfortunately that does not fix it :|

Regards,

Hans

> 
> 
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index d0f69a3..bc2 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -584,10 +584,8 @@ struct Scsi_Host {
>* Area to keep a shared tag map (if needed, will be
>* NULL if not).
>*/
> - union {
> - struct blk_queue_tag*bqt;
> - struct blk_mq_tag_set   tag_set;
> - };
> + struct blk_queue_tag*bqt;
> + struct blk_mq_tag_set   tag_set;
>  
>   atomic_t host_busy;/* commands actually active on 
> low-level */
>   atomic_t host_blocked;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check

2014-09-11 Thread Hans de Goede
Cc: sta...@vger.kernel.org # 3.16
Signed-off-by: Hans de Goede 
---
 drivers/usb/storage/uas-detect.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/storage/uas-detect.h b/drivers/usb/storage/uas-detect.h
index 1e298ec..8a6f371 100644
--- a/drivers/usb/storage/uas-detect.h
+++ b/drivers/usb/storage/uas-detect.h
@@ -73,8 +73,8 @@ static int uas_use_uas_driver(struct usb_interface *intf,
 * broken on the ASM1051, use the number of streams to differentiate.
 * New ASM1053-s also support 32 streams, but have a different prod-id.
 */
-   if (udev->descriptor.idVendor == 0x174c &&
-   udev->descriptor.idProduct == 0x55aa) {
+   if (le16_to_cpu(udev->descriptor.idVendor) == 0x174c &&
+   le16_to_cpu(udev->descriptor.idProduct) == 0x55aa) {
if (udev->speed < USB_SPEED_SUPER) {
/* No streams info, assume ASM1051 */
flags |= US_FL_IGNORE_UAS;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 04/11] megaraid_sas : Firmware crash dump feature support

2014-09-11 Thread Kashyap Desai
> -Original Message-
> From: Vivek Goyal [mailto:vgo...@redhat.com]
> Sent: Wednesday, September 10, 2014 9:17 PM
> To: Tomas Henzl
> Cc: Elliott, Robert (Server Storage); Sumit Saxena;
linux-scsi@vger.kernel.org;
> martin.peter...@oracle.com; h...@infradead.org;
> jbottom...@parallels.com; Kashyap Desai; aradf...@gmail.com; Michal
> Schmidt; am...@mellanox.com; Jens Axboe 
> (ax...@kernel.dk); scame...@beardog.cce.hp.com
> Subject: Re: [PATCH 04/11] megaraid_sas : Firmware crash dump feature
> support
>
> On Wed, Sep 10, 2014 at 05:28:40PM +0200, Tomas Henzl wrote:
> > On 09/10/2014 05:06 PM, Elliott, Robert (Server Storage) wrote:
> > >> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> > >> ow...@vger.kernel.org] On Behalf Of Sumit Saxena
> > >>
> > >>> From: Tomas Henzl [mailto:the...@redhat.com]
> > >>>
> > >>> With several controllers in a system this may take a lot memory,
> > >>> could you also in case when a kdump kernel is running lower it, by
> > >>> not using this feature?
> > >>>
> > >> Agreed, we will disable this feature for kdump kernel by adding
> > >> "reset_devices" global varaiable.
> > >> That check is required for only one place, throughout the code,
> > >> this feature will remain disabled.  Code snippet for the same-
> > >>
> > >> instance->crash_dump_drv_support = (!reset_devices) &&
> > >> crashdump_enable &&
> > >> instance->crash_dump_fw_support &&
> > >> instance->crash_dump_buf);
> > >> if(instance->crash_dump_drv_support) {
> > >> printk(KERN_INFO "megaraid_sas: FW Crash dump is
> > >> supported\n");
> > >> megasas_set_crash_dump_params(instance,
> > >> MR_CRASH_BUF_TURN_OFF);
> > >>
> > >> } else {
> > >> ..
> > >> }
> > > Network drivers have been running into similar problems.
> > >
> > > There's a new patch from Amir coming through net-next to make
> > > is_kdump_kernel() (in crash_dump.h) accessible to modules.
> > > That may be a better signal than reset_devices that the driver
> > > should use minimal resources.
> > >
> > > http://comments.gmane.org/gmane.linux.network/324737
> > >
> > > I'm not sure about the logistics of a SCSI patch depending on a
> > > net-next patch.
> >
> > Probably better to start with reset_devices and switch to
> > is_kdump_kernel() later.
> > This is not a discussion about reset_devices versus is_kdump_kernel,
> > but while it looks good to have it distinguished - is the
> > reset_devices actually used anywhere else than in kdump kernel?
>
> I think usage of reset_devices for lowering memory footprint of driver
is
> plain wrong. It tells driver to only reset the device as BIOS might not
have
> done it right or we skipped BIOS completely.
>
> Using is_kdump_kernel() is also not perfect either but atleast better
than
> reset_devices.

We will use is_kdump_kernel() not to enable this feature in Kdump kernel.

MegaRaid Driver will not allocate Host memory (which is used to collect
complete FW crash image) until and unless associated FW is crashed/IO
timeout.
Driver do allocation only if required. Also, it will break the function
immediately after memory allocation is failed and operate on available
memory.

To be on safer side, we can disable this feature in Kdump mode.

~ Kashyap
>
> Thanks
> Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Updated linux uas driver, please test

2014-09-11 Thread Hans de Goede
Hi,

On 09/10/2014 10:54 PM, Douglas Gilbert wrote:
> On 14-09-10 08:13 AM, Hans de Goede wrote:
>> Hi All,
>>
>> I'm mailing all of you because you've reported various problems
>> with the new uas support in kernel 3.16 and later.
>>
>> I've been working on making the uas driver more resilient to
>> errors, as well as improved logging so we can easier figure
>> out the cause of errors.
>>
>> I would like to ask you all to test a standalone version of
>> the new uas driver with the devices you've been having
>> trouble with before, and report the results to me.
>>
>> Testing instructions:
>>
>> 1) Remove any usb-storage.quirks= setting from the kernel commandline,
>> and remove any /etc/modprobe.conf* files doing the same, boot your
>> machine without the uas device attached.
>>
>> 2) Make sure your machine is set up for building kernel modules,
>> usually this means installing kernel-devel and gcc packages, see
>> your distributions documentation for more info
>>
>> 3) Download all files from here:
>> https://fedorapeople.org/~jwrdegoede/uas/
>> And put them all in a single directory, named e.g. uas
>>
>> 4) Start a terminal, cd into the uas directory
>>
>> 5) Run the following commands:
>> make
>> sudo rmmod uas
>> sudo insmod ./uas.ko
>>
>> 6) Connect your uas device
>>
>> 7) Wait for the disk to show up (wait circa 1 minute max), then do:
>>
>> dmesg > dmesg.log
>> lsusb -v > lsusb.log
>>
>> 8) Test the uas disk
>>
>> Once done please send me a mail, in this mail please
>>
>> 1) Describe how the disk worked, did it show up in a reasonable time,
>> and did it work?
>>
>> 2) Attach dmesg.log and lsusb.log
> 
> Could you give some sort of indication of the dd throughput
> time (on READs) with UAS given a recent SATA SSD that can
> source data faster than 300 MB/sec (say)?

I've done tests with 3 dd running simultaneously, reading data
from 3 different parts of the disk, and I'm getting about 110 MB/s
for each dd, so about 330 MB / s in total.

This is with a Crucial M500 ssd, with both asm1053 and nec chipsets.

> My ASM1051 based dock under W7 got an underwhelming maximum
> of 42 MB/sec with such a SATA SSD. And I saw enough stupid
> meta-data from SCSI commands to suggest that product should
> be binned.

42 MB/s sounds like you're running over USB-2 not USB-3 (although for
USB-2 it is very good, but uas should be able to get pretty close
to saturating the bus with USB-2).

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/14] sd: Honor block layer integrity handling flags

2014-09-11 Thread Sagi Grimberg

On 9/11/2014 3:07 AM, Martin K. Petersen wrote:

"Sagi" == Sagi Grimberg  writes:


[Back from vacation]


Hey Martin,




+ [SCSI_PROT_WRITE_PASS] = SCSI_PROT_TRANSFER_PI |
+ SCSI_PROT_GUARD_CHECK |
+ SCSI_PROT_REF_CHECK |
+ SCSI_PROT_REF_INCREMENT |
+ SCSI_PROT_IP_CHECKSUM,


Sagi> A bit strange to me that you put REF_CHECK & REF_INCREMENT flag
Sagi> depending on the prot_op while it really depends on the prot_type.

It doesn't just depend on the prot_type. You need to be able to tell the
HBA to increment/test the ref tag even for a Type 0 device.



That's still a dependence on prot_type (type 0...). Notice that you
set SCSI_PROT_REF_INCREMENT in every op index (except SCSI_PROT_NORMAL)
so my point is that it's strange to see this association.

P.S.
Now drivers can stop understanding prot_type to set DIF operations...
so once drivers stop referencing it we can remove it from scsi_cmnd.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html