Re: *** SPAM *** Re: [RFC PATCH] mpt3sas: mpt3sas_scsih_enclosure_find_by_handle can be static

2018-04-10 Thread Jaco Kroon
Hi Martin, Bart,

I've not seen additional feedback on this (I may simply not be CCed).

I've applied the patch to one of our hosts where we've had endless IO
lockups (with MQ enabled the host died within a day or two, sometimes
sub one hour, without it typically ran for about two weeks).  With this
patch (on top of 4.16) we're now at four days and 17 hours, with IO
still going strong (including a mdadm reshape to add a disk, as well as
a rebuild on a drive that failed - concurrently on two different arrays,
same controller).  Very subjective, but the host also feels more
responsive under heavy IO load.

What can I do from my side (I've got some development experience) to
help push this patch forward?

Kind Regards,
Jaco


On 28/03/2018 23:54, Martin K. Petersen wrote:
> Bart,
>
>> Are you aware that if the 0-day test infrastructure suggests an improvement
>> for a patch that the patch that that improvement applies to gets ignored
>> unless either the patch is reposted with the improvement applied or that it
>> is explained why the suggested improvement is inappropriate?
> Correct. I don't apply anything that causes a 0-day warning. The patch
> will be closed with "Changes Required" status in patchwork.
>
> Always build patch submissions to linux-scsi with:
>
>   make C=1 CF="-D__CHECK_ENDIAN__"
>



Re: *** SPAM *** Re: [RFC PATCH] mpt3sas: mpt3sas_scsih_enclosure_find_by_handle can be static

2018-04-05 Thread Jaco Kroon
Hi,

Further to that, in the second last hunk there is a very clear
functionality change:

@@ -8756,12 +8859,12 @@ _scsih_mark_responding_expander(struct
MPT3SAS_ADAPTER *ioc,
    continue;
    sas_expander->responding = 1;

-   if (!encl_pg0_rc)
+   if (enclosure_dev) {
    sas_expander->enclosure_logical_id =
-   le64_to_cpu(enclosure_pg0.EnclosureLogicalID);
-
-   sas_expander->enclosure_handle =
-   le16_to_cpu(expander_pg0->EnclosureHandle);
+   le64_to_cpu(enclosure_dev->pg
0.EnclosureLogicalID);
+   sas_expander->enclosure_handle =
+   le16_to_cpu(expander_pg0->EnclosureHandle);
+   }

    if (sas_expander->handle == handle)
    goto out;

Note that the assignment to sas_expander->enclosure_handle is now
dependent on enclosure_dev being non-NULL.

Busy applying the patch to 4.16 - and now I have no idea whether that
functionality change should be part of the change or not.  Having worked
through the rest of the patch it seems good otherwise (Keeping in mind
that I'm not familiar with the code in question, nor do I normally work
on kernel code, and this is definitely the first time I took a peek
anywhere near the IO subsystem).

Kind Regards,
Jaco

On 28/03/2018 23:54, Martin K. Petersen wrote:
> Bart,
>
>> Are you aware that if the 0-day test infrastructure suggests an improvement
>> for a patch that the patch that that improvement applies to gets ignored
>> unless either the patch is reposted with the improvement applied or that it
>> is explained why the suggested improvement is inappropriate?
> Correct. I don't apply anything that causes a 0-day warning. The patch
> will be closed with "Changes Required" status in patchwork.
>
> Always build patch submissions to linux-scsi with:
>
>   make C=1 CF="-D__CHECK_ENDIAN__"
>



Re: mpt3sas: sleeping function called from invalid context

2018-03-19 Thread Jaco Kroon
Hi All,

On 14/03/2018 03:29, Bart Van Assche wrote:
> (+Jaco)
Bart, thanks for adding me.
>
> On Tue, 2018-03-13 at 16:18 +0530, Suganath Prabu Subramani wrote:
>> We have root caused the issue and it is same as you mentioned.
>> "_scsih_get_enclosure_logicalid_chassis_slot()" is called with interrupts
>> disabled and this function
>> "_scsih_get_enclosure_logicalid_chassis_slot" again calls
>> _config_request(), with mutex_lock().
>>
>> We have patch ready along with few other change and we ll be posting
>> it by tomorrow after covering BST.
Has there been any progress?  We're currently seeing our server going
down again, and we'd like to eliminate this as the cause.  Currently IO
is still flowing but some IO has started to deadlock.

Kind Regards,
Jaco