[Bug 199435] HPSA + P420i resetting logical Direct-Access never complete

2018-04-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199435

--- Comment #13 from lober...@redhat.com ---
Apr 18 01:29:16 kernel: cmaidad D0  3442  1 0x
Apr 18 01:29:16 kernel: Call Trace:
Apr 18 01:29:16 kernel:  __schedule+0x3b9/0x8f0
Apr 18 01:29:16 kernel:  schedule+0x36/0x80
Apr 18 01:29:16 kernel:  scsi_block_when_processing_errors+0xd5/0x110
Apr 18 01:29:16 kernel:  ? wake_atomic_t_function+0x60/0x60
Apr 18 01:29:16 kernel:  sg_open+0x14a/0x5c0

 * Likely a pass though from the cma* management
daemons

Can you try reproduce with all the HP Health daemons disabled

-- 
You are receiving this mail because:
You are the assignee for the bug.


RE: [Patch v2] Storvsc: Select channel based on available percentage of ring buffer to write

2018-04-22 Thread Michael Kelley (EOSG)
> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org  
> On Behalf
> Of Long Li
> Sent: Thursday, April 19, 2018 2:54 PM
> To: KY Srinivasan ; Haiyang Zhang 
> ; Stephen
> Hemminger ; James E . J . Bottomley 
> ;
> Martin K . Petersen ; 
> de...@linuxdriverproject.org; linux-
> s...@vger.kernel.org; linux-ker...@vger.kernel.org
> Cc: Long Li 
> Subject: [Patch v2] Storvsc: Select channel based on available percentage of 
> ring buffer to
> write
> 
> From: Long Li 
> 
> This is a best effort for estimating on how busy the ring buffer is for
> that channel, based on available buffer to write in percentage. It is still
> possible that at the time of actual ring buffer write, the space may not be
> available due to other processes may be writing at the time.
> 
> Selecting a channel based on how full it is can reduce the possibility that
> a ring buffer write will fail, and avoid the situation a channel is over
> busy.
> 
> Now it's possible that storvsc can use a smaller ring buffer size
> (e.g. 40k bytes) to take advantage of cache locality.
> 
> Changes.
> v2: Pre-allocate struct cpumask on the heap.
> Struct cpumask is a big structure (1k bytes) when CONFIG_NR_CPUS=8192 (default
> value when CONFIG_MAXSMP=y). Don't use kernel stack for it by pre-allocating
> them using kmalloc when channels are first initialized.
> 
> Signed-off-by: Long Li 
> ---
>  drivers/scsi/storvsc_drv.c | 90 
> --
>  1 file changed, 72 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index a2ec0bc9e9fa..2a9fff94dd1a 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -395,6 +395,12 @@ MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer 
> size
> (bytes)");
> 
>  module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO);
>  MODULE_PARM_DESC(storvsc_vcpus_per_sub_channel, "Ratio of VCPUs to 
> subchannels");
> +
> +static int ring_avail_percent_lowater = 10;
> +module_param(ring_avail_percent_lowater, int, S_IRUGO);
> +MODULE_PARM_DESC(ring_avail_percent_lowater,
> + "Select a channel if available ring size > this in percent");
> +
>  /*
>   * Timeout in seconds for all devices managed by this driver.
>   */
> @@ -468,6 +474,13 @@ struct storvsc_device {
>* Mask of CPUs bound to subchannels.
>*/
>   struct cpumask alloced_cpus;
> + /*
> +  * Pre-allocated struct cpumask for each hardware queue.
> +  * struct cpumask is used by selecting out-going channels. It is a
> +  * big structure, default to 1024k bytes when CONFIG_MAXSMP=y.

I think you mean "1024 bytes" or "1k bytes" in the above comment.

> +  * Pre-allocate it to avoid allocation on the kernel stack.
> +  */
> + struct cpumask *cpumask_chns;
>   /* Used for vsc/vsp channel reset process */
>   struct storvsc_cmd_request init_request;
>   struct storvsc_cmd_request reset_request;
> @@ -872,6 +885,13 @@ static int storvsc_channel_init(struct hv_device 
> *device, bool is_fc)
>   if (stor_device->stor_chns == NULL)
>   return -ENOMEM;
> 
> + stor_device->cpumask_chns = kcalloc(num_possible_cpus(),
> + sizeof(struct cpumask), GFP_KERNEL);

Note that num_possible_cpus() is 240 for a Hyper-V 2016 guest unless 
overridden on the kernel boot line, so this is going to allocate 240 Kbytes for 
each
synthetic SCSI controller.  On an Azure VM, which has two IDE and two SCSI
controllers, this is nearly 1 Mbyte.  It's unfortunate to have to allocate this 
much
memory for a what is essentially a temporary variable.   Further down in these
comments, I've proposed an alternate implementation of the code that avoids
the need for the temporary variable, and hence avoids the need for this
allocation.

> + if (stor_device->cpumask_chns == NULL) {
> + kfree(stor_device->stor_chns);
> + return -ENOMEM;
> + }
> +
>   stor_device->stor_chns[device->channel->target_cpu] = device->channel;
>   cpumask_set_cpu(device->channel->target_cpu,
>   _device->alloced_cpus);
> @@ -1232,6 +1252,7 @@ static int storvsc_dev_remove(struct hv_device *device)
>   vmbus_close(device->channel);
> 
>   kfree(stor_device->stor_chns);
> + kfree(stor_device->cpumask_chns);
>   kfree(stor_device);
>   return 0;
>  }
> @@ -1241,7 +1262,7 @@ static struct vmbus_channel *get_og_chn(struct 
> storvsc_device
> *stor_device,
>  {1G/
>   u16 slot = 0;
>   u16 hash_qnum;
> - struct cpumask alloced_mask;
> + struct cpumask *alloced_mask = _device->cpumask_chns[q_num];
>   int num_channels, tgt_cpu;
> 
>   if (stor_device->num_sc == 0)
> @@ -1257,10 

[Bug 199435] HPSA + P420i resetting logical Direct-Access never complete

2018-04-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199435

lober...@redhat.com changed:

   What|Removed |Added

 CC||lober...@redhat.com

--- Comment #12 from lober...@redhat.com ---
We had a bunch of issues with the HPSA as already mentioned above.
The specific issue that we had to revert was this commit
8b834bff1b73dce46f4e9f5e84af6f73fed8b0ef


I assume your array has a charged battery (capacitor) and the writeback-cache
is enabled on the 420i

Are you only seeing this wen you have cmaeventd running, because hat can use
pass through commands and has been known to cause issues.
I am not running any of the HPE Proliant SPP daemons on my system.

I have not seen this load related issue (without those daemons running) that
you are seeing on my DL380G7 or Dl380G8 here so I will work on trying to
reproduce and assist.

Thanks
Laurence

-- 
You are receiving this mail because:
You are the assignee for the bug.


Greetings !!!

2018-04-22 Thread Ms Deborah Geist
 Citigroup Center
Citibank House, North Carolina,
United States Of America
 
Greetings ,.

We have FUND here in your name. We have been able to confirm that the money was 
deposited here as your Beneficiary/Inheritance Fund. The Fund was stopped for 
about 3 years now but it has finally been released by the Treasury for 
processing and payment to you. Get back to me if you are still available on 
this email address for release details.

Have a blessed day. I await for your reply soon.
 
Yours sincerely,
 
Deborah Geist
Citibank Customer Service
msdeborage...@gmx.us


[Bug 199435] HPSA + P420i resetting logical Direct-Access never complete

2018-04-22 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199435

--- Comment #11 from Anthony Hausman (anthonyhaussm...@gmail.com) ---
The only patch that I'm sure that I have is the "scsi: hpsa: fix selection of
reply queue" one.
For the I'm using an out of the box 4.11 kernel. So I'm really not sure that
the other patches are present.


Unfortunately, the module does not compile using 4.11.0-14-generic headers.

# make -C /lib/modules/4.11.0-14-generic/build M=$(pwd)
--makefile="/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt"
make: Entering directory '/usr/src/linux-headers-4.11.0-14-generic'
make -C /lib/modules/4.4.0-96-generic/build
M=/usr/src/linux-headers-4.11.0-14-generic EXTRA_CFLAGS+=-DKCLASS4A modules
make[1]: Entering directory '/usr/src/linux-headers-4.4.0-96-generic'
make[2]: *** No rule to make target 'kernel/bounds.c', needed by
'kernel/bounds.s'.  Stop.
Makefile:1423: recipe for target
'_module_/usr/src/linux-headers-4.11.0-14-generic' failed
make[1]: *** [_module_/usr/src/linux-headers-4.11.0-14-generic] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-96-generic'
/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt:96: recipe for
target 'default' failed
make: *** [default] Error 2
make: Leaving directory '/usr/src/linux-headers-4.11.0-14-generic'

But if you tell me the principal problem is using the 4.11 kernel, I can
upgrade it to use the 4.16.3 kernel.

If I use it, must I use the out of box 3.4.20-136 hpsa driver or use your
precedent patch on the last 3.4.20-125?

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: [PATCH 15/39] acpi/battery: simplify procfs code

2018-04-22 Thread Rafael J. Wysocki
On Thu, Apr 19, 2018 at 2:41 PM, Christoph Hellwig  wrote:
> Use remove_proc_subtree to remove the whole subtree on cleanup, and
> unwind the registration loop into individual calls.  Switch to use
> proc_create_seq where applicable.
>
> Signed-off-by: Christoph Hellwig 

It is OK AFAICS.

Reviewed-by: Rafael J. Wysocki 


Re: [PATCH] bsg referencing bus driver module

2018-04-22 Thread James Bottomley
On Fri, 2018-04-20 at 16:44 -0600, Anatoliy Glagolev wrote:
>  
> > This patch isn't applyable because your mailer has changed all the
> > tabs to spaces.
> > 
> > I also think there's no need to do it this way.  I think what we
> > need is for fc_bsg_remove() to wait until the bsg queue is
> > drained.  It does look like the author thought this happened
> > otherwise the code wouldn't have the note.  If we fix it that way
> > we can do the same thing in all the other transport classes that
> > use bsg (which all have a similar issue).
> > 
> > James
> > 
> 
> Thanks, James. Sorry about the tabs; re-sending.
> 
> On fc_bsg_remove()...: are you suggesting to implement the whole fix
> in scsi_transport_fc.c?

Yes, but it's not just scsi_transport_fc, scsi_transport_sas has the
same issue.  I think it's probably just the one liner addition of
blk_drain_queue() that fixes this.  There should probably be a block
primitive that does the correct queue reference dance and calls
blk_cleanup_queue() and blk_drain_queue() in order.

>  That would be nice, but I do not see how that
> is possible. Even with the queue drained bsg still holds a reference
> to the Scsi_Host via bsg_class_device; bsg_class_device itself is
> referenced on bsg_open and kept around while a user-mode process
> keeps a handle to bsg.

Once you've called bsg_unregister_queue(), the queue will be destroyed
and the reference released once the last job is drained, meaning the
user can keep the bsg device open, but it will just return errors
because of the lack of queue.  This scenario allows removal to proceed
without being held hostage by open devices.

> Even if we somehow implement the waiting the call may be stuck
> forever if the user-mode process keeps the handle.

No it won't: after blk_cleanup_queue(), the queue is in bypass mode: no
requests queued after this do anything other than complete with error,
so they never make it into SCSI.

> I think handling it via a rererence to the module is more consistent
> with the way things are done in Linux. You suggested the approach
> youself back in "Waiting for scsi_host_template release" discussion.

That was before I analyzed the code paths.  Module release is tricky,
because the module exit won't be called until the references drop to
zero, so you have to be careful about not creating a situation where
module exit never gets called and module exit code should force stuff
to detach and wait for the forcing to complete to make up for the
reference circularity problem.  If you do it purely by refcounting, the
module actually may never release (that's why scsi_remove_host works
the way it does, for instance).

James