Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-06 Thread Petr Mladek
On Fri 2018-04-06 10:30:16, Petr Mladek wrote:
> On Tue 2018-04-03 14:19:43, wen.yan...@zte.com.cn wrote:
> > On the other hand,queue_lock is big, looping doing something under spinlock 
> > 
> > may locked many things and taking a long time, may cause some problems.
> > 
> > So This code needs to be optimized later:
> > 
> > scsi_request_fn()
> > {
> > for (;;) {
> > int rtn;
> > /*
> >  * get next queueable request.  We do this early to make sure
> >  * that the request is fully prepared even if we cannot
> >  * accept it.
> >  */
> > 
> > req = blk_peek_request(q);
> > 
> > if (!req)
> > break;
> > 
> > if (unlikely(!scsi_device_online(sdev))) {
> > sdev_printk(KERN_ERR, sdev,
> > "rejected I/O to offline device\n");
> > scsi_kill_request(req, q);
> > continue;
> > 
> > ^ still under spinlock
> > }
> 
> I wonder if the following might be the best solution after all:
> 
>   if (unlikely(!scsi_device_online(sdev))) {
>   scsi_kill_request(req, q);
> 
>   /*
>* printk() might take a while on slow consoles.
>* Prevent solftlockups by releasing the lock.
>*/
>   spin_unlock_irq(q->queue_lock);
>   sdev_printk(KERN_ERR, sdev,
>   "rejecting I/O to offline device\n");
>   spin_lock_irq(q->queue_lock);
>   continue;
>   }
> 
> I see that the lock is released also in several other situations.
> Therefore it looks safe. Also handling too many requests without
> releasing the lock seems to be a bad idea in general. I think
> that this solution was already suggested earlier.

Just to be sure. Is it safe to kill first few requests and proceed
the others?

I wonder if the device could actually get online without releasing
the queue lock. If not, we normally killed all requests.

I wonder if a local flag might actually help to reduce the number
of messages but keep the existing behavior. I mean something like

static void scsi_request_fn(struct request_queue *q)
{
struct scsi_device *sdev = q->queuedata;
   ^
   The device is the same for each request
   in this queue.


struct request *req;
+   bool offline_reported = false;

/*
 * To start with, we keep looping until the queue is empty, or until
 * the host is no longer able to accept any more requests.
 */
shost = sdev->host;
for (;;) {
int rtn;
req = blk_peek_request(q);
if (!req)
break;

if (unlikely(!scsi_device_online(sdev))) {
+   if (!offline_reported) {
sdev_printk(KERN_ERR, sdev,
"rejecting I/O to offline device\n");
+   offline_reported = true;
+   }
scsi_kill_request(req, q);
continue;
}


Please, note that I am not familiar with the scsi code. I am involved
because this is printk related. Unfortunately, we could not make
printk() faster. The main principle is to get messages on the console
ASAP. Nobody knows when the system might die and any message might
be important.

Best Regards,
Petr


Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-06 Thread Petr Mladek
On Tue 2018-04-03 14:19:43, wen.yan...@zte.com.cn wrote:
> On the other hand,queue_lock is big, looping doing something under spinlock 
> 
> may locked many things and taking a long time, may cause some problems.
> 
> So This code needs to be optimized later:
> 
> scsi_request_fn()
> {
>   for (;;) {
>   int rtn;
>   /*
>* get next queueable request.  We do this early to make sure
>* that the request is fully prepared even if we cannot
>* accept it.
>*/
> 
>   req = blk_peek_request(q);
> 
>   if (!req)
>   break;
> 
>   if (unlikely(!scsi_device_online(sdev))) {
>   sdev_printk(KERN_ERR, sdev,
>   "rejected I/O to offline device\n");
>   scsi_kill_request(req, q);
>   continue;
> 
>   ^ still under spinlock
>   }

I wonder if the following might be the best solution after all:

if (unlikely(!scsi_device_online(sdev))) {
scsi_kill_request(req, q);

/*
 * printk() might take a while on slow consoles.
 * Prevent solftlockups by releasing the lock.
 */
spin_unlock_irq(q->queue_lock);
sdev_printk(KERN_ERR, sdev,
"rejecting I/O to offline device\n");
spin_lock_irq(q->queue_lock);
continue;
}

I see that the lock is released also in several other situations.
Therefore it looks safe. Also handling too many requests without
releasing the lock seems to be a bad idea in general. I think
that this solution was already suggested earlier.

Please, note that I moved scsi_kill_request() up. It looks natural
to remove it from the queue before we release the queue lock.

Best Regards,
Petr

BTW: Your mail had strange formatting. Please, try to avoid using
html.


Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-02 Thread Jason Yan



On 2018/4/2 13:29, Sergey Senozhatsky wrote:

On (04/02/18 13:14), wen.yan...@zte.com.cn wrote:


> It's true that this print for the same device is useless. But it's
> useful for different devices. Is it possible to limit the print only
> for the same device?

In our scene, it's  just for the same device (q->queuedata), Thanks.


Yes, what Jason meant was that rate limit struct is shared by different call
sites - including scsi_request_fn() from different devices.

If device1->scsi_request_fn()->sdev_printk_ratelimited() causes printk rate
limit, then messages from device2->scsi_request_fn()->sdev_printk_ratelimited()
may be lost entirely, unless you have enough of them.

-ss



Yes, that's exactly what I mean.

Thanks,

Jason


.





Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-02 Thread Jason Yan



On 2018/4/2 13:14, wen.yan...@zte.com.cn wrote:

Hello,

 > It's true that this print for the same device is useless. But it's

 > useful for different devices. Is it possible to limit the print only

 > for the same device?


In our scene, it's  just for the same device (q->queuedata), Thanks.



I mean the print limit you add will affect all devices. One device's 
print *may* cause another device's print limited even if it only printed 
one message.




Re: 答复: Re: [PATCH v2] scsi: Introduce sdev_printk_ratelimited to throttlefrequent printk

2018-04-01 Thread Sergey Senozhatsky
On (04/02/18 13:14), wen.yan...@zte.com.cn wrote:
> 
>> It's true that this print for the same device is useless. But it's
>> useful for different devices. Is it possible to limit the print only
>> for the same device?
> 
>In our scene, it's  just for the same device (q->queuedata), Thanks.

Yes, what Jason meant was that rate limit struct is shared by different call
sites - including scsi_request_fn() from different devices.

If device1->scsi_request_fn()->sdev_printk_ratelimited() causes printk rate
limit, then messages from device2->scsi_request_fn()->sdev_printk_ratelimited()
may be lost entirely, unless you have enough of them.

-ss