Re: [PATCH] Maintain a list of nop-out PDUs that almost timed out

Erez Zilber Tue, 15 Dec 2009 03:23:39 -0800

On Tue, Dec 15, 2009 at 10:34 AM, Mike Christie <micha...@cs.wisc.edu> wrote:
> Erez Zilber wrote:
>> Maintain a list of nop-out PDUs that almost timed out.
>> With this information, you can understand and debug the
>> whole system: you can check your target and see what caused
>> it to be so slow on that specific time, you can see if your
>> network was very busy during that time etc.
>>
>> Signed-off-by: Erez Zilber <erezzi.l...@gmail.com>
>>
>
> Sorry for the late reply. Thanks for doing this work!
>
>
> @@ -88,11 +89,12 @@ static struct option const long_options[] =
>         {"killiscsid", required_argument, NULL, 'k'},
>         {"debug", required_argument, NULL, 'd'},
>         {"show", no_argument, NULL, 'S'},
> +       {"noopinfo", required_argument, NULL, 'N'},
>         {"version", no_argument, NULL, 'V'},
>         {"help", no_argument, NULL, 'h'},
>         {NULL, 0, NULL, 0},
>  };
>
>
> Do you think you want something more generic like a get-debug-info type
> of feature? Maybe in the future we could also show how many times a scsi
> command has timed out or almost timed out (user can adjust scsi timer
> then) or how many times the scsi eh has fired and how many times a
> abort, lu or target reset has timed out?
>
> So maybe it would be get like get stats where then the noop info is the
> first stat and in the future we will have more?


Sounds good. Let's change it to {"debuginfo", required_argument, NULL, 'i'}. OK?
>
>
>
>
> @@ -994,8 +997,25 @@ static int iscsi_nop_out_rsp(struct iscsi_task *task,
>                 if (iscsi_recv_pdu(conn->cls_conn, (struct iscsi_hdr *)nop,
>                                    data, datalen))
>                         rc = ISCSI_ERR_CONN_FAILED;
> -       } else
> +       } else {
> +               ping_delay = jiffies - conn->last_ping;
> +
> +               /* Check if the ping has almost timed out */
> +               if (ping_delay >=
> +                   (session->noop_threshold * (conn->ping_timeout * HZ)) /
> +                   100) {
> +                       mutex_lock(&conn->noop_info_mutex);
>
>
>
> We annot use a mutex here, because we run from a bottom half (softirq
> for iscsi and a tasklet for iser), and we cannot sleep in a bh since
> there is no process context.

You're right. I'll replace the mutex with a spinlock.

>
>
> +                       idx = conn->noop_info_arr_idx;
> +                       conn->noop_info_arr[idx].tv = conn->tmp_noop_tv;
> +                       conn->noop_info_arr[idx].elapsed_time_msec =
> ping_delay;
>
> I think ping_delay is just in jiffies, so you have to do some sort of
> jiffies_to_msec call (I cannot rememeber the name of the helper but
> there is one).

I think it should be: conn->noop_info_arr[idx].elapsed_time_msec =
ping_delay * HZ / 1000

>
>
> +                       conn->noop_info_arr[idx].init = 1;
> +                       conn->noop_info_arr_idx =
> +                               (conn->noop_info_arr_idx + 1) %
> NOOP_INFO_ARR_LEN;
>
>
> I am not getting the reason for the division?

Are you talking about "conn->noop_info_arr_idx =
(conn->noop_info_arr_idx + 1) % NOOP_INFO_ARR_LEN"? It's a cyclic
array.

>
>
> +                       mutex_unlock(&conn->noop_info_mutex);
> +               }
> +
>                 mod_timer(&conn->transport_timer, jiffies +
> conn->recv_timeout);
> +       }
>         iscsi_complete_task(task, ISCSI_TASK_COMPLETED);
>         return rc;
>  }
>
>
>
> @@ -2026,6 +2046,12 @@ static void
> iscsi_check_transport_timeouts(unsigned long data)
>         if (time_before_eq(last_recv + recv_timeout, jiffies)) {
>                 /* send a ping to try to provoke some traffic */
>                 ISCSI_DBG_CONN(conn, "Sending nopout as ping\n");
> +
> +               /* First, save the time of the ping for later use */
> +               mutex_lock(&conn->noop_info_mutex);
>
> Cannot use a mutex here either, because this is run from a timer. Timers
> are run from bhs too.

mutex -> spinlock.

Let me know if you're OK with my suggestions and I'll resend the patch.

Erez

--

You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Re: [PATCH] Maintain a list of nop-out PDUs that almost timed out

Reply via email to