On Tue, Dec 15, 2009 at 10:34 AM, Mike Christie <micha...@cs.wisc.edu> wrote: > Erez Zilber wrote: >> Maintain a list of nop-out PDUs that almost timed out. >> With this information, you can understand and debug the >> whole system: you can check your target and see what caused >> it to be so slow on that specific time, you can see if your >> network was very busy during that time etc. >> >> Signed-off-by: Erez Zilber <erezzi.l...@gmail.com> >> > > Sorry for the late reply. Thanks for doing this work! > > > @@ -88,11 +89,12 @@ static struct option const long_options[] = > {"killiscsid", required_argument, NULL, 'k'}, > {"debug", required_argument, NULL, 'd'}, > {"show", no_argument, NULL, 'S'}, > + {"noopinfo", required_argument, NULL, 'N'}, > {"version", no_argument, NULL, 'V'}, > {"help", no_argument, NULL, 'h'}, > {NULL, 0, NULL, 0}, > }; > > > Do you think you want something more generic like a get-debug-info type > of feature? Maybe in the future we could also show how many times a scsi > command has timed out or almost timed out (user can adjust scsi timer > then) or how many times the scsi eh has fired and how many times a > abort, lu or target reset has timed out? > > So maybe it would be get like get stats where then the noop info is the > first stat and in the future we will have more?
Sounds good. Let's change it to {"debuginfo", required_argument, NULL, 'i'}. OK? > > > > > @@ -994,8 +997,25 @@ static int iscsi_nop_out_rsp(struct iscsi_task *task, > if (iscsi_recv_pdu(conn->cls_conn, (struct iscsi_hdr *)nop, > data, datalen)) > rc = ISCSI_ERR_CONN_FAILED; > - } else > + } else { > + ping_delay = jiffies - conn->last_ping; > + > + /* Check if the ping has almost timed out */ > + if (ping_delay >= > + (session->noop_threshold * (conn->ping_timeout * HZ)) / > + 100) { > + mutex_lock(&conn->noop_info_mutex); > > > > We annot use a mutex here, because we run from a bottom half (softirq > for iscsi and a tasklet for iser), and we cannot sleep in a bh since > there is no process context. You're right. I'll replace the mutex with a spinlock. > > > + idx = conn->noop_info_arr_idx; > + conn->noop_info_arr[idx].tv = conn->tmp_noop_tv; > + conn->noop_info_arr[idx].elapsed_time_msec = > ping_delay; > > I think ping_delay is just in jiffies, so you have to do some sort of > jiffies_to_msec call (I cannot rememeber the name of the helper but > there is one). I think it should be: conn->noop_info_arr[idx].elapsed_time_msec = ping_delay * HZ / 1000 > > > + conn->noop_info_arr[idx].init = 1; > + conn->noop_info_arr_idx = > + (conn->noop_info_arr_idx + 1) % > NOOP_INFO_ARR_LEN; > > > I am not getting the reason for the division? Are you talking about "conn->noop_info_arr_idx = (conn->noop_info_arr_idx + 1) % NOOP_INFO_ARR_LEN"? It's a cyclic array. > > > + mutex_unlock(&conn->noop_info_mutex); > + } > + > mod_timer(&conn->transport_timer, jiffies + > conn->recv_timeout); > + } > iscsi_complete_task(task, ISCSI_TASK_COMPLETED); > return rc; > } > > > > @@ -2026,6 +2046,12 @@ static void > iscsi_check_transport_timeouts(unsigned long data) > if (time_before_eq(last_recv + recv_timeout, jiffies)) { > /* send a ping to try to provoke some traffic */ > ISCSI_DBG_CONN(conn, "Sending nopout as ping\n"); > + > + /* First, save the time of the ping for later use */ > + mutex_lock(&conn->noop_info_mutex); > > Cannot use a mutex here either, because this is run from a timer. Timers > are run from bhs too. mutex -> spinlock. Let me know if you're OK with my suggestions and I'll resend the patch. Erez -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-is...@googlegroups.com. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.