Re: [Qemu-devel] [PATCH] block/iscsi: add support for request timeouts

2015-06-26 Thread Paolo Bonzini


On 25/06/2015 23:21, Peter Lieven wrote:
> I will send a v2 that is compatible with 1.9.0 and enables the timeout
> stuff only for libiscsi >= 1.15.0

Since Stefan has already applied the patch, restoring 1.9.0
compatibility (and disabling timeouts between 1.9.0 and 1.14.x) on top
of this patch would be enough.

Thanks for your understanding!

Paolo



Re: [Qemu-devel] [PATCH] block/iscsi: add support for request timeouts

2015-06-25 Thread Peter Lieven
Am 25.06.2015 um 23:08 schrieb Paolo Bonzini:
>
> On 16/06/2015 13:45, Peter Lieven wrote:
>> libiscsi starting with 1.15 will properly support timeout of iscsi
>> commands. The default will remain no timeout, but this can
>> be changed via cmdline parameters, e.g.:
>>
>> qemu -iscsi timeout=30 -drive file=iscsi://...
>>
>> If a timeout occurs a reconnect is scheduled and the timed out command
>> will be requeued for processing after a successful reconnect.
>>
>> The required API call iscsi_set_timeout is present since libiscsi
>> 1.10 which was released in October 2013. However, due to some bugs
>> in the libiscsi code the use is not recommended before version 1.15.
> If so, QEMU should not allow it if libiscsi is older than 1.15.

Not accept a timeout parameter or ignore it and print a warning?

>
>> Please note that this patch bumps the libiscsi requirement to 1.10
>> to have all function and macros defined.
> This is not acceptable, unfortunately.  I explained this two months ago
> (https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg01847.html)
> and it is still true.

Sorry, i missed that. Can you verify if 1.15.0 has a soname that
makes it possible to jump to 1.15.0 at some point in the future?

>
> libiscsi keeps breaking ABI compatibility and for a while did not even
> bump the soname when they do.  This makes it completely impossible for
> distros to upgrade to a newer libiscsi, and RHEL7 is thus stuck with 1.9.
>
> Yes, it is 2 years old.  It doesn't matter.  If libiscsi upstream only
> _tried_ to preserve ABI compatibility, they wouldn't be in this
> situation.  And I know that it is not even trying, because it broke
> again sometime between 1.11 and 1.14 for a totally trivial reason:
>
> --- a/iscsi/iscsi.h
> +++ b/iscsi/iscsi.h
> @@ -91,6 +136,8 @@ struct iscsi_url {
> char target[MAX_STRING_SIZE + 1];
> char user[MAX_STRING_SIZE + 1];
> char passwd[MAX_STRING_SIZE + 1];
> +   char target_user[MAX_STRING_SIZE + 1];
> +   char target_passwd[MAX_STRING_SIZE + 1];
> int lun;
> struct iscsi_context *iscsi;
>  };
>
>
> This is the only change between these releases that breaks the ABI, but
> it is already one too much. :(
>
> (Also, the parsing of URLs into iscsi_url doesn't even try to obey the
> RFCs...).
>
>> The patch fixes also a
>> off-by-one error in the NOP timeout calculation which was fixed
>> while touching these code parts.
> Can you please separate this part anyway?

Sure.

I will send a v2 that is compatible with 1.9.0 and enables the timeout
stuff only for libiscsi >= 1.15.0

Peter



Re: [Qemu-devel] [PATCH] block/iscsi: add support for request timeouts

2015-06-25 Thread Paolo Bonzini


On 16/06/2015 13:45, Peter Lieven wrote:
> libiscsi starting with 1.15 will properly support timeout of iscsi
> commands. The default will remain no timeout, but this can
> be changed via cmdline parameters, e.g.:
> 
> qemu -iscsi timeout=30 -drive file=iscsi://...
> 
> If a timeout occurs a reconnect is scheduled and the timed out command
> will be requeued for processing after a successful reconnect.
> 
> The required API call iscsi_set_timeout is present since libiscsi
> 1.10 which was released in October 2013. However, due to some bugs
> in the libiscsi code the use is not recommended before version 1.15.

If so, QEMU should not allow it if libiscsi is older than 1.15.

> Please note that this patch bumps the libiscsi requirement to 1.10
> to have all function and macros defined.

This is not acceptable, unfortunately.  I explained this two months ago
(https://lists.gnu.org/archive/html/qemu-devel/2015-04/msg01847.html)
and it is still true.

libiscsi keeps breaking ABI compatibility and for a while did not even
bump the soname when they do.  This makes it completely impossible for
distros to upgrade to a newer libiscsi, and RHEL7 is thus stuck with 1.9.

Yes, it is 2 years old.  It doesn't matter.  If libiscsi upstream only
_tried_ to preserve ABI compatibility, they wouldn't be in this
situation.  And I know that it is not even trying, because it broke
again sometime between 1.11 and 1.14 for a totally trivial reason:

--- a/iscsi/iscsi.h
+++ b/iscsi/iscsi.h
@@ -91,6 +136,8 @@ struct iscsi_url {
char target[MAX_STRING_SIZE + 1];
char user[MAX_STRING_SIZE + 1];
char passwd[MAX_STRING_SIZE + 1];
+   char target_user[MAX_STRING_SIZE + 1];
+   char target_passwd[MAX_STRING_SIZE + 1];
int lun;
struct iscsi_context *iscsi;
 };


This is the only change between these releases that breaks the ABI, but
it is already one too much. :(

(Also, the parsing of URLs into iscsi_url doesn't even try to obey the
RFCs...).

> The patch fixes also a
> off-by-one error in the NOP timeout calculation which was fixed
> while touching these code parts.

Can you please separate this part anyway?

Paolo



Re: [Qemu-devel] [PATCH] block/iscsi: add support for request timeouts

2015-06-24 Thread Peter Lieven

Am 23.06.2015 um 01:03 schrieb ronnie sahlberg:

LGTM

It is good to finally have timeouts that work in libiscsi,  and a consumer that 
can use and benefit from it.


Paolo, Kevin, Stefan, do you think this is sth for 2.4?

Peter



Re: [Qemu-devel] [PATCH] block/iscsi: add support for request timeouts

2015-06-22 Thread ronnie sahlberg
LGTM

It is good to finally have timeouts that work in libiscsi,  and a consumer
that can use and benefit from it.

On Tue, Jun 16, 2015 at 4:45 AM, Peter Lieven  wrote:

> libiscsi starting with 1.15 will properly support timeout of iscsi
> commands. The default will remain no timeout, but this can
> be changed via cmdline parameters, e.g.:
>
> qemu -iscsi timeout=30 -drive file=iscsi://...
>
> If a timeout occurs a reconnect is scheduled and the timed out command
> will be requeued for processing after a successful reconnect.
>
> The required API call iscsi_set_timeout is present since libiscsi
> 1.10 which was released in October 2013. However, due to some bugs
> in the libiscsi code the use is not recommended before version 1.15.
>
> Please note that this patch bumps the libiscsi requirement to 1.10
> to have all function and macros defined. The patch fixes also a
> off-by-one error in the NOP timeout calculation which was fixed
> while touching these code parts.
>
> Signed-off-by: Peter Lieven 
> ---
>  block/iscsi.c   | 87
> ++---
>  configure   |  6 ++--
>  qemu-options.hx |  4 +++
>  3 files changed, 72 insertions(+), 25 deletions(-)
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 14e97a6..f19a56a 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -69,6 +69,7 @@ typedef struct IscsiLun {
>  bool dpofua;
>  bool has_write_same;
>  bool force_next_flush;
> +bool request_timed_out;
>  } IscsiLun;
>
>  typedef struct IscsiTask {
> @@ -99,7 +100,8 @@ typedef struct IscsiAIOCB {
>  #endif
>  } IscsiAIOCB;
>
> -#define EVENT_INTERVAL 250
> +/* libiscsi uses time_t so its enough to process events every second */
> +#define EVENT_INTERVAL 1000
>  #define NOP_INTERVAL 5000
>  #define MAX_NOP_FAILURES 3
>  #define ISCSI_CMD_RETRIES ARRAY_SIZE(iscsi_retry_times)
> @@ -186,13 +188,18 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int
> status,
>  iTask->do_retry = 1;
>  goto out;
>  }
> -/* status 0x28 is SCSI_TASK_SET_FULL. It was first introduced
> - * in libiscsi 1.10.0. Hardcode this value here to avoid
> - * the need to bump the libiscsi requirement to 1.10.0 */
> -if (status == SCSI_STATUS_BUSY || status == 0x28) {
> +if (status == SCSI_STATUS_BUSY || status ==
> SCSI_STATUS_TIMEOUT ||
> +status == SCSI_STATUS_TASK_SET_FULL) {
>  unsigned retry_time =
>  exp_random(iscsi_retry_times[iTask->retries - 1]);
> -error_report("iSCSI Busy/TaskSetFull (retry #%u in %u
> ms): %s",
> +if (status == SCSI_STATUS_TIMEOUT) {
> +/* make sure the request is rescheduled AFTER the
> + * reconnect is initiated */
> +retry_time = EVENT_INTERVAL * 2;
> +iTask->iscsilun->request_timed_out = true;
> +}
> +error_report("iSCSI Busy/TaskSetFull/TimeOut"
> + " (retry #%u in %u ms): %s",
>   iTask->retries, retry_time,
>   iscsi_get_error(iscsi));
>  aio_timer_init(iTask->iscsilun->aio_context,
> @@ -276,20 +283,26 @@ iscsi_set_events(IscsiLun *iscsilun)
> iscsilun);
>  iscsilun->events = ev;
>  }
> -
> -/* newer versions of libiscsi may return zero events. In this
> - * case start a timer to ensure we are able to return to service
> - * once this situation changes. */
> -if (!ev) {
> -timer_mod(iscsilun->event_timer,
> -  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) +
> EVENT_INTERVAL);
> -}
>  }
>
> -static void iscsi_timed_set_events(void *opaque)
> +static void iscsi_timed_check_events(void *opaque)
>  {
>  IscsiLun *iscsilun = opaque;
> +
> +/* check for timed out requests */
> +iscsi_service(iscsilun->iscsi, 0);
> +
> +if (iscsilun->request_timed_out) {
> +iscsilun->request_timed_out = false;
> +iscsi_reconnect(iscsilun->iscsi);
> +}
> +
> +/* newer versions of libiscsi may return zero events. Ensure we are
> able
> + * to return to service once this situation changes. */
>  iscsi_set_events(iscsilun);
> +
> +timer_mod(iscsilun->event_timer,
> +  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + EVENT_INTERVAL);
>  }
>
>  static void
> @@ -1096,16 +1109,37 @@ static char *parse_initiator_name(const char
> *target)
>  return iscsi_name;
>  }
>
> +static int parse_timeout(const char *target)
> +{
> +QemuOptsList *list;
> +QemuOpts *opts;
> +const char *timeout;
> +
> +list = qemu_find_opts("iscsi");
> +if (list) {
> +opts = qemu_opts_find(list, target);
> +if (!opts) {
> +opts = QTAILQ_FIRST(&list->head);
> +}
> +if (opts) {
> +timeout = qe

[Qemu-devel] [PATCH] block/iscsi: add support for request timeouts

2015-06-16 Thread Peter Lieven
libiscsi starting with 1.15 will properly support timeout of iscsi
commands. The default will remain no timeout, but this can
be changed via cmdline parameters, e.g.:

qemu -iscsi timeout=30 -drive file=iscsi://...

If a timeout occurs a reconnect is scheduled and the timed out command
will be requeued for processing after a successful reconnect.

The required API call iscsi_set_timeout is present since libiscsi
1.10 which was released in October 2013. However, due to some bugs
in the libiscsi code the use is not recommended before version 1.15.

Please note that this patch bumps the libiscsi requirement to 1.10
to have all function and macros defined. The patch fixes also a
off-by-one error in the NOP timeout calculation which was fixed
while touching these code parts.

Signed-off-by: Peter Lieven 
---
 block/iscsi.c   | 87 ++---
 configure   |  6 ++--
 qemu-options.hx |  4 +++
 3 files changed, 72 insertions(+), 25 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 14e97a6..f19a56a 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -69,6 +69,7 @@ typedef struct IscsiLun {
 bool dpofua;
 bool has_write_same;
 bool force_next_flush;
+bool request_timed_out;
 } IscsiLun;
 
 typedef struct IscsiTask {
@@ -99,7 +100,8 @@ typedef struct IscsiAIOCB {
 #endif
 } IscsiAIOCB;
 
-#define EVENT_INTERVAL 250
+/* libiscsi uses time_t so its enough to process events every second */
+#define EVENT_INTERVAL 1000
 #define NOP_INTERVAL 5000
 #define MAX_NOP_FAILURES 3
 #define ISCSI_CMD_RETRIES ARRAY_SIZE(iscsi_retry_times)
@@ -186,13 +188,18 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int 
status,
 iTask->do_retry = 1;
 goto out;
 }
-/* status 0x28 is SCSI_TASK_SET_FULL. It was first introduced
- * in libiscsi 1.10.0. Hardcode this value here to avoid
- * the need to bump the libiscsi requirement to 1.10.0 */
-if (status == SCSI_STATUS_BUSY || status == 0x28) {
+if (status == SCSI_STATUS_BUSY || status == SCSI_STATUS_TIMEOUT ||
+status == SCSI_STATUS_TASK_SET_FULL) {
 unsigned retry_time =
 exp_random(iscsi_retry_times[iTask->retries - 1]);
-error_report("iSCSI Busy/TaskSetFull (retry #%u in %u ms): %s",
+if (status == SCSI_STATUS_TIMEOUT) {
+/* make sure the request is rescheduled AFTER the
+ * reconnect is initiated */
+retry_time = EVENT_INTERVAL * 2;
+iTask->iscsilun->request_timed_out = true;
+}
+error_report("iSCSI Busy/TaskSetFull/TimeOut"
+ " (retry #%u in %u ms): %s",
  iTask->retries, retry_time,
  iscsi_get_error(iscsi));
 aio_timer_init(iTask->iscsilun->aio_context,
@@ -276,20 +283,26 @@ iscsi_set_events(IscsiLun *iscsilun)
iscsilun);
 iscsilun->events = ev;
 }
-
-/* newer versions of libiscsi may return zero events. In this
- * case start a timer to ensure we are able to return to service
- * once this situation changes. */
-if (!ev) {
-timer_mod(iscsilun->event_timer,
-  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + EVENT_INTERVAL);
-}
 }
 
-static void iscsi_timed_set_events(void *opaque)
+static void iscsi_timed_check_events(void *opaque)
 {
 IscsiLun *iscsilun = opaque;
+
+/* check for timed out requests */
+iscsi_service(iscsilun->iscsi, 0);
+
+if (iscsilun->request_timed_out) {
+iscsilun->request_timed_out = false;
+iscsi_reconnect(iscsilun->iscsi);
+}
+
+/* newer versions of libiscsi may return zero events. Ensure we are able
+ * to return to service once this situation changes. */
 iscsi_set_events(iscsilun);
+
+timer_mod(iscsilun->event_timer,
+  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + EVENT_INTERVAL);
 }
 
 static void
@@ -1096,16 +1109,37 @@ static char *parse_initiator_name(const char *target)
 return iscsi_name;
 }
 
+static int parse_timeout(const char *target)
+{
+QemuOptsList *list;
+QemuOpts *opts;
+const char *timeout;
+
+list = qemu_find_opts("iscsi");
+if (list) {
+opts = qemu_opts_find(list, target);
+if (!opts) {
+opts = QTAILQ_FIRST(&list->head);
+}
+if (opts) {
+timeout = qemu_opt_get(opts, "timeout");
+if (timeout) {
+return atoi(timeout);
+}
+}
+}
+
+return 0;
+}
+
 static void iscsi_nop_timed_event(void *opaque)
 {
 IscsiLun *iscsilun = opaque;
 
-if (iscsi_get_nops_in_flight(iscsilun->iscsi) > MAX_NOP_FAILURES) {
+if (iscsi_get_nops_in_flight(iscsilun->iscsi) >= MAX_NOP_FAILURES) {
 error_report("iSCSI: NO