Re: [PATCH V2 01/15] Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg()

2016-12-06 Thread Greg KH
On Sat, Dec 03, 2016 at 12:34:28PM -0800, k...@exchange.microsoft.com wrote:
> From: Vitaly Kuznetsov 
> 
> DoS protection conditions were altered in WS2016 and now it's easy to get
> -EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU on a
> netvsc device in a loop). All vmbus_post_msg() callers don't retry the
> operation and we usually end up with a non-functional device or crash.
> 
> While host's DoS protection conditions are unknown to me my tests show that
> it can take up to 10 seconds before the message is sent so doing udelay()
> is not an option, we really need to sleep. Almost all vmbus_post_msg()
> callers are ready to sleep but there is one special case:
> vmbus_initiate_unload() which can be called from interrupt/NMI context and
> we can't sleep there. I'm also not sure about the lonely
> vmbus_send_tl_connect_request() which has no in-tree users but its external
> users are most likely waiting for the host to reply so sleeping there is
> also appropriate.
> 
> Signed-off-by: Vitaly Kuznetsov 
> Signed-off-by: K. Y. Srinivasan 

Shouldn't this go to stable kernels so that 4.9 and earlier can work
properly on WS2016?

thanks,

greg k-h
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


[PATCH V2 01/15] Drivers: hv: vmbus: Raise retry/wait limits in vmbus_post_msg()

2016-12-03 Thread kys
From: Vitaly Kuznetsov 

DoS protection conditions were altered in WS2016 and now it's easy to get
-EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU on a
netvsc device in a loop). All vmbus_post_msg() callers don't retry the
operation and we usually end up with a non-functional device or crash.

While host's DoS protection conditions are unknown to me my tests show that
it can take up to 10 seconds before the message is sent so doing udelay()
is not an option, we really need to sleep. Almost all vmbus_post_msg()
callers are ready to sleep but there is one special case:
vmbus_initiate_unload() which can be called from interrupt/NMI context and
we can't sleep there. I'm also not sure about the lonely
vmbus_send_tl_connect_request() which has no in-tree users but its external
users are most likely waiting for the host to reply so sleeping there is
also appropriate.

Signed-off-by: Vitaly Kuznetsov 
Signed-off-by: K. Y. Srinivasan 
---
 drivers/hv/channel.c  |   17 +
 drivers/hv/channel_mgmt.c |   10 ++
 drivers/hv/connection.c   |   17 -
 drivers/hv/hyperv_vmbus.h |2 +-
 4 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 5fb4c6d..d5b8d9f 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -181,7 +181,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 
send_ringbuffer_size,
spin_unlock_irqrestore(_connection.channelmsg_lock, flags);
 
ret = vmbus_post_msg(open_msg,
-  sizeof(struct vmbus_channel_open_channel));
+sizeof(struct vmbus_channel_open_channel), true);
 
if (ret != 0) {
err = ret;
@@ -233,7 +233,7 @@ int vmbus_send_tl_connect_request(const uuid_le 
*shv_guest_servie_id,
conn_msg.guest_endpoint_id = *shv_guest_servie_id;
conn_msg.host_service_id = *shv_host_servie_id;
 
-   return vmbus_post_msg(_msg, sizeof(conn_msg));
+   return vmbus_post_msg(_msg, sizeof(conn_msg), true);
 }
 EXPORT_SYMBOL_GPL(vmbus_send_tl_connect_request);
 
@@ -419,7 +419,7 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, 
void *kbuffer,
spin_unlock_irqrestore(_connection.channelmsg_lock, flags);
 
ret = vmbus_post_msg(gpadlmsg, msginfo->msgsize -
-  sizeof(*msginfo));
+sizeof(*msginfo), true);
if (ret != 0)
goto cleanup;
 
@@ -433,8 +433,8 @@ int vmbus_establish_gpadl(struct vmbus_channel *channel, 
void *kbuffer,
gpadl_body->gpadl = next_gpadl_handle;
 
ret = vmbus_post_msg(gpadl_body,
-submsginfo->msgsize -
-sizeof(*submsginfo));
+submsginfo->msgsize - sizeof(*submsginfo),
+true);
if (ret != 0)
goto cleanup;
 
@@ -485,8 +485,8 @@ int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 
gpadl_handle)
list_add_tail(>msglistentry,
  _connection.chn_msg_list);
spin_unlock_irqrestore(_connection.channelmsg_lock, flags);
-   ret = vmbus_post_msg(msg,
-  sizeof(struct vmbus_channel_gpadl_teardown));
+   ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_gpadl_teardown),
+true);
 
if (ret)
goto post_msg_err;
@@ -557,7 +557,8 @@ static int vmbus_close_internal(struct vmbus_channel 
*channel)
msg->header.msgtype = CHANNELMSG_CLOSECHANNEL;
msg->child_relid = channel->offermsg.child_relid;
 
-   ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_close_channel));
+   ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_close_channel),
+true);
 
if (ret) {
pr_err("Close failed: close post msg return is %d\n", ret);
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index cbb96f2..dc6b675 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -321,7 +321,8 @@ static void vmbus_release_relid(u32 relid)
memset(, 0, sizeof(struct vmbus_channel_relid_released));
msg.child_relid = relid;
msg.header.msgtype = CHANNELMSG_RELID_RELEASED;
-   vmbus_post_msg(, sizeof(struct vmbus_channel_relid_released));
+   vmbus_post_msg(, sizeof(struct vmbus_channel_relid_released),
+  true);
 }
 
 void hv_event_tasklet_disable(struct vmbus_channel *channel)
@@ -726,7 +727,8 @@ void vmbus_initiate_unload(bool crash)
init_completion(_connection.unload_event);
memset(, 0, sizeof(struct vmbus_channel_message_header));
hdr.msgtype = CHANNELMSG_UNLOAD;
-   vmbus_post_msg(, sizeof(struct