Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/20/2014 9:52 AM, Yishai Hadas wrote: can we avoid the fatal_event_raised flag? We need this flag to prevent generating duplicate event. This may happen when a fatal error occurred and just later the remove one was called when there were active applications. If your only concern is not generating duplicate fatal error events, I would strongly recommend to go and drop this flag to make the code clearer/simpler, object-I_already_did_operation_X flags should be avoided in correct system programming IMHO. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/20/2014 9:52 AM, Yishai Hadas wrote: can we avoid the fatal_event_raised flag? We need this flag to prevent generating duplicate event. This may happen when a fatal error occurred and just later the remove one was called when there were active applications. If your only concern is not generating duplicate fatal error events, I would strongly recommend to go a drop this flag to make the code clearer/simpler, object-I_already_did_operation_X flags should be avoided in correct system programming IMHO. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/18/2014 11:42 PM, Or Gerlitz wrote: On Tue, Nov 18, 2014 at 6:24 PM, Yishai Hadasyish...@dev.mellanox.co.il The IB device pointer is not valid at all spots, for example after the HW Device was removed, that's why we set this information on the uverbs device. In 2nd thought, they must be point @ time where it is valid, right? so you can query/cache the whole 32bit device caps on that minute and use it later. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/18/2014 2:11 PM, Yishai Hadas wrote: @@ -533,6 +552,12 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler, struct ib_uverbs_file *file = container_of(handler, struct ib_uverbs_file, event_handler); + if (event-event == IB_EVENT_DEVICE_FATAL) { + if (file-fatal_event_raised) + return; + file-fatal_event_raised = 1; + } + can we avoid the fatal_event_raised flag? In a similar manner to what defined for rdma-cm consumers, let's add IB_EVENT_DEVICE_REMOVAL event and generate both events from here towards the uberbs consumer (the fatal for legacy apps and the device removal for newer apps that maybe want to distinguish between the two) ib_uverbs_async_handler(file, event-element.port_num, event-event, NULL, NULL); } -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/19/2014 4:49 PM, Or Gerlitz wrote: On 11/18/2014 2:11 PM, Yishai Hadas wrote: @@ -533,6 +552,12 @@ void ib_uverbs_event_handler(struct ib_event_handler *handler, struct ib_uverbs_file *file = container_of(handler, struct ib_uverbs_file, event_handler); +if (event-event == IB_EVENT_DEVICE_FATAL) { +if (file-fatal_event_raised) +return; +file-fatal_event_raised = 1; +} + can we avoid the fatal_event_raised flag? We need this flag to prevent generating duplicate event. This may happen when a fatal error occurred and just later the remove one was called when there were active applications. In a similar manner to what defined for rdma-cm consumers, let's add IB_EVENT_DEVICE_REMOVAL event and generate both events from here towards the uberbs consumer (the fatal for legacy apps and the device removal for newer apps that maybe want to distinguish between the two) Need to consider whether it's important for newer applications to distinguish, can be opened for other input. In case we believe so, can be added as an extra patch to that series. ib_uverbs_async_handler(file, event-element.port_num, event-event, NULL, NULL); } -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
Enables the uverbs_remove_one to succeed despite the fact that there are running IB applications working with the given ib device. This functionality enables a HW device to be unbind/reset despite the fact that there are running user space applications using it. It exposes a new IB kernel API named 'disassociate_ucontext' which lets a driver detaching its HW resources from a given user context without crashing/terminating the application. In case a driver implemented the above API and registered with ib_uverb there will be no dependency between its device to its uverbs_device. Upon calling remove_one of ib_uverbs the call should return after disassociating the open HW resources without waiting to clients disconnecting. In case driver didn't implement this API there will be no change to current behaviour and uverbs_remove_one will return only when last client has disconnected and reference count on uverbs device became 0. In case the lower driver device was removed any application will continue working over some zombie HCA, further calls will ended with an immediate error. Signed-off-by: Yishai Hadas yish...@mellanox.com Signed-off-by: Jack Morgenstein ja...@mellanox.com --- drivers/infiniband/core/uverbs.h |9 + drivers/infiniband/core/uverbs_cmd.c |8 + drivers/infiniband/core/uverbs_main.c | 324 +++-- include/rdma/ib_verbs.h |2 + 4 files changed, 287 insertions(+), 56 deletions(-) diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h index 643c08a..e485e67 100644 --- a/drivers/infiniband/core/uverbs.h +++ b/drivers/infiniband/core/uverbs.h @@ -94,6 +94,12 @@ struct ib_uverbs_device { struct cdev cdev; struct rb_root xrcd_tree; struct mutexxrcd_tree_mutex; + struct mutexdisassociate_mutex; /* protect lists of files. */ + int disassociated; + int disassociated_supported; + struct srcu_struct disassociate_srcu; + struct list_headuverbs_file_list; + struct list_headuverbs_events_file_list; }; struct ib_uverbs_event_file { @@ -105,6 +111,7 @@ struct ib_uverbs_event_file { wait_queue_head_t poll_wait; struct fasync_struct *async_queue; struct list_headevent_list; + struct list_headlist; }; struct ib_uverbs_file { @@ -114,6 +121,8 @@ struct ib_uverbs_file { struct ib_ucontext *ucontext; struct ib_event_handler event_handler; struct ib_uverbs_event_file*async_file; + struct list_headlist; + int fatal_event_raised; }; struct ib_uverbs_event { diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 5ba2a86..0b19361 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -38,6 +38,7 @@ #include linux/slab.h #include asm/uaccess.h +#include linux/sched.h #include uverbs.h #include core_priv.h @@ -326,6 +327,7 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file, INIT_LIST_HEAD(ucontext-xrcd_list); INIT_LIST_HEAD(ucontext-rule_list); ucontext-closing = 0; + ucontext-tgid = get_task_pid(current-group_leader, PIDTYPE_PID); resp.num_comp_vectors = file-device-num_comp_vectors; @@ -1286,6 +1288,12 @@ ssize_t ib_uverbs_create_comp_channel(struct ib_uverbs_file *file, return -EFAULT; } + /* Taking ref count on uverbs_file to make sure that file won't be freed till + * that event file is closed. It will enable accessing the uverbs_device fields as part of + * closing the events file and making sure that uverbs device is available by that time as well. + * Note: similar is already done for the async event file. + */ + kref_get(file-ref); fd_install(resp.fd, filp); return in_len; } diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 71ab83f..1d33e46 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -133,7 +133,12 @@ static void ib_uverbs_release_dev(struct kref *ref) struct ib_uverbs_device *dev = container_of(ref, struct ib_uverbs_device, ref); - complete(dev-comp); + if (dev-disassociated) { + cleanup_srcu_struct(dev-disassociate_srcu); + kfree(dev); + } else { + complete(dev-comp); + } } static void ib_uverbs_release_event_file(struct kref
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/18/2014 2:11 PM, Yishai Hadas wrote: @@ -923,6 +1062,7 @@ static void ib_uverbs_add_one(struct ib_device *device) if (device_create_file(uverbs_dev-dev, dev_attr_abi_version)) goto err_class; + uverbs_dev-disassociated_supported = device-disassociate_ucontext ? 1 : 0; ib_set_client_data(device, uverbs_client, uverbs_dev); return; please no, for object-yyy_supported flags (here and elsewhere too). You can add IB device capability (say, named IB_DEVICE_DISASSOCIATE) and just look it up directly from uverbs throug the IB device pointer existing in the uverbs device object in the spots you need this (remove uverbs_dev-disassociated_supported field). Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/18/2014 2:11 PM, Yishai Hadas wrote: @@ -599,6 +638,8 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf, struct ib_uverbs_file *file = filp-private_data; struct ib_uverbs_cmd_hdr hdr; __u32 flags; + int srcu_key; + ssize_t ret; if (count sizeof hdr) return -EINVAL; @@ -606,6 +647,12 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf, if (copy_from_user(hdr, buf, sizeof hdr)) return -EFAULT; + srcu_key = srcu_read_lock(file-device-disassociate_srcu); + if (file-device-disassociated) { + ret = -EIO; + goto out; + } + flags = (hdr.command IB_USER_VERBS_CMD_FLAGS_MASK) IB_USER_VERBS_CMD_FLAGS_SHIFT; @@ -613,26 +660,36 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf, __u32 command; if (hdr.command ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK | - IB_USER_VERBS_CMD_COMMAND_MASK)) - return -EINVAL; + IB_USER_VERBS_CMD_COMMAND_MASK)) { + ret = -EINVAL; + goto out; + } put this change and it's such in a pre-patch that does refactoring of the ib_uverbs_write() code, to make the volume and amount of changes introduced by this patch smaller command = hdr.command IB_USER_VERBS_CMD_COMMAND_MASK; if (command = ARRAY_SIZE(uverbs_cmd_table) || - !uverbs_cmd_table[command]) - return -EINVAL; + !uverbs_cmd_table[command]) { + ret = -EINVAL; + goto out; + } if (!file-ucontext - command != IB_USER_VERBS_CMD_GET_CONTEXT) - return -EINVAL; + command != IB_USER_VERBS_CMD_GET_CONTEXT) { + ret = -EINVAL; + goto out; + } - if (!(file-device-ib_dev-uverbs_cmd_mask (1ull command))) - return -ENOSYS; + if (!(file-device-ib_dev-uverbs_cmd_mask (1ull command))) { + ret = -ENOSYS; + goto out; + } - if (hdr.in_words * 4 != count) - return -EINVAL; + if (hdr.in_words * 4 != count) { + ret = -EINVAL; + goto out; + } - return uverbs_cmd_table[command](file, + ret = uverbs_cmd_table[command](file, buf + sizeof(hdr), hdr.in_words * 4, hdr.out_words * 4); @@ -647,47 +704,69 @@ static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf, size_t written_count = count; if (hdr.command ~(__u32)(IB_USER_VERBS_CMD_FLAGS_MASK | - IB_USER_VERBS_CMD_COMMAND_MASK)) - return -EINVAL; + IB_USER_VERBS_CMD_COMMAND_MASK)) { + ret = -EINVAL; + goto out; + } command = hdr.command IB_USER_VERBS_CMD_COMMAND_MASK; if (command = ARRAY_SIZE(uverbs_ex_cmd_table) || - !uverbs_ex_cmd_table[command]) - return -ENOSYS; + !uverbs_ex_cmd_table[command]) { + ret = -ENOSYS; + goto out; + } - if (!file-ucontext) - return -EINVAL; + if (!file-ucontext) { + ret = -EINVAL; + goto out; + } - if (!(file-device-ib_dev-uverbs_ex_cmd_mask (1ull command))) - return -ENOSYS; + if (!(file-device-ib_dev-uverbs_ex_cmd_mask (1ull command))) { + ret = -ENOSYS; + goto out; + } - if (count (sizeof(hdr) + sizeof(ex_hdr))) - return -EINVAL; + if (count (sizeof(hdr) + sizeof(ex_hdr))) { + ret = -EINVAL; + goto out; + } - if (copy_from_user(ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr))) - return -EFAULT; + if (copy_from_user(ex_hdr, buf + sizeof(hdr), sizeof(ex_hdr))) { + ret = -EFAULT; + goto out; + } count -= sizeof(hdr) + sizeof(ex_hdr); buf += sizeof(hdr) + sizeof(ex_hdr); - if ((hdr.in_words + ex_hdr.provider_in_words) * 8 != count) - return -EINVAL; + if ((hdr.in_words
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On 11/18/2014 4:38 PM, Or Gerlitz wrote: On 11/18/2014 2:11 PM, Yishai Hadas wrote: @@ -923,6 +1062,7 @@ static void ib_uverbs_add_one(struct ib_device *device) if (device_create_file(uverbs_dev-dev, dev_attr_abi_version)) goto err_class; +uverbs_dev-disassociated_supported = device-disassociate_ucontext ? 1 : 0; ib_set_client_data(device, uverbs_client, uverbs_dev); return; please no, for object-yyy_supported flags (here and elsewhere too). You can add IB device capability (say, named IB_DEVICE_DISASSOCIATE) and just look it up directly from uverbs throug the IB device pointer existing in the uverbs device object in the spots you need this (remove uverbs_dev-disassociated_supported field). The IB device pointer is not valid at all spots, for example after the HW Device was removed, that's why we set this information on the uverbs device. Or. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V1 for-next 1/2] IB/uverbs: Enable device removal when there are active user space applications
On Tue, Nov 18, 2014 at 6:24 PM, Yishai Hadas yish...@dev.mellanox.co.il The IB device pointer is not valid at all spots, for example after the HW Device was removed, that's why we set this information on the uverbs device. let this (flag on the uverbs object) be, but still, no-no for object-xxx_supported object-yyy_supported object-zzz_supported but rather do object-flags XXX object-flags YYY object-flags ZZZ where each of XXX/YYY/ZZZ are bits that mark support for features xxx/yyy/zzz -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html