Re: [PATCH] lib/scatterlist: Provide a DMA page iterator
On Sat, Jan 12, 2019 at 06:37:58PM +, Jason Gunthorpe wrote: > On Sat, Jan 12, 2019 at 12:27:05PM -0600, Shiraz Saleem wrote: > > On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote: > > > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o > > > backing pages") introduced the sg_page_iter_dma_address() function without > > > providing a way to use it in the general case. If the sg_dma_len is not > > > equal to the dma_length callers cannot safely use the > > > for_each_sg_page/sg_page_iter_dma_address combination. > > > > > > Resolve this API mistake by providing a DMA specific iterator, > > > for_each_sg_dma_page(), that uses the right length so > > > sg_page_iter_dma_address() works as expected with all sglists. A new > > > iterator type is introduced to provide compile-time safety against wrongly > > > mixing accessors and iterators. > > [..] > > > > > > > > +/* > > > + * sg page iterator for DMA addresses > > > + * > > > + * This is the same as sg_page_iter however you can call > > > + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA > > > + * address. sg_page_iter_page() cannot be called on this iterator. > > > + */ > > Does it make sense to have a variant of sg_page_iter_page() to get the > > page descriptor with this dma_iter? This can be used when walking DMA-mapped > > SG lists with for_each_sg_dma_page. > > I think that would be a complicated cacluation to find the right > offset into the page sg list to get the page pointer back. We can't > just naively use the existing iterator location. > > Probably places that need this are better to run with two iterators, > less computationally expensive. > > Did you find a need for this? > Well I was trying convert the RDMA drivers to use your new iterator variant and saw the need for it in locations where we need virtual address of the pages contained in the SGEs. diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c index 59eeac5..7d26903 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c @@ -85,7 +85,7 @@ static void __free_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl, static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl, struct scatterlist *sghead, u32 pages, u32 pg_size) { - struct scatterlist *sg; + struct sg_dma_page_iter sg_iter; bool is_umem = false; int i; @@ -116,12 +116,13 @@ static int __alloc_pbl(struct pci_dev *pdev, struct bnxt_qplib_pbl *pbl, } else { i = 0; is_umem = true; - for_each_sg(sghead, sg, pages, i) { - pbl->pg_map_arr[i] = sg_dma_address(sg); - pbl->pg_arr[i] = sg_virt(sg); + for_each_sg_dma_page(sghead, &sg_iter, pages, 0) { + pbl->pg_map_arr[i] = sg_page_iter_dma_address(&sg_iter); + pbl->pg_arr[i] = page_address(sg_page_iter_page(&sg_iter.base)); ??? ^ if (!pbl->pg_arr[i]) goto fail; + i++; pbl->pg_count++; } } -- @@ -191,16 +190,16 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, goto err1; } - mem->page_shift = umem->page_shift; - mem->page_mask = BIT(umem->page_shift) - 1; + mem->page_shift = PAGE_SHIFT; + mem->page_mask = PAGE_SIZE - 1; num_buf = 0; map = mem->map; if (length > 0) { buf = map[0]->buf; - for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { - vaddr = page_address(sg_page(sg)); + for_each_sg_dma_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) { + vaddr = page_address(sg_page_iter_page(&sg_iter.base)); ? if (!vaddr) { pr_warn("null vaddr\n"); err = -ENOMEM; @@ -208,7 +207,7 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, } buf->addr = (uintptr_t)vaddr; - buf->size = BIT(umem->page_shift); + buf->size = PAGE_SIZE; 1.8.3.1
Re: [PATCH] lib/scatterlist: Provide a DMA page iterator
On Fri, Jan 04, 2019 at 10:35:43PM +, Jason Gunthorpe wrote: > Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o > backing pages") introduced the sg_page_iter_dma_address() function without > providing a way to use it in the general case. If the sg_dma_len is not > equal to the dma_length callers cannot safely use the > for_each_sg_page/sg_page_iter_dma_address combination. > > Resolve this API mistake by providing a DMA specific iterator, > for_each_sg_dma_page(), that uses the right length so > sg_page_iter_dma_address() works as expected with all sglists. A new > iterator type is introduced to provide compile-time safety against wrongly > mixing accessors and iterators. [..] > > +/* > + * sg page iterator for DMA addresses > + * > + * This is the same as sg_page_iter however you can call > + * sg_page_iter_dma_address(@dma_iter) to get the page's DMA > + * address. sg_page_iter_page() cannot be called on this iterator. > + */ Does it make sense to have a variant of sg_page_iter_page() to get the page descriptor with this dma_iter? This can be used when walking DMA-mapped SG lists with for_each_sg_dma_page. > +struct sg_dma_page_iter { > + struct sg_page_iter base; > +}; > + >
Re: mutex and rcu list traversal idiosyncrasy
On Thu, Sep 13, 2018 at 05:55:44AM -0600, HÃ¥kon Bugge wrote: > Hi Faisal, > > > In commit f27b4746f378 ("i40iw: add connection management code") you have in > i40iw_add_mqh_6(): > > rtnl_lock(); > for_each_netdev_rcu(...) { > [] > } > rtnl_unlock(); > > Shouldn't this read: > rtnl_lock(); > for_each_netdev(...) { > [] > } > rtnl_unlock(); > Yes.
Re: [PATCH 1/3] infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_add_mqh_4
On Wed, Apr 11, 2018 at 10:53:13AM -0400, Dennis Dalessandro wrote: > On 4/11/2018 3:32 AM, Jia-Ju Bai wrote: > > i40iw_add_mqh_4() is never called in atomic context, because it > > calls rtnl_lock() that can sleep. > > > > Despite never getting called from atomic context, > > i40iw_add_mqh_4() calls kzalloc() with GFP_ATOMIC, > > which does not sleep for allocation. > > GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, > > which can sleep and improve the possibility of sucessful allocation. > > Just a general comment. I don't know that this is the greatest idea. I can > imagine instances where sleeping is OK as far as how the code is written, > but for performance reasons you would rather fail than sleep. > > As to whether that is the case here I'll let the i40iw folks comment. > In this case, the changes Jia made look safe and not in the perf. path. Thanks! > > This is found by a static analysis tool named DCNS written by myself. > > And I also manually check it. > You should probably post a pointer to your tool. > > -Denny
Re: [PATCH 3/3] infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_l2param_change
On Wed, Apr 11, 2018 at 03:33:06PM +0800, Jia-Ju Bai wrote: > i40iw_l2param_change() is never called in atomic context. > > i40iw_make_listen_node() is only set as ".l2_param_change" > in struct i40e_client_ops, and this function pointer is not called > in atomic context. > > Despite never getting called from atomic context, > i40iw_l2param_change() calls kzalloc() with GFP_ATOMIC, > which does not sleep for allocation. > GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, > which can sleep and improve the possibility of sucessful allocation. > > This is found by a static analysis tool named DCNS written by myself. > And I also manually check it. > > Signed-off-by: Jia-Ju Bai > --- > Acked-by: Shiraz Saleem
Re: [PATCH 1/3] infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_add_mqh_4
On Wed, Apr 11, 2018 at 03:32:25PM +0800, Jia-Ju Bai wrote: > i40iw_add_mqh_4() is never called in atomic context, because it > calls rtnl_lock() that can sleep. > > Despite never getting called from atomic context, > i40iw_add_mqh_4() calls kzalloc() with GFP_ATOMIC, > which does not sleep for allocation. > GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, > which can sleep and improve the possibility of sucessful allocation. > > This is found by a static analysis tool named DCNS written by myself. > And I also manually check it. > > Signed-off-by: Jia-Ju Bai > --- Acked-by: Shiraz Saleem
Re: [PATCH 2/3] infiniband: i40iw: Replace GFP_ATOMIC with GFP_KERNEL in i40iw_make_listen_node
On Wed, Apr 11, 2018 at 03:32:48PM +0800, Jia-Ju Bai wrote: > i40iw_make_listen_node() is never called in atomic context. > > i40iw_make_listen_node() is only called by i40iw_create_listen, which is > set as ".create_listen" in struct iw_cm_verbs. > > Despite never getting called from atomic context, > i40iw_make_listen_node() calls kzalloc() with GFP_ATOMIC, > which does not sleep for allocation. > GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, > which can sleep and improve the possibility of sucessful allocation. > > This is found by a static analysis tool named DCNS written by myself. > And I also manually check it. > > Signed-off-by: Jia-Ju Bai > --- > Acked-by: Shiraz Saleem
Re: [PATCH v4 3/6] RDMA/i40iw: Eliminate duplicate barriers on weakly-ordered archs
On Mon, Mar 19, 2018 at 08:47:45PM -0600, Sinan Kaya wrote: > Code includes wmb() followed by writel(). writel() already has a barrier on > some architectures like arm64. > > This ends up CPU observing two barriers back to back before executing the > register write. > > Create a new wrapper function with relaxed write operator. Use the new > wrapper when a write is following a wmb(). > > Since code already has an explicit barrier call, changing writel() to > writel_relaxed(). > > Signed-off-by: Sinan Kaya Acked-by: Shiraz Saleem
Re: [PATCH] 40iw: include linux/irq.h
On Tue, Mar 13, 2018 at 01:06:20PM +0100, Arnd Bergmann wrote: > We get a build failure on ARM unless the header is included explicitly: > > drivers/infiniband/hw/i40iw/i40iw_verbs.c: In function > 'i40iw_get_vector_affinity': > drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: implicit declaration > of function 'irq_get_affinity_mask'; did you mean > 'irq_create_affinity_masks'? [-Werror=implicit-function-declaration] > return irq_get_affinity_mask(msix_vec->irq); > ^ > irq_create_affinity_masks > drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: returning 'int' from > a function with return type 'const struct cpumask *' makes pointer from > integer without a cast [-Werror=int-conversion] > return irq_get_affinity_mask(msix_vec->irq); > > Fixes: 7e952b19eb63 ("i40iw: Implement get_vector_affinity API") > Signed-off-by: Arnd Bergmann > --- Looks like you beat me to it. 0-day caught this on rdma-nxt and I was going to send the fix today. Typo in the subject. 40iw --> i40iw. Othwerwise, looks good. Thanks for the patch, Arnd! Shiraz
Re: [PATCH] i40iw: Replace mdelay with msleep in i40iw_wait_pe_ready
On Sun, Dec 24, 2017 at 03:39:40AM -0700, Jia-Ju Bai wrote: > i40iw_wait_pe_ready is not called in an interrupt handler > nor holding a spinlock. > The function mdelay in it can be replaced with msleep, > to reduce busy wait. > > Signed-off-by: Jia-Ju Bai > --- > drivers/infiniband/hw/i40iw/i40iw_main.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Thanks! Acked-by: Shiraz Saleem
Re: [PATCH 1/2] i40iw: remove unused 'timeval' struct member
On Mon, Nov 27, 2017 at 04:38:09AM -0700, Arnd Bergmann wrote: > There is a stale entry in i40iw_cm_tcp_context that apparently > was copied from the 'nes' driver but never used in i40iw. > I'm trying to kill off all users of timeval as part of the > y2038-safety work, so let's just remove this one. > > Signed-off-by: Arnd Bergmann > --- Thanks! Acked-by: Shiraz Saleem
Re: [PATCH] i40w: Remove garbage at end of INFINIBAND_I40IW Kconfig section
On Sun, Nov 19, 2017 at 07:59:21PM +0100, Geert Uytterhoeven wrote: > Remove leftover garbage (containing Kconfig dependencies for another > symbol?) > > Signed-off-by: Geert Uytterhoeven > --- Acked-by: Shiraz Saleem
Re: [PATCH] RDMA/i40iw: Convert timers to use timer_setup()
On Fri, Oct 06, 2017 at 06:17:23PM -0500, Shiraz Saleem wrote: > On Wed, Oct 04, 2017 at 05:45:41PM -0700, Kees Cook wrote: > > In preparation for unconditionally passing the struct timer_list pointer to > > all timer callbacks, switch to using the new timer_setup() and from_timer() > > to pass the timer pointer explicitly. > > > > Cc: Faisal Latif > > Cc: Shiraz Saleem > > Cc: Doug Ledford > > Cc: Sean Hefty > > Cc: Hal Rosenstock > > Cc: linux-r...@vger.kernel.org > > Cc: Thomas Gleixner > > Signed-off-by: Kees Cook > > --- > > This requires commit 686fef928bba ("timer: Prepare to change timer > > callback argument type") in v4.14-rc3, but should be otherwise > > stand-alone. > > --- > > Patch looks ok. Did some minimal testing and looks good. > > Acked-by: Shiraz Saleem > Hi Kees, Sorry, I didnt notice this earlier, but, you made the change only to the stats timer to use the new timer init APIs. Can you do the same for the cm_timer and terminate_timer too for i40iw; so that things are consistent? [ssaleem@linbuild6081 i40iw]$ grep "setup_timer" * i40iw_cm.c: setup_timer(&cm_core->tcp_timer, i40iw_cm_timer_tick, i40iw_utils.c: setup_timer(&iwqp->terminate_timer, i40iw_terminate_timeout, i40iw_utils.c: setup_timer(&devstat->stats_timer, i40iw_hw_stats_timeout, Shiraz
Re: [PATCH] RDMA/i40iw: Convert timers to use timer_setup()
On Wed, Oct 04, 2017 at 05:45:41PM -0700, Kees Cook wrote: > In preparation for unconditionally passing the struct timer_list pointer to > all timer callbacks, switch to using the new timer_setup() and from_timer() > to pass the timer pointer explicitly. > > Cc: Faisal Latif > Cc: Shiraz Saleem > Cc: Doug Ledford > Cc: Sean Hefty > Cc: Hal Rosenstock > Cc: linux-r...@vger.kernel.org > Cc: Thomas Gleixner > Signed-off-by: Kees Cook > --- > This requires commit 686fef928bba ("timer: Prepare to change timer > callback argument type") in v4.14-rc3, but should be otherwise > stand-alone. > --- Patch looks ok. Did some minimal testing and looks good. Acked-by: Shiraz Saleem
Re: [PATCH] i40iw: make some structures const
On Mon, Aug 28, 2017 at 09:51:23PM +0530, Bhumika Goyal wrote: > Make some structures const as they are only used during a copy > operation. > > Signed-off-by: Bhumika Goyal Thanks! Acked-by: Shiraz Saleem
Re: [PATCH] i40iw: Simplify code
On Sun, Jul 16, 2017 at 01:09:23PM +0200, Christophe JAILLET wrote: > Axe a few lines of code and re-use existing error handling path to avoid > code duplication. > Acked-by: Shiraz Saleem
Re: [PATCH] i40iw: Add a value assignment to avoid sleep-in-atomic bug caused by uninitialized value
On Thu, Jun 01, 2017 at 10:11:16AM +0800, Jia-Ju Bai wrote: > The value "cqp_request->waiting" indicates whether the sleeping operation > should be performed, and it is not assigned in i40iw_get_cqp_request, so > the driver may sleep in interrupt handling. The function call path is: > > i40iw_dpc (tasklet_init indicates it handles interrupt) > i40iw_process_aeq > i40iw_next_iw_state > i40iw_hw_modify_qp (call i40iw_get_cqp_request) > i40iw_handle_cqp_op > i40iw_wait_event --> may sleep > > To fix it, "cqp_request->waiting" is assigned in "else" branch in > i40iw_get_cqp_request. > > Signed-off-by: Jia-Ju Bai > --- > drivers/infiniband/hw/i40iw/i40iw_utils.c |1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c > b/drivers/infiniband/hw/i40iw/i40iw_utils.c > index 409a378..0f4e633 100644 > --- a/drivers/infiniband/hw/i40iw/i40iw_utils.c > +++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c > @@ -326,6 +326,7 @@ struct i40iw_cqp_request *i40iw_get_cqp_request(struct > i40iw_cqp *cqp, bool wait > cqp_request->waiting = true; > } else { > atomic_set(&cqp_request->refcount, 1); > + cqp_request->waiting = false; > } > return cqp_request; > } > -- Hi Jia - cqp_request->waiting is always initialized. If cqp_request object is allocated using kzalloc, it is initialized to 0. For those cqp_request object that are retrieved from the cqp list 'cqp->cqp_avail_reqs', the waiting flag is set to false when it is put back on the list via i40iw_free_cqp_request. So there should be no issue here. Shiraz
Re: [PATCH] infiniband: hw: i40iw: fix duplicated code for different branches
On Thu, May 18, 2017 at 01:11:17PM -0500, Gustavo A. R. Silva wrote: > Refactor code to avoid identical code for different branches. > > Addresses-Coverity-ID: 1357356 > Reviewed-by: Yuval Shaia > Signed-off-by: Gustavo A. R. Silva > --- > drivers/infiniband/hw/i40iw/i40iw_virtchnl.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/drivers/infiniband/hw/i40iw/i40iw_virtchnl.c > b/drivers/infiniband/hw/i40iw/i40iw_virtchnl.c > index f4d1368..48fd327 100644 > --- a/drivers/infiniband/hw/i40iw/i40iw_virtchnl.c > +++ b/drivers/infiniband/hw/i40iw/i40iw_virtchnl.c > @@ -443,10 +443,7 @@ enum i40iw_status_code i40iw_vchnl_recv_pf(struct > i40iw_sc_dev *dev, > if (!dev->vchnl_up) > return I40IW_ERR_NOT_READY; > if (vchnl_msg->iw_op_code == I40IW_VCHNL_OP_GET_VER) { > - if (vchnl_msg->iw_op_ver != I40IW_VCHNL_OP_GET_VER_V0) > - vchnl_pf_send_get_ver_resp(dev, vf_id, vchnl_msg); > - else > - vchnl_pf_send_get_ver_resp(dev, vf_id, vchnl_msg); > + vchnl_pf_send_get_ver_resp(dev, vf_id, vchnl_msg); > return I40IW_SUCCESS; > } > for (iw_vf_idx = 0; iw_vf_idx < I40IW_MAX_PE_ENABLED_VF_COUNT; > iw_vf_idx++) { > -- Acked-by: Shiraz Saleem