Re: Mlx4: BUG: unable to handle kernel at ffffffffa02be210

2015-07-08 Thread Or Gerlitz
On 7/8/2015 12:42 PM, Jack Wang wrote: We're using MLX OFED 2.4-1.0.4 together on top of 3.18.14. So this list is for upstream things.. still, let's see We hit bug below spontaneously, our test trigger this bug around 1 in 5 times. and what is your test if I may ask?! HCA 'mlx4_0' CA

RE: [PATCH v7 4/4] IB/sa: Route SA pathrecord query through netlink

2015-07-08 Thread Wan, Kaike
-Original Message- From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com] Sent: Friday, July 03, 2015 5:38 PM To: Wan, Kaike Cc: linux-rdma@vger.kernel.org; Fleck, John; Weiny, Ira Subject: Re: [PATCH v7 4/4] IB/sa: Route SA pathrecord query through netlink On Tue, Jun 30,

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Sagi Grimberg
On 7/8/2015 1:20 PM, 'Christoph Hellwig' wrote: On Wed, Jul 08, 2015 at 01:05:28PM +0300, Sagi Grimberg wrote: If we agree to consolidate on a single MR allocation API, I don't see how this wrapper is moving us forward. But if you guys prefer to have it than I don't have a hard objection.

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
On Wed, Jul 08, 2015 at 12:26:56PM +0530, Devesh Sharma wrote: We (Emulex/Avago) were lobbied by the Open-Fabrics Alliance (OFA) to change the licensing from just GPLv2 to a dual GPLv2/BSD license. They would prefer the elements in the OFED stack all be dual licensed. We're trying to move to

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread 'Christoph Hellwig'
On Tue, Jul 07, 2015 at 09:05:15AM -0500, Steve Wise wrote: I took the feedback from Christoph and Jason to mean I should remove ib_get_dma_mr() entirely and pull its guts into rdma_get_dma_mr(), and change all the users of ib_get_dma_mr() to use rdma_get_dma_mr(). So the net result isn't a

Re: Mlx4: BUG: unable to handle kernel at ffffffffa02be210

2015-07-08 Thread Or Gerlitz
On 7/8/2015 3:47 PM, Jack Wang wrote: static void mlx4_ib_cq_comp(struct mlx4_cq *cq) 47 { 48 struct ib_cq *ibcq = to_mibcq(cq)-ibcq; 49 ibcq-comp_handler(ibcq, ibcq-cq_context); 50 } Looks like cq use-after-free? I have no idea where. see if you have in the code base you're using (why not

Re: Mlx4: BUG: unable to handle kernel at ffffffffa02be210

2015-07-08 Thread Jack Wang
Hi Or, We're testing our rdma kernel module, the tests is load module, create RDMA connection, do some traffic, and unload module. No mlx4_en involved, in fact we disable mlx4_en in kernel build, because we don't need that. I did some debug with gdb: (gdb)list *mlx4_test_interrupts+0x84a 0xb0ea

Re: Mlx4: BUG: unable to handle kernel at ffffffffa02be210

2015-07-08 Thread Jack Wang
Thanks for your time. Looks the last one is missing in OFED 2.4 driver, I just checked the history of mainline commit bf1bac5b7882daa41249f85fbc97828f0597de5c Author: Eli Cohen e...@dev.mellanox.co.il Date: Thu Oct 23 15:57:27 2014 +0300 net/mlx4_core: Call synchronize_irq() before

[PATCH] IB/core: Destroy multcast_idr on moudle exit

2015-07-08 Thread Johannes Thumshirn
Destroy multcast_idr on moudle exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez mcg...@suse.com) SmPL @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@

[PATCH] IB/core: Destroy ocrdma_dev_id IDR on module exit

2015-07-08 Thread Johannes Thumshirn
Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory. This was detected by the following semantic patch (written by Luis Rodriguez mcg...@suse.com) SmPL @ defines_module_init @ declarer name module_init, module_exit; declarer name DEFINE_IDR; identifier init; @@

Re: Mlx4: BUG: unable to handle kernel at ffffffffa02be210

2015-07-08 Thread Or Gerlitz
On Wed, Jul 8, 2015 at 5:07 PM, Jack Wang xjtu...@gmail.com wrote: Looks the last one is missing in OFED 2.4 driver, I just checked the history of mainline commit bf1bac5b7882daa41249f85fbc97828f0597de5c Author: Eli Cohen e...@dev.mellanox.co.il Date: Thu Oct 23 15:57:27 2014 +0300

RE: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Hefty, Sean
I am still not clear if all of us agree that we need it. Sean and Steve had some disclaimers... A single entry point doesn't help a whole lot if the app must deal with different behavior based on how the API is used. We have a single entry point for post_send, for example, and that makes

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
On Wed, Jul 08, 2015 at 04:15:00PM -0400, Doug Ledford wrote: On 07/08/2015 04:02 PM, Christoph Hellwig wrote: So how about someone tells OFED to stop trying to enforce this BS? Unfortunately, simply not enforcing a bylaw of a multi-company organization isn't really a valid option, you

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread 'Christoph Hellwig'
On Wed, Jul 08, 2015 at 01:08:42PM -0600, Jason Gunthorpe wrote: Then, what is left is all remote MRs and maybe it will be clearer what to do about them then... From looking at that for a while the APIs needed seem pretty simple to me from a consumer perspective: struct rdma_mr

Re: [PATCH v1 02/12] IB/core: Find the network device matching connection parameters

2015-07-08 Thread Jason Gunthorpe
On Mon, Jun 22, 2015 at 03:42:31PM +0300, Haggai Eran wrote: +/** + * ib_get_net_dev_by_params() - Return the appropriate net_dev + * for a received CM request + * @dev: An RDMA device on which the request has been received. + * @port:Port number on the RDMA device. + * @pkey:The

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Doug Ledford
On 07/08/2015 04:25 PM, Christoph Hellwig wrote: On Wed, Jul 08, 2015 at 04:15:00PM -0400, Doug Ledford wrote: On 07/08/2015 04:02 PM, Christoph Hellwig wrote: So how about someone tells OFED to stop trying to enforce this BS? Unfortunately, simply not enforcing a bylaw of a multi-company

Re: [PATCH] IB/core: Destroy ocrdma_dev_id IDR on module exit

2015-07-08 Thread Doug Ledford
On 07/08/2015 11:23 AM, Johannes Thumshirn wrote: Destroy ocrdma_dev_id IDR on module exit, reclaiming the allocated memory. Thanks, applied. -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: OpenPGP digital signature

Re: [PATCH 1/1] infiniband: Remove redundant NULL check before kfree

2015-07-08 Thread Doug Ledford
On 07/08/2015 12:23 AM, Maninder Singh wrote: Hello, + for (i = 0; i dev-caps.num_ports; i++) + kfree(dm[i]); goto out; } } -- 1.7.9.5 If you are going to change this, you might as well make it 100%

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Doug Ledford
On 07/03/2015 11:38 AM, Weiny, Ira wrote: Christoph, Apologies, I misspoke in my response to you. There was a study of the code and we thought it was reasonable to post. However, in retrospect we should have used more due diligence. We're going back to seek explicit consent from key

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
So how about someone tells OFED to stop trying to enforce this BS? This just confirms my byass that Open-Fabrics Alliance are a bunch of idiots making life hard, similar to all their horrible OFED driver distributions that crated a total mess for everyone involved. -- To unsubscribe from this

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Doug Ledford
On 07/08/2015 04:02 PM, Christoph Hellwig wrote: So how about someone tells OFED to stop trying to enforce this BS? Unfortunately, simply not enforcing a bylaw of a multi-company organization isn't really a valid option, you should know that. You have to work to change the bylaw, which usually

Re: [PATCH v1 01/12] IB/core: pass client data to remove() callbacks

2015-07-08 Thread Jason Gunthorpe
On Mon, Jun 22, 2015 at 03:42:30PM +0300, Haggai Eran wrote: An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device() freeing its client data. This is because ib_unregister_device() will

RE: [PATCH 36/41] IB/hfi1: add low level page locking

2015-07-08 Thread Marciniszyn, Mike
anything wrong with the umem services provided by the IB core that requires this implementation? what? The current level of the API is mismatched with the PSM SDMA. The ib_umem api: - maps an SG list which isn't required by PSM since DMA mapping is done by the low level SDMA - the mapping

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Doug Ledford
On 07/08/2015 04:01 AM, Christoph Hellwig wrote: On Wed, Jul 08, 2015 at 12:26:56PM +0530, Devesh Sharma wrote: We (Emulex/Avago) were lobbied by the Open-Fabrics Alliance (OFA) to change the licensing from just GPLv2 to a dual GPLv2/BSD license. They would prefer the elements in the OFED

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread 'Christoph Hellwig'
On Wed, Jul 08, 2015 at 01:32:05PM -0700, 'Christoph Hellwig' wrote: /* updates *sg if the SG couldn't be fully registered due to offsets */ int rdma_register_sg(struct rdma_mr *mr, struct scatterlist **sg, u32 *pkey, u32 *offset, u32 *len); plus an enum dma_data_direction

Re: [PATCH] IB/core: Destroy multcast_idr on moudle exit

2015-07-08 Thread Doug Ledford
On 07/08/2015 11:21 AM, Johannes Thumshirn wrote: Destroy multcast_idr on moudle exit, reclaiming the allocated memory. Thanks, applied. -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: OpenPGP digital signature

RE: [PATCH 06/41] IB/hfi1: add char device instantiation code

2015-07-08 Thread Marciniszyn, Mike
netlink is a reasonable low speed format to use for this kind of serialization, either via the common mux or via your own char device. A couple of follow-up netlink questions. 1. I assume you are talking about generic netlink vs. say the RDMA netlink. The generic netlink handles dynamic

Re: [PATCH v1 01/12] IB/core: pass client data to remove() callbacks

2015-07-08 Thread Jason Gunthorpe
On Wed, Jul 08, 2015 at 02:29:10PM -0600, Jason Gunthorpe wrote: On Mon, Jun 22, 2015 at 03:42:30PM +0300, Haggai Eran wrote: An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device()

Re: [PATCH v2] infiniband: free only allocated items

2015-07-08 Thread Doug Ledford
On 07/08/2015 12:13 AM, Maninder Singh wrote: o If allocation of dm fails, no need to free it. o Free only allocated items. I've taken the patch, but I reworked your commit message. The v1 version of the patch had a more correct commit message, but it could have used a little rework as well.

RE: [PATCH 06/41] IB/hfi1: add char device instantiation code

2015-07-08 Thread Marciniszyn, Mike
so what's the role of the char-device? why should a low-level driver which is part of the upstream RDMA stack contain a char-device? Or. Contains additional char devices: - PSM character device - diagnostic character devices The nature of the hardware requires both of these additional

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Devesh Sharma
On Fri, Jul 3, 2015 at 9:08 PM, Weiny, Ira ira.we...@intel.com wrote: Christoph, Apologies, I misspoke in my response to you. There was a study of the code and we thought it was reasonable to post. However, in retrospect we should have used more due diligence. We're going back to seek

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Devesh Sharma
Hi Christoph, On Fri, Jul 3, 2015 at 9:22 PM, Christoph Hellwig h...@infradead.org wrote: On Fri, Jul 03, 2015 at 03:38:55PM +, Weiny, Ira wrote: Christoph, Apologies, I misspoke in my response to you. There was a study of the code and we thought it was reasonable to post.

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Sagi Grimberg
On 7/8/2015 12:36 AM, Jason Gunthorpe wrote: On Tue, Jul 07, 2015 at 07:27:47PM +0300, Sagi Grimberg wrote: Doesn't it look odd to you? Sure, but the oddness is that rdma_device_access_flags exists at all, not the wrapper. The wrapper is what we want the API to look like, I don't

[PATCH for-4.2] IB/mlx4: Fix and optimize SRIOV slave init

2015-07-08 Thread Doug Ledford
In mlx4_main.c:do_slave_init(), the function is supposed to queue up each work struct. However, it checks to make sure the sriov support isn't going down first. When it is going down, it doesn't queue up the work struct, which results in us leaking the work struct at the end of the function. As

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Jason Gunthorpe
On Wed, Jul 08, 2015 at 05:38:05PM -0400, Tom Talpey wrote: On 7/8/2015 3:08 PM, Jason Gunthorpe wrote: The MR stuff was never really designed, the HW people provided some capability and the SW side just raw exposed it, thoughtlessly. Jason, I don't disagree that the API can be improved. I

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Jason Gunthorpe
On Wed, Jul 08, 2015 at 01:32:05PM -0700, 'Christoph Hellwig' wrote: On Wed, Jul 08, 2015 at 01:08:42PM -0600, Jason Gunthorpe wrote: Then, what is left is all remote MRs and maybe it will be clearer what to do about them then... From looking at that for a while the APIs needed seem pretty

Re: [PATCH] IB: Add rdma_cap_ib_switch helper and use where appropriate

2015-07-08 Thread Doug Ledford
On 06/29/2015 09:57 AM, Hal Rosenstock wrote: Persuant to Liran's comments on node_type on linux-rdma mailing list: In an effort to reform the RDMA core and ULPs to minimize use of node_type in struct ib_device, an additional bit is added to struct ib_device for is_switch (IB switch). This

Re: [PATCH 0/2] lockdep warning fixes

2015-07-08 Thread Doug Ledford
On 07/07/2015 10:45 AM, Haggai Eran wrote: Hi, These two patches fix a couple of lockdep warnings I ran into, in IPoIB and RDMA CM. Regards, Haggai Haggai Eran (2): IB/ucma: Fix lockdep warning in ucma_lock_files IB/ipoib: Prevent lockdep warning in __ipoib_ib_dev_flush

RE: [PATCH 00/41] Add OPA gen1 driver

2015-07-08 Thread Marciniszyn, Mike
Ummm.. Could we get some more descriptions as to what this code is for? The next set with contain a great deal more background info in the cover letter. Do we have a new OmniPath protocol here as well or is it IB? Which standards are followed? This will be covered in the additional

Re: [PATCH V3] IB/mad: Fix 0-day build

2015-07-08 Thread Doug Ledford
On 06/25/2015 12:04 PM, ira.we...@intel.com wrote: From: Ira Weiny ira.we...@intel.com The define OPA_LID_PERMISSIVE is big endian and was compared to cpu value opa_drslid. 0-day build caught this while building with the OPA (hfi1) driver which was recently sent to the list. Fixes:

Re: [PATCH v2 39/49] IB/hfi1: add sysfs routines

2015-07-08 Thread ira.weiny
On Wed, Jul 08, 2015 at 10:32:44PM +, Marciniszyn, Mike wrote: This sysfs entries are used by PSM2 to form packets from user space. You didn't explain what's SC and what's SC-to-VL and why PSM2 can't talk to the SM to query that. Like IB the SL to SC to VL maps are available via SMA

Re: [PATCH for-4.2 1/1] RDMA/nes: Fix for incorrect recording of the MAC address

2015-07-08 Thread Doug Ledford
On 07/02/2015 01:52 PM, Tatyana Nikolova wrote: Fix for incorrect recording of the MAC address Signed-off-by: Tatyana Nikolova tatyana.e.nikol...@intel.com --- drivers/infiniband/hw/nes/nes_hw.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git

Re: [PATCH v1 03/12] IB/ipoib: Return IPoIB devices matching connection parameters

2015-07-08 Thread Jason Gunthorpe
On Mon, Jun 22, 2015 at 03:42:32PM +0300, Haggai Eran wrote: + if (net_dev) { + ipoib_warn(priv, matching net_dev found: %s\n, +net_dev-name); Is that a debug print? + default: + dev_warn(dev-dev, duplicate IP

Re: [PATCH for-4.2 1/1] RDMA/nes: Fix for resolving the neigh

2015-07-08 Thread Doug Ledford
On 07/02/2015 01:49 PM, Tatyana Nikolova wrote: Neighbor resolution doesn't work without this fix Signed-off-by: Tatyana Nikolova tatyana.e.nikol...@intel.com Thanks, applied. -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: OpenPGP

RE: [PATCH v2 39/49] IB/hfi1: add sysfs routines

2015-07-08 Thread Marciniszyn, Mike
This sysfs entries are used by PSM2 to form packets from user space. You didn't explain what's SC and what's SC-to-VL and why PSM2 can't talk to the SM to query that. SC stands for Service Channel and is a Fabric wide concept. These tables are used to as to perform the following mapping:

Re: [PATCH for-4.2 1/1] RDMA/core: Fixes for port mapper client registration

2015-07-08 Thread Doug Ledford
On 07/02/2015 01:47 PM, Tatyana Nikolova wrote: Fixes to allow clients to make remove mapping requests, after they have provided the user space service with the mapping information, they are using when the service is restarted. 1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
On Wed, Jul 08, 2015 at 03:33:03PM -0400, Doug Ledford wrote: I am not a lawyer, but this has been explained to me on numerous occasions, so I relay the layman's interpretation here: No, you don't always need everyone's approval. There are contributions that are not legally copyright

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Tom Talpey
On 7/8/2015 3:08 PM, Jason Gunthorpe wrote: The MR stuff was never really designed, the HW people provided some capability and the SW side just raw exposed it, thoughtlessly. Jason, I don't disagree that the API can be improved. I have some responses to your statements below though. Why is

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Jason Gunthorpe
On Wed, Jul 08, 2015 at 10:29:56AM +0300, Sagi Grimberg wrote: Sure, but the oddness is that rdma_device_access_flags exists at all, not the wrapper. The wrapper is what we want the API to look like, I don't necessarily agree. The API we'd want is a single API at all the call sites to all