Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps

2014-12-16 Thread Haggai Eran
On 16/12/2014 14:33, Yann Droneaud wrote:
> Le jeudi 11 décembre 2014 à 17:04 +0200, Haggai Eran a écrit :
>>  static inline int ib_copy_to_udata(struct ib_udata *udata, void *src, 
>> size_t len)
>>  {
>> -return copy_to_user(udata->outbuf, src, len) ? -EFAULT : 0;
>> +size_t copy_sz;
>> +
>> +copy_sz = min_t(size_t, len, udata->outlen);
>> +return copy_to_user(udata->outbuf, src, copy_sz) ? -EFAULT : 0;
>>  }
> 
> 
> This is not the place to do this: as I'm guessing the purpose of this 
> change from the patch in '[PATCH v3 07/17] IB/core: Add flags for on 
> demand paging support', you're trying to handle uverbs call from 
> a userspace program using a previous, shorter ABI.

Yes, that was my intention.

> 
> But that's hidding bug where userspace will get it wrong at passing the 
> correct buffer / size for all others uverb calls.
> 
> That cannot work that way.
> 
> In a previous patchset [1], I've suggested to add a check in 
> ib_copy_{from,to}_udata()[2][3] in order to check the input/output
> buffer size to not read/write past userspace provided buffer
> boundaries: in case of mismatch an error would be returned to
> userspace.
> 
> With the suggested change here, buffer overflow won't happen,
> but the error is silently ignored, allowing uverb to return a
> partial result, which is likely not expected by userspace as
> it's a bit difficult to handle it gracefully.
> 
> So this has to be removed, and a check on userspace response
> buffer must be added to ib_uverbs_ex_query_device() instead.

I agree that we shouldn't silently ignore bugs in userspace, but I'm not
sure the alternative is maintainable. If we have in the future N new
extensions to this verb, will we need to validate the user space given
output buffer is one of the N possible sizes?

Regards,
Haggai
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] mlx4_core: Fix device capabilities dumping

2014-12-16 Thread Or Gerlitz
We are dumping device capabilities which are supported both by the
firmware and the driver. Align the arrary that holds the capability
strings with this practice.

Reported-by: Yuval Shaia 
Signed-off-by: Or Gerlitz 
---
 drivers/net/ethernet/mellanox/mlx4/fw.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c 
b/drivers/net/ethernet/mellanox/mlx4/fw.c
index ef3b95b..4a4e652 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -84,13 +84,10 @@ static void dump_dev_cap_flags(struct mlx4_dev *dev, u64 
flags)
[ 1] = "UC transport",
[ 2] = "UD transport",
[ 3] = "XRC transport",
-   [ 4] = "reliable multicast",
-   [ 5] = "FCoIB support",
[ 6] = "SRQ support",
[ 7] = "IPoIB checksum offload",
[ 8] = "P_Key violation counter",
[ 9] = "Q_Key violation counter",
-   [10] = "VMM",
[12] = "Dual Port Different Protocol (DPDP) support",
[15] = "Big LSO headers",
[16] = "MW support",
@@ -99,12 +96,11 @@ static void dump_dev_cap_flags(struct mlx4_dev *dev, u64 
flags)
[19] = "Raw multicast support",
[20] = "Address vector port checking support",
[21] = "UD multicast support",
-   [24] = "Demand paging support",
-   [25] = "Router support",
[30] = "IBoE support",
[32] = "Unicast loopback support",
[34] = "FCS header control",
-   [38] = "Wake On LAN support",
+   [37] = "Wake On LAN (port1) support",
+   [38] = "Wake On LAN (port2) support",
[40] = "UDP RSS support",
[41] = "Unicast VEP steering support",
[42] = "Multicast VEP steering support",
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-next] RDMA/CMA: Mark IPv4 addresses correctly when the listener is IPv6

2014-12-16 Thread Or Gerlitz

On 12/16/2014 8:38 PM, Hefty, Sean wrote:

Any comment? can you please ack the patch or continue the discussion?

My concerns were addressed, so I have no further issues.


OK, thanks, Roland, can you please move fwd and pick this one for the 
merge window too?


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] dapl-2.1.3-1 release

2014-12-16 Thread Davis, Arlin R
New release for DAPL (2.1.3) is available at 
http://www.openfabrics.org/downloads/dapl

Vlad, please pull into OFED 3.18 daily builds.

Latest Packages (see ChangeLog for recent changes, see README.mcm for MIC 
support):

   md5sum: 04537bdd405b89c562d73bfdd6027c2b dapl-2.1.3.tar.gz

For package install RPM packages as follow:

   dapl-2.1.3-1
   dapl-utils-2.1.3-1
   dapl-devel-2.1.3-1
   dapl-debuginfo-2.1.3-1

Full list of changes since last release:

Amir Hanania (2):
  common: add srq support for openib verbs providers
  dtest: add dtestsrq for SRQ example and provider testing

Arlin Davis (21):
  add provider and proxy support for GUID across platform
  mpxyd: log warning if running in COMPAT mode
  mpxyd/mcm: add provider specific attribute DAT_IB_PROXY_VERSION
  extension: add IB UD extensions to reduce provider CM and AH memory footprint
  openib: add new TIMEWAIT state for CM
  openib: add IB UD cm_free/ah_free extension support in UCM provider
  dtestx: update IB extension example test with new v2.0.9 features
  mcm: provide CPU family/model attribute on both host and mic sides
  openib: add port_num to provider named attributes
  mpxyd: set global seg_sz to 128KB for proxy data service
  mcm: add segmentation to HST->MXS mode for improved performance
  mcm: HST->MXS mode incorrectly signals multiple fragments per WR
  mpxyd: DTO completion ERR: status 12, op RDMA_WRITE running MPI alltoall test
  mpxyd: increase max open files for service
  ucm: RTU not retransmitted in TIMEWAIT state
  dtestx: allow scale up to 1000 EP's
  common: dapl_ep_free must serialize CM object destroy
  ucm: add time wait override capability for CM services
  dapl: add rdma_write_imm and write only option to dtest
  dapl: mpxyd service changes to support multi-thread single-core

Regards,

Arlin

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] RDMA/cma: fix first byte overwritten for AF_IB

2014-12-16 Thread Hefty, Sean
> If user attach private data for AF_IB, the first byte will
> be overwritten, because we always set the cma version no matter
> family is AF_IB, so move the version set inside if condition.
> 
> Reported-by: Fabian Holler 
> Signed-off-by: Jack Wang 
> ---
>  drivers/infiniband/core/cma.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index d570030..22a22e2 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -2618,10 +2618,10 @@ static int cma_format_hdr(void *hdr, struct
> rdma_id_private *id_priv)
>   struct cma_hdr *cma_hdr;
> 
>   cma_hdr = hdr;
> - cma_hdr->cma_version = CMA_VERSION;
>   if (cma_family(id_priv) == AF_INET) {
>   struct sockaddr_in *src4, *dst4;
> 
> + cma_hdr->cma_version = CMA_VERSION;
>   src4 = (struct sockaddr_in *) cma_src_addr(id_priv);
>   dst4 = (struct sockaddr_in *) cma_dst_addr(id_priv);
> 
> @@ -2632,6 +2632,7 @@ static int cma_format_hdr(void *hdr, struct
> rdma_id_private *id_priv)
>   } else if (cma_family(id_priv) == AF_INET6) {
>   struct sockaddr_in6 *src6, *dst6;
> 
> + cma_hdr->cma_version = CMA_VERSION;
>   src6 = (struct sockaddr_in6 *) cma_src_addr(id_priv);
>   dst6 = (struct sockaddr_in6 *) cma_dst_addr(id_priv);

I don't think this is sufficient.  The RDMA CM private data header is defined 
by the IB spec.  If the service ID starts with the prefix 0x01, it's 
reasonable to assume that the header is part of the private data.  The receive 
side should probably even check the version and discard any unknown values.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] libibumad 1.3.10 release

2014-12-16 Thread Hal Rosenstock
There is a new 1.3.10 release of libibumad.

Tarball is available in:
http://www.openfabrics.org/downloads/management/
(listed in http://www.openfabrics.org/downloads/management/latest.txt)

md5sum:
3a4046ddaaf43bbb82f124808232a294  libibumad-1.3.10.tar.gz

All component versions are from recent master branch. Full list of
changes is below.

Alex Netes (1):
  libibumad: Fix memory leak in resolve_ca_port

Dan Ben Yosef (1):
  umad.c: Buffer not null terminated

Hal Rosenstock (14):
  libibumad: Minor fixups for previous umad_str patch
  libibumad: Fix issues causing const warnings for strings
  libibumad: Add recent/missing SM/SA attributes
  libibumad: Rename attributes UMAD_SM_ATTR_XXX rather than 
UMAD_SMP_ATTR_XXX
  libibumad: update shared library version
  libibumad: package version update for 1.3.9 release
  umad_sm.h: Add Mellanox extended port info SM attribute ID to enum
  umad_[sm sa].h: Add PortInfoExtended SM and PortInfoExtendedRecord SA 
attributes
  umad_sa.h: Add some new (at IBA 1.3) SA CapabilityMask2 bit definitions
  umad_cm.h: Add new SAP and SPR CM attributes
  umad_str.c: Add strings for newly added attributes
  Added/updated some Mellanox copyrights
  libibumad.ver: Update shared library version
  configure.in: package version update for 1.3.10 release

Ilya Nelkenbaum (1):
  libibumad/umad.c: In resolve_ca_port, skip ethernet link layer ports

Ira Weiny (11):
  libibumad: fix umad_register man page
  libibumad: update umad_[send|recv] man pages to document how rmpp is 
handled
  libibumad: change UMAD_METHOD_RESP to UMAD_METHOD_RESP_MASK
  libibumad: add UMAD_SA_STATUS_PRI_SUGGESTED to SA status.
  libibumad: add string functions for various enums
  libibumad: document the setting of errno for umad_send and umad_recv
  libibumad: add SA CAP MASK[2] definitions
  libibumad: add ClassPortInfo struct
  umad_types.h: fix status type in umad_hdr
  Add support for new registration ioctl
  Add make check with umad register tests

Line Holen (1):
  umad_sm.h Add SM trap definitions

Nick Mills (1):
  configure.in: Remove unused --disable-libcheck configure option

Sean Hefty (10):
  libibumad: Provide MAD definitions with libibumad
  libibumad: Add SA MAD definitions to umad
  libibumad: Add basic SM definitions to umad
  libibumad: Add CM definitions to umad
  libibumad: Add new umad header files to release
  libibumad: Define ntohll/htonll
  libibumad: Define data type to indicate values are in big-endian
  libibumad/sa: Add SA specific status values
  libibumad: Define well known QKEY
  Fix export of umad_sa_mad_status_str

sean.he...@intel.com (1):
  Add UMAD_RMPP_FLAG_ACTIVE into umad_types.h

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps

2014-12-16 Thread Yann Droneaud
Hi,

Le mardi 16 décembre 2014 à 22:07 +0200, Or Gerlitz a écrit :
> On Tue, Dec 16, 2014 at 7:41 PM, Roland Dreier  wrote:
> > On Tue, Dec 16, 2014 at 4:33 AM, Yann Droneaud  wrote:
> >>
> >> With the suggested change here, buffer overflow won't happen,
> >> but the error is silently ignored, allowing uverb to return a
> >> partial result, which is likely not expected by userspace as
> >> it's a bit difficult to handle it gracefully.
> >>
> >> So this has to be removed, and a check on userspace response
> >> buffer must be added to ib_uverbs_ex_query_device() instead.
> >
> > I'm not sure of the specifics of the change you're suggesting here.
> > Would it be OK to go forward with the patch set we have, and then fix
> > this issue before 3.19-rc2?
> 
> Roland,
> 
> Haggai will address the change in an incremental patch against your
> for-next (3.19-rc1) so the fix will be ready on time for 3.19-rc2
> 

That's would be great.

Thanks

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps

2014-12-16 Thread Or Gerlitz
On Tue, Dec 16, 2014 at 7:41 PM, Roland Dreier  wrote:
> On Tue, Dec 16, 2014 at 4:33 AM, Yann Droneaud  wrote:
>>
>> With the suggested change here, buffer overflow won't happen,
>> but the error is silently ignored, allowing uverb to return a
>> partial result, which is likely not expected by userspace as
>> it's a bit difficult to handle it gracefully.
>>
>> So this has to be removed, and a check on userspace response
>> buffer must be added to ib_uverbs_ex_query_device() instead.
>
> I'm not sure of the specifics of the change you're suggesting here.
> Would it be OK to go forward with the patch set we have, and then fix
> this issue before 3.19-rc2?

Roland,

Haggai will address the change in an incremental patch against your
for-next (3.19-rc1) so the fix will be ready on time for 3.19-rc2

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH for-next] RDMA/CMA: Mark IPv4 addresses correctly when the listener is IPv6

2014-12-16 Thread Hefty, Sean
> Any comment? can you please ack the patch or continue the discussion?

My concerns were addressed, so I have no further issues.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps

2014-12-16 Thread Roland Dreier
On Tue, Dec 16, 2014 at 4:33 AM, Yann Droneaud  wrote:
>
> With the suggested change here, buffer overflow won't happen,
> but the error is silently ignored, allowing uverb to return a
> partial result, which is likely not expected by userspace as
> it's a bit difficult to handle it gracefully.
>
> So this has to be removed, and a check on userspace response
> buffer must be added to ib_uverbs_ex_query_device() instead.

I'm not sure of the specifics of the change you're suggesting here.
Would it be OK to go forward with the patch set we have, and then fix
this issue before 3.19-rc2?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Query regarding MAD_DEMUX and Secure Host

2014-12-16 Thread Bob Biloxi
Hi Jack,

Thank you so much for clarifying this. Now I understand. It all ties
down to QP1.
To support QP1, we need to support MAD_DEMUX

This is really helpful.

Best Regards,
Bob


On Tue, Dec 16, 2014 at 9:03 PM, Jack Morgenstein
 wrote:
> On Mon, 15 Dec 2014 15:07:58 +0530
> Bob Biloxi  wrote:
>
>> am I correct in my understanding
>> when i say that MAD_DEMUX feature is not required to be
>> supported/implemented in Mellanox RoCE Drivers?
>>
>> It is required only for Infiniband drivers?
>
> Actually, you will need to support MAD_DEMUX anyway. If not, the
> CONF_SPECIAL_QP command will fail if Secure Host mode is operating.
>
> CONF_SPECIAL_QP is required for RoCE as well, since if it is not called
> we will not have QP1.  However, since this command maps QP0 as well to
> a QP, the MAD_DEMUX command is still required.
>
> -Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] RDMA/cma: fix first byte overwritten for AF_IB

2014-12-16 Thread Jack Wang
If user attach private data for AF_IB, the first byte will
be overwritten, because we always set the cma version no matter
family is AF_IB, so move the version set inside if condition.

Reported-by: Fabian Holler 
Signed-off-by: Jack Wang 
---
 drivers/infiniband/core/cma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..22a22e2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2618,10 +2618,10 @@ static int cma_format_hdr(void *hdr, struct 
rdma_id_private *id_priv)
struct cma_hdr *cma_hdr;
 
cma_hdr = hdr;
-   cma_hdr->cma_version = CMA_VERSION;
if (cma_family(id_priv) == AF_INET) {
struct sockaddr_in *src4, *dst4;
 
+   cma_hdr->cma_version = CMA_VERSION;
src4 = (struct sockaddr_in *) cma_src_addr(id_priv);
dst4 = (struct sockaddr_in *) cma_dst_addr(id_priv);
 
@@ -2632,6 +2632,7 @@ static int cma_format_hdr(void *hdr, struct 
rdma_id_private *id_priv)
} else if (cma_family(id_priv) == AF_INET6) {
struct sockaddr_in6 *src6, *dst6;
 
+   cma_hdr->cma_version = CMA_VERSION;
src6 = (struct sockaddr_in6 *) cma_src_addr(id_priv);
dst6 = (struct sockaddr_in6 *) cma_dst_addr(id_priv);
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Query regarding MAD_DEMUX and Secure Host

2014-12-16 Thread Jack Morgenstein
On Mon, 15 Dec 2014 15:07:58 +0530
Bob Biloxi  wrote:

> am I correct in my understanding
> when i say that MAD_DEMUX feature is not required to be
> supported/implemented in Mellanox RoCE Drivers?
> 
> It is required only for Infiniband drivers?

Actually, you will need to support MAD_DEMUX anyway. If not, the
CONF_SPECIAL_QP command will fail if Secure Host mode is operating.

CONF_SPECIAL_QP is required for RoCE as well, since if it is not called
we will not have QP1.  However, since this command maps QP0 as well to
a QP, the MAD_DEMUX command is still required.

-Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: First byte of private_data transfered with rdma_connect() is always 0

2014-12-16 Thread Jack Wang
Found the bug.
commit 56e620c453f2588cfc9898a41b110477f6417a5d
Author: Jack Wang 
Date:   Tue Dec 16 15:44:17 2014 +0100

RDMA/cma: fix first byte overwritten for AF_IB

If user attach private data for AF_IB, the first byte will
be overwritten, because we always set the cma version no matter
family is AF_IB, so move the version set inside if condition.

Reported-by: Fabian Holler 
Signed-off-by: Jack Wang 

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..22a22e2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2618,10 +2618,10 @@ static int cma_format_hdr(void *hdr, struct
rdma_id_private *id_priv)
struct cma_hdr *cma_hdr;

cma_hdr = hdr;
-   cma_hdr->cma_version = CMA_VERSION;
if (cma_family(id_priv) == AF_INET) {
struct sockaddr_in *src4, *dst4;

+   cma_hdr->cma_version = CMA_VERSION;
src4 = (struct sockaddr_in *) cma_src_addr(id_priv);
dst4 = (struct sockaddr_in *) cma_dst_addr(id_priv);

@@ -2632,6 +2632,7 @@ static int cma_format_hdr(void *hdr, struct
rdma_id_private *id_priv)
} else if (cma_family(id_priv) == AF_INET6) {
struct sockaddr_in6 *src6, *dst6;

+   cma_hdr->cma_version = CMA_VERSION;
src6 = (struct sockaddr_in6 *) cma_src_addr(id_priv);
dst6 = (struct sockaddr_in6 *) cma_dst_addr(id_priv);


2014-12-16 15:32 GMT+01:00 Fabian Holler :
> Hello,
>
> we are using the conn_param->private_data field to transfer data with the
> rdma_connect() call to the server.
> When it's done on a RDMA_PS_IB rdma_cm_id the first Byte of the
> private_data, that is received by the server is _always_ 0.
> When a RDMA_PS_TCP rdma_cm_id is used, the data is received correctly on the
> server.
>
> We are using:
> - kernel 3.14.13
> - Mellanox Technologies MT26428 HCA
> - mlx4_0 driver
>
> I attached a simple client and server module to reproduce the behaviour.
> Can somebody have a look? Is there a problem in our modules? Or is it a bug?
>
> --
> Connection establishment via GID (RDMA_PS_IB):
> client:
> # insmod client.ko gid_addr=fe80::::0002:c903:0010:c0f5
> [ 7328.586773] private_data88022a263c50: 57 48 41 5a 20 55 50 20 53 
> 45 52 56 45 52 3f 00  WHAZ UP SERVER?.
>
> server:
> [ 1658.208238] private_data8800b93e3bec: 00 48 41 5a 20 55 50 20 53 
> 45 52 56 45 52 3f 00  .HAZ UP SERVER?.
> [ 1658.208239] private_data8800b93e3bfc: 00 00 00 00 00 00 00 00 00 
> 00 00 00 00 00 00 00  
> [ 1658.208241] private_data8800b93e3c0c: 00 00 00 00 00 00 00 00 00 
> 00 00 00 00 00 00 00  
> [ 1658.208242] private_data8800b93e3c1c: 00 00 00 00 00 00 00 00 00 
> 00 00 00 00 00 00 00  
> [ 1658.208244] private_data8800b93e3c2c: 00 00 00 00 00 00 00 00 00 
> 00 00 00 00 00 00 00  
> [ 1658.208245] private_data8800b93e3c3c: 00 00 00 00 00 00 00 00 00 
> 00 00 00  
>
>
> Connection establishment via IPv4 address (RDMA_PS_TCP):
> client:
> # insmod client.ko ip_addr=10.50.100.62
> [ 7179.219773] private_data88022a263c50: 57 48 41 5a 20 55 50 20 53 
> 45 # 52 56 45 52 3f 00  WHAZ UP SERVER?.
>
> server:
> [ 1508.840508] private_data8800b8d25b90: 57 48 41 5a 20 55 50 20 53 
> 45 52 56 45 52 3f 00  WHAZ UP SERVER?.
> [ 1508.840509] private_data8800b8d25ba0: 00 00 00 00 00 00 00 00 00 
> 00 00 00 00 00 00 00  
> [ 1508.840511] private_data8800b8d25bb0: 00 00 00 00 00 00 00 00 00 
> 00 00 00 00 00 00 00  
> [ 1508.840512] private_data8800b8d25bc0: 00 00 00 00 00 00 00 00  
> 
> --
>
>
> thanks
>
> Fabian
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH]RDMA/cma: fix first byte overwritten for AF_IB

2014-12-16 Thread Jinpu Wang
commit 56e620c453f2588cfc9898a41b110477f6417a5d
Author: Jack Wang 
Date:   Tue Dec 16 15:44:17 2014 +0100

RDMA/cma: fix first byte overwritten for AF_IB

If user attach private data for AF_IB, the first byte will
be overwritten, because we always set the cma version no matter
family is AF_IB, so move the version set inside if condition.

Reported-by: Fabian Holler 
Signed-off-by: Jack Wang 

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index d570030..22a22e2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2618,10 +2618,10 @@ static int cma_format_hdr(void *hdr, struct
rdma_id_private *id_priv)
struct cma_hdr *cma_hdr;

cma_hdr = hdr;
-   cma_hdr->cma_version = CMA_VERSION;
if (cma_family(id_priv) == AF_INET) {
struct sockaddr_in *src4, *dst4;

+   cma_hdr->cma_version = CMA_VERSION;
src4 = (struct sockaddr_in *) cma_src_addr(id_priv);
dst4 = (struct sockaddr_in *) cma_dst_addr(id_priv);

@@ -2632,6 +2632,7 @@ static int cma_format_hdr(void *hdr, struct
rdma_id_private *id_priv)
} else if (cma_family(id_priv) == AF_INET6) {
struct sockaddr_in6 *src6, *dst6;

+   cma_hdr->cma_version = CMA_VERSION;
src6 = (struct sockaddr_in6 *) cma_src_addr(id_priv);
dst6 = (struct sockaddr_in6 *) cma_dst_addr(id_priv);



-- 
Mit freundlichen Grüßen,
Best Regards,

Jack Wang

Linux Kernel Developer Storage
ProfitBricks GmbH  The IaaS-Company.

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin
Tel: +49 30 5770083-42
Fax: +49 30 5770085-98
Email: jinpu.w...@profitbricks.com
URL: http://www.profitbricks.de

Sitz der Gesellschaft: Berlin.
Registergericht: Amtsgericht Charlottenburg, HRB 125506 B.
Geschäftsführer: Andreas Gauger, Achim Weiss.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


First byte of private_data transfered with rdma_connect() is always 0

2014-12-16 Thread Fabian Holler
Hello,

we are using the conn_param->private_data field to transfer data with the
rdma_connect() call to the server.
When it's done on a RDMA_PS_IB rdma_cm_id the first Byte of the
private_data, that is received by the server is _always_ 0.
When a RDMA_PS_TCP rdma_cm_id is used, the data is received correctly on the
server.

We are using:
- kernel 3.14.13
- Mellanox Technologies MT26428 HCA
- mlx4_0 driver

I attached a simple client and server module to reproduce the behaviour.
Can somebody have a look? Is there a problem in our modules? Or is it a bug?

--
Connection establishment via GID (RDMA_PS_IB):
client:
# insmod client.ko gid_addr=fe80::::0002:c903:0010:c0f5
[ 7328.586773] private_data88022a263c50: 57 48 41 5a 20 55 50 20 53 45 
52 56 45 52 3f 00  WHAZ UP SERVER?.

server:
[ 1658.208238] private_data8800b93e3bec: 00 48 41 5a 20 55 50 20 53 45 
52 56 45 52 3f 00  .HAZ UP SERVER?.
[ 1658.208239] private_data8800b93e3bfc: 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00  
[ 1658.208241] private_data8800b93e3c0c: 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00  
[ 1658.208242] private_data8800b93e3c1c: 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00  
[ 1658.208244] private_data8800b93e3c2c: 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00  
[ 1658.208245] private_data8800b93e3c3c: 00 00 00 00 00 00 00 00 00 00 
00 00  


Connection establishment via IPv4 address (RDMA_PS_TCP):
client:
# insmod client.ko ip_addr=10.50.100.62
[ 7179.219773] private_data88022a263c50: 57 48 41 5a 20 55 50 20 53 45 
# 52 56 45 52 3f 00  WHAZ UP SERVER?.

server:
[ 1508.840508] private_data8800b8d25b90: 57 48 41 5a 20 55 50 20 53 45 
52 56 45 52 3f 00  WHAZ UP SERVER?.
[ 1508.840509] private_data8800b8d25ba0: 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00  
[ 1508.840511] private_data8800b8d25bb0: 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00  
[ 1508.840512] private_data8800b8d25bc0: 00 00 00 00 00 00 00 00
  
--


thanks

Fabian
#include 
#include 
#include 
#include 
#include 

MODULE_LICENSE("GPL");

static char *ip_addr;
module_param(ip_addr, charp, 0444);

static char *gid_addr;
module_param(gid_addr, charp, 0444);

#define SERVER_PORT 1234

#define my_printk(level, format, arg...) printk(level KBUILD_MODNAME \
		" %s(), %d: " \
		format, __func__, __LINE__, ## arg)
#define LOG(format, arg...) my_printk(KERN_INFO, format, ## arg)

struct con {
	struct rdma_cm_id	*cm_id;
	enum rdma_cm_event_type	ev;
	wait_queue_head_t	wait_q;
};

static struct con con;

#define XX(a) case (a): return #a
static inline const char *rdma_event_str(enum rdma_cm_event_type event)
{
	switch (event) {
	XX(RDMA_CM_EVENT_ADDR_RESOLVED);
	XX(RDMA_CM_EVENT_ADDR_ERROR);
	XX(RDMA_CM_EVENT_ROUTE_RESOLVED);
	XX(RDMA_CM_EVENT_ROUTE_ERROR);
	XX(RDMA_CM_EVENT_CONNECT_REQUEST);
	XX(RDMA_CM_EVENT_CONNECT_RESPONSE);
	XX(RDMA_CM_EVENT_CONNECT_ERROR);
	XX(RDMA_CM_EVENT_UNREACHABLE);
	XX(RDMA_CM_EVENT_REJECTED);
	XX(RDMA_CM_EVENT_ESTABLISHED);
	XX(RDMA_CM_EVENT_DISCONNECTED);
	XX(RDMA_CM_EVENT_DEVICE_REMOVAL);
	XX(RDMA_CM_EVENT_MULTICAST_JOIN);
	XX(RDMA_CM_EVENT_MULTICAST_ERROR);
	XX(RDMA_CM_EVENT_ADDR_CHANGE);
	XX(RDMA_CM_EVENT_TIMEWAIT_EXIT);
	default: return "RDMA_CM_UNKNOWN";
	}
}

static int rdma_cm_ev_handler(struct rdma_cm_id *cm_id,
			  struct rdma_cm_event *event)
{
	struct con *con = cm_id->context;

	LOG("CM event %s, error %d\n", rdma_event_str(event->event),
	 event->status);

	con->ev = event->event;
	wake_up_interruptible(&con->wait_q);
	return 0;
}

static int str_gid_to_sockaddr(const char *gid, struct sockaddr_ib *dst)
{
	int ret;

	ret = in6_pton(gid, strlen(gid),
		   dst->sib_addr.sib_raw, '\n', NULL);
	if (ret == 0)
		return -EINVAL;

	dst->sib_family = AF_IB;
	dst->sib_sid = cpu_to_be64(RDMA_IB_IP_PS_IB | SERVER_PORT);
	dst->sib_sid_mask = cpu_to_be64(0xULL);
	dst->sib_pkey = cpu_to_be16(0x);
	LOG("Converted %s to binary GID %pI6\n", gid, dst->sib_addr.sib_raw);
	return 0;
}

static int str_ip_to_sockaddr(const char *ipaddr, struct sockaddr_storage *dst)
{
	int ret;
	u8 ip4[4];

	ret = in6_pton(ipaddr, strlen(ipaddr),
		   ((struct sockaddr_in6 *)dst)->sin6_addr.s6_addr,
		   '\n', NULL);
	if (ret == 1) {
		dst->ss_family = AF_INET6;
		((struct sockaddr_in6 *)dst)->sin6_port =
		htons(SERVER_PORT);
		LOG("Converted %s to binary IPv6 %pI6\n", ipaddr,
		((struct sockaddr_in6 *)dst)->sin6_addr.s6_addr);
		return 0;
	}

	ret = in4_pton(ipaddr, strlen(ipaddr), ip4, '\n', NULL);
	if (ret == 0)
		return -EINVAL;

	memcpy(&((struct sockaddr_in *)dst)->sin_addr.s_addr, ip4,
	   sizeof(((struct sockaddr_in *)dst)->sin_addr.s_addr));
	dst->ss_family = AF_INET;
	((struct sockaddr_in *)dst)->sin_port = h

Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps

2014-12-16 Thread Yann Droneaud
Le jeudi 11 décembre 2014 à 17:04 +0200, Haggai Eran a écrit :
> From: Eli Cohen 
> 
> Add extensible query device capabilities verb to allow adding new features.
> ib_uverbs_ex_query_device is added and copy_query_dev_fields is used to copy
> capability fields to be used by both ib_uverbs_query_device and
> ib_uverbs_ex_query_device.
> 
> Signed-off-by: Eli Cohen 
> Signed-off-by: Haggai Eran 
> ---
>  drivers/infiniband/core/uverbs.h  |   1 +
>  drivers/infiniband/core/uverbs_cmd.c  | 124 
> +++---
>  drivers/infiniband/core/uverbs_main.c |   3 +-
>  include/rdma/ib_verbs.h   |   5 +-
>  include/uapi/rdma/ib_user_verbs.h |  14 +++-
>  5 files changed, 103 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/infiniband/core/uverbs.h 
> b/drivers/infiniband/core/uverbs.h
> index 643c08a025a5..b716b0815644 100644
> --- a/drivers/infiniband/core/uverbs.h
> +++ b/drivers/infiniband/core/uverbs.h
> @@ -258,5 +258,6 @@ IB_UVERBS_DECLARE_CMD(close_xrcd);
>  
>  IB_UVERBS_DECLARE_EX_CMD(create_flow);
>  IB_UVERBS_DECLARE_EX_CMD(destroy_flow);
> +IB_UVERBS_DECLARE_EX_CMD(query_device);
>  
>  #endif /* UVERBS_H */
> diff --git a/drivers/infiniband/core/uverbs_cmd.c 
> b/drivers/infiniband/core/uverbs_cmd.c
> index 5ba2a86aab6a..c7a43624c96b 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -378,6 +378,52 @@ err:
>   return ret;
>  }
>  
> +static void copy_query_dev_fields(struct ib_uverbs_file *file,
> +   struct ib_uverbs_query_device_resp *resp,
> +   struct ib_device_attr *attr)
> +{
> + resp->fw_ver= attr->fw_ver;
> + resp->node_guid = file->device->ib_dev->node_guid;
> + resp->sys_image_guid= attr->sys_image_guid;
> + resp->max_mr_size   = attr->max_mr_size;
> + resp->page_size_cap = attr->page_size_cap;
> + resp->vendor_id = attr->vendor_id;
> + resp->vendor_part_id= attr->vendor_part_id;
> + resp->hw_ver= attr->hw_ver;
> + resp->max_qp= attr->max_qp;
> + resp->max_qp_wr = attr->max_qp_wr;
> + resp->device_cap_flags  = attr->device_cap_flags;
> + resp->max_sge   = attr->max_sge;
> + resp->max_sge_rd= attr->max_sge_rd;
> + resp->max_cq= attr->max_cq;
> + resp->max_cqe   = attr->max_cqe;
> + resp->max_mr= attr->max_mr;
> + resp->max_pd= attr->max_pd;
> + resp->max_qp_rd_atom= attr->max_qp_rd_atom;
> + resp->max_ee_rd_atom= attr->max_ee_rd_atom;
> + resp->max_res_rd_atom   = attr->max_res_rd_atom;
> + resp->max_qp_init_rd_atom   = attr->max_qp_init_rd_atom;
> + resp->max_ee_init_rd_atom   = attr->max_ee_init_rd_atom;
> + resp->atomic_cap= attr->atomic_cap;
> + resp->max_ee= attr->max_ee;
> + resp->max_rdd   = attr->max_rdd;
> + resp->max_mw= attr->max_mw;
> + resp->max_raw_ipv6_qp   = attr->max_raw_ipv6_qp;
> + resp->max_raw_ethy_qp   = attr->max_raw_ethy_qp;
> + resp->max_mcast_grp = attr->max_mcast_grp;
> + resp->max_mcast_qp_attach   = attr->max_mcast_qp_attach;
> + resp->max_total_mcast_qp_attach = attr->max_total_mcast_qp_attach;
> + resp->max_ah= attr->max_ah;
> + resp->max_fmr   = attr->max_fmr;
> + resp->max_map_per_fmr   = attr->max_map_per_fmr;
> + resp->max_srq   = attr->max_srq;
> + resp->max_srq_wr= attr->max_srq_wr;
> + resp->max_srq_sge   = attr->max_srq_sge;
> + resp->max_pkeys = attr->max_pkeys;
> + resp->local_ca_ack_delay= attr->local_ca_ack_delay;
> + resp->phys_port_cnt = file->device->ib_dev->phys_port_cnt;
> +}
> +
>  ssize_t ib_uverbs_query_device(struct ib_uverbs_file *file,
>  const char __user *buf,
>  int in_len, int out_len)
> @@ -398,47 +444,7 @@ ssize_t ib_uverbs_query_device(struct ib_uverbs_file 
> *file,
>   return ret;
>  
>   memset(&resp, 0, sizeof resp);
> -
> - resp.fw_ver= attr.fw_ver;
> - resp.node_guid = file->device->ib_dev->node_guid;
> - resp.sys_image_guid= attr.sys_image_guid;
> - resp.max_mr_size   = attr.max_mr_size;
> - resp.page_size_cap = attr.page_size_cap;
> - resp.vendor_id = attr.vendor_id;
> - resp.vendor_part_id= attr.vendor_part_id;
> - resp.hw_ver= attr.hw_ver;
> - resp.max_qp= attr.max_qp;
> - resp.max_qp_wr = attr.max_qp_wr;
> - resp.device_cap_flags  = attr.device_cap_flags;
> - 

Re: [PATCH v3 07/17] IB/core: Add flags for on demand paging support

2014-12-16 Thread Yann Droneaud
Le mardi 16 décembre 2014 à 13:02 +0100, Yann Droneaud a écrit :
> Le jeudi 11 décembre 2014 à 17:04 +0200, Haggai Eran a écrit :
> > From: Sagi Grimberg 
> > 
> > * Add a configuration option for enable on-demand paging support in the
> >   infiniband subsystem (CONFIG_INFINIBAND_ON_DEMAND_PAGING). In a later 
> > patch,
> >   this configuration option will select the MMU_NOTIFIER configuration 
> > option
> >   to enable mmu notifiers.
> > * Add a flag for on demand paging (ODP) support in the IB device 
> > capabilities.
> > * Add a flag to request ODP MR in the access flags to reg_mr.
> > * Fail registrations done with the ODP flag when the low-level driver 
> > doesn't
> >   support this.
> > * Change the conditions in which an MR will be writable to explicitly
> >   specify the access flags. This is to avoid making an MR writable just
> >   because it is an ODP MR.
> > * Add a ODP capabilities to the extended query device verb.
> > 
> > Signed-off-by: Sagi Grimberg 
> > Signed-off-by: Shachar Raindel 
> > Signed-off-by: Haggai Eran 
> > ---
> >  drivers/infiniband/Kconfig   | 10 ++
> >  drivers/infiniband/core/umem.c   |  8 +---
> >  drivers/infiniband/core/uverbs_cmd.c | 25 +
> >  include/rdma/ib_verbs.h  | 28 ++--
> >  include/uapi/rdma/ib_user_verbs.h| 15 +++
> >  5 files changed, 81 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
> > index 77089399359b..089a2c2af329 100644
> > --- a/drivers/infiniband/Kconfig
> > +++ b/drivers/infiniband/Kconfig
> > @@ -38,6 +38,16 @@ config INFINIBAND_USER_MEM
> > depends on INFINIBAND_USER_ACCESS != n
> > default y
> >  
> > +config INFINIBAND_ON_DEMAND_PAGING
> > +   bool "InfiniBand on-demand paging support"
> > +   depends on INFINIBAND_USER_MEM
> > +   default y
> > +   ---help---
> > + On demand paging support for the InfiniBand subsystem.
> > + Together with driver support this allows registration of
> > + memory regions without pinning their pages, fetching the
> > + pages on demand instead.
> > +
> >  config INFINIBAND_ADDR_TRANS
> > bool
> > depends on INFINIBAND
> > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> > index 6f152628e0d2..c328e4693d14 100644
> > --- a/drivers/infiniband/core/umem.c
> > +++ b/drivers/infiniband/core/umem.c
> > @@ -107,13 +107,15 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> > *context, unsigned long addr,
> > umem->page_size = PAGE_SIZE;
> > umem->pid   = get_task_pid(current, PIDTYPE_PID);
> > /*
> > -* We ask for writable memory if any access flags other than
> > -* "remote read" are set.  "Local write" and "remote write"
> > +* We ask for writable memory if any of the following
> > +* access flags are set.  "Local write" and "remote write"
> >  * obviously require write access.  "Remote atomic" can do
> >  * things like fetch and add, which will modify memory, and
> >  * "MW bind" can change permissions by binding a window.
> >  */
> > -   umem->writable  = !!(access & ~IB_ACCESS_REMOTE_READ);
> > +   umem->writable  = !!(access &
> > +   (IB_ACCESS_LOCAL_WRITE   | IB_ACCESS_REMOTE_WRITE |
> > +IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> >  
> > /* We assume the memory is from hugetlb until proved otherwise */
> > umem->hugetlb   = 1;
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c 
> > b/drivers/infiniband/core/uverbs_cmd.c
> > index c7a43624c96b..f9326ccda4b5 100644
> > --- a/drivers/infiniband/core/uverbs_cmd.c
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -953,6 +953,18 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
> > goto err_free;
> > }
> >  
> > +   if (cmd.access_flags & IB_ACCESS_ON_DEMAND) {
> > +   struct ib_device_attr attr;
> > +
> > +   ret = ib_query_device(pd->device, &attr);
> > +   if (ret || !(attr.device_cap_flags &
> > +   IB_DEVICE_ON_DEMAND_PAGING)) {
> > +   pr_debug("ODP support not available\n");
> > +   ret = -EINVAL;
> > +   goto err_put;
> > +   }
> > +   }
> > +
> > mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
> >  cmd.access_flags, &udata);
> > if (IS_ERR(mr)) {
> > @@ -3289,6 +3301,19 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file 
> > *file,
> > copy_query_dev_fields(file, &resp.base, &attr);
> > resp.comp_mask = 0;
> >  
> > +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > +   if (cmd.comp_mask & IB_USER_VERBS_EX_QUERY_DEVICE_ODP) {
> > +   resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> > +   resp.odp_caps.per_transport_caps.rc_odp_caps =
> > +   attr.odp_caps.per_transport_caps.rc_odp_caps;
> > +   resp.odp_cap

Re: [PATCH v3 07/17] IB/core: Add flags for on demand paging support

2014-12-16 Thread Yann Droneaud
Hi,

Le jeudi 11 décembre 2014 à 17:04 +0200, Haggai Eran a écrit :
> From: Sagi Grimberg 
> 
> * Add a configuration option for enable on-demand paging support in the
>   infiniband subsystem (CONFIG_INFINIBAND_ON_DEMAND_PAGING). In a later patch,
>   this configuration option will select the MMU_NOTIFIER configuration option
>   to enable mmu notifiers.
> * Add a flag for on demand paging (ODP) support in the IB device capabilities.
> * Add a flag to request ODP MR in the access flags to reg_mr.
> * Fail registrations done with the ODP flag when the low-level driver doesn't
>   support this.
> * Change the conditions in which an MR will be writable to explicitly
>   specify the access flags. This is to avoid making an MR writable just
>   because it is an ODP MR.
> * Add a ODP capabilities to the extended query device verb.
> 
> Signed-off-by: Sagi Grimberg 
> Signed-off-by: Shachar Raindel 
> Signed-off-by: Haggai Eran 
> ---
>  drivers/infiniband/Kconfig   | 10 ++
>  drivers/infiniband/core/umem.c   |  8 +---
>  drivers/infiniband/core/uverbs_cmd.c | 25 +
>  include/rdma/ib_verbs.h  | 28 ++--
>  include/uapi/rdma/ib_user_verbs.h| 15 +++
>  5 files changed, 81 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
> index 77089399359b..089a2c2af329 100644
> --- a/drivers/infiniband/Kconfig
> +++ b/drivers/infiniband/Kconfig
> @@ -38,6 +38,16 @@ config INFINIBAND_USER_MEM
>   depends on INFINIBAND_USER_ACCESS != n
>   default y
>  
> +config INFINIBAND_ON_DEMAND_PAGING
> + bool "InfiniBand on-demand paging support"
> + depends on INFINIBAND_USER_MEM
> + default y
> + ---help---
> +   On demand paging support for the InfiniBand subsystem.
> +   Together with driver support this allows registration of
> +   memory regions without pinning their pages, fetching the
> +   pages on demand instead.
> +
>  config INFINIBAND_ADDR_TRANS
>   bool
>   depends on INFINIBAND
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 6f152628e0d2..c328e4693d14 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -107,13 +107,15 @@ struct ib_umem *ib_umem_get(struct ib_ucontext 
> *context, unsigned long addr,
>   umem->page_size = PAGE_SIZE;
>   umem->pid   = get_task_pid(current, PIDTYPE_PID);
>   /*
> -  * We ask for writable memory if any access flags other than
> -  * "remote read" are set.  "Local write" and "remote write"
> +  * We ask for writable memory if any of the following
> +  * access flags are set.  "Local write" and "remote write"
>* obviously require write access.  "Remote atomic" can do
>* things like fetch and add, which will modify memory, and
>* "MW bind" can change permissions by binding a window.
>*/
> - umem->writable  = !!(access & ~IB_ACCESS_REMOTE_READ);
> + umem->writable  = !!(access &
> + (IB_ACCESS_LOCAL_WRITE   | IB_ACCESS_REMOTE_WRITE |
> +  IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
>  
>   /* We assume the memory is from hugetlb until proved otherwise */
>   umem->hugetlb   = 1;
> diff --git a/drivers/infiniband/core/uverbs_cmd.c 
> b/drivers/infiniband/core/uverbs_cmd.c
> index c7a43624c96b..f9326ccda4b5 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -953,6 +953,18 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
>   goto err_free;
>   }
>  
> + if (cmd.access_flags & IB_ACCESS_ON_DEMAND) {
> + struct ib_device_attr attr;
> +
> + ret = ib_query_device(pd->device, &attr);
> + if (ret || !(attr.device_cap_flags &
> + IB_DEVICE_ON_DEMAND_PAGING)) {
> + pr_debug("ODP support not available\n");
> + ret = -EINVAL;
> + goto err_put;
> + }
> + }
> +
>   mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length, cmd.hca_va,
>cmd.access_flags, &udata);
>   if (IS_ERR(mr)) {
> @@ -3289,6 +3301,19 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file 
> *file,
>   copy_query_dev_fields(file, &resp.base, &attr);
>   resp.comp_mask = 0;
>  
> +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> + if (cmd.comp_mask & IB_USER_VERBS_EX_QUERY_DEVICE_ODP) {
> + resp.odp_caps.general_caps = attr.odp_caps.general_caps;
> + resp.odp_caps.per_transport_caps.rc_odp_caps =
> + attr.odp_caps.per_transport_caps.rc_odp_caps;
> + resp.odp_caps.per_transport_caps.uc_odp_caps =
> + attr.odp_caps.per_transport_caps.uc_odp_caps;
> + resp.odp_caps.per_transport_caps.ud_odp_caps =
> + 

Re: [PATCH for-next] RDMA/CMA: Mark IPv4 addresses correctly when the listener is IPv6

2014-12-16 Thread Or Gerlitz

On 11/16/2014 11:04 AM, Shachar Raindel wrote:



-Original Message-
From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
ow...@vger.kernel.org] On Behalf Of Hefty, Sean
Sent: Thursday, November 13, 2014 6:24 PM
To: Or Gerlitz
Cc: linux-rdma@vger.kernel.org; Roland Dreier; Yotam Kenneth
Subject: RE: [PATCH for-next] RDMA/CMA: Mark IPv4 addresses correctly
when the listener is IPv6


From: Yotam Kenneth 

When accepting a new connection with the listener being IPv6, the
family of the new connection is set as IPv6. This causes cma_zero_addr
function to return true on an non-zero address. As a result, the wrong
code path is taken. This causes the connection request to be rejected,
as the RDMA-CM code looks for the wrong type of device.

Since copying the ip address is done in different function depending
on the family (cma_save_ip4_info/cma_save_ip6_info) this is fixed by
hard coding the family of the IP address according to the function.

Signed-off-by: Yotam Kenneth 
Signed-off-by: Or Gerlitz 
---
  drivers/infiniband/core/cma.c |8 
  1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cma.c

b/drivers/infiniband/core/cma.c

index d570030..6e5e11c 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -866,12 +866,12 @@ static void cma_save_ip4_info(struct rdma_cm_id

*id,

struct rdma_cm_id *listen_i

listen4 = (struct sockaddr_in *) &listen_id->route.addr.src_addr;
ip4 = (struct sockaddr_in *) &id->route.addr.src_addr;
-   ip4->sin_family = listen4->sin_family;
+   ip4->sin_family = AF_INET;
ip4->sin_addr.s_addr = hdr->dst_addr.ip4.addr;
ip4->sin_port = listen4->sin_port;

ip4 = (struct sockaddr_in *) &id->route.addr.dst_addr;
-   ip4->sin_family = listen4->sin_family;
+   ip4->sin_family = AF_INET;
ip4->sin_addr.s_addr = hdr->src_addr.ip4.addr;
ip4->sin_port = hdr->port;
  }
@@ -883,12 +883,12 @@ static void cma_save_ip6_info(struct rdma_cm_id

*id,

struct rdma_cm_id *listen_i

listen6 = (struct sockaddr_in6 *) &listen_id->route.addr.src_addr;
ip6 = (struct sockaddr_in6 *) &id->route.addr.src_addr;
-   ip6->sin6_family = listen6->sin6_family;
+   ip6->sin6_family = AF_INET6;
ip6->sin6_addr = hdr->dst_addr.ip6;
ip6->sin6_port = listen6->sin6_port;

ip6 = (struct sockaddr_in6 *) &id->route.addr.dst_addr;
-   ip6->sin6_family = listen6->sin6_family;
+   ip6->sin6_family = AF_INET6;
ip6->sin6_addr = hdr->src_addr.ip6;
ip6->sin6_port = hdr->port;

I can't say that I understand what the problem is or how the change
fixes it.  Is listen4->sin_port above not AF_INET?  If that's the case,
then aren't we still taking the wrong code path and just masking some
bug further up in the code path?


An IPv6 listener socket can accept IPv4 connection, especially
when listening to the "any" interface. This behavior is
configurable with the net.ipv6.bindv6only sysctl flag. In
RDMA-CM, we use this flag as the baseline value for the CM-ID
flag "afonly". If afonly is set to 0, a listening RDMA-CM ID for
IPv6 any address will happily accept an IPv4 connection. The IP
addresses saved in cma_save_ip[46]_info are only used internally
in the RDMA-CM code for selecting the appropriate device for
answering and sending a reply. As such, there is no problem with
hard coding the appropriate address family in the functions.




Hi Sean,

Any comment? can you please ack the patch or continue the discussion?


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: build failure after merge of the infiniband tree

2014-12-16 Thread Or Gerlitz

On 12/16/2014 3:56 AM, Roland Dreier wrote:

On Mon, Dec 15, 2014 at 5:47 PM, Stephen Rothwell  wrote:

Hi all,

After merging the infiniband tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/infiniband/hw/mlx5/main.c: In function 'mlx5_ib_query_device':
drivers/infiniband/hw/mlx5/main.c:248:34: error: 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' 
undeclared (first use in this function)
   if (dev->mdev->caps.gen.flags & MLX5_DEV_CAP_FLAG_ON_DMND_PG)
   ^
[...]
Really?  Code added half way though the merge window not even build
tested?

It's not quite as bad as it seems.  The infiniband tree itself builds,
the problem is the merged tree.

The Mellanox guys merged the "cleanup"


Hi Roland,

So shit happens... Eli is away this week, but it's clear that this 
portion of the cleanup
was terribly wrongand done by mistake, sorry for that and thanks for 
addressing quickly.


Or.



commit 0c7aac854f52
Author: Eli Cohen 
Date:   Tue Dec 2 02:26:14 2014

 net/mlx5_core: Remove unused dev cap enum fields

 These enumerations are not used so remove them.

 Signed-off-by: Eli Cohen 
 Signed-off-by: David S. Miller 

through davem's tree, and then went ahead and used at least
MLX5_DEV_CAP_FLAG_ON_DMND_PG (which that patch removes) in patches
they merged through my tree.

I'll add a partial revert of that patch to my tree to get back the
now-used enum values.

  - R.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html