[ewg] Re: [PATCH 1/2 - v2] IB/iser: move open-iscsi crypto functions to kernel_addons (SLES10, RHEL5)

2007-08-06 Thread Michael S. Tsirkin
> Quoting Erez Zilber <[EMAIL PROTECTED]>:
> Subject: [PATCH 1/2 - v2] IB/iser: move open-iscsi crypto functions to 
> kernel_addons (SLES10, RHEL5)
> 
> move open-iscsi crypto functions to kernel_addons (SLES10, RHEL5)
> 
> Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>

Couldn't apply this.
Applying IB/iser: move open-iscsi crypto functions to kernel_addons (SLES10, 
RHEL5)

error: patch failed: 
kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch:1
error: kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch: 
patch does not apply
error: patch failed: 
kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch:1
error: 
kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch: patch 
does not apply
error: patch failed: 
kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch:1
error: kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch: patch 
does not apply
Patch failed at 0001.

Erez, could you publish a git tree with these patches applied?

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH 1/2 - v2] IB/iser: move open-iscsi crypto functions to kernel_addons (SLES10, RHEL5)

2007-08-06 Thread Michael S. Tsirkin
> Quoting Michael S. Tsirkin <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH 1/2 - v2] IB/iser: move open-iscsi crypto functions to 
> kernel_addons (SLES10, RHEL5)
> 
> > Quoting Erez Zilber <[EMAIL PROTECTED]>:
> > Subject: [PATCH 1/2 - v2] IB/iser: move open-iscsi crypto functions to 
> > kernel_addons (SLES10, RHEL5)
> > 
> > move open-iscsi crypto functions to kernel_addons (SLES10, RHEL5)
> > 
> > Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>
> 
> Couldn't apply this.
> Applying IB/iser: move open-iscsi crypto functions to kernel_addons (SLES10, 
> RHEL5)
> 
> error: patch failed: 
> kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch:1
> error: kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch: 
> patch does not apply
> error: patch failed: 
> kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch:1
> error: 
> kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch: 
> patch does not apply
> error: patch failed: 
> kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch:1
> error: kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch: 
> patch does not apply
> Patch failed at 0001.
> 
> Erez, could you publish a git tree with these patches applied?

Same with quilt:

Applying patch patcha.patch
patching file kernel_addons/backport/2.6.16_sles10/include/linux/crypto.h
patching file kernel_addons/backport/2.6.16_sles10_sp1/include/linux/crypto.h
patching file kernel_addons/backport/2.6.18_FC6/include/linux/crypto.h
patching file 
kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch
Hunk #1 FAILED at 1.
File kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch is 
not empty after patch, as expected
1 out of 1 hunk FAILED -- rejects in file 
kernel_patches/backport/2.6.16_sles10/open-iscsi-tx-hash-fixes.patch
patching file 
kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch
Hunk #1 FAILED at 1.
File kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch 
is not empty after patch, as expected
1 out of 1 hunk FAILED -- rejects in file 
kernel_patches/backport/2.6.16_sles10_sp1/open-iscsi-tx-hash-fixes.patch
patching file kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch
Hunk #1 FAILED at 1.
File kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch is not 
empty after patch, as expected
1 out of 1 hunk FAILED -- rejects in file 
kernel_patches/backport/2.6.18_FC6/open-iscsi-tx-hash-fixes.patch
Patch patcha.patch does not apply (enforce with -f)

Should the whole of open-iscsi-tx-hash-fixes.patch be removed?

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: ofa_1_2_c_kernel 20070802-0201 daily build status

2007-08-06 Thread Hoang-Nam Nguyen
Hello Michael and Vladimir!
> ehca backports for kernel.org kernels seem to be broken.
> 1. Does anyone care enough to fix them? If not we'll disable
>ehca in build for these kernels.
I downloaded daily build package ofa_1_2_c_kernel-20070804-0200.tgz
and followed the build scheme configure, make on 2.6.19, 2.6.18, 2.6.17
and 2.6.16/sles10/sles10_sp1. Except for 2.6.16/sles10/sles10_sp1
a patch for kmem_cache_zalloc() is required for ehca the others were
built without errors, see below. Thus, I'm wondering what I'm doing
differently than your daily build script?
PS: Will run the build on rhel5 (still to set up kernel tree).
Thanks
Nam


* linux-2.6.19
[EMAIL PROTECTED] ofa_1_2_c_kernel-20070804-0200]# uname -r
2.6.19

  gcc -m64 
-Wp,-MD,/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/.ib_ipoib.mod.o.d
  -nostdinc -isystem /usr/lib/gcc/ppc64-redhat-linux/4.1.1/include -D__KERNEL__ 
\
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/kernel_addons/backport/2.6.19/include/
 \
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/include \
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/include \
-Iinclude \
 \
-include include/linux/autoconf.h \
-include /root/2.6.19/ofa_1_2_c_kernel-20070804-0200/include/linux/autoconf.h \
  -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -O2 -msoft-float -pipe -mminimal-toc -mtraceback=none  
-mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring 
-Wa,-maltivec -fomit-frame-pointer  -fno-stack-protector 
-Wdeclaration-after-statement -Wno-pointer-sign
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/include 
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/include   
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib  
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/debug  
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/cxgb3/core  
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/net/cxgb3 
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/net/rds   
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/net/mlx4  
-I/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/mlx4
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_ipoib.mod)"  
-D"KBUILD_MODNAME=KBUILD_STR(ib_ipoib)" -DMODULE -c -o 
/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/ib_ipoib.mod.o
 
/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/ib_ipoib.mod.c
  ld -m elf64ppc  -r -o 
/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/ib_ipoib.ko
 
/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/ib_ipoib.o
 
/root/2.6.19/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/ib_ipoib.mod.o
make[1]: Leaving directory `/usr/src/linux-2.6.19'
[EMAIL PROTECTED] ofa_1_2_c_kernel-20070804-0200]# find . -name '*.ko'
./drivers/infiniband/ulp/ipoib/ib_ipoib.ko
./drivers/infiniband/hw/ehca/ib_ehca.ko
./drivers/infiniband/core/ib_mad.ko
./drivers/infiniband/core/iw_cm.ko
./drivers/infiniband/core/ib_uverbs.ko
./drivers/infiniband/core/ib_ucm.ko
./drivers/infiniband/core/ib_cm.ko
./drivers/infiniband/core/ib_sa.ko
./drivers/infiniband/core/ib_core.ko

* 2.6.18
[EMAIL PROTECTED] 2.6.18]# uname -r
2.6.18
  gcc -m64 
-Wp,-MD,/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib/.ib_ipoib.mod.o.d
  -nostdinc -isystem /usr/lib/gcc/ppc64-redhat-linux/4.1.1/include -D__KERNEL__ 
\
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/kernel_addons/backport/2.6.18/include/
 \
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/include \
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/include \
-Iinclude \
 \
-include include/linux/autoconf.h \
-include /root/2.6.18/ofa_1_2_c_kernel-20070804-0200/include/linux/autoconf.h \
  -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -O2 -msoft-float -pipe -mminimal-toc -mtraceback=none  
-mcall-aixdesc -mtune=power4 -mno-altivec -funit-at-a-time -mstring 
-Wa,-maltivec -fomit-frame-pointer  -fno-stack-protector 
-Wdeclaration-after-statement -Wno-pointer-sign
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/include 
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/include   
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/ulp/ipoib  
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/debug  
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/cxgb3/core  
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/net/cxgb3 
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/net/rds   
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/net/mlx4  
-I/root/2.6.18/ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/mlx4
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ib_ipoib.mod)"  
-D"KBUILD_MODNAME=KBUILD_STR(ib_ipoib)" -DMODULE -c -o 
/root/2.6.18/ofa_1_2_c_kernel-2

[ewg] Re: ofa_1_2_c_kernel 20070802-0201 daily build status

2007-08-06 Thread Hoang-Nam Nguyen
Hello Doug and Scott!

On Thursday 02 August 2007 18:08, Michael S. Tsirkin wrote:
> ehca backports for kernel.org kernels seem to be broken.
> 1. Does anyone care enough to fix them? If not we'll disable
>ehca in build for these kernels.
> 
> 2. Could you upload kernels for RHEL4U5 and SLES10 ppc64?
>This would make it possible for us to add it to nightly builds.
Could you please provide Michael and Vladimir with a URL to download
above kernel source tree in order to perform daily build of ofed
code suite? This will help us to prevent build issues in very early
stage.
Thanks much in advance!

Regards
Nam



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RFCv2: SRC API

2007-08-06 Thread Michael S. Tsirkin
This is version 2 of the proposal, addressing comments
from version 1.

Changelog:
- Use oflags to make API smaller
- Clarify sharing semantics
- Add documentation

This is the API proposal for support of the SRC
(scalable reliable connected) protocol extension in libibverbs.

This adds APIs to:
- manage SRC domains

- share SRC domains between processes,
by means of creating a 1:1 association
between an SRC domain and an inode.

Notes:
- The inode is specified by means of a file descriptor,
this makes it possible for the user to manage file
creation/deletion in the most flexible manner
(e.g. tmpfile can be used).

- I envision implementing this sharing mechanism in kernel by means
of a per-device tree, with inode as a key and domain object
as a value.

Please comment.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>



diff --git a/SRC.txt b/SRC.txt
new file mode 100644
index 000..3881477
--- /dev/null
+++ b/SRC.txt
@@ -0,0 +1,133 @@
+Here's some documentation on Scalable Reliable Connections.
+
+ * * *
+
+SRC is an extension supported by recent Mellanox hardware
+which is geared toward reducing the number of QPs
+required for all-to-all communication on systems
+with a high number of jobs per node.
+
+===
+Motivation:
+===
+Given N nodes with J jobs per node, number of QPs required
+for all-to-all communication is:
+
+With RC:
+   O((N * J) ^ 2)
+
+   Since each job out of O(N * J) jobs must create a single QP
+   to communicate with each one of O(N * J) other jobs.
+
+With SRC:
+   O(N ^ 2 * J)
+
+   This is achived by using a single send queue (per job, out of O(N * J) 
jobs)
+   to send data to all J jobs running on a specific node (out of O(N) 
nodes).
+   Hardware uses new "SRQ number" field in packet header to
+   multiplex receive WRs and WCs to private memory of each job.
+
+This is similiar idea to IB RD.
+Q: Why not use RD then?
+A: Because no hardware supports it.
+
+Details:
+
+===
+Verbs extension:
+===
+
+- There is a new transport/QP type "SRC".
+- There is a new object type "SRC domain"
+- Each SRQ gets new (optional) attributes:
+SRC domain
+   SRC SRQ number
+SRC CQ
+  SRQ must have either all 3 of these or none of these attributes
+
+- QPs of type SRC have all the same attributes as regular RC QPs
+  connected to SRQ, except that:
+  A. Each SRC QP has a new required attribute "SRC domain"
+  B. SRC QPs do *not* have "SRQ" attribute
+   (do not have a specific SRQ associated with them)
+
+===
+Protocol extension:
+===
+SRC QP behaviour: Requestor
+- Post send WR for this QP type is extended with SRQ number field
+  This number is sent as part of packet header
+- SRC Packets follow rules for RC packets on the wire, exactly
+  What is different is their handling at the responder side
+
+SRC QP behaviour: Responder
+Each incoming packet passes transport checks with respect
+to the SRC QP, following RC rules, exactly.
+
+After this, SRQ number in packet header is used to look up
+a specific SRQ. SRC domain of the resulting SRQ must be equal
+to SRC domain of the QP, otherwise a NAK is sent,
+and QP moves to error state.
+
+If the SRC domains match, receive WR and receive WC processing
+are as follows:
+
+- RC Send
+  - Rather than using SRQ to which the QP is attached,
+SRQ is looked up by SRQ number in the packet.
+Receive WR is taken from this SRQ.
+  - Completions are generated on the CQ specified in the SRQ
+
+- RDMA/Atomic
+  - Rather than using PD to which the QP is attached,
+SRQ is looked up by SRQ number in the packet.
+PD of this SRQ is used for protection checks.
+
+===
+Pseudo code:
+===
+
+Consider again a setup where there are N nodes with J jobs per node.
+All N * J jobs need to perform all-to-all communication.
+Using RC QPs, this would call for O((N * J) ^ 2) QPs.
+Here is how SRC can be used to reduce the number of QPs to O(N ^ 2 * J).
+
+At startup:
+1. All jobs on each node share a single SRC domain
+2. Each job creates a CQ for receive WCs
+3. Each job creates a SRQ attached to this CQ and to the shared domain
+
+When job j1 needs to transmit to job j2 on remote node n for the first time:
+1. Test: does job j1 have an existing connection to some job on node n?
+- If no:
+   j1 creates an SRC QP qp1 (send QP)
+   qp1 is only used to post send WRs
+   j2 creates an SRC QP qp2
+   qp2 is part of SRC

[ewg] Re: ofa_1_2_c_kernel 20070802-0201 daily build status

2007-08-06 Thread Michael S. Tsirkin

> Quoting Hoang-Nam Nguyen <[EMAIL PROTECTED]>:
> Subject: Re: ofa_1_2_c_kernel 20070802-0201 daily build status
> 
> Hello Michael and Vladimir!
> > ehca backports for kernel.org kernels seem to be broken.
> > 1. Does anyone care enough to fix them? If not we'll disable
> >ehca in build for these kernels.
> I downloaded daily build package ofa_1_2_c_kernel-20070804-0200.tgz
> and followed the build scheme configure, make on 2.6.19, 2.6.18, 2.6.17
> and 2.6.16/sles10/sles10_sp1. Except for 2.6.16/sles10/sles10_sp1
> a patch for kmem_cache_zalloc() is required for ehca the others were
> built without errors, see below. Thus, I'm wondering what I'm doing
> differently than your daily build script?

Could be different kernel configs or compiler version?
Can you please build on ofa server against kernels in ~vlad/kernel.org/?
The cross tool chain is here: /home/vlad/cross/

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH ofed-1.2.c] ehca: backport kmem_cache_zalloc() for 2.6.10/sles10/sles10_sp1

2007-08-06 Thread Hoang-Nam Nguyen
Hello Michael and Vladimir!
This patch below adds a backport patch for ehca to the dirs 2.6.16, 
2.6.16_sles10
and 2.6.16_sles10_sp1 underneath kernel_patches/backport of ofed-1.2.c source 
tree.
Thanks!
Nam



backport kmem_cache_zalloc() to 2.6.10, 2.6.10_sles10 and 2.6.10_sles10_sp1

Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]>
---

 2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch|   97 +++
 2.6.16_sles10/ehca_kmem_cache_zalloc_to_2_6_16.patch |   97 +++
 2.6.16_sles10_sp1/ehca_kmem_cache_zalloc_to_2_6_16.patch |   97 +++
 3 files changed, 291 insertions(+)

diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
 
ofa_1_2_c_kernel-20070804-0200/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
--- 
ofa_1_2_c_kernel-20070804-0200_orig/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
   1970-01-01 01:00:00.0 +0100
+++ 
ofa_1_2_c_kernel-20070804-0200/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
2007-08-06 00:53:59.0 +0200
@@ -0,0 +1,97 @@
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_cq.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_cq.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_cq.c   
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_cq.c
2007-08-06 00:41:50.0 +0200
+@@ -134,13 +134,14 @@ struct ib_cq *ehca_create_cq(struct ib_d
+   if (cqe >= 0x - 64 - additional_cqe)
+   return ERR_PTR(-EINVAL);
+ 
+-  my_cq = kmem_cache_zalloc(cq_cache, GFP_KERNEL);
++  my_cq = kmem_cache_alloc(cq_cache, GFP_KERNEL);
+   if (!my_cq) {
+   ehca_err(device, "Out of memory for ehca_cq struct device=%p",
+device);
+   return ERR_PTR(-ENOMEM);
+   }
+ 
++  memset(my_cq, 0, sizeof(*my_cq));
+   memset(¶m, 0, sizeof(struct ehca_alloc_cq_parms));
+ 
+   spin_lock_init(&my_cq->spinlock);
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_main.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_main.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_main.c 
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_main.c  
2007-08-06 00:40:58.0 +0200
+@@ -113,9 +113,11 @@ static struct kmem_cache *ctblk_cache = 
+ 
+ void *ehca_alloc_fw_ctrlblock(gfp_t flags)
+ {
+-  void *ret = kmem_cache_zalloc(ctblk_cache, flags);
++  void *ret = kmem_cache_alloc(ctblk_cache, flags);
+   if (!ret)
+   ehca_gen_err("Out of memory for ctblk");
++  else
++  memset(ret, 0, EHCA_PAGESIZE);
+   return ret;
+ }
+ 
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_mrmw.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_mrmw.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_mrmw.c 
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_mrmw.c  
2007-08-06 00:39:30.0 +0200
+@@ -55,8 +55,9 @@ static struct ehca_mr *ehca_mr_new(void)
+ {
+   struct ehca_mr *me;
+ 
+-  me = kmem_cache_zalloc(mr_cache, GFP_KERNEL);
++  me = kmem_cache_alloc(mr_cache, GFP_KERNEL);
+   if (me) {
++  memset(me, 0, sizeof(*me));
+   spin_lock_init(&me->mrlock);
+   } else
+   ehca_gen_err("alloc failed");
+@@ -73,8 +74,9 @@ static struct ehca_mw *ehca_mw_new(void)
+ {
+   struct ehca_mw *me;
+ 
+-  me = kmem_cache_zalloc(mw_cache, GFP_KERNEL);
++  me = kmem_cache_alloc(mw_cache, GFP_KERNEL);
+   if (me) {
++  memset(me, 0, sizeof(*me));
+   spin_lock_init(&me->mwlock);
+   } else
+   ehca_gen_err("alloc failed");
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_pd.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_pd.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_pd.c   
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_pd.c
2007-08-06 00:38:14.0 +0200
+@@ -50,13 +50,14 @@ struct ib_pd *ehca_alloc_pd(struct ib_de
+ {
+   struct ehca_pd *pd;
+ 
+-  pd = kmem_cache_zalloc(pd_cache, GFP_KERNEL);
++  pd = kmem_cache_alloc(pd_cache, GFP_KERNEL);
+   if (!pd) {
+   ehca_err(device, "device=%p context=%p out of memory",
+device, context);
+   return ERR_PTR(-ENOMEM);
+   }
+ 
++  memset(pd, 0, sizeof(*pd));
+   pd->ownpid = current->tgid;
+ 
+   /*
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/dri

[ewg] Re: [PATCH ofed-1.2.c] ehca: backport kmem_cache_zalloc() for 2.6.10/sles10/sles10_sp1

2007-08-06 Thread Michael S. Tsirkin
Let's not do it this way.

I think the right thing is to implement kmem_cache_zalloc
by means of kmem_cache_allocand memset in kernel_addons.



Quoting Hoang-Nam Nguyen <[EMAIL PROTECTED]>:
Subject: [PATCH ofed-1.2.c] ehca: backport kmem_cache_zalloc() for 
2.6.10/sles10/sles10_sp1

Hello Michael and Vladimir!
This patch below adds a backport patch for ehca to the dirs 2.6.16, 
2.6.16_sles10
and 2.6.16_sles10_sp1 underneath kernel_patches/backport of ofed-1.2.c source 
tree.
Thanks!
Nam



backport kmem_cache_zalloc() to 2.6.10, 2.6.10_sles10 and 2.6.10_sles10_sp1

Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]>
---

 2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch|   97 +++
 2.6.16_sles10/ehca_kmem_cache_zalloc_to_2_6_16.patch |   97 +++
 2.6.16_sles10_sp1/ehca_kmem_cache_zalloc_to_2_6_16.patch |   97 +++
 3 files changed, 291 insertions(+)

diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
 
ofa_1_2_c_kernel-20070804-0200/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
--- 
ofa_1_2_c_kernel-20070804-0200_orig/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
   1970-01-01 01:00:00.0 +0100
+++ 
ofa_1_2_c_kernel-20070804-0200/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
2007-08-06 00:53:59.0 +0200
@@ -0,0 +1,97 @@
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_cq.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_cq.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_cq.c   
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_cq.c
2007-08-06 00:41:50.0 +0200
+@@ -134,13 +134,14 @@ struct ib_cq *ehca_create_cq(struct ib_d
+   if (cqe >= 0x - 64 - additional_cqe)
+   return ERR_PTR(-EINVAL);
+ 
+-  my_cq = kmem_cache_zalloc(cq_cache, GFP_KERNEL);
++  my_cq = kmem_cache_alloc(cq_cache, GFP_KERNEL);
+   if (!my_cq) {
+   ehca_err(device, "Out of memory for ehca_cq struct device=%p",
+device);
+   return ERR_PTR(-ENOMEM);
+   }
+ 
++  memset(my_cq, 0, sizeof(*my_cq));
+   memset(¶m, 0, sizeof(struct ehca_alloc_cq_parms));
+ 
+   spin_lock_init(&my_cq->spinlock);
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_main.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_main.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_main.c 
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_main.c  
2007-08-06 00:40:58.0 +0200
+@@ -113,9 +113,11 @@ static struct kmem_cache *ctblk_cache = 
+ 
+ void *ehca_alloc_fw_ctrlblock(gfp_t flags)
+ {
+-  void *ret = kmem_cache_zalloc(ctblk_cache, flags);
++  void *ret = kmem_cache_alloc(ctblk_cache, flags);
+   if (!ret)
+   ehca_gen_err("Out of memory for ctblk");
++  else
++  memset(ret, 0, EHCA_PAGESIZE);
+   return ret;
+ }
+ 
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_mrmw.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_mrmw.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_mrmw.c 
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_mrmw.c  
2007-08-06 00:39:30.0 +0200
+@@ -55,8 +55,9 @@ static struct ehca_mr *ehca_mr_new(void)
+ {
+   struct ehca_mr *me;
+ 
+-  me = kmem_cache_zalloc(mr_cache, GFP_KERNEL);
++  me = kmem_cache_alloc(mr_cache, GFP_KERNEL);
+   if (me) {
++  memset(me, 0, sizeof(*me));
+   spin_lock_init(&me->mrlock);
+   } else
+   ehca_gen_err("alloc failed");
+@@ -73,8 +74,9 @@ static struct ehca_mw *ehca_mw_new(void)
+ {
+   struct ehca_mw *me;
+ 
+-  me = kmem_cache_zalloc(mw_cache, GFP_KERNEL);
++  me = kmem_cache_alloc(mw_cache, GFP_KERNEL);
+   if (me) {
++  memset(me, 0, sizeof(*me));
+   spin_lock_init(&me->mwlock);
+   } else
+   ehca_gen_err("alloc failed");
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_pd.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_pd.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_pd.c   
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_pd.c
2007-08-06 00:38:14.0 +0200
+@@ -50,13 +50,14 @@ struct ib_pd *ehca_alloc_pd(struct ib_de
+ {
+   struct ehca_pd *pd;
+ 
+-  pd = kmem_cache_zalloc(pd_cache, GFP_KERNEL);
++  pd = kmem_cache_alloc(pd_cache, GFP_KERNEL);
+   if (!pd) {
+   ehca_

[ewg] RE: [ofa-general] RFCv2: SRC API

2007-08-06 Thread Tang, Changqing
 
> +When job j1 needs to transmit to job j2 on remote node n for 
> the first time:
> +1. Test: does job j1 have an existing connection to some job 
> on node n?
> +- If no:
> + j1 creates an SRC QP qp1 (send QP)
> + qp1 is only used to post send WRs
> + j2 creates an SRC QP qp2
> + qp2 is part of SRC domain
> + qp2 is only used to do transport checks:
> + neither send nor receive WRs 
> are posted on qp2
> + j1 and j2 create a connection between qp1 and qp2
> + - If yes:
> + let qp1 be the QP which belongs to j1 and is connected
> + to some qp on node n
> +
> +2. j1 gets SRQ number from j2
> +3. j1 can now use QP qp2 from step 1
> +   and SRQ number from step 3 to send data to j2
> +
> +Cleanup:
> +When job j1 does not need to communicate to any jobs on node n, it 
> +disconnects qp1 from qp2, and asks j2 to destroy qp2.

Suppose remote node n has j2/qp2, j3/qp3, j4/qp4, qp1 on j1 is connected
to qp2, then there is no need to make connection between qp1 and qp3,
qp1 and qp4, there are automatically connected, right ?  Then how can j3
know that j2 has connected to j1, it does not need to make connection
again ?

qp1 find destination by SRQ number only, so "+3. j1 can now use QP qp2
from step 1", what does it mean ?

Can we destroy qp2 on j2 first, and keep j1 and j3 continue to
communicate ?

--CQ

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [ofa-general] RFCv2: SRC API

2007-08-06 Thread Tang, Changqing

 Maybe I miss something here.

Only of the job among j2, j3, j4 on remote node n need to create a
receiving qp2 for j1, right ?

--CQ


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Tang, Changqing
> Sent: Monday, August 06, 2007 10:31 AM
> To: Michael S. Tsirkin
> Cc: Pavel Shamis; Gil Bloch; ewg@lists.openfabrics.org; 
> [EMAIL PROTECTED]; Ishai Rabinovitz
> Subject: RE: [ofa-general] RFCv2: SRC API
> 
>  
> > +When job j1 needs to transmit to job j2 on remote node n for
> > the first time:
> > +1. Test: does job j1 have an existing connection to some job
> > on node n?
> > +- If no:
> > +   j1 creates an SRC QP qp1 (send QP)
> > +   qp1 is only used to post send WRs
> > +   j2 creates an SRC QP qp2
> > +   qp2 is part of SRC domain
> > +   qp2 is only used to do transport checks:
> > +   neither send nor receive WRs
> > are posted on qp2
> > +   j1 and j2 create a connection between qp1 and qp2
> > +   - If yes:
> > +   let qp1 be the QP which belongs to j1 and is connected
> > +   to some qp on node n
> > +
> > +2. j1 gets SRQ number from j2
> > +3. j1 can now use QP qp2 from step 1
> > +   and SRQ number from step 3 to send data to j2
> > +
> > +Cleanup:
> > +When job j1 does not need to communicate to any jobs on node n, it 
> > +disconnects qp1 from qp2, and asks j2 to destroy qp2.
> 
> Suppose remote node n has j2/qp2, j3/qp3, j4/qp4, qp1 on j1 
> is connected to qp2, then there is no need to make connection 
> between qp1 and qp3,
> qp1 and qp4, there are automatically connected, right ?  Then 
> how can j3 know that j2 has connected to j1, it does not need 
> to make connection again ?
> 
> qp1 find destination by SRQ number only, so "+3. j1 can now 
> use QP qp2 from step 1", what does it mean ?
> 
> Can we destroy qp2 on j2 first, and keep j1 and j3 continue 
> to communicate ?
> 
> --CQ
> 
> ___
> general mailing list
> [EMAIL PROTECTED]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [ofa-general] RFCv2: SRC API

2007-08-06 Thread Tang, Changqing
 
> > +When job j1 needs to transmit to job j2 on remote node n for
> > the first time:
> > +1. Test: does job j1 have an existing connection to some job
> > on node n?
> > +- If no:
> > +   j1 creates an SRC QP qp1 (send QP)
> > +   qp1 is only used to post send WRs
> > +   j2 creates an SRC QP qp2
> > +   qp2 is part of SRC domain
> > +   qp2 is only used to do transport checks:
> > +   neither send nor receive WRs
> > are posted on qp2
> > +   j1 and j2 create a connection between qp1 and qp2
> > +   - If yes:
> > +   let qp1 be the QP which belongs to j1 and is connected
> > +   to some qp on node n
> > +
> > +2. j1 gets SRQ number from j2
> > +3. j1 can now use QP qp2 from step 1
> > +   and SRQ number from step 3 to send data to j2
> > +
> > +Cleanup:
> > +When job j1 does not need to communicate to any jobs on node n, it 
> > +disconnects qp1 from qp2, and asks j2 to destroy qp2.

OK, I was wrong before, here is my question.

if remote node n has j2, j3, and j4, and j2 is the job to create qp2 and
make connection with qp1 in j1.
if j2 is done before j3 and j4, then we can not let j2 to destroy qp2,
because j3 and j4 are still communicating with
j1. Since j2 owns qp2, j2 need to be the last job to cleanup.

Am I right ?

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: RFCv2: SRC API

2007-08-06 Thread Michael S. Tsirkin
> Only of the job among j2, j3, j4 on remote node n need to create a
> receiving qp2 for j1, right ?

Correct. A single QP can be used to send data to any SRQ that shares the
same domain.

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: RFCv2: SRC API

2007-08-06 Thread Michael S. Tsirkin
> Quoting Tang, Changqing <[EMAIL PROTECTED]>:
> Subject: RE: RFCv2: SRC API
> 
>  
> > +When job j1 needs to transmit to job j2 on remote node n for 
> > the first time:
> > +1. Test: does job j1 have an existing connection to some job 
> > on node n?
> > +- If no:
> > +   j1 creates an SRC QP qp1 (send QP)
> > +   qp1 is only used to post send WRs
> > +   j2 creates an SRC QP qp2
> > +   qp2 is part of SRC domain
> > +   qp2 is only used to do transport checks:
> > +   neither send nor receive WRs 
> > are posted on qp2
> > +   j1 and j2 create a connection between qp1 and qp2
> > +   - If yes:
> > +   let qp1 be the QP which belongs to j1 and is connected
> > +   to some qp on node n
> > +
> > +2. j1 gets SRQ number from j2
> > +3. j1 can now use QP qp2 from step 1
> > +   and SRQ number from step 3 to send data to j2
> > +
> > +Cleanup:
> > +When job j1 does not need to communicate to any jobs on node n, it 
> > +disconnects qp1 from qp2, and asks j2 to destroy qp2.
> 
> Suppose remote node n has j2/qp2, j3/qp3, j4/qp4,

You got it wrong, j3 and j4 do not need to create QPs
to get packets from j1.

> qp1 on j1 is connected
> to qp2, then there is no need to make connection between qp1 and qp3,
> qp1 and qp4, there are automatically connected, right ?  Then how can j3
> know that j2 has connected to j1, it does not need to make connection
> again ?
> 
> qp1 find destination by SRQ number only, so "+3. j1 can now use QP qp2
> from step 1", what does it mean ?
> 
> Can we destroy qp2 on j2 first, and keep j1 and j3 continue to
> communicate ?

Communication is only possible as long as both qp1 and qp2 exist.

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: RFCv2: SRC API

2007-08-06 Thread Michael S. Tsirkin
> Quoting Tang, Changqing <[EMAIL PROTECTED]>:
> Subject: RE: RFCv2: SRC API
> 
>  
> > > +When job j1 needs to transmit to job j2 on remote node n for
> > > the first time:
> > > +1. Test: does job j1 have an existing connection to some job
> > > on node n?
> > > +- If no:
> > > + j1 creates an SRC QP qp1 (send QP)
> > > + qp1 is only used to post send WRs
> > > + j2 creates an SRC QP qp2
> > > + qp2 is part of SRC domain
> > > + qp2 is only used to do transport checks:
> > > + neither send nor receive WRs
> > > are posted on qp2
> > > + j1 and j2 create a connection between qp1 and qp2
> > > + - If yes:
> > > + let qp1 be the QP which belongs to j1 and is connected
> > > + to some qp on node n
> > > +
> > > +2. j1 gets SRQ number from j2
> > > +3. j1 can now use QP qp2 from step 1
> > > +   and SRQ number from step 3 to send data to j2
> > > +
> > > +Cleanup:
> > > +When job j1 does not need to communicate to any jobs on node n, it 
> > > +disconnects qp1 from qp2, and asks j2 to destroy qp2.
> 
> OK, I was wrong before, here is my question.
> 
> if remote node n has j2, j3, and j4, and j2 is the job to create qp2 and
> make connection with qp1 in j1.
> if j2 is done before j3 and j4, then we can not let j2 to destroy qp2,
> because j3 and j4 are still communicating with
> j1. Since j2 owns qp2, j2 need to be the last job to cleanup.
> 
> Am I right ?

Correct. Is this clear from the text, or is some kind of
additional clarification necessary?

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: RFCv2: SRC API

2007-08-06 Thread Tang, Changqing
 

> > OK, I was wrong before, here is my question.
> > 
> > if remote node n has j2, j3, and j4, and j2 is the job to 
> create qp2 
> > and make connection with qp1 in j1.
> > if j2 is done before j3 and j4, then we can not let j2 to 
> destroy qp2, 
> > because j3 and j4 are still communicating with j1. Since j2 
> owns qp2, 
> > j2 need to be the last job to cleanup.
> > 
> > Am I right ?
> 
> Correct. Is this clear from the text, or is some kind of 
> additional clarification necessary?

It is not clear at the first read, so please add one sentence to clarify
it.

if j2 is the last job to cleanup, how can it know all other jobs on the
same node has called 
ibv_close_src_domain(), and it is time for itself to cleanup ?

Is this something upto application to do ?

--CQ


> 
> --
> MST
> 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: RFCv2: SRC API

2007-08-06 Thread Michael S. Tsirkin
> Quoting Tang, Changqing <[EMAIL PROTECTED]>:
> Subject: RE: RFCv2: SRC API
> 
>  
> 
> > > OK, I was wrong before, here is my question.
> > > 
> > > if remote node n has j2, j3, and j4, and j2 is the job to 
> > create qp2 
> > > and make connection with qp1 in j1.
> > > if j2 is done before j3 and j4, then we can not let j2 to 
> > destroy qp2, 
> > > because j3 and j4 are still communicating with j1. Since j2 
> > owns qp2, 
> > > j2 need to be the last job to cleanup.
> > > 
> > > Am I right ?
> > 
> > Correct. Is this clear from the text, or is some kind of 
> > additional clarification necessary?
> 
> It is not clear at the first read, so please add one sentence to clarify
> it.

Would something like this help?

Cleanup:
When job j1 does not need to communicate to any jobs on node n,
it disconnects qp1 from qp2, and asks j2 to destroy qp2.
+
+Note: both qp1 and qp2 must exist for the communication to take place.
+Thus, j2 should not destroy qp2 (and in particular, should not exit)
+until j1 has completed communication with node n and
+has asked j2 to disconnect.


> if j2 is the last job to cleanup, how can it know all other jobs on the
> same node has called 
> ibv_close_src_domain(), and it is time for itself to cleanup ?
> 
> Is this something upto application to do ?

No, this is handled automatically.
Have you seen this text?
 * ibv_close_src_domain - close an SRC domain
 * If this is the last reference, destroys the domain.
 
So, each job has a reference to the domain.
Once the last reference is gone, the domain is destroyed.


-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: RFCv2: SRC API

2007-08-06 Thread Tang, Changqing
 
> Cleanup:
> When job j1 does not need to communicate to any jobs on node 
> n, it disconnects qp1 from qp2, and asks j2 to destroy qp2.
> +
> +Note: both qp1 and qp2 must exist for the communication to 
> take place.
> +Thus, j2 should not destroy qp2 (and in particular, should not exit) 
> +until j1 has completed communication with node n and has asked j2 to 
> +disconnect.
> 
Thanks. 

Another question. if a node n has 8 jobs, say, j2-j9, usually the first
job j2 is the one to create the SRC
domain(other jobs just attach and share) and it make sense to let j2 to
create all the receiving QPs for all other
remote jobs and make all the connections. (we can do in roundrobin way,
but more work).

Is there any performance worry to let j2(the first job on a node) to do
all the "work" ?

What is the latency of SRC+SRQ ?

--CQ
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: RFCv2: SRC API

2007-08-06 Thread Michael S. Tsirkin
> Quoting Tang, Changqing <[EMAIL PROTECTED]>:
> Subject: RE: RFCv2: SRC API
> 
>  
> > Cleanup:
> > When job j1 does not need to communicate to any jobs on node 
> > n, it disconnects qp1 from qp2, and asks j2 to destroy qp2.
> > +
> > +Note: both qp1 and qp2 must exist for the communication to 
> > take place.
> > +Thus, j2 should not destroy qp2 (and in particular, should not exit) 
> > +until j1 has completed communication with node n and has asked j2 to 
> > +disconnect.
> > 
> Thanks. 
> 
> Another question. if a node n has 8 jobs, say, j2-j9, usually the first
> job j2 is the one to create the SRC
> domain(other jobs just attach and share) and it make sense to let j2 to
> create all the receiving QPs for all other
> remote jobs and make all the connections. (we can do in roundrobin way,
> but more work).

Sure, creating allconnections upfront will work to, this is just a usage 
example.

> Is there any performance worry to let j2(the first job on a node) to do
> all the "work" ?

How do you mean?

> What is the latency of SRC+SRQ ?

I'd expect it to be more or less the same as regular SRQ.

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c

2007-08-06 Thread Ramachandra K

Vlad,

Please apply this VNIC patch series to both the OFED-1.2 and OFED-1.2.c 
branches.

This series contains changes to the VNIC driver for supporting iPath and the 
new version of the
VEx hardware, the Ethernet Virtual I/O Controller (EVIC).

The first four patches are for the VNIC kernel driver. They have dependencies 
and need
to be applied in the following order:


[PATCH 1/5 VNIC] DMA interface changes to support iPath

[PATCH 2/5 VNIC] Changes to VNIC control state machine (Depends on 
Patch 1)

[PATCH 3/5 VNIC ] Improvements for EVIC scalability (Depends on Patch 2)

[PATCH 4/5 VNIC] Delay reconnects to prevent flooding of EVIC in 
failure cases (Depends on Patch 3)

The last patch is for the build environment, build_env.sh to enable VNIC 
installation on RHEL 5:

[PATCH 5/5 build_env.sh] Enable building VNIC on RHEL 5

Regards,
Ram

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH 1/5 VNIC] DMA interface changes to support iPath

2007-08-06 Thread Ramachandra K
Use the ib_dma interface for compatibility with ipath driver.

From: Poornima Kamath <[EMAIL PROTECTED]>
Signed-off-by: Ramachandra K <[EMAIL PROTECTED]>

---

 drivers/infiniband/ulp/vnic/vnic_control.c |  352 ++--
 drivers/infiniband/ulp/vnic/vnic_data.c|  192 ---
 2 files changed, 276 insertions(+), 268 deletions(-)

diff --git a/drivers/infiniband/ulp/vnic/vnic_control.c 
b/drivers/infiniband/ulp/vnic/vnic_control.c
index a199380..02e8fa5 100644
--- a/drivers/infiniband/ulp/vnic/vnic_control.c
+++ b/drivers/infiniband/ulp/vnic/vnic_control.c
@@ -78,9 +78,9 @@ static void control_recv_complete(struct
CONTROL_FUNCTION("%s: control_recv_complete()\n",
 control_ifcfg_name(control));
 
-   dma_sync_single_for_cpu(control->parent->config->ibdev->dma_device,
-   control->recv_dma, control->recv_len,
-   DMA_FROM_DEVICE);
+   ib_dma_sync_single_for_cpu(control->parent->config->ibdev,
+  control->recv_dma, control->recv_len,
+  DMA_FROM_DEVICE);
control_note_rsptime_stats(&response_time);
CONTROL_PACKET(pkt);
spin_lock_irqsave(&control->io_lock, flags);
@@ -110,9 +110,9 @@ static void control_recv_complete(struct
spin_unlock_irqrestore(&control->io_lock, flags);
viport_kick(control->parent);
}
-   dma_sync_single_for_device(control->parent->config->ibdev->dma_device,
-  control->recv_dma, control->recv_len,
-  DMA_FROM_DEVICE);
+   ib_dma_sync_single_for_device(control->parent->config->ibdev,
+ control->recv_dma, control->recv_len,
+ DMA_FROM_DEVICE);
 }
 
 static void control_timeout(unsigned long data)
@@ -196,9 +196,9 @@ void control_process_async(struct contro
 
CONTROL_FUNCTION("%s: control_process_async()\n",
 control_ifcfg_name(control));
-   dma_sync_single_for_cpu(control->parent->config->ibdev->dma_device,
-   control->recv_dma, control->recv_len,
-   DMA_FROM_DEVICE);
+   ib_dma_sync_single_for_cpu(control->parent->config->ibdev,
+  control->recv_dma, control->recv_len,
+  DMA_FROM_DEVICE);
 
spin_lock_irqsave(&control->io_lock, flags);
recv_io = control->info;
@@ -262,9 +262,9 @@ void control_process_async(struct contro
spin_lock_irqsave(&control->io_lock, flags);
}
spin_unlock_irqrestore(&control->io_lock, flags);
-   dma_sync_single_for_device(control->parent->config->ibdev->dma_device,
-  control->recv_dma, control->recv_len,
-  DMA_FROM_DEVICE);
+   ib_dma_sync_single_for_device(control->parent->config->ibdev,
+ control->recv_dma, control->recv_len,
+ DMA_FROM_DEVICE);
 
CONTROL_INFO("%s: done control_process_async\n",
 control_ifcfg_name(control));
@@ -340,9 +340,9 @@ int control_init_vnic_req(struct control
struct vnic_control_packet  *pkt;
struct vnic_cmd_init_vnic_req   *init_vnic_req;
 
-   dma_sync_single_for_cpu(control->parent->config->ibdev->dma_device,
-   control->send_dma, control->send_len,
-   DMA_TO_DEVICE);
+   ib_dma_sync_single_for_cpu(control->parent->config->ibdev,
+  control->send_dma, control->send_len,
+  DMA_TO_DEVICE);
 
send_io = control_init_hdr(control, CMD_INIT_VNIC);
if (!send_io)
@@ -363,15 +363,15 @@ int control_init_vnic_req(struct control
 
control->rsp_expected = pkt->hdr.pkt_cmd;
 
-   dma_sync_single_for_device(control->parent->config->ibdev->dma_device,
-  control->send_dma, control->send_len,
-  DMA_TO_DEVICE);
+   ib_dma_sync_single_for_device(control->parent->config->ibdev,
+ control->send_dma, control->send_len,
+ DMA_TO_DEVICE);
 
return control_send(control, send_io);
 failure:
-   dma_sync_single_for_device(control->parent->config->ibdev->dma_device,
-  control->send_dma, control->send_len,
-  DMA_TO_DEVICE);
+   ib_dma_sync_single_for_device(control->parent->config->ibdev,
+ control->send_dma, control->send_len,
+ DMA_TO_DEVICE);
return -1;
 }
 
@@ -434,9 +434,9 @@ int control_init_vnic_rsp(struct control
 
   

[ewg] [PATCH 2/5 VNIC] Changes to VNIC control state machine (Depends on Patch 1)

2007-08-06 Thread Ramachandra K
Reimplement control statemachine for fixing IB send not completed errors.
Also simplify the control command processing logic and remove all control QP 
retries.

From: Poornima Kamath <[EMAIL PROTECTED]>
Signed-off-by: Ramachandra K <[EMAIL PROTECTED]>
---

 drivers/infiniband/ulp/vnic/vnic_config.c  |1 
 drivers/infiniband/ulp/vnic/vnic_config.h  |2 
 drivers/infiniband/ulp/vnic/vnic_control.c |  388 +++-
 drivers/infiniband/ulp/vnic/vnic_control.h |   39 ++
 drivers/infiniband/ulp/vnic/vnic_control_pkt.h |1 
 5 files changed, 282 insertions(+), 149 deletions(-)

diff --git a/drivers/infiniband/ulp/vnic/vnic_config.c 
b/drivers/infiniband/ulp/vnic/vnic_config.c
index d3b02d4..0447427 100644
--- a/drivers/infiniband/ulp/vnic/vnic_config.c
+++ b/drivers/infiniband/ulp/vnic/vnic_config.c
@@ -133,7 +133,6 @@ static void config_control_defaults(stru
control_config->vnic_instance = params->instance;
control_config->max_address_entries = MAX_ADDRESS_ENTRIES;
control_config->min_address_entries = MIN_ADDRESS_ENTRIES;
-   control_config->req_retry_count = CONTROL_REQ_RETRY_COUNT;
control_config->rsp_timeout = msecs_to_jiffies(CONTROL_RSP_TIMEOUT);
 }
 
diff --git a/drivers/infiniband/ulp/vnic/vnic_config.h 
b/drivers/infiniband/ulp/vnic/vnic_config.h
index f9850e7..03cbf9c 100644
--- a/drivers/infiniband/ulp/vnic/vnic_config.h
+++ b/drivers/infiniband/ulp/vnic/vnic_config.h
@@ -120,7 +120,6 @@ enum {
 #define VNIC_USE_RX_CSUM   1
 #define VNIC_USE_TX_CSUM   1
 #defineDEFAULT_PREFER_PRIMARY  0
-#defineCONTROL_REQ_RETRY_COUNT 4
 
 struct path_param {
__be64  ioc_guid;
@@ -155,7 +154,6 @@ struct control_config {
u16 max_address_entries;
u16 min_address_entries;
u32 rsp_timeout;
-   u8  req_retry_count;
 };
 
 struct data_config {
diff --git a/drivers/infiniband/ulp/vnic/vnic_control.c 
b/drivers/infiniband/ulp/vnic/vnic_control.c
index 02e8fa5..5d582db 100644
--- a/drivers/infiniband/ulp/vnic/vnic_control.c
+++ b/drivers/infiniband/ulp/vnic/vnic_control.c
@@ -75,8 +75,8 @@ static void control_recv_complete(struct
unsigned long   flags;
cycles_tresponse_time;
 
-   CONTROL_FUNCTION("%s: control_recv_complete()\n",
-control_ifcfg_name(control));
+   CONTROL_FUNCTION("%s: control_recv_complete() State=%d\n",
+control_ifcfg_name(control), control->req_state);
 
ib_dma_sync_single_for_cpu(control->parent->config->ibdev,
   control->recv_dma, control->recv_len,
@@ -92,18 +92,67 @@ static void control_recv_complete(struct
if (last_recv_io)
control_recv(control, last_recv_io);
} else if (c_hdr->pkt_type == TYPE_RSP) {
-   if (control->rsp_expected
-   && (c_hdr->pkt_seq_num == control->seq_num)) {
-   control->response = recv_io;
-   control->rsp_expected = 0;
-   spin_unlock_irqrestore(&control->io_lock, flags);
-   control_update_rsptime_stats(control,
-response_time);
+   u8 repost = 0;
+   u8 fail = 0;
+   u8 kick = 0;
+
+   switch (control->req_state) {
+   case REQ_INACTIVE:
+   case RSP_RECEIVED:
+   case REQ_COMPLETED:
+   CONTROL_ERROR("%s: Unexpected control"
+ "response received: CMD = %d\n",
+ control_ifcfg_name(control), 
c_hdr->pkt_cmd);
+   control_log_control_packet(pkt);
+   control->req_state = REQ_FAILED;
+   fail = 1;
+   break;
+   case REQ_POSTED:
+   case REQ_SENT:
+   if (c_hdr->pkt_cmd != control->last_cmd
+   || c_hdr->pkt_seq_num != control->seq_num) {
+   CONTROL_ERROR("%s: Incorrect Control Response"
+ "received\n",
+ control_ifcfg_name(control));
+   CONTROL_ERROR("%s: Sent control request:\n",
+ control_ifcfg_name(control));
+   
control_log_control_packet(control_last_req(control));
+   CONTROL_ERROR("%s: Received control 
response:\n",
+ control_ifcfg_name(control));
+   control_log_control_packet(pkt);
+   contro

[ewg] [PATCH 3/5 VNIC] Improvements for EVIC scalability (Depends on Patch 2)

2007-08-06 Thread Ramachandra K
Improve EVIC scalability by delaying host connections. Increase the time to
wait for a control response to handle the case when there are many hosts
trying to reconnect to the EVIC.

From: Poornima Kamath <[EMAIL PROTECTED]>
Signed-off-by: Ramachandra K <[EMAIL PROTECTED]>
---

 drivers/infiniband/ulp/vnic/vnic_config.c  |   15 +--
 drivers/infiniband/ulp/vnic/vnic_config.h  |4 ++-
 drivers/infiniband/ulp/vnic/vnic_main.c|5 ++--
 drivers/infiniband/ulp/vnic/vnic_netpath.h |1 +
 drivers/infiniband/ulp/vnic/vnic_viport.c  |   39 ++--
 drivers/infiniband/ulp/vnic/vnic_viport.h  |3 +-
 6 files changed, 52 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/ulp/vnic/vnic_config.c 
b/drivers/infiniband/ulp/vnic/vnic_config.c
index 0447427..e9db52d 100644
--- a/drivers/infiniband/ulp/vnic/vnic_config.c
+++ b/drivers/infiniband/ulp/vnic/vnic_config.c
@@ -55,7 +55,6 @@ static u16 max_mtu = MAX_MTU;
 
 static u32 default_no_path_timeout = DEFAULT_NO_PATH_TIMEOUT;
 static u32 sa_path_rec_get_timeout = SA_PATH_REC_GET_TIMEOUT;
-
 static u32 default_primary_reconnect_timeout =
DEFAULT_PRIMARY_RECONNECT_TIMEOUT;
 static u32 default_primary_switch_timeout = DEFAULT_PRIMARY_SWITCH_TIMEOUT;
@@ -64,6 +63,7 @@ static int default_prefer_primary
 static int use_rx_csum = VNIC_USE_RX_CSUM;
 static int use_tx_csum = VNIC_USE_TX_CSUM;
 
+static u32 control_response_timeout = CONTROL_RSP_TIMEOUT;
 module_param(max_mtu, ushort, 0444);
 MODULE_PARM_DESC(max_mtu, "Maximum MTU size (1500-9500). Default is 9500");
 
@@ -91,6 +91,11 @@ module_param(sa_path_rec_get_timeout, ui
 MODULE_PARM_DESC(sa_path_rec_get_timeout, "Time out value in milliseconds"
 " for SA path record get queries");
 
+module_param(control_response_timeout, uint, 0444);
+MODULE_PARM_DESC(control_response_timeout, "Time out value in milliseconds"
+" to wait for response to control requests");
+
+
 static void config_control_defaults(struct control_config *control_config,
struct path_param *params)
 {
@@ -133,7 +138,7 @@ static void config_control_defaults(stru
control_config->vnic_instance = params->instance;
control_config->max_address_entries = MAX_ADDRESS_ENTRIES;
control_config->min_address_entries = MIN_ADDRESS_ENTRIES;
-   control_config->rsp_timeout = msecs_to_jiffies(CONTROL_RSP_TIMEOUT);
+   control_config->rsp_timeout = 
msecs_to_jiffies(control_response_timeout);
 }
 
 static void config_data_defaults(struct data_config *data_config,
@@ -331,6 +336,12 @@ int config_start(void)
sa_path_rec_get_timeout = max_t(u32, sa_path_rec_get_timeout,
MIN_SA_TIMEOUT);
 
+   control_response_timeout = min_t(u32, control_response_timeout,
+MIN_CONTROL_RSP_TIMEOUT);
+
+   control_response_timeout = max_t(u32, control_response_timeout,
+MAX_CONTROL_RSP_TIMEOUT);
+
if (!default_no_path_timeout)
default_no_path_timeout = DEFAULT_NO_PATH_TIMEOUT;
 
diff --git a/drivers/infiniband/ulp/vnic/vnic_config.h 
b/drivers/infiniband/ulp/vnic/vnic_config.h
index 03cbf9c..e0a473d 100644
--- a/drivers/infiniband/ulp/vnic/vnic_config.h
+++ b/drivers/infiniband/ulp/vnic/vnic_config.h
@@ -100,7 +100,9 @@ enum {
 };
 
 enum {
-   CONTROL_RSP_TIMEOUT = 1000  /* 1 sec */
+   CONTROL_RSP_TIMEOUT = 1000, /* 1 sec */
+   MIN_CONTROL_RSP_TIMEOUT = 1000, /* 1  sec */
+   MAX_CONTROL_RSP_TIMEOUT = 6 /* 60 sec */
 };
 
 /* infiniband connection parameters */
diff --git a/drivers/infiniband/ulp/vnic/vnic_main.c 
b/drivers/infiniband/ulp/vnic/vnic_main.c
index e15d3f9..eab123b 100644
--- a/drivers/infiniband/ulp/vnic/vnic_main.c
+++ b/drivers/infiniband/ulp/vnic/vnic_main.c
@@ -465,9 +465,10 @@ static void update_path_and_reconnect(st
netpath->path_idx = config->path_idx;
netpath->connect_time = jiffies;
delay = 0;
-   } else if (config->path_idx != netpath->path_idx)
+   } else if (config->path_idx != netpath->path_idx) {
delay = 0;
-
+   netpath->path_idx = config->path_idx;
+   }
viport_connect(netpath->viport, delay);
 }
 
diff --git a/drivers/infiniband/ulp/vnic/vnic_netpath.h 
b/drivers/infiniband/ulp/vnic/vnic_netpath.h
index a5ee45c..51fa3a8 100644
--- a/drivers/infiniband/ulp/vnic/vnic_netpath.h
+++ b/drivers/infiniband/ulp/vnic/vnic_netpath.h
@@ -53,6 +53,7 @@ struct netpath {
size_t  path_idx;
u32 connect_time;
int second_bias;
+   u8  is_primary_path;
struct timer_list   timer;
enum netpath_ts timer_state;
struct class_dev_info   class_dev_info;
di

[ewg] [PATCH 4/5 VNIC] Delay reconnects to prevent flooding of EVIC (Depends on Patch 3)

2007-08-06 Thread Ramachandra K
Delay the reconnects to prevent the flooding of EVIC in case of a failure.

From: Poornima Kamath <[EMAIL PROTECTED]>
Signed-off-by: Ramachandra K <[EMAIL PROTECTED]>
---

 drivers/infiniband/ulp/vnic/vnic_main.c|7 +--
 drivers/infiniband/ulp/vnic/vnic_netpath.h |1 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/vnic/vnic_main.c 
b/drivers/infiniband/ulp/vnic/vnic_main.c
index eab123b..e44f0f0 100644
--- a/drivers/infiniband/ulp/vnic/vnic_main.c
+++ b/drivers/infiniband/ulp/vnic/vnic_main.c
@@ -464,11 +464,14 @@ static void update_path_and_reconnect(st
  vnic->config->no_path_timeout) {
netpath->path_idx = config->path_idx;
netpath->connect_time = jiffies;
+   netpath->delay_reconnect = 0;
delay = 0;
} else if (config->path_idx != netpath->path_idx) {
-   delay = 0;
+   delay = netpath->delay_reconnect;
netpath->path_idx = config->path_idx;
-   }
+   netpath->delay_reconnect = 1;
+   } else
+   delay = 1;
viport_connect(netpath->viport, delay);
 }
 
diff --git a/drivers/infiniband/ulp/vnic/vnic_netpath.h 
b/drivers/infiniband/ulp/vnic/vnic_netpath.h
index 51fa3a8..cc43c83 100644
--- a/drivers/infiniband/ulp/vnic/vnic_netpath.h
+++ b/drivers/infiniband/ulp/vnic/vnic_netpath.h
@@ -54,6 +54,7 @@ struct netpath {
u32 connect_time;
int second_bias;
u8  is_primary_path;
+   u8  delay_reconnect;
struct timer_list   timer;
enum netpath_ts timer_state;
struct class_dev_info   class_dev_info;
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [PATCH 5/5 VNIC] Enable building VNIC on RHEL 5

2007-08-06 Thread Ramachandra K
Modifiy build_env.sh to add RHEL 5 in the list of kernels supported by VNIC.

From: Poornima Kamath <[EMAIL PROTECTED]>
Signed-off-by: Ramachandra K <[EMAIL PROTECTED]>
---

 build_env.sh |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/build_env.sh b/build_env.sh
index 71e26fc..8c4acd4 100644
--- a/build_env.sh
+++ b/build_env.sh
@@ -142,7 +142,7 @@ esac
 
 # Vnic
 case ${K_VER} in
-2.6.9-34*|2.6.9-42*|2.6.9-55*|2.6.16.*-*-*|2.6.19*)
+2.6.9-34*|2.6.9-42*|2.6.9-55*|2.6.16.*-*-*|2.6.19*|2.6.18*)
 IB_KERNEL_PACKAGES="${IB_KERNEL_PACKAGES} vnic"
 ;;
 esac
@@ -2004,7 +2004,7 @@ set_package_deps()
 ;;
 vnic)
 case ${K_VER} in
-2.6.19*|2.6.9-34*|2.6.9-42*|2.6.9-55*|2.6.16.*-*-*)
+
2.6.19*|2.6.9-34*|2.6.9-42*|2.6.9-55*|2.6.16.*-*-*|2.6.18*)
 OFA_KERNEL_PACKAGES=$(echo "$OFA_KERNEL_PACKAGES 
ib_verbs ${ll_driver} vnic" | tr -s ' ' '\n' | sort -n | uniq)
 OFA_PACKAGES=$(echo "$OFA_PACKAGES kernel-ib" | tr 
-s ' ' '\n' | sort -n | uniq)
 ;;
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c

2007-08-06 Thread Michael S. Tsirkin
> Quoting Ramachandra K <[EMAIL PROTECTED]>:
> Subject: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
> 
> 
> Vlad,
> 
> Please apply this VNIC patch series to both the OFED-1.2 and OFED-1.2.c
> branches.
> 
> This series contains changes to the VNIC driver for supporting iPath and the
> new version of the VEx hardware, the Ethernet Virtual I/O Controller (EVIC).

I don't see how adding features to 1.2 *at this stage* can be justufied.

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] [GIT PULL] Please pull qlvnictools.git

2007-08-06 Thread Ramachandra K

Vlad,

For inclusion in OFED-1.2 and OFED-1.2.c, please also pull qlvnictools.git:

git://git.openfabrics.org/~ramachandrak/qlvnictools.git

Changes include:

  - Removing check for MTHCA in vnic_parser.pl as now iPath and ConnectX are 
supported
by the VNIC driver
 
  - Fix in ibvexdm to check for a valid value of sm_lid.


Regards,
Ram
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c

2007-08-06 Thread Kuchimanchi, Ramachandra
-Original Message-
From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED]
Sent: Mon 8/6/2007 11:48 PM
To: Kuchimanchi, Ramachandra
Cc: [EMAIL PROTECTED]; ewg@lists.openfabrics.org
Subject: Re: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
 
> Quoting Ramachandra K <[EMAIL PROTECTED]>:
> Subject: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
> 
> 
> Vlad,
> 
> Please apply this VNIC patch series to both the OFED-1.2 and OFED-1.2.c
> branches.
> 
> This series contains changes to the VNIC driver for supporting iPath and the
> new version of the VEx hardware, the Ethernet Virtual I/O Controller (EVIC).

> I don't see how adding features to 1.2 *at this stage* can be justufied.

Just to clarify, when I by OFED-1.2, I meant the next release in the 1.2 series 
of OFED i.e
OFED-1.2.1 and ultimately for OFED-1.3 down the line. Is there any other
branch designated for that ?

And I hope there is no objection for inclusion of these patches in OFED-1.2.c
branch.

Regards,
Ram


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c

2007-08-06 Thread Michael S. Tsirkin
> Quoting Kuchimanchi, Ramachandra <[EMAIL PROTECTED]>:
> Subject: RE: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
> 
> -Original Message-
> From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED]
> Sent: Mon 8/6/2007 11:48 PM
> To: Kuchimanchi, Ramachandra
> Cc: [EMAIL PROTECTED]; ewg@lists.openfabrics.org
> Subject: Re: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
>  
> > Quoting Ramachandra K <[EMAIL PROTECTED]>:
> > Subject: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
> > 
> > 
> > Vlad,
> > 
> > Please apply this VNIC patch series to both the OFED-1.2 and OFED-1.2.c
> > branches.
> > 
> > This series contains changes to the VNIC driver for supporting iPath and the
> > new version of the VEx hardware, the Ethernet Virtual I/O Controller (EVIC).
> 
> > I don't see how adding features to 1.2 *at this stage* can be justufied.
> 
> Just to clarify, when I by OFED-1.2, I meant the next release in the 1.2
> series of OFED i.e OFED-1.2.1 and ultimately for OFED-1.3 down the line. Is
> there any other branch designated for that ?

I think EWG decided that the next release in the 1.2 series will be 1.2.c.
So far, the definition of 1.2.c was "1.2 plus bugfixes plus connectx support".

Stuff intended for 1.3 should go here for now:
git://openfabrics.org/~vlad/ofed_kernel ofed_kernel
This has been updated to 2.6.23-rc2, but otherwise is tracking ofed_1_2_c.

> And I hope there is no objection for inclusion of these patches in OFED-1.2.c
> branch.

This looks like a change of methodology so this might be something EWG
would have to agree on. Right?

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c

2007-08-06 Thread Kuchimanchi, Ramachandra
From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED]
Sent: Tue 8/7/2007 12:16 AM
To: Kuchimanchi, Ramachandra
Cc: Michael S. Tsirkin; [EMAIL PROTECTED]; ewg@lists.openfabrics.org
Subject: Re: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
 
> Quoting Kuchimanchi, Ramachandra <[EMAIL PROTECTED]>:
> Subject: RE: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c
> > 
> > Vlad,
> > 
> > Please apply this VNIC patch series to both the OFED-1.2 and OFED-1.2.c
> > branches.
> > 
> > This series contains changes to the VNIC driver for supporting iPath and the
> > new version of the VEx hardware, the Ethernet Virtual I/O Controller (EVIC).
> 
> > I don't see how adding features to 1.2 *at this stage* can be justufied.
> 
> Just to clarify, when I by OFED-1.2, I meant the next release in the 1.2
> series of OFED i.e OFED-1.2.1 and ultimately for OFED-1.3 down the line. Is
> there any other branch designated for that ?

> I think EWG decided that the next release in the 1.2 series will be 1.2.c.
> So far, the definition of 1.2.c was "1.2 plus bugfixes plus connectx support".

I am not entirely clear about this 1.2.c = 1.2.1 thing.
This is my understanding, please correct me if I am wrong - OFED-1.2 is the 
official
OFED release, OFED-1.2.c is a parallel line of development to support Connect X 
which
has been recently made available through a GIT tree on the OFED server. I 
presume, this
OFED-1.2.c would ultimately merge into the OFED-1.2 line of development to 
become the OFED-1.3 release.

Also OFED-1.2.c does not seem like an incremental release of OFED-1.2
with some minor changes that could be equated with something like a OFED-1.2.1, 
considering the
changes for Connect X support and other kernel changes.

>> And I hope there is no objection for inclusion of these patches in OFED-1.2.c
>> branch.

> This looks like a change of methodology so this might be something EWG
> would have to agree on. Right?

In the July 30th meeting minutes sent by Tziporet
(http://lists.openfabrics.org/pipermail/ewg/2007-July/004094.html), it was
mentioned that the only release in August would be that of OFED-1.2.c. The
rationale given was that it would focus QA efforts and ensure wider test 
coverage.

It is for this same reason that I have submitted these patches, so that we
can get wider test coverage of the changes to VNIC code as part of a complete 
OFED package.

I hope I have made the intention behind submitting these patches clear.

Also all of these patches are specific to VNIC and all changes are within the 
VNIC
driver and we have been testing them for some time now with the few previous 
releases
of OFED-1.2.c.

Regards,
Ram
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [PATCH 0/5 VNIC] VNIC patch series for OFED-1.2 and OFED-1.2.c

2007-08-06 Thread Michael S. Tsirkin
> > > So far, the definition of 1.2.c was "1.2 plus bugfixes plus connectx
> > > support".
> > 
> > I am not entirely clear about this 1.2.c = 1.2.1 thing.
> > This is my understanding, please correct me if I am wrong - OFED-1.2 is the
> > official OFED release, OFED-1.2.c is a parallel line of development to
> > support Connect X which has been recently made available through a GIT tree
> > on the OFED server. I presume, this OFED-1.2.c would ultimately merge into
> > the OFED-1.2 line of development to become the OFED-1.3 release.

Here's how I understand it:

1.2 should not see any development - it's a major bugfix only branch.
1.2.c is 1.2 code updated to kernel 2.6.22 plus bugfixes.
1.3 is 1.2.c updated to latest kernel (currently 2.6.23) plus new features.

-- 
MST
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [PATCH ofed-1.2.c] ehca: backport kmem_cache_zalloc() for 2.6.10/sles10/sles10_sp1

2007-08-06 Thread Michael S. Tsirkin
Hmm, I thought about it some more.
kmem_cache struct is not exported on recent kernels,
so this might br hard to do.

So I think the patch is probably the right approach, after all.

Quoting Michael S. Tsirkin <[EMAIL PROTECTED]>:
Subject: Re: [PATCH ofed-1.2.c] ehca: backport kmem_cache_zalloc() for 
2.6.10/sles10/sles10_sp1

Let's not do it this way.

I think the right thing is to implement kmem_cache_zalloc
by means of kmem_cache_allocand memset in kernel_addons.



Quoting Hoang-Nam Nguyen <[EMAIL PROTECTED]>:
Subject: [PATCH ofed-1.2.c] ehca: backport kmem_cache_zalloc() for 
2.6.10/sles10/sles10_sp1

Hello Michael and Vladimir!
This patch below adds a backport patch for ehca to the dirs 2.6.16, 
2.6.16_sles10
and 2.6.16_sles10_sp1 underneath kernel_patches/backport of ofed-1.2.c source 
tree.
Thanks!
Nam



backport kmem_cache_zalloc() to 2.6.10, 2.6.10_sles10 and 2.6.10_sles10_sp1

Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]>
---

 2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch|   97 +++
 2.6.16_sles10/ehca_kmem_cache_zalloc_to_2_6_16.patch |   97 +++
 2.6.16_sles10_sp1/ehca_kmem_cache_zalloc_to_2_6_16.patch |   97 +++
 3 files changed, 291 insertions(+)

diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
 
ofa_1_2_c_kernel-20070804-0200/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
--- 
ofa_1_2_c_kernel-20070804-0200_orig/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
   1970-01-01 01:00:00.0 +0100
+++ 
ofa_1_2_c_kernel-20070804-0200/kernel_patches/backport/2.6.16/ehca_kmem_cache_zalloc_to_2_6_16.patch
2007-08-06 00:53:59.0 +0200
@@ -0,0 +1,97 @@
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_cq.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_cq.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_cq.c   
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_cq.c
2007-08-06 00:41:50.0 +0200
+@@ -134,13 +134,14 @@ struct ib_cq *ehca_create_cq(struct ib_d
+   if (cqe >= 0x - 64 - additional_cqe)
+   return ERR_PTR(-EINVAL);
+ 
+-  my_cq = kmem_cache_zalloc(cq_cache, GFP_KERNEL);
++  my_cq = kmem_cache_alloc(cq_cache, GFP_KERNEL);
+   if (!my_cq) {
+   ehca_err(device, "Out of memory for ehca_cq struct device=%p",
+device);
+   return ERR_PTR(-ENOMEM);
+   }
+ 
++  memset(my_cq, 0, sizeof(*my_cq));
+   memset(¶m, 0, sizeof(struct ehca_alloc_cq_parms));
+ 
+   spin_lock_init(&my_cq->spinlock);
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_main.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_main.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_main.c 
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_main.c  
2007-08-06 00:40:58.0 +0200
+@@ -113,9 +113,11 @@ static struct kmem_cache *ctblk_cache = 
+ 
+ void *ehca_alloc_fw_ctrlblock(gfp_t flags)
+ {
+-  void *ret = kmem_cache_zalloc(ctblk_cache, flags);
++  void *ret = kmem_cache_alloc(ctblk_cache, flags);
+   if (!ret)
+   ehca_gen_err("Out of memory for ctblk");
++  else
++  memset(ret, 0, EHCA_PAGESIZE);
+   return ret;
+ }
+ 
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_mrmw.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_mrmw.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_mrmw.c 
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_mrmw.c  
2007-08-06 00:39:30.0 +0200
+@@ -55,8 +55,9 @@ static struct ehca_mr *ehca_mr_new(void)
+ {
+   struct ehca_mr *me;
+ 
+-  me = kmem_cache_zalloc(mr_cache, GFP_KERNEL);
++  me = kmem_cache_alloc(mr_cache, GFP_KERNEL);
+   if (me) {
++  memset(me, 0, sizeof(*me));
+   spin_lock_init(&me->mrlock);
+   } else
+   ehca_gen_err("alloc failed");
+@@ -73,8 +74,9 @@ static struct ehca_mw *ehca_mw_new(void)
+ {
+   struct ehca_mw *me;
+ 
+-  me = kmem_cache_zalloc(mw_cache, GFP_KERNEL);
++  me = kmem_cache_alloc(mw_cache, GFP_KERNEL);
+   if (me) {
++  memset(me, 0, sizeof(*me));
+   spin_lock_init(&me->mwlock);
+   } else
+   ehca_gen_err("alloc failed");
+diff -Nurp 
ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_pd.c 
ofa_1_2_c_kernel-20070804-0200/drivers/infiniband/hw/ehca/ehca_pd.c
+--- ofa_1_2_c_kernel-20070804-0200_orig/drivers/infiniband/hw/ehca/ehca_pd.c   
2007-08-04 11:00:05.0 +0200
 ofa_1_2_c_kernel-20070804-0