Fwd: [ewg] Re: [PATCH v3] mlx4_ib: Optimize hugetlab pages support

2009-03-31 Thread Olga Shern (Voltaire)
Vlad,

Can you please replase mlx4_1070-optimize-huge_tlb.patch
with Yossi's patch. It fixes Eli's patch.

Thanks
Olga


-- Forwarded message --
From: Yossi Etigin yos...@voltaire.com
Date: Mon, Mar 30, 2009 at 6:49 PM
Subject: [ewg] Re: [PATCH v3] mlx4_ib: Optimize hugetlab pages support
To: Eli Cohen e...@mellanox.co.il
Cc: Roland Dreier rdre...@cisco.com, ewg
ewg@lists.openfabrics.org, general-list
gene...@lists.openfabrics.org


Eli Cohen wrote:

 Since Linux may not merge adjacent pages into a single scatter entry through
 calls to dma_map_sg(), we check the special case of hugetlb pages which are
 likely to be mapped to coniguous dma addresses and if they are, take advantage
 of this. This will result in a significantly lower number of MTT segments used
 for registering hugetlb memory regions.


How about the one below - it fixes bugzilla #1569 (fix mapping for
size that is not
on page boundary):

---

Since Linux may not merge adjacent pages into a single scatter entry through
calls to dma_map_sg(), we check the special case of hugetlb pages which are
likely to be mapped to coniguous dma addresses and if they are, take advantage
of this. This will result in a significantly lower number of MTT segments used
for registering hugetlb memory regions.

Signed-off-by: Eli Cohen e...@mellanox.co.il
---
drivers/infiniband/hw/mlx4/mr.c |   81 ++
1 files changed, 72 insertions(+), 9 deletions(-)

Index: b/drivers/infiniband/hw/mlx4/mr.c
===
--- a/drivers/infiniband/hw/mlx4/mr.c   2008-11-19 21:32:15.0 +0200
+++ b/drivers/infiniband/hw/mlx4/mr.c   2009-03-30 18:29:55.0 +0300
@@ -119,6 +119,70 @@ out:
       return err;
}

+static int handle_hugetlb_user_mr(struct ib_pd *pd, struct mlx4_ib_mr *mr,
+                                 u64 start, u64 virt_addr, int access_flags)
+{
+#if defined(CONFIG_HUGETLB_PAGE)  !defined(__powerpc__)  !defined(__ia64__)
+       struct mlx4_ib_dev *dev = to_mdev(pd-device);
+       struct ib_umem_chunk *chunk;
+       unsigned dsize;
+       dma_addr_t daddr;
+       unsigned cur_size = 0;
+       dma_addr_t uninitialized_var(cur_addr);
+       int n;
+       struct ib_umem  *umem = mr-umem;
+       u64 *arr;
+       int err = 0;
+       int i;
+       int j = 0;
+       int off = start  (HPAGE_SIZE - 1);
+
+       n = DIV_ROUND_UP(off + umem-length, HPAGE_SIZE);
+       arr = kmalloc(n * sizeof *arr, GFP_KERNEL);
+       if (!arr)
+               return -ENOMEM;
+
+       list_for_each_entry(chunk, umem-chunk_list, list)
+               for (i = 0; i  chunk-nmap; ++i) {
+                       daddr = sg_dma_address(chunk-page_list[i]);
+                       dsize = sg_dma_len(chunk-page_list[i]);
+                       if (!cur_size) {
+                               cur_addr = daddr;
+                               cur_size = dsize;
+                       } else if (cur_addr + cur_size != daddr) {
+                               err = -EINVAL;
+                               goto out;
+                       } else
+                               cur_size += dsize;
+
+                       if (cur_size  HPAGE_SIZE) {
+                               err = -EINVAL;
+                               goto out;
+                       } else if (cur_size == HPAGE_SIZE) {
+                               cur_size = 0;
+                               arr[j++] = cur_addr;
+                       }
+               }
+
+       if (cur_size) {
+               arr[j++] = cur_addr;
+       }
+
+       err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, umem-length,
+                           convert_access(access_flags), n,
HPAGE_SHIFT, mr-mmr);
+       if (err)
+               goto out;
+
+       err = mlx4_write_mtt(dev-dev, mr-mmr.mtt, 0, n, arr);
+
+out:
+       kfree(arr);
+       return err;
+#else
+       return -ENOSYS;
+#endif
+}
+
struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
                                 u64 virt_addr, int access_flags,
                                 struct ib_udata *udata)
@@ -140,17 +204,20 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct
               goto err_free;
       }

-       n = ib_umem_page_count(mr-umem);
-       shift = ilog2(mr-umem-page_size);
-
-       err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn, virt_addr, length,
-                           convert_access(access_flags), n, shift, mr-mmr);
-       if (err)
-               goto err_umem;
-
-       err = mlx4_ib_umem_write_mtt(dev, mr-mmr.mtt, mr-umem);
-       if (err)
-               goto err_mr;
+       if (!mr-umem-hugetlb ||
+           handle_hugetlb_user_mr(pd, mr, start, virt_addr, access_flags)) {
+               n = ib_umem_page_count(mr-umem);
+               shift = ilog2(mr-umem-page_size);
+
+               err = mlx4_mr_alloc(dev-dev, to_mpd(pd)-pdn,
virt_addr, length,
+                                   

***SPAM*** Re: [ewg] RE: Delaying next Monday OFED meeting

2009-03-05 Thread Olga Shern (Voltaire)
Both dates are OK with us

On Thu, Mar 5, 2009 at 4:02 PM, John Russo john.ru...@qlogic.com wrote:
 Let’s go for the 12th.



 From: ewg-boun...@lists.openfabrics.org
 [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren
 Sent: Thursday, March 05, 2009 7:23 AM
 To: ewg@lists.openfabrics.org
 Subject: [ewg] Delaying next Monday OFED meeting



 Hello,

 Due to Purim holiday in Israel I wish to delay the next Monday OFED meeting.

 We can do it next week on Thursday (12 March) 9am PST or delay to a week
 after on Monday (March 16 ) 9am PST

 Can you reply with your availability?

 Sorry for this inconvenient.

 Tziporet

 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED (EWG) meeting agenda for tomorrow (Jan 26)

2009-01-25 Thread Olga Shern (Voltaire)

 3. OFED 1.5 schedule

 Betsy from Qlogic suggested to early the release.

 From the other hand Olga from Voltaire asked to stay with the July time
 frame.

 Based on the decisions in 1  2 we should decide on the release schedule.

We should decide whether we want to have one or two OFED releases per year.
If we will decide that we should go for one OFED release per year, I
think we should postpone OFED 1.5 release to October.
And have dot release in a middle.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED Jan 5, 2009 meeting minutes on OFED plans

2009-01-08 Thread Olga Shern (Voltaire)
- Kernel base will be 2.6.29

Hi,

Kernel 2.6.29 window will be closed very soon, so it means that we
cannot have any new features in this kernel.
Therefore no new features in OFED 1.5.
I think we should be based on 2.6.30.
And I agree with Tziporet regarding the OFED 1.5 schedule, no need to
rush, OFED is mature enough, therefore no need to have releases every
1/2 year.

Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED Nov 24, 2008 meeting minutes

2008-11-27 Thread Olga Shern (Voltaire)

 OFED 1.4 release: RC6 on Nov 28, GA on Dec 8

Hi,

Are you going to build RC6 today/tomorrow?
I see that there are still a lot of major bugs. Maybe we should wait?

Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] RE: Do we have an EWG meeting today?

2008-11-24 Thread Olga Shern (Voltaire)
I got email from Jeff:

Friendly reminder: the OFED teleconference is today (24 November 2008).

1. Noon US Eastern / 9am US Pacific / 7pm Israel
  Monday, November 24, code 210020028 (*** TODAY ***)
2. Noon US Eastern / 9am US Pacific / 7pm Israel
  Monday, December 1, code 210020028

US/Canada:  +1.866.432.9903
India:  +91.80.4103.3979
Israel: +972.9.892.7026
Others: http://cisco.com/en/US/about/doing_business/conferencing/



On Mon, Nov 24, 2008 at 6:57 PM, Woodruff, Robert J
[EMAIL PROTECTED] wrote:
 I can set up a bridge number if we want to meet.

 woody


 -Original Message-
 From: Tziporet Koren [mailto:[EMAIL PROTECTED]
 Sent: Monday, November 24, 2008 5:10 AM
 To: Woodruff, Robert J; Betsy Zeller; Olga Shern
 Subject: Do we have an EWG meeting today?

 I thought we decided to have one but I don't see such meeting in my calender

 Tziporet
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED 1.4 - delay the GA to Dec 4

2008-11-20 Thread Olga Shern (Voltaire)

 1370blo [EMAIL PROTECTED] Ping over IPoIB I/F fails
 after ifconfig down and up


Yossi have sent a patch that fixes this

 1198cri [EMAIL PROTECTED] hang during ipoib
 create_child/ifdown

We sent patch to Roland some time ago. But it was decided in EWG meeting that
because:
 1. It is rarely that user will run such test
2. This is an old bug that wasn't introduced in OFED 1.4
we will not add the patch to OFED 1.4

If you think this is another bug we should open a new one


 1289maj [EMAIL PROTECTED]Ib and ipoib doesnt respond while
 running multiple tests ...


It seems that this was already fixed - need only retest this and
verify that this is indeed fixed
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED 1.4 bugs status and OFED meetings

2008-11-17 Thread Olga Shern (Voltaire)
Hi Vlad,

Is this bug :1349maj [EMAIL PROTECTED]Kernel panic on sdp
was fixed?

Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] rhel4.6 testing

2008-11-07 Thread Olga Shern (Voltaire)
I assume you mean OFED 1.4
We have tested it - regression tests.
Do you see any problem?

On Thu, Nov 6, 2008 at 9:12 PM, Steve Wise [EMAIL PROTECTED] wrote:
 Has anyone tested the core rdma stuff on rhel4.6?
 Thanks,

 Steve.
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED October 27 2008 meeting summary on OFED 1.4 status

2008-10-29 Thread Olga Shern (Voltaire)
 2. We had a discussion on NFS-RDMA since both RHEL 5.1 and SLES10 SP2
 backports are not working well
 We had a debate - do we take it out of OFED since it is not working on
 the distros
 Leave it in: We can have bug fixes for 1.4.1, and give customers a
 platform to play with
 Take it out: If someone will try it on the distro experience can be
 problematic
 Decision: We will leave it for 2.6.27 kernel only.
 All testing should be done on this kernel mainly to see that basic
 functionality is working

We have tested NFSoRDMA on 2.6.27 and didn't see any of the issues
that we see on Distros.
So basic functionality is working
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] ***SPAM*** NFS-RDMA compilation problem

2008-10-25 Thread Olga Shern (Voltaire)
Hi Amar,

I suggest you to open bug in openbabrics bugzilla:
https://bugs.openfabrics.org/.

Thanks
Olga

On Thu, Oct 23, 2008 at 4:50 PM, Amar Mudrankit
[EMAIL PROTECTED] wrote:
 While I was trying to install OFED-1.4-rc3 over SLES 10 SP 2 with
 NFS-RDMA selected for installation, I got the following error message:

 nfs-utils-1.1.1 rpm is required to build kernel-ib

 I have downloaded and installed successfully, the nfs-utils-1.1.4
 **source .tgz** from   http://www.kernel.org/pub/linux/utils/nfs,
 still I was hit with the same error message.

 I was not able to find out nfs-utils rpm that would install over SLES
 10 SP 2.  Can anybody please point me to the location of rpm? Why is
 OFED installation unable to detect the latest installation of nfs
 utils compiled from source and is fully dependent upon the rpm
 installation?

 Regards,
 Amar
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] ***SPAM*** Re: [ofa-general] OFED-1.4-rc3 is available

2008-10-23 Thread Olga Shern (Voltaire)
 - 27 bugs fixed (see attached for details)

Hi Vlad,

I don't see the attached file.

Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: Continue of defer skb_orphan() until irqs enabled

2008-10-02 Thread Olga Shern (Voltaire)
We run regression tests and it were OK.
We will continue the testing and update if we see any issues.

Olga

On Sun, Sep 28, 2008 at 2:40 PM, Olga Shern (Voltaire)
[EMAIL PROTECTED] wrote:
 Hi Eli,

 We also want to run regression tests with this patch.
 Please let me know when OFED daily build will include it.

 Thanks
 Olga

 On Sun, Sep 28, 2008 at 2:39 PM, Eli Cohen [EMAIL PROTECTED] wrote:
 On Fri, Sep 26, 2008 at 01:19:00PM -0700, Roland Dreier wrote:
 How about this?  Instead of trying to rely on some complicated and
 fragile reasoning about when some race might occur, let's just do what
 we want to do anyway and get rid of LLTX.  We change from priv-tx_lock
 (taken with IRQ disabling) to netif_tx_lock (taken on with
 BH-disabling).  And then we can keep the skb_orphan in the place it is,
 since our xmit routine runs with IRQs enabled.


 We'll integrate this into ofed 1.4 and monitor this through our
 regression system.
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] OFED build hangs when trying to build sdpnetstat

2008-10-01 Thread Olga Shern (Voltaire)
Hi,

Please see bugzilla:
https://bugs.openfabrics.org/show_bug.cgi?id=1238

Olga


On Mon, Sep 29, 2008 at 10:17 PM, Woodruff, Robert J
[EMAIL PROTECTED] wrote:

 Has anyone else seen a problem with the OFED install in
 today's daily build hanging while trying to build sdpnetstat ?

 Here is the last few lines in the log file after a did a
 cntrl-c.
 The hang seems to happen both on EL 5.2 (2.6.18-92.el5) and EL 5.1
 (2.6.18-53.el5).

 + unset DISPLAY
 + make netstat
 Configuring the Linux net-tools (NET-3 Base Utilities)...

 *
 *
 *  Internationalization
 *
 * The net-tools package has currently been translated to French,
 * German and Brazilian Portugese.  Other translations are, of
 * course, welcome.  Answer `n' here if you have no support for
 * internationalization on your system.
 *
 Does your system support GNU gettext? (I18N) [n] *
 *
 * Protocol Families.
 *
 UNIX protocol family (HAVE_AFUNIX) [y] INET (TCP/IP) protocol family
 (HAVE_AFINET) [y] INET6 (IPv6) protocol family (HAVE_AFINET6) [n] make:
 *** Deleting file `config.h'
 make: *** wait: No child processes.  Stop.
 make: *** Waiting for unfinished jobs
 make: *** wait: No child processes.  Stop.

 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: Continue of defer skb_orphan() until irqs enabled

2008-09-28 Thread Olga Shern (Voltaire)
Hi Eli,

We also want to run regression tests with this patch.
Please let me know when OFED daily build will include it.

Thanks
Olga

On Sun, Sep 28, 2008 at 2:39 PM, Eli Cohen [EMAIL PROTECTED] wrote:
 On Fri, Sep 26, 2008 at 01:19:00PM -0700, Roland Dreier wrote:
 How about this?  Instead of trying to rely on some complicated and
 fragile reasoning about when some race might occur, let's just do what
 we want to do anyway and get rid of LLTX.  We change from priv-tx_lock
 (taken with IRQ disabling) to netif_tx_lock (taken on with
 BH-disabling).  And then we can keep the skb_orphan in the place it is,
 since our xmit routine runs with IRQs enabled.


 We'll integrate this into ofed 1.4 and monitor this through our
 regression system.
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] ***SPAM*** Regarding: bonding issue.

2008-09-24 Thread Olga Shern (Voltaire)
Hi Gnana,

First, I would recommend using OFED 1.3.1.
How did you configure bonding?
Please check whether you configuration is according the instructions
in /usr/share/doc/packages/ib-bonding-0.9.0/ib-bonding.txt

Best Regards,
Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] ***SPAM*** OFED installation on SLES 10

2008-08-28 Thread Olga Shern (Voltaire)
Hi Vlad,

I tested OFED 1.4 beta installation on SLES 10 minimal installation,
and all dependencies checks were OK except kernel sources check,
I think we should add for sles check whether kernel-source rpm is installed
I attached a patch that should fix it.

Thanks
Olga


install.diff
Description: Binary data
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

***SPAM*** Re: [ewg] ***SPAM*** OFED installation on SLES 10

2008-08-28 Thread Olga Shern (Voltaire)
Thanks Vlad,

It works :)
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


***SPAM*** Re: [ewg] OFED installation on RH5 UP2

2008-08-28 Thread Olga Shern (Voltaire)
Hi Vlad,

I found another issue with openmpi rpm removal on RH5 UP2.
See patch attached

Thanks
Olga


uninstall.patch
Description: Binary data
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Fwd: [ofa-general] [PATCH v3] ib/core: fix for send multicast group send leave retry

2008-08-27 Thread Olga Shern (Voltaire)
Hi Vlad,

Please add this patch to OFED 1.4
It is in Roland's tree for 2.6.28

Thanks
Olga


-- Forwarded message --
From: Yossi Etigin [EMAIL PROTECTED]
Date: Aug 11, 2008 7:35 PM
Subject: [ofa-general] [PATCH v3] ib/core: fix for send multicast
group send leave retry
To: Roland Drier [EMAIL PROTECTED]
Cc: Olga Shern [EMAIL PROTECTED], general list
[EMAIL PROTECTED], Ron Livne [EMAIL PROTECTED]


Until now, only if joining a multicast group failed there was a retry
mechanism.
This patch will add a mechanism that will retry to leave a multicast
group before giving up.

Changes from v1:
- Save the leave state because it's overridden
- use 'else'

Changes from v2:
- Call mcast_work_handler() when send_leave() fails

Signed-off-by: Ron Livne [EMAIL PROTECTED]
Signed-off-by: Yossi Etigin [EMAIL PROTECTED]


Index: b/drivers/infiniband/core/multicast.c
===
--- a/drivers/infiniband/core/multicast.c   2008-08-11
19:13:26.0 +0300
+++ b/drivers/infiniband/core/multicast.c   2008-08-11
19:34:21.0 +0300
@@ -106,6 +106,8 @@ struct mcast_group {
   struct ib_sa_query  *query;
   int query_id;
   u16 pkey_index;
+   u8  leave_state;
+   int retries;
};

struct mcast_member {
@@ -350,6 +352,7 @@ static int send_leave(struct mcast_group

   rec = group-rec;
   rec.join_state = leave_state;
+   group-leave_state = leave_state;

   ret = ib_sa_mcmember_rec_query(sa_client, port-dev-device,
  port-port_num, IB_SA_METHOD_DELETE, rec,
@@ -542,7 +545,11 @@ static void leave_handler(int status, st
{
   struct mcast_group *group = context;

-   mcast_work_handler(group-work);
+   if (status  (group-retries  0) 
+   !send_leave(group, group-leave_state))
+   group-retries--;
+   else
+   mcast_work_handler(group-work);
}

static struct mcast_group *acquire_group(struct mcast_port *port,
@@ -565,6 +572,7 @@ static struct mcast_group *acquire_group
   if (!group)
   return NULL;

+   group-retries = 3;
   group-port = port;
   group-rec.mgid = *mgid;
   group-pkey_index = MCAST_INVALID_PKEY_INDEX;
-- 
--Yossi

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Ofed1.3 bonding problem

2008-07-15 Thread Olga Shern (Voltaire)
On 7/15/08, Acero Fernandez Alicia [EMAIL PROTECTED] wrote:

 Hi everybody,

 I have tried what is explained in the ib-bonding.txt file, but it doesn't
 work. What are the lines in the /etc/modprobe.conf for Redhat Enterprise
 linux 4 up 5? Or perhaps there is some more information needed.

There is no need to add anything to /etc/modprobe.conf if your OS is RH4 UP5

 Could anyone help me?


Can you please send your network scripts for bonding and ifconfig
output and dmesg.


Thanks
Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Ofed1.3 bonding problem

2008-07-10 Thread Olga Shern (Voltaire)
Hi Alicia,

You are right bonding package inside OFED replaces bonding package
that is installed on your OS (ib-bonding is installed under
/lib/modules/uname -r/updates, therefore if you will remove it, you
will have native bonding working).

I assume that your OS is RH 4.
Indeed in OFED 1.3 there was a bug, that Ethernet bonding didn't work
on RH4, but there is a workaround in OFED 1.3.1.
ib-bonding rpm includes ib-bonding.txt file that has an instruction
how to configure Ethernet bonding:
(/usr/share/doc/packages/ib-bonding-0.9.0/ib-bonding.txt)

3.3 Configuring Ethernet slaves
---
It is not possible to have a mix of Ethernet slaves and IPoIB slaves under the
same bonding master. It is possible however that a bonding master of Ethernet
slaves and a bonding master of IPoIB slaves will co-exist in one machine.
To configure Ethernet slaves under a bonding master use the same instructions
as for IPoIB slaves (according  to the OS) with one exception. When working
under Redhat-AS4 do the following when configuring a bonding  master with
Ethernet slaves

- In the master configuration file add the line
SLAVEDEV=1
- In the slave configuration file leave the line
TYPE=InfiniBand

This bug will be fixed in OFED 1.4.

Please let me know if it helps.

Best Regards
Olga


On 7/10/08, Acero Fernandez Alicia [EMAIL PROTECTED] wrote:

 Hi everybody,


 I am trying to install ofed1.3 in my cluster. I have done ethernet bonding
 in some network interfaces and I would like to install ofed1.3, but when I
 try it ethernet network connection is lost. I have been looking for a
 solution, but I have found that the module name for ethernet bonding and for
 infiniband bonding is the same, then perhaps it is the reason, is it true?
 In that case, how could I solve it? It doesn´t seem to be solved in
 ofed1.3.1 because I have tried to install it and the same happens.

 Could you help me, please?

 Regards
 Alicia Confidencialidad: Este mensaje y sus
 ficheros adjuntos se dirige exclusivamente a su destinatario y puede
 contener información privilegiada o confidencial. Si no es vd. el
 destinatario indicado, queda notificado de que la utilización, divulgación
 y/o copia sin autorización está prohibida en virtud de la legislación
 vigente. Si ha recibido este mensaje por error, le rogamos que nos lo
 comunique inmediatamente respondiendo al mensaje y proceda a su destrucción.
 Disclaimer: This message and its attached files is intended exclusively for
 its recipients and may contain confidential information. If you received
 this e-mail in error you are hereby notified that any dissemination, copy or
 disclosure of this communication is strictly prohibited and may be unlawful.
 In this case, please notify us by a reply and delete this email and its
 contents immediately. 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Agenda for the OFED meeting today (May 5)

2008-05-19 Thread Olga Shern (Voltaire)
On 5/19/08, Tziporet Koren [EMAIL PROTECTED] wrote:

 Hi,

 This is the agenda for the OFED meeting today:
 1. OFED 1.3.1:
   1.1 Schedule:
rc1 - done on May 6
rc2 - May 22 == I propose to delay to Thursday since there are
 few IPOIB bugs on work
GA  - May 29
   1.2 OS support:
SLES10 SP2 backports were done (thanks to Moshe from Voltaire)
There is a request fro RHEL 5.2 - who has this OS and can help
 with the backports?
   1.3 Bugs status
Please set release version 1.3.1 for all bugs that should be
 resolved in 1.3.1
In the way the bugs are assigned today it is very hard to
 extract the relevant bugs for the release.
This is the list of bugs that should be resolved to my best
 knowledge (please add more):


There is also bug number 1004
1004 https://bugs.openfabrics.org/show_bug.cgi?id=1004 maj P2 RHEL
[EMAIL PROTECTED]  IPoIB failed on stress testing

1024normal  [EMAIL PROTECTED]  Bonding-Ping not recovery after
 reconnect the non active interface
 1027normal  [EMAIL PROTECTED] kernel panic in mad.c
 handle_outgoing_dr_smp with RESULT_CONSUMED
 1031normal  [EMAIL PROTECTED]  OpenSM fat tree routing thinks
 fat tree isn't
 1032critical[EMAIL PROTECTED]   RHEL  5.1 and OFED 1.3
 cannot write IO blocks greater than 1024.
 1038normal  [EMAIL PROTECTED]  Kernel panic while running
 tcp/ip ltp tests
 1040normal  [EMAIL PROTECTED]Kernel Oops during port up/down
 test
 1041normal  [EMAIL PROTECTED] Install Failed with memtrack
 flag in the conf file
 1042normal  [EMAIL PROTECTED] ofed-1.3.1 install fails

 2. OFED 1.4:
- Kernel rebase status: we have prepared the new tree, make-dist
 pass but compilation still fails.
  Any help to resolve compilation issues is welcome.
  URL: git://git.openfabrics.org/ofed_1_4/linux-2.6.git
 ofed_kernel
- Update from the participants (mainly on new
 components/features):
  - NFSoRDMA - Jeff
  - Management - Sasha
  - Multiple EQs to best fit multi-core systems - we try to
 define it with Roland
  - RDMA CM to support IPv6 - Woody any news on this?
  - IB BMME and iWARP equivalent memory extensions - under
 progress on the general list

 3. Open discussion
   - Upgrade memory in the OFA server:
 This request raised long time ago and we had a promise to do it
 after 1.3 release. What is the status?
   - Other topics ...

 Tziporet


 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Fwd: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index

2008-05-14 Thread Olga Shern (Voltaire)
On 5/14/08, Vladimir Sokolovsky [EMAIL PROTECTED] wrote:

 Olga Shern (Voltaire) wrote:

 
 
 
 
 Hello Olga,
 This patch can't be applied as is to the ofed-1.3.1 git tree:
 
 patching file drivers/infiniband/ulp/ipoib/ipoib_cm.c
 Hunk #1 succeeded at 847 (offset -160 lines).
 patching file drivers/infiniband/ulp/ipoib/ipoib_ib.c
 Hunk #1 succeeded at 488 (offset -106 lines).
 Hunk #2 FAILED at 729.
 1 out of 2 hunks FAILED -- saving rejects to file
 drivers/infiniband/ulp/ipoib/ipoib_ib.c.rej
 
 Can you recreate this patch against
 git://git.openfabrics.org/ofed_1_3/linux-2.6.git
 http://git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel?
 
 Regards,
 Vladimir
 
 
   It was applied without issues on OFED 1.3, therfore I sent it as is.
  I will recreate it against OFED 1.3.1
   Olga
 
 
 I added this patch (as is) to ofed-1.3.1 kernel git tree as
 kernel_patches/fixes/ipoib_0360_Handle_case_when_P_Key_is_deleted.patch.
 Thanks,

 Regards,
 Vladimir



Great,

Thanks
Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Fwd: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index

2008-05-13 Thread Olga Shern (Voltaire)

 

 Hello Olga,
 This patch can't be applied as is to the ofed-1.3.1 git tree:

 patching file drivers/infiniband/ulp/ipoib/ipoib_cm.c
 Hunk #1 succeeded at 847 (offset -160 lines).
 patching file drivers/infiniband/ulp/ipoib/ipoib_ib.c
 Hunk #1 succeeded at 488 (offset -106 lines).
 Hunk #2 FAILED at 729.
 1 out of 2 hunks FAILED -- saving rejects to file
 drivers/infiniband/ulp/ipoib/ipoib_ib.c.rej

 Can you recreate this patch against git://
 git.openfabrics.org/ofed_1_3/linux-2.6.git ofed_kernel?

 Regards,
 Vladimir



It was applied without issues on OFED 1.3, therfore I sent it as is.
I will recreate it against OFED 1.3.1

Olga
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] RE: [ofa-general] OFED May 5 meeting summary

2008-05-12 Thread Olga Shern (Voltaire)
On 5/12/08, Tziporet Koren [EMAIL PROTECTED] wrote:

 Moshe Kazir wrote:

 
  I have checked OFED-1.3.1-rc1 on SLES10 SP 2 Beta3.
 
  ib-bonding compile failed.  Everything else is compiled o.k.
  Attached : ib-bonding error log.
 
 
  I'll take the backport of ib-bonding to sles10 sp 2 on me (if needed,
  I'll get Moni's help).
 
 
 
 Thanks
 Please update when done.
 Any need for a change in the install script?


It seems that there is no need for changes in the install script,
I will update you

Tziporet




___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Fwd: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted and re-added at same index

2008-05-12 Thread Olga Shern (Voltaire)
Hi Vlad,

Please add this patch to OFED 1.3.1
In additional to the main purpose of this patch it is also fixes issues we
saw with partitioning and SM failover because of:

*Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually.  This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.*
**
Thanks
Olga


-- Forwarded message --
From: Roland Dreier [EMAIL PROTECTED]
Date: Apr 15, 2008 8:55 AM
Subject: [ofa-general] [PATCH/RFC] IPoIB: Handle case when P_Key is deleted
and re-added at same index
To: [EMAIL PROTECTED]

If a P_Key is deleted and then re-added at the same index, then IPoIB
gets confused because __ipoib_ib_dev_flush() only checks whether the
index is the same without checking whether the P_Key was present, so
the interface is stopped when the P_Key is deleted, but the event when
the P_Key is re-added gets ignored and the interface never gets
restarted.

Also, switch to using ib_find_pkey() instead of ib_find_cached_pkey()
everywhere in IPoIB, since none of the places that look for P_Keys are
in a fast path or in non-sleeping context, and in general we want to
kill off the whole caching infrastructure eventually.  This also fixes
consistency problems caused because some IPoIB queries were cached and
some were uncached during the window where the cache was not updated.

Thanks to Venkata Subramonyam [EMAIL PROTECTED] for debugging this
problem and testing this fix.

Signed-off-by: Roland Dreier [EMAIL PROTECTED]
---
drivers/infiniband/ulp/ipoib/ipoib_cm.c |4 ++--
drivers/infiniband/ulp/ipoib/ipoib_ib.c |   10 +-
2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 9d411f2..9db7b0b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1007,9 +1007,9 @@ static int ipoib_cm_modify_tx_init(struct net_device
*dev,
   struct ipoib_dev_priv *priv = netdev_priv(dev);
   struct ib_qp_attr qp_attr;
   int qp_attr_mask, ret;
-   ret = ib_find_cached_pkey(priv-ca, priv-port, priv-pkey,
qp_attr.pkey_index);
+   ret = ib_find_pkey(priv-ca, priv-port, priv-pkey,
qp_attr.pkey_index);
   if (ret) {
-   ipoib_warn(priv, pkey 0x%x not in cache: %d\n, priv-pkey,
ret);
+   ipoib_warn(priv, pkey 0x%x not found: %d\n, priv-pkey,
ret);
   return ret;
   }

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 8b4ff69..0205eb7 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -594,7 +594,7 @@ static void ipoib_pkey_dev_check_presence(struct
net_device *dev)
   struct ipoib_dev_priv *priv = netdev_priv(dev);
   u16 pkey_index = 0;

-   if (ib_find_cached_pkey(priv-ca, priv-port, priv-pkey,
pkey_index))
+   if (ib_find_pkey(priv-ca, priv-port, priv-pkey, pkey_index))
   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
   else
   set_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
@@ -835,13 +835,13 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv
*priv, int pkey_event)
   clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
   ipoib_ib_dev_down(dev, 0);
   ipoib_ib_dev_stop(dev, 0);
-   ipoib_pkey_dev_delay_open(dev);
-   return;
+   if (ipoib_pkey_dev_delay_open(dev))
+   return;
   }
-   set_bit(IPOIB_PKEY_ASSIGNED, priv-flags);

   /* restart QP only if P_Key index is changed */
-   if (new_index == priv-pkey_index) {
+   if (test_and_set_bit(IPOIB_PKEY_ASSIGNED, priv-flags) 
+   new_index == priv-pkey_index) {
   ipoib_dbg(priv, Not flushing - P_Key index not
changed.\n);
   return;
   }
--
1.5.5

___
general mailing list
[EMAIL PROTECTED]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] OFED May 5 meeting summary

2008-05-11 Thread Olga Shern (Voltaire)
On 5/6/08, Tziporet Koren [EMAIL PROTECTED] wrote:


 May 5 OFED meeting summary:
 ===

 1. OFED 1.3.1:
1.1  Status of changes:
IB-bonding - on work
SRP failover - done (need more testing)
SDP crashes - on work (not clear if we will have
 something on time)
RDS fixes for RDMA API - done
librdmacm 1.0.7 - done
uDAPL updates - done
Open MPI 1.2.6 - done
MVAPICH 1.0.1 - done
MVAPICH2 1.0.3 - done
IPoIB - 2 bugs fixed. There are still two issue that
 should be resolved.
Low level drivers: Changes that already committed:
nes
mlx4
cxgb3
ehca

1.2 Schedule:
rc1 - was released today
rc2 - May 20
GA  - May 29

1.3 Discussion:
- ipath driver is going to be updated
- There is an issue of bonding and Ethernet drivers on RHEL4 -
 under debug
- We wish to add support for SLES10 SP2. Already got an approval
 from Novell
Any volunteer to provide the new backport patches?



Tziporet, we will do it.
Already started with it, seems like everything is compiled, need only
backport bonding

Olga

2. OFED 1.4:
   Updated that the new tree will be ready next week - based on
 2.6.26-rc

 3. Update on OpenSuSE build system - Yiftah updated on the work that is
 done and problems:
   - The system requires clean RPMs only (no use of install script) -
 they work to resolve
   - We target this system toward releases (and not to replace the daily
 build system).
   - we may try now with OFED 1.3.1


 Tziporet
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg