Re: [openib-general] [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB

2007-02-27 Thread Moni Shoua

Thanks for the comments 

>> To fix it, this patch adds a dev field to struct ipoib_neigh which is used
>> instead of the struct neighbour dev one.
> 
> It seems that in this design, if multiple ipoib interfaces are present, we 
> might
> get an skb such that skb->dev will be different from the new dev field in 
> struct
> ipoib_neigh.
> 
> It seems that the result will be that the packet will be sent on a wrong 
> interface.
> Right?
> 
I don't see how. The field dev in ipoib_neigh doesn't take part in interface 
selection.
As I see it, skb travels this path:
1. Passed to bond_dev->hard_start_xmit
2. bond_dev->hard_start_xmit chooses the current active interface, changes 
skb->dev and enqueues it back for xmittig.

>> In addition, if an IPoIB device is removed before bonding is unloaded it may 
>> cause bond0 neighbours (neighbours that point to bond0) to exist after the 
>> IPoIB
>> device no longer exist. This is why a neighbour cleanup is required during 
>> device 
>> cleanup. This cleanup scans the arp cache and the ndisc cache to find there 
>> neighbours of bond0 which refer also to the relevant ibX. Also, when 
>> ib_ipoib module is
>> unloaded, the neighbour destructor must be set to NULL because the neighbour 
>> function is in
>> ib_ipoib.
>> For this neigh table cleanup, it is required to export the symbol nd_tbl 
>> just like the symbol arp_tbl is.
> 
> I wonder about this: is it really true that any allocated neighbour is always 
> in
> either arp_tbl or nd_tbl? For example, could some code have called neigh_hold
> and retained a neighbour that is not in either one of these tables?
> 
I got the assumption about neighbours living in one of these 2 tables from 
observation and code reading.
I preferred that that on keeping track of all ipoib_neighs and putting them in 
a list. However, I could 
do that instead of neigh_table scanning. Do you think it's better?
For the example... I didn't understand it. Could you please explain?

>> During my tests I found that when running 
>>
>>  1. modprobe -r ib_mthca (to delete IPoIB interfaces)
>>  2. ping somewhere on the subnet of bond0
>>
>> I get this stack dump (which ends with kernel death)
>>   [] skb_under_panic+0x5c/0x60
>>   [] :ib_ipoib:ipoib_hard_header+0xa6/0xc0
>>   [] arp_create+0x120/0x226
>>   [] arp_send+0x25/0x3b
>>   [] arp_solicit+0x186/0x195
>>   [] neigh_timer_handler+0x2b5/0x309
>>   [] neigh_timer_handler+0x0/0x309
>>   [] run_timer_softirq+0x130/0x19e
>>   [] __do_softirq+0x55/0xc3
>>   [] call_softirq+0x1c/0x28
>>   [] do_softirq+0x2c/0x7d
>>   [] smp_apic_timer_interrupt+0x57/0x6a
>>   [] mwait_idle+0x0/0x45
>>   [] apic_timer_interrupt+0x66/0x70
>> [] mwait_idle+0x42/0x45
>>   [] cpu_idle+0x8b/0xae
>>   [] start_secondary+0x47f/0x48f
>>
>> The only way I found to avoid this (for now) is to check skb headroom in
>> ipoib_hard_header. I guess that this safety check doesn't harm regular IPoIB 
>> operation and it seems to solve my problem. However, I would be happy to 
>> hear what
>> others think of this last issue.
> 
> As I said, this seems to indicate a problem in the bonding code.
> But what will happen after you error out in ipoib_hard_header?
> Is the packet dropped? What might break as a result?
> 
I will check the hard_header_len issue in the bonding code more carefully. From 
first look
it seems that bonding does borrow the hard_header_len.
Also, my checks show that it is safe to return with error from hard_header().
For example,  in neigh_connected_output:

err = dev->hard_header(skb, dev, ntohs(skb->protocol),
   neigh->ha, NULL, skb->len);
read_unlock_bh(&neigh->lock);
if (err >= 0)
err = neigh->ops->queue_xmit(skb);
else {
err = -EINVAL;
kfree_skb(skb);
 
>> I would really appreciate comments.
>>
>> thanks
>>
>>  -MoniS
> 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [RFC] [PATCH v2] IB/ipoib: Add bonding support to IPoIB

2007-02-26 Thread Moni Shoua
Hi,

This post follows a previous one, regarding required changes to IPoIB to enable
it to work with bonding. Please find it here: 
http://openib.org/pipermail/openib-general/2007-February/032598.html

This patch version adds fixes to the comments from Michael Tsirkin from the 
last post.

IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Combing these two flows, there is a hole if some code at ipoib
(ipoib_neigh_destructor) assumes that for each struct neighbour it gets, n->dev
is an ipoib device so for example netdev_priv(n->dev) would be of type struct
ipoib_dev_priv.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one.

In addition, if an IPoIB device is removed before bonding is unloaded it may 
cause bond0 neighbours (neighbours that point to bond0) to exist after the IPoIB
device no longer exist. This is why a neighbour cleanup is required during 
device 
cleanup. This cleanup scans the arp cache and the ndisc cache to find there 
neighbours of bond0 which refer also to the relevant ibX. Also, when ib_ipoib 
module is
unloaded, the neighbour destructor must be set to NULL because the neighbour 
function is in
ib_ipoib.
For this neigh table cleanup, it is required to export the symbol nd_tbl just 
like the symbol arp_tbl is.

During my tests I found that when running 

1. modprobe -r ib_mthca (to delete IPoIB interfaces)
2. ping somewhere on the subnet of bond0

I get this stack dump (which ends with kernel death)
 [] skb_under_panic+0x5c/0x60
 [] :ib_ipoib:ipoib_hard_header+0xa6/0xc0
 [] arp_create+0x120/0x226
 [] arp_send+0x25/0x3b
 [] arp_solicit+0x186/0x195
 [] neigh_timer_handler+0x2b5/0x309
 [] neigh_timer_handler+0x0/0x309
 [] run_timer_softirq+0x130/0x19e
 [] __do_softirq+0x55/0xc3
 [] call_softirq+0x1c/0x28
 [] do_softirq+0x2c/0x7d
 [] smp_apic_timer_interrupt+0x57/0x6a
 [] mwait_idle+0x0/0x45
 [] apic_timer_interrupt+0x66/0x70
   [] mwait_idle+0x42/0x45
 [] cpu_idle+0x8b/0xae
 [] start_secondary+0x47f/0x48f

The only way I found to avoid this (for now) is to check skb headroom in
ipoib_hard_header. I guess that this safety check doesn't harm regular IPoIB 
operation and it seems to solve my problem. However, I would be happy to hear 
what
others think of this last issue.

I would really appreciate comments.

thanks

 -MoniS

--
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h 
b/drivers/infiniband/ulp/ipoib/ipoib.h
index 07deee8..31bc6d8 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -216,6 +216,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -232,7 +233,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c 
b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 705eb1d..0e3953e 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -48,6 +48,8 @@ #include 
 #include 
 
 #include 
+#include 
+#include 
 
 #define IPOIB_QPN(ha) (be32_to_cpup((__be32 *) ha) & 0xff)
 
@@ -70,6 +72,7 @@ module_param_named(debug_level, ipoib_de
 MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0");
 #endif
 
+static int ipoib_at_exit = 0;
 struct ipoib_path_iter {
struct net_device *dev;
struct ipoib_path  path;
@@ -490,7 +493,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+   neigh = ipoib_neigh_alloc(skb->dst->neighbour, skb->dev);
if (!neigh) {
++priv->stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -735,6 +738,9 @@ static int ipoib_hard_header(struct sk_b
 {
struct ipoib_header *header;
 

Re: [openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

2007-02-08 Thread Moni Shoua
Michael S. Tsirkin wrote:
>> Quoting Moni Shoua <[EMAIL PROTECTED]>:
>> Subject: Re: [PATCH] IB/ipoib get net_device from ipoib_neigh instead of 
>> linux neighbour
>>
>>
>>> Another concern: assume that one device goes away (e.g. hotplug).
>>> It seems that neighbours whose dev field point to another device, will not 
>>> be destroyed.
>>> Correct?
>> I agree.
>>
>>> Therefore in your design, it seems that to_ipoib_neigh()->dev
>>> will get us a pointer to device that has been removed already.
>>>
>> I agree that this is a problem.
> 
> I think we can solve this if we track all ipoib neighbours, like we do for 
> old kernels,
> and then flush ipoib neighbours on any hotplug event.
> Roland, does this sound too awful?
> 
>> It think it would be best to prevent an IPoIB device
>> from disappearing or from ib_ipoib from being unloaded as long as IPoIB
>> device is a slave. Unfortunately, I don't see how this can be done just
>> by fixing something in bonding or IPoIB. 
> 
> So hotplug is blocked potentially forever?
> This does not sound good.
OK, so I'm dropping this thought.
> 
>> However, any slave knows he has a master (dev->master). 
>> What do you think about a solution where IPoIB first tries to clean up the
>> neighbours that belong to it's master before deleting the IPoIB device?
> 
> How?
Let me know what do you think about that. I hope this makes sense.
in IPoIB, before calling unregister_netdev do
for each kernel neighbour n
if  n->dev == ib_dev->master
delete n

Michael, as I see it we have to deal with 2 cases.
1. IPoIB device is deleted (unregister_netdev) - IPoIB netdev in not in the 
kernel's address space.
we have to make sure that no one holds a pointer to it after it is 
deleted.
2 ib_ipoib module is unloaded (modprobe -r) - the ipoib_neigh_destructor is not 
in the kernel's address space.
we have to make sure no one calls to it after the module is unloaded.
I think that if nothing prevents the execution of the "code" above it serves 
both cases.
Do you see any problem with that?
Do I have to maintain my own list of neighbours or use the kernel's arp table 
for that?

I am trying to study the neighbour cleanup function and do something like that 
but
I would be happy to learn from others as well.


>>>> Furthermore, bond_setup_by_slave is called only for non
>>>> Ethernet devices (we consider to change the logic to "called only for
>>>> IPoIB devices just for safety).
>>> Why is this necessary, BTW?
>>>
>> If we don't do that, we get a memory leak because the neigh destructor will
>> never be called for non IPoIB devices although they carry ipoib_neigh
>> with them.
> 
> How can this happen? If it does, I think we are back to where we started:
> to_ipoib_neigh is broken for non-IPoIB device.
> I thought you said only devices of the same type can be paired?
> 
> 
The scenario is:
1. kernel allocates a neighbour structure for bond0, puts it on a skb and 
passed it to bond xmit function.
2. bond0 passes the skb to ipoib
3. ipoib allocates ipoib_neigh and hangs it on linux neighbour. 
4. a while after that, the kernel wants to destroy the neighbour (cleanup) but 
doesn't call ipoib_neigh_destructor because it the neigh setup registered the 
destructor for ibX device.





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

2007-02-07 Thread Moni Shoua

> Another concern: assume that one device goes away (e.g. hotplug).
> It seems that neighbours whose dev field point to another device, will not be 
> destroyed.
> Correct?
I agree.
> 
> Therefore in your design, it seems that to_ipoib_neigh()->dev
> will get us a pointer to device that has been removed already.
> 
I agree that this is a problem. It think it would be best to prevent an IPoIB 
device
from disappearing or from ib_ipoib from being unloaded as long as IPoIB
device is a slave. Unfortunately, I don't see how this can be done just
by fixing something in bonding or IPoIB. 
However, any slave knows he has a master (dev->master). 
What do you think about a solution where IPoIB first tries to clean up the
neighbours that belong to it's master before deleting the IPoIB device?

>> Furthermore, bond_setup_by_slave is called only for non
>> Ethernet devices (we consider to change the logic to "called only for
>> IPoIB devices just for safety).
> 
> Why is this necessary, BTW?
> 
If we don't do that, we get a memory leak because the neigh destructor will
never be called for non IPoIB devices although they carry ipoib_neigh
with them.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

2007-02-06 Thread Moni Shoua
Michael S. Tsirkin wrote:
>>--
>>IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
>>whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
>>created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
>>call.
>>
>>When using the bonding driver, neighbours are created by the net stack on 
>>behalf
>>of the bonding (master) device. On the tx flow the bonding code gets an skb 
>>such
>>that skb->dev points to the master device, it changes this skb to point on the
>>slave device and calls the slave hard_start_xmit function.
>>
>>Combing these two flows, there is a hole if some code at ipoib
>>(ipoib_neigh_destructor) assumes that for each struct neighbour it gets, 
>>n->dev
>>is an ipoib device so for example netdev_priv(n->dev) would be of type struct
>>ipoib_dev_priv.
> 
> 
> Could you plese elaborate how ipoib_neigh_destructor comes to be called at 
> all?
> At what point does ipoib_neigh_setup_dev get called?
> 
> 
The bond device uses its slave's neigh_setup function.
Please look at line 19 below from the bonding code.
static void bond_setup_by_slave(struct net_device *bond_dev,
 11 +   struct net_device *slave_dev)
 12 +{
 13 +   bond_dev->hard_header   = slave_dev->hard_header;
 14 +   bond_dev->rebuild_header= slave_dev->rebuild_header;
 15 +   bond_dev->hard_header_cache = slave_dev->hard_header_cache;
 16 +   bond_dev->header_cache_update   = slave_dev->header_cache_update;
 17 +   bond_dev->hard_header_parse = slave_dev->hard_header_parse;
 18 +
 19 +   bond_dev->neigh_setup   = slave_dev->neigh_setup;
 20 +
 21 +   bond_dev->type  = slave_dev->type;
 22 +   bond_dev->hard_header_len   = slave_dev->hard_header_len;
 23 +   bond_dev->addr_len  = slave_dev->addr_len;
 24 +
 25 +   memcpy(bond_dev->broadcast, slave_dev->broadcast,
 26 +   slave_dev->addr_len);
 27 +}
>>To fix it, this patch adds a dev field to struct ipoib_neigh which is used
>>instead of the struct neighbour dev one.
> 
> 
> What I am concerned with is - if the master is not an IPoIB device,
> what guarantee do we have that to_ipoib_neigh will return 0
> and not part of an actual hardware address?
> 
> Without bonding, the reason is that dev points to an ipoib device,
> so we know hw address is 20 bytes.
> 

I guess you meant "if the slave is not an IPoIB device"...

The bond device doesn't allow devices of different types to be grouped
together as its slaves. Furthermore, bond_setup_by_slave is called only for non
Ethernet devices (we consider to change the logic to "called only for
IPoIB devices just for safety).


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

2007-02-06 Thread Moni Shoua
Michael, Roland,

I'd appreciate if you take a look at this and give your comments.

The patch here refers to this thread about adding bonding 
support for IPoIB interfaces and is necessary for it to work properly.
http://openib.org/pipermail/openib-general/2007-January/031934.html

The patch here is for upstream kernel while there is a version of the patch 
for OFED as well (for kernels up to 2.6.16)
http://openib.org/pipermail/openib-general/2007-January/031935.html

thanks
- MoniS

--
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Combing these two flows, there is a hole if some code at ipoib
(ipoib_neigh_destructor) assumes that for each struct neighbour it gets, n->dev
is an ipoib device so for example netdev_priv(n->dev) would be of type struct
ipoib_dev_priv.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one.

Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
Signed-off-by: Or Gerlitz <[EMAIL PROTECTED]>
---
 ipoib.h   |4 +++-
 ipoib_main.c  |   23 +--
 ipoib_multicast.c |2 +-
 3 files changed, 17 insertions(+), 12 deletions(-)

Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- infiniband.orig/drivers/infiniband/ulp/ipoib/ipoib.h2007-01-22 
12:11:25.0 +0200
+++ infiniband/drivers/infiniband/ulp/ipoib/ipoib.h 2007-01-22 
12:18:06.101698456 +0200
@@ -216,6 +216,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -232,7 +233,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- infiniband.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-01-22 
12:11:33.0 +0200
+++ infiniband/drivers/infiniband/ulp/ipoib/ipoib_main.c2007-01-22 
12:34:57.599156580 +0200
@@ -490,7 +490,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+   neigh = ipoib_neigh_alloc(skb->dst->neighbour, skb->dev);
if (!neigh) {
++priv->stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -769,32 +769,34 @@ static void ipoib_set_mcast_list(struct 
 static void ipoib_neigh_destructor(struct neighbour *n)
 {
struct ipoib_neigh *neigh;
-   struct ipoib_dev_priv *priv = netdev_priv(n->dev);
+   struct ipoib_dev_priv *priv;
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
-   ipoib_dbg(priv,
- "neigh_destructor for %06x " IPOIB_GID_FMT "\n",
- IPOIB_QPN(n->ha),
- IPOIB_GID_RAW_ARG(n->ha + 4));
-
-   spin_lock_irqsave(&priv->lock, flags);
 
neigh = *to_ipoib_neigh(n);
if (neigh) {
+   priv = netdev_priv(neigh->dev);
+   ipoib_dbg(priv,
+ "neigh_destructor for %06x " IPOIB_GID_FMT "\n",
+ IPOIB_QPN(n->ha),
+ IPOIB_GID_RAW_ARG(n->ha + 4));
+
+   spin_lock_irqsave(&priv->lock, flags);
if (neigh->ah)
ah = neigh->ah;
list_del(&neigh->list);
ipoib_neigh_free(n->dev, neigh);
+   spin_unlock_irqrestore(&priv->lock, flags);
}
 
-   spin_unlock_irqrestore(&priv->lock, flags);
 
if (ah)
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struc

Re: [openib-general] IB/mthca: question about HCA profile module parameters

2007-02-04 Thread Moni Shoua
Dotan Barak wrote:
> Hi Moni.
> 
> I tried to use the mthca module parameter: for example i tried to change
> the number of QPs.
> 
> I got several failures when i used the HCA 25204:
> * sometimes i got the following error message (when using big values,
> for example 512K QPs):
> ib_mthca: :0c: INIT_HCA command failed aborting.
> ib_mthca: probe of :0c: failed with error -16
> * when i tried to use small amount of QPs (1024) the machine just hanged
> and i noticed a kernel oops message on the console
> 
OK. So I ran more tests on my setup which now include
- Dual x86_64 processor (Intel Xeon)
- 1GB RAM
- 25204 HCA - fw_ver=1.1.0

In the range of 16K - to 256K of value for num_qp I got no errors.
For lower and higher values I got errors from INIT_HCA and (not always and just 
for very low values) a machine hung.
Do you have the Oops saved somewhere? Can you put it here please?


> 
> Did you verify the HCA profile module parameter feature?
As I mentioned earlier, I verified that non default values can be assigned 
and that the HCA works for some selected values. 
I also noticed that illegal cause the driver to throw a message to the kernel 
log.
However, I didn't test the exact behaviout of all possible values for each 
profile variable.
> Is there is any known limitation for the values that should be used?
> (for example: only values which are power of two)
> 
> 
I guess that it is clear that there are hardware limitations that don't allow 
setting of any value.
Unfotunately, even after looking for them in the PRM, I couldn't figure out 
which are they.
The software limits the value to be a power of 2 and corrects the users if they 
try to set a wrong value (to the nearest power of 2). In that case a warning 
message is thrown to the kernel log.
> thanks
> Dotan
> 







___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB/mthca: question about HCA profile module parameters

2007-02-04 Thread Moni Shoua
Dotan Barak wrote:
> Hi Moni.
> 
> I tried to use the mthca module parameter: for example i tried to change
> the number of QPs.
> 
> I got several failures when i used the HCA 25204:
> * sometimes i got the following error message (when using big values,
> for example 512K QPs):
> ib_mthca: :0c: INIT_HCA command failed aborting.
> ib_mthca: probe of :0c: failed with error -16
> * when i tried to use small amount of QPs (1024) the machine just hanged
> and i noticed a kernel oops message on the console
> 
> 
> Did you verify the HCA profile module parameter feature?
> Is there is any known limitation for the values that should be used?
> (for example: only values which are power of two)
> 
> 
> thanks
> Dotan
> 

Hi Dotan,
I verified the profile feature up to the level of successful modprobe.
I am working now to look into your report.
thanks


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Add bonding suuport to OFED

2007-01-28 Thread Moni Shoua
Vladimir Sokolovsky wrote:
> Hi Moni,
> Please review the following patch to ib-bonding.spec:
> 
> Use %{_prefix} in RPM spec file instead of hard-coded /usr/local/ofed.
> 
> Signed-off-by: Vladimir Sokolovsky <[EMAIL PROTECTED]>
> ---
> 
> diff --git a/ib-bonding.spec b/ib-bonding.spec
> index db02fe8..77e51e0 100644
> --- a/ib-bonding.spec
> +++ b/ib-bonding.spec
> @@ -5,6 +5,8 @@
>  
>  %define _build_name_fmt 
> %%{ARCH}/%%{NAME}-%%{VERSION}-%%{RELEASE}-%%{DISTRIBUTION}-%%{ARCH}.rpm
>  
> +%{!?_prefix: %define _prefix /usr/local/ofed}
> +
>  Summary : ib_bonding patch and modules.
>  Name: %{name}
>  Version : %{version}
> @@ -39,11 +41,11 @@ fi
>  %install
>  [ "${RPM_BUILD_ROOT}" != "/" -a -d ${RPM_BUILD_ROOT} ] && rm -rf 
> ${RPM_BUILD_ROOT}
>  mkdir -p 
> ${RPM_BUILD_ROOT}/lib/modules/%{kversion}/kernel/drivers/net/bonding/
> -mkdir -p ${RPM_BUILD_ROOT}/usr/local/ofed/bin
> -mkdir -p ${RPM_BUILD_ROOT}/usr/local/ofed/docs
> +mkdir -p ${RPM_BUILD_ROOT}%{_prefix}/bin
> +mkdir -p ${RPM_BUILD_ROOT}%{_prefix}/docs
>  install  -m 755 linux/drivers/net/bonding/bonding.ko 
> ${RPM_BUILD_ROOT}/lib/modules/%{kversion}/kernel/drivers/net/bonding/
> -install  -m 755 bin/bond-init.sh ${RPM_BUILD_ROOT}/usr/local/ofed/bin
> -install  -m 755 docs/ib-bonding.txt ${RPM_BUILD_ROOT}/usr/local/ofed/docs
> +install  -m 755 bin/bond-init.sh ${RPM_BUILD_ROOT}%{_prefix}/bin
> +install  -m 755 docs/ib-bonding.txt ${RPM_BUILD_ROOT}%{_prefix}/docs
>  
>  
>  
> @@ -51,7 +53,7 @@ install  -m 755 docs/ib-bonding.txt ${RP
>  if [ ! -z $STACK_PREFIX ] ; then
>  backup_dir=$STACK_PREFIX/backup
>  else
> -backup_dir=/usr/local/ofed/backup
> +backup_dir=%{_prefix}/backup
>  fi
>  
>  
> @@ -69,7 +71,7 @@ STACK_PREFIX=$(test -x /etc/infiniband/i
>  if [ ! -z $STACK_PREFIX ] ; then
>  backup_dir=$STACK_PREFIX/backup
>  else
> -backup_dir=/usr/local/ofed/backup
> +backup_dir=%{_prefix}/backup
>  fi
>  cd $backup_dir
>  found_file=$(find -name bonding.ko)
> @@ -81,6 +83,6 @@ fi
>  
>  %files 
>  /lib/modules/%{kversion}/kernel/drivers/net/bonding/bonding.ko
> -/usr/local/ofed/bin/bond-init.sh
> -/usr/local/ofed/docs/ib-bonding.txt
> +%{_prefix}/bin/bond-init.sh
> +%{_prefix}/docs/ib-bonding.txt
>  
> 
> 
Thabks.
I applied that.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] The neigh_setup patch for upstream

2007-01-24 Thread Moni Shoua
Hi,
This is the upstream version of the patch that I sent in for OFED. 
Please comment.
thanks
 - MoniS



IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Combing these two flows, there is a hole if some code at ipoib
(ipoib_neigh_destructor) assumes that for each struct neighbour it gets, n->dev
is an ipoib device so for example netdev_priv(n->dev) would be of type struct
ipoib_dev_priv.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one.

Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
Signed-off-by: Or Gerlitz <[EMAIL PROTECTED]>
---
 ipoib.h   |4 +++-
 ipoib_main.c  |   23 +--
 ipoib_multicast.c |2 +-
 3 files changed, 17 insertions(+), 12 deletions(-)

Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- infiniband.orig/drivers/infiniband/ulp/ipoib/ipoib.h2007-01-22 
12:11:25.0 +0200
+++ infiniband/drivers/infiniband/ulp/ipoib/ipoib.h 2007-01-22 
12:18:06.101698456 +0200
@@ -216,6 +216,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headlist;
 };
@@ -232,7 +233,8 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,
+ struct net_device *dev);
 void ipoib_neigh_free(struct net_device *dev, struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- infiniband.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-01-22 
12:11:33.0 +0200
+++ infiniband/drivers/infiniband/ulp/ipoib/ipoib_main.c2007-01-22 
12:34:57.599156580 +0200
@@ -490,7 +490,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+   neigh = ipoib_neigh_alloc(skb->dst->neighbour, skb->dev);
if (!neigh) {
++priv->stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -769,32 +769,34 @@ static void ipoib_set_mcast_list(struct 
 static void ipoib_neigh_destructor(struct neighbour *n)
 {
struct ipoib_neigh *neigh;
-   struct ipoib_dev_priv *priv = netdev_priv(n->dev);
+   struct ipoib_dev_priv *priv;
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
-   ipoib_dbg(priv,
- "neigh_destructor for %06x " IPOIB_GID_FMT "\n",
- IPOIB_QPN(n->ha),
- IPOIB_GID_RAW_ARG(n->ha + 4));
-
-   spin_lock_irqsave(&priv->lock, flags);
 
neigh = *to_ipoib_neigh(n);
if (neigh) {
+   priv = netdev_priv(neigh->dev);
+   ipoib_dbg(priv,
+ "neigh_destructor for %06x " IPOIB_GID_FMT "\n",
+ IPOIB_QPN(n->ha),
+ IPOIB_GID_RAW_ARG(n->ha + 4));
+
+   spin_lock_irqsave(&priv->lock, flags);
if (neigh->ah)
ah = neigh->ah;
list_del(&neigh->list);
ipoib_neigh_free(n->dev, neigh);
+   spin_unlock_irqrestore(&priv->lock, flags);
}
 
-   spin_unlock_irqrestore(&priv->lock, flags);
 
if (ah)
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struct ipoib_neigh *neigh;
 
@@ -803,6 +805,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
return NULL;
 
neigh->neighbour = neighbour;
+   neigh->dev = dev;
*to_ipoib_neigh(neighbour) = neigh;
skb_queue_head_init(&neigh->queue);
 
Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
===

Re: [openib-general] Add bonding suuport to OFED

2007-01-24 Thread Moni Shoua
Hi, Vlad,
Can you please pull this to OFED-1.2?
I guess this requires some changes in the build scripts and configuration files.

I'd be happy to help and any way I can to help with that. Please let me know.

thanks
 - MoniS

Moni Shoua wrote:
> Originally, bonding is a High Availability solution for Ethernet network 
> interfaces.
> It is a module that implements a virtual network device (not bounded to
> hardware) and enslaves "real" devices. Bonding device  controls its slaves 
> according
> to the bonding policy and the slave's health.
> 
> I am adding a bonding device which is good for IPoIB interfaces. Feel free to 
> install it 
> send comments.
> 
> You just have to build  source RPM, rebuild it and install the binary.
> 
> For now, I have tested the module under RH4-UP3 and SLES10 with OFED-1.1.
> 
> HOW TO BUILD THE SOURCE RPM
> ===
> git clone git://staging.openfabrics.org/~monis/ofed-bond-pkg.git mydir
> cd mydir/
> ./build_rpm.sh 
> ./build_rpm.sh  OR  ./build_rpm.sh --git-url
> 
> 
> After installing the binary RPM read the instructions in
> /usr/local/ofed/docs/ib-bonding.txt
> 
> Note: Using ib-bonding requires applying a patch for IPoIB and replacing
> ib_ipoib.ko. Please find the patch in the following message. 
> Please also note that the patch should be applied after 
> ipoib_8111_to_2_6_16.patch.
> 
>  - MoniS
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 
> 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/ipoib: Add field dev to struct ipoib_neigh

2007-01-24 Thread Moni Shoua
Michael S. Tsirkin wrote:
>>>
>>>Just to clarify - you previously mentionned you saw problems with 2.6.16
>>>backport. Is this an issue you see with 2.6.20 as well?
>>
>>Yes, the same thing happens with kernel 2.6.20. However, the patch for 2.6.20
>>looks a little bit different. I will post it today or tommorow.
> 
> 
> Let's see that first. I prefer to first look at upstream code, then think
> about backporting.
> 
OK, I will post this patch today.

> But this would hardly help if ipoib module is unloaded while neighbour
> for bonding device is still around and has a pointer to 
> ipoib_neigh_destructor.
> 
> 
>>For later kernels, bond device "borrows" the slave's neigh_setup
>>function in the bond's setup function.
>>
>> ==> bond_dev->neigh_setup = slave_dev->neigh_setup;
>>
>>So even if the beighbour points to bond device the
>>ipoib_neigh_destructor will be called.
> 
> 
> Same applies here.
> 
This is a good point. The right solution in my opinion is to enforce a correct 
order 
of unloading the modules. First bonding and than IPoIB. We still have to think 
how do 
we want to implement this. 
> Further, in both cases, it seems that accessing data at to_ipoib_neigh on a 
> neighbour for
> non-ipoib device can cause a crash if hardware address is !=0 at offset 20.
> 
I don't see such risk. the ipoib_neigh_destructor is called only for neighbours 
that were passed 
as an argument to ipoib_neigh_alloc (for kernels <= 2.6.16) or for devices that 
set their neigh_setup
function to ipoib_neigh_setup_dev (for bigger kernels). The only one (besides 
IPoIB of course) that does 
that is bonding and bonding cannot enslave devices of different types. So, once 
bonding sets its neigh_setup 
to ipoib_neigh_setup_dev, it means it enslaves an IPoIB device and won't 
enslave devices of other types.
However, it might be good idea to change the condition in bonding to "borrow" 
the neigh_setup function.
Currently it is (slave_type != Ethernet) but should be (slave_type == IPoIB).



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] IB/ipoib: Add field dev to struct ipoib_neigh

2007-01-24 Thread Moni Shoua

> 
> 
> Just to clarify - you previously mentionned you saw problems with 2.6.16
> backport. Is this an issue you see with 2.6.20 as well?
Yes, the same thing happens with kernel 2.6.20. However, the patch for 2.6.20
looks a little bit different. I will post it today or tommorow.

> 
> Also - in your approach, what prevents the device from going away while there
> are still ipoib_neigh objects around?
Nothing prevents it. You can modprobe -r bonding whenever you want (even when 
IPoIB is up)
and still be safe from leaks. I think my answer for that is below.

> Also - if neigh does not point to ipoib device, our neigh destructor won't be 
> called
> for it, will it? What will clean the ipoib neigh then?
> 
With kernels up to 2.6.16, patch ipoib_8111_to_2_6_16 adds this to 
ipoib_neigh_alloc
  ==> neigh->neighbour->ops->destructor = ipoib_neigh_destructor;
So I guess there is no such problem here.

For later kernels, bond device "borrows" the slave's neigh_setup
function in the bond's setup function.

 ==> bond_dev->neigh_setup = slave_dev->neigh_setup;

So even if the beighbour points to bond device the
ipoib_neigh_destructor will be called.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/ipoib: Add field dev to struct ipoib_neigh

2007-01-23 Thread Moni Shoua
IPoIB uses a two layer neighboring scheme, such that for each struct neighbour
whose device is an ipoib one, there is a struct ipoib_neigh buddy which is
created on demand at the tx flow by an ipoib_neigh_alloc(skb->dst->neighbour)
call.

When using the bonding driver, neighbours are created by the net stack on behalf
of the bonding (master) device. On the tx flow the bonding code gets an skb such
that skb->dev points to the master device, it changes this skb to point on the
slave device and calls the slave hard_start_xmit function.

Combing these two flows, there is a hole if some code at ipoib
(ipoib_neigh_destructor) assumes that for each struct neighbour it gets, n->dev
is an ipoib device so for example netdev_priv(n->dev) would be of type struct
ipoib_dev_priv.

To fix it, this patch adds a dev field to struct ipoib_neigh which is used
instead of the struct neighbour dev one.

Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
Signed-off-by: Or Gerlitz <[EMAIL PROTECTED]>

 ipoib.h   |3 ++-
 ipoib_main.c  |   22 +++---
 ipoib_multicast.c |2 +-
 3 files changed, 14 insertions(+), 13 deletions(-)
---
Index: openib-1.1/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- openib-1.1.orig/drivers/infiniband/ulp/ipoib/ipoib.h2007-01-10 
17:53:02.744225722 +0200
+++ openib-1.1/drivers/infiniband/ulp/ipoib/ipoib.h 2007-01-10 
17:55:04.121544018 +0200
@@ -218,6 +218,7 @@ struct ipoib_neigh {
struct sk_buff_head queue;
 
struct neighbour   *neighbour;
+   struct net_device *dev;
 
struct list_headall_neigh_list;
struct list_headlist;
@@ -235,7 +236,7 @@ static inline struct ipoib_neigh **to_ip
 INFINIBAND_ALEN, sizeof(void *));
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh);
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neigh,struct 
net_device *dev);
 void ipoib_neigh_free(struct ipoib_neigh *neigh);
 
 extern struct workqueue_struct *ipoib_workqueue;
Index: openib-1.1/drivers/infiniband/ulp/ipoib/ipoib_main.c
===
--- openib-1.1.orig/drivers/infiniband/ulp/ipoib/ipoib_main.c   2007-01-10 
17:53:02.717230544 +0200
+++ openib-1.1/drivers/infiniband/ulp/ipoib/ipoib_main.c2007-01-10 
17:58:55.531209253 +0200
@@ -516,7 +516,7 @@ static void neigh_add_path(struct sk_buf
struct ipoib_path *path;
struct ipoib_neigh *neigh;
 
-   neigh = ipoib_neigh_alloc(skb->dst->neighbour);
+   neigh = ipoib_neigh_alloc(skb->dst->neighbour, skb->dev);
if (!neigh) {
++priv->stats.tx_dropped;
dev_kfree_skb_any(skb);
@@ -799,7 +799,7 @@ static void ipoib_set_mcast_list(struct 
 static void ipoib_neigh_destructor(struct neighbour *n)
 {
struct ipoib_neigh *neigh;
-   struct ipoib_dev_priv *priv = netdev_priv(n->dev);
+   struct ipoib_dev_priv *priv;
unsigned long flags;
struct ipoib_ah *ah = NULL;
 
@@ -808,12 +808,14 @@ static void ipoib_neigh_destructor(struc
list_for_each_entry(tn, &ipoib_all_neigh_list, all_neigh_list)
if (tn->neighbour == n) {
nn = tn;
+   neigh = *to_ipoib_neigh(n);
break;
}
spin_unlock(&ipoib_all_neigh_list_lock);
-   if (!nn)
+   if (!nn || !neigh)
return;
 
+   priv = netdev_priv(neigh->dev);
ipoib_dbg(priv,
  "neigh_destructor for %06x " IPOIB_GID_FMT "\n",
  be32_to_cpup((__be32 *) n->ha),
@@ -821,13 +823,9 @@ static void ipoib_neigh_destructor(struc
 
spin_lock_irqsave(&priv->lock, flags);
 
-   neigh = *to_ipoib_neigh(n);
-   if (neigh) {
-   if (neigh->ah)
-   ah = neigh->ah;
-   list_del(&neigh->list);
-   ipoib_neigh_free(neigh);
-   }
+   ah = neigh->ah;
+   list_del(&neigh->list);
+   ipoib_neigh_free(neigh);
 
spin_unlock_irqrestore(&priv->lock, flags);
 
@@ -835,7 +833,8 @@ static void ipoib_neigh_destructor(struc
ipoib_put_ah(ah);
 }
 
-struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour)
+struct ipoib_neigh *ipoib_neigh_alloc(struct neighbour *neighbour,
+ struct net_device *dev)
 {
struct ipoib_neigh *neigh;
 
@@ -849,6 +848,7 @@ struct ipoib_neigh *ipoib_neigh_alloc(st
spin_lock(&ipoib_all_neigh_list_lock);
list_add_tail(&neigh->all_neigh_list, &ipoib_all_neigh_list);
neigh->neighbour->ops->destructor = ipoib_neigh_destructor;
+   neigh->

[openib-general] Add bonding suuport to OFED

2007-01-23 Thread Moni Shoua
Originally, bonding is a High Availability solution for Ethernet network 
interfaces.
It is a module that implements a virtual network device (not bounded to
hardware) and enslaves "real" devices. Bonding device  controls its slaves 
according
to the bonding policy and the slave's health.

I am adding a bonding device which is good for IPoIB interfaces. Feel free to 
install it 
send comments.

You just have to build  source RPM, rebuild it and install the binary.

For now, I have tested the module under RH4-UP3 and SLES10 with OFED-1.1.

HOW TO BUILD THE SOURCE RPM
===
git clone git://staging.openfabrics.org/~monis/ofed-bond-pkg.git mydir
cd mydir/
./build_rpm.sh 
./build_rpm.sh  OR  ./build_rpm.sh --git-url


After installing the binary RPM read the instructions in
/usr/local/ofed/docs/ib-bonding.txt

Note: Using ib-bonding requires applying a patch for IPoIB and replacing
ib_ipoib.ko. Please find the patch in the following message. 
Please also note that the patch should be applied after 
ipoib_8111_to_2_6_16.patch.

 - MoniS


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling

2007-01-11 Thread Moni Shoua

> Unless you can come up with a way that makes sure that all skbs are
> completed even in low-traffic situations, I don't think this is
> mergeable -- it's just too much of a usability nightmare to have a
> flag that is essentially "break some workloads in a mysterious way to
> make some benchmarks run a little faster."

Thanks for the comment.
My thinking on how to address this issue is: add a periodic task that 
checks if there are uncompleted sends beyond some threshold. If there 
are such, it sets a flag that causes the ipoib tx logic to enforce a 
signal on the next post and sends a packet which is practically a 
NO-OP.

This packet can be for example a unicast arp (reply) with src and dst 
being this interface IP.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling

2007-01-11 Thread Moni Shoua

> Thinking about this more - why does this patch help some benchmarks?
> The amount of work it takes for the hardware to generate a completion
> is likely negligeable, and we still are scanning the same amount
> of TX WRs in a loop to unmap/free them.
This makes sense but I think you should also consider the fact that 
the tx_lock is taken once per per tx_completion so, with the patch,
the driver spends less time under lock.


> If you think about it this way, it becomes clear that your workload,
> for some reason, hits a path where you get an event very fast
> after the first completion and there is only a small number of completions
> to handle. So your patch helps just by delaying the event handler until
> there's more work to do. And I expect it wouldn't help TCP much if at all
> as there are RX WRs per each couple of TX WRs.
> 
This is a good point to check. I hope I can get to it and spend time over it 
next week.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling - performance measurements

2007-01-10 Thread Moni Shoua
Michael S. Tsirkin wrote:
Tests with iperf and netperf for unicast and multicast destinations show
an improvement in the ability of user applications to xmit packets. 

Examples: Number of successful writes as reported by 30 seconds UDP_STREAM 
of 100 byte packets.
Tested with netperf on Dual CPU (64bit Intel Xeon 3GHz) running 
linux-2.6.20-rc1 (sender) and
OFED-1.1 (receiver)
>>>
>>>
>>>IMO netperf reporting is actually not too informative without stats settings.
>>>Try running with e.g. -i 10,2 -I 99,5 - you might discover that your numbers 
>>>are
>>>only accurate within 30%
>>
>>I tried that and I am getting a warning about confidence level not being
>>achieved.  I am still trying to learn about that and trying to understand why
>>(any ideas?) but for the meantime can you explain why do I need statistics 
>>when
>>I am only trying to count the number of successful writes?
> 
> 
> Otherwise your results could be just noise.
I'm sorry but I don't understand how can it be noise. I am not measuring 
average nor PPS (or BW) but
true a counter (number of total sent packets) so confidence seems irrelevant 
here.
Anyway, port counters and device counters show the same number as netperf so I 
guess this is
the real confidence.

> 
> 
Note that the results below show improvement only for TX so we see an end 
to end packet loss.
>>>
>>>
>>>Hmm, as long as packet drops increase, BW improvements in UDP don't sound
>>>too convincing, do they? You can get infinite BW at 100% drop ...
>>>
>>>
>>>
Improving the receiver (NAPI) will reduce the packet loss. 
>>>
>>>
>>>Needs testing with NAPI patch then?
>>
>>I tried NAPI and I get better results for the receiver but my opinion is that
>>the receiver side is less important here since all I'm trying to improve is
>>the ability to send packets. Am I right?
> 
> 
> Only if you are sure something else is not dropping the packets (e.g.
> buffer overruns triggered).
> 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling - performance measurements

2007-01-09 Thread Moni Shoua
Michael S. Tsirkin wrote:
Tests with iperf and netperf for unicast and multicast destinations show
an improvement in the ability of user applications to xmit packets. 

Examples: Number of successful writes as reported by 30 seconds UDP_STREAM 
of 100 byte packets.
Tested with netperf on Dual CPU (64bit Intel Xeon 3GHz) running 
linux-2.6.20-rc1 (sender) and
OFED-1.1 (receiver)
>>>
>>>
>>>IMO netperf reporting is actually not too informative without stats settings.
>>>Try running with e.g. -i 10,2 -I 99,5 - you might discover that your numbers 
>>>are
>>>only accurate within 30%
>>
>>I tried that and I am getting a warning about confidence level not being
>>achieved.  I am still trying to learn about that and trying to understand why
>>(any ideas?) but for the meantime can you explain why do I need statistics 
>>when
>>I am only trying to count the number of successful writes?
> 
> 
> Otherwise your results could be just noise.
> 
> 
Note that the results below show improvement only for TX so we see an end 
to end packet loss.
>>>
>>>
>>>Hmm, as long as packet drops increase, BW improvements in UDP don't sound
>>>too convincing, do they? You can get infinite BW at 100% drop ...
>>>
>>>
>>>
Improving the receiver (NAPI) will reduce the packet loss. 
>>>
>>>
>>>Needs testing with NAPI patch then?
>>
>>I tried NAPI and I get better results for the receiver but my opinion is that
>>the receiver side is less important here since all I'm trying to improve is
>>the ability to send packets. Am I right?
> 
> 
> Only if you are sure something else is not dropping the packets (e.g.
> buffer overruns triggered).
> 
The number of sent packets reported by netperf is equal to the number of sent 
packets 
reported by netdev stats (from running ifconfig before and after netperf) and 
to the
number of sent packets reported by the port (perfquery)



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling - performance measurements

2007-01-09 Thread Moni Shoua
Michael S. Tsirkin wrote:
>>Tests with iperf and netperf for unicast and multicast destinations show
>>an improvement in the ability of user applications to xmit packets. 
>>
>>Examples: Number of successful writes as reported by 30 seconds UDP_STREAM of 
>>100 byte packets.
>>Tested with netperf on Dual CPU (64bit Intel Xeon 3GHz) running 
>>linux-2.6.20-rc1 (sender) and
>>OFED-1.1 (receiver)
> 
> 
> IMO netperf reporting is actually not too informative without stats settings.
> Try running with e.g. -i 10,2 -I 99,5 - you might discover that your numbers 
> are
> only accurate within 30%
I tried that and I am getting a warning about confidence level not being 
achieved.
I am still trying to learn about that and trying to understand why (any ideas?) 
but
for the meantime can you explain why do I need statistics when I am only trying 
to count the
number of successful writes?
> 
> 
>>Note that the results below show improvement only for TX so we see an end to 
>>end packet loss.
> 
> 
> Hmm, as long as packet drops increase, BW improvements in UDP don't sound
> too convincing, do they? You can get infinite BW at 100% drop ...
> 
> 
>>Improving the receiver (NAPI) will reduce the packet loss. 
> 
> 
> Needs testing with NAPI patch then?
> 
I tried NAPI and I get better results for the receiver but my opinion is that 
the receiver side
is less important here since all I'm trying to improve is the ability to send 
packets. Am I right?



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling

2007-01-09 Thread Moni Shoua
Michael S. Tsirkin wrote:
>>I don't think that holding the skb too long is a trigger for somethink. 
> 
> 
> 
> Are you sure? We are not talking about too long here - unsignalled TX packet
> will never get a completion. As far as I can see, __kfree_skb will
> 1. call dst_release - so this patch might keep a reference on dst 
> indefinitely?
I don't think that holding dst too long is unsafe. Imagine a constant stream of 
packets
to the same destination. In this case will always be a reference to a dst 
struct.
> 2. call skb->destructor if not NULL - this is responsible for socket buffer
>accounting
I addressed the issue of the socket buffer accounting in the openning message.
I don't see it as a problem but more than an note to the user. Don't you think?
> 3. Releases reference to lots of other objects related to netfiltering
> 
> Are you sure keeping all these references indefinitely is safe?
I can't say I'm 100% sure but please see my comment below.
> 
A comment regarding the word "indefinitely" - I understand that theoretically 
there
is a chance that no packet will be sent through the ib interface causing 
unnecessary resource
allocation as you described. I think however that the chance for that is very 
small and 
that the price is worth for gaining performance increase. This is true of 
course if the 
penalty is just resource allocation and not system safety. In this context I 
can say that my 
tests didn't cause any bad system behavior and my senses tell me there 
shouldn't be any.
However, I would be glad to learn more from those who know more.
 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling

2007-01-09 Thread Moni Shoua
Michael S. Tsirkin wrote:
>>This patch implements selective tx signaling for IPoIB.
> 
> 
> Let's assume that the last tx packet you have sent is marked unsignalled.
> Since you never free the skb, won't the TX watchdog get triggered?
> 

AFAIK, tx_timeout is called when (jiffies - dev->trans_start) > 
dev->watchdog_timeo.
I don't think that holding the skb too long is a trigger for somethink. 
Anyway, I never saw ipoib_timeout being called during my tests.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling - performance measurements

2007-01-09 Thread Moni Shoua

Tests with iperf and netperf for unicast and multicast destinations show
an improvement in the ability of user applications to xmit packets. 

Examples: Number of successful writes as reported by 30 seconds UDP_STREAM of 
100 byte packets.
Tested with netperf on Dual CPU (64bit Intel Xeon 3GHz) running 
linux-2.6.20-rc1 (sender) and
OFED-1.1 (receiver)

Note that the results below show improvement only for TX so we see an end to 
end packet loss.
Improving the receiver (NAPI) will reduce the packet loss. 

--
Without the patch
PPS=230507

linux:~ # netperf -H 192.168.11.234 -t UDP_STREAM -l 30 -- -m 100
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.234 (192.168.11.234) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144 100   30.00 6915225  0 184.40
135168   30.00 6366068169.75

--
tx_signal_rate=1
PPS=244116

linux:~ # netperf -H 192.168.11.234 -t UDP_STREAM -l 30 -- -m 100
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.234 (192.168.11.234) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144 100   30.00 7323482  0 195.27
135168   30.00 6905764184.13

--
tx_signal_rate=4
PPS=254748

linux:~ # netperf -H 192.168.11.234 -t UDP_STREAM -l 30 -- -m 100
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.234 (192.168.11.234) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144 100   30.00 7642461  0 203.77
135168   30.00 6741080179.74

--
tx_signal_rate=8
PPS=278458

linux:~ # netperf -H 192.168.11.234 -t UDP_STREAM -l 30 -- -m 100
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.234 (192.168.11.234) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144 100   30.01 8353760  0 222.73
135168   30.01 6884056183.54

--
tx_signal_rate=16
PPS=316418

linux:~ # netperf -H 192.168.11.234 -t UDP_STREAM -l 30 -- -m 100
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.234 (192.168.11.234) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144 100   30.00 9492551  0 253.11
135168   30.00 6501771173.37

--
tx_signal_rate=32
PPS=328316

linux:~ # netperf -H 192.168.11.234 -t UDP_STREAM -l 30 -- -m 100
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.234 (192.168.11.234) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144 100   30.00 9849480  0 262.62
135168   30.00 6006394160.15

--



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH/RFC] IB/ipoib: add selective tx signaling

2007-01-09 Thread Moni Shoua

This patch implements selective tx signaling for IPoIB.
It lets the user set the ratio between the number of 
sent packets and the number of TX completion signals.
This optimization has the following advantages:
+ increase the packet per second (PPS) rate
+ reduce the number of interrupts related to ipoib tx completions  

Since the IB HCA HW executes work requests posted QP in-order, we can i
assume that a completion of a work request means that all the work 
requests posted before it are also completed and hence their 
associated resources (skbs in this context) can be recycled.

The current driver implementation asks for a completion signaling for every 
sent packet (a ratio of 1). 
This patch enables the user to set a higher ratio.

Asking for a completion signal for every n (>1) packets saves the following:
1. less interrupts to the host
2. the amortized cost for tx completion handing is lowered
3. the tx_lock  is taken less often


The cost of selective signaling is in the average amount of memory that the 
IPoIB 
driver consumes since skbs are freed in the TX completion handler (which is now 
executed less often). 
So, if the current driver holds only few skbs at any given time (and normally 
not more than one) the new driver holds 
skbs up to n (the ratio between sent packets and the number of tx completions). 
For reasonable value of n 
can lead to over consumption of few tens of Kbytes but the real issue is 
elsewhere. Applications that set the 
socket buffer  to a small size (with setsockopt()) may suffer from ENOBUFS 
failures 
when calling to sendto() or sendmsg(). A good example for this is ping and a 
signaling ratio 
of 16 packets to 1 completion request. In this case few successful pings are 
followed by an endless sequence 
of errors (until ping restarts). The solution is to set n with attention to the 
specific user applications and
to use setsockopt() with care (ping for instance, can be run with -S).

Another issue is related to the ipoib_ib_dev_stop() operation. This function 
checks that the tail and head
of the tx_ring are equal and if they are not it assumes that there are 
uncompleted work requests.
With this patch it is normal that the tail and head of the tx_ring would be 
different sice we are not always
asking for a completion notification. Since I don't see a way to tell if the 
tail/head gap is normal or 
due to a failure I only reduce the message severity from warn to dbg if the 
condition for expected gap is true.
However, I still see there a tiny chance that a completion notification would 
arrive after the timeout in ipoib_ib_dev_stop()
expires and the it tries to free the skbs in the tx_ring(). Solutions to that 
can be
   1. protect the code with a lock - but I started with trying to avoid locks
   2. reduce the hazard by adding more to the timeout and calling 
test_bit(IPOIB_FLAG_INITIALIZED, &priv->flags); in the TX completion 
handler to check if ipoib_ib_dev_stop() 
had started.


I would be happy to get comments for the last issue and for the rest of the 
patch of course.
thanks 
 > MoniS



 ipoib.h   |2 ++
 ipoib_ib.c|   39 ---
 ipoib_main.c  |   10 +-
 ipoib_verbs.c |4 ++--
 4 files changed, 41 insertions(+), 14 deletions(-)

Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
---

Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib.h
===
--- infiniband.orig/drivers/infiniband/ulp/ipoib/ipoib.h2007-01-07 
15:39:49.421190295 +0200
+++ infiniband/drivers/infiniband/ulp/ipoib/ipoib.h 2007-01-07 
15:42:33.768824668 +0200
@@ -164,6 +164,7 @@ struct ipoib_dev_priv {
struct ipoib_tx_buf *tx_ring;
unsigned tx_head;
unsigned tx_tail;
+   unsigned tx_completion_mark;
struct ib_sgetx_sge;
struct ib_send_wrtx_wr;
 
@@ -335,6 +336,7 @@ static inline void ipoib_unregister_debu
 
 extern int ipoib_sendq_size;
 extern int ipoib_recvq_size;
+extern int num_unsignal_tx;
 
 extern struct ib_sa_client ipoib_sa_client;
 
Index: infiniband/drivers/infiniband/ulp/ipoib/ipoib_ib.c
===
--- infiniband.orig/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2007-01-07 
15:39:49.443186365 +0200
+++ infiniband/drivers/infiniband/ulp/ipoib/ipoib_ib.c  2007-01-07 
19:29:21.885896644 +0200
@@ -256,29 +256,32 @@ static void ipoib_ib_handle_tx_wc(struct
return;
}
 
-   tx_req = &priv->tx_ring[wr_id];
+   do {
+   tx_req = &priv->tx_ring[wr_id];
 
-   ib_dma_unmap_single(priv->ca, tx_req->mapping,
-   tx_req->skb->len, DMA_TO_DEVICE);
+   ib_dma_unmap_single(priv->ca, tx_req->mapping,
+   tx_req->skb->len, DMA_TO_DEVICE);
 

[openib-general] [PATCH v2] IB_mthca HCA profile module parameters

2006-11-16 Thread Moni Shoua
From: Leonid Arsh [EMAIL PROTECTED]>

Adds module parameters that enable settting some of the HCA
profile values
Signed-off-by: Leonid Arsh <[EMAIL PROTECTED]>
Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
---
  mthca_main.c |  139 
++-
  1 files changed, 128 insertions(+), 11 deletions(-)
--- mthca_main.c.orig   2006-11-14 22:07:58.0 -0500
+++ mthca_main.c2006-11-16 11:27:17.683513163 -0500
@@ -80,21 +80,134 @@
  module_param(tune_pci, int, 0444);
  MODULE_PARM_DESC(tune_pci, "increase PCI burst from the default set by BIOS 
if nonzero");

+#define MTHCA_DEFAULT_NUM_QP(1 << 16)
+#define MTHCA_DEFAULT_RDB_PER_QP(1 << 2)
+#define MTHCA_DEFAULT_NUM_CQ(1 << 16)
+#define MTHCA_DEFAULT_NUM_MCG   (1 << 13)
+#define MTHCA_DEFAULT_NUM_MPT   (1 << 17)
+#define MTHCA_DEFAULT_NUM_MTT   (1 << 20)
+#define MTHCA_DEFAULT_NUM_UDAV  (1 << 15)
+#define MTHCA_DEFAULT_NUM_RESERVED_MTTS (1 << 18)
+#define MTHCA_DEFAULT_NUM_UARC_SIZE (1 << 18)
+
+static struct mthca_profile default_profile = {
+   .num_qp= MTHCA_DEFAULT_NUM_QP,
+   .rdb_per_qp= MTHCA_DEFAULT_RDB_PER_QP,
+   .num_cq= MTHCA_DEFAULT_NUM_CQ,
+   .num_mcg   = MTHCA_DEFAULT_NUM_MCG,
+   .num_mpt   = MTHCA_DEFAULT_NUM_MPT,
+   .num_mtt   = MTHCA_DEFAULT_NUM_MTT,
+   .num_udav  = MTHCA_DEFAULT_NUM_UDAV,/* Tavor 
only */
+   .fmr_reserved_mtts = MTHCA_DEFAULT_NUM_RESERVED_MTTS,   /* Tavor only */
+   .uarc_size = MTHCA_DEFAULT_NUM_UARC_SIZE,   /* Arbel 
only */
+};
+
+module_param_named(num_qp, default_profile.num_qp, int, 0444);
+MODULE_PARM_DESC(num_qp, "maximum number of available QPs per HCA");
+
+module_param_named(rdb_per_qp, default_profile.rdb_per_qp, int, 0444);
+MODULE_PARM_DESC(rdb_per_qp, "number of RDB buffers per QP");
+
+module_param_named(num_cq, default_profile.num_cq, int, 0444);
+MODULE_PARM_DESC(num_cq, "maximum number of CQs per HCA");
+
+module_param_named(num_mcg, default_profile.num_mcg, int, 0444);
+MODULE_PARM_DESC(num_mcg, "maximum number of multicast groups per HCA");
+
+module_param_named(num_mpt, default_profile.num_mpt, int, 0444);
+MODULE_PARM_DESC(num_mpt, 
+   "maximum number of memory protection pable entries per HCA");
+
+module_param_named(num_mtt, default_profile.num_mtt, int, 0444);
+MODULE_PARM_DESC(num_mtt,
+"maximum number of memory translation table segments per HCA");
+/* Tavor only */
+module_param_named(num_udav, default_profile.num_udav, int, 0444);
+MODULE_PARM_DESC(num_udav, "maximum number of UD address vectors per HCA");
+
+/* Tavor only */
+module_param_named(fmr_reserved_mtts, default_profile.fmr_reserved_mtts, int, 
0444);
+MODULE_PARM_DESC(fmr_reserved_mtts,
+"number of memory translation table segments reserved for 
FMR");
+
  static const char mthca_version[] __devinitdata =
DRV_NAME ": Mellanox InfiniBand HCA driver v"
DRV_VERSION " (" DRV_RELDATE ")\n";

-static struct mthca_profile default_profile = {
-   .num_qp= 1 << 16,
-   .rdb_per_qp= 4,
-   .num_cq= 1 << 16,
-   .num_mcg   = 1 << 13,
-   .num_mpt   = 1 << 17,
-   .num_mtt   = 1 << 20,
-   .num_udav  = 1 << 15,   /* Tavor only */
-   .fmr_reserved_mtts = 1 << 18,   /* Tavor only */
-   .uarc_size = 1 << 18,   /* Arbel only */
-};
+#define is_power_of_2(x) (!(x & (x - 1)))
+
+static int __devinit mthca_check_profile_value(int* pval,int pval_default){
+/* value must be positive and power of 2 */
+int old_pval = *pval;
+if (old_pval <= 0) {
+*pval = pval_default;
+} else if (!is_power_of_2(old_pval)) {
+*pval = roundup_pow_of_two(old_pval);
+}
+return old_pval-*pval;
+}
+
+static int __devinit mthca_validate_profile(struct mthca_dev *mdev,
+   struct mthca_profile *profile)
+{
+if (mthca_check_profile_value(&default_profile.num_qp,
+  MTHCA_DEFAULT_NUM_QP)){
+   mthca_warn(mdev,"invalid num_qp passed. changed to %d.\n",
+   default_profile.num_qp); 
+   }
+
+   if (mthca_check_profile_value(&default_profile.rdb_per_qp,
+  MTHCA_DEFAULT_RDB_PER_QP)){
+mthca_warn(mdev,"invalid rdb_per_qp passed. changed to %d\n",
+   default_profile.rdb_per_qp); 
+   }
+
+   if (mthca_check_profile_value(&d

Re: [openib-general] [PATCH] IB/mthca: HCA profile module parameters

2006-11-16 Thread Moni Shoua
Roland Dreier wrote:

>The patch is line-wrapped and bizarrely corrupted and won't apply, eg:
>
> > +   mthca_warn(mdev, "num_qp rounded to power of 2 (%d).\n",
> > + default_profile.num_qp); +}
>
>This is completely unnecessary:
>
> > +#define to_up_power_of_2(x) (x = roundup_pow_of_two(x))
>
>...just open code this.
>
>And this seems strange:
>
> > +#define is_power_of_2(x) (x>0 &&(x & (x - 1)))
>
>so there's no warning if someone passes in a negative value??  and
>it's backwards too, (x & (x - 1)) is 0 precisely for the powers of 2.
>Was this patch tested at all?
>
>Anyway, all this
>
> > +   if (!is_power_of_2(default_profile.num_qp)){
> > +   to_up_power_of_2(default_profile.num_qp);
> > +   mthca_warn(mdev, "num_qp rounded to power of 2 (%d).\n",
> > + default_profile.num_qp); +}
>
>seems very repetive.  Can't it be wrapped up in a function so we just
>do something like
>
>   mthca_check_profile_value(&default_profile.num_qp);
>   mthca_check_profile_value(&default_profile.rdb_per_qp);
>   mthca_check_profile_value(&default_profile.num_cq);
>
>etc.
>
> - R.
>
>  
>
Thanks for the comments
Lines became wrapped because I used a "wrong" email client. I'll 
re-submit with another client but this would be in a new thread
because I still have problems reading mail with it and therefore I can't 
reply to this thread. Sorry for the bother...

The patch was tested but unfortunately I sent the wrong one (not the 
final).
The new version is the one I should have sent + changes according to the 
comments here.

thanks
MoniS


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] IB/mthca: HCA profile module parameters

2006-11-15 Thread Moni Shoua
From: Leonid Arsh <[EMAIL PROTECTED]>

Adds module parameters that enable settting some of the HCA
profile values.
Signed-off-by: Leonid Arsh <[EMAIL PROTECTED]>
Signed-off-by: Moni Shoua <[EMAIL PROTECTED]>
---
 mthca_main.c |  104 +--
 1 files changed, 101 insertions(+), 3 deletions(-)
--- mthca_main.c.orig   2006-11-14 22:07:58.0 -0500
+++ mthca_main.c2006-11-15 09:42:30.151093815 -0500
@@ -80,9 +80,6 @@
 module_param(tune_pci, int, 0444);
 MODULE_PARM_DESC(tune_pci, "increase PCI burst from the default set by BIOS if 
nonzero");
 
-static const char mthca_version[] __devinitdata =
-   DRV_NAME ": Mellanox InfiniBand HCA driver v"
-   DRV_VERSION " (" DRV_RELDATE ")\n";
 
 static struct mthca_profile default_profile = {
.num_qp= 1 << 16,
@@ -96,6 +93,103 @@
.uarc_size = 1 << 18,   /* Arbel only */
 };
 
+module_param_named(num_qp, default_profile.num_qp, int, 0444);
+MODULE_PARM_DESC(num_qp, "maximum number of available QPs per HCA");
+
+module_param_named(rdb_per_qp, default_profile.rdb_per_qp, int, 0444);
+MODULE_PARM_DESC(rdb_per_qp, "number of RDB buffers per QP");
+
+module_param_named(num_cq, default_profile.num_cq, int, 0444);
+MODULE_PARM_DESC(num_cq, "maximum number of CQs per HCA");
+
+module_param_named(num_mcg, default_profile.num_mcg, int, 0444);
+MODULE_PARM_DESC(num_mcg, "maximum number of multicast groups per HCA");
+
+module_param_named(num_mpt, default_profile.num_mpt, int, 0444);
+MODULE_PARM_DESC(num_mpt, 
+   "maximum number of memory protection pable entries per HCA");
+
+module_param_named(num_mtt, default_profile.num_mtt, int, 0444);
+MODULE_PARM_DESC(num_mtt,
+"maximum number of memory translation table segments per HCA");
+/* Tavor only */
+module_param_named(num_udav, default_profile.num_udav, int, 0444);
+MODULE_PARM_DESC(num_udav, "maximum number of UD address vectors per HCA");
+
+/* Tavor only */
+module_param_named(fmr_reserved_mtts, default_profile.fmr_reserved_mtts, int, 
0444);
+MODULE_PARM_DESC(fmr_reserved_mtts,
+"number of memory translation table segments reserved for 
FMR");
+
+static const char mthca_version[] __devinitdata =
+   DRV_NAME ": Mellanox InfiniBand HCA driver v"
+   DRV_VERSION " (" DRV_RELDATE ")\n";
+
+#define is_power_of_2(x) (x>0 &&(x & (x - 1)))
+#define to_up_power_of_2(x) (x = roundup_pow_of_two(x))
+static int __devinit mthca_validate_profile(struct mthca_dev *mdev,
+   struct mthca_profile *profile)
+{
+   if (!is_power_of_2(default_profile.num_qp)){
+   to_up_power_of_2(default_profile.num_qp);
+   mthca_warn(mdev, "num_qp rounded to power of 2 (%d).\n",
+ default_profile.num_qp); 
+   }
+
+   if (!is_power_of_2(default_profile.rdb_per_qp)){
+   to_up_power_of_2(default_profile.rdb_per_qp);
+   mthca_warn(mdev, "rdb_per_qp rounded to power of 2 (%d)\n",
+ default_profile.rdb_per_qp); 
+   }
+
+   if (!is_power_of_2(default_profile.num_cq)){
+   to_up_power_of_2(default_profile.num_cq);
+   mthca_warn(mdev, "num_cq rounded to power of 2 (%d)\n",
+ default_profile.num_cq); 
+   }
+
+   if (!is_power_of_2(default_profile.num_mcg)){
+   to_up_power_of_2(default_profile.num_mcg);
+   mthca_warn(mdev, "num_mcg rounded to power of 2 (%d)\n",
+ default_profile.num_mcg); 
+   }
+   if (!is_power_of_2(default_profile.num_mpt)){
+   to_up_power_of_2(default_profile.num_mpt);
+   mthca_warn(mdev, "num_mpt rounded to power of 2 (%d)\n",
+ default_profile.num_mpt); 
+   }
+
+   if (!is_power_of_2(default_profile.num_mtt)){
+   to_up_power_of_2(default_profile.num_mtt);
+   mthca_warn(mdev, "num_mtt rounded to power of 2 (%d)\n",
+ default_profile.num_mtt); 
+   }
+
+   if (mthca_is_memfree(mdev)) {
+   if (!is_power_of_2(default_profile.num_udav)){
+   to_up_power_of_2(default_profile.num_udav);
+   mthca_warn(mdev, "num_udav rounded to power of 2 
(%d)\n",
+ default_profile.num_udav); 
+   }
+
+   if (!is_power_of_2(default_profile.fmr_reserved_mtts)){
+   to_up_power_of_2(default_profile.fmr_reserved_mtts);
+   mthca_warn(mdev, "fmr_reserved_mtts rounded to power of 
2 (%d)\n",
+

[openib-general] Add module params to mthca to control the HCA profile

2006-11-15 Thread Moni Shoua
Hi,
A few months ago, Leonid Arsh submitted a patch to mthca that enables to 
control some of  the HCA profile values.
This patch was discussed here (see references below) but wasn't accepted 
and somehow got lost and I'd like to re-submit it.

http://openib.org/pipermail/openib-general/2006-May/021821.html
http://openib.org/pipermail/openib-general/2006-May/022424.html


MoniS




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Add module params to mthca to control the HCA profile

2006-11-15 Thread Moni Shoua
Moni Shoua wrote:

>Hi,
>A few months ago, Leonid Arsh submitted a patch to mthca that enables to 
>control some of  the HCA profile values.
>This patch was discussed here (see references below) but wasn't accepted 
>and somehow got lost and I'd like to re-submit it.
>
>http://openib.org/pipermail/openib-general/2006-May/021821.html
>http://openib.org/pipermail/openib-general/2006-May/022424.html
>
>
>___
>openib-general mailing list
>openib-general@openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>
>
>  
>
Sorry, submitted to the wrong place


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Add module params to mthca to control the HCA profile

2006-11-15 Thread Moni Shoua
Hi,
A few months ago, Leonid Arsh submitted a patch to mthca that enables to 
control some of  the HCA profile values.
This patch was discussed here (see references below) but wasn't accepted 
and somehow got lost and I'd like to re-submit it.

http://openib.org/pipermail/openib-general/2006-May/021821.html
http://openib.org/pipermail/openib-general/2006-May/022424.html


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ipoib mtu problem with UDP

2006-11-07 Thread Moni Shoua
Michael S. Tsirkin wrote:

>I tried using ifconfig to limit the ipoib mtu.
>Once I do this on *either* both server and client, or only on the client side,
>UDP seems to stop working:
>
>#ifconfig ib0 mtu 512
>#netperf -c -C -H 11.4.3.68 -f M -t UDP_STREAM
>UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.4.3.68
>(11.4.3.68) port 0 AF_INET : demo
>Socket  Message  Elapsed  Messages   CPU  Service
>SizeSize Time Okay Errors   Throughput   Util Demand
>bytes   bytessecs#  #   MBytes/sec % SS us/KB
>
>118784   65507   10.00   27582  0  172.2 26.33inf
>118784   10.00   0   0.0 23.40inf
>
>Things work fine if the mtu on the client side is 2044:
># ifconfig ib0 mtu 2044
># netperf -c -C -H 11.4.3.68 -f M -t UDP_STREAM
>UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
>11.4.3.68 (11.4.3.68) port 0 AF_INET : demo
>Socket  Message  Elapsed  Messages   CPU  Service
>SizeSize Time Okay Errors   Throughput   Util Demand
>bytes   bytessecs#  #   MBytes/sec % SS us/KB
>
>118784   65507   10.00   78488  0  490.1 25.312.310
>118784   10.00   68534 428.0 24.552.241
>
>Tested with kernel 2.6.19-rc4 and netperf 2.4.2.
>
>  
>
I get the same  results with iperf.
However they succeed with smaller datagrams (netperf uses 65507 by default)

dodly5:/home/shared/testing-tools/x86_64/netperf/netperf-2.4.1 # 
ifconfig ib0
ib0   Link encap:UNSPEC  HWaddr 
00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
  inet addr:192.168.11.235  Bcast:192.168.11.255  Mask:255.255.255.0
  UP BROADCAST RUNNING MULTICAST  MTU:512  Metric:1
  RX packets:42 errors:0 dropped:0 overruns:0 frame:0
  TX packets:14077513 errors:0 dropped:5 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:5776 (5.6 Kb)  TX bytes:6717604780 (6406.4 Mb)

dodly5:/home/shared/testing-tools/x86_64/netperf/netperf-2.4.1 # 
./netperf   -H 192.168.11.233  -t UDP_STREAM -- -m 3
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 
192.168.11.233 (192.168.11.233) port 0 AF_INET
Socket  Message  Elapsed  Messages
SizeSize Time Okay Errors   Throughput
bytes   bytessecs#  #   10^6bits/sec

262144   3   10.00   52533  01260.59
262144   10.00   22956550.86


dodly5:/home/shared/testing-tools/x86_64/iperf-2.0.2 # ./iperf -uc 
192.168.11.233 -l 65000

Client connecting to 192.168.11.233, UDP port 5001
Sending 65000 byte datagrams
UDP buffer size:   256 KByte (default)

[  3] local 192.168.11.235 port 32769 connected with 192.168.11.233 port 
5001
[  3]  0.0-10.9 sec  1.36 MBytes  1.05 Mbits/sec
[  3] Sent 22 datagrams
[  3] WARNING: did not receive ack of last datagram after 10 tries.
dodly5:/home/shared/testing-tools/x86_64/iperf-2.0.2 # ./iperf -uc 
192.168.11.233

Client connecting to 192.168.11.233, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size:   256 KByte (default)

[  3] local 192.168.11.235 port 32769 connected with 192.168.11.233 port 
5001
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec
[  3] Sent 893 datagrams
[  3] Server Report:
[  3]  0.0-10.0 sec  1.25 MBytes  1.05 Mbits/sec  0.002 ms0/  893 (0%)




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 Build Issue

2006-11-01 Thread Moni Shoua
Vladimir Sokolovsky wrote:

>
> Ramachandra K wrote:
>
>> Moni Shoua wrote:
>>
>>> We already tried to go this way and found that a local 
>>> Module.symvers is not always generated (but we might have missed 
>>> something though).
>>> I suggest that you check that this alternative way works under all 
>>> OSs compilation (SuSE and RedHat to be precise)...
>>>  
>>>
>> I think Module.symvers generation for external modules was added 
>> sometime
>> around 2.6.16, so its not generated on the older kernels (for eg 
>> 2.6.9 kernels
>> on RHEL)
>>
>> In this scenario, when there is no Module.symvers file, I guess the 
>> other
>> option is to use a single Kbuild file to build both modules,
>> as explained in section 7.3 of Documentation/kbuild/modules.txt.
>>
>> But this may not be feasible always. Come to think of it, why does the
>> OFED installation procedure not update the kernel Module.symvers file
>> when it replaces the old kernel modules present in /lib/modules/
>> with the new ones ?
>>
>>> BTW, Why not updating the kernel Module.symvers when kernel-ib-devel 
>>> is installed? This will free the developer from copying it to 
>>> his/hers private directory.
>>>  
>>>
>> It might be a good idea to update the Module.symvers file as part of the
>> normal installation and not only kernel-ib-devel. Because if the kernel
>> modules are being replaced (or new modules are being added), shouldn't
>> the Module.symvers file also be updated ?
>> Regards,
>> Ram
>
> Agree,
> Module.symvers should be updated by kernel-ib RPM.
> So, need to implement Moni's suggestion with light changes: update 
> kernel-ib RPM %post and %preun sections instead of kernel-ib-devel RPM 
> %pre and %postun.
>
> Regards,
> Vladimir
>
I agree although there is no use in updated Module.symvers when the 
devel RPM is not installed.

This is a part of the shell script that updates Module.symvers which you 
can use if you don't find a way how to generate Module.symvers in 2.4 
kernels

*for mod in $(find -name *.ko) ; do*
*nm -o $mod |grep __crc >> /tmp/syms*
*n_mods=$((n_mods+1))*
*done*

*n_syms=$(wc -l /tmp/syms |cut -f1 -d" ")*
*echo found $n_syms InfiniBand symbols in $n_mods InfiniBand
modules*
*n=1*

*MOD_SYMVERS_IB=./Module.symvers.ib*
*MOD_SYMVERS_PATCH=./Module.symvers.patch*
*if [ -f /lib/modules/$K_VER/source/Module.symvers ] ; then*
*MOD_SYMVERS_KERNEL=/lib/modules/$K_VER/source/Module.symvers*
*elif [ -f /lib/modules/$K_VER/build/Module.symvers ] ; then*
*MOD_SYMVERS_KERNEL=/lib/modules/$K_VER/build/Module.symvers*
*else*
*echo file Module.symvers not found*
*fi*
*if [ ! -z $MOD_SYMVERS_KERNEL ] ; then *
**
*rm -f $MOD_SYMVERS_IB*

*while [ $n -le $n_syms ] ; do*
*line=$(head -$n /tmp/syms|tail -1)*
*line1=$(echo $line|cut -f1 -d:)*
*line2=$(echo $line|cut -f2 -d:)*
*file=$(echo $line1|cut -f6- -d/)*
*file=$(echo $file|cut -f1 -d.)*

*crc=$(echo $line2|cut -f1 -d" ")*
*crc=${crc:8}*
*sym=$(echo $line2|cut -f3 -d" ")*
*sym=${sym:6}*
*echo -e  "0x$crc\t$sym\t$file" >> $MOD_SYMVERS_IB*
*if [ -z $allsyms ] ; then*
*allsyms=$sym*
*else*
*allsyms="$allsyms|$sym"*
*fi*
*n=$((n+1))*
*done*
*egrep -v "$allsyms" $MOD_SYMVERS_KERNEL >> $MOD_SYMVERS_IB*
*diff -u $MOD_SYMVERS_KERNEL $MOD_SYMVERS_IB >
$MOD_SYMVERS_PATCH*
*patch -d $(dirname $MOD_SYMVERS_KERNEL)  < $MOD_SYMVERS_PATCH*
*mkdir -p /usr/voltaire/backup*
*cp $MOD_SYMVERS_PATCH /usr/voltaire/backup*
*fi*




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 Build Issue

2006-10-31 Thread Moni Shoua
Vladimir Sokolovsky wrote:

> The alternative way to resolve this issue is the following:
> Save Modules.symvers file generated by OFED kernel modules compilation 
> (drivers/infiniband/Modules.symvers).
> It can be added to the kernel-ib-devel RPM in the next OFED release.
> Then in order to compile external module copy this Modules.symvers to 
> the directory where external module is build.
>
> Regards,
> Vladimir
>
>
> Moni Shoua wrote:
>
>> We managed to avoid rebuilding the kernel to solve this issue.
>>
>> Before building any IB dependant modules (out of OFED) it is required to
>> update the Module.symvers.
>> The new values for the symbol CRCs can be taken from the modules
>> themselves ( nm IB_MODULE |grep __crc_)
>> When Module.symvers is up-to-date, there should be no problem building
>> and installing the IB dependant modules.
>>
>> The solution step-by-step
>> 1. The procedure should run after installing the kerne-ib-devel RPM. It
>> is possible to run it in %pre section of the spec file.
>> 2. Foreach IB module (ko) which is listed in $(rpm -ql kernel-ib) - 
>> 2.1 take out the __crc_ sybbols 2.2 extract the symbol name 
>> and it's CRC value (simple parsing)
>> 2.3 add  it (or replace the existing) to Module.symvers (usually
>> under /lib/modules/$(uname -r)/build/ or /lib/modules/$(uname
>> -r)/source/ )
>> 3. Save the diff of the current Module.symvers from the original (for
>> future restore)
>> 4. When kernel-ib-devel RPM is uninstalled use the patch from (3) to
>> restore Module.symvers. This can be done in the %postun of the spec
>> file)
>>
>> I'd be glad to get comments about this.
>>
>>
>>
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] On Behalf Of Tom Tucker
>> Sent: Friday, October 27, 2006 5:30 PM
>> To: openib-general
>> Subject: [openib-general] OFED 1.1 Build Issue
>>
>>
>> I've been testing some code against the OFED 1.1 release and noticed
>> that if you build anything that depends on IB (RNFS in this case) into
>> the kernel, that the OFED kit doesn't work correctly. This is because
>> the dependent modules (ib_core, etc...) get sucked into the kernel
>> automagically and will cause the subsequent modprobe of the OFED module
>> to fail.
>>
>> I don't think you can fix this without rebuilding the kernel so it
>> should probably be listed in the OFED_release_notes as a known issue.
>> Providing a mechanism to rebuild the kernel as part of the OFED install
>> would be great too, sorry if it's already there and I missed it.
>>
>> Tom
>>
>>
>> ___
>> openib-general mailing list
>> openib-general@openib.org
>> http://openib.org/mailman/listinfo/openib-general
>>
>> To unsubscribe, please visit
>> http://openib.org/mailman/listinfo/openib-general
>>
>>
>> ___
>> openib-general mailing list
>> openib-general@openib.org
>> http://openib.org/mailman/listinfo/openib-general
>>
>> To unsubscribe, please visit 
>> http://openib.org/mailman/listinfo/openib-general
>>   
>
>
>
We already tried to go this way and found that a local Module.symvers is 
not always generated (but we might have missed something though).
I suggest that you check that this alternative way works under all OSs 
compilation (SuSE and RedHat to be precise)...

BTW, Why not updating the kernel Module.symvers when kernel-ib-devel is 
installed? This will free the developer from copying it to his/hers 
private directory.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 Build Issue

2006-10-29 Thread Moni Shoua
We managed to avoid rebuilding the kernel to solve this issue.

Before building any IB dependant modules (out of OFED) it is required to
update the Module.symvers.
The new values for the symbol CRCs can be taken from the modules
themselves ( nm IB_MODULE |grep __crc_)
When Module.symvers is up-to-date, there should be no problem building
and installing the IB dependant modules.

The solution step-by-step
1. The procedure should run after installing the kerne-ib-devel RPM. It
is possible to run it in %pre section of the spec file.
2. Foreach IB module (ko) which is listed in $(rpm -ql kernel-ib) - 
2.1 take out the __crc_ sybbols 
2.2 extract the symbol name and it's CRC value (simple parsing)
2.3 add  it (or replace the existing) to Module.symvers (usually
under /lib/modules/$(uname -r)/build/ or /lib/modules/$(uname
-r)/source/ )
3. Save the diff of the current Module.symvers from the original (for
future restore)
4. When kernel-ib-devel RPM is uninstalled use the patch from (3) to
restore Module.symvers. This can be done in the %postun of the spec
file)

I'd be glad to get comments about this.




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tom Tucker
Sent: Friday, October 27, 2006 5:30 PM
To: openib-general
Subject: [openib-general] OFED 1.1 Build Issue


I've been testing some code against the OFED 1.1 release and noticed
that if you build anything that depends on IB (RNFS in this case) into
the kernel, that the OFED kit doesn't work correctly. This is because
the dependent modules (ib_core, etc...) get sucked into the kernel
automagically and will cause the subsequent modprobe of the OFED module
to fail.

I don't think you can fix this without rebuilding the kernel so it
should probably be listed in the OFED_release_notes as a known issue.
Providing a mechanism to rebuild the kernel as part of the OFED install
would be great too, sorry if it's already there and I missed it.

Tom


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [openfabrics-ewg] IPoIB ifconfig HWaddr blank on RHEL4 U3?

2006-05-04 Thread Moni Shoua




Scott Weitzenkamp (sweitzen) wrote:

  
  
  OFED
1.0 rc3 on RHEL4 U3.  IPoIB is working, but I just noticed the HWaddr
is 00-00-00-00-00-00-00-00-00-00-00-00-00-00-0, shouldn't this have the
GID?
   
  [EMAIL PROTECTED] ~]# ifconfig
eth0  Link encap:Ethernet  HWaddr 00:13:72:50:B7:D1
  inet addr:172.29.238.49  Bcast:172.29.239.255 
Mask:255.255.252.0
  inet6 addr: fe80::213:72ff:fe50:b7d1/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:31232 errors:0 dropped:0 overruns:0 frame:0
  TX packets:13122 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:20539406 (19.5 MiB)  TX bytes:1415914 (1.3 MiB)
  Base address:0xdcc0 Memory:dfae-dfb0
   
  ib0   Link encap:UNSPEC  HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-0
  inet addr:192.168.2.49  Bcast:192.168.3.255 
Mask:255.255.252.0
  inet6 addr: fe80::202:c902:21:51d/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
  RX packets:839425 errors:0 dropped:0 overruns:0 frame:0
  TX packets:4384118 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:44110930 (42.0 MiB)  TX bytes:8046551416 (7.4 GiB)
   
  ib1   Link encap:UNSPEC  HWaddr
00-00-00-00-00-00-00-00-00-00-00-00-00-00-0
  inet addr:192.168.4.49  Bcast:192.168.5.255 
Mask:255.255.254.0
  inet6 addr: fe80::202:c902:21:51e/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
  RX packets:364 errors:0 dropped:0 overruns:0 frame:0
  TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:128
  RX bytes:46824 (45.7 KiB)  TX bytes:408 (408.0 b)
  
   
  Scott Weitzenkamp
  SQA and Release Manager
  Server Virtualization
Business Unit
  Cisco Systems
   
  

___
openfabrics-ewg mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openfabrics-ewg
  

That's probably a bug in ifconfig dealing with such a long address.
Type ip address show and you'll see that the correct HW address
is shown.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Calculation of maximum number of FMR remaps in mthca driver

2006-03-30 Thread Moni Shoua

Hi,
Belwo is a suggested patch that makes ib_query_device (or more 
precisely  mthca_query device) to return the number of max_map_per_fmr 
instead of zero.
This is used by ib_create_fmr_pool as the number for maximum allowed FMR 
remaps instead of the constant IB_FMR_MAX_REMAPS.


Since this is only a suggestion for now I let myslf not to take care of 
the other drivers for now.
I would be happy to get a feedback on this from the Mellanox driver guys 
about the correctness of the calculation.


thanks

Moni S.



Index: infiniband/core/fmr_pool.c
===
--- infiniband/core/fmr_pool.c(revision 8504)
+++ infiniband/core/fmr_pool.c(working copy)
@@ -214,6 +214,7 @@
{
struct ib_device   *device;
struct ib_fmr_pool *pool;
+struct ib_device_attr device_attr;
int i;
int ret;

@@ -228,6 +229,12 @@
return ERR_PTR(-ENOSYS);
}

+ret = ib_query_device(device, &device_attr);
+if (ret) {
+printk(KERN_WARNING "couldn't query device");
+return ERR_PTR(ret);
+}
+
pool = kmalloc(sizeof *pool, GFP_KERNEL);
if (!pool) {
printk(KERN_WARNING "couldn't allocate pool struct");
@@ -279,7 +286,7 @@
struct ib_pool_fmr *fmr;
struct ib_fmr_attr attr = {
.max_pages  = params->max_pages_per_fmr,
-.max_maps   = IB_FMR_MAX_REMAPS,
+.max_maps   = device_attr.max_map_per_fmr,
.page_shift = params->page_shift
};

Index: infiniband/hw/mthca/mthca_provider.c
===
--- infiniband/hw/mthca/mthca_provider.c(revision 8504)
+++ infiniband/hw/mthca/mthca_provider.c(working copy)
@@ -116,6 +116,8 @@
props->max_total_mcast_qp_attach = props->max_mcast_qp_attach *
   props->max_mcast_grp;

+props->max_map_per_fmr=(1 << (32 - 
long_log2(mdev->limits.num_mpts))) - 1;

+
err = 0;
 out:
kfree(in_mad)

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Please give 1.0 RC1 a whirl

2006-03-15 Thread Moni Shoua

James Lentini wrote:


Arlin,

This email was garbled. I'm 99% certain it was from you, but the from 
field reads "Moni Shoua":


http://openib.org/pipermail/openib-general/2006-March/018318.html

The patch was also mangled. 


Could you resend please?

Thanks,
james

On Wed, 15 Mar 2006, Moni Shoua wrote:

 


Davis, Arlin R wrote:

   

James, 
I am in the process of building the autotools stuff for DAT and DAPL so

it builds exactly like the rest of OpenIB user libraries. I should have
something by the end of the day or tomorrow first thing.

-arlin


 


-Original Message-
From: James Lentini [mailto:[EMAIL PROTECTED]
Sent: Monday, March 13, 2006 10:30 AM
To: Woodruff, Robert J
Cc: Bryan O'Sullivan; openib-general@openib.org; Davis, Arlin R
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl


There are two parts of udapl, the registry and the provider.

There is a provider .spec file at

https://openib.org/svn/gen2/trunk/src/userspace/dapl/dat/udat/linux/dat
  
   


-registry-1.1.spec

 


If you build the dat registry with "make rpm" an rpm will be
automatically created.

I need to put together a .spec file for the provider.

Do we need to do anything else for 1.0 packaging purposes?

On Thu, 9 Mar 2006, Woodruff, Robert J wrote:

  
   


James/Arlin ?

woody


-Original Message-
From: Bryan O'Sullivan [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 09, 2006 3:00 PM
To: Woodruff, Robert J
Cc: openib-general@openib.org; Davis, Arlin R; 'James Lentini'
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl

On Thu, 2006-03-09 at 14:53 -0800, Bob Woodruff wrote:


 


Where are the uDAPL RPMs ?
  
   


Nobody has fixed uDAPL to be autostools buildable or written a

 


.spec.in

 


file for it.  That will be up to someone other than me to do :-)


 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


 


This is a patch that automates the build of dat and udapl. It also modifies
the packaging of librdmacm

to build dat/dapl one should run the following commands in
src/userspace/dapl/dat  and  src/userspace/dapl/dapl
# sh ./autogen.sh
# ./configure
# make dist-gzip
# rpmbuild -ta *.gz

build of dapl requires that RPMs libibverns-devel librdmacm and dat are
installed.



=
diff --exclude=.svn -urN openib.org.fresh/src/userspace/dapl/dapl/autogen.sh
openib.org/src/userspace/dapl/dapl/autogen.sh
--- openib.org.fresh/src/userspace/dapl/dapl/autogen.sh1970-01-01
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/autogen.sh2006-03-14
17:03:45.0 +0200
@@ -0,0 +1,9 @@
+#! /bin/sh
+
+set -x
+aclocal -I config
+libtoolize --force --copy
+autoheader
+automake --foreign --add-missing --copy
+autoconf
+
diff --exclude=.svn -urN openib.org.fresh/src/userspace/dapl/dapl/configure.in
openib.org/src/userspace/dapl/dapl/configure.in
--- openib.org.fresh/src/userspace/dapl/dapl/configure.in1970-01-01
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/configure.in2006-03-14
17:03:45.0 +0200
@@ -0,0 +1,41 @@
+dnl Process this file with autoconf to produce a configure script.
+
+AC_PREREQ(2.57)
+AC_INIT(dapl, 0.9.0, openib-general@openib.org)
+AC_CONFIG_SRCDIR([udapl/dapl_init.c])
+AC_CONFIG_AUX_DIR(config)
+AM_CONFIG_HEADER(config.h)
+AM_INIT_AUTOMAKE(dapl, 0.9.0)
+
+dnl Checks for programs
+AC_PROG_CXX
+AC_PROG_CC
+AC_PROG_CPP
+AC_PROG_INSTALL
+AC_PROG_LN_S
+AC_PROG_MAKE_SET
+AM_PROG_LIBTOOL
+
+dnl Checks for header files.
+AC_HEADER_STDC
+
+dnl Checks for library functions
+AC_TYPE_SIGNAL
+AC_FUNC_VPRINTF
+
+dnl Checks for typedefs, structures, and compiler characteristics.
+AC_C_CONST
+AC_C_INLINE
+AC_STRUCT_TM
+
+AC_CACHE_CHECK(whether ld accepts --version-script, ac_cv_version_script,
+if test -n "`$LD --help < /dev/null 2>/dev/null | grep version-script`";
then
+ac_cv_version_script=yes
+else
+ac_cv_version_script=no
+fi)
+
+AM_CONDITIONAL(HAVE_LD_VERSION_SCRIPT, test "$ac_cv_version_script" = "yes")
+
+AC_CONFIG_FILES([Makefile dapl.spec])
+AC_OUTPUT
diff --exclude=.svn -urN openib.org.fresh/src/userspace/dapl/dapl/dapl.spec.in
openib.org/src/userspace/dapl/dapl/dapl.spec.in
--- openib.org.fresh/src/userspace/dapl/dapl/dapl.spec.in1970-01-01
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/dapl.spec.in2006-03-15
15:25:53.0 +0200
@@ -0,0 +1,41 @@
+# $Id: ipoibcfg.spec.in 28 2004-04-07 20:00:33Z roland $
+
+%define prefix /usr
+%define ver  @VERSION@
+%define  RELEASE 1
+%define  rel %{?CUSTOM_RELEASE} %{!?CUSTOM_RELEASE:%RELEASE}
+
+
+Summary: This package contains the U

Re: [openib-general] Please give 1.0 RC1 a whirl

2006-03-15 Thread Moni Shoua

Davis, Arlin R wrote:

James, 


I am in the process of building the autotools stuff for DAT and DAPL so
it builds exactly like the rest of OpenIB user libraries. I should have
something by the end of the day or tomorrow first thing.

-arlin

 


-Original Message-
From: James Lentini [mailto:[EMAIL PROTECTED]
Sent: Monday, March 13, 2006 10:30 AM
To: Woodruff, Robert J
Cc: Bryan O'Sullivan; openib-general@openib.org; Davis, Arlin R
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl


There are two parts of udapl, the registry and the provider.

There is a provider .spec file at

https://openib.org/svn/gen2/trunk/src/userspace/dapl/dat/udat/linux/dat
   


-registry-1.1.spec
 


If you build the dat registry with "make rpm" an rpm will be
automatically created.

I need to put together a .spec file for the provider.

Do we need to do anything else for 1.0 packaging purposes?

On Thu, 9 Mar 2006, Woodruff, Robert J wrote:

   


James/Arlin ?

woody


-Original Message-
From: Bryan O'Sullivan [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 09, 2006 3:00 PM
To: Woodruff, Robert J
Cc: openib-general@openib.org; Davis, Arlin R; 'James Lentini'
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl

On Thu, 2006-03-09 at 14:53 -0800, Bob Woodruff wrote:

 


Where are the uDAPL RPMs ?
   


Nobody has fixed uDAPL to be autostools buildable or written a
 


.spec.in
 


file for it.  That will be up to someone other than me to do :-)

 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

 


Sending patch again. This time as an  attachment

This patch automates the build of dat and udapl. It also modifies the 
packaging of librdmacm


to build dat/dapl one should run the following commands in  
src/userspace/dapl/dat  and  src/userspace/dapl/dapl

# sh ./autogen.sh
# ./configure
# make dist-gzip
# rpmbuild -ta *.gz

build of dapl requires that RPMs libibverns-devel librdmacm and dat are 
installed.



diff --exclude=.svn -urN openib.org.fresh/src/userspace/dapl/dapl/autogen.sh 
openib.org/src/userspace/dapl/dapl/autogen.sh
--- openib.org.fresh/src/userspace/dapl/dapl/autogen.sh 1970-01-01 
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/autogen.sh   2006-03-14 
17:03:45.0 +0200
@@ -0,0 +1,9 @@
+#! /bin/sh
+
+set -x
+aclocal -I config
+libtoolize --force --copy
+autoheader
+automake --foreign --add-missing --copy
+autoconf
+
diff --exclude=.svn -urN openib.org.fresh/src/userspace/dapl/dapl/configure.in 
openib.org/src/userspace/dapl/dapl/configure.in
--- openib.org.fresh/src/userspace/dapl/dapl/configure.in   1970-01-01 
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/configure.in 2006-03-14 
17:03:45.0 +0200
@@ -0,0 +1,41 @@
+dnl Process this file with autoconf to produce a configure script.
+
+AC_PREREQ(2.57)
+AC_INIT(dapl, 0.9.0, openib-general@openib.org)
+AC_CONFIG_SRCDIR([udapl/dapl_init.c])
+AC_CONFIG_AUX_DIR(config)
+AM_CONFIG_HEADER(config.h)
+AM_INIT_AUTOMAKE(dapl, 0.9.0)
+
+dnl Checks for programs
+AC_PROG_CXX
+AC_PROG_CC
+AC_PROG_CPP
+AC_PROG_INSTALL
+AC_PROG_LN_S
+AC_PROG_MAKE_SET
+AM_PROG_LIBTOOL
+
+dnl Checks for header files.
+AC_HEADER_STDC
+
+dnl Checks for library functions
+AC_TYPE_SIGNAL
+AC_FUNC_VPRINTF
+
+dnl Checks for typedefs, structures, and compiler characteristics.
+AC_C_CONST
+AC_C_INLINE
+AC_STRUCT_TM
+
+AC_CACHE_CHECK(whether ld accepts --version-script, ac_cv_version_script,
+if test -n "`$LD --help < /dev/null 2>/dev/null | grep version-script`"; 
then
+ac_cv_version_script=yes
+else
+ac_cv_version_script=no
+fi)
+
+AM_CONDITIONAL(HAVE_LD_VERSION_SCRIPT, test "$ac_cv_version_script" = "yes")
+
+AC_CONFIG_FILES([Makefile dapl.spec])
+AC_OUTPUT
diff --exclude=.svn -urN openib.org.fresh/src/userspace/dapl/dapl/dapl.spec.in 
openib.org/src/userspace/dapl/dapl/dapl.spec.in
--- openib.org.fresh/src/userspace/dapl/dapl/dapl.spec.in   1970-01-01 
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/dapl.spec.in 2006-03-15 
15:25:53.0 +0200
@@ -0,0 +1,41 @@
+# $Id: ipoibcfg.spec.in 28 2004-04-07 20:00:33Z roland $
+
+%define prefix /usr
+%define ver  @VERSION@
+%define  RELEASE 1
+%define  rel %{?CUSTOM_RELEASE} %{!?CUSTOM_RELEASE:%RELEASE}
+
+
+Summary: This package contains the User Direct Access Programming Library 
(uDAPL)
+Name: dapl
+Version: %ver
+Release: %{rel}%{?dist}
+License: GPL/BSD
+Group: Applications/System
+BuildRoot: %{_tmppath}/%{name}-%{version}-root
+Source: http://openib.org/downloads/%{name}-%{version}.tar.gz
+Url: http://openib.org/
+
+%description
+udat is 
+
+%prep
+%setup -q
+
+%build
+%configure
+pwd
+make -C udapl clean
+make -C udapl
+
+%install
+make -C udapl PREFIX=${RPM_BUILD_ROOT} install
+
+%clean
+rm -rf

Re: [openib-general] Please give 1.0 RC1 a whirl

2006-03-15 Thread Moni Shoua

Davis, Arlin R wrote:

James, 


I am in the process of building the autotools stuff for DAT and DAPL so
it builds exactly like the rest of OpenIB user libraries. I should have
something by the end of the day or tomorrow first thing.

-arlin

 


-Original Message-
From: James Lentini [mailto:[EMAIL PROTECTED]
Sent: Monday, March 13, 2006 10:30 AM
To: Woodruff, Robert J
Cc: Bryan O'Sullivan; openib-general@openib.org; Davis, Arlin R
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl


There are two parts of udapl, the registry and the provider.

There is a provider .spec file at

https://openib.org/svn/gen2/trunk/src/userspace/dapl/dat/udat/linux/dat
   


-registry-1.1.spec
 


If you build the dat registry with "make rpm" an rpm will be
automatically created.

I need to put together a .spec file for the provider.

Do we need to do anything else for 1.0 packaging purposes?

On Thu, 9 Mar 2006, Woodruff, Robert J wrote:

   


James/Arlin ?

woody


-Original Message-
From: Bryan O'Sullivan [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 09, 2006 3:00 PM
To: Woodruff, Robert J
Cc: openib-general@openib.org; Davis, Arlin R; 'James Lentini'
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl

On Thu, 2006-03-09 at 14:53 -0800, Bob Woodruff wrote:

 


Where are the uDAPL RPMs ?
   


Nobody has fixed uDAPL to be autostools buildable or written a
 


.spec.in
 


file for it.  That will be up to someone other than me to do :-)

 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

 

This is a patch that automates the build of dat and udapl. It also 
modifies the packaging of librdmacm


to build dat/dapl one should run the following commands in  
src/userspace/dapl/dat  and  src/userspace/dapl/dapl

# sh ./autogen.sh
# ./configure
# make dist-gzip
# rpmbuild -ta *.gz

build of dapl requires that RPMs libibverns-devel librdmacm and dat are 
installed.




=
diff --exclude=.svn -urN 
openib.org.fresh/src/userspace/dapl/dapl/autogen.sh 
openib.org/src/userspace/dapl/dapl/autogen.sh
--- openib.org.fresh/src/userspace/dapl/dapl/autogen.sh1970-01-01 
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/autogen.sh2006-03-14 
17:03:45.0 +0200

@@ -0,0 +1,9 @@
+#! /bin/sh
+
+set -x
+aclocal -I config
+libtoolize --force --copy
+autoheader
+automake --foreign --add-missing --copy
+autoconf
+
diff --exclude=.svn -urN 
openib.org.fresh/src/userspace/dapl/dapl/configure.in 
openib.org/src/userspace/dapl/dapl/configure.in
--- openib.org.fresh/src/userspace/dapl/dapl/configure.in1970-01-01 
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/configure.in2006-03-14 
17:03:45.0 +0200

@@ -0,0 +1,41 @@
+dnl Process this file with autoconf to produce a configure script.
+
+AC_PREREQ(2.57)
+AC_INIT(dapl, 0.9.0, openib-general@openib.org)
+AC_CONFIG_SRCDIR([udapl/dapl_init.c])
+AC_CONFIG_AUX_DIR(config)
+AM_CONFIG_HEADER(config.h)
+AM_INIT_AUTOMAKE(dapl, 0.9.0)
+
+dnl Checks for programs
+AC_PROG_CXX
+AC_PROG_CC
+AC_PROG_CPP
+AC_PROG_INSTALL
+AC_PROG_LN_S
+AC_PROG_MAKE_SET
+AM_PROG_LIBTOOL
+
+dnl Checks for header files.
+AC_HEADER_STDC
+
+dnl Checks for library functions
+AC_TYPE_SIGNAL
+AC_FUNC_VPRINTF
+
+dnl Checks for typedefs, structures, and compiler characteristics.
+AC_C_CONST
+AC_C_INLINE
+AC_STRUCT_TM
+
+AC_CACHE_CHECK(whether ld accepts --version-script, ac_cv_version_script,
+if test -n "`$LD --help < /dev/null 2>/dev/null | grep 
version-script`"; then

+ac_cv_version_script=yes
+else
+ac_cv_version_script=no
+fi)
+
+AM_CONDITIONAL(HAVE_LD_VERSION_SCRIPT, test "$ac_cv_version_script" = 
"yes")

+
+AC_CONFIG_FILES([Makefile dapl.spec])
+AC_OUTPUT
diff --exclude=.svn -urN 
openib.org.fresh/src/userspace/dapl/dapl/dapl.spec.in 
openib.org/src/userspace/dapl/dapl/dapl.spec.in
--- openib.org.fresh/src/userspace/dapl/dapl/dapl.spec.in1970-01-01 
02:00:00.0 +0200
+++ openib.org/src/userspace/dapl/dapl/dapl.spec.in2006-03-15 
15:25:53.0 +0200

@@ -0,0 +1,41 @@
+# $Id: ipoibcfg.spec.in 28 2004-04-07 20:00:33Z roland $
+
+%define prefix /usr
+%define ver  @VERSION@
+%define  RELEASE 1
+%define  rel %{?CUSTOM_RELEASE} %{!?CUSTOM_RELEASE:%RELEASE}
+
+
+Summary: This package contains the User Direct Access Programming 
Library (uDAPL)

+Name: dapl
+Version: %ver
+Release: %{rel}%{?dist}
+License: GPL/BSD
+Group: Applications/System
+BuildRoot: %{_tmppath}/%{name}-%{version}-root
+Source: http://openib.org/downloads/%{name}-%{version}.tar.gz
+Url: http://openib.org/
+
+%description
+udat is
+
+%prep
+%setup -q
+
+%build
+%configure
+pwd
+make -C udapl clean
+make -C udapl
+
+%install
+make -C udapl PREFIX=${RPM_BUILD_ROOT} instal

Re: [openib-general] Please give 1.0 RC1 a whirl

2006-03-14 Thread Moni Shoua




Davis, Arlin R wrote:

  James, 

I am in the process of building the autotools stuff for DAT and DAPL so
it builds exactly like the rest of OpenIB user libraries. I should have
something by the end of the day or tomorrow first thing.

-arlin

  
  
-Original Message-
From: James Lentini [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 13, 2006 10:30 AM
To: Woodruff, Robert J
Cc: Bryan O'Sullivan; openib-general@openib.org; Davis, Arlin R
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl


There are two parts of udapl, the registry and the provider.

There is a provider .spec file at

https://openib.org/svn/gen2/trunk/src/userspace/dapl/dat/udat/linux/dat

  
  -registry-1.1.spec
  
  
If you build the dat registry with "make rpm" an rpm will be
automatically created.

I need to put together a .spec file for the provider.

Do we need to do anything else for 1.0 packaging purposes?

On Thu, 9 Mar 2006, Woodruff, Robert J wrote:



  James/Arlin ?

woody


-Original Message-
From: Bryan O'Sullivan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 09, 2006 3:00 PM
To: Woodruff, Robert J
Cc: openib-general@openib.org; Davis, Arlin R; 'James Lentini'
Subject: RE: [openib-general] Please give 1.0 RC1 a whirl

On Thu, 2006-03-09 at 14:53 -0800, Bob Woodruff wrote:

  
  
Where are the uDAPL RPMs ?

  
  Nobody has fixed uDAPL to be autostools buildable or written a
  

  
  .spec.in
  
  

  file for it.  That will be up to someone other than me to do :-)

	

  
  ___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

  

I have also done some work in automating the build of dat and udapl.
I hope to send a patch to this soon.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] ib1 takes ib0 configuration

2006-03-06 Thread Moni Shoua




After building and 
installing openib stack on SUSE Linux Enterprise Server 9.90 
Beta5  I noticed that ib1 interface has 
identical IP configuration to ib0 configuration. 
 
ib0   Link encap:UNSPEC  HWaddr 
00-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00  
  inet 
addr:192.68.3.238  Bcast:192.68.255.255  
Mask:255.255.0.0
ib1   Link encap:UNSPEC  
HWaddr 00-00-04-05-FE-80-00-00-00-00-00-00-00-00-00-00  
  inet 
addr:192.68.3.238  Bcast:192.68.255.255  
Mask:255.255.0.0 
This 
 causes problems in the machine's IP configuration even if the link 
of ib1 is down.
The fact that a 
network script for ib1 exists doesn't make a difference. ib1 still takes ib0 
configuration.
 
When trying to query 
ib1 status (with /sbin/ifstatus ib1) I get this
 
 ib1   device: 
Mellanox Technologies MT25208 InfiniHost III Ex HCA (Tavor compatibility mode) 
(rev a0)    ib1   
configuration: ib0ib1 is up10: ib1: 
 mtu 1500 qdisc pfifo_fast qlen 
128    link/infiniband 
00:00:04:05:fe:80:00:00:00:00:00:00:00:08:f1:04:03:97:08:ea brd 
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff    
inet 192.68.3.238/16 brd 192.68.255.255 scope global ib1    
ib1   IP address: 
192.68.3.238/16
 
When looking 
for the reason to this behavior I found that when the network is started, ifup 
calls getcfg like this: /sbin/getcfg -d . -f  ifcfg- -- ib1 and one 
of the lines in the output is HWD_CONFIG_0="ib0". The 
HWD_CONFIG_0 variable is parsed in ifup and the variable 
CONFIG is set to its value.
 
All the 
above doesn't happen on SUSE Linux Enterprise Server 9 and I suspect that 
the difference is in the version of the rpm sysconfig (0.50.6 vs. 0.31.0) which 
packages getcfg.
 
Does anyone have an 
idea how to solve this? Is this a bug of getcfg, bug in IPoIB or just a wrong IP 
configuration?
 
thanks
 

____
Moni 
Shoua |  +972-9-9718630 (o)   |   +072-52-8232979 (m)
SW 
Engineer, Mainstream IB host stack
Voltaire 
– The Grid Backbone
 
 www.voltaire.com

  
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general