[openib-general] [PATCH] rping: Erroneous check for minumum ping buffer size

2006-06-08 Thread Pradipta Kumar Banerjee
rping didn't checked correctly for the minimum size of the ping
buffer resulting in the following error from glibc

"*** glibc detected *** free(): invalid next size (fast)"

Signed-off-by: Pradipta Kumar Banerjee <[EMAIL PROTECTED]>
---

Index: rping.c
=
--- rping.org   2006-06-09 10:57:43.0 +0530
+++ rping.c 2006-06-09 11:00:28.0 +0530
@@ -96,6 +96,12 @@ struct rping_rdma_info {
 #define RPING_BUFSIZE 64*1024
 #define RPING_SQ_DEPTH 16
 
+/* Default string for print data and
+ * minimum buffer size
+ */
+#define RPING_MSG_FMT   "rdma-ping-%d: "
+#define RPING_MIN_BUFSIZE   sizeof(itoa(INT_MAX))+sizeof(RPING_MSG_FMT)
+
 /*
  * Control block struct.
  */
@@ -774,7 +780,7 @@ static void rping_test_client(struct rpi
cb->state = RDMA_READ_ADV;
 
/* Put some ascii text in the buffer. */
-   cc = sprintf(cb->start_buf, "rdma-ping-%d: ", ping);
+   cc = sprintf(cb->start_buf, RPING_MSG_FMT, ping);
for (i = cc, c = start; i < cb->size; i++) {
cb->start_buf[i] = c;
c++;
@@ -977,11 +983,11 @@ int main(int argc, char *argv[])
break;
case 'S':
cb->size = atoi(optarg);
-   if ((cb->size < 1) ||
+   if ((cb->size < RPING_MIN_BUFSIZE) ||
(cb->size > (RPING_BUFSIZE - 1))) {
fprintf(stderr, "Invalid size %d "
-  "(valid range is 1 to %d)\n",
-  cb->size, RPING_BUFSIZE);
+  "(valid range is %d to %d)\n",
+  cb->size, RPING_MIN_BUFSIZE, 
RPING_BUFSIZE);
ret = EINVAL;
} else
DEBUG_LOG("size %d\n", (int) atoi(optarg));

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH 1/2] multicast: notify users on membership errors

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> These should eliminate any races with ipoib leaving,
> then quickly re-joining a group as a result of an event.

Is there a chance this will fix the crashes me and Or were seeing?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 2/2] ipoib: handle multicast group reset notification

2006-06-08 Thread Sean Hefty
Ipoib already checks for events that require rejoining multicast groups.
We just need to add code to handle (i.e. ignore) multicast group reset
notifications.

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
Ignoring the callback is a simple fix.  I didn't try to see what it would
take to have ipoib use the ib_multicast event to trigger a re-join.  My
guess is that it would be less efficient, since ipoib would get a callback
for every group on the affected port.

Index: ipoib_multicast.c
===
--- ipoib_multicast.c   (revision 7758)
+++ ipoib_multicast.c   (working copy)
@@ -306,6 +306,10 @@ ipoib_mcast_sendonly_join_complete(int s
struct net_device *dev = mcast->dev;
struct ipoib_dev_priv *priv = netdev_priv(dev);
 
+   /* We trap for port events ourselves. */
+   if (status == -ENETRESET)
+   return 0;
+
if (!status)
status = ipoib_mcast_join_finish(mcast, &multicast->rec);

@@ -390,6 +394,10 @@ static int ipoib_mcast_join_complete(int
" (status %d)\n",
IPOIB_GID_ARG(mcast->mcmember.mgid), status);
 
+   /* We trap for port events ourselves. */
+   if (status == -ENETRESET)
+   return 0;
+
if (!status)
status = ipoib_mcast_join_finish(mcast, &multicast->rec);
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH 1/2] multicast: notify users on membership errors

2006-06-08 Thread Sean Hefty
Modify ib_multicast module to detect events that require clients to rejoin
multicast groups.  Add tracking of clients which are members of any groups,
and provide notification to those clients when such an event occurs.

This patch tracks all active members of a group.  When an event occurs that
requires clients to rejoin a multicast group, the active members are moved
into an error state, and the clients are notified of a network reset error.
The group is then reset to force additional join requests to generate requests
to the SA.

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
Hal, can you apply these patches and see if it fixes the issues that you
are experiencing.  These should eliminate any races with ipoib leaving,
then quickly re-joining a group as a result of an event.

Index: multicast.c
===
--- multicast.c (revision 7805)
+++ multicast.c (working copy)
@@ -61,6 +61,7 @@ static struct ib_client mcast_client = {
.remove = mcast_remove_one
 };
 
+static struct ib_event_handler event_handler;
 static struct workqueue_struct *mcast_wq;
 
 struct mcast_device;
@@ -86,6 +87,7 @@ enum mcast_state {
MCAST_JOINING,
MCAST_MEMBER,
MCAST_BUSY,
+   MCAST_ERROR
 };
 
 struct mcast_member;
@@ -97,6 +99,7 @@ struct mcast_group {
spinlock_t  lock;
struct work_struct  work;
struct list_headpending_list;
+   struct list_headactive_list;
struct mcast_member *last_join;
int members[3];
atomic_trefcount;
@@ -338,6 +341,8 @@ static void join_group(struct mcast_grou
group->rec.join_state |= join_state;
member->multicast.rec = group->rec;
member->multicast.rec.join_state = join_state;
+   list_del(&member->list);
+   list_add(&member->list, &group->active_list);
 }
 
 static int fail_join(struct mcast_group *group, struct mcast_member *member,
@@ -349,6 +354,34 @@ static int fail_join(struct mcast_group 
return member->multicast.callback(status, &member->multicast);
 }
 
+static void process_group_error(struct mcast_group *group)
+{
+   struct mcast_member *member;
+   int ret;
+
+   spin_lock_irq(&group->lock);
+   while (!list_empty(&group->active_list)) {
+   member = list_entry(group->active_list.next,
+   struct mcast_member, list);
+   atomic_inc(&member->refcount);
+   list_del_init(&member->list);
+   adjust_membership(group, member->multicast.rec.join_state, -1);
+   member->state = MCAST_ERROR;
+   spin_unlock_irq(&group->lock);
+
+   ret = member->multicast.callback(-ENETRESET,
+&member->multicast);
+   deref_member(member);
+   if (ret)
+   ib_free_multicast(&member->multicast);
+   spin_lock_irq(&group->lock);
+   }
+
+   group->rec.join_state = 0;
+   group->state = MCAST_BUSY;
+   spin_unlock_irq(&group->lock);
+}
+
 static void mcast_work_handler(void *data)
 {
struct mcast_group *group = data;
@@ -359,6 +392,12 @@ static void mcast_work_handler(void *dat
 
 retest:
spin_lock_irq(&group->lock);
+   if (group->state == MCAST_ERROR) {
+   spin_unlock_irq(&group->lock);
+   process_group_error(group);
+   goto retest;
+   }
+
while (!list_empty(&group->pending_list)) {
member = list_entry(group->pending_list.next,
struct mcast_member, list);
@@ -371,8 +410,8 @@ retest:
 multicast->comp_mask);
if (!status)
join_group(group, member, join_state);
-
-   list_del_init(&member->list);
+   else
+   list_del_init(&member->list);
spin_unlock_irq(&group->lock);
ret = multicast->callback(status, multicast);
} else {
@@ -467,6 +506,7 @@ static struct mcast_group *acquire_group
group->port = port;
group->rec.mgid = *mgid;
INIT_LIST_HEAD(&group->pending_list);
+   INIT_LIST_HEAD(&group->active_list);
INIT_WORK(&group->work, mcast_work_handler, group);
spin_lock_init(&group->lock);
 
@@ -551,16 +591,10 @@ void ib_free_multicast(struct ib_multica
group = member->group;
 
spin_lock_irq(&group->lock);
-   switch (member->state) {
-   case MCAST_MEMBER:
+   if (member->state == MCAST_MEMBER)
adjust_membership(group, multicast->rec.join_state, -1);
-   break;
-   case MCAST_JOINING:
-   list_del_init(&member->list);
-   break;
-   default:
-

RE: [openib-general] OFED-1.0-rc6 is available

2006-06-08 Thread Scott Weitzenkamp (sweitzen)



The MTU change undos the changes for bug 81, so I have 
reopened bug 81 (http://openib.org/bugzilla/show_bug.cgi?id=81).
 
With rc6, PCI-X osu_bw and osu_bibw performance is bad, 
and PCI-E osu_bibw performance is bad.  I've enclosed some performance 
data, look at rc4 vs rc5 vs rc6 for Cougar/Cheetah/LionMini.
 
Are there other benchmarks driving the changes in rc6 
(and rc4)?
 
Scott 
Weitzenkamp
SQA and Release 
Manager
Server Virtualization 
Business Unit
Cisco Systems
 

  
   
  OSU 
  MPI:
  ·    
  Added mpi_alltoall fine tuning 
  parameters
  ·    
  Added default 
  configuration/documentation file 
  $MPIHOME/etc/mvapich.conf
  ·    
  Added shell configuration 
  files  $MPIHOME/etc/mvapich.csh , 
  $MPIHOME/etc/mvapich.csh
  ·    
  Default MTU was changed back to 2K 
  for InfiniHost III Ex and InfiniHost III Lx HCAs. For InfiniHost card 
  recommended value 
  is:VIADEV_DEFAULT_MTU=MTU1024


mpi_perf.xls
Description: mpi_perf.xls
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Roland Dreier
Michael> So we need something like mthca_clean_eq?

Roland> That's one obvious way to handle it.

Actually that looks very hard without adding locks to the interrupt
handling fast path.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Roland Dreier
Roland> there's no guarantee the upper bits won't repeat -- or
Roland> someone could be using 24 bits for index

Michael> So we need something like mthca_clean_eq?

That's one obvious way to handle it.

We could also keep a list of freed CQNs and make sure we don't reuse
the CQNs until their associated EQ has been drained once.

Or just call the handler for that EQ an extra time after freeing the
CQ.  But I guess that would lead to tricky races against the regular
interrupt handler.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> there's no guarantee the upper bits won't
> repeat -- or someone could be using 24 bits for index

So we need something like mthca_clean_eq?


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [ANNOUNCE] New iWARP Branch

2006-06-08 Thread Tom Tucker
Steve is fishing in the Florida Keys right now (or will be by morning),
but if he were here, I think he would say -- "...sounds like you've
found an rping bug, please post a patch" ;-)

I would prefer the #define you proposed, e.g.
 
#define RPING_MSG_FMT   "rdma-ping-%d"
#define RPING_MIN_BUFSIZsizeof(itoa(INT_MAX))+sizeof(RPING_MSG_FMT)

Then use the RPING_MSG_FMT symbol in the code that prepares the 
contents of the message. then if someone decides to change the string, 
the error checking still works.

> 
Tom

On Thu, 2006-06-08 at 23:12 +0530, Pradipta Kumar Banerjee wrote: 
> Sundeep Narravula wrote:
> > Hi,
> > 
> >> I don't see this problem at all. I am using kernel 2.6.16.16, SLES 9 glibc
> >> version 2.3.3-98, gcc version 3.3.3 and AMSO1100 RNIC.
> > 
> > The versions I used are glibc 2.3.4, kernel 2.6.16 and gcc 3.4.3 and
> > AMSO1100 RNIC.
> > 
> >> Will running it under gdb be of some help ?
> > 
> > I am able to reproduce this error with/without gdb. The glibc error
> > disappears with higher number of iterations.
> > 
> > (gdb) r -c -vV -C10 -S10 -a 150.111.111.100 -p 
> 
> The problem is due to specifying a less than sufficient size (-S10, -S4) for 
> the 
> buffer. If you look into the following lines from the function 
> rping_test_client 
>   in rping.c
> 
> for (ping = 0; !cb->count || ping < cb->count; ping++) {
>  cb->state = RDMA_READ_ADV;
> 
> /* Put some ascii text in the buffer. */
> -->cc = sprintf(cb->start_buf, "rdma-ping-%d: ", ping);
> 
>  From the above its clear that minimum size for start_buf should be atleast 
> sufficient to hold the string, which in the invocations mentioned here (-S10 
> or 
> -S4) is not the case. Hence you notice the glibc errors.
> 
> 
> cb->start_buf is allocated in rping_setup_buffers() as
>   cb->start_buf = malloc(cb->size);
> 
> Basically the check
> 
> if ((cb->size < 1) ||
> (cb->size > (RPING_BUFSIZE - 1))) {
> 
> in the main()  should be changed to something like this
> 
> #define RPING_MIN_BUFSIZE   sizeof(itoa(INT_MAX)) + sizeof("rdma-ping-%d: ")
> 
> ---> 'ping' is defined as a signed int, its maximum permissible value is 
> defined 
> in limits.h (INT_MAX = 2147483647)
> We can even hardcode the RPING_MIN_BUFSIZE to '19' if desired/
> 
> if ((cb->size < RPING_MIN_BUFSIZE) ||
> (cb->size > (RPING_BUFSIZE - 1))) {
> 
> Steve what do you say ??
> 
> 
> Thanks,
> Pradipta Kumar.
> 
> 
> > Starting program: /usr/local/bin/rping -c -vV -C10 -S10 -a 150.111.111.100
> > -p 
> > Reading symbols from shared object read from target memory...done.
> > Loaded system supplied DSO at 0xe000
> > [Thread debugging using libthread_db enabled]
> > [New Thread -1208465728 (LWP 23960)]
> > libibverbs: Warning: no userspace device-specific driver found for uverbs1
> > driver search path: /usr/local/lib/infiniband
> > libibverbs: Warning: no userspace device-specific driver found for uverbs0
> > driver search path: /usr/local/lib/infiniband
> > [New Thread -1208468560 (LWP 23963)]
> > [New Thread -1216861264 (LWP 23964)]
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > ping data: rdma-ping
> > cq completion failed status 5
> > DISCONNECT EVENT...
> > *** glibc detected *** free(): invalid next size (fast): 0x0804ea80 ***
> > 
> > Program received signal SIGABRT, Aborted.
> > [Switching to Thread -1208465728 (LWP 23960)]
> > 0xe410 in __kernel_vsyscall ()
> > (gdb)
> > 
> >   --Sundeep.
> > 
> >> Thanks
> >> Pradipta Kumar.
>  Thanx,
> 
> 
>  Steve.
> 
> 
>  On Mon, 2006-06-05 at 00:43 -0400, Sundeep Narravula wrote:
> > Hi Steve,
> >We are trying the new iwarp branch on ammasso adapters. The 
> > installation
> > has gone fine. However, on running rping there is a error during
> > disconnect phase.
> >
> > $ rping -c -vV -C4 -S4 -a 150.10.108.100 -p 
> > libibverbs: Warning: no userspace device-specific driver found for 
> > uverbs1
> >  driver search path: /usr/local/lib/infiniband
> > libibverbs: Warning: no userspace device-specific driver found for 
> > uverbs0
> >  driver search path: /usr/local/lib/infiniband
> > ping data: rdm
> > ping data: rdm
> > ping data: rdm
> > ping data: rdm
> > cq completion failed status 5
> > DISCONNECT EVENT...
> > *** glibc detected *** free(): invalid next size (fast): 0x0804ea80 ***
> > Aborted
> >
> > There are no apparent errors showing up in dmesg. Is this error
> > currently expected?
> >
> > Thanks,
> >--Sundeep.
> >
> > 
> > 
> > ___
> > openib-general mailing list
> > openib-general@openib.org
> > http:/

[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: race in mthca_cq.c?
> 
> Michael> Not in the driver I have: mthca_array_clear is at line
> Michael> 1351, mthca_cq_clean at line 1372.  Isn't
> Michael> mthca_array_clear freeing the slot in QP table?
> 
> Nope, the bitmap slot isn't freed until mthca_free().

Oh. Right. I see it now.

> Michael> But there might be more EQEs for this CQN outstanding in
> Michael> the EQ which we have not seen yet.
> 
> Now that you mention it, that could be a real problem I guess.
> synchronize_irq() isn't enough because the interrupt handler might not
> have even started yet.
> 
> But on the other hand a CQ can't be destroyed until after all
> associated QPs have been destroyed.  So could we really miss EQEs for
> that long?

Yes, I think there might be spurious EQEs and they might get delayed
in HW for a long time. Destroyng QPs does not flush completion events out.

So just this bit?

--

Check EQE is not for a stale CQ number.  Since high bits in CQ number are
allocated by round-robin, we can be reasonably sure CQ number is different even
for CQs which share slot in CQ table.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>


--- openib/drivers/infiniband/hw/mthca/mthca_cq.c   2006-05-09 
21:07:28.623383000 +0300
+++ /mswg/work/mst/tmp/infiniband1/hw/mthca/mthca_cq.c  2006-06-08 
23:46:52.404499000 +0300
@@ -217,9 +217,9 @@ void mthca_cq_completion(struct mthca_de
 {
struct mthca_cq *cq;
 
cq = mthca_array_get(&dev->cq_table.cq, cqn & (dev->limits.num_cqs - 
1));
 
-   if (!cq) {
+   if (!cq || cq->cqn != cqn) {
mthca_warn(dev, "Completion event for bogus CQ %08x\n", cqn);
return;
}

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty

Hal Rosenstock wrote:

2. There is lazy deletion of MC groups allowed so the reclamation may be
difficult.


I'm not familiar with the switch programming.  Does the SM set the entire 
MulticastForwardingTable for a switch every time a new group is created, or a 
new member joins?  If the SM loses track of all multicast groups, how are the 
stale groups on the switches deleted?



The endport SMAs are claiming they do support client reregistration but
it does take more than that for the endport/node to behave properly.


My original plan was to have the ib_multicast module rejoin all groups, but 
since the MLIDs can change I can't see any way to handle reregistration safely 
without involving the application.  My latest changes are just to report errors 
on existing multicast groups on an affected port.



I know it is a conceptual rather than actual compliance. One issue would
be defining what it means to repect all existing communication. Then we
would need to look at whether that was feasible or not and perhaps
rescope what it means to a set of things achievable. Another issue would
be defining where it is possible or not. If that is totally vendor
dependent, then this would have no substance to it. It is largely a
matter of being a "better" SM.


We could use the phrase, "except where such communication is no longer 
realizable" instead of "where possible".  Where unrealizable means impossible 
because the communication uses properties that are physically impossible to 
achieve given the hardware configuration of the subnet.  (See bottom of page 910 
of the spec.)


If an SM could just query switches for their MulticastForwardingTables or the 
end nodes, would we be able to avoid these issues?


- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Roland Dreier
Michael> Not in the driver I have: mthca_array_clear is at line
Michael> 1351, mthca_cq_clean at line 1372.  Isn't
Michael> mthca_array_clear freeing the slot in QP table?

Nope, the bitmap slot isn't freed until mthca_free().

Michael> But there might be more EQEs for this CQN outstanding in
Michael> the EQ which we have not seen yet.

Now that you mention it, that could be a real problem I guess.
synchronize_irq() isn't enough because the interrupt handler might not
have even started yet.

But on the other hand a CQ can't be destroyed until after all
associated QPs have been destroyed.  So could we really miss EQEs for
that long?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Roland Dreier
 > The following seems to work. How does it look?

I don't think it's needed, and anyway I don't see how it fixes
things.  The problem only happens when the new CQ or QP has the same
number as an old CQ/QP, so the test of cq->cqn == cqn might still pass
even if the cq has changed (there's no guarantee the upper bits won't
repeat -- or someone could be using 24 bits for index)

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Michael S. Tsirkin <[EMAIL PROTECTED]>:
> Subject: race in mthca_cq.c?
> 
> Roland, I think I see a race in mthca: let's assume that
> a QP is destroyed. We remove the qpn from qp_table.
> 
> Before we have the chance to cleanup the CQ, another QP is created
> and put in the same slot in table. If the user now polls the CQ he'll see a
> completion for a wrong QP, since poll CQ does:
> 
>*cur_qp = mthca_array_get(&dev->qp_table.qp,
>   be32_to_cpu(cqe->my_qpn) &
>   (dev->limits.num_qps - 1));
> 
> Is this analysis right?
> If yes, I think we can fix this by testing (*cur_qp)->qpn ==
> be32_to_cpu(cqe->my_qpn), does this make sense?
> 
> Same for userspace I guess?
> 
> It seems a similiar issue exists for CQs, does it not?
> And I think it can be solved in a similiar way, checking the CQN?


The following seems to work. How does it look?

---

Make sure completion/completion event is not for a stale QP/CQ
before reporting to user.

Signed-off-by: Michael S. Tsirkin <[EMAIL PROTECTED]>

--- openib/drivers/infiniband/hw/mthca/mthca_cq.c   2006-05-09 
21:07:28.623383000 +0300
+++ /mswg/work/mst/tmp/infiniband1/hw/mthca/mthca_cq.c  2006-06-08 
23:46:52.404499000 +0300
@@ -217,9 +217,9 @@ void mthca_cq_completion(struct mthca_de
 {
struct mthca_cq *cq;
 
cq = mthca_array_get(&dev->cq_table.cq, cqn & (dev->limits.num_cqs - 
1));
 
-   if (!cq) {
+   if (!cq || cq->cqn != cqn) {
mthca_warn(dev, "Completion event for bogus CQ %08x\n", cqn);
return;
}
@@ -513,10 +515,10 @@ static inline int mthca_poll_one(struct 
 * because CQs will be locked while QPs are removed
 * from the table.
 */
*cur_qp = mthca_array_get(&dev->qp_table.qp,
  be32_to_cpu(cqe->my_qpn) &
  (dev->limits.num_qps - 1));
-   if (!*cur_qp) {
+   if (!*cur_qp || (*cur_qp)->qpn != be32_to_cpu(cqe->my_qpn)) {
mthca_warn(dev, "CQ entry for unknown QP %06x\n",
   be32_to_cpu(cqe->my_qpn) & 0xff);
err = -EINVAL;

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] Compilation issues on rhel4 u3 ppc64 sysfs.o

2006-06-08 Thread Scott Weitzenkamp (sweitzen)
This is working for us on RHEL4 U3, thanks!

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -Original Message-
> From: Vladimir Sokolovsky [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, May 25, 2006 2:49 AM
> To: Scott Weitzenkamp (sweitzen)
> Cc: Paul; openib-general@openib.org
> Subject: Re: [openib-general] Compilation issues on rhel4 u3 
> ppc64 sysfs.o
> 
> In OFED-1.0-rc5 all binaries and libraries will be compiled on *ppc64 
> *with *-m64* flag.
> This requires sysfsutils and sysfsutils-devel 64-bit RPM to 
> be installed 
> (in order to build libibverbs).
> Also pciutils and pciutils-devel 64-bit required for tvflash package.
> 
> libsdp will be built both 32 and 64 bit libraries.
> 
> Note: in order to build sysfsutils 64-bit RPM run:
>  CC="gcc -m64" rpmbuild --rebuild 
> sysfsutils-1.3.0-1.2.1.src.rpm
>   (This was tested on Fedora C4 PPC64)
> 
> Regards,
> Vladimir
> 
> Scott Weitzenkamp (sweitzen) wrote:
> > I know Vlad made some changes for rc5 in this area, at least for 
> > libsdp, not sure if other libs got changed as well.
> >  
> > Scott Weitzenkamp
> > SQA and Release Manager
> > Server Virtualization Business Unit
> > Cisco Systems
> >  
> >
> > 
> --
> --
> > *From:* Paul [mailto:[EMAIL PROTECTED]
> > *Sent:* Wednesday, May 24, 2006 11:00 AM
> > *To:* Scott Weitzenkamp (sweitzen)
> > *Cc:* openib-general@openib.org
> > *Subject:* Re: [openib-general] Compilation issues on rhel4 u3
> > ppc64 sysfs.o
> >
> > Scott,
> >   Upon further inspection the build.sh and 
> install.sh scripts
> > built 32bit libraries and binaries. If I export CFLAGS (and the
> > like) to include -m64 then the build dies while looking for a
> > 64bit libsysfs. rhel4 u3 does not include a ppc64 
> sysfsutils, nor
> > have I been able to find an actual 64bit version of it. 
> Is there a
> > workaround for getting things to build actual ppc64
> > binaries/libraries ?
> >
> > The actual error is:
> > checking for dlsym in -ldl... yes
> > checking for pthread_mutex_init in -lpthread... yes
> > checking for sysfs_open_class in -lsysfs... no
> > configure: error: sysfs_open_class() not found. libibverbs
> > requires libsysfs. 
> >
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: race in mthca_cq.c?
> 
>  > Roland, I think I see a race in mthca: let's assume that
>  > a QP is destroyed. We remove the qpn from qp_table.
>  > 
>  > Before we have the chance to cleanup the CQ, another QP is created
>  > and put in the same slot in table. If the user now polls the CQ he'll see a
>  > completion for a wrong QP, since poll CQ does:
>  > 
>  >*cur_qp = mthca_array_get(&dev->qp_table.qp,
>  >   be32_to_cpu(cqe->my_qpn) &
>  >   (dev->limits.num_qps - 1));
>  > 
>  > Is this analysis right?
> 
> I don't think so.  There's no way for another QP to be assigned the
> same number, since the mthca_free() to clear out the QPN bitmap
> doesn't happen until after the CQs are cleaned up.

Not in the driver I have:
mthca_array_clear is at line 1351, mthca_cq_clean at line 1372.
Isn't mthca_array_clear freeing the slot in QP table?

>  > It seems a similiar issue exists for CQs, does it not?
>  > And I think it can be solved in a similiar way, checking the CQN?
> 
> I don't see anything there either.  When destroying a CQ, mthca does
> HW2SW_CQ and synchronize_irq() before a new CQ could be created with
> the same number.

But there might be more EQEs for this CQN outstanding in the EQ
which we have not seen yet.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Hal Rosenstock
On Thu, 2006-06-08 at 12:49, Sean Hefty wrote:
> >If this comment is directed at client reregister mechanism, you should
> >note that when this was brought up there was resistance to it based on
> >the recommendation (probably not a strong enough word for this) that SMs
> >be redundant in the subnet. There was a fair bit of anecdotal evidence
> >that this was not how they were being used at the time but it may have
> >been a chicken and egg problem.
> 
> Even with redundant SMs, we wouldn't want them to reassign all of the LIDs in
> the subnet just because of failover.  I don't think of MLIDs as being any
> different.  

Do you mean without redundant SMs (rather than with) ?

There are a couple of things about MLIDs are different:
1. There are very much fewer of them (not necessarily architecturally
but in some implementations)
2. There is lazy deletion of MC groups allowed so the reclamation may be
difficult.

This is not to say it can't be done but there are some hurdles to clear.

> Client reregister support is optional, so what if the node(s) that
> need to re-create the group doesn't support it?

The endport SMAs are claiming they do support client reregistration but
it does take more than that for the endport/node to behave properly.

> What if we started with something like the following compliance statement, and
> tried to add this to the spec?
> 
> An SM, upon becoming the master, shall respect all existing communication 
> in the fabric, where possible.

At the 50K level, I can see where you are coming from and think there is
merit in this but first, I'm not sure I know how to define this and
second, whether that is achievable but could wait to see whether some
definition could be achieved.

I know it is a conceptual rather than actual compliance. One issue would
be defining what it means to repect all existing communication. Then we
would need to look at whether that was feasible or not and perhaps
rescope what it means to a set of things achievable. Another issue would
be defining where it is possible or not. If that is totally vendor
dependent, then this would have no substance to it. It is largely a
matter of being a "better" SM.

-- Hal

> - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: race in mthca_cq.c?

2006-06-08 Thread Roland Dreier
 > Roland, I think I see a race in mthca: let's assume that
 > a QP is destroyed. We remove the qpn from qp_table.
 > 
 > Before we have the chance to cleanup the CQ, another QP is created
 > and put in the same slot in table. If the user now polls the CQ he'll see a
 > completion for a wrong QP, since poll CQ does:
 > 
 >*cur_qp = mthca_array_get(&dev->qp_table.qp,
 >   be32_to_cpu(cqe->my_qpn) &
 >   (dev->limits.num_qps - 1));
 > 
 > Is this analysis right?

I don't think so.  There's no way for another QP to be assigned the
same number, since the mthca_free() to clear out the QPN bitmap
doesn't happen until after the CQs are cleaned up.

 > It seems a similiar issue exists for CQs, does it not?
 > And I think it can be solved in a similiar way, checking the CQN?

I don't see anything there either.  When destroying a CQ, mthca does
HW2SW_CQ and synchronize_irq() before a new CQ could be created with
the same number.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] race in mthca_cq.c?

2006-06-08 Thread Michael S. Tsirkin
Roland, I think I see a race in mthca: let's assume that
a QP is destroyed. We remove the qpn from qp_table.

Before we have the chance to cleanup the CQ, another QP is created
and put in the same slot in table. If the user now polls the CQ he'll see a
completion for a wrong QP, since poll CQ does:

   *cur_qp = mthca_array_get(&dev->qp_table.qp,
  be32_to_cpu(cqe->my_qpn) &
  (dev->limits.num_qps - 1));

Is this analysis right?
If yes, I think we can fix this by testing (*cur_qp)->qpn ==
be32_to_cpu(cqe->my_qpn), does this make sense?

Same for userspace I guess?

It seems a similiar issue exists for CQs, does it not?
And I think it can be solved in a similiar way, checking the CQN?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [Bug 122] mad layer problem

2006-06-08 Thread bugzilla-daemon
http://openib.org/bugzilla/show_bug.cgi?id=122





--- Comment #2 from [EMAIL PROTECTED]  2006-06-08 12:51 ---
I'm not aware of any relationship between ARP and MADs.  I'd like to verify
that this is indeed a MAD layer issue, and not a problem in the user-to-kernel
interface, or lower level driver.  After the hang, were any applications able
to run?  Did you try running any kernel tests, like grmpp or cmatose?  Loading
madeye after connectivity is lost could also be helpful.  How easily is this
reproduced?




--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [Bug 122] mad layer problem

2006-06-08 Thread bugzilla-daemon
http://openib.org/bugzilla/show_bug.cgi?id=122





--- Comment #1 from [EMAIL PROTECTED]  2006-06-08 12:44 ---
To debug this we probably need to know where sminfo and/or opensm were getting
stuck.  sysrq-T output for the stuck processes would probably be the most
helpful.




--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [Bug 122] New: mad layer problem

2006-06-08 Thread bugzilla-daemon
http://openib.org/bugzilla/show_bug.cgi?id=122

   Summary: mad layer problem
   Product: OpenFabrics Linux
   Version: gen2
  Platform: All
OS/Version: Other
Status: NEW
  Severity: blocker
  Priority: P2
 Component: IB Core
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]


We were running polygraph http://freshmeat.net/projects/polygraph/ over ipoib
and at some time ipoib connectivity was lost. When looking at the state of the
machines (two machines conected through a switch) I noticed that on on one of
the machines, I could not run any program that uses mads. Specifically I tried
sminfo and then opensm, both got stuck. I assume what happend is that at some
time the kernel refreshed its arp cache and at that time there was already a
problem in sending mads so the kernel could not resolve the address so ipoib
connectivity got lost.




--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: RFC: ib_cache_event problems

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: RFC: ib_cache_event problems
> 
>  > > Ah, that looks like the bug I guess.  What's the situation?  SM clears
>  > > P_Key table and then later readds a P_Key?
> 
>  > Any ideas on how to fix this?
> 
> Does it work to just start the pkey_task if ipoib_ib_dev_flush() wants
> for a P_Key that's not there?  Or is it trickier?

If this works, why is dev_up playing with pkey_chek_presence at all?
Can we kill all of this then?

ipoib_pkey_dev_check_presence(dev);

if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) {
ipoib_dbg(priv, "PKEY is not assigned.\n");
return 0;
}

It seems we must avoid joining multicast groups while key isn't assigned ...


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: RFC: ib_cache_event problems

2006-06-08 Thread Roland Dreier
 > > Ah, that looks like the bug I guess.  What's the situation?  SM clears
 > > P_Key table and then later readds a P_Key?

 > Any ideas on how to fix this?

Does it work to just start the pkey_task if ipoib_ib_dev_flush() wants
for a P_Key that's not there?  Or is it trickier?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty
>> An SM, upon becoming the master, shall respect all existing communication in
>> the fabric, where possible.
>
>To me, "where possible" doesn't sound like an appropriate language for a
>compliance statement. Is there precedent for this in IB spec?

I was trying to express a concept, not formulate exact wording here...

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: Failed multicast join withnew multicast module

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> What if we started with something like the following compliance statement, and
> tried to add this to the spec?
> 
> An SM, upon becoming the master, shall respect all existing communication in
> the fabric, where possible.

To me, "where possible" doesn't sound like an appropriate language for a
compliance statement. Is there precedent for this in IB spec?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: RFC: ib_cache_event problems

2006-06-08 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: RFC: ib_cache_event problems
> 
> Michael> But ipoib_ib_dev_flush doesn't?
> 
> Ah, that looks like the bug I guess.  What's the situation?  SM clears
> P_Key table and then later readds a P_Key?

Any ideas on how to fix this?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty

Greg Lindahl wrote:

Isn't this a quality of implementation issue? It's hard to imagine a
SM author not realizing this is a good thing to do.


I don't know if any SM implementation actually does this today.  I think that 
all break all multicast groups.



If it was in the standard, how would you test it for compliance?


Stop / restart the SM and see if any existing RC, UD, MCast communication breaks 
could be an easy first test.


- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-08 Thread Sean Hefty

Roland Dreier wrote:

I haven't looked too carefully yet.

What's the motivation?  It seems strange to put an IB-only transport
into the RDMA CM -- iWARP can't handle datagrams, can it?


This allows using the address translation to locate the remote service.  The 
RDMA CM also provides an IP based interface for IB.  From a user's perspective, 
this extends the RDMA CM to include the UDP port space, in addition to TCP.


- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Greg Lindahl
On Thu, Jun 08, 2006 at 09:49:35AM -0700, Sean Hefty wrote:

> What if we started with something like the following compliance statement, and
> tried to add this to the spec?
> 
> An SM, upon becoming the master, shall respect all existing communication in 
> the
> fabric, where possible.

Isn't this a quality of implementation issue? It's hard to imagine a
SM author not realizing this is a good thing to do.

If it was in the standard, how would you test it for compliance?

-- g

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-08 Thread Roland Dreier
Sean> Do you see any issues with this patch series or the related
Sean> userspace changes? There's a small change to uverbs, and new
Sean> APIs added to libibverbs.

I haven't looked too carefully yet.

What's the motivation?  It seems strange to put an IB-only transport
into the RDMA CM -- iWARP can't handle datagrams, can it?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 0/4] Add support for UD QPs

2006-06-08 Thread Sean Hefty

The following patch series adds support for UD QPs to userspace through the RDMA
CM.  UD QPs are referenced by an IP address, UDP port number.  The RDMA CM
abstracts SIDR for Infiniband clients.


Roland,

Do you see any issues with this patch series or the related userspace changes? 
There's a small change to uverbs, and new APIs added to libibverbs.


- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [ANNOUNCE] New iWARP Branch

2006-06-08 Thread Pradipta Kumar Banerjee

Sundeep Narravula wrote:

Hi,


I don't see this problem at all. I am using kernel 2.6.16.16, SLES 9 glibc
version 2.3.3-98, gcc version 3.3.3 and AMSO1100 RNIC.


The versions I used are glibc 2.3.4, kernel 2.6.16 and gcc 3.4.3 and
AMSO1100 RNIC.


Will running it under gdb be of some help ?


I am able to reproduce this error with/without gdb. The glibc error
disappears with higher number of iterations.

(gdb) r -c -vV -C10 -S10 -a 150.111.111.100 -p 


The problem is due to specifying a less than sufficient size (-S10, -S4) for the 
buffer. If you look into the following lines from the function rping_test_client 
 in rping.c


for (ping = 0; !cb->count || ping < cb->count; ping++) {
cb->state = RDMA_READ_ADV;

   /* Put some ascii text in the buffer. */
-->cc = sprintf(cb->start_buf, "rdma-ping-%d: ", ping);

From the above its clear that minimum size for start_buf should be atleast 
sufficient to hold the string, which in the invocations mentioned here (-S10 or 
-S4) is not the case. Hence you notice the glibc errors.



cb->start_buf is allocated in rping_setup_buffers() as
cb->start_buf = malloc(cb->size);

Basically the check

if ((cb->size < 1) ||
   (cb->size > (RPING_BUFSIZE - 1))) {

in the main()  should be changed to something like this

#define RPING_MIN_BUFSIZE   sizeof(itoa(INT_MAX)) + sizeof("rdma-ping-%d: ")

---> 'ping' is defined as a signed int, its maximum permissible value is defined 
in limits.h (INT_MAX = 2147483647)

We can even hardcode the RPING_MIN_BUFSIZE to '19' if desired/

if ((cb->size < RPING_MIN_BUFSIZE) ||
   (cb->size > (RPING_BUFSIZE - 1))) {

Steve what do you say ??


Thanks,
Pradipta Kumar.



Starting program: /usr/local/bin/rping -c -vV -C10 -S10 -a 150.111.111.100
-p 
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xe000
[Thread debugging using libthread_db enabled]
[New Thread -1208465728 (LWP 23960)]
libibverbs: Warning: no userspace device-specific driver found for uverbs1
driver search path: /usr/local/lib/infiniband
libibverbs: Warning: no userspace device-specific driver found for uverbs0
driver search path: /usr/local/lib/infiniband
[New Thread -1208468560 (LWP 23963)]
[New Thread -1216861264 (LWP 23964)]
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
ping data: rdma-ping
cq completion failed status 5
DISCONNECT EVENT...
*** glibc detected *** free(): invalid next size (fast): 0x0804ea80 ***

Program received signal SIGABRT, Aborted.
[Switching to Thread -1208465728 (LWP 23960)]
0xe410 in __kernel_vsyscall ()
(gdb)

  --Sundeep.


Thanks
Pradipta Kumar.

Thanx,


Steve.


On Mon, 2006-06-05 at 00:43 -0400, Sundeep Narravula wrote:

Hi Steve,
   We are trying the new iwarp branch on ammasso adapters. The installation
has gone fine. However, on running rping there is a error during
disconnect phase.

$ rping -c -vV -C4 -S4 -a 150.10.108.100 -p 
libibverbs: Warning: no userspace device-specific driver found for uverbs1
 driver search path: /usr/local/lib/infiniband
libibverbs: Warning: no userspace device-specific driver found for uverbs0
 driver search path: /usr/local/lib/infiniband
ping data: rdm
ping data: rdm
ping data: rdm
ping data: rdm
cq completion failed status 5
DISCONNECT EVENT...
*** glibc detected *** free(): invalid next size (fast): 0x0804ea80 ***
Aborted

There are no apparent errors showing up in dmesg. Is this error
currently expected?

Thanks,
   --Sundeep.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Sean Hefty
>If this comment is directed at client reregister mechanism, you should
>note that when this was brought up there was resistance to it based on
>the recommendation (probably not a strong enough word for this) that SMs
>be redundant in the subnet. There was a fair bit of anecdotal evidence
>that this was not how they were being used at the time but it may have
>been a chicken and egg problem.

Even with redundant SMs, we wouldn't want them to reassign all of the LIDs in
the subnet just because of failover.  I don't think of MLIDs as being any
different.  Client reregister support is optional, so what if the node(s) that
need to re-create the group doesn't support it?

What if we started with something like the following compliance statement, and
tried to add this to the spec?

An SM, upon becoming the master, shall respect all existing communication in the
fabric, where possible.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH] ib_uverbs_get_context does not unlock file->mutex in error path

2006-06-08 Thread Roland Dreier
Good catch.  Applied.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED-1.0-rc6 is available

2006-06-08 Thread Roland Dreier
Thanks... one further fix for Cisco gateways: sometimes the IsDM bit
is set on switch ports as well, so ibsrpdm should not be limited to
just CA ports.

Here's the patch, also on the trunk as r7836.

--- srptools/ChangeLog  (revision 7803)
+++ srptools/ChangeLog  (working copy)
@@ -1,3 +1,10 @@
+2006-06-08  Roland Dreier  <[EMAIL PROTECTED]>
+
+   * src/srp-dm.c (get_port_list): In some setups (eg Cisco SFS 3001
+   with an FC gateway), there will be switches with the IsDM bit set
+   on port 0.  So the initial get of NodeRecords must retrieve all
+   records, not just CA ports.
+
 2006-06-07  Roland Dreier  <[EMAIL PROTECTED]>
* src/srp-dm.c (do_port): Use correct endianness when comparing
GUID against Topspin OUI.
--- srptools/src/srp-dm.c   (revision 7803)
+++ srptools/src/srp-dm.c   (working copy)
@@ -523,11 +523,9 @@ static int get_port_list(int fd, uint32_
out_sa_mad->mgmt_class= SRP_MGMT_CLASS_SA;
out_sa_mad->method= SRP_SA_METHOD_GET_TABLE;
out_sa_mad->class_version = 2;
-   out_sa_mad->comp_mask = htonll(1ul << 4); /* node type */
+   out_sa_mad->comp_mask = 0; /* Get all end ports */
out_sa_mad->rmpp_version  = 1;
out_sa_mad->rmpp_type = 1;
-   node  = (void *) out_sa_mad->data;
-   node->type= 1; /* CA */
 
len = send_and_get(fd, &out_mad, in_mad, node_table_response_size);
if (len < 0)

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Re: [PATCH] uDAPL openib-cma provider - add support for IB_CM_REQ_OPTIONS

2006-06-08 Thread James Lentini


On Thu, 8 Jun 2006, Jack Morgenstein wrote:

> On Wednesday 07 June 2006 18:26, James Lentini wrote:
> > On Wed, 7 Jun 2006, Jack Morgenstein wrote:
> > > This (bug fix) can still be included in next-week's release, if you
> > > think it is important (I have extracted it from the changes checked
> > > in at svn 7755)
> >
> > If you are going to make another release anyway, then I would included
> > it.
> 
> Do you mean -- include the fix in next week's release -- or -- wait 
> with the fix for the following release?

I'd include the fix in the next release, but I wouldn't create a 
special release just for this fix.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH] osm: fix num of blocks of GUIDInfo GetTable query

2006-06-08 Thread Hal Rosenstock
Hi Eitan,

On Thu, 2006-06-08 at 07:24, Eitan Zahavi wrote:
> Hi Hal
> 
> I'm working on passing osmtest check. Found a bug in the new
> GUIDInfoRecord query: If you had a physical port with zero guid_cap
> the code would loop on blocks 0..255 instead of trying the next port.

OK; that's definitely a problem.

> I am still looking for why we might have a guid_cap == 0 on some
> ports.

PortInfo:GuidCap is not used for switch external ports.

> This patch resolves this new problem. osmtest passes on some arbitrary
> networks.
> 
> Eitan
> 
> Signed-off-by:  Eitan Zahavi <[EMAIL PROTECTED]>
> 
> Index: opensm/osm_sa_guidinfo_record.c
> ===
> --- opensm/osm_sa_guidinfo_record.c   (revision 7703)
> +++ opensm/osm_sa_guidinfo_record.c   (working copy)
> @@ -255,6 +255,10 @@ __osm_sa_gir_create_gir(
>continue;
>  
>  p_pi = osm_physp_get_port_info_ptr( p_physp );
> +
> +if ( p_pi->guid_cap == 0 )  
> +  continue;
> +

I think the right fix is to detect switch external ports and use the
VLCap from port 0 rather than from the switch external port (unless that
concept is broken in which case it should return 0 records).

-- Hal

>  num_blocks = p_pi->guid_cap / 8;
>  if ( p_pi->guid_cap % 8 )
>num_blocks++;
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] ib_uverbs_get_context does not unlock file->mutex in error path

2006-06-08 Thread CH Ganapathi
Hi,

If ibdev->alloc_ucontext(ibdev, &udata) fails then
ib_uverbs_get_context
does not unlock file->mutex before returning error.

Thanks,
Ganapathi
Novell Inc.


Signed-off by: Ganapathi CH <[EMAIL PROTECTED]>

Index: linux-kernel/infiniband/core/uverbs_cmd.c
===
--- infiniband/core/uverbs_cmd.c2006-06-08 11:52:29.0
+0530
+++ infiniband-fix/core/uverbs_cmd.c2006-06-08 17:16:10.0
+0530
@@ -80,8 +80,10 @@ ssize_t ib_uverbs_get_context(struct ib_
   in_len - sizeof cmd, out_len - sizeof resp);
 
ucontext = ibdev->alloc_ucontext(ibdev, &udata);
-   if (IS_ERR(ucontext))
-   return PTR_ERR(file->ucontext);
+   if (IS_ERR(ucontext)) {
+   ret = PTR_ERR(file->ucontext);
+   goto err;
+   }
 
ucontext->device = ibdev;
INIT_LIST_HEAD(&ucontext->pd_list);


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] osm: fix num of blocks of GUIDInfo GetTable query

2006-06-08 Thread Eitan Zahavi
Hi Hal

I'm working on passing osmtest check. Found a bug in the new
GUIDInfoRecord query: If you had a physical port with zero guid_cap
the code would loop on blocks 0..255 instead of trying the next port.

I am still looking for why we might have a guid_cap == 0 on some
ports.

This patch resolves this new problem. osmtest passes on some arbitrary
networks.

Eitan

Signed-off-by:  Eitan Zahavi <[EMAIL PROTECTED]>

Index: opensm/osm_sa_guidinfo_record.c
===
--- opensm/osm_sa_guidinfo_record.c (revision 7703)
+++ opensm/osm_sa_guidinfo_record.c (working copy)
@@ -255,6 +255,10 @@ __osm_sa_gir_create_gir(
   continue;
 
 p_pi = osm_physp_get_port_info_ptr( p_physp );
+
+if ( p_pi->guid_cap == 0 )  
+  continue;
+
 num_blocks = p_pi->guid_cap / 8;
 if ( p_pi->guid_cap % 8 )
   num_blocks++;


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] RE: Failed multicast join withnew multicast module

2006-06-08 Thread Hal Rosenstock
On Wed, 2006-06-07 at 22:48, Sean Hefty wrote:
> >I might be missing your point but UD is unreliable so the sends can be
> >dropped. The delay/retry is to make sure the join does occur,
> 
> This is different than a dropped request or reply.  In this case, the receiver
> gets a reply, but it will be a failure from the SA to join the group.

By receiver, I think you are referring to SA requester. Yes, the SA
would reject the request with a status ERR_REQ_INSUFFICIENT_COMPONENTS.

> For example, a NonMember tries to re-join before a FullMember which would have
> created the group does.  The result is that requests that receive a reply also
> need to be retried, with the timeout dependent on some remote node in the 
> fabric
> creating the group.

and it is unknown when such a multicast registration (to create the
group) would occur. So the proper timeout is unknown. That's why IPoIB
has a couple of different strategies for handling this depending on the
JoinState,

> >> So, the only safe thing to do is for all multicast clients to detach from 
> >> all
> >> multicast groups, destroy all address handles,
> >
> >Why all groups ?
> 
> Because the SM has lost track that any groups in the fabric existed, so those
> groups must be recreated, all potentially with different mlids.

Yes, in the case of client reregister.

> >> possibly wait for a new group to be created, and then start all over again.
> >
> >Start what all over again ?
> 
> I meant attach the QP to the new group and allocate a new address handle.

Couldn't it modify the old one as an alternative strategy ?

> This is a general comment, and not directed at anyone specific,

Don't worry. I'm not taking it personally. Just want to give you my
$0.02 worth on what I think you are saying below:

> but is this
> really the architecture and implementation that we want to aim for?  I really
> think that we need to look at solutions that don't break existing 
> communication,
> unless the links providing that communication actually go down, even if this
> means extending the architecture.

If this comment is directed at client reregister mechanism, you should
note that when this was brought up there was resistance to it based on
the recommendation (probably not a strong enough word for this) that SMs
be redundant in the subnet. There was a fair bit of anecdotal evidence
that this was not how they were being used at the time but it may have
been a chicken and egg problem.

-- Hal

> - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] RE: [PATCH] osm: fix mlx vendor rmpp sender fail to send zero sizeRMPP

2006-06-08 Thread Eitan Zahavi
This does not have to get into OFED. 
I did not see these failures there. 

Eitan Zahavi
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


> -Original Message-
> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
> Sent: Thursday, June 08, 2006 1:42 PM
> To: Eitan Zahavi
> Cc: OPENIB
> Subject: Re: [PATCH] osm: fix mlx vendor rmpp sender fail to send zero
sizeRMPP
> 
> Hi Eitan,
> 
> On Thu, 2006-06-08 at 04:40, Eitan Zahavi wrote:
> > Hi Hal
> >
> > Run into this by chance. Some changes introduced lately to the SA
queries
> > now sends zero size RMPP (single segment with only headers). It used
to send
> > them as non-RMPP responses.
> 
> Not sure what that change was.
> 
> > Anyway, this broke the mlx vendor code that I use
> > for simulation.
> >
> > This patch resolves this new problem.
> 
> Thanks. Applied to trunk only. Any idea of OFED RC6 has this issue ?
> 
> -- Hal
> 
> > Eitan


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH] osm: fix mlx vendor rmpp sender fail to send zero size RMPP

2006-06-08 Thread Hal Rosenstock
Hi Eitan,

On Thu, 2006-06-08 at 04:40, Eitan Zahavi wrote:
> Hi Hal
> 
> Run into this by chance. Some changes introduced lately to the SA queries 
> now sends zero size RMPP (single segment with only headers). It used to send 
> them as non-RMPP responses.

Not sure what that change was.

> Anyway, this broke the mlx vendor code that I use 
> for simulation.
> 
> This patch resolves this new problem.

Thanks. Applied to trunk only. Any idea of OFED RC6 has this issue ?

-- Hal

> Eitan



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] OFED-1.0-rc6 is available

2006-06-08 Thread Tziporet Koren
Roland did the fix on Trunk and I took it to OFED 1.0 branch.

Tziporet

-Original Message-
From: Ramachandra K [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, June 07, 2006 8:28 PM
To: Tziporet Koren
Cc: [EMAIL PROTECTED]; openib-general; Ramachandra K
Subject: Re: [openib-general] OFED-1.0-rc6 is available

Tziporet Koren wrote:
> Hi All,
> 
> We have prepared OFED 1.0 RC6.
> 
 From the openib source tar ball in OFED RC6, it looks like
the SRP kernel changes (ulp/srp/ib_srp.c) in the trunk for
supporting Rev 10 targets have been included in RC6, but the 
corresponding changes to the userspace srptool--ibsrpdm
(userspace/srptools/src/srp-dm.c) for displaying the IO class
of the target have not been made part of RC6.

The changes to ibsrpdm were committed to the SVN repository trunk in 
revision number 7758.

Will the latest version of ibsrpdm make it to the next OFED release ?

Regards,
Ram

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] osm: fix mlx vendor rmpp sender fail to send zero size RMPP

2006-06-08 Thread Eitan Zahavi
Hi Hal

Run into this by chance. Some changes introduced lately to the SA queries 
now sends zero size RMPP (single segment with only headers). It used to send 
them as non-RMPP responses. Anyway, this broke the mlx vendor code that I use 
for simulation.

This patch resolves this new problem.

Eitan

Signed-off-by:  Eitan Zahavi <[EMAIL PROTECTED]>

Index: libvendor/osm_vendor_mlx_sar.c
===
--- libvendor/osm_vendor_mlx_sar.c  (revision 7703)
+++ libvendor/osm_vendor_mlx_sar.c  (working copy)
@@ -91,7 +91,7 @@ osmv_rmpp_sar_get_mad_seg(
 num_segs++;
   }
 
-  if ( seg_idx > num_segs)
+  if ( (seg_idx > num_segs) && (seg_idx != 1) )
   {
 return IB_NOT_FOUND;
   }
@@ -102,18 +102,14 @@ osmv_rmpp_sar_get_mad_seg(
   /* attach header */
   memcpy(p_buf,p_sar->p_arbt_mad,p_sar->hdr_sz);
 
-
   /* fill data */
   p_seg = (char*)p_sar->p_arbt_mad + p_sar->hdr_sz + ((seg_idx-1) * 
p_sar->data_sz);
   sz_left = p_sar->data_len - ((seg_idx -1) * p_sar->data_sz);
   if (sz_left > p_sar->data_sz)
-  {
 memcpy((char*)p_buf+p_sar->hdr_sz,(char*)p_seg,p_sar->data_sz);
-  }
   else
 memcpy((char*)p_buf+ p_sar->hdr_sz, (char*)p_seg, sz_left);
 
-
   return IB_SUCCESS;
 }
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH v2 4/7] AMSO1100 Memory Management.

2006-06-08 Thread Andrew Morton
On Wed, 07 Jun 2006 15:06:55 -0500
Steve Wise <[EMAIL PROTECTED]> wrote:

> 
> +void c2_free(struct c2_alloc *alloc, u32 obj)
> +{
> + spin_lock(&alloc->lock);
> + clear_bit(obj, alloc->table);
> + spin_unlock(&alloc->lock);
> +}

The spinlock is unneeded here.


What does all the code in this file do, anyway?  It looks totally generic
(and hence inappropriate for drivers/infiniband/hw/amso1100/) and somewhat
similar to idr trees, perhaps.

> +int c2_array_set(struct c2_array *array, int index, void *value)
> +{
> + int p = (index * sizeof(void *)) >> PAGE_SHIFT;
> +
> + /* Allocate with GFP_ATOMIC because we'll be called with locks held. */
> + if (!array->page_list[p].page)
> + array->page_list[p].page =
> + (void **) get_zeroed_page(GFP_ATOMIC);
> +
> + if (!array->page_list[p].page)
> + return -ENOMEM;

This _will_ happen under load.  What will the result of that be, in the
context of thise driver?

This function is incorrectly designed - it should receive a gfp_t argument.
Because you don't *know* that the caller will always hold a spinlock.  And
GFP_KERNEL is far, far stronger than GFP_ATOMIC.

> +static int c2_alloc_mqsp_chunk(gfp_t gfp_mask, struct sp_chunk **head)
> +{
> + int i;
> + struct sp_chunk *new_head;
> +
> + new_head = (struct sp_chunk *) __get_free_page(gfp_mask | GFP_DMA);

Why is __GFP_DMA in there?  Unless you've cornered the ISA bus infiniband
market, it's likely to be wrong.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH v2 1/2] iWARP Connection Manager.

2006-06-08 Thread Andrew Morton
On Wed, 07 Jun 2006 15:06:05 -0500
Steve Wise <[EMAIL PROTECTED]> wrote:

> 
> This patch provides the new files implementing the iWARP Connection
> Manager.
> 
> Review Changes:
> 
> - sizeof -> sizeof()
> 
> - removed printks
> 
> - removed TT debug code
> 
> - cleaned up lock/unlock around switch statements.
> 
> - waitqueue -> completion for destroy path.
>
> ...
>
> +/* 
> + * This function is called on interrupt context. Schedule events on
> + * the iwcm_wq thread to allow callback functions to downcall into
> + * the CM and/or block.  Events are queued to a per-CM_ID
> + * work_list. If this is the first event on the work_list, the work
> + * element is also queued on the iwcm_wq thread.
> + *
> + * Each event holds a reference on the cm_id. Until the last posted
> + * event has been delivered and processed, the cm_id cannot be
> + * deleted. 
> + */
> +static void cm_event_handler(struct iw_cm_id *cm_id,
> +  struct iw_cm_event *iw_event) 
> +{
> + struct iwcm_work *work;
> + struct iwcm_id_private *cm_id_priv;
> + unsigned long flags;
> +
> + work = kmalloc(sizeof(*work), GFP_ATOMIC);
> + if (!work)
> + return;

This allocation _will_ fail sometimes.  The driver must recover from it. 
Will it do so?

> +EXPORT_SYMBOL(iw_cm_init_qp_attr);

This file exports a ton of symbols.  It's usual to provide some justifying
commentary in the changelog when this happens.

> +/*
> + * Copyright (c) 2005 Network Appliance, Inc. All rights reserved.
> + * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
> + *
> + * This software is available to you under a choice of one of two
> + * licenses.  You may choose to be licensed under the terms of the GNU
> + * General Public License (GPL) Version 2, available from the file
> + * COPYING in the main directory of this source tree, or the
> + * OpenIB.org BSD license below:
> + *
> + * Redistribution and use in source and binary forms, with or
> + * without modification, are permitted provided that the following
> + * conditions are met:
> + *
> + *  - Redistributions of source code must retain the above
> + *copyright notice, this list of conditions and the following
> + *disclaimer.
> + *
> + *  - Redistributions in binary form must reproduce the above
> + *copyright notice, this list of conditions and the following
> + *disclaimer in the documentation and/or other materials
> + *provided with the distribution.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
> + * SOFTWARE.
> + */
> +#if !defined(IW_CM_PRIVATE_H)
> +#define IW_CM_PRIVATE_H

We normally use #ifndef here.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general