Re: [openib-general] RHEL5 and OFED ...

2006-10-18 Thread Shirley Ma

[EMAIL PROTECTED] wrote on 10/16/2006 01:50:49 PM:
> On Mon, 2006-10-16 at 15:25 +0200, Michael S. Tsirkin wrote:
> > Quoting r. Maestas, Christopher Daniel <[EMAIL PROTECTED]>:
> > > Subject: Re: [openib-general] RHEL5 and OFED ...
> > > 
> > > > Now for userspace - does RHEL5 include at least libibverbs-1.0?
> > > > This has been released a while back, and Roland makes regular bugfix
> > > releases.
> > > 
> > > Here's what I see on a rhel4 u4 system:
> > > ---
> > > $ rpm -q libibverbs
> > > libibverbs-1.0.3-1
> > > ---
> > > 
> > > So I would think rhel5 would have at least that or greater.  When I
> > > compiled rpms for 1.1rc7 it generated:
> > > ---
> > > # ls libibverbs-*
> > > libibverbs-1.0.4-0.x86_64.rpm        libibverbs-utils-1.0.4-0.x86_64.rpm
> > > libibverbs-devel-1.0.4-0.x86_64.rpm
> > 
> > Dough, would it be possible to update this + libmthca?
> 
> Possibly.  What's the justification?  What's in 1.0.4 that is the
> primary reason for wanting to update from 1.0.3?
> 
> -- 
> Doug Ledford <[EMAIL PROTECTED]>

I am not sure whether this already has an answer. 
The justification is madvise(..., MADV_DONTFORK) is used to make fork() work for verbs consumers in the recent packages. I hope same patch will be in libehca.

thanks
Shirley Ma
IBM Linux Technology Center___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> Subject: Re: [openfabrics-ewg] RHEL5 and OFED ...
> 
> On Wed, 2006-10-18 at 09:29 +0200, Michael S. Tsirkin wrote:
> > Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > > > >From our dicussion, it seems we should be able to just push the
> > > > small number of missing bits into RHEL5 directly. That would be
> > > > nicer of course.
> > > 
> > > It depends.  If there's lots of individual changes, it might be easier
> > > to push the OFED 1.1 change.  But, that depends on when the final OFED
> > > 1.1 comes out and how much it varies from the existing RPMs.
> > 
> > OFED is in deep freeze, so you can already look at it to estimate the 
> > amount of
> > changes against 2.6.18.
> > Could you look at the diff please so that I know whether it's worth it
> > to invest in building the minimal patch set for pushing into RHEL5,
> > or whether you'll push OFED 1.1 into RHEL kernel as is?
> 
> Yeah, I'll look over the diff today.

How does it look?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] [REVOKE] If addr_handler() got error, do not set state as OK

2006-10-18 Thread Krishna Kumar
This was originally sent with the intention :
If addr_handler() got invoked with an error status,
do not set id_priv->state to success followed by
resettting it to the old value (redundant code).
Also encapsulate some common code.

But when I followed Sean's suggestion to avoid using extra
flags, the result is not very appealing (see below). The
code is too complicated (multiple overwrites of 'status')
to do this neatly.

I suggest we drop this patch, as it is not easy to achieve
the above intention cleanly by either re-write method :)

diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c
--- org/drivers/infiniband/core/cma.c   2006-10-10 15:45:27.0 +0530
+++ new/drivers/infiniband/core/cma.c   2006-10-10 15:59:53.0 +0530
@@ -1520,6 +1518,13 @@ static void addr_handler(int status, str
 
atomic_inc(&id_priv->dev_remove);
 
+   if (status) { /* We got called with an error */
+   if (!cma_comp(id_priv, CMA_ADDR_QUERY)) /* Invalid state */
+   goto out;
+   event = RDMA_CM_EVENT_ADDR_ERROR;
+   goto notify:
+   }
+
/*
 * Grab mutex to block rdma_destroy_id() from removing the device while
 * we're trying to acquire it.
@@ -1529,9 +1534,8 @@ static void addr_handler(int status, str
mutex_unlock(&lock);
goto out;
}
-
-   if (!status && !id_priv->cma_dev)
-   status = cma_acquire_dev(id_priv);
+   if (!id_priv->cma_dev)
+   status = cma_acquire_dev(id_priv); 
mutex_unlock(&lock);
 
if (status) {
@@ -1544,16 +1548,15 @@ static void addr_handler(int status, str
event = RDMA_CM_EVENT_ADDR_RESOLVED;
}
 
-   if (cma_notify_user(id_priv, event, status, NULL, 0)) {
+notify:
+   if (cma_notify_user(id_priv, event, status, NULL, 0))
cma_exch(id_priv, CMA_DESTROYING);
-   cma_release_remove(id_priv);
-   cma_deref_id(id_priv);
-   rdma_destroy_id(&id_priv->id);
-   return;
-   }
+
 out:
cma_release_remove(id_priv);
cma_deref_id(id_priv);
+   if (cma_comp(id_priv, CMA_DESTROYING))
+   rdma_destroy_id(&id_priv->id);
 }
 
 static int cma_resolve_loopback(struct rdma_id_private *id_priv)

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Fix some cancellation problems in process_req().

2006-10-18 Thread Krishna Kumar2
> The other changes look fine.  But note that if req->status == -ECANCELED 
and 
> time_after() is true, then it seems like a toss up as to which one can 
be 
> reported to the user.

I felt that since the time_after() check matched (in all likelyhood) due 
to the
processing of the cancellation, ECANCELLED is more appropriate to return.
It is most likely that if both conditions are true, that a cancelled 
operation led to
the time_after() match (cancel sets time to jiffies resulting in this 
time_after match).
Chances of both happening together is almost zero.

Do you agree ? Otherwise I can re-work the patch as suggested.

thanks,

- KK


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count

2006-10-18 Thread Krishna Kumar
rdma_bind_addr() leaks a cma_dev reference count
in failure case. Also hold lock when doing a
cma_detach_from_dev() as pointed out by Sean.

Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]>
---
diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c
--- org/drivers/infiniband/core/cma.c   2006-10-09 17:13:41.0 +0530
+++ new/drivers/infiniband/core/cma.c   2006-10-09 19:42:31.0 +0530
@@ -1750,6 +1750,7 @@ static int cma_get_port(struct rdma_id_p
 int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
 {
struct rdma_id_private *id_priv;
+   int did_acquire_dev = 0;
int ret;
 
if (addr->sa_family != AF_INET)
@@ -1768,6 +1769,7 @@ int rdma_bind_addr(struct rdma_cm_id *id
}
if (ret)
goto err;
+   did_acquire_dev = 1;
}
 
memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr));
@@ -1777,6 +1779,11 @@ int rdma_bind_addr(struct rdma_cm_id *id
 
return 0;
 err:
+   if (did_acquire_dev) {
+   mutex_lock(&lock);
+   cma_detach_from_dev(id_priv);
+   mutex_unlock(&lock);
+   }
cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE);
return ret;
 }

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-18 Thread Krishna Kumar2
Hi Sean,

> Let's try something like this then (untested):
> 
> diff --git a/drivers/infiniband/core/cma.c 
b/drivers/infiniband/core/cma.c
> index 18a4366..0d06431 100755
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -1859,16 +1859,20 @@ int rdma_bind_addr(struct rdma_cm_id *id
>   mutex_unlock(&lock);
>}
>if (ret)
> - goto err;
> + goto err1;
> }
> 
> memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr));
> ret = cma_get_port(id_priv);
> if (ret)
> -  goto err;
> +  goto err2;
> 
> return 0;
> -err:
> +err2:
> +   mutex_lock(&lock);
> +   cma_detach_from_dev(id_priv);
> +   mutex_unlock(&lock);
> +err1:
> cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE);
> return ret;
>  }

This will mean that a deref is wrongly done if a loopback or zero address 
is
passed to this function, without it having done a ref inc. I do think this 
case
requires a variable to indicate whether a ref was got or not. Assuming 
that is
true, I will submit a patch with your comment about holding the lock.

thanks,

- KK


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Michael S. Tsirkin
Quoting Roland Dreier <[EMAIL PROTECTED]>:
> I think the real question is whether we expect to have complex config
> options that would be hard to stick in an environment variable.  At
> this point I suspect not, so I guess I'll go with $sysconfdir/libibverbs.d
> and $IBV_DRIVERS.

OK, both make sense.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Roland Dreier
 > Sure. But this configuration of a program (x11 font server), not a library, 
 > is
 > that right? So user has a chance to know he's running it and read the man 
 > page
 > to figure which files are read.  It seems for libraries conf files are not
 > common.

No, actually I snipped the next few lines of the man page:

Description
   Fontconfig is a library designed to provide system-wide font 
configuration, cus-
   tomization and application access.

Off the top of my head I can also think of GTK+ (~/.gtkrc), and I seem
to have a ~/.gstreamer too.

I think the real question is whether we expect to have complex config
options that would be hard to stick in an environment variable.  At
this point I suspect not, so I guess I'll go with $sysconfdir/libibverbs.d
and $IBV_DRIVERS.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
>  > > I dunno what's better.  Maybe separate environment variables for
>  > > user-specific configs are just as good -- eg that's what ld.so does.
>  > 
>  > Hmm.
>  > I guess what I'm trying to say is - let's follow some precedent.
>  > ld.so example is good. Are there others?
> 
> I think there are plenty of precedents for putting configuration in
> dotfiles in $HOME.  For example on my system, 'man fonts-conf' shows
> 
> NAME
>fonts.conf - Font configuration files
> 
> SYNOPSIS
>   /etc/fonts/fonts.conf
>   /etc/fonts/fonts.dtd
>   /etc/fonts/conf.d
>   ~/.fonts.conf
> 
> But I'm sure there are plenty of environment variable uses too.

Sure. But this configuration of a program (x11 font server), not a library, is
that right? So user has a chance to know he's running it and read the man page
to figure which files are read.  It seems for libraries conf files are not
common.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Roland Dreier
 > > I dunno what's better.  Maybe separate environment variables for
 > > user-specific configs are just as good -- eg that's what ld.so does.
 > 
 > Hmm.
 > I guess what I'm trying to say is - let's follow some precedent.
 > ld.so example is good. Are there others?

I think there are plenty of precedents for putting configuration in
dotfiles in $HOME.  For example on my system, 'man fonts-conf' shows

NAME
   fonts.conf - Font configuration files

SYNOPSIS
  /etc/fonts/fonts.conf
  /etc/fonts/fonts.dtd
  /etc/fonts/conf.d
  ~/.fonts.conf

But I'm sure there are plenty of environment variable uses too.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> libraries don't stick anything in home directories -- I'm just
> suggesting $HOME/.libibverbs.conf as a place to stick extra configs
> that users might want to add.
> 
> I'm kind of thinking that we might want other config options beyond
> just driver names someday.  Otherwise we might as well have
> /etc/libibverbs.drivers.d and an environment variable IBV_DRIVERS, I
> guess.  But it might be nice to be able to add a line like
> 
> default-fork-safe true
> 
> somewhere in libibverbs.conf.d to set a system-wide default.
> 
> I dunno what's better.  Maybe separate environment variables for
> user-specific configs are just as good -- eg that's what ld.so does.

Possible usage examples:
I was thinking about some networked filesystem to have all boxes in the lab get
stuff from central place before the run, instead of copying stuff over.
I don't want to consider NFS-based home directory though.
Using environment makes it easier for me to avoid need to istall stuff
on local disks, at all.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
>  > > > Hopefully that is under prefix:
>  > > > $prefix/etc/libibverbs.conf.d/
>  > > 
>  > > Well, $sysconfdir/libibverbs.conf.d
>  > 
>  > Ugh, is that a problem if I want to build and run as non-root?
>  > I'm used to be able to set --prefix on config line for all libs
>  > to some directory, put LD_LIBRARY_PATH to point there, then
>  > if I like I just blow all of it away and I get a clean system.
>  > Scattering config files around in home directory etc will break this.
> 
> I'm not following the objection: what's wrong with using $sysconfdir?
> It defaults to $prefix/etc like you want, and it can be overridden
> with the --sysconfdir parameter to configure.

Sorry, looks like I was confused.

>  > > > Finally, it might be nice to be able to just specify the list of
>  > > > plugins at configure time for people like me who buuild everything
>  > > > from source and who want less flexibility
>  > > > but also less files to install.
>  > > 
>  > > Again, is that really any easier
>  > 
>  > Well, I'm thinking of distributed systems mainly where copying extra
>  > files around is additional pain.
>  > Consider myself: I'm building things on my laptop, then pushing them out to
>  > machines in the lab over rsync for testing. Less files - less headache.
>  > 
>  > > than putting whatever you want into
>  > > your .libibverbs.conf?
>  > 
>  > I really don't think a library sticking things in user's home directory
>  > is such a great idea - typical users don't really know they link against
>  > some library, this is just an extra place that users can break:
>  > move to another machine, things stop working, and your app's
>  > manual does not say anything of course.
> 
> libraries don't stick anything in home directories -- I'm just
> suggesting $HOME/.libibverbs.conf as a place to stick extra configs
> that users might want to add.
> 
> I'm kind of thinking that we might want other config options beyond
> just driver names someday.  Otherwise we might as well have
> /etc/libibverbs.drivers.d and an environment variable IBV_DRIVERS, I
> guess.  But it might be nice to be able to add a line like
> 
> default-fork-safe true
> 
> somewhere in libibverbs.conf.d to set a system-wide default.

I see. Looks somewhat useful - do you really intend something like this?
Then we'd need an API for app to set fork support state explicitly -
we currently only make it possible to enable it, not to disable.

> I dunno what's better.  Maybe separate environment variables for
> user-specific configs are just as good -- eg that's what ld.so does.

Hmm.
I guess what I'm trying to say is - let's follow some precedent.
ld.so example is good. Are there others?

>  > 
>  > > I definitely plan to make it so a missing plug-in is not fatal, so it
>  > > shouldn't hurt to have extra drivers declared that you don't build
>  > > every time.
>  > 
>  > Not until someone decides to rename a plugin for some reason - then you
>  > have to hunt down and kill the old file name to prevent an old version
>  > stuck in library path for some reason from being loaded - easy with the
>  > central location, but good luck walking all user's home directories.
> 
> Hmm, this seems to argue against allowing environment variables or
> anything but a single directory built into libibverbs.  Because
> otherwise you have to grep every .bashrc .cshrc and so on.

Hmm, good point.
I like it that with environment I can just pass it on command line and
not worry about any files which might be left behind.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Roland Dreier
 > > > Hopefully that is under prefix:
 > > > $prefix/etc/libibverbs.conf.d/
 > > 
 > > Well, $sysconfdir/libibverbs.conf.d
 > 
 > Ugh, is that a problem if I want to build and run as non-root?
 > I'm used to be able to set --prefix on config line for all libs
 > to some directory, put LD_LIBRARY_PATH to point there, then
 > if I like I just blow all of it away and I get a clean system.
 > Scattering config files around in home directory etc will break this.

I'm not following the objection: what's wrong with using $sysconfdir?
It defaults to $prefix/etc like you want, and it can be overridden
with the --sysconfdir parameter to configure.

 > > > Finally, it might be nice to be able to just specify the list of
 > > > plugins at configure time for people like me who buuild everything
 > > > from source and who want less flexibility
 > > > but also less files to install.
 > > 
 > > Again, is that really any easier
 > 
 > Well, I'm thinking of distributed systems mainly where copying extra
 > files around is additional pain.
 > Consider myself: I'm building things on my laptop, then pushing them out to
 > machines in the lab over rsync for testing. Less files - less headache.
 > 
 > > than putting whatever you want into
 > > your .libibverbs.conf?
 > 
 > I really don't think a library sticking things in user's home directory
 > is such a great idea - typical users don't really know they link against
 > some library, this is just an extra place that users can break:
 > move to another machine, things stop working, and your app's
 > manual does not say anything of course.

libraries don't stick anything in home directories -- I'm just
suggesting $HOME/.libibverbs.conf as a place to stick extra configs
that users might want to add.

I'm kind of thinking that we might want other config options beyond
just driver names someday.  Otherwise we might as well have
/etc/libibverbs.drivers.d and an environment variable IBV_DRIVERS, I
guess.  But it might be nice to be able to add a line like

default-fork-safe true

somewhere in libibverbs.conf.d to set a system-wide default.

I dunno what's better.  Maybe separate environment variables for
user-specific configs are just as good -- eg that's what ld.so does.

 > 
 > > I definitely plan to make it so a missing plug-in is not fatal, so it
 > > shouldn't hurt to have extra drivers declared that you don't build
 > > every time.
 > 
 > Not until someone decides to rename a plugin for some reason - then you have 
 > to
 > hunt down and kill the old file name to prevent an old version stuck in 
 > library
 > path for some reason from being loaded - easy with the central location, but
 > good luck walking all user's home directories.

Hmm, this seems to argue against allowing environment variables or
anything but a single directory built into libibverbs.  Because
otherwise you have to grep every .bashrc .cshrc and so on.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] [TRIVIAL] OpenSM/osm_port_info_rcv.c: Remove duplicate dump of received PortInfo

2006-10-18 Thread Hal Rosenstock
OpenSM/osm_port_info_rcv.c: Remove duplicate dump of received PortInfo
in osm_pi_rcv_process

Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]>

Index: opensm/osm_port_info_rcv.c
===
--- opensm/osm_port_info_rcv.c  (revision 9884)
+++ opensm/osm_port_info_rcv.c  (working copy)
@@ -710,8 +710,9 @@ osm_pi_rcv_process(
   port_guid = p_context->port_guid;
   node_guid = p_context->node_guid;
 
-  osm_dump_port_info(
-p_rcv->p_log, node_guid, port_guid, port_num, p_pi, OSM_LOG_DEBUG);
+  osm_dump_port_info( p_rcv->p_log,
+  node_guid, port_guid, port_num, p_pi,
+  OSM_LOG_DEBUG );
 
   /* 
  we might get a response during a light sweep looking for a change in 
@@ -829,10 +830,6 @@ osm_pi_rcv_process(
 p_smp->hop_count, p_smp->initial_path );
 }
 
-osm_dump_port_info( p_rcv->p_log,
-node_guid, port_guid, port_num, p_pi,
-OSM_LOG_DEBUG );
-
 /*
   Check if the update_sm_base_lid in the context is TRUE.
   If it is - then update the master_sm_base_lid of the variable




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> > Hopefully that is under prefix:
> > $prefix/etc/libibverbs.conf.d/
> 
> Well, $sysconfdir/libibverbs.conf.d

Ugh, is that a problem if I want to build and run as non-root?
I'm used to be able to set --prefix on config line for all libs
to some directory, put LD_LIBRARY_PATH to point there, then
if I like I just blow all of it away and I get a clean system.
Scattering config files around in home directory etc will break this.

> > Finally, it might be nice to be able to just specify the list of
> > plugins at configure time for people like me who buuild everything
> > from source and who want less flexibility
> > but also less files to install.
> 
> Again, is that really any easier

Well, I'm thinking of distributed systems mainly where copying extra
files around is additional pain.
Consider myself: I'm building things on my laptop, then pushing them out to
machines in the lab over rsync for testing. Less files - less headache.

> than putting whatever you want into
> your .libibverbs.conf?

I really don't think a library sticking things in user's home directory
is such a great idea - typical users don't really know they link against
some library, this is just an extra place that users can break:
move to another machine, things stop working, and your app's
manual does not say anything of course.

> I definitely plan to make it so a missing plug-in is not fatal, so it
> shouldn't hurt to have extra drivers declared that you don't build
> every time.

Not until someone decides to rename a plugin for some reason - then you have to
hunt down and kill the old file name to prevent an old version stuck in library
path for some reason from being loaded - easy with the central location, but
good luck walking all user's home directories.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Roland Dreier
 > Hopefully that is under prefix:
 > $prefix/etc/libibverbs.conf.d/

Well, $sysconfdir/libibverbs.conf.d

 > and I think an environment with a list of additional directories
 > would also be helpful.

Is that really necessary?  Just stick whatever you want into
$HOME/.libibverbs.conf.

 > Finally, it might be nice to be able to just specify the list of
 > plugins at configure time for people like me who buuild everything
 > from source and who want less flexibility
 > but also less files to install.

Again, is that really any easier than putting whatever you want into
your .libibverbs.conf?

I definitely plan to make it so a missing plug-in is not fatal, so it
shouldn't hurt to have extra drivers declared that you don't build
every time.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> could have each plugin drop a file in /etc/libibverbs.conf.d/ with the
> name -- something like

OK, feature request time :)

Hopefully that is under prefix:
$prefix/etc/libibverbs.conf.d/

and I think an environment with a list of additional directories
would also be helpful.

Finally, it might be nice to be able to just specify the list of
plugins at configure time for people like me who buuild everything
from source and who want less flexibility
but also less files to install.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Jason Gunthorpe
On Wed, Oct 18, 2006 at 04:25:21PM -0700, Roland Dreier wrote:
>  > AC_DEFUN(rc_LIBSTDCPP_VER,

> Thanks -- this actually solves the easiest part of my problem, and
> does it in a way that's not really useful for me (libibverbs needs to
> know what extra bits are getting added to plugin names, and with this
> technique, it would have to know what the final libary name was going
> to be, before it got built).  So I think I need to stick the extra
> plugin library name into a define in .

Right, thats exactly what should be done in ibverbs.

The general technique from that example is what you'd put in
userspace/libmtcha/configure.in if I'm groking this build process
properly..

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] test results from Mellanox, Voltaire, QLogic, and IBM for OFED 1.1 rc7?

2006-10-18 Thread Scott Weitzenkamp (sweitzen)



What testing did 
these companies do with rc7?  I'd kinda like to see performance data for 
the QLogic and IBM HCAs...
 
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business 
Unit
Cisco Systems
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7

2006-10-18 Thread Scott Weitzenkamp (sweitzen)



SRP got broken in rc7 for the Cisco Fibre Channel gateway, 
so we couldn't test it with that.
 
We have started testing with DDN IB storage, but don't have 
test results to share yet.
 
I'm sad to report no SRP HA testing in Cisco SQA yet.  
It's next on the todo list (right after IPoIB HA).
 
Scott

  
  
  From: Sujal Das [mailto:[EMAIL PROTECTED] 
  Sent: Wednesday, October 18, 2006 4:09 PMTo: Scott 
  Weitzenkamp (sweitzen)Cc: 
  openib-general@openib.orgSubject: RE: [openfabrics-ewg] Cisco SQA 
  Results for OFED 1.1 rc7
  
  
  Scott, thanks for the 
  report.  Based on this, it looks like Cisco did not test the SRP 
  initiator and HA functions with any SRP targets.  Is that a fair 
  assessment?
   
  
  
  
  
  From: 
  [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
  On Behalf Of Scott Weitzenkamp 
  (sweitzen)Sent: Wednesday, 
  October 18, 2006 2:24 PMTo: 
  [EMAIL PROTECTED]Cc: openib-general@openib.orgSubject: [openfabrics-ewg] Cisco SQA 
  Results for OFED 1.1 rc7
   
  
  Regression testing went well, 
  using Cisco switches and Cisco (Mellanox) HCAs.  See attached spreadsheet 
  for more details.
  
   
  
  The following increase in testing 
  happened:
  
Started 
testing SLES10 IA32 (will have IA64 and PPC64 results for 
pre1). 
Switched 
to HP MPI 2.2.5, which is first version to support 
OF. 
  
  The following bugs were tested and 
  closed.
  
247 OFED 
IPoIB HA not working on RHEL4 U3 
259 
problems with OFED IPoIB HA on SLES10 
173 OFED 
mpitests: add osu_{bw,latency,bibw,bcast}.c 
examples 
  
  The following bugs were opened, 
  but all have been marked fixed in pre1, thanks Mellanox folks for the quick 
  response.
  
273 OFED 
1.1 rc7 does not work with Cisco FC Gateway 
274 OFED 
1.1: RENICE_IB_MAD=yes hangs dual-HCA system with dual-port 
HCAs 
277 OFED 
1.1 rc7: uninitialized value during IPoIB failover in 
ipoib_ha.pl 
278 OFED 
1.1: two copies of openib.spec in openib-1.1.tgz 

  Scott 
  Weitzenkamp
  SQA and Release 
  Manager
  Server Virtualization Business 
  Unit
  Cisco 
  Systems
  
   
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Roland Dreier
 > AC_DEFUN(rc_LIBSTDCPP_VER,

Thanks -- this actually solves the easiest part of my problem, and
does it in a way that's not really useful for me (libibverbs needs to
know what extra bits are getting added to plugin names, and with this
technique, it would have to know what the final libary name was going
to be, before it got built).  So I think I need to stick the extra
plugin library name into a define in .

But seeing this code led me to information that solves everything else
I was worried about.  The libtool flag "-release" is what I need to
add gunk to the final .so's name, and I think backward compatibility
can be handled pretty easily too.  So thanks...

 > That bit goes in aclocal.m4

Yeah, I'd hide code like that too rather than let anyone see it in my
configure.in ;)

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7

2006-10-18 Thread Sujal Das








Scott, thanks for the report.  Based on
this, it looks like Cisco did not test the SRP initiator and HA functions with
any SRP targets.  Is that a fair assessment?

 









From:
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)
Sent: Wednesday, October 18, 2006
2:24 PM
To: [EMAIL PROTECTED]
Cc: openib-general@openib.org
Subject: [openfabrics-ewg] Cisco
SQA Results for OFED 1.1 rc7



 



Regression testing went well, using Cisco switches and Cisco
(Mellanox) HCAs.  See attached spreadsheet for more details.





 





The following increase in testing happened:




 Started testing SLES10 IA32 (will have IA64 and
 PPC64 results for pre1).
 Switched to HP MPI 2.2.5, which is first version
 to support OF.




The following bugs were tested and closed.




 247 OFED IPoIB HA not working on RHEL4 U3
 259 problems with OFED IPoIB HA on SLES10
 173 OFED mpitests: add
 osu_{bw,latency,bibw,bcast}.c examples




The following bugs were opened, but all have been marked
fixed in pre1, thanks Mellanox folks for the quick response.




 273 OFED 1.1 rc7 does
 not work with Cisco FC Gateway
 274 OFED 1.1:
 RENICE_IB_MAD=yes hangs dual-HCA system with dual-port HCAs
 277 OFED 1.1 rc7:
 uninitialized value during IPoIB failover in ipoib_ha.pl
 278 OFED 1.1: two copies
 of openib.spec in openib-1.1.tgz


Scott Weitzenkamp

SQA and Release Manager

Server Virtualization Business Unit

Cisco Systems



 








___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] ibv_reg_mr temporary vs permanent errors

2006-10-18 Thread Troy Benjegerdes
If ibv_reg_mr fails, can an application (or library, such as pvfs)  
assume that this is just a temporary error, and try to deregister  
some memory, then try again?

How can we differentiate between the case where the hardware (such as  
ehca) actually has more information about why the memory registration  
failed, and the application can act on that information (by  
coalescing memory regions, for example), vs cases where something is  
just plain broken and the application should give up and exit.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-18 Thread Shirley Ma

Roland Dreier <[EMAIL PROTECTED]> wrote on 10/18/2006 01:55:13 PM:
> I would like to understand why there's a throughput difference with
> scaling turned off, since the NAPI code doesn't change the interrupt
> handling all that much, and should lower the CPU usage if anything.
That's I am trying to understand now. 
Yes, the send side rate dropped significant, cpu usage lower as well.

> Does changing the netdev weight value affect anything?
> 
>  - R.
No, it doesn't.

Thanks
Shirley Ma
IBM Linux Technology Center___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Jason Gunthorpe
On Wed, Oct 18, 2006 at 01:43:03PM -0700, Roland Dreier wrote:

> The only two things I need to figure out, I hope with help from
> smarter people:

I'm by no means an expert, but this might be helpfull to someone who
is:

AC_DEFUN(rc_LIBSTDCPP_VER,
[AC_MSG_CHECKING([libstdc++ version])
dummy=if$$
cat <<_LIBSTDCPP_>$dummy.cc
#include 
#include 
#include 
int main(int argc, char **argv) { exit(0); }
_LIBSTDCPP_
${CXX-c++} $dummy.cc -o $dummy > /dev/null 2>&1

if test "$?" = 0; then
soname=`objdump -p ./$dummy |grep NEEDED|grep libstd`
LIBSTDCPP_VER=`echo $soname | sed -e 
's/.*NEEDED.*libstdc++\(-libc.*\(-.*\)\)\?.so.\(.*\)/\3\2/'`
fi
rm -f $dummy $dummy.cc

if test -z "$LIBSTDCPP_VER"; then
AC_MSG_WARN([cannot determine standard C++ library version 
number])
else
AC_MSG_RESULT([$LIBSTDCPP_VER])
LIBSTDCPP_VER="-$LIBSTDCPP_VER"
fi
AC_SUBST(LIBSTDCPP_VER)
])

This is a fragment from another project I have that stamps a soname
with the libstdc++ soname (libstdc++ causes a similar issue). The
basic idea is to compile a dummy program and link it with the target
library then use objdump to extract the soname and assign a
substition variable. That bit goes in aclocal.m4

Once you have the subsitition I think a conditional fragment in the
makefile should be enough to solve the second problem.

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-18 Thread Roland Dreier
 > Thanks. The touch test results are not good. This NAPI patch induces huge
 > latency for ehca driver scaling code, the throughput performance is not
 > good. (I am not fully conviced the huge latency is because of raising NAPI
 > in thread context.) Then I tried ehca no scaling driver, the latency looks
 > good, but the throughtput is still a problem. We are working on these
 > issues. Hopefully we can get the answer soon.

Hmm, the results with "scaling" on are not that unexpected, since the
idea of scheduling a thread round-robin (to kill all cache locality)
is pretty dubious anyway.

I would like to understand why there's a throughput difference with
scaling turned off, since the NAPI code doesn't change the interrupt
handling all that much, and should lower the CPU usage if anything.
Does changing the netdev weight value affect anything?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-18 Thread Roland Dreier
 > I just look a quick look at the directory setup and if you are
 > changing things I'd say you should also arrange to have the libibverbs
 > soname stamped into the plugin path and soname. Something like
 > libmthca-libibverbs.2.so.0. Once you do that it is pretty safe
 > to put it in /usr/lib* 

That makes sense (although I guess it would be
libmthca-libibverbs.2.so without the .0, since libmthca is just a
plugin that doesn't have an independent soname of its own).  Then we
could have each plugin drop a file in /etc/libibverbs.conf.d/ with the
name -- something like

driver mthca

(and possibly also read $HOME/.libibverbs.conf if desired)

The only two things I need to figure out, I hope with help from
smarter people:
 - What is the autoconf/automake chicanery needed to make the
   libmthca figure out the right libibverbs soname to stick in the
   name of the .so it installs?
 - And what is the autoconf/automake chicanery needed to fall back to
   having libmthca install plain mthca.so under /usr/lib/infiniband
   when it detects that it is being built against libibverbs 1.0?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH OFED-1.1-rc7] libehca configure: fix missing check of libsysfs

2006-10-18 Thread Hoang-Nam Nguyen
Hi,
> Do we really want generated files in svn? Why?
No. I was unsure if it's in ofed branch. And you're right, no need to.
Ignore this!
Thanks
Nam


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Catastrophic error detected.

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Ira Weiny <[EMAIL PROTECTED]>:
> Subject: Catastrophic error detected.
> 
> I got the following error running with OFED 1.1 on a modified 2.6.9 RHEL4
> kernel.  Hal mentioned that there might be a catastrophic error recovery patch
> submitted since then?  I can't find a mention of that in the mailing list.  If
> possible I would like to try such a patch.
> 
> Thanks,
> Ira
> 
> 2006-10-17 21:31:47 ib_mthca :07:00.0: Catastrophic error detected: 
> unknown error
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[00]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[01]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[02]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[03]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[04]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[05]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[06]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[07]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[08]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[09]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0a]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0b]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0c]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0d]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0e]: 
> 2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0f]: 

OFED 1.1 will already try to recover. But the fact that you got 
indicates its a hard error that we couldn't recover from.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Catastrophic error detected.

2006-10-18 Thread Ira Weiny
I got the following error running with OFED 1.1 on a modified 2.6.9 RHEL4
kernel.  Hal mentioned that there might be a catastrophic error recovery patch
submitted since then?  I can't find a mention of that in the mailing list.  If
possible I would like to try such a patch.

Thanks,
Ira

2006-10-17 21:31:47 ib_mthca :07:00.0: Catastrophic error detected: unknown 
error
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[00]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[01]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[02]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[03]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[04]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[05]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[06]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[07]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[08]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[09]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0a]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0b]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0c]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0d]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0e]: 
2006-10-17 21:31:47 ib_mthca :07:00.0:   buf[0f]: 

# rhea277 /root > /sbin/lspci -vv -s 07:00.0
07:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev 20)
Subsystem: Mellanox Technologies MT25208 InfiniHost III Ex
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Sasha Khapyorsky
On 08:12 Wed 18 Oct , Jeff Squyres wrote:
> I was not on the call last week, but I understand that there was some  
> discussion about exactly this point (ditch SVN and go 100% git): the  
> decision was to stick with SVN for userspace stuff and stick with git  
> for kernel stuff.
> 
> However, this is a larger audience than was on the call.  Is there a  
> significant movement here from the developers to move to 100% git?

Moving (or not moving) userspace to git could be done on per project
basis (as actually suggested by Michael).

Personally I'm voting for git.

Sasha

> 
> (I don't really care)
> 
> 
> On Oct 17, 2006, at 12:21 PM, Sasha Khapyorsky wrote:
> 
> >On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote:
> >>Quoting r. Steve Wise <[EMAIL PROTECTED]>:
> >>>At the risk of opening a can of worms, is there any reason we  
> >>>don't move
> >>>the user stuff into its own git tree?  This would get rid of svn
> >>>altogether...
> >>
> >>If we do, that should probably be multiple git trees - verbs,  
> >>management,
> >>tests are all more or less independent and developed mostly by  
> >>different people.
> >
> >Reasonable. And generally this should not be too bad.
> >
> >Sasha
> >
> >___
> >openib-general mailing list
> >openib-general@openib.org
> >http://openib.org/mailman/listinfo/openib-general
> >
> >To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
> >openib-general
> 
> 
> -- 
> Jeff Squyres
> Server Virtualization Business Unit
> Cisco Systems
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH OFED-1.1-rc7] libehca configure: fix missing check of libsysfs

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Hoang-Nam Nguyen <[EMAIL PROTECTED]>:
> Subject: [PATCH OFED-1.1-rc7] libehca configure: fix missing check of libsysfs
> 
> Hello,
> here is the patch of configure in libehca as a result of the patch
> "libehca configure.in and config.h.in". It is generated by autogen.sh
> and pretty lengthy. Hence, I'm attaching it here for completeness.
> Vlad, do you want me to check it in svn or send you the whole file?
> Thanks!
> Nam

Do we really want generated files in svn? Why?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-18 Thread Shirley Ma

Roland Dreier <[EMAIL PROTECTED]> wrote on 10/17/2006 08:41:59 PM:
> Anyway, I'm eagerly awaiting your NAPI results with ehca.
> 
> Thanks,
>   Roland

Thanks. The touch test results are not good. This NAPI patch induces huge latency for ehca driver scaling code, the throughput performance is not good. (I am not fully conviced the huge latency is because of raising NAPI in thread context.) Then I tried ehca no scaling driver, the latency looks good, but the throughtput is still a problem. We are working on these issues. Hopefully we can get the answer soon. 

Thanks
Shirley Ma
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH OFED-1.1-rc7] libehca configure.in and config.h.in: fix missing check of libsysfs.h

2006-10-18 Thread Hoang-Nam Nguyen
Hello,
below is a patch of configure.in and config.h.in in libehca. It checks
the presence of libsysfs.h properly. Unfortunately I recognized this bug
lately after I've fixed the "openib.spec" issues and tested ofed on
a clean system.
Thanks!
Nam


Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]>
---


 config.h.in  |3 +++
 configure.in |5 +
 2 files changed, 8 insertions(+)


diff -Nurp openib-1.1/src/userspace/libehca/config.h.in 
openib-1.1_patch/src/userspace/libehca/config.h.in
--- openib-1.1/src/userspace/libehca/config.h.in 2006-10-05 15:07:36.0 
+0200
+++ openib-1.1_patch/src/userspace/libehca/config.h.in 2006-10-18 
17:31:37.0 +0200
@@ -27,6 +27,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STRING_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_SYSFS_LIBSYSFS_H
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SYS_STAT_H
 
diff -Nurp openib-1.1/src/userspace/libehca/configure.in 
openib-1.1_patch/src/userspace/libehca/configure.in
--- openib-1.1/src/userspace/libehca/configure.in 2006-10-05 15:07:03.0 
+0200
+++ openib-1.1_patch/src/userspace/libehca/configure.in 2006-10-18 
17:31:37.0 +0200
@@ -25,9 +25,14 @@ AC_CHECK_LIB(ibverbs, 
  [], 
  AC_MSG_ERROR([libibverbs not installed]))
 
+dnl Checks for header files.
+AC_CHECK_HEADER(infiniband/driver.h, [],
+AC_MSG_ERROR([ not found.  libehca requires 
libibverbs.]))
+
 dnl Checks for library functions
 AC_CHECK_FUNCS(ibv_read_sysfs_file)
 fi
+AC_CHECK_HEADERS(sysfs/libsysfs.h)
 
 dnl Checks for programs.
 AC_PROG_CC
diff -Nurp openib-1.1/src/userspace/libehca/config.h.in 
openib-1.1_patch/src/userspace/libehca/config.h.in
--- openib-1.1/src/userspace/libehca/config.h.in2006-10-05 
15:07:36.0 +0200
+++ openib-1.1_patch/src/userspace/libehca/config.h.in  2006-10-18 
17:31:37.0 +0200
@@ -27,6 +27,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STRING_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_SYSFS_LIBSYSFS_H
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SYS_STAT_H
 
diff -Nurp openib-1.1/src/userspace/libehca/configure.in 
openib-1.1_patch/src/userspace/libehca/configure.in
--- openib-1.1/src/userspace/libehca/configure.in   2006-10-05 
15:07:03.0 +0200
+++ openib-1.1_patch/src/userspace/libehca/configure.in 2006-10-18 
17:31:37.0 +0200
@@ -25,9 +25,14 @@ AC_CHECK_LIB(ibverbs, 
  [], 
  AC_MSG_ERROR([libibverbs not installed]))
 
+dnl Checks for header files.
+AC_CHECK_HEADER(infiniband/driver.h, [],
+AC_MSG_ERROR([ not found.  libehca requires 
libibverbs.]))
+
 dnl Checks for library functions
 AC_CHECK_FUNCS(ibv_read_sysfs_file)
 fi
+AC_CHECK_HEADERS(sysfs/libsysfs.h)
 
 dnl Checks for programs.
 AC_PROG_CC
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-18 Thread Troy Benjegerdes
(I am taking this back to the openib list because I think the list  
needs to hear about real applications that are hitting memory  
registration limits)

What are the limits on the ehca memory registrations?

Is there a limit to the number of regions that can be registered? Is  
there any way (with kernel hacks) that we can register the entire  
address space of the application? We would like to be able to do RDMA  
sends and receives from anywhere in the application address space  
eventually, and only register it once.

What is the point of RDMA for memory-intensive applications if you  
have to copy the data to a registered buffer before sending it anyway?


On Oct 18, 2006, at 11:27 AM, Kyle Schochenmaier wrote:

> Hoang-Nam Nguyen wrote:
>> Hi Troy!
>>
>>> I am running PVFS2 on OpenIB, with IBM's ehca.
>>> When we start writing/reading large files, either with the NetPIPE
>>> PVFS module we have or a modified GAMESS executable that uses
>>> libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an  
>>> error.
>>> This is also correlated with kernel log messages like this:
>>> Oct 16 11:14:45 p5l8 kernel: PU0003 000e0091:ehca_hcall_7arg_7ret
>>> HCAD_ERROR  opco
>>> de=160 ret=fff7 arg1=1304 arg2=5
>>> arg3=14f0ebc8 arg4=1
>>> arg5=e0 arg6=e3e9f200 arg7=0 out1=0 out2=0 out3=0 out4=0
>>> out5=0 out6=0
>>> out7=0
>>>
>> Return code f7 from firmware/hvcall means H_NO_MEM. I'm wondering
>> if you could provide me with some pre-history of this problem.
>> Is this a permanent problem? If yes, could you give me more infos
>> on your testcase resp. scenario eg large file size, NetPIPE options?
>> Which version of ehca are you using? And which kernel version?
>> Thanks!
>> Hoang-Nam Nguyen
>>
>>
> I think Troy could better explain what is happening here, so I'm  
> taking this off-list for now -- we're trying to get this working  
> for SC'06, so time is limited :) -- if Troy wants to forward this  
> on to the list after looking at it, thats fine too.
> Our app writes out a file once, then reads it in many times through  
> the pvfs2 system.  In the pvfs2 layers, there is memory caching  
> done at the network level, so memory is registered by the app, and  
> attempts are made to re-register and/or re-use these memory regions  
> to save on memory reg overhead.  The problem occurs only while  
> writing files, so while memory is being initially registered with  
> the nic/app and cached?  Also, our tests show that the app runs  
> normally to completion on identical machines using mellanox hca's  
> instead of the eHCA.  The file sizes are generally >16GByte,  
> however our failures usually appear by the time ~220-250MBytes have  
> been written(possibly also all registered)?
>
> I'm not sure the standard OpenIB NetPIPE runs can reproduce this  
> type of workload.  However, we have developed a working PVFS2- 
> NetPIPE module which can reproduce this problem on occassion, if  
> there is interest in further testing this on your end, I can make  
> it available.
>
> Our ehca's have the following revision info:
>vendor_id:  0x5076
>vendor_part_id: 0
>hw_ver: 0x103
> Kernel version is debian 2.6.17
>
> I hope this is enough info to get some more insight from everyone.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [Bug 263] OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop

2006-10-18 Thread bugzilla-daemon
http://openib.org/bugzilla/show_bug.cgi?id=263





--- Comment #11 from [EMAIL PROTECTED]  2006-10-18 09:56 ---
Roland, I enabled debug_level=1 with OFED 1.1 rc7 RHEL4 U3 x86_64, and got same
crash (netserver machine).

I could only see the debug_level=1 info by running dmesg in a loop, and the
info did not get saved into any /var/log files.  Is there some extra
configuration needed for syslog?  Shouldn't IPoIB debug_level=1 info go into a
syslog file by default?

Here's what I saw from dmesg loop right before crash.

ib1: Port state change event
ib0: Port state change event
ib1: Port state change event
ib0: flushing
ib0: downing ib_dev
ib1: flushing
ib1: downing ib_dev
ib0: Created ah 0101beffa800
ib1: Created ah 0101be636800
ib0: Created ah 0101be5724c0
ib1: Created ah 0101be9c8a80
ib0: Created ah 0101bfc57100
ib1: Created ah 0101be49f700
ib0: Created ah 0101beffa3c0
ib1: Created ah 0101beffae80
ib0: Created ah 0101be636b40
ib1: Created ah 01019dfecd40
ib0: Start path record lookup for fe80::::0005:ad00:0020:0861 MTU >
1024
ib0: PathRec LID 0x0006 for GID fe80::::0005:ad00:0020:0861
ib0: Created ah 01019dfec600
ib0: created address handle 01019dfecac0 for LID 0x0006, SL 0
ib0: Port state change event
ib1: Port state change event
ib0: flushing
ib0: downing ib_dev
ib1: flushing
ib1: downing ib_dev
ib0: Start path record lookup for fe80::::0005:ad00:0020:0861 MTU >
1024
ib0: PathRec LID 0x0006 for GID fe80::::0005:ad00:0020:0861
ib0: Created ah 0101beffa300
ib0: created address handle 01019dfec1c0 for LID 0x0006, SL 0
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: Created ah 0101bfc55e80
ib0: Created ah 0101bfc4cc80
ib0: Created ah 01019dfec480
ib0: Created ah 01019dfec3c0
ib0: Created ah 01019dfec100
Tue Oct 17 01:05:42 PDT 2006

Message from [EMAIL PROTECTED] at Tue Oct 17 01:05:43 2006 ...
svbu-qa-pcie-1 kernel: general protection fault:  [1] SMP


Here's serial console output from netserver machine.

ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
ib0: dev_queue_xmit failed to requeue packet
general protection fault:  [1] SMP
CPU 0
Modules linked in: rdma_ucm(U) rdma_cm(U) ib_addr(U) ib_ipoib(U)
ib_mthca<7>Losi
ng some ticks... checking if CPU frequency changed.
(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U)
md5
 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc
ds
 yenta_socket pcmcia_core dm_mirror dm_multipath dm_mod button battery ac
uhci_h
cd ehci_hcd hw_random shpchp e1000 floppy sg ext3 jbd aic79xx sd_mod scsi_mod
Pid: 7838, comm: ib_mad1 Not tainted 2.6.9-34.ELsmp
RIP: 0010:[]
{:ib_ipoib:path_rec_completion+
178}
RSP: 0018:0101a756bc70  EFLAGS: 00010202
warning: many lost ticks.
Your time source seems to be instable or some driver is hogging interupts
rip mwait_idle+0x56/0x7c
RAX:  RBX:  RCX: 
RDX: 0101bbeffc80 RSI:  RDI: fffc
RBP: 0101bbeffc80 R08: 0003 R09: 0101bbeffca0
R10: 8011dfe0 R11: 8011dfe0 R12: 1b60167f
R13: fffc R14:  R15: 1b6012ff
FS:  () GS:804d7b00() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 006cf5e8 CR3: 00101000 CR4: 06e0
Process ib_mad1 (pid: 7838, threadinfo 0101a756a000, task 0101bdc3b030)
Stack: a00e547d 0101afda5000 0002 0101afda5380
   0246 0246 802ab017 0101bc16a500
   0101bbeffca0 0101bbeffc80
Call Trace:{:ib_sa:ib_sa_path_rec_callback+0}
   {dev_queue_xmit+525}
{:ib_ipoib:path_
rec_completion+885}
   {:ib_sa:ib_sa_path_rec_callback+64}
   {:ib_sa:send_handler+74}
{:ib_mad:ib_
mad_complete_send_wr+418}
   {:ib_mad:ib_mad_completion_handler+979}
   {:ib_mad:ib_mad_completion_handler+0}
   {worker_thread+419}
{default_wake_fun
ction+0}
   {default_wake_function+0}
{keventd_cr
eate_kthread+0}
   {worker_thread+0}
{keventd_create_kth
read+0}
   {kthread+200} {child_rip+8}
   {keventd_create_kthread+0}
{kthread+0
}
   {child_rip+0}

Code: 49 8b 74 24 08 50 0f b6 42 16 50 0f b6 42 15 50 0f b6 42 14
RIP {:ib_ipoib:path_rec_completion+178} RSP
<0101a756bc70>
 <0>Kernel panic - not syncing: Oops




--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/lis

Re: [openib-general] [PATCH] RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count

2006-10-18 Thread Sean Hefty
Krishna Kumar wrote:
>   struct rdma_id_private *id_priv;
> + int did_acquire_dev = 0;

See my other mail that gets rid of this flag.

> @@ -1776,6 +1778,8 @@ int rdma_bind_addr(struct rdma_cm_id *id
>  
>   return 0;
>  err:
> + if (did_acquire_dev)
> + cma_detach_from_dev(id_priv);

We need to lock around cma_detach_from_dev().

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-18 Thread Sean Hefty
>Actually that will not work, since the undo operation is for when the
>next operation (cma_get_port()) fails after we did an acquire_dev,
>and in that case the refcount needs to be dropped. So I am not
>able to avoid using an extra flag to indicate that a ref was got some
>time in the past, and drop it in the error path. I will send that out now.

Let's try something like this then (untested):

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 18a4366..0d06431 100755
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1859,16 +1859,20 @@ int rdma_bind_addr(struct rdma_cm_id *id
mutex_unlock(&lock);
}
if (ret)
-   goto err;
+   goto err1;
}
 
memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr));
ret = cma_get_port(id_priv);
if (ret)
-   goto err;
+   goto err2;
 
return 0;
-err:
+err2:
+   mutex_lock(&lock);
+   cma_detach_from_dev(id_priv);
+   mutex_unlock(&lock);
+err1:
cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE);
return ret;
 }

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Eitan Zahavi
Steve Wise wrote:
> On Wed, 2006-10-18 at 14:39 +0200, Michael S. Tsirkin wrote:
>   
>> Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
>> 
>>> However, this is a larger audience than was on the call.  Is there a  
>>> significant movement here from the developers to move to 100% git?
>>>   
>> Life would be somewhat easier for me with 100% git.
>>
>> 
>
> Probably for everyone.
>   
Not for me. I hate to move from SVN.
>
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>   


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] osm: port to WinIB stack - moving win-specific to config.h

2006-10-18 Thread Hal Rosenstock
On Wed, 2006-10-18 at 12:15, Yevgeny Kliteynik wrote:
> Hi Hal
> 
> As we discussed previously, I've added config.h in
> windows, and removed windows-specific defines from 
> the common OSM files:
>   opensm/osm_log.c
>   opensm/osm_prtn.c
>   opensm/osm_subnet.c
> 
> --
> Yevgeny
> 
> Signed-off-by:  Yevgeny Kliteynik <[EMAIL PROTECTED]>

Excellent. Thanks. Applied.

-- Hal



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] osm: port to WinIB stack - moving win-specific to config.h

2006-10-18 Thread Yevgeny Kliteynik
Hi Hal

As we discussed previously, I've added config.h in
windows, and removed windows-specific defines from 
the common OSM files:
opensm/osm_log.c
opensm/osm_prtn.c
opensm/osm_subnet.c

--
Yevgeny

Signed-off-by:  Yevgeny Kliteynik <[EMAIL PROTECTED]>
 
Index: opensm/osm_log.c
===
--- opensm/osm_log.c(revision 9869)
+++ opensm/osm_log.c(working copy)
@@ -96,10 +96,6 @@ static void truncate_log_file(osm_log_t*
 
 #else /* Windows */
 
-#define fstat _fstat
-#define stat _stat
-#define fileno _fileno
-
 static void truncate_log_file(osm_log_t* const p_log)
 {
fprintf(stderr, "truncate_log_file: cannot truncate on windows system 
(yet)\n");
Index: opensm/osm_prtn.c
===
--- opensm/osm_prtn.c   (revision 9869)
+++ opensm/osm_prtn.c   (working copy)
@@ -61,11 +61,6 @@
 #include 
 #include 
 
-#ifdef WIN32
-#define snprintf _snprintf
-#define stat _stat
-#endif
-
 extern int osm_prtn_config_parse_file(osm_log_t * const p_log,
  osm_subn_t * const p_subn,
  const char *file_name);
Index: opensm/osm_subnet.c
===
--- opensm/osm_subnet.c (revision 9869)
+++ opensm/osm_subnet.c (working copy)
@@ -658,10 +658,6 @@ __osm_subn_opts_unpack_charp(
   }
 }
 
-#ifdef WIN32
-#define snprintf _snprintf
-#endif
-
 /**
  **/
 static void

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OpenFabrics Developer Summit at SC06, Tampa Nov 16 - 17

2006-10-18 Thread Tang, Changqing
 
Has the registration site been set up ? 

--CQ

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeff Squyres
Sent: Tuesday, October 17, 2006 6:57 AM
To: Bill Boas
Cc: Open Fabrics; openib-general@openib.org; [EMAIL PROTECTED]
Subject: Re: [openib-general] OpenFabrics Developer Summit at SC06,
Tampa Nov 16 - 17

I have copied this information to the wiki -- please make all updates
there so that there is a single reference point to find all the
information about the meeting.  Thanks!

 https://openib.org/tiki/tiki-index.php?page=Meeting+Minutes



On Oct 15, 2006, at 5:02 PM, Bill Boas wrote:

> To all in the OpenFabrics Community
>
>
>
> We will be holding our first Developer Summit in the Tampa Convention 
> Center courtesy of SC06 starting at 1.30PM in Room 17 on Thursday 
> November 16, 2006. On Friday November 17, we will start in Room 13 at 
> 8.00 AM and continue till 5.00PM. We have had to schedule into these 
> time slots because no other usable space is available at any other 
> times during the week of SC06!
>
>
>
> OpenFabrics will cater food and beverages for afternoon break and 
> supper on Thursday, breakfast, lunch and two breaks on Friday. We will

> set up a registration site at Acteva to collect $$ to cover our out of

> pocket expenses - I'll email out the URL for that site in the next day

> or two.
>
>
>
> Please review attached Strawman purposes, suggested attendees and 
> agenda. Any changes or comments, please email them to the community 
> for all to comment on please.
>
>
>
> The Summit has several dimensions and themes throughout our work  
> there:
>
> 1) - consistency and robustness of the Linux and Windows software  
> stacks for Release 2.0 of OpenFabrics;
>
> 2) - feature selection, development resources and timelines for  
> Release 2.0;
>
> 3) - activities, features and processes of the Enterprise Working  
> Group on OFED 1.x until Release 2.0 is ready hand-off to the EWG;
>
> 4) - enhancing the resources of the EWG to be ready for 2.0 it so  
> that it may be subsequently be distributed as OFED 2.0. and adopted  
> by the OpenFabrics vendor and customer communities for production use.
>
>
>
> This is a far too much work for just a day and half! PLEASE START  
> NOW exchanging ideas for additional features, contact peer  
> engineers from companies and customers to discuss work sizing,  
> development resources, identify volunteer developers for items so  
> that when we meet on the 16th we're not starting from a blank sheet!
>
>
>
> Sujal Das, Johann George, Matt Leininger, Pramod Srivatsa, Hal  
> Rosenstock, Tom Tucker and Bob Woodruff are leading the pre- 
> meeting, STRAWMAN collation of requirements, feature  
> prioritization, developer assignments, sizing and processes so that  
> we have the list largely complete prior to the meeting and people  
> know has already volunteered for items from the list.
>
>
>
> Bill Boas
>
> VP, Business Development | System Fabric Works
>
> [EMAIL PROTECTED] | 510-375-8840
>
>
>
> 
> 
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Steve Wise
On Wed, 2006-10-18 at 14:39 +0200, Michael S. Tsirkin wrote:
> Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
> > However, this is a larger audience than was on the call.  Is there a  
> > significant movement here from the developers to move to 100% git?
> 
> Life would be somewhat easier for me with 100% git.
> 

Probably for everyone.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] IPoIB multicast neighbour?!

2006-10-18 Thread Or Gerlitz
While debugging something, i have opened ipoib debug messages and see

ib0: neigh_destructor for ff ff12:601b::::::0002

Do you have an idea what is the source of this neighbour? why it is created and
is there a way to eliminate this somehow (my guess is that removing IPv6 support
from the kernel will do that).

Its a RH4 U3 system with OFED 1.1 rc7

more info below, thanks.

Or.

# ip a s ib0
9: ib0:  mtu 1500 qdisc pfifo_fast qlen 128
link/[32] 00:02:04:04:fe:80:00:00:00:00:00:00:00:08:f1:04:03:97:08:c5
  brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
inet 192.169.3.235/24 brd 192.169.3.255 scope global ib0
inet6 fe80::208:f104:397:8c5/64 scope link
   valid_lft forever preferred_lft forever

# ip m s ib0
9:  ib0
link  00:ff:ff:ff:ff:12:40:1b:00:00:00:00:00:00:00:00:00:00:00:01
link  00:ff:ff:ff:ff:12:60:1b:00:00:00:00:00:00:00:01:ff:97:08:c5
link  00:ff:ff:ff:ff:12:60:1b:00:00:00:00:00:00:00:00:00:00:00:01
inet  224.0.0.1
inet6 ff02::1:ff97:8c5
inet6 ff02::1


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-18 Thread Or Gerlitz
Michael S. Tsirkin wrote:
> Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
>>> Try something like
>>> git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78
>>> to get the list of OFED changes against v2.6.18-rc6.
>> thanks for all the info, however i think the OFED docs must state on 
>> what upstream version are the OFED kernel IB drivers based (ie in this 
>> case 2.6.18-rc6 tag of linus tree) so one is able to determine that from 
>> reading the docs only (ie without using GIT).
> 
> Makes sense. Care to formulate the appropriate wording?
> Which document should this go into?

OK, something in the spirit of (remove the XXX) the below:

I will be able to produce something better tomorrow morning.

Or.

# Kernel

based on XXX=2.6.18-rc6 mainline kernel. The patches to this mainline
kernel are included within the OFED sources, please see the YYY doc for 
their location and how to apply them on the kernel sources.





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH TRIVIAL] opensm: fix function name in the comment

2006-10-18 Thread Hal Rosenstock
On Wed, 2006-10-18 at 09:44, Sasha Khapyorsky wrote:
> This trivially fixes function name (osm_switch_set_min_lid_size) in the
> comment.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH TRIVIAL] opensm: fix function name in the comment

2006-10-18 Thread Sasha Khapyorsky

This trivially fixes function name (osm_switch_set_min_lid_size) in the
comment.

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 osm/include/opensm/osm_switch.h |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/osm/include/opensm/osm_switch.h b/osm/include/opensm/osm_switch.h
index 8c4799f..0cf7542 100644
--- a/osm/include/opensm/osm_switch.h
+++ b/osm/include/opensm/osm_switch.h
@@ -440,9 +440,9 @@ osm_switch_set_hops(
 * SEE ALSO
 */
 
-/f* OpenSM: Switch/osm_switch_set_hops
+/f* OpenSM: Switch/osm_switch_set_min_lid_size
 * NAME
-*  osm_switch_set_hops
+*  osm_switch_set_min_lid_size
 *
 * DESCRIPTION
 *  Sets the size of the switch's routing table to at least accomodate the
-- 
1.4.2.3.g128e


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
> Does anyone have any sysadmin cycles to do this kind of stuff?  I  
> would expect it to be a flurry of activity here in the beginning  
> followed by short bursts of activity separated by long periods of  
> nothing.

FWIW, I can help out keeping the git tool updated - I'm doing it at
Mellanox now and its quite trivial. In particular, this can be
done without central admin priveledges - git does not need to be suid root, and
its easy to set git up to run from some "git-admin" user's home directory.

Playing with softlinks makes it quite easy for this user to update
git for everyone.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Jeff Squyres
On Oct 17, 2006, at 12:45 PM, Michael S. Tsirkin wrote:

>> Developers had requested git 1.4, but Ubuntu had an older  
>> version.  We
>> went ahead and installed git from source.  I'd prefer to stick to  
>> Ubuntu
>> packages if possible.
>
> We have much to gain from newer versions - just look at gitweb  
> change log.
> But my assumption here was that someone will keep the built from  
> source
> tools updated. I don't have a problem alerting the list when new
> versions come out.
>
> If, as Roland suggested, we'll be stuck at this version, its better
> to stick with distro-supplied ones, assuming that *that* is updated
> in a timely fashion.
>
> So, I guess the question is how is the sytsem supported/updated?

This is probably quite the operative question.  I volunteered to  
setup and maintain trac if the group decides to use it.

I don't know what the plan is for supporting the other software  
packages.  I too, would side with Michael that the relatively-recent  
versions of svn (although this may become moot) and trac tend to be  
beneficial to developers (I assume the same is true for git, but I  
have no direct experience).

Does anyone have any sysadmin cycles to do this kind of stuff?  I  
would expect it to be a flurry of activity here in the beginning  
followed by short bursts of activity separated by long periods of  
nothing.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-18 Thread Doug Ledford
On Wed, 2006-10-18 at 09:29 +0200, Michael S. Tsirkin wrote:
> Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > > >From our dicussion, it seems we should be able to just push the
> > > small number of missing bits into RHEL5 directly. That would be
> > > nicer of course.
> > 
> > It depends.  If there's lots of individual changes, it might be easier
> > to push the OFED 1.1 change.  But, that depends on when the final OFED
> > 1.1 comes out and how much it varies from the existing RPMs.
> 
> OFED is in deep freeze, so you can already look at it to estimate the amount 
> of
> changes against 2.6.18.
> Could you look at the diff please so that I know whether it's worth it
> to invest in building the minimal patch set for pushing into RHEL5,
> or whether you'll push OFED 1.1 into RHEL kernel as is?

Yeah, I'll look over the diff today.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED 1.1-RC7 build problem on SLES10

2006-10-18 Thread Erez Zilber
When I ran the install script without having the kernel sources rpm on 
SLES 10, I had to wait several minutes before it failed. Shouldn't the 
script check such dependencies before starting the build process?

Erez

Vladimir Sokolovsky wrote:
>
> Hi,
> OFED installation script check that the directory
> /lib/modules/`uname -r`/build/ and the file
> /lib/modules/`uname -r`/build/Makefle exist.
> It does not check for kernel-source RPM because of kernels from
> kernel.org support.
>
>
> --
>
> Regards,
> Vladimir
>
> On Tue, 2006-10-17 at 15:45 -0700, Scott Weitzenkamp (sweitzen) wrote:
> > You need the kernel-source RPM, I guess the OFED install.sh should check
> > for that RPM.
> >
> > svbu-qa-opteron-1:~ # uname -a
> > Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC
> > 2006 i68
> > 6 athlon i386 GNU/Linux
> > svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel
> > kernel-source-2.6.16.21-0.8
> > kernel-smp-2.6.16.21-0.8
> > kernel-ib-1.1-2.6.16.21_0.8_smp
> > kernel-ib-devel-1.1-2.6.16.21_0.8_smp
> > svbu-qa-opteron-1:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp
> > .config Makefilearch include2
> > .kernelrelease  Module.symvers  include  scripts
> >
> > Scott Weitzenkamp
> > SQA and Release Manager
> > Server Virtualization Business Unit
> > Cisco Systems
> > 
> >
> > > -Original Message-
> > > From: [EMAIL PROTECTED]
> > > [mailto:[EMAIL PROTECTED] On Behalf Of Chris Dennett
> > > Sent: Tuesday, October 17, 2006 12:46 PM
> > > To: [EMAIL PROTECTED]; openib-general@openib.org
> > > Subject: [openib-general] OFED 1.1-RC7 build problem on SLES10
> > >
> > > I've been trying to install OFED 1.1 RC7 on an x86 server
> > > with a fresh install
> > > of SLES10 (32-bit).  It errors out when trying to build the
> > > kernel modules. 
> > > I've included what I think are the relevant log messages
> > > below.  I've tried
> > > installing everything (minus iser and tvflash) or just the
> > > modules needed for
> > > SRP.  I've installed 1.1 RC7 successfully on other RedHat
> > > servers without any
> > > problems.  I am installing as root.  Any help would be appreciated.
> > >
> > > Thanks.
> > >
> > > -Chris
> > >
> > > ==
> > > + make kernel
> > > Building kernel modules
> > > Kernel version: 2.6.16.21-0.8-smp
> > > Modules directory: //lib/modules/2.6.16.21-0.8-smp
> > > Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build
> > > env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include
> > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
> > >
> > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \
> > >
> > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \
> > > make -C /lib/modules/2.6.16.21-0.8-smp/build
> > > SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband"
> > > KERNELRELEASE=2.6.16.21-0.8-smp \
> > > EXTRAVERSION=.21-0.8-smp V=1  \
> > > CONFIG_INFINIBAND=m \
> > > CONFIG_INFINIBAND_IPOIB=m \
> > > CONFIG_INFINIBAND_SDP= \
> > > CONFIG_INFINIBAND_SRP=m \
> > > CONFIG_INFINIBAND_USER_MAD=m \
> > > CONFIG_INFINIBAND_USER_ACCESS=m \
> > > CONFIG_INFINIBAND_ADDR_TRANS=y \
> > > CONFIG_INFINIBAND_MTHCA=m \
> > > CONFIG_INFINIBAND_IPOIB_DEBUG=y \
> > > CONFIG_INFINIBAND_ISER= \
> > > CONFIG_INFINIBAND_EHCA= \
> > > CONFIG_INFINIBAND_RDS= \
> > > CONFIG_INFINIBAND_RDS_DEBUG= \
> > > CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \
> > > CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \
> > > CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \
> > > CONFIG_INFINIBAND_SDP_DEBUG= \
> > > CONFIG_INFINIBAND_SDP_DEBUG_DATA= \
> > > CONFIG_INFINIBAND_IPATH= \
> > > CONFIG_INFINIBAND_MTHCA_DEBUG=y \
> > > CONFIG_INFINIBAND_MADEYE= \
> > > LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \
> > >
> > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
> > > -Iinclude \
> > > $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
> > > -include include/linux/autoconf.h \
> > > -include
> > > /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \
> > > ' \
> > > modules
> > > make[1]: Entering directory
> > > `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
> > > make[1]: *** No rule to make target `modules'.  Stop.
> > > make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
> > > make: *** [kernel] Error 2
> > > error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
> > >
> > >
> > > RPM build errors:
> > > user vlad does not exist - using root
> > > group mtl does not exist - using root
> > > user vlad does not exist - using root
> > > group mtl does not exist - using root
> > > Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
> > > ERROR: Failed executing "rp

Re: [openib-general] Tools for development

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Jeff Squyres <[EMAIL PROTECTED]>:
> However, this is a larger audience than was on the call.  Is there a  
> significant movement here from the developers to move to 100% git?

Life would be somewhat easier for me with 100% git.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] New DNS name for openfabrics.org

2006-10-18 Thread Jeff Squyres
Who runs the DNS for openfabrics.org?

Could we get a new DNS A name added: staging.openfabrics.org -- for  
the new server?  69.55.231.195.

Thanks!

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Jeff Squyres
I was not on the call last week, but I understand that there was some  
discussion about exactly this point (ditch SVN and go 100% git): the  
decision was to stick with SVN for userspace stuff and stick with git  
for kernel stuff.

However, this is a larger audience than was on the call.  Is there a  
significant movement here from the developers to move to 100% git?

(I don't really care)


On Oct 17, 2006, at 12:21 PM, Sasha Khapyorsky wrote:

> On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote:
>> Quoting r. Steve Wise <[EMAIL PROTECTED]>:
>>> At the risk of opening a can of worms, is there any reason we  
>>> don't move
>>> the user stuff into its own git tree?  This would get rid of svn
>>> altogether...
>>
>> If we do, that should probably be multiple git trees - verbs,  
>> management,
>> tests are all more or less independent and developed mostly by  
>> different people.
>
> Reasonable. And generally this should not be too bad.
>
> Sasha
>
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/ 
> openib-general


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-18 Thread Jeff Squyres
On Oct 17, 2006, at 9:45 AM, Michael S. Tsirkin wrote:

>> It seems like trac can integrate with both SVN and git and would also
>> provide us with integrated wiki capabilities.
>
> One feature that bugzilla has (and that seems to be disabled in  
> openib bugzilla
> :() is mail integration, where I can Cc bugzilla and mail contents  
> will get
> attached to bug report. I was hoping that new server will have this
> capability. Does trac have this?

Good question; I don't know.  I'll find out.

-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] opensm: misc fixes in lft dump file parser

2006-10-18 Thread Hal Rosenstock
On Tue, 2006-10-17 at 14:28, Sasha Khapyorsky wrote:
> There are misc small fixes for lft dump parser:
> - merge ERROR and SYS logging in single osm_log() call
> - more strict strtoul() results checking
> - fix potential bugs with invalid dump files
> - break too long lines
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] sysfs exposure of port counters useless?

2006-10-18 Thread Hal Rosenstock
On Tue, 2006-10-17 at 23:49, Hal Rosenstock wrote:
[snip...]
> > For IB counters in a Cisco switch, we read and reset the 32-bit counters
> > once per second and keep 64-bit counters internally.
> 
> 32 bit byte counters can be pegged in only 16 seconds on a 4x SDR link
> and there are 4x DDR links now (8 seconds) and 12x links (5 seconds) so
> that strategy is inaccurate on busy networks.

I was a little sleepy... I take back the last part of the comment. 1 sec
should be frequent enough. The only issue with this approach is the skew
in the reading of the port counters as the interval is not as precise as
it could be and that is likely to be good enough for these purposes.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1-RC7 build problem on SLES10

2006-10-18 Thread Vladimir Sokolovsky
Hi,
OFED installation script check that the directory 
/lib/modules/`uname -r`/build/ and the file 
/lib/modules/`uname -r`/build/Makefle exist. 
It does not check for kernel-source RPM because of kernels from
kernel.org support.


-- 

Regards,
Vladimir

On Tue, 2006-10-17 at 15:45 -0700, Scott Weitzenkamp (sweitzen) wrote:
> You need the kernel-source RPM, I guess the OFED install.sh should check
> for that RPM.
> 
> svbu-qa-opteron-1:~ # uname -a
> Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC
> 2006 i68
> 6 athlon i386 GNU/Linux
> svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel
> kernel-source-2.6.16.21-0.8
> kernel-smp-2.6.16.21-0.8
> kernel-ib-1.1-2.6.16.21_0.8_smp
> kernel-ib-devel-1.1-2.6.16.21_0.8_smp
> svbu-qa-opteron-1:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp
> .config Makefilearch include2
> .kernelrelease  Module.symvers  include  scripts
> 
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>  
> 
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Chris Dennett
> > Sent: Tuesday, October 17, 2006 12:46 PM
> > To: [EMAIL PROTECTED]; openib-general@openib.org
> > Subject: [openib-general] OFED 1.1-RC7 build problem on SLES10
> > 
> > I've been trying to install OFED 1.1 RC7 on an x86 server 
> > with a fresh install 
> > of SLES10 (32-bit).  It errors out when trying to build the 
> > kernel modules.  
> > I've included what I think are the relevant log messages 
> > below.  I've tried 
> > installing everything (minus iser and tvflash) or just the 
> > modules needed for 
> > SRP.  I've installed 1.1 RC7 successfully on other RedHat 
> > servers without any 
> > problems.  I am installing as root.  Any help would be appreciated.
> > 
> > Thanks.
> > 
> > -Chris
> > 
> > ==
> > + make kernel
> > Building kernel modules
> > Kernel version: 2.6.16.21-0.8-smp
> > Modules directory: //lib/modules/2.6.16.21-0.8-smp
> > Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build
> > env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include 
> > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
> > 
> > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \
> > 
> > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \
> > make -C /lib/modules/2.6.16.21-0.8-smp/build 
> > SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband" 
> > KERNELRELEASE=2.6.16.21-0.8-smp \
> > EXTRAVERSION=.21-0.8-smp V=1  \
> > CONFIG_INFINIBAND=m \
> > CONFIG_INFINIBAND_IPOIB=m \
> > CONFIG_INFINIBAND_SDP= \
> > CONFIG_INFINIBAND_SRP=m \
> > CONFIG_INFINIBAND_USER_MAD=m \
> > CONFIG_INFINIBAND_USER_ACCESS=m \
> > CONFIG_INFINIBAND_ADDR_TRANS=y \
> > CONFIG_INFINIBAND_MTHCA=m \
> > CONFIG_INFINIBAND_IPOIB_DEBUG=y \
> > CONFIG_INFINIBAND_ISER= \
> > CONFIG_INFINIBAND_EHCA= \
> > CONFIG_INFINIBAND_RDS= \
> > CONFIG_INFINIBAND_RDS_DEBUG= \
> > CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \
> > CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \
> > CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \
> > CONFIG_INFINIBAND_SDP_DEBUG= \
> > CONFIG_INFINIBAND_SDP_DEBUG_DATA= \
> > CONFIG_INFINIBAND_IPATH= \
> > CONFIG_INFINIBAND_MTHCA_DEBUG=y \
> > CONFIG_INFINIBAND_MADEYE= \
> > LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \
> > 
> > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
> > -Iinclude \
> > $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
> > -include include/linux/autoconf.h \
> > -include 
> > /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \
> > ' \
> > modules
> > make[1]: Entering directory 
> > `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
> > make[1]: *** No rule to make target `modules'.  Stop.
> > make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
> > make: *** [kernel] Error 2
> > error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
> > 
> > 
> > RPM build errors:
> > user vlad does not exist - using root
> > group mtl does not exist - using root
> > user vlad does not exist - using root
> > group mtl does not exist - using root
> > Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
> > ERROR: Failed executing "rpmbuild --rebuild --define '_topdir 
> > /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' --define 
> > 'build_root 
> > /var/tmp/OFED' --define 'configure_options --with-libibcommon 
> > --with-libibmad 
> > --with-libibumad --with-libibverbs --with-libmthca --with-opensm 
> > --with-librdmacm --with-openib-diags --with-srptools --with-mstflint 
> > --with-perftest --with-ipoib-mod --with-mthca-mod --with-srp-mod 
> > --with-core-mod --with-us

Re: [openib-general] srp trouble on RHEL4 U4

2006-10-18 Thread Lakshmanan, Madhu
> From: Mirochnick Natalia [mailto:[EMAIL PROTECTED] 
> Subject: Re: [openib-general] srp trouble on RHEL4 U4
> 
> I've changed the string as you've advised, but it didn't 
> work. The only 
> difference is that string "" was added in /var/log/messages.
> 
> [EMAIL PROTECTED] echo -n 
> id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
> 66a0261de,pkey=,service_id=49435353525
> 0,io_class=ff00 
>  > /sys/class/infiniband_srp/srp-mthca0-1/add_target
> 
> [EMAIL PROTECTED]  tail /var/log/messages
> Oct 18 14:01:26 ... kernel:   REJ reason 0x0
> Oct 18 14:01:26 ... kernel: ib_srp: Connection failed
> 
> By the way, in ofed srp_release_notes.txt hasn't been said 
> that io_class is 
> mandatory parameter.
> 
> Natalia

It is an error in the OFED 1.0 srp_release_notes.txt. For all
Silverstorm SRP targets (like the Silverstorm 5000 switch with Fiber
Channel IOC), the 'io_class=ff00' parameter is mandatory. 

Let me investigate a little more, and get back to you.

Madhu

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] srp trouble on RHEL4 U4

2006-10-18 Thread Mirochnick Natalia
I've changed the string as you've advised, but it didn't work. The only 
difference is that string "" was added in /var/log/messages.

[EMAIL PROTECTED] echo -n 
id_ext=0001,ioc_guid=00066a01380001de,dgid=fe866a0261de,pkey=,service_id=494353535250,io_class=ff00
 
 > /sys/class/infiniband_srp/srp-mthca0-1/add_target

[EMAIL PROTECTED]  tail /var/log/messages
Oct 18 14:01:26 ... kernel:   REJ reason 0x0
Oct 18 14:01:26 ... kernel: ib_srp: Connection failed

By the way, in ofed srp_release_notes.txt hasn't been said that io_class is 
mandatory parameter.

Natalia
- Original Message - 

>> Madhu Lakshmanan wrote:
>> Which SRP target are you using? Could you also give some more
>> details on
>> the fabric setup; i.e. what IB switch / gateway your host is
connected
>> to, and what kind of storage you wish to access? The full command
that
>> you used (echo -n  > /add_target) to configure the SRP target
>> would be very useful as well.
>
> From: Mirochnick Natalia [mailto:[EMAIL PROTECTED]
> Subject: Re: [openib-general] srp trouble on RHEL4 U4
>
> 2. Here's the details:
>
> IB switch: Silverstorm 5000
> Storage: NetApp FAS 320
>
> [EMAIL PROTECTED] /usr/ofed/sbin/ibsrpdm -c
> id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
> 66a0261de,pkey=,service_id=49435353525
> 0,io_class=ff00
> id_ext=0001,ioc_guid=00066a02380001de,dgid=fe8
> 66a0261de,pkey=,service_id=49435353525
> 0,io_class=ff00
>
> [EMAIL PROTECTED] echo -n
> id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
> 66a0261de,pkey=,service_id=494353535250
>  > /sys/class/infiniband_srp/srp-mthca0-1/add_target
>

The problem is with the echo string you are giving. The command should
be invoked as:

echo -n id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
66a0261de,pkey=,service_id=494353535250,
io_class=ff00 > /sys/class/infiniband_srp/srp-mthca0-1/add_target

You were missing the 'io_class=ff00' bit.

Let me know if it works.

Madhu




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] srp trouble on RHEL4 U4

2006-10-18 Thread Lakshmanan, Madhu
>> Madhu Lakshmanan wrote:
>> Which SRP target are you using? Could you also give some more 
>> details on
>> the fabric setup; i.e. what IB switch / gateway your host is
connected
>> to, and what kind of storage you wish to access? The full command
that
>> you used (echo -n  > /add_target) to configure the SRP target
>> would be very useful as well.
> 
> From: Mirochnick Natalia [mailto:[EMAIL PROTECTED] 
> Subject: Re: [openib-general] srp trouble on RHEL4 U4
> 
> 2. Here's the details:
> 
> IB switch: Silverstorm 5000
> Storage: NetApp FAS 320
> 
> [EMAIL PROTECTED] /usr/ofed/sbin/ibsrpdm -c
> id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
> 66a0261de,pkey=,service_id=49435353525
> 0,io_class=ff00
> id_ext=0001,ioc_guid=00066a02380001de,dgid=fe8
> 66a0261de,pkey=,service_id=49435353525
> 0,io_class=ff00
> 
> [EMAIL PROTECTED] echo -n 
> id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
> 66a0261de,pkey=,service_id=494353535250 
>  > /sys/class/infiniband_srp/srp-mthca0-1/add_target
> 

The problem is with the echo string you are giving. The command should
be invoked as:

echo -n id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8
66a0261de,pkey=,service_id=494353535250,
io_class=ff00 > /sys/class/infiniband_srp/srp-mthca0-1/add_target

You were missing the 'io_class=ff00' bit.

Let me know if it works.

Madhu

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] srp trouble on RHEL4 U4

2006-10-18 Thread Mirochnick Natalia
1. Thank alot for your attention.

2. Here's the details:

IB switch: Silverstorm 5000
Storage: NetApp FAS 320

[EMAIL PROTECTED] /usr/ofed/sbin/ibsrpdm -c
id_ext=0001,ioc_guid=00066a01380001de,dgid=fe866a0261de,pkey=,service_id=494353535250,io_class=ff00
id_ext=0001,ioc_guid=00066a02380001de,dgid=fe866a0261de,pkey=,service_id=494353535250,io_class=ff00

[EMAIL PROTECTED] echo -n 
id_ext=0001,ioc_guid=00066a01380001de,dgid=fe866a0261de,pkey=,service_id=494353535250
 
 > /sys/class/infiniband_srp/srp-mthca0-1/add_target

Thanks in advance,

Natalia Mirochnick
- Original Message - 
From: "Lakshmanan, Madhu" <[EMAIL PROTECTED]>
To: "Mirochnick Natalia" <[EMAIL PROTECTED]>; 
Sent: Wednesday, October 18, 2006 11:04 AM
Subject: RE: [openib-general] srp trouble on RHEL4 U4


Which SRP target are you using? Could you also give some more details on
the fabric setup; i.e. what IB switch / gateway your host is connected
to, and what kind of storage you wish to access? The full command that
you used (echo -n  > /add_target) to configure the SRP target
would be very useful as well.

The Silverstorm 7000 is an HCA (Host Channel Adapter). By itself, it
should in most cases not be the primary reason for the error code you
are seeing. The issue is more likely to be due to the SRP target you are
attempting to connect to.

Thanks,

Madhu Lakshmanan
Silverstorm Technologies, Inc.
[EMAIL PROTECTED]


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of
> Mirochnick Natalia
> Sent: Tuesday, October 17, 2006 10:47 AM
> To: openib-general@openib.org
> Subject: [openib-general] srp trouble on RHEL4 U4
>
> Hello,
>
> I'm trying to setup SRP connection (SRP in OFED 1.0).
> IB card is Silverstorm 7000.
>
> ib_srp module is loaded, but after attempt to to create an
> SRP device (as it
> was described in manual srp_release_notes.txt) the error appears in
> /var/log/messages:
>  kernel:   REJ reason 0x0
>
> What's wrong?
> -- 
> Thanks in advance,
> Mirochnick Natalia
>
>
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
>


__ NOD32 1.1808 (20061017) Information __

This message was checked by NOD32 antivirus system.
http://www.eset.com



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > Hmm, no, I really want to take a srpm from amd64 and get a 32 bit
> > gcc executable that will build 64 bit binaries that match these
> > built on native amd64 system exectly.
> 
> Between just i386 and x86_64, you might be able to do that.

I guess what I would like is for redhat to enable -m64 is gcc/binutils
from 32 bit distribution.

Then once I have a 64 bit machine, I could boot a 32 bit distro but
change just the kernel to 64 bit.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
> > Try something like
> > git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78
> > to get the list of OFED changes against v2.6.18-rc6.
> 
> thanks for all the info, however i think the OFED docs must state on 
> what upstream version are the OFED kernel IB drivers based (ie in this 
> case 2.6.18-rc6 tag of linus tree) so one is able to determine that from 
> reading the docs only (ie without using GIT).

Makes sense. Care to formulate the appropriate wording?
Which document should this go into?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-18 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > >From our dicussion, it seems we should be able to just push the
> > small number of missing bits into RHEL5 directly. That would be
> > nicer of course.
> 
> It depends.  If there's lots of individual changes, it might be easier
> to push the OFED 1.1 change.  But, that depends on when the final OFED
> 1.1 comes out and how much it varies from the existing RPMs.

OFED is in deep freeze, so you can already look at it to estimate the amount of
changes against 2.6.18.
Could you look at the diff please so that I know whether it's worth it
to invest in building the minimal patch set for pushing into RHEL5,
or whether you'll push OFED 1.1 into RHEL kernel as is?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-18 Thread Or Gerlitz
Michael S. Tsirkin wrote:
> Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
>> Subject: Re: OFED-1.1-pre1 is ready
>>
>> Tziporet Koren wrote:
>>> OFED 1.1-pre1 is available:
>>> URL:
>>> https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz
>>> Release details:
>>>
>>> BUILD_ID:
>>> OFED-1.1-pre1
>>>
>>> openib-1.1 (REV=9854)
>>> # User space
>>> https://openib.org/svn/gen2/branches/1.1/src/userspace
>>> Git:
>>> ref: refs/heads/ofed_1_1
>>> commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78
>> Hi Tziporet,
>>
>> I have asked this Michael few days ago and did not get a reply yet: can 
>> you clarify where is the version of the OFED IB ***kernel*** drivers 
>> stated?
> 
> That's the commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78 part.

I see.

>> I understand they are typically based on some tag of Linus GIT tree (for 
>> example OFED1.1 uses 2.6.18 - correct?) but i could not find any notice 
>> for that in the docs nor in the per rc emails.

> OFED1.1 was last rebased against 2.6.18-rc6 + a couple of small patches 
> touching
> cma + adding scripts out of kernel modules backports etc. 2.6.18 wasn't out
> by code freeze time, but all fixes in 2.6.18 are also in OFED 1.1.

> Try something like
> git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78
> to get the list of OFED changes against v2.6.18-rc6.

thanks for all the info, however i think the OFED docs must state on 
what upstream version are the OFED kernel IB drivers based (ie in this 
case 2.6.18-rc6 tag of linus tree) so one is able to determine that from 
reading the docs only (ie without using GIT).

Or.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-18 Thread Doug Ledford
On Wed, 2006-10-18 at 08:58 +0200, Michael S. Tsirkin wrote:
> Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > Subject: Re: [openfabrics-ewg] RHEL5 and OFED ...
> > 
> > On Sun, 2006-10-15 at 12:13 -0400, Doug Ledford wrote:
> > 
> > > > Now for userspace - does RHEL5 include at least libibverbs-1.0?
> > > > This has been released a while back, and Roland makes regular bugfix 
> > > > releases.
> > > 
> > > It includes the OFED 1.0 libibverbs (which makes openmpi complain about
> > > lack of out of band data support, but otherwise seems to work).
> 
> What's out of band data BTW?

Probably just me misremembering the error message...here the actual
message is:

[0,1,1][btl_openib_endpoint.c:945:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data
[0,1,1][btl_openib_endpoint.c:945:mca_btl_openib_endpoint_create_qp]
ibv_create_qp: returned 0 byte(s) for max inline data

> > I built the OFED-1.1-pre1 user space RPMs for RHEL5.  They are available
> > at my web site.
> 
> Thanks!
> 
> > Kernel RPMs with the OFED 1.1 code will come a little
> > later.
> 
> >From our dicussion, it seems we should be able to just push the
> small number of missing bits into RHEL5 directly. That would be
> nicer of course.

It depends.  If there's lots of individual changes, it might be easier
to push the OFED 1.1 change.  But, that depends on when the final OFED
1.1 comes out and how much it varies from the existing RPMs.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] srp trouble on RHEL4 U4

2006-10-18 Thread Lakshmanan, Madhu
Which SRP target are you using? Could you also give some more details on
the fabric setup; i.e. what IB switch / gateway your host is connected
to, and what kind of storage you wish to access? The full command that
you used (echo -n  > /add_target) to configure the SRP target
would be very useful as well.

The Silverstorm 7000 is an HCA (Host Channel Adapter). By itself, it
should in most cases not be the primary reason for the error code you
are seeing. The issue is more likely to be due to the SRP target you are
attempting to connect to. 

Thanks,

Madhu Lakshmanan
Silverstorm Technologies, Inc.
[EMAIL PROTECTED]
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Mirochnick Natalia
> Sent: Tuesday, October 17, 2006 10:47 AM
> To: openib-general@openib.org
> Subject: [openib-general] srp trouble on RHEL4 U4
> 
> Hello,
> 
> I'm trying to setup SRP connection (SRP in OFED 1.0).
> IB card is Silverstorm 7000.
> 
> ib_srp module is loaded, but after attempt to to create an 
> SRP device (as it 
> was described in manual srp_release_notes.txt) the error appears in 
> /var/log/messages:
>  kernel:   REJ reason 0x0
> 
> What's wrong?
> -- 
> Thanks in advance,
> Mirochnick Natalia 
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general