Re: [openib-general] RHEL5 and OFED ...
[EMAIL PROTECTED] wrote on 10/16/2006 01:50:49 PM: > On Mon, 2006-10-16 at 15:25 +0200, Michael S. Tsirkin wrote: > > Quoting r. Maestas, Christopher Daniel <[EMAIL PROTECTED]>: > > > Subject: Re: [openib-general] RHEL5 and OFED ... > > > > > > > Now for userspace - does RHEL5 include at least libibverbs-1.0? > > > > This has been released a while back, and Roland makes regular bugfix > > > releases. > > > > > > Here's what I see on a rhel4 u4 system: > > > --- > > > $ rpm -q libibverbs > > > libibverbs-1.0.3-1 > > > --- > > > > > > So I would think rhel5 would have at least that or greater. When I > > > compiled rpms for 1.1rc7 it generated: > > > --- > > > # ls libibverbs-* > > > libibverbs-1.0.4-0.x86_64.rpm libibverbs-utils-1.0.4-0.x86_64.rpm > > > libibverbs-devel-1.0.4-0.x86_64.rpm > > > > Dough, would it be possible to update this + libmthca? > > Possibly. What's the justification? What's in 1.0.4 that is the > primary reason for wanting to update from 1.0.3? > > -- > Doug Ledford <[EMAIL PROTECTED]> I am not sure whether this already has an answer. The justification is madvise(..., MADV_DONTFORK) is used to make fork() work for verbs consumers in the recent packages. I hope same patch will be in libehca. thanks Shirley Ma IBM Linux Technology Center___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > Subject: Re: [openfabrics-ewg] RHEL5 and OFED ... > > On Wed, 2006-10-18 at 09:29 +0200, Michael S. Tsirkin wrote: > > Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > > > >From our dicussion, it seems we should be able to just push the > > > > small number of missing bits into RHEL5 directly. That would be > > > > nicer of course. > > > > > > It depends. If there's lots of individual changes, it might be easier > > > to push the OFED 1.1 change. But, that depends on when the final OFED > > > 1.1 comes out and how much it varies from the existing RPMs. > > > > OFED is in deep freeze, so you can already look at it to estimate the > > amount of > > changes against 2.6.18. > > Could you look at the diff please so that I know whether it's worth it > > to invest in building the minimal patch set for pushing into RHEL5, > > or whether you'll push OFED 1.1 into RHEL kernel as is? > > Yeah, I'll look over the diff today. How does it look? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] [REVOKE] If addr_handler() got error, do not set state as OK
This was originally sent with the intention : If addr_handler() got invoked with an error status, do not set id_priv->state to success followed by resettting it to the old value (redundant code). Also encapsulate some common code. But when I followed Sean's suggestion to avoid using extra flags, the result is not very appealing (see below). The code is too complicated (multiple overwrites of 'status') to do this neatly. I suggest we drop this patch, as it is not easy to achieve the above intention cleanly by either re-write method :) diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c --- org/drivers/infiniband/core/cma.c 2006-10-10 15:45:27.0 +0530 +++ new/drivers/infiniband/core/cma.c 2006-10-10 15:59:53.0 +0530 @@ -1520,6 +1518,13 @@ static void addr_handler(int status, str atomic_inc(&id_priv->dev_remove); + if (status) { /* We got called with an error */ + if (!cma_comp(id_priv, CMA_ADDR_QUERY)) /* Invalid state */ + goto out; + event = RDMA_CM_EVENT_ADDR_ERROR; + goto notify: + } + /* * Grab mutex to block rdma_destroy_id() from removing the device while * we're trying to acquire it. @@ -1529,9 +1534,8 @@ static void addr_handler(int status, str mutex_unlock(&lock); goto out; } - - if (!status && !id_priv->cma_dev) - status = cma_acquire_dev(id_priv); + if (!id_priv->cma_dev) + status = cma_acquire_dev(id_priv); mutex_unlock(&lock); if (status) { @@ -1544,16 +1548,15 @@ static void addr_handler(int status, str event = RDMA_CM_EVENT_ADDR_RESOLVED; } - if (cma_notify_user(id_priv, event, status, NULL, 0)) { +notify: + if (cma_notify_user(id_priv, event, status, NULL, 0)) cma_exch(id_priv, CMA_DESTROYING); - cma_release_remove(id_priv); - cma_deref_id(id_priv); - rdma_destroy_id(&id_priv->id); - return; - } + out: cma_release_remove(id_priv); cma_deref_id(id_priv); + if (cma_comp(id_priv, CMA_DESTROYING)) + rdma_destroy_id(&id_priv->id); } static int cma_resolve_loopback(struct rdma_id_private *id_priv) ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] Fix some cancellation problems in process_req().
> The other changes look fine. But note that if req->status == -ECANCELED and > time_after() is true, then it seems like a toss up as to which one can be > reported to the user. I felt that since the time_after() check matched (in all likelyhood) due to the processing of the cancellation, ECANCELLED is more appropriate to return. It is most likely that if both conditions are true, that a cancelled operation led to the time_after() match (cancel sets time to jiffies resulting in this time_after match). Chances of both happening together is almost zero. Do you agree ? Otherwise I can re-work the patch as suggested. thanks, - KK ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count
rdma_bind_addr() leaks a cma_dev reference count in failure case. Also hold lock when doing a cma_detach_from_dev() as pointed out by Sean. Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]> --- diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c --- org/drivers/infiniband/core/cma.c 2006-10-09 17:13:41.0 +0530 +++ new/drivers/infiniband/core/cma.c 2006-10-09 19:42:31.0 +0530 @@ -1750,6 +1750,7 @@ static int cma_get_port(struct rdma_id_p int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) { struct rdma_id_private *id_priv; + int did_acquire_dev = 0; int ret; if (addr->sa_family != AF_INET) @@ -1768,6 +1769,7 @@ int rdma_bind_addr(struct rdma_cm_id *id } if (ret) goto err; + did_acquire_dev = 1; } memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr)); @@ -1777,6 +1779,11 @@ int rdma_bind_addr(struct rdma_cm_id *id return 0; err: + if (did_acquire_dev) { + mutex_lock(&lock); + cma_detach_from_dev(id_priv); + mutex_unlock(&lock); + } cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE); return ret; } ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count
Hi Sean, > Let's try something like this then (untested): > > diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c > index 18a4366..0d06431 100755 > --- a/drivers/infiniband/core/cma.c > +++ b/drivers/infiniband/core/cma.c > @@ -1859,16 +1859,20 @@ int rdma_bind_addr(struct rdma_cm_id *id > mutex_unlock(&lock); >} >if (ret) > - goto err; > + goto err1; > } > > memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr)); > ret = cma_get_port(id_priv); > if (ret) > - goto err; > + goto err2; > > return 0; > -err: > +err2: > + mutex_lock(&lock); > + cma_detach_from_dev(id_priv); > + mutex_unlock(&lock); > +err1: > cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE); > return ret; > } This will mean that a deref is wrongly done if a loopback or zero address is passed to this function, without it having done a ref inc. I do think this case requires a variable to indicate whether a ref was got or not. Assuming that is true, I will submit a patch with your comment about holding the lock. thanks, - KK ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
Quoting Roland Dreier <[EMAIL PROTECTED]>: > I think the real question is whether we expect to have complex config > options that would be hard to stick in an environment variable. At > this point I suspect not, so I guess I'll go with $sysconfdir/libibverbs.d > and $IBV_DRIVERS. OK, both make sense. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
> Sure. But this configuration of a program (x11 font server), not a library, > is > that right? So user has a chance to know he's running it and read the man > page > to figure which files are read. It seems for libraries conf files are not > common. No, actually I snipped the next few lines of the man page: Description Fontconfig is a library designed to provide system-wide font configuration, cus- tomization and application access. Off the top of my head I can also think of GTK+ (~/.gtkrc), and I seem to have a ~/.gstreamer too. I think the real question is whether we expect to have complex config options that would be hard to stick in an environment variable. At this point I suspect not, so I guess I'll go with $sysconfdir/libibverbs.d and $IBV_DRIVERS. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > > > I dunno what's better. Maybe separate environment variables for > > > user-specific configs are just as good -- eg that's what ld.so does. > > > > Hmm. > > I guess what I'm trying to say is - let's follow some precedent. > > ld.so example is good. Are there others? > > I think there are plenty of precedents for putting configuration in > dotfiles in $HOME. For example on my system, 'man fonts-conf' shows > > NAME >fonts.conf - Font configuration files > > SYNOPSIS > /etc/fonts/fonts.conf > /etc/fonts/fonts.dtd > /etc/fonts/conf.d > ~/.fonts.conf > > But I'm sure there are plenty of environment variable uses too. Sure. But this configuration of a program (x11 font server), not a library, is that right? So user has a chance to know he's running it and read the man page to figure which files are read. It seems for libraries conf files are not common. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
> > I dunno what's better. Maybe separate environment variables for > > user-specific configs are just as good -- eg that's what ld.so does. > > Hmm. > I guess what I'm trying to say is - let's follow some precedent. > ld.so example is good. Are there others? I think there are plenty of precedents for putting configuration in dotfiles in $HOME. For example on my system, 'man fonts-conf' shows NAME fonts.conf - Font configuration files SYNOPSIS /etc/fonts/fonts.conf /etc/fonts/fonts.dtd /etc/fonts/conf.d ~/.fonts.conf But I'm sure there are plenty of environment variable uses too. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > libraries don't stick anything in home directories -- I'm just > suggesting $HOME/.libibverbs.conf as a place to stick extra configs > that users might want to add. > > I'm kind of thinking that we might want other config options beyond > just driver names someday. Otherwise we might as well have > /etc/libibverbs.drivers.d and an environment variable IBV_DRIVERS, I > guess. But it might be nice to be able to add a line like > > default-fork-safe true > > somewhere in libibverbs.conf.d to set a system-wide default. > > I dunno what's better. Maybe separate environment variables for > user-specific configs are just as good -- eg that's what ld.so does. Possible usage examples: I was thinking about some networked filesystem to have all boxes in the lab get stuff from central place before the run, instead of copying stuff over. I don't want to consider NFS-based home directory though. Using environment makes it easier for me to avoid need to istall stuff on local disks, at all. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > > > > Hopefully that is under prefix: > > > > $prefix/etc/libibverbs.conf.d/ > > > > > > Well, $sysconfdir/libibverbs.conf.d > > > > Ugh, is that a problem if I want to build and run as non-root? > > I'm used to be able to set --prefix on config line for all libs > > to some directory, put LD_LIBRARY_PATH to point there, then > > if I like I just blow all of it away and I get a clean system. > > Scattering config files around in home directory etc will break this. > > I'm not following the objection: what's wrong with using $sysconfdir? > It defaults to $prefix/etc like you want, and it can be overridden > with the --sysconfdir parameter to configure. Sorry, looks like I was confused. > > > > Finally, it might be nice to be able to just specify the list of > > > > plugins at configure time for people like me who buuild everything > > > > from source and who want less flexibility > > > > but also less files to install. > > > > > > Again, is that really any easier > > > > Well, I'm thinking of distributed systems mainly where copying extra > > files around is additional pain. > > Consider myself: I'm building things on my laptop, then pushing them out to > > machines in the lab over rsync for testing. Less files - less headache. > > > > > than putting whatever you want into > > > your .libibverbs.conf? > > > > I really don't think a library sticking things in user's home directory > > is such a great idea - typical users don't really know they link against > > some library, this is just an extra place that users can break: > > move to another machine, things stop working, and your app's > > manual does not say anything of course. > > libraries don't stick anything in home directories -- I'm just > suggesting $HOME/.libibverbs.conf as a place to stick extra configs > that users might want to add. > > I'm kind of thinking that we might want other config options beyond > just driver names someday. Otherwise we might as well have > /etc/libibverbs.drivers.d and an environment variable IBV_DRIVERS, I > guess. But it might be nice to be able to add a line like > > default-fork-safe true > > somewhere in libibverbs.conf.d to set a system-wide default. I see. Looks somewhat useful - do you really intend something like this? Then we'd need an API for app to set fork support state explicitly - we currently only make it possible to enable it, not to disable. > I dunno what's better. Maybe separate environment variables for > user-specific configs are just as good -- eg that's what ld.so does. Hmm. I guess what I'm trying to say is - let's follow some precedent. ld.so example is good. Are there others? > > > > > I definitely plan to make it so a missing plug-in is not fatal, so it > > > shouldn't hurt to have extra drivers declared that you don't build > > > every time. > > > > Not until someone decides to rename a plugin for some reason - then you > > have to hunt down and kill the old file name to prevent an old version > > stuck in library path for some reason from being loaded - easy with the > > central location, but good luck walking all user's home directories. > > Hmm, this seems to argue against allowing environment variables or > anything but a single directory built into libibverbs. Because > otherwise you have to grep every .bashrc .cshrc and so on. Hmm, good point. I like it that with environment I can just pass it on command line and not worry about any files which might be left behind. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
> > > Hopefully that is under prefix: > > > $prefix/etc/libibverbs.conf.d/ > > > > Well, $sysconfdir/libibverbs.conf.d > > Ugh, is that a problem if I want to build and run as non-root? > I'm used to be able to set --prefix on config line for all libs > to some directory, put LD_LIBRARY_PATH to point there, then > if I like I just blow all of it away and I get a clean system. > Scattering config files around in home directory etc will break this. I'm not following the objection: what's wrong with using $sysconfdir? It defaults to $prefix/etc like you want, and it can be overridden with the --sysconfdir parameter to configure. > > > Finally, it might be nice to be able to just specify the list of > > > plugins at configure time for people like me who buuild everything > > > from source and who want less flexibility > > > but also less files to install. > > > > Again, is that really any easier > > Well, I'm thinking of distributed systems mainly where copying extra > files around is additional pain. > Consider myself: I'm building things on my laptop, then pushing them out to > machines in the lab over rsync for testing. Less files - less headache. > > > than putting whatever you want into > > your .libibverbs.conf? > > I really don't think a library sticking things in user's home directory > is such a great idea - typical users don't really know they link against > some library, this is just an extra place that users can break: > move to another machine, things stop working, and your app's > manual does not say anything of course. libraries don't stick anything in home directories -- I'm just suggesting $HOME/.libibverbs.conf as a place to stick extra configs that users might want to add. I'm kind of thinking that we might want other config options beyond just driver names someday. Otherwise we might as well have /etc/libibverbs.drivers.d and an environment variable IBV_DRIVERS, I guess. But it might be nice to be able to add a line like default-fork-safe true somewhere in libibverbs.conf.d to set a system-wide default. I dunno what's better. Maybe separate environment variables for user-specific configs are just as good -- eg that's what ld.so does. > > > I definitely plan to make it so a missing plug-in is not fatal, so it > > shouldn't hurt to have extra drivers declared that you don't build > > every time. > > Not until someone decides to rename a plugin for some reason - then you have > to > hunt down and kill the old file name to prevent an old version stuck in > library > path for some reason from being loaded - easy with the central location, but > good luck walking all user's home directories. Hmm, this seems to argue against allowing environment variables or anything but a single directory built into libibverbs. Because otherwise you have to grep every .bashrc .cshrc and so on. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] [TRIVIAL] OpenSM/osm_port_info_rcv.c: Remove duplicate dump of received PortInfo
OpenSM/osm_port_info_rcv.c: Remove duplicate dump of received PortInfo in osm_pi_rcv_process Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]> Index: opensm/osm_port_info_rcv.c === --- opensm/osm_port_info_rcv.c (revision 9884) +++ opensm/osm_port_info_rcv.c (working copy) @@ -710,8 +710,9 @@ osm_pi_rcv_process( port_guid = p_context->port_guid; node_guid = p_context->node_guid; - osm_dump_port_info( -p_rcv->p_log, node_guid, port_guid, port_num, p_pi, OSM_LOG_DEBUG); + osm_dump_port_info( p_rcv->p_log, + node_guid, port_guid, port_num, p_pi, + OSM_LOG_DEBUG ); /* we might get a response during a light sweep looking for a change in @@ -829,10 +830,6 @@ osm_pi_rcv_process( p_smp->hop_count, p_smp->initial_path ); } -osm_dump_port_info( p_rcv->p_log, -node_guid, port_guid, port_num, p_pi, -OSM_LOG_DEBUG ); - /* Check if the update_sm_base_lid in the context is TRUE. If it is - then update the master_sm_base_lid of the variable ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > > Hopefully that is under prefix: > > $prefix/etc/libibverbs.conf.d/ > > Well, $sysconfdir/libibverbs.conf.d Ugh, is that a problem if I want to build and run as non-root? I'm used to be able to set --prefix on config line for all libs to some directory, put LD_LIBRARY_PATH to point there, then if I like I just blow all of it away and I get a clean system. Scattering config files around in home directory etc will break this. > > Finally, it might be nice to be able to just specify the list of > > plugins at configure time for people like me who buuild everything > > from source and who want less flexibility > > but also less files to install. > > Again, is that really any easier Well, I'm thinking of distributed systems mainly where copying extra files around is additional pain. Consider myself: I'm building things on my laptop, then pushing them out to machines in the lab over rsync for testing. Less files - less headache. > than putting whatever you want into > your .libibverbs.conf? I really don't think a library sticking things in user's home directory is such a great idea - typical users don't really know they link against some library, this is just an extra place that users can break: move to another machine, things stop working, and your app's manual does not say anything of course. > I definitely plan to make it so a missing plug-in is not fatal, so it > shouldn't hurt to have extra drivers declared that you don't build > every time. Not until someone decides to rename a plugin for some reason - then you have to hunt down and kill the old file name to prevent an old version stuck in library path for some reason from being loaded - easy with the central location, but good luck walking all user's home directories. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
> Hopefully that is under prefix: > $prefix/etc/libibverbs.conf.d/ Well, $sysconfdir/libibverbs.conf.d > and I think an environment with a list of additional directories > would also be helpful. Is that really necessary? Just stick whatever you want into $HOME/.libibverbs.conf. > Finally, it might be nice to be able to just specify the list of > plugins at configure time for people like me who buuild everything > from source and who want less flexibility > but also less files to install. Again, is that really any easier than putting whatever you want into your .libibverbs.conf? I definitely plan to make it so a missing plug-in is not fatal, so it shouldn't hurt to have extra drivers declared that you don't build every time. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > could have each plugin drop a file in /etc/libibverbs.conf.d/ with the > name -- something like OK, feature request time :) Hopefully that is under prefix: $prefix/etc/libibverbs.conf.d/ and I think an environment with a list of additional directories would also be helpful. Finally, it might be nice to be able to just specify the list of plugins at configure time for people like me who buuild everything from source and who want less flexibility but also less files to install. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
On Wed, Oct 18, 2006 at 04:25:21PM -0700, Roland Dreier wrote: > > AC_DEFUN(rc_LIBSTDCPP_VER, > Thanks -- this actually solves the easiest part of my problem, and > does it in a way that's not really useful for me (libibverbs needs to > know what extra bits are getting added to plugin names, and with this > technique, it would have to know what the final libary name was going > to be, before it got built). So I think I need to stick the extra > plugin library name into a define in . Right, thats exactly what should be done in ibverbs. The general technique from that example is what you'd put in userspace/libmtcha/configure.in if I'm groking this build process properly.. Jason ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] test results from Mellanox, Voltaire, QLogic, and IBM for OFED 1.1 rc7?
What testing did these companies do with rc7? I'd kinda like to see performance data for the QLogic and IBM HCAs... Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7
SRP got broken in rc7 for the Cisco Fibre Channel gateway, so we couldn't test it with that. We have started testing with DDN IB storage, but don't have test results to share yet. I'm sad to report no SRP HA testing in Cisco SQA yet. It's next on the todo list (right after IPoIB HA). Scott From: Sujal Das [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 18, 2006 4:09 PMTo: Scott Weitzenkamp (sweitzen)Cc: openib-general@openib.orgSubject: RE: [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7 Scott, thanks for the report. Based on this, it looks like Cisco did not test the SRP initiator and HA functions with any SRP targets. Is that a fair assessment? From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen)Sent: Wednesday, October 18, 2006 2:24 PMTo: [EMAIL PROTECTED]Cc: openib-general@openib.orgSubject: [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7 Regression testing went well, using Cisco switches and Cisco (Mellanox) HCAs. See attached spreadsheet for more details. The following increase in testing happened: Started testing SLES10 IA32 (will have IA64 and PPC64 results for pre1). Switched to HP MPI 2.2.5, which is first version to support OF. The following bugs were tested and closed. 247 OFED IPoIB HA not working on RHEL4 U3 259 problems with OFED IPoIB HA on SLES10 173 OFED mpitests: add osu_{bw,latency,bibw,bcast}.c examples The following bugs were opened, but all have been marked fixed in pre1, thanks Mellanox folks for the quick response. 273 OFED 1.1 rc7 does not work with Cisco FC Gateway 274 OFED 1.1: RENICE_IB_MAD=yes hangs dual-HCA system with dual-port HCAs 277 OFED 1.1 rc7: uninitialized value during IPoIB failover in ipoib_ha.pl 278 OFED 1.1: two copies of openib.spec in openib-1.1.tgz Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
> AC_DEFUN(rc_LIBSTDCPP_VER, Thanks -- this actually solves the easiest part of my problem, and does it in a way that's not really useful for me (libibverbs needs to know what extra bits are getting added to plugin names, and with this technique, it would have to know what the final libary name was going to be, before it got built). So I think I need to stick the extra plugin library name into a define in . But seeing this code led me to information that solves everything else I was worried about. The libtool flag "-release" is what I need to add gunk to the final .so's name, and I think backward compatibility can be handled pretty easily too. So thanks... > That bit goes in aclocal.m4 Yeah, I'd hide code like that too rather than let anyone see it in my configure.in ;) - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7
Scott, thanks for the report. Based on this, it looks like Cisco did not test the SRP initiator and HA functions with any SRP targets. Is that a fair assessment? From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Weitzenkamp (sweitzen) Sent: Wednesday, October 18, 2006 2:24 PM To: [EMAIL PROTECTED] Cc: openib-general@openib.org Subject: [openfabrics-ewg] Cisco SQA Results for OFED 1.1 rc7 Regression testing went well, using Cisco switches and Cisco (Mellanox) HCAs. See attached spreadsheet for more details. The following increase in testing happened: Started testing SLES10 IA32 (will have IA64 and PPC64 results for pre1). Switched to HP MPI 2.2.5, which is first version to support OF. The following bugs were tested and closed. 247 OFED IPoIB HA not working on RHEL4 U3 259 problems with OFED IPoIB HA on SLES10 173 OFED mpitests: add osu_{bw,latency,bibw,bcast}.c examples The following bugs were opened, but all have been marked fixed in pre1, thanks Mellanox folks for the quick response. 273 OFED 1.1 rc7 does not work with Cisco FC Gateway 274 OFED 1.1: RENICE_IB_MAD=yes hangs dual-HCA system with dual-port HCAs 277 OFED 1.1 rc7: uninitialized value during IPoIB failover in ipoib_ha.pl 278 OFED 1.1: two copies of openib.spec in openib-1.1.tgz Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] ibv_reg_mr temporary vs permanent errors
If ibv_reg_mr fails, can an application (or library, such as pvfs) assume that this is just a temporary error, and try to deregister some memory, then try again? How can we differentiate between the case where the hardware (such as ehca) actually has more information about why the memory registration failed, and the application can act on that information (by coalescing memory regions, for example), vs cases where something is just plain broken and the application should give up and exit. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()
Roland Dreier <[EMAIL PROTECTED]> wrote on 10/18/2006 01:55:13 PM: > I would like to understand why there's a throughput difference with > scaling turned off, since the NAPI code doesn't change the interrupt > handling all that much, and should lower the CPU usage if anything. That's I am trying to understand now. Yes, the send side rate dropped significant, cpu usage lower as well. > Does changing the netdev weight value affect anything? > > - R. No, it doesn't. Thanks Shirley Ma IBM Linux Technology Center___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
On Wed, Oct 18, 2006 at 01:43:03PM -0700, Roland Dreier wrote: > The only two things I need to figure out, I hope with help from > smarter people: I'm by no means an expert, but this might be helpfull to someone who is: AC_DEFUN(rc_LIBSTDCPP_VER, [AC_MSG_CHECKING([libstdc++ version]) dummy=if$$ cat <<_LIBSTDCPP_>$dummy.cc #include #include #include int main(int argc, char **argv) { exit(0); } _LIBSTDCPP_ ${CXX-c++} $dummy.cc -o $dummy > /dev/null 2>&1 if test "$?" = 0; then soname=`objdump -p ./$dummy |grep NEEDED|grep libstd` LIBSTDCPP_VER=`echo $soname | sed -e 's/.*NEEDED.*libstdc++\(-libc.*\(-.*\)\)\?.so.\(.*\)/\3\2/'` fi rm -f $dummy $dummy.cc if test -z "$LIBSTDCPP_VER"; then AC_MSG_WARN([cannot determine standard C++ library version number]) else AC_MSG_RESULT([$LIBSTDCPP_VER]) LIBSTDCPP_VER="-$LIBSTDCPP_VER" fi AC_SUBST(LIBSTDCPP_VER) ]) This is a fragment from another project I have that stamps a soname with the libstdc++ soname (libstdc++ causes a similar issue). The basic idea is to compile a dummy program and link it with the target library then use objdump to extract the soname and assign a substition variable. That bit goes in aclocal.m4 Once you have the subsitition I think a conditional fragment in the makefile should be enough to solve the second problem. Jason ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()
> Thanks. The touch test results are not good. This NAPI patch induces huge > latency for ehca driver scaling code, the throughput performance is not > good. (I am not fully conviced the huge latency is because of raising NAPI > in thread context.) Then I tried ehca no scaling driver, the latency looks > good, but the throughtput is still a problem. We are working on these > issues. Hopefully we can get the answer soon. Hmm, the results with "scaling" on are not that unexpected, since the idea of scheduling a thread round-robin (to kill all cache locality) is pretty dubious anyway. I would like to understand why there's a throughput difference with scaling turned off, since the NAPI code doesn't change the interrupt handling all that much, and should lower the CPU usage if anything. Does changing the netdev weight value affect anything? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] use mmiowb after doorbell ring
> I just look a quick look at the directory setup and if you are > changing things I'd say you should also arrange to have the libibverbs > soname stamped into the plugin path and soname. Something like > libmthca-libibverbs.2.so.0. Once you do that it is pretty safe > to put it in /usr/lib* That makes sense (although I guess it would be libmthca-libibverbs.2.so without the .0, since libmthca is just a plugin that doesn't have an independent soname of its own). Then we could have each plugin drop a file in /etc/libibverbs.conf.d/ with the name -- something like driver mthca (and possibly also read $HOME/.libibverbs.conf if desired) The only two things I need to figure out, I hope with help from smarter people: - What is the autoconf/automake chicanery needed to make the libmthca figure out the right libibverbs soname to stick in the name of the .so it installs? - And what is the autoconf/automake chicanery needed to fall back to having libmthca install plain mthca.so under /usr/lib/infiniband when it detects that it is being built against libibverbs 1.0? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH OFED-1.1-rc7] libehca configure: fix missing check of libsysfs
Hi, > Do we really want generated files in svn? Why? No. I was unsure if it's in ofed branch. And you're right, no need to. Ignore this! Thanks Nam ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Catastrophic error detected.
Quoting r. Ira Weiny <[EMAIL PROTECTED]>: > Subject: Catastrophic error detected. > > I got the following error running with OFED 1.1 on a modified 2.6.9 RHEL4 > kernel. Hal mentioned that there might be a catastrophic error recovery patch > submitted since then? I can't find a mention of that in the mailing list. If > possible I would like to try such a patch. > > Thanks, > Ira > > 2006-10-17 21:31:47 ib_mthca :07:00.0: Catastrophic error detected: > unknown error > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[00]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[01]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[02]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[03]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[04]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[05]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[06]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[07]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[08]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[09]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0a]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0b]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0c]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0d]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0e]: > 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0f]: OFED 1.1 will already try to recover. But the fact that you got indicates its a hard error that we couldn't recover from. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Catastrophic error detected.
I got the following error running with OFED 1.1 on a modified 2.6.9 RHEL4 kernel. Hal mentioned that there might be a catastrophic error recovery patch submitted since then? I can't find a mention of that in the mailing list. If possible I would like to try such a patch. Thanks, Ira 2006-10-17 21:31:47 ib_mthca :07:00.0: Catastrophic error detected: unknown error 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[00]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[01]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[02]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[03]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[04]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[05]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[06]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[07]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[08]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[09]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0a]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0b]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0c]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0d]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0e]: 2006-10-17 21:31:47 ib_mthca :07:00.0: buf[0f]: # rhea277 /root > /sbin/lspci -vv -s 07:00.0 07:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (rev 20) Subsystem: Mellanox Technologies MT25208 InfiniHost III Ex Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
On 08:12 Wed 18 Oct , Jeff Squyres wrote: > I was not on the call last week, but I understand that there was some > discussion about exactly this point (ditch SVN and go 100% git): the > decision was to stick with SVN for userspace stuff and stick with git > for kernel stuff. > > However, this is a larger audience than was on the call. Is there a > significant movement here from the developers to move to 100% git? Moving (or not moving) userspace to git could be done on per project basis (as actually suggested by Michael). Personally I'm voting for git. Sasha > > (I don't really care) > > > On Oct 17, 2006, at 12:21 PM, Sasha Khapyorsky wrote: > > >On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote: > >>Quoting r. Steve Wise <[EMAIL PROTECTED]>: > >>>At the risk of opening a can of worms, is there any reason we > >>>don't move > >>>the user stuff into its own git tree? This would get rid of svn > >>>altogether... > >> > >>If we do, that should probably be multiple git trees - verbs, > >>management, > >>tests are all more or less independent and developed mostly by > >>different people. > > > >Reasonable. And generally this should not be too bad. > > > >Sasha > > > >___ > >openib-general mailing list > >openib-general@openib.org > >http://openib.org/mailman/listinfo/openib-general > > > >To unsubscribe, please visit http://openib.org/mailman/listinfo/ > >openib-general > > > -- > Jeff Squyres > Server Virtualization Business Unit > Cisco Systems > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH OFED-1.1-rc7] libehca configure: fix missing check of libsysfs
Quoting r. Hoang-Nam Nguyen <[EMAIL PROTECTED]>: > Subject: [PATCH OFED-1.1-rc7] libehca configure: fix missing check of libsysfs > > Hello, > here is the patch of configure in libehca as a result of the patch > "libehca configure.in and config.h.in". It is generated by autogen.sh > and pretty lengthy. Hence, I'm attaching it here for completeness. > Vlad, do you want me to check it in svn or send you the whole file? > Thanks! > Nam Do we really want generated files in svn? Why? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()
Roland Dreier <[EMAIL PROTECTED]> wrote on 10/17/2006 08:41:59 PM: > Anyway, I'm eagerly awaiting your NAPI results with ehca. > > Thanks, > Roland Thanks. The touch test results are not good. This NAPI patch induces huge latency for ehca driver scaling code, the throughput performance is not good. (I am not fully conviced the huge latency is because of raising NAPI in thread context.) Then I tried ehca no scaling driver, the latency looks good, but the throughtput is still a problem. We are working on these issues. Hopefully we can get the answer soon. Thanks Shirley Ma ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH OFED-1.1-rc7] libehca configure.in and config.h.in: fix missing check of libsysfs.h
Hello, below is a patch of configure.in and config.h.in in libehca. It checks the presence of libsysfs.h properly. Unfortunately I recognized this bug lately after I've fixed the "openib.spec" issues and tested ofed on a clean system. Thanks! Nam Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]> --- config.h.in |3 +++ configure.in |5 + 2 files changed, 8 insertions(+) diff -Nurp openib-1.1/src/userspace/libehca/config.h.in openib-1.1_patch/src/userspace/libehca/config.h.in --- openib-1.1/src/userspace/libehca/config.h.in 2006-10-05 15:07:36.0 +0200 +++ openib-1.1_patch/src/userspace/libehca/config.h.in 2006-10-18 17:31:37.0 +0200 @@ -27,6 +27,9 @@ /* Define to 1 if you have the header file. */ #undef HAVE_STRING_H +/* Define to 1 if you have the header file. */ +#undef HAVE_SYSFS_LIBSYSFS_H + /* Define to 1 if you have the header file. */ #undef HAVE_SYS_STAT_H diff -Nurp openib-1.1/src/userspace/libehca/configure.in openib-1.1_patch/src/userspace/libehca/configure.in --- openib-1.1/src/userspace/libehca/configure.in 2006-10-05 15:07:03.0 +0200 +++ openib-1.1_patch/src/userspace/libehca/configure.in 2006-10-18 17:31:37.0 +0200 @@ -25,9 +25,14 @@ AC_CHECK_LIB(ibverbs, [], AC_MSG_ERROR([libibverbs not installed])) +dnl Checks for header files. +AC_CHECK_HEADER(infiniband/driver.h, [], +AC_MSG_ERROR([ not found. libehca requires libibverbs.])) + dnl Checks for library functions AC_CHECK_FUNCS(ibv_read_sysfs_file) fi +AC_CHECK_HEADERS(sysfs/libsysfs.h) dnl Checks for programs. AC_PROG_CC diff -Nurp openib-1.1/src/userspace/libehca/config.h.in openib-1.1_patch/src/userspace/libehca/config.h.in --- openib-1.1/src/userspace/libehca/config.h.in2006-10-05 15:07:36.0 +0200 +++ openib-1.1_patch/src/userspace/libehca/config.h.in 2006-10-18 17:31:37.0 +0200 @@ -27,6 +27,9 @@ /* Define to 1 if you have the header file. */ #undef HAVE_STRING_H +/* Define to 1 if you have the header file. */ +#undef HAVE_SYSFS_LIBSYSFS_H + /* Define to 1 if you have the header file. */ #undef HAVE_SYS_STAT_H diff -Nurp openib-1.1/src/userspace/libehca/configure.in openib-1.1_patch/src/userspace/libehca/configure.in --- openib-1.1/src/userspace/libehca/configure.in 2006-10-05 15:07:03.0 +0200 +++ openib-1.1_patch/src/userspace/libehca/configure.in 2006-10-18 17:31:37.0 +0200 @@ -25,9 +25,14 @@ AC_CHECK_LIB(ibverbs, [], AC_MSG_ERROR([libibverbs not installed])) +dnl Checks for header files. +AC_CHECK_HEADER(infiniband/driver.h, [], +AC_MSG_ERROR([ not found. libehca requires libibverbs.])) + dnl Checks for library functions AC_CHECK_FUNCS(ibv_read_sysfs_file) fi +AC_CHECK_HEADERS(sysfs/libsysfs.h) dnl Checks for programs. AC_PROG_CC ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?
(I am taking this back to the openib list because I think the list needs to hear about real applications that are hitting memory registration limits) What are the limits on the ehca memory registrations? Is there a limit to the number of regions that can be registered? Is there any way (with kernel hacks) that we can register the entire address space of the application? We would like to be able to do RDMA sends and receives from anywhere in the application address space eventually, and only register it once. What is the point of RDMA for memory-intensive applications if you have to copy the data to a registered buffer before sending it anyway? On Oct 18, 2006, at 11:27 AM, Kyle Schochenmaier wrote: > Hoang-Nam Nguyen wrote: >> Hi Troy! >> >>> I am running PVFS2 on OpenIB, with IBM's ehca. >>> When we start writing/reading large files, either with the NetPIPE >>> PVFS module we have or a modified GAMESS executable that uses >>> libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an >>> error. >>> This is also correlated with kernel log messages like this: >>> Oct 16 11:14:45 p5l8 kernel: PU0003 000e0091:ehca_hcall_7arg_7ret >>> HCAD_ERROR opco >>> de=160 ret=fff7 arg1=1304 arg2=5 >>> arg3=14f0ebc8 arg4=1 >>> arg5=e0 arg6=e3e9f200 arg7=0 out1=0 out2=0 out3=0 out4=0 >>> out5=0 out6=0 >>> out7=0 >>> >> Return code f7 from firmware/hvcall means H_NO_MEM. I'm wondering >> if you could provide me with some pre-history of this problem. >> Is this a permanent problem? If yes, could you give me more infos >> on your testcase resp. scenario eg large file size, NetPIPE options? >> Which version of ehca are you using? And which kernel version? >> Thanks! >> Hoang-Nam Nguyen >> >> > I think Troy could better explain what is happening here, so I'm > taking this off-list for now -- we're trying to get this working > for SC'06, so time is limited :) -- if Troy wants to forward this > on to the list after looking at it, thats fine too. > Our app writes out a file once, then reads it in many times through > the pvfs2 system. In the pvfs2 layers, there is memory caching > done at the network level, so memory is registered by the app, and > attempts are made to re-register and/or re-use these memory regions > to save on memory reg overhead. The problem occurs only while > writing files, so while memory is being initially registered with > the nic/app and cached? Also, our tests show that the app runs > normally to completion on identical machines using mellanox hca's > instead of the eHCA. The file sizes are generally >16GByte, > however our failures usually appear by the time ~220-250MBytes have > been written(possibly also all registered)? > > I'm not sure the standard OpenIB NetPIPE runs can reproduce this > type of workload. However, we have developed a working PVFS2- > NetPIPE module which can reproduce this problem on occassion, if > there is interest in further testing this on your end, I can make > it available. > > Our ehca's have the following revision info: >vendor_id: 0x5076 >vendor_part_id: 0 >hw_ver: 0x103 > Kernel version is debian 2.6.17 > > I hope this is enough info to get some more insight from everyone. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 263] OFED 1.1 rc6: IPoIB Oops during IPoIB failover loop
http://openib.org/bugzilla/show_bug.cgi?id=263 --- Comment #11 from [EMAIL PROTECTED] 2006-10-18 09:56 --- Roland, I enabled debug_level=1 with OFED 1.1 rc7 RHEL4 U3 x86_64, and got same crash (netserver machine). I could only see the debug_level=1 info by running dmesg in a loop, and the info did not get saved into any /var/log files. Is there some extra configuration needed for syslog? Shouldn't IPoIB debug_level=1 info go into a syslog file by default? Here's what I saw from dmesg loop right before crash. ib1: Port state change event ib0: Port state change event ib1: Port state change event ib0: flushing ib0: downing ib_dev ib1: flushing ib1: downing ib_dev ib0: Created ah 0101beffa800 ib1: Created ah 0101be636800 ib0: Created ah 0101be5724c0 ib1: Created ah 0101be9c8a80 ib0: Created ah 0101bfc57100 ib1: Created ah 0101be49f700 ib0: Created ah 0101beffa3c0 ib1: Created ah 0101beffae80 ib0: Created ah 0101be636b40 ib1: Created ah 01019dfecd40 ib0: Start path record lookup for fe80::::0005:ad00:0020:0861 MTU > 1024 ib0: PathRec LID 0x0006 for GID fe80::::0005:ad00:0020:0861 ib0: Created ah 01019dfec600 ib0: created address handle 01019dfecac0 for LID 0x0006, SL 0 ib0: Port state change event ib1: Port state change event ib0: flushing ib0: downing ib_dev ib1: flushing ib1: downing ib_dev ib0: Start path record lookup for fe80::::0005:ad00:0020:0861 MTU > 1024 ib0: PathRec LID 0x0006 for GID fe80::::0005:ad00:0020:0861 ib0: Created ah 0101beffa300 ib0: created address handle 01019dfec1c0 for LID 0x0006, SL 0 ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: Created ah 0101bfc55e80 ib0: Created ah 0101bfc4cc80 ib0: Created ah 01019dfec480 ib0: Created ah 01019dfec3c0 ib0: Created ah 01019dfec100 Tue Oct 17 01:05:42 PDT 2006 Message from [EMAIL PROTECTED] at Tue Oct 17 01:05:43 2006 ... svbu-qa-pcie-1 kernel: general protection fault: [1] SMP Here's serial console output from netserver machine. ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet ib0: dev_queue_xmit failed to requeue packet general protection fault: [1] SMP CPU 0 Modules linked in: rdma_ucm(U) rdma_cm(U) ib_addr(U) ib_ipoib(U) ib_mthca<7>Losi ng some ticks... checking if CPU frequency changed. (U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc ds yenta_socket pcmcia_core dm_mirror dm_multipath dm_mod button battery ac uhci_h cd ehci_hcd hw_random shpchp e1000 floppy sg ext3 jbd aic79xx sd_mod scsi_mod Pid: 7838, comm: ib_mad1 Not tainted 2.6.9-34.ELsmp RIP: 0010:[] {:ib_ipoib:path_rec_completion+ 178} RSP: 0018:0101a756bc70 EFLAGS: 00010202 warning: many lost ticks. Your time source seems to be instable or some driver is hogging interupts rip mwait_idle+0x56/0x7c RAX: RBX: RCX: RDX: 0101bbeffc80 RSI: RDI: fffc RBP: 0101bbeffc80 R08: 0003 R09: 0101bbeffca0 R10: 8011dfe0 R11: 8011dfe0 R12: 1b60167f R13: fffc R14: R15: 1b6012ff FS: () GS:804d7b00() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 006cf5e8 CR3: 00101000 CR4: 06e0 Process ib_mad1 (pid: 7838, threadinfo 0101a756a000, task 0101bdc3b030) Stack: a00e547d 0101afda5000 0002 0101afda5380 0246 0246 802ab017 0101bc16a500 0101bbeffca0 0101bbeffc80 Call Trace:{:ib_sa:ib_sa_path_rec_callback+0} {dev_queue_xmit+525} {:ib_ipoib:path_ rec_completion+885} {:ib_sa:ib_sa_path_rec_callback+64} {:ib_sa:send_handler+74} {:ib_mad:ib_ mad_complete_send_wr+418} {:ib_mad:ib_mad_completion_handler+979} {:ib_mad:ib_mad_completion_handler+0} {worker_thread+419} {default_wake_fun ction+0} {default_wake_function+0} {keventd_cr eate_kthread+0} {worker_thread+0} {keventd_create_kth read+0} {kthread+200} {child_rip+8} {keventd_create_kthread+0} {kthread+0 } {child_rip+0} Code: 49 8b 74 24 08 50 0f b6 42 16 50 0f b6 42 15 50 0f b6 42 14 RIP {:ib_ipoib:path_rec_completion+178} RSP <0101a756bc70> <0>Kernel panic - not syncing: Oops --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/lis
Re: [openib-general] [PATCH] RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count
Krishna Kumar wrote: > struct rdma_id_private *id_priv; > + int did_acquire_dev = 0; See my other mail that gets rid of this flag. > @@ -1776,6 +1778,8 @@ int rdma_bind_addr(struct rdma_cm_id *id > > return 0; > err: > + if (did_acquire_dev) > + cma_detach_from_dev(id_priv); We need to lock around cma_detach_from_dev(). - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count
>Actually that will not work, since the undo operation is for when the >next operation (cma_get_port()) fails after we did an acquire_dev, >and in that case the refcount needs to be dropped. So I am not >able to avoid using an extra flag to indicate that a ref was got some >time in the past, and drop it in the error path. I will send that out now. Let's try something like this then (untested): diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 18a4366..0d06431 100755 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -1859,16 +1859,20 @@ int rdma_bind_addr(struct rdma_cm_id *id mutex_unlock(&lock); } if (ret) - goto err; + goto err1; } memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr)); ret = cma_get_port(id_priv); if (ret) - goto err; + goto err2; return 0; -err: +err2: + mutex_lock(&lock); + cma_detach_from_dev(id_priv); + mutex_unlock(&lock); +err1: cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE); return ret; } - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
Steve Wise wrote: > On Wed, 2006-10-18 at 14:39 +0200, Michael S. Tsirkin wrote: > >> Quoting r. Jeff Squyres <[EMAIL PROTECTED]>: >> >>> However, this is a larger audience than was on the call. Is there a >>> significant movement here from the developers to move to 100% git? >>> >> Life would be somewhat easier for me with 100% git. >> >> > > Probably for everyone. > Not for me. I hate to move from SVN. > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: port to WinIB stack - moving win-specific to config.h
On Wed, 2006-10-18 at 12:15, Yevgeny Kliteynik wrote: > Hi Hal > > As we discussed previously, I've added config.h in > windows, and removed windows-specific defines from > the common OSM files: > opensm/osm_log.c > opensm/osm_prtn.c > opensm/osm_subnet.c > > -- > Yevgeny > > Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]> Excellent. Thanks. Applied. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] osm: port to WinIB stack - moving win-specific to config.h
Hi Hal As we discussed previously, I've added config.h in windows, and removed windows-specific defines from the common OSM files: opensm/osm_log.c opensm/osm_prtn.c opensm/osm_subnet.c -- Yevgeny Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]> Index: opensm/osm_log.c === --- opensm/osm_log.c(revision 9869) +++ opensm/osm_log.c(working copy) @@ -96,10 +96,6 @@ static void truncate_log_file(osm_log_t* #else /* Windows */ -#define fstat _fstat -#define stat _stat -#define fileno _fileno - static void truncate_log_file(osm_log_t* const p_log) { fprintf(stderr, "truncate_log_file: cannot truncate on windows system (yet)\n"); Index: opensm/osm_prtn.c === --- opensm/osm_prtn.c (revision 9869) +++ opensm/osm_prtn.c (working copy) @@ -61,11 +61,6 @@ #include #include -#ifdef WIN32 -#define snprintf _snprintf -#define stat _stat -#endif - extern int osm_prtn_config_parse_file(osm_log_t * const p_log, osm_subn_t * const p_subn, const char *file_name); Index: opensm/osm_subnet.c === --- opensm/osm_subnet.c (revision 9869) +++ opensm/osm_subnet.c (working copy) @@ -658,10 +658,6 @@ __osm_subn_opts_unpack_charp( } } -#ifdef WIN32 -#define snprintf _snprintf -#endif - /** **/ static void ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OpenFabrics Developer Summit at SC06, Tampa Nov 16 - 17
Has the registration site been set up ? --CQ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jeff Squyres Sent: Tuesday, October 17, 2006 6:57 AM To: Bill Boas Cc: Open Fabrics; openib-general@openib.org; [EMAIL PROTECTED] Subject: Re: [openib-general] OpenFabrics Developer Summit at SC06, Tampa Nov 16 - 17 I have copied this information to the wiki -- please make all updates there so that there is a single reference point to find all the information about the meeting. Thanks! https://openib.org/tiki/tiki-index.php?page=Meeting+Minutes On Oct 15, 2006, at 5:02 PM, Bill Boas wrote: > To all in the OpenFabrics Community > > > > We will be holding our first Developer Summit in the Tampa Convention > Center courtesy of SC06 starting at 1.30PM in Room 17 on Thursday > November 16, 2006. On Friday November 17, we will start in Room 13 at > 8.00 AM and continue till 5.00PM. We have had to schedule into these > time slots because no other usable space is available at any other > times during the week of SC06! > > > > OpenFabrics will cater food and beverages for afternoon break and > supper on Thursday, breakfast, lunch and two breaks on Friday. We will > set up a registration site at Acteva to collect $$ to cover our out of > pocket expenses - I'll email out the URL for that site in the next day > or two. > > > > Please review attached Strawman purposes, suggested attendees and > agenda. Any changes or comments, please email them to the community > for all to comment on please. > > > > The Summit has several dimensions and themes throughout our work > there: > > 1) - consistency and robustness of the Linux and Windows software > stacks for Release 2.0 of OpenFabrics; > > 2) - feature selection, development resources and timelines for > Release 2.0; > > 3) - activities, features and processes of the Enterprise Working > Group on OFED 1.x until Release 2.0 is ready hand-off to the EWG; > > 4) - enhancing the resources of the EWG to be ready for 2.0 it so > that it may be subsequently be distributed as OFED 2.0. and adopted > by the OpenFabrics vendor and customer communities for production use. > > > > This is a far too much work for just a day and half! PLEASE START > NOW exchanging ideas for additional features, contact peer > engineers from companies and customers to discuss work sizing, > development resources, identify volunteer developers for items so > that when we meet on the 16th we're not starting from a blank sheet! > > > > Sujal Das, Johann George, Matt Leininger, Pramod Srivatsa, Hal > Rosenstock, Tom Tucker and Bob Woodruff are leading the pre- > meeting, STRAWMAN collation of requirements, feature > prioritization, developer assignments, sizing and processes so that > we have the list largely complete prior to the meeting and people > know has already volunteered for items from the list. > > > > Bill Boas > > VP, Business Development | System Fabric Works > > [EMAIL PROTECTED] | 510-375-8840 > > > > > > ___ > openfabrics-ewg mailing list > [EMAIL PROTECTED] > http://openib.org/mailman/listinfo/openfabrics-ewg -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
On Wed, 2006-10-18 at 14:39 +0200, Michael S. Tsirkin wrote: > Quoting r. Jeff Squyres <[EMAIL PROTECTED]>: > > However, this is a larger audience than was on the call. Is there a > > significant movement here from the developers to move to 100% git? > > Life would be somewhat easier for me with 100% git. > Probably for everyone. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] IPoIB multicast neighbour?!
While debugging something, i have opened ipoib debug messages and see ib0: neigh_destructor for ff ff12:601b::::::0002 Do you have an idea what is the source of this neighbour? why it is created and is there a way to eliminate this somehow (my guess is that removing IPv6 support from the kernel will do that). Its a RH4 U3 system with OFED 1.1 rc7 more info below, thanks. Or. # ip a s ib0 9: ib0: mtu 1500 qdisc pfifo_fast qlen 128 link/[32] 00:02:04:04:fe:80:00:00:00:00:00:00:00:08:f1:04:03:97:08:c5 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff inet 192.169.3.235/24 brd 192.169.3.255 scope global ib0 inet6 fe80::208:f104:397:8c5/64 scope link valid_lft forever preferred_lft forever # ip m s ib0 9: ib0 link 00:ff:ff:ff:ff:12:40:1b:00:00:00:00:00:00:00:00:00:00:00:01 link 00:ff:ff:ff:ff:12:60:1b:00:00:00:00:00:00:00:01:ff:97:08:c5 link 00:ff:ff:ff:ff:12:60:1b:00:00:00:00:00:00:00:00:00:00:00:01 inet 224.0.0.1 inet6 ff02::1:ff97:8c5 inet6 ff02::1 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.1-pre1 is ready
Michael S. Tsirkin wrote: > Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: >>> Try something like >>> git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78 >>> to get the list of OFED changes against v2.6.18-rc6. >> thanks for all the info, however i think the OFED docs must state on >> what upstream version are the OFED kernel IB drivers based (ie in this >> case 2.6.18-rc6 tag of linus tree) so one is able to determine that from >> reading the docs only (ie without using GIT). > > Makes sense. Care to formulate the appropriate wording? > Which document should this go into? OK, something in the spirit of (remove the XXX) the below: I will be able to produce something better tomorrow morning. Or. # Kernel based on XXX=2.6.18-rc6 mainline kernel. The patches to this mainline kernel are included within the OFED sources, please see the YYY doc for their location and how to apply them on the kernel sources. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH TRIVIAL] opensm: fix function name in the comment
On Wed, 2006-10-18 at 09:44, Sasha Khapyorsky wrote: > This trivially fixes function name (osm_switch_set_min_lid_size) in the > comment. > > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]> Thanks. Applied. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH TRIVIAL] opensm: fix function name in the comment
This trivially fixes function name (osm_switch_set_min_lid_size) in the comment. Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]> --- osm/include/opensm/osm_switch.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/osm/include/opensm/osm_switch.h b/osm/include/opensm/osm_switch.h index 8c4799f..0cf7542 100644 --- a/osm/include/opensm/osm_switch.h +++ b/osm/include/opensm/osm_switch.h @@ -440,9 +440,9 @@ osm_switch_set_hops( * SEE ALSO */ -/f* OpenSM: Switch/osm_switch_set_hops +/f* OpenSM: Switch/osm_switch_set_min_lid_size * NAME -* osm_switch_set_hops +* osm_switch_set_min_lid_size * * DESCRIPTION * Sets the size of the switch's routing table to at least accomodate the -- 1.4.2.3.g128e ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
Quoting r. Jeff Squyres <[EMAIL PROTECTED]>: > Does anyone have any sysadmin cycles to do this kind of stuff? I > would expect it to be a flurry of activity here in the beginning > followed by short bursts of activity separated by long periods of > nothing. FWIW, I can help out keeping the git tool updated - I'm doing it at Mellanox now and its quite trivial. In particular, this can be done without central admin priveledges - git does not need to be suid root, and its easy to set git up to run from some "git-admin" user's home directory. Playing with softlinks makes it quite easy for this user to update git for everyone. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
On Oct 17, 2006, at 12:45 PM, Michael S. Tsirkin wrote: >> Developers had requested git 1.4, but Ubuntu had an older >> version. We >> went ahead and installed git from source. I'd prefer to stick to >> Ubuntu >> packages if possible. > > We have much to gain from newer versions - just look at gitweb > change log. > But my assumption here was that someone will keep the built from > source > tools updated. I don't have a problem alerting the list when new > versions come out. > > If, as Roland suggested, we'll be stuck at this version, its better > to stick with distro-supplied ones, assuming that *that* is updated > in a timely fashion. > > So, I guess the question is how is the sytsem supported/updated? This is probably quite the operative question. I volunteered to setup and maintain trac if the group decides to use it. I don't know what the plan is for supporting the other software packages. I too, would side with Michael that the relatively-recent versions of svn (although this may become moot) and trac tend to be beneficial to developers (I assume the same is true for git, but I have no direct experience). Does anyone have any sysadmin cycles to do this kind of stuff? I would expect it to be a flurry of activity here in the beginning followed by short bursts of activity separated by long periods of nothing. -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...
On Wed, 2006-10-18 at 09:29 +0200, Michael S. Tsirkin wrote: > Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > > >From our dicussion, it seems we should be able to just push the > > > small number of missing bits into RHEL5 directly. That would be > > > nicer of course. > > > > It depends. If there's lots of individual changes, it might be easier > > to push the OFED 1.1 change. But, that depends on when the final OFED > > 1.1 comes out and how much it varies from the existing RPMs. > > OFED is in deep freeze, so you can already look at it to estimate the amount > of > changes against 2.6.18. > Could you look at the diff please so that I know whether it's worth it > to invest in building the minimal patch set for pushing into RHEL5, > or whether you'll push OFED 1.1 into RHEL kernel as is? Yeah, I'll look over the diff today. -- Doug Ledford <[EMAIL PROTECTED]> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1-RC7 build problem on SLES10
When I ran the install script without having the kernel sources rpm on SLES 10, I had to wait several minutes before it failed. Shouldn't the script check such dependencies before starting the build process? Erez Vladimir Sokolovsky wrote: > > Hi, > OFED installation script check that the directory > /lib/modules/`uname -r`/build/ and the file > /lib/modules/`uname -r`/build/Makefle exist. > It does not check for kernel-source RPM because of kernels from > kernel.org support. > > > -- > > Regards, > Vladimir > > On Tue, 2006-10-17 at 15:45 -0700, Scott Weitzenkamp (sweitzen) wrote: > > You need the kernel-source RPM, I guess the OFED install.sh should check > > for that RPM. > > > > svbu-qa-opteron-1:~ # uname -a > > Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC > > 2006 i68 > > 6 athlon i386 GNU/Linux > > svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel > > kernel-source-2.6.16.21-0.8 > > kernel-smp-2.6.16.21-0.8 > > kernel-ib-1.1-2.6.16.21_0.8_smp > > kernel-ib-devel-1.1-2.6.16.21_0.8_smp > > svbu-qa-opteron-1:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp > > .config Makefilearch include2 > > .kernelrelease Module.symvers include scripts > > > > Scott Weitzenkamp > > SQA and Release Manager > > Server Virtualization Business Unit > > Cisco Systems > > > > > > > -Original Message- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On Behalf Of Chris Dennett > > > Sent: Tuesday, October 17, 2006 12:46 PM > > > To: [EMAIL PROTECTED]; openib-general@openib.org > > > Subject: [openib-general] OFED 1.1-RC7 build problem on SLES10 > > > > > > I've been trying to install OFED 1.1 RC7 on an x86 server > > > with a fresh install > > > of SLES10 (32-bit). It errors out when trying to build the > > > kernel modules. > > > I've included what I think are the relevant log messages > > > below. I've tried > > > installing everything (minus iser and tvflash) or just the > > > modules needed for > > > SRP. I've installed 1.1 RC7 successfully on other RedHat > > > servers without any > > > problems. I am installing as root. Any help would be appreciated. > > > > > > Thanks. > > > > > > -Chris > > > > > > == > > > + make kernel > > > Building kernel modules > > > Kernel version: 2.6.16.21-0.8-smp > > > Modules directory: //lib/modules/2.6.16.21-0.8-smp > > > Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build > > > env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \ > > > > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \ > > > > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \ > > > make -C /lib/modules/2.6.16.21-0.8-smp/build > > > SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband" > > > KERNELRELEASE=2.6.16.21-0.8-smp \ > > > EXTRAVERSION=.21-0.8-smp V=1 \ > > > CONFIG_INFINIBAND=m \ > > > CONFIG_INFINIBAND_IPOIB=m \ > > > CONFIG_INFINIBAND_SDP= \ > > > CONFIG_INFINIBAND_SRP=m \ > > > CONFIG_INFINIBAND_USER_MAD=m \ > > > CONFIG_INFINIBAND_USER_ACCESS=m \ > > > CONFIG_INFINIBAND_ADDR_TRANS=y \ > > > CONFIG_INFINIBAND_MTHCA=m \ > > > CONFIG_INFINIBAND_IPOIB_DEBUG=y \ > > > CONFIG_INFINIBAND_ISER= \ > > > CONFIG_INFINIBAND_EHCA= \ > > > CONFIG_INFINIBAND_RDS= \ > > > CONFIG_INFINIBAND_RDS_DEBUG= \ > > > CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \ > > > CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \ > > > CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \ > > > CONFIG_INFINIBAND_SDP_DEBUG= \ > > > CONFIG_INFINIBAND_SDP_DEBUG_DATA= \ > > > CONFIG_INFINIBAND_IPATH= \ > > > CONFIG_INFINIBAND_MTHCA_DEBUG=y \ > > > CONFIG_INFINIBAND_MADEYE= \ > > > LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \ > > > > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \ > > > -Iinclude \ > > > $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \ > > > -include include/linux/autoconf.h \ > > > -include > > > /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \ > > > ' \ > > > modules > > > make[1]: Entering directory > > > `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp' > > > make[1]: *** No rule to make target `modules'. Stop. > > > make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp' > > > make: *** [kernel] Error 2 > > > error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install) > > > > > > > > > RPM build errors: > > > user vlad does not exist - using root > > > group mtl does not exist - using root > > > user vlad does not exist - using root > > > group mtl does not exist - using root > > > Bad exit status from /var/tmp/rpm-tmp.92052 (%install) > > > ERROR: Failed executing "rp
Re: [openib-general] Tools for development
Quoting r. Jeff Squyres <[EMAIL PROTECTED]>: > However, this is a larger audience than was on the call. Is there a > significant movement here from the developers to move to 100% git? Life would be somewhat easier for me with 100% git. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] New DNS name for openfabrics.org
Who runs the DNS for openfabrics.org? Could we get a new DNS A name added: staging.openfabrics.org -- for the new server? 69.55.231.195. Thanks! -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
I was not on the call last week, but I understand that there was some discussion about exactly this point (ditch SVN and go 100% git): the decision was to stick with SVN for userspace stuff and stick with git for kernel stuff. However, this is a larger audience than was on the call. Is there a significant movement here from the developers to move to 100% git? (I don't really care) On Oct 17, 2006, at 12:21 PM, Sasha Khapyorsky wrote: > On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote: >> Quoting r. Steve Wise <[EMAIL PROTECTED]>: >>> At the risk of opening a can of worms, is there any reason we >>> don't move >>> the user stuff into its own git tree? This would get rid of svn >>> altogether... >> >> If we do, that should probably be multiple git trees - verbs, >> management, >> tests are all more or less independent and developed mostly by >> different people. > > Reasonable. And generally this should not be too bad. > > Sasha > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/ > openib-general -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Tools for development
On Oct 17, 2006, at 9:45 AM, Michael S. Tsirkin wrote: >> It seems like trac can integrate with both SVN and git and would also >> provide us with integrated wiki capabilities. > > One feature that bugzilla has (and that seems to be disabled in > openib bugzilla > :() is mail integration, where I can Cc bugzilla and mail contents > will get > attached to bug report. I was hoping that new server will have this > capability. Does trac have this? Good question; I don't know. I'll find out. -- Jeff Squyres Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] opensm: misc fixes in lft dump file parser
On Tue, 2006-10-17 at 14:28, Sasha Khapyorsky wrote: > There are misc small fixes for lft dump parser: > - merge ERROR and SYS logging in single osm_log() call > - more strict strtoul() results checking > - fix potential bugs with invalid dump files > - break too long lines > > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]> Thanks. Applied. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] sysfs exposure of port counters useless?
On Tue, 2006-10-17 at 23:49, Hal Rosenstock wrote: [snip...] > > For IB counters in a Cisco switch, we read and reset the 32-bit counters > > once per second and keep 64-bit counters internally. > > 32 bit byte counters can be pegged in only 16 seconds on a 4x SDR link > and there are 4x DDR links now (8 seconds) and 12x links (5 seconds) so > that strategy is inaccurate on busy networks. I was a little sleepy... I take back the last part of the comment. 1 sec should be frequent enough. The only issue with this approach is the skew in the reading of the port counters as the interval is not as precise as it could be and that is likely to be good enough for these purposes. -- Hal ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED 1.1-RC7 build problem on SLES10
Hi, OFED installation script check that the directory /lib/modules/`uname -r`/build/ and the file /lib/modules/`uname -r`/build/Makefle exist. It does not check for kernel-source RPM because of kernels from kernel.org support. -- Regards, Vladimir On Tue, 2006-10-17 at 15:45 -0700, Scott Weitzenkamp (sweitzen) wrote: > You need the kernel-source RPM, I guess the OFED install.sh should check > for that RPM. > > svbu-qa-opteron-1:~ # uname -a > Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC > 2006 i68 > 6 athlon i386 GNU/Linux > svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel > kernel-source-2.6.16.21-0.8 > kernel-smp-2.6.16.21-0.8 > kernel-ib-1.1-2.6.16.21_0.8_smp > kernel-ib-devel-1.1-2.6.16.21_0.8_smp > svbu-qa-opteron-1:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp > .config Makefilearch include2 > .kernelrelease Module.symvers include scripts > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Chris Dennett > > Sent: Tuesday, October 17, 2006 12:46 PM > > To: [EMAIL PROTECTED]; openib-general@openib.org > > Subject: [openib-general] OFED 1.1-RC7 build problem on SLES10 > > > > I've been trying to install OFED 1.1 RC7 on an x86 server > > with a fresh install > > of SLES10 (32-bit). It errors out when trying to build the > > kernel modules. > > I've included what I think are the relevant log messages > > below. I've tried > > installing everything (minus iser and tvflash) or just the > > modules needed for > > SRP. I've installed 1.1 RC7 successfully on other RedHat > > servers without any > > problems. I am installing as root. Any help would be appreciated. > > > > Thanks. > > > > -Chris > > > > == > > + make kernel > > Building kernel modules > > Kernel version: 2.6.16.21-0.8-smp > > Modules directory: //lib/modules/2.6.16.21-0.8-smp > > Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build > > env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \ > > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \ > > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \ > > make -C /lib/modules/2.6.16.21-0.8-smp/build > > SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband" > > KERNELRELEASE=2.6.16.21-0.8-smp \ > > EXTRAVERSION=.21-0.8-smp V=1 \ > > CONFIG_INFINIBAND=m \ > > CONFIG_INFINIBAND_IPOIB=m \ > > CONFIG_INFINIBAND_SDP= \ > > CONFIG_INFINIBAND_SRP=m \ > > CONFIG_INFINIBAND_USER_MAD=m \ > > CONFIG_INFINIBAND_USER_ACCESS=m \ > > CONFIG_INFINIBAND_ADDR_TRANS=y \ > > CONFIG_INFINIBAND_MTHCA=m \ > > CONFIG_INFINIBAND_IPOIB_DEBUG=y \ > > CONFIG_INFINIBAND_ISER= \ > > CONFIG_INFINIBAND_EHCA= \ > > CONFIG_INFINIBAND_RDS= \ > > CONFIG_INFINIBAND_RDS_DEBUG= \ > > CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \ > > CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \ > > CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \ > > CONFIG_INFINIBAND_SDP_DEBUG= \ > > CONFIG_INFINIBAND_SDP_DEBUG_DATA= \ > > CONFIG_INFINIBAND_IPATH= \ > > CONFIG_INFINIBAND_MTHCA_DEBUG=y \ > > CONFIG_INFINIBAND_MADEYE= \ > > LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \ > > > > -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \ > > -Iinclude \ > > $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \ > > -include include/linux/autoconf.h \ > > -include > > /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \ > > ' \ > > modules > > make[1]: Entering directory > > `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp' > > make[1]: *** No rule to make target `modules'. Stop. > > make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp' > > make: *** [kernel] Error 2 > > error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install) > > > > > > RPM build errors: > > user vlad does not exist - using root > > group mtl does not exist - using root > > user vlad does not exist - using root > > group mtl does not exist - using root > > Bad exit status from /var/tmp/rpm-tmp.92052 (%install) > > ERROR: Failed executing "rpmbuild --rebuild --define '_topdir > > /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' --define > > 'build_root > > /var/tmp/OFED' --define 'configure_options --with-libibcommon > > --with-libibmad > > --with-libibumad --with-libibverbs --with-libmthca --with-opensm > > --with-librdmacm --with-openib-diags --with-srptools --with-mstflint > > --with-perftest --with-ipoib-mod --with-mthca-mod --with-srp-mod > > --with-core-mod --with-us
Re: [openib-general] srp trouble on RHEL4 U4
> From: Mirochnick Natalia [mailto:[EMAIL PROTECTED] > Subject: Re: [openib-general] srp trouble on RHEL4 U4 > > I've changed the string as you've advised, but it didn't > work. The only > difference is that string "" was added in /var/log/messages. > > [EMAIL PROTECTED] echo -n > id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 > 66a0261de,pkey=,service_id=49435353525 > 0,io_class=ff00 > > /sys/class/infiniband_srp/srp-mthca0-1/add_target > > [EMAIL PROTECTED] tail /var/log/messages > Oct 18 14:01:26 ... kernel: REJ reason 0x0 > Oct 18 14:01:26 ... kernel: ib_srp: Connection failed > > By the way, in ofed srp_release_notes.txt hasn't been said > that io_class is > mandatory parameter. > > Natalia It is an error in the OFED 1.0 srp_release_notes.txt. For all Silverstorm SRP targets (like the Silverstorm 5000 switch with Fiber Channel IOC), the 'io_class=ff00' parameter is mandatory. Let me investigate a little more, and get back to you. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] srp trouble on RHEL4 U4
I've changed the string as you've advised, but it didn't work. The only difference is that string "" was added in /var/log/messages. [EMAIL PROTECTED] echo -n id_ext=0001,ioc_guid=00066a01380001de,dgid=fe866a0261de,pkey=,service_id=494353535250,io_class=ff00 > /sys/class/infiniband_srp/srp-mthca0-1/add_target [EMAIL PROTECTED] tail /var/log/messages Oct 18 14:01:26 ... kernel: REJ reason 0x0 Oct 18 14:01:26 ... kernel: ib_srp: Connection failed By the way, in ofed srp_release_notes.txt hasn't been said that io_class is mandatory parameter. Natalia - Original Message - >> Madhu Lakshmanan wrote: >> Which SRP target are you using? Could you also give some more >> details on >> the fabric setup; i.e. what IB switch / gateway your host is connected >> to, and what kind of storage you wish to access? The full command that >> you used (echo -n > /add_target) to configure the SRP target >> would be very useful as well. > > From: Mirochnick Natalia [mailto:[EMAIL PROTECTED] > Subject: Re: [openib-general] srp trouble on RHEL4 U4 > > 2. Here's the details: > > IB switch: Silverstorm 5000 > Storage: NetApp FAS 320 > > [EMAIL PROTECTED] /usr/ofed/sbin/ibsrpdm -c > id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 > 66a0261de,pkey=,service_id=49435353525 > 0,io_class=ff00 > id_ext=0001,ioc_guid=00066a02380001de,dgid=fe8 > 66a0261de,pkey=,service_id=49435353525 > 0,io_class=ff00 > > [EMAIL PROTECTED] echo -n > id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 > 66a0261de,pkey=,service_id=494353535250 > > /sys/class/infiniband_srp/srp-mthca0-1/add_target > The problem is with the echo string you are giving. The command should be invoked as: echo -n id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 66a0261de,pkey=,service_id=494353535250, io_class=ff00 > /sys/class/infiniband_srp/srp-mthca0-1/add_target You were missing the 'io_class=ff00' bit. Let me know if it works. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] srp trouble on RHEL4 U4
>> Madhu Lakshmanan wrote: >> Which SRP target are you using? Could you also give some more >> details on >> the fabric setup; i.e. what IB switch / gateway your host is connected >> to, and what kind of storage you wish to access? The full command that >> you used (echo -n > /add_target) to configure the SRP target >> would be very useful as well. > > From: Mirochnick Natalia [mailto:[EMAIL PROTECTED] > Subject: Re: [openib-general] srp trouble on RHEL4 U4 > > 2. Here's the details: > > IB switch: Silverstorm 5000 > Storage: NetApp FAS 320 > > [EMAIL PROTECTED] /usr/ofed/sbin/ibsrpdm -c > id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 > 66a0261de,pkey=,service_id=49435353525 > 0,io_class=ff00 > id_ext=0001,ioc_guid=00066a02380001de,dgid=fe8 > 66a0261de,pkey=,service_id=49435353525 > 0,io_class=ff00 > > [EMAIL PROTECTED] echo -n > id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 > 66a0261de,pkey=,service_id=494353535250 > > /sys/class/infiniband_srp/srp-mthca0-1/add_target > The problem is with the echo string you are giving. The command should be invoked as: echo -n id_ext=0001,ioc_guid=00066a01380001de,dgid=fe8 66a0261de,pkey=,service_id=494353535250, io_class=ff00 > /sys/class/infiniband_srp/srp-mthca0-1/add_target You were missing the 'io_class=ff00' bit. Let me know if it works. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] srp trouble on RHEL4 U4
1. Thank alot for your attention. 2. Here's the details: IB switch: Silverstorm 5000 Storage: NetApp FAS 320 [EMAIL PROTECTED] /usr/ofed/sbin/ibsrpdm -c id_ext=0001,ioc_guid=00066a01380001de,dgid=fe866a0261de,pkey=,service_id=494353535250,io_class=ff00 id_ext=0001,ioc_guid=00066a02380001de,dgid=fe866a0261de,pkey=,service_id=494353535250,io_class=ff00 [EMAIL PROTECTED] echo -n id_ext=0001,ioc_guid=00066a01380001de,dgid=fe866a0261de,pkey=,service_id=494353535250 > /sys/class/infiniband_srp/srp-mthca0-1/add_target Thanks in advance, Natalia Mirochnick - Original Message - From: "Lakshmanan, Madhu" <[EMAIL PROTECTED]> To: "Mirochnick Natalia" <[EMAIL PROTECTED]>; Sent: Wednesday, October 18, 2006 11:04 AM Subject: RE: [openib-general] srp trouble on RHEL4 U4 Which SRP target are you using? Could you also give some more details on the fabric setup; i.e. what IB switch / gateway your host is connected to, and what kind of storage you wish to access? The full command that you used (echo -n > /add_target) to configure the SRP target would be very useful as well. The Silverstorm 7000 is an HCA (Host Channel Adapter). By itself, it should in most cases not be the primary reason for the error code you are seeing. The issue is more likely to be due to the SRP target you are attempting to connect to. Thanks, Madhu Lakshmanan Silverstorm Technologies, Inc. [EMAIL PROTECTED] > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Mirochnick Natalia > Sent: Tuesday, October 17, 2006 10:47 AM > To: openib-general@openib.org > Subject: [openib-general] srp trouble on RHEL4 U4 > > Hello, > > I'm trying to setup SRP connection (SRP in OFED 1.0). > IB card is Silverstorm 7000. > > ib_srp module is loaded, but after attempt to to create an > SRP device (as it > was described in manual srp_release_notes.txt) the error appears in > /var/log/messages: > kernel: REJ reason 0x0 > > What's wrong? > -- > Thanks in advance, > Mirochnick Natalia > > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > __ NOD32 1.1808 (20061017) Information __ This message was checked by NOD32 antivirus system. http://www.eset.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL5 and OFED ...
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > Hmm, no, I really want to take a srpm from amd64 and get a 32 bit > > gcc executable that will build 64 bit binaries that match these > > built on native amd64 system exectly. > > Between just i386 and x86_64, you might be able to do that. I guess what I would like is for redhat to enable -m64 is gcc/binutils from 32 bit distribution. Then once I have a 64 bit machine, I could boot a 32 bit distro but change just the kernel to 64 bit. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.1-pre1 is ready
Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > > Try something like > > git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78 > > to get the list of OFED changes against v2.6.18-rc6. > > thanks for all the info, however i think the OFED docs must state on > what upstream version are the OFED kernel IB drivers based (ie in this > case 2.6.18-rc6 tag of linus tree) so one is able to determine that from > reading the docs only (ie without using GIT). Makes sense. Care to formulate the appropriate wording? Which document should this go into? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > >From our dicussion, it seems we should be able to just push the > > small number of missing bits into RHEL5 directly. That would be > > nicer of course. > > It depends. If there's lots of individual changes, it might be easier > to push the OFED 1.1 change. But, that depends on when the final OFED > 1.1 comes out and how much it varies from the existing RPMs. OFED is in deep freeze, so you can already look at it to estimate the amount of changes against 2.6.18. Could you look at the diff please so that I know whether it's worth it to invest in building the minimal patch set for pushing into RHEL5, or whether you'll push OFED 1.1 into RHEL kernel as is? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] OFED-1.1-pre1 is ready
Michael S. Tsirkin wrote: > Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: >> Subject: Re: OFED-1.1-pre1 is ready >> >> Tziporet Koren wrote: >>> OFED 1.1-pre1 is available: >>> URL: >>> https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz >>> Release details: >>> >>> BUILD_ID: >>> OFED-1.1-pre1 >>> >>> openib-1.1 (REV=9854) >>> # User space >>> https://openib.org/svn/gen2/branches/1.1/src/userspace >>> Git: >>> ref: refs/heads/ofed_1_1 >>> commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78 >> Hi Tziporet, >> >> I have asked this Michael few days ago and did not get a reply yet: can >> you clarify where is the version of the OFED IB ***kernel*** drivers >> stated? > > That's the commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78 part. I see. >> I understand they are typically based on some tag of Linus GIT tree (for >> example OFED1.1 uses 2.6.18 - correct?) but i could not find any notice >> for that in the docs nor in the per rc emails. > OFED1.1 was last rebased against 2.6.18-rc6 + a couple of small patches > touching > cma + adding scripts out of kernel modules backports etc. 2.6.18 wasn't out > by code freeze time, but all fixes in 2.6.18 are also in OFED 1.1. > Try something like > git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78 > to get the list of OFED changes against v2.6.18-rc6. thanks for all the info, however i think the OFED docs must state on what upstream version are the OFED kernel IB drivers based (ie in this case 2.6.18-rc6 tag of linus tree) so one is able to determine that from reading the docs only (ie without using GIT). Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...
On Wed, 2006-10-18 at 08:58 +0200, Michael S. Tsirkin wrote: > Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > Subject: Re: [openfabrics-ewg] RHEL5 and OFED ... > > > > On Sun, 2006-10-15 at 12:13 -0400, Doug Ledford wrote: > > > > > > Now for userspace - does RHEL5 include at least libibverbs-1.0? > > > > This has been released a while back, and Roland makes regular bugfix > > > > releases. > > > > > > It includes the OFED 1.0 libibverbs (which makes openmpi complain about > > > lack of out of band data support, but otherwise seems to work). > > What's out of band data BTW? Probably just me misremembering the error message...here the actual message is: [0,1,1][btl_openib_endpoint.c:945:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data [0,1,1][btl_openib_endpoint.c:945:mca_btl_openib_endpoint_create_qp] ibv_create_qp: returned 0 byte(s) for max inline data > > I built the OFED-1.1-pre1 user space RPMs for RHEL5. They are available > > at my web site. > > Thanks! > > > Kernel RPMs with the OFED 1.1 code will come a little > > later. > > >From our dicussion, it seems we should be able to just push the > small number of missing bits into RHEL5 directly. That would be > nicer of course. It depends. If there's lots of individual changes, it might be easier to push the OFED 1.1 change. But, that depends on when the final OFED 1.1 comes out and how much it varies from the existing RPMs. -- Doug Ledford <[EMAIL PROTECTED]> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] srp trouble on RHEL4 U4
Which SRP target are you using? Could you also give some more details on the fabric setup; i.e. what IB switch / gateway your host is connected to, and what kind of storage you wish to access? The full command that you used (echo -n > /add_target) to configure the SRP target would be very useful as well. The Silverstorm 7000 is an HCA (Host Channel Adapter). By itself, it should in most cases not be the primary reason for the error code you are seeing. The issue is more likely to be due to the SRP target you are attempting to connect to. Thanks, Madhu Lakshmanan Silverstorm Technologies, Inc. [EMAIL PROTECTED] > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Mirochnick Natalia > Sent: Tuesday, October 17, 2006 10:47 AM > To: openib-general@openib.org > Subject: [openib-general] srp trouble on RHEL4 U4 > > Hello, > > I'm trying to setup SRP connection (SRP in OFED 1.0). > IB card is Silverstorm 7000. > > ib_srp module is loaded, but after attempt to to create an > SRP device (as it > was described in manual srp_release_notes.txt) the error appears in > /var/log/messages: > kernel: REJ reason 0x0 > > What's wrong? > -- > Thanks in advance, > Mirochnick Natalia > > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general