RE: [openib-general] RC2 delayed a bit

2006-04-05 Thread Moshe Kazir
Bob Woodruff wrote -> > BTW. I built some kernel RPMs based on the 1.0 branch kernel code and the backport patches for RedHat EL4.0 U3. If someone wants me to post them somewhere, I will. I'll be glad to test them on a PPC machine . Can you put the rpm on an ftp somewhere ? Moshe ___

Re: [openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Eli Cohen
On Wednesday 05 April 2006 18:43, Roland Dreier wrote: > Michael> Not sure I read you. It'd still be use after free, won't it? > > It's definitely a bug. But it doesn't explain the specific oops we > saw. In other words, doing: > > kfree(mcast); > dev = mcast->dev; > > shouldn't c

Re: [openib-general] RE: Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Fabian Tillier
On 4/5/06, Sean Hefty <[EMAIL PROTECTED]> wrote: > >I don't see the connection (since we only need de-register, let's just call > >it flush and be done with it), but fine. Please propose an API then. > > The issue that I see is that ib_sa_flush() requires ib_mad_flush(), but the > MAD > layer API

[openib-general] Re: RC2 delayed a bit

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>: > It would be risky if Distros target future OF releases if for some reason > OF slips its release schedule. I think distros learned to live with this by now. Or they can just play it safe and target a stable release. -- MST

[openib-general] Re: Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: Re: [PATCH] ipoib_flush_paths > > Michael S. Tsirkin wrote: > >The only easy way out that I see is some kind of sa_flush function that > >will flush MAD wqs, and ask all users to call that. > >And I think we have the same bug in addr so thi

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Shirley Ma
Roland Dreier <[EMAIL PROTECTED]> wrote on 04/05/2006 05:25:54 PM: >     Shirley> How to handle this OF1.0 release to be synced with >     Shirley> distros' releases?  Without clear milestones and >     Shirley> schedules, it would be tough to target distros. > > I think it's up to OF to publis

[openib-general] Re: RC2 delayed a bit

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>: > How to handle this OF1.0 release to be synced with distros' releases? 1. Publish tarballs of stable versions on openib website 2. Send an annuncement by mail 3. Have distros pick it up Seems to work for most every open-source project out there. > With

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Roland Dreier
Shirley> How to handle this OF1.0 release to be synced with Shirley> distros' releases? Without clear milestones and Shirley> schedules, it would be tough to target distros. I think it's up to OF to publish its release schedule, and then distros can decide what to ship. It's no diffe

Re: [openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Sean Hefty
Michael S. Tsirkin wrote: The only easy way out that I see is some kind of sa_flush function that will flush MAD wqs, and ask all users to call that. And I think we have the same bug in addr so this needs a flush function too. My preference would be to add a registration function to ib_sa and i

[openib-general] Re: Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: Re: [PATCH] ipoib_flush_paths > > >>By the way, I was thinking about SA queries, and I came to a conclusion > >>that we have an unfixable race at module unload: nothing I as the user of > >>SA do in my callback can ensure that my callback i

Re: [openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Sean Hefty
By the way, I was thinking about SA queries, and I came to a conclusion that we have an unfixable race at module unload: nothing I as the user of SA do in my callback can ensure that my callback is not still running when my module is unloaded. I missed what you were saying before. I agree, I do

[openib-general] Re: Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > >By the way, I was thinking about SA queries, and I came to a conclusion > >that we have an unfixable race at module unload: nothing I as the user of > >SA do in my callback can ensure that my callback is not still running > >when my module is unloaded.

Re: [openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Sean Hefty
Michael S. Tsirkin wrote: By the way, I was thinking about SA queries, and I came to a conclusion that we have an unfixable race at module unload: nothing I as the user of SA do in my callback can ensure that my callback is not still running when my module is unloaded. The problem here is lack o

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Shirley Ma
> The initial mistake was probably to think that the OF1.0 release would > be something targeted at end users.  It would be much better to do as > nearly all other free software projects do and produce a release > targeted at distributors, and let distributors get it to end users. > >  - R. How

[openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] ipoib_flush_paths > > Michael> See what happens if you pass a stale id (query finished) > Michael> and a NULL query? > > Ah, I see. It can deal with a stale id or a NULL query, but both at > once leads to a bad coincide

[openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Roland Dreier
Michael> See what happens if you pass a stale id (query finished) Michael> and a NULL query? Ah, I see. It can deal with a stale id or a NULL query, but both at once leads to a bad coincidence. I guess that's not quite a bug in ib_sa_cancel_query(), although it might be nice if it dealt

[openib-general] Re: CMA and SDP hh

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: RE: CMA and SDP hh > > >*ip_ver = sdp_get_ip_ver(hdr); > >*port = ((struct sdp_hh *) hdr)->port; > >*src= &((struct sdp_hh *) hdr)->src_addr; > >*dst= &((struct sdp_hh *)

[openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > It looks safe to ib_sa_cancel_query() with a stale or NULL query pointer. I don't think so. Look into ib_sa_cancel_query: void ib_sa_cancel_query(int id, struct ib_sa_query *query) { unsigned long flags; struct ib_mad_agent *agent;

[openib-general] RE: CMA and SDP hh

2006-04-05 Thread Sean Hefty
>*ip_ver = sdp_get_ip_ver(hdr); >*port = ((struct sdp_hh *) hdr)->port; >*src= &((struct sdp_hh *) hdr)->src_addr; >*dst= &((struct sdp_hh *) hdr)->dst_addr; > >seems to assume that SDP places the HH message at the beginning

[openib-general] CMA and SDP hh

2006-04-05 Thread Michael S. Tsirkin
Sean, the following code in CMA *ip_ver = sdp_get_ip_ver(hdr); *port = ((struct sdp_hh *) hdr)->port; *src= &((struct sdp_hh *) hdr)->src_addr; *dst= &((struct sdp_hh *) hdr)->dst_addr; seems to assume that SDP places the H

[openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Roland Dreier
This makes sense but I'm trying to see exactly what goes wrong without it. Suppose path->query gets set to NULL between testing it and calling ib_sa_cancel_query(). What's the worst that can happen? It looks safe to ib_sa_cancel_query() with a stale or NULL query pointer. - R.

[openib-general] Re: [PATCH] mthca: disable pci_tune

2006-04-05 Thread Roland Dreier
Thanks, applied. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] [DAPL] [RFC] - remove duplicate disconnect event.

2006-04-05 Thread Steve Wise
James, Running a 4 thread, 8 ep/thread dapltest (the last test in regress.sh), I was intermittently seeing a seg fault in dapltest. This is running over the chelsio rnic using the iwarp branch. After debugging I found out that dapltest was freeing an already freed endpoint due to it receiving du

Re: [openib-general] Re: [PATCH] repost: IPoIB queue size tune patch

2006-04-05 Thread Roland Dreier
Thanks, here's the version I committed to svn and queued for 2.6.17. I made the module parameters "send_queue_size" and "recv_queue_size" because I think that "sendq_size" might be too obscure for people to understand. I made the queue size variables __read_mostly to avoid false sharing of cache

RE: [openib-general] how can i know whether my appl'n is using SDP ornot?

2006-04-05 Thread Scott Weitzenkamp (sweitzen)
1) You can use tcpdump to see if there is IPoIB traffic (assuming your tcpdump supports OpenIB IPoIB, RHEL4 tcpdump does not yet). 2) You can strace the application see if it is using AF_INET or AF_INET_SDP. Scott > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTE

Re: [openib-general] how can i know whether my appl'n is using SDP or not?

2006-04-05 Thread Grant Grundler
On Wed, Apr 05, 2006 at 09:29:13AM +0100, keshetti mahesh wrote: > i am working on the infiniband cluster and the openIB stack is > unstalled in hosts(including SDP) > > can anybody tell me wen i run an appl'n over infiniband > how can i know whether it is using SDP or IPoIB? Are you using li

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Roland Dreier
Thanks, looks really good (consolidating the code even shrinks the .text of the driver). Applied to svn and 2.6.17 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit h

RE: [openib-general] RC2 delayed a bit

2006-04-05 Thread Bob Woodruff
Roland wrote, >The initial mistake was probably to think that the OF1.0 release would >be something targeted at end users. It would be much better to do as >nearly all other free software projects do and produce a release > > - R. I agree. OF1.0 should probably target distros, rather than end u

[openib-general] Re: ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: ipath module compilation on 2.6.15 and 2.6.16 > > Vladimir> I am not compiling kernel, but only ipath using SUBDIRS: > > Vladimir> CONFIG_IPATH_CORE=m \ CONFIG_INFINIBAND_IPATH=m \ > Vladimir> CONFIG_IPATH_ETHER =m > >

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Maybe it is just random corruption that causes the crash > right at the beginning of ipoib_mcast_sendonly_join_complete(). Anyway, the updated patch Eli posted looks good, doesn't it? -- MST ___ openib-ge

RE: [openib-general] RC2 delayed a bit

2006-04-05 Thread Bob Woodruff
Bryan wrote, >So, we went from having no openib release to now having two? That's confusing. >Are these vendors members of openib? >- Sean I know that I am confused. Can someone from the ibed (openfabrics-ewg) people please enlighten us ? BTW. I built some kernel RPMs based on the 1.0 branch

Re: [openib-general] Re: [DAPL] Provider initialialization

2006-04-05 Thread James Lentini
On Wed, 5 Apr 2006, Vladimir Sokolovsky wrote: > Hi James, > Does uDAPL support PPC64 architecture? On PPC64, the code expects __PPC64__ to be defined. The PPC64 support isn't as well tested as other architectures, so there may be some problems. ___

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Roland Dreier
Sean> So, we went from having no openib release to now having two? Sean> That's confusing. I think the right way to think about it is that we have one OpenFabrics release, namely the 1.0 release managed by Bryan, and one distribution (so far) of that release plus other components (MPI, ker

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Bryan O'Sullivan
On Wed, 2006-04-05 at 11:02 -0700, Sean Hefty wrote: > So, we went from having no openib release to now having two? That's > confusing. Indeed. > Are these vendors members of openib? Yes. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.or

Re: [openib-general] Re: SPAM: [PATCH] [RFC] - dapl - dat_ep_free() can return without freeing the endpoint

2006-04-05 Thread Steve Wise
On Wed, 2006-04-05 at 10:56 -0700, Sean Hefty wrote: > Arlin Davis wrote: > > I did not see the original thread/patch from Steve so I don't have the > > entire context of this issue but it sounds like we need to fix the code > > so that the destroy QP (dat_ep_free) blocks until the event processi

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Sean Hefty
Bryan O'Sullivan wrote: The intention is to ship the same basic userspace components as the OpenIB 1.0 software release, with some additional parts. In addition, some people want to ship kernel components that are not upstream, such as SDP and iSER. So, we went from having no openib release to

Re: [openib-general] Re: the cma is not in the for-2.6.18 branch of the git tree

2006-04-05 Thread Sean Hefty
Roland Dreier wrote: Yes, makes sense. I think there have been some updates and fixes to CMA code (loopback handling, etc). Sean, when you get a chance, can send me updates for the rdma_cm branch? Even a single rolled-up patch against the head of that branch is fine -- it's easy for me to spli

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Bryan O'Sullivan
On Wed, 2006-04-05 at 10:43 -0700, Shirley Ma wrote: > Today is April 5. What the current plan for RC2? Several vendors have formed an "Enterprise Working Group" (EWG) to produce a set of sources and binary packages that they can provide to customers until such time as the Linux distribution ven

Re: [openib-general] Re: SPAM: [PATCH] [RFC] - dapl - dat_ep_free() can return without freeing the endpoint

2006-04-05 Thread Sean Hefty
Arlin Davis wrote: I did not see the original thread/patch from Steve so I don't have the entire context of this issue but it sounds like we need to fix the code so that the destroy QP (dat_ep_free) blocks until the event processing is complete, always destroy the QP and cm_id from this call, a

Re: [openib-general] Re: SPAM: [PATCH] [RFC] - dapl - dat_ep_free() can return without freeing the endpoint

2006-04-05 Thread Steve Wise
> I did not see the original thread/patch from Steve so I don't have the > entire context of this issue but it sounds like we need to fix the code > so that the destroy QP (dat_ep_free) blocks until the event processing > is complete, always destroy the QP and cm_id from this call, and remove

Re: [openib-general] Re: [PATCH] repost: IPoIB queue size tune patch

2006-04-05 Thread Roland Dreier
Shirley> Hello Roland, I have been working hard on this patch. Do Shirley> you think it is ready to be merged? Yes, I will fix it up and apply it. Hopefully we can come up with something better for 2.6.18. - R. ___ openib-general mailing list

Re: [openib-general] Re: [PATCH] repost: IPoIB queue size tune patch

2006-04-05 Thread Shirley Ma
Hello Roland,         I have been working hard on this patch. Do you think it is ready to be merged? Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503) 578-7638 ___ openib-general mailing lis

Re: [openib-general] RC2 delayed a bit

2006-04-05 Thread Shirley Ma
Hello Bryan, [EMAIL PROTECTED] wrote on 03/28/2006 04:10:36 PM: > Hi - > > Due to being swamped with several different things at once, I will not > be able to release 1.0 RC2 this week.  I expect that I will be able to > release it by April 4. > > Apologies for the inconvenience, > >     Tod

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Bryan O'Sullivan
On Wed, 2006-04-05 at 10:01 -0700, Roland Dreier wrote: > Huh?? This error seemed to have been because writeq() doesn't exist > in a 32-bit kernel. I didn't see that error, just the IB_NODE_CA one. Vladimir, please set LANG=C when you're going to compile a test case for posting the output, so t

[openib-general] Re: ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > But what I don't understand is how you were able to > configure your kernel to build the driver, since the Kconfig has > "depends on 64BIT" in it. Vlad is compiling things as an out of kernel module. KConfig does not work in this configuration. --

Re: [openib-general] Re: SPAM: [PATCH] [RFC] - dapl - dat_ep_free() can return without freeing the endpoint

2006-04-05 Thread Arlin Davis
Sean Hefty wrote: James Lentini wrote: void dapli_destroy_conn(struct dapl_cm_id *conn) { int in_callback; +struct rdma_cm_id *cm_id; dapl_dbg_log(DAPL_DBG_TYPE_CM, " destroy_conn: conn %p id %d\n", conn,conn->cm_id); - dapl_os_lock(&conn->lock);

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Roland Dreier
Vladimir> I am not compiling kernel, but only ipath using SUBDIRS: Vladimir> CONFIG_IPATH_CORE=m \ CONFIG_INFINIBAND_IPATH=m \ Vladimir> CONFIG_IPATH_ETHER =m I see. I guess in that case if you create a broken config then there's no one else to blame... - R. __

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Vladimir Sokolovsky
Roland Dreier wrote: Vladimir> I tried compilation on 32 bit machine. Should it be Vladimir> supported by ipath? No, PathScale hasn't done the work to make the driver work on 32-bit archs yet. But what I don't understand is how you were able to configure your kernel to build the driver

[openib-general] Re: [PATCH] static rate encoding changes

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] static rate encoding changes > > OK, I went ahead and committed this to svn and queued it for 2.6.17. > If you see any problems, let me know. (It will be a while before this > gets merged upstream anyway because Linus is away th

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Roland Dreier
Michael> Assume that you have mcast point to random kernel data. Michael> doing things like skb_dequeue(&mcast->pkt_queue) will now Michael> do random things to random memory locations, it could be Michael> stack or anything else. Yes, true. Maybe it is just random corruption that

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] ipoib_mcast_restart_task > > Michael> The mcast pointer comes from stack. Surely we could have > Michael> use after free in ipoib_mcast_join_complete trigger data > Michael> corruption on stack and then trip on it? >

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Roland Dreier
Vladimir> I tried compilation on 32 bit machine. Should it be Vladimir> supported by ipath? No, PathScale hasn't done the work to make the driver work on 32-bit archs yet. But what I don't understand is how you were able to configure your kernel to build the driver, since the Kconfig has

[openib-general] Re: the cma is not in the for-2.6.18 branch of the git tree

2006-04-05 Thread Roland Dreier
Or> As part of the work to push iser for 2.6.18 I am going to send Or> RFC on it to linux-scsi (and ofcourse resend RFC to Or> open-ib). To have the people who review it in linux-scsi being Or> able to compile iser, they would need rdma_cm.h and the Or> associated changes in wha

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Vladimir Sokolovsky
Roland Dreier wrote: Looks like you are building with a 32-bit compiler. But in the Kconfig, ipath depends on 64BIT. So how are you enabling the driver? - R. I tried compilation on 32 bit machine. Should it be supported by ipath? Vladimir ___

Re: [openib-general] Re: [PATCH 1 of 3] core: static rate encoding changesupport

2006-04-05 Thread Roland Dreier
I merged the core static rate patch into svn. This means that the struct ib_ah_attr.static_rate field is now defined to be an absolute rate (as defined in enum ib_rate), rather than a relative inter-packet delay (IPD) value. I believe ehca will need to be updated to handle this in modify QP and c

Re: [openib-general] [PATCH] static rate encoding changes

2006-04-05 Thread Roland Dreier
OK, I went ahead and committed this to svn and queued it for 2.6.17. If you see any problems, let me know. (It will be a while before this gets merged upstream anyway because Linus is away this week) - R. ___ openib-general mailing list openib-general@

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Roland Dreier
Bryan> I haven't figured out what to do about this. People quite Bryan> reasonably don't like having ifdefs in drivers, but when Bryan> macro names change out from under us in header files, I Bryan> don't know what else to do. Bryan> Roland, what's your preference? Huh?? Thi

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Roland Dreier
Looks like you are building with a 32-bit compiler. But in the Kconfig, ipath depends on 64BIT. So how are you enabling the driver? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To un

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Roland Dreier
Michael> The mcast pointer comes from stack. Surely we could have Michael> use after free in ipoib_mcast_join_complete trigger data Michael> corruption on stack and then trip on it? Now you're confusing me. Isn't the mcast pointer kmalloc()ed? - R. _

Re: [openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Bryan O'Sullivan
On Wed, 2006-04-05 at 19:52 +0300, Vladimir Sokolovsky wrote: > I tried to compile ipath module taken from trunk (REV=6237) on 2.6.16 > and on 2.6.15 kernels and it fails with the following errors: Ah. You're using a kernel patched with SVN headers. I haven't figured out what to do about this.

[openib-general] ipath module compilation on 2.6.15 and 2.6.16

2006-04-05 Thread Vladimir Sokolovsky
Hi Bryan, I tried to compile ipath module taken from trunk (REV=6237) on 2.6.16 and on 2.6.15 kernels and it fails with the following errors: gcc -m32 -Wp,-MD,/var/tmp/IBED/tmp/openib/openib/src/linux-kernel/infiniband/hw/ipath/.ipath_verbs.o.d -nostdinc -isystem /usr/lib/gcc/i586-suse-lin

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] ipoib_mcast_restart_task > > Michael> Not sure I read you. It'd still be use after free, won't it? > > It's definitely a bug. But it doesn't explain the specific oops we > saw. The mcast pointer comes from stack. Surely w

[openib-general] Re: IPoIB destructor for 2.6.16-stable?

2006-04-05 Thread Roland Dreier
Michael> I don't see any way to fix crashes in ipoib in 2.6.16, Michael> then. Do you? Unfortunately no. If we could get to the bottom of Hal's crash then I would be fine with adding something like this to 2.6.16.stable. But I don't have much interest in debugging code that's already obs

Re: [openib-general] IPoIB descructor for 2.6.16-stable?

2006-04-05 Thread Shirley Ma
Michael, I have tested your workaround patch. It has been working pretty well. It's easy to hit this problem with ehca driver. I need to relook at the patch to see whether it's the same or not. thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax

Re: [openib-general] how to execute the dtest?

2006-04-05 Thread Arlin Davis
Dotan Barak wrote: Some more info: when i changed the dat.conf to be: OpenIB-cma u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 "ib0 0" "" OpenIB-cma-ip u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 "192.168.0.22 0" "" OpenIB-cma-name u1.2 nonthreads

[openib-general] Re: IPoIB destructor for 2.6.16-stable?

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: IPoIB destructor for 2.6.16-stable? > > Michael> Is this small/obvious enough to be considered for stable? > Michael> What do you think? > > I'd be a little worried. Hal had an oops that looked related to this, > but I was neve

[openib-general] Re: [PATCH] mthca: disable pci_tune

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] mthca: disable pci_tune > > Looks reasonable to me. How about the PCI Express max read request > value? Does that matter? Do we want to disable setting that by > default too? (that's what your patch does) Yes, I disable that

[openib-general] Re: IPoIB descructor for 2.6.16-stable?

2006-04-05 Thread Roland Dreier
Michael> Is this small/obvious enough to be considered for stable? Michael> What do you think? I'd be a little worried. Hal had an oops that looked related to this, but I was never able to reproduce it or figure it out. But in any case I'm uncomfortable that this patch is replacing one p

[openib-general] Re: [PATCH] mthca: disable pci_tune

2006-04-05 Thread Roland Dreier
Looks reasonable to me. How about the PCI Express max read request value? Does that matter? Do we want to disable setting that by default too? (that's what your patch does) - R. ___ openib-general mailing list openib-general@openib.org http://openib

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Roland Dreier
Michael> Not sure I read you. It'd still be use after free, won't it? It's definitely a bug. But it doesn't explain the specific oops we saw. In other words, doing: kfree(mcast); dev = mcast->dev; shouldn't cause an oops, because mcast is still a valid kernel pointer, even

[openib-general] [DAPL] tests

2006-04-05 Thread Vladimir Sokolovsky
Hi James, Can you add dapl tests to EXTRA_DIST list in the dapl/Makefile.am? Thanks, Regards, Vladimir Vladimir Sokolovsky wrote: Hi James, Does uDAPL support PPC64 architecture? Description: OS: Fedora Core release 4 (Stentz) Kernel: 2.6.11-1.1369_FC4 Arch: ppc64 I have the foll

Re: [openib-general] how to execute the dtest?

2006-04-05 Thread Steve Wise
Here is what works for me (using mthca): vic17:~ # cat /etc/dat.conf OpenIB-cma-ip u1.2 nonthreadsafe default /usr/local/lib/libdapl.so mv_dapl.1.2 "192.168.79.147 0" "" vic17:~ # And ib1 has the above ip address: vic17:~ # ifconfig ib1 ib1 Link encap:UNSPEC HWaddr 00-00-04-05-FE-80-00-

Re: [openib-general] how to execute the dtest?

2006-04-05 Thread Dotan Barak
Some more info: when i changed the dat.conf to be: OpenIB-cma u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 "ib0 0" "" OpenIB-cma-ip u1.2 nonthreadsafe default /usr/local/lib/libdaplcma.so mv_dapl.1.2 "192.168.0.22 0" "" OpenIB-cma-name u1.2 nonthreadsafe default /usr/local

[openib-general] how to execute the dtest?

2006-04-05 Thread Dotan Barak
Title: Message Hi.   We would like to start executing the dtest. When i tried to execute it, i got the following output:   DAT Registry: Started (dat_init)DAT Registry: static registry file   DAT Registry: token type  eor value <>   DAT Registry: token type  string value   DAT Registry:

Re: [openib-general] [PATCH] [DAPL] - dapl doesn't set max read iov attributes

2006-04-05 Thread Steve Wise
Ignore this patch. max_sge_rd is not the correct attribute... Sorry... STeve. On Wed, 2006-04-05 at 09:01 -0500, Steve Wise wrote: > Set the IA attribute max_iov_segments_per_rdma_read and the EP attribute > max_rdma_read_iov based on the openib max_sge_rd device attribute. > > Signed-off-by:

Re: [openib-general] [PATCH] [MTHCA] - set max_sge_rd attribute

2006-04-05 Thread Steve Wise
On Wed, 2006-04-05 at 17:47 +0300, Dotan Barak wrote: > On Wednesday 05 April 2006 17:19, Steve Wise wrote: > > While testing the cxgb3 iwarp driver, I noticed that mthca is not > > setting the max_sge_rd attribute. Dunno if the limits.max_sg is the > > correct value, but it seemed like the only r

Re: [openib-general] [PATCH] [MTHCA] - set max_sge_rd attribute

2006-04-05 Thread Dotan Barak
On Wednesday 05 April 2006 17:19, Steve Wise wrote: > While testing the cxgb3 iwarp driver, I noticed that mthca is not > setting the max_sge_rd attribute. Dunno if the limits.max_sg is the > correct value, but it seemed like the only reasonable one... This value is not being set because RD (Reli

[openib-general] [PATCH] [MTHCA] - set max_sge_rd attribute

2006-04-05 Thread Steve Wise
While testing the cxgb3 iwarp driver, I noticed that mthca is not setting the max_sge_rd attribute. Dunno if the limits.max_sg is the correct value, but it seemed like the only reasonable one... Signed-off-by: Steve Wise <[EMAIL PROTECTED]> Index: mthca_provider.c =

[openib-general] [PATCH] [DAPL] - dapl doesn't set max read iov attributes

2006-04-05 Thread Steve Wise
Set the IA attribute max_iov_segments_per_rdma_read and the EP attribute max_rdma_read_iov based on the openib max_sge_rd device attribute. Signed-off-by: Steve Wise <[EMAIL PROTECTED]> Index: openib_cma/dapl_ib_util.c === --- open

[openib-general] Re: [PATCH] ipoib_flush_paths

2006-04-05 Thread Michael S. Tsirkin
Quoting r. Eli Cohen <[EMAIL PROTECTED]>: > Subject: [PATCH] ipoib_flush_paths > > ib_sa_cancel_query must be called with priv->lock held since > a completion might arrive and set path->query to NULL > > Signed-off-by: Eli Cohen <[EMAIL PROTECTED]> Wow. This looks very similiar to what was fixed

[openib-general] [PATCHv2] OpenSM: Fix osm_vendor_send for GSI classes

2006-04-05 Thread Hal Rosenstock
Hi Yael, Below is a slightly modified version of the previous patch. It is a complete fix for the problem you identified. Let me know if this works for you and I will check it into both the trunk and 1.0 branch. Thanks. -- Hal OpenSM: Fix osm_vendor_send for GSI classes Currently, the default

[openib-general] [PATCH] ipoib_flush_paths

2006-04-05 Thread Eli Cohen
ib_sa_cancel_query must be called with priv->lock held since a completion might arrive and set path->query to NULL Signed-off-by: Eli Cohen <[EMAIL PROTECTED]> Index: latest/drivers/infiniband/ulp/ipoib/ipoib_main.c === --- latest.or

[openib-general] RE: [PATCH] OpenSM - fix osm_vendor_send on vendor mads

2006-04-05 Thread Hal Rosenstock
On Wed, 2006-04-05 at 08:13, Yael Kalka wrote: > When we are sending MAD with ManagementClass Vendor, then it isn't > defined with rmpp head. Thus when looking at the rmpp header we are > actually looking at part of the mad data. Right. See subsequent email and proposed patch. -- Hal > > Yael >

[openib-general] [PATCH] mthca: disable pci_tune

2006-04-05 Thread Michael S. Tsirkin
Roland, here's a patch for stability problems with mthca that were reported to Mellanox by some of our customers. I know how you think about options, and I agree, but this seems the best we can do in this case: it seems too risky to remove this tuning outright. --- PCI spec recommends against d

[openib-general] what should be the result of create a CQ with size 0?

2006-04-05 Thread Dotan Barak
Hi. We have a test case in which we try to create a CQ with 0 entries. Over the gen2 driver the creation of this CQ fails. what is the expected behavior of the driver? a) the CQ creation should fail b) the CQ should be created (with 1 or 2 entries) Thanks Dotan __

[openib-general] RE: [PATCH] OpenSM - fix osm_vendor_send on vendor mads

2006-04-05 Thread Yael Kalka
When we are sending MAD with ManagementClass Vendor, then it isn't defined with rmpp head. Thus when looking at the rmpp header we are actually looking at part of the mad data. Yael > -Original Message- > From: Hal Rosenstock [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 05, 2006 2:1

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Michael S. Tsirkin
Quoting Eli Cohen <[EMAIL PROTECTED]>: > > This could explain the oops in ipoib_mcast_sendonly_join_complete(), > > but only if a send-only group is being replaced by a full-member > > join. Is Eli's test doing that? > > No, not deliberately but it did not happen again after a full night runs. I

[openib-general] [PATCH] OpenSM: Fix osm_vendor_send for GSI classes

2006-04-05 Thread Hal Rosenstock
Hi Yael, Below is a complete fix for the problem you identified. Let me know if this works for you and I will check it into both the trunk and 1.0 branch. Thanks. -- Hal OpenSM: Fix osm_vendor_send for GSI classes Currently, the default for GSI classes assumes RMPP. There are two groups of GSI

Re: [openib-general] Re: [DAPL] Provider initialialization

2006-04-05 Thread Vladimir Sokolovsky
Hi James, Does uDAPL support PPC64 architecture? Description: OS: Fedora Core release 4 (Stentz) Kernel: 2.6.11-1.1369_FC4 Arch: ppc64 I have the following compilation error: make -C src/userspace/dapl \ CPPFLAGS="-I../libibverbs/include/infiniband -I../librdmacm/include \ -I../libib

[openib-general] [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Eli Cohen
ipoib_mcast_restart_task might free an mcast object while a join request is still outstanding, leading to an oops when the query completes. Fix this by waiting for query to complete, similar to what ipoib_stop_thread is doing. The wait for mcast completion code is consolidated in wait_join_comple

[openib-general] IPoIB descructor for 2.6.16-stable?

2006-04-05 Thread Michael S. Tsirkin
Roland, given that the cleaner backport from 2.6.17 got voted down from 2.6.16, how about pushing a work-around from subversion in there? Something along the lines of the patch below? Is this small/obvious enough to be considered for stable? What do you think? Signed-off-by: Michael S. Tsirkin <[

[openib-general] Re: [PATCH] ipoib_mcast_restart_task

2006-04-05 Thread Eli Cohen
> Yes, looks like there might be problem here. However, is there any > way to consolidate the "cancel and wait for done" code in one place, > rather than just cut-and-pasting it from ipoib_stop_thread()? An appropriate patch will follow. > This could explain the oops in ipoib_mcast_sendonly_join_

Re: [openib-general] Re: [PATCH] OpenSM - fix osm_vendor_send on vendor mads

2006-04-05 Thread Hal Rosenstock
On Wed, 2006-04-05 at 07:11, Hal Rosenstock wrote: > Hi Yael, > > On Wed, 2006-04-05 at 04:54, Yael Kalka wrote: > > Hi Hal, > > > > We saw the following problem in the osm_vendor_send mad (in > > osm_vendor_ibumad.c). Currently, there is a case on the Management > > Class values, where the cases

[openib-general] Re: [PATCH] OpenSM - fix osm_vendor_send on vendor mads

2006-04-05 Thread Hal Rosenstock
Hi Yael, On Wed, 2006-04-05 at 04:54, Yael Kalka wrote: > Hi Hal, > > We saw the following problem in the osm_vendor_send mad (in > osm_vendor_ibumad.c). Currently, there is a case on the Management > Class values, where the cases are > IB_MCLASS_SUBN_DIR/IB_MCLASS_SUBN_LID and default, when the

[openib-general] Re: [PATCH] OpenSM - complib fix for branch

2006-04-05 Thread Hal Rosenstock
Hi Yael, On Wed, 2006-04-05 at 02:24, Yael Kalka wrote: > Hi Hal, > > I saw that the complib patch (removal of constructor and destructor > attribute), wasn't fully added to the branch. > Attached is a patch for the branch. Is this needed for 1.0 ? Is this safe to add ? Was there more to it tha

Re: [openib-general] Re: [PATCH] repost: IPoIB queue size tune patch

2006-04-05 Thread Shirley Ma
Thanks, updated. Signed-off-by: Shirley Ma <[EMAIL PROTECTED]> diff -urpN infiniband/ulp/ipoib/ipoib.h infiniband-queue/ulp/ipoib/ipoib.h --- infiniband/ulp/ipoib/ipoib.h        2006-03-26 11:57:15.0 -0800 +++ infiniband-queue/ulp/ipoib/ipoib.h        2006-04-04 16:53:24.0 -0700

[openib-general] Re: [PATCH] OpenSM - osm_vendor_mlx_svc.h fix for branch1.0

2006-04-05 Thread Hal Rosenstock
On Wed, 2006-04-05 at 02:31, Yael Kalka wrote: > Hi Hal, > > Attached is a patch for branch 1.0, for the osm_vendor_mlx_svc.h that > was applied on the trunk but not on the branch 1.0. > > Yael > > OpenSM/osm_vendor_mlx_svc.h: Identify RMPP MADs > > RMPP mads can only sent in 4 MAD classes: >

[openib-general] [PATCH] OpenSM - fix osm_vendor_send on vendor mads

2006-04-05 Thread Yael Kalka
Hi Hal, We saw the following problem in the osm_vendor_send mad (in osm_vendor_ibumad.c). Currently, there is a case on the Management Class values, where the cases are IB_MCLASS_SUBN_DIR/IB_MCLASS_SUBN_LID and default, when the assumption is that in the default case the management class is IB_MC

[openib-general] the cma is not in the for-2.6.18 branch of the git tree

2006-04-05 Thread Or Gerlitz
Roland, >From the 2.6.17 related discussion I understand that the kernel portion of the >cma is ready for upstream. As part of the work to push iser for 2.6.18 I am going to send RFC on it to linux-scsi (and ofcourse resend RFC to open-ib). To have the people who review it in linux-scsi bein

[openib-general] how can i know whether my appl'n is using SDP or not?

2006-04-05 Thread keshetti mahesh
i am working on the infiniband cluster and the openIB stack is unstalled in hosts(including SDP)   can anybody tell me wen i run an appl'n over infiniband how can i know whether it is using SDP or IPoIB   regards K.Mahesh   Jiyo cricket on Yahoo! India cricket Yahoo! Messenger Mobil

  1   2   >