Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > Subject: Re: [openfabrics-ewg] RHEL5 and OFED ... > > On Sun, 2006-10-15 at 12:13 -0400, Doug Ledford wrote: > > > > Now for userspace - does RHEL5 include at least libibverbs-1.0? > > > This has been released a while back, and Roland makes regular b

Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Sun, 2006-10-15 at 12:13 -0400, Doug Ledford wrote: > > Now for userspace - does RHEL5 include at least libibverbs-1.0? > > This has been released a while back, and Roland makes regular bugfix > > releases. > > It includes the OFED 1.0 libibverbs (which makes openmpi complain about > lack of

Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Or Gerlitz <[EMAIL PROTECTED]>: > Subject: Re: OFED-1.1-pre1 is ready > > Tziporet Koren wrote: > > OFED 1.1-pre1 is available: > > URL: > > https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz > > Release details: > > > > BUILD_ID: > > OFED-1.1-pre1 > > > > openib

Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Or Gerlitz
Tziporet Koren wrote: > OFED 1.1-pre1 is available: > URL: > https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz > Release details: > > BUILD_ID: > OFED-1.1-pre1 > > openib-1.1 (REV=9854) > # User space > https://openib.org/svn/gen2/branches/1.1/src/userspace > Git: > ref: re

Re: [openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Erez Zilber
I reported the same problem last week: http://openib.org/pipermail/openfabrics-ewg/2006-October/001714.html -- Erez Zilber | 972-9-971-7689 Software Engineer, Storage Team Voltaire – _The Grid Backbone_ __ www.voltaire.com

[openib-general] [PATCH] IB/SRP Userspace: srptools/srp_daemon - Fix connect bug and add support for user specified initiator extension

2006-10-17 Thread Lakshmanan, Madhu
The patch addresses 3 issues: 1. Fixes bug in srp_daemon for the case where if it is invoked with the '-e' option, it fails to connect to the SRP targets because of a newline character in the parameter string. 2. Changes the name of the constant 'MAX_TRAGET_CONFIG_STR_STRING' to 'MAX_TARGET_CONFIG_

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > Subject: Re: RHEL5 and OFED ... > > On Wed, 2006-10-18 at 06:01 +0200, Michael S. Tsirkin wrote: > > Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > > Far easier would be to go the other way around, > > > run on x86_64 and build for i386, in which ca

Re: [openib-general] [PATCH] If addr_handler() got error, do not set state as OK

2006-10-17 Thread Krishna Kumar2
Sean Hefty <[EMAIL PROTECTED]> wrote on 10/17/2006 10:33:41 PM: > Can you rework this patch without adding in extra flags to indicate what has or > has not been executed? OK, will fix it accordingly. thanks, - KK > Krishna Kumar wrote: > > diff -ruNp org/drivers/infiniband/core/cma.c new/d

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Wed, 2006-10-18 at 06:01 +0200, Michael S. Tsirkin wrote: > Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > Far easier would be to go the other way around, > > run on x86_64 and build for i386, in which case gcc supports that out of > > the box. > > All that's left is to convince Lenovo there'

[openib-general] [PATCH] RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Krishna Kumar
rdma_bind_addr() leaks a cma_dev reference count in failure case. Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]> --- diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c --- org/drivers/infiniband/core/cma.c 2006-10-09 17:13:41.0 +0530 +++ new/drivers/infiniba

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Jason Gunthorpe
On Wed, Oct 18, 2006 at 06:43:54AM +0200, Michael S. Tsirkin wrote: > The difference here is that libibverbs insists on putting all plugins > in a separate directory and passing full path to dlopen, which of course > breaks this. Yeah, plugins in a seperate dir are not well supported by all the fa

Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Krishna Kumar2
> Something similar to: > > if (cma_any_addr...) { >ret = rdma_translate_ip(..); >if (ret) > goto err1; > >mutex_lock >ret = cma_acquire_dev >mutex_unlock >if (ret) > goto err2; > } > > should work fine. Actually that will not work, since the undo operation i

Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Krishna Kumar2
> Please add something like "RDMA/addr: " before the "Use" there, so > that someone skimming the kernel log knows what subsystem/specific > area the patch touches. (I added that by hand) > Git just wants three -s like "---" between changelog entry and actual patch. > the last line in the origin

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Jason Gunthorpe <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > On Tue, Oct 17, 2006 at 08:44:34PM -0700, Roland Dreier wrote: > > Jason> I think the typical way this is done would be to use > > Jason> ld.so's 'hwcap' handling and stick an optimize

Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-17 Thread Roland Dreier
Michael> All that's left is to convince Lenovo there's a market Michael> for x86_64 thinkpads. Actually you just have to wait a few months -- Core 2 (Merom) is 64-bit capable so once Lenovo catches up to everyone else, you'll be able to get a 64-bit thinkpad. - R. __

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Jason Gunthorpe
On Tue, Oct 17, 2006 at 08:44:34PM -0700, Roland Dreier wrote: > Jason> I think the typical way this is done would be to use > Jason> ld.so's 'hwcap' handling and stick an optimized library in > Jason> /usr/lib/sse2. > It's a good suggestion, but the problem is that the CPU-dependent

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > Roland> For now I just used lock; addl %0 to implement rmb on > Roland> i386. I'm really not comfortable making libmthca depend > Roland> on sse2, and I don't see a good way to dete

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > Far easier would be to go the other way around, > run on x86_64 and build for i386, in which case gcc supports that out of > the box. All that's left is to convince Lenovo there's a market for x86_64 thinkpads. -- MST __

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 20:18, Scott Weitzenkamp (sweitzen) wrote: > I agree the 32-bit byte and packet counters are useless as they get > pegged in a few seconds on a busy IB networks. I thought there was an > effort in IBTA to fix this. The fix at least in terms of the spec has been there for a w

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Roland> For now I just used lock; addl %0 to implement rmb on Roland> i386. I'm really not comfortable making libmthca depend Roland> on sse2, and I don't see a good way to detect and use sse2 Roland> at runtime. Jason> I think the typical way this is done would be to use

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 20:10, Michael Newton wrote: > On Tue, 17 Oct 2006, Hal Rosenstock wrote: > > On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote: > > > > From: Michael Newton > > > > Sent: Tuesday, October 17, 2006 3:02 AM > > > > To: openib-general@openib.org > > > > Subject: [openib-general] s

Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-17 Thread Roland Dreier
Sorry, I just noticed my cross-compilation test setup was messed up, so I never actually built the modified ehca, even though I thought I did. Anyway, the patch below on top of what I sent out should fix everything up. I've also merged this into my ipoib-napi branch, so what's there should be OK

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Greg Lindahl
On Tue, Oct 17, 2006 at 05:18:34PM -0700, Scott Weitzenkamp (sweitzen) wrote: > I agree the 32-bit byte and packet counters are useless as they get > pegged in a few seconds on a busy IB networks. I thought there was an > effort in IBTA to fix this. Yes, it's in the management working group. >

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Scott Weitzenkamp (sweitzen)
I agree the 32-bit byte and packet counters are useless as they get pegged in a few seconds on a busy IB networks. I thought there was an effort in IBTA to fix this. For IB counters in a Cisco switch, we read and reset the 32-bit counters once per second and keep 64-bit counters internally. This

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Jason Gunthorpe
On Tue, Oct 17, 2006 at 04:31:00PM -0700, Roland Dreier wrote: > For now I just used lock; addl %0 to implement rmb on i386. I'm > really not comfortable making libmthca depend on sse2, and I don't see > a good way to detect and use sse2 at runtime. I think the typical way this is done would be

Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Michael Newton
On Tue, 17 Oct 2006, Hal Rosenstock wrote: > On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote: > > > From: Michael Newton > > > Sent: Tuesday, October 17, 2006 3:02 AM > > > To: openib-general@openib.org > > > Subject: [openib-general] sysfs exposure of port counters useless? > > > > > > > > > These

Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-17 Thread Shirley Ma
Hi, Roland, There were a couple errors and warning when I applied this patch to OFED-1.1-rc7. 1. ehca_req_notify_cq() in ehca_iverbs.h is not updated. 2. *maybe_missed_event = ipz_qeit_is_valid(my_cq->ipz_queue) should be =ipz_qeit_is_valid(&my_cq->ipz_queue) 3. a compile warning this line retur

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
OK, you convinced me to add rmb()/wmb() and use it in libmthca. I just checked a bunch of changes to do that into svn. Please survey the wreckage of libibverbs/libmthca and let me know if you see where I broke anything. For now I just used lock; addl %0 to implement rmb on i386. I'm really not

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 23:48 +0200, Michael S. Tsirkin wrote: > Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > > Subject: Re: RHEL5 and OFED ... > > > > On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote: > > > On a tangent, is there a way to set up a cross-build environment that will > >

Re: [openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Scott Weitzenkamp (sweitzen)
You need the kernel-source RPM, I guess the OFED install.sh should check for that RPM. svbu-qa-opteron-1:~ # uname -a Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 i68 6 athlon i386 GNU/Linux svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel kernel-source-2.6.16.21-0.8 ke

Re: [openib-general] Race in mthca_cmd_post()

2006-10-17 Thread John Partridge
I'm going back and comparing analyzer traces with the fix and without and the machine doing an MCA. John Roland Dreier wrote: > chas> i would guess the read to the mmio region is flushing the > chas> writes to the config register but the read happens "too > chas> soon" after those wri

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Parks Fields
\ Have we ever seen silent data corruption in CHECKSUM_HW? Here at lanl we have seen silent corruption on other types of networks but not IB yet that we know of. So we are a little gun shy...    * Correspondence * This email contains no programmatic content that requir

Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-17 Thread Hoang-Nam Nguyen
Hi Troy! > I am running PVFS2 on OpenIB, with IBM's ehca. > When we start writing/reading large files, either with the NetPIPE > PVFS module we have or a modified GAMESS executable that uses > libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an error. > This is also correlated with ke

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Detecting SSE2 is easy -- we could just do the cpuid ourselves if we > wanted to. The problem is what do you do when you see that the CPU > does or doesn't have the instruction? The runtime patching that the > kernel does is way too complicated, and

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > > as for mb() - I don't thnk our kernel code uses that so I think userspace > > should switch to wmb as well. wmb isjust a compiler barrier on most > > arhitectures. > > I'm not sure it's worth the trouble to split up the two cases at this > point

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
> Well, at startup we can read /proc/cpuinfo and look for sse2 in the flags: > line. > Seems simple enough. Detecting SSE2 is easy -- we could just do the cpuid ourselves if we wanted to. The problem is what do you do when you see that the CPU does or doesn't have the instruction? The runtim

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>: > Subject: Re: RHEL5 and OFED ... > > On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote: > > On a tangent, is there a way to set up a cross-build environment that will > > build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine? >

[openib-general] [GIT PULL] please pull infiniband.git

2006-10-17 Thread Roland Dreier
Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This includes various fixes found since 2.6.19-rc2:

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > But of course not all x86 processors > support lfence/mfence which leads to some ugly issues of how to handle > this lfence seems to be part of SSE2, and I don't think we really need sfence/mfence. We can just require SSE2 support: http://en.wikipedi

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
> > Another confusing thing is that asm-i386 defines mb() and rmb() just > > to be compiler barriers, > > I see: > #define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", > X86_FEATURE_XMM2) Oops, you're right. I misread that file. OK, we probably want mb() to be more than a compil

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael> True, but I dont think anyone us still running libibverbs Michael> on processors that don't. What happens if an older Michael> processors when you call lfence? You get an illegal instruction signal and the process dies I guess. ___

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Another confusing thing is that asm-i386 defines mb() and rmb() just > to be compiler barriers, I see: #define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2) as for mb() - I don't thnk our kernel code uses that so I think us

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote: > On a tangent, is there a way to set up a cross-build environment that will > build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine? > I'm doing this now with gcc and kernel.org kernel I built myself from source. > I guess

Re: [openib-general] [PATCH] Rewrite cma_req_handler() to encapsulate common code.

2006-10-17 Thread Roland Dreier
OK, queued for 2.6.20 ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Roland Dreier
> Roland, this looks good for 2.6.20. How would you like to handle > pulling in patches like these? Once OFA has git up, would it be > easier to pull them into my git tree, then request that you pull from > there, or does this work okay? Git pulls are definitely the easiest, but I'm fine wit

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > But of course not all x86 processors support lfence/mfence True, but I dont think anyone us still running libibverbs on processors that don't. What happens if an older processors when you call lfence? -- MST __

Re: [openib-general] client-server small message performance issues

2006-10-17 Thread Roland Dreier
> Basic ping pong is 25 us. That's fine as this is not a particularly > optimal way to communicate. Each additional server adds 6 us. That > seems like a lot of overhead just to do another pair of posts and > polls, but not my major complaint. Look at the jump from 6 to 7 > servers, 41 us.

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
> Very strange. Let's consider amd64: libibverbs has > > #elif defined(__x86_64__) > > #define mb()asm volatile("" ::: "memory") > > So its just a compiler barrier there. > > While linux has asm-x86_64/system.h > > #define rmb() asm volatile("lfence":::"memory") > > So rmb s

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Roland Dreier
Shirley> Have we ever seen silent data corruption in CHECKSUM_HW? Well, a quick web search finds stuff like http://my.adsm.org/modules.php?op=modload&name=phpBB_14&file=index&action=viewtopic&topic=2362&0 But what I was really talking about was the risk of sending IP packets without a checksu

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
Parks Fields <[EMAIL PROTECTED]> wrote on 10/17/2006 01:12:48 PM: > > > > >No, it's never a good idea to turn off TCP or IP checksums.  That > >leads to possibilities of silent data corruption too easily. > > I totally agree... Have we ever seen silent data corruption in CHECKSUM_HW? Thanks Sh

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 22:23 +0200, Michael S. Tsirkin wrote: > Quoting Doug Ledford <[EMAIL PROTECTED]>: > > Evidently, I was mistaken and rhn is still populated with the beta1 > > rpms. So, I've made the latest kernel available on my web page as > > referenced below (amongst other rpms as well).

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 16:21, Yevgeny Kliteynik wrote: > Hal Rosenstock wrote: > > On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote: > >> Hi Hal > >> > >> Fixing more things in the multicast test flow. > >> > >> Still have things to do in case when multicast group removal > >> fails, and have

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
On a tangent, is there a way to set up a cross-build environment that will build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine? I'm doing this now with gcc and kernel.org kernel I built myself from source. I guess I mostly need to get gcc and binutils SRPMs to generate cross-compili

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting Doug Ledford <[EMAIL PROTECTED]>: > Evidently, I was mistaken and rhn is still populated with the beta1 > rpms. So, I've made the latest kernel available on my web page as > referenced below (amongst other rpms as well). However, it may still be > a while before the rpms are fully populat

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Yevgeny Kliteynik
Hal Rosenstock wrote: > On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote: >> Hi Hal >> >> Fixing more things in the multicast test flow. >> >> Still have things to do in case when multicast group removal >> fails, and have to add some cleanup (as we've discussed previously). >> -- >> Yevgeny

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 17:09 +0200, Michael S. Tsirkin wrote: > > Yeah, this is the rolling updates thing I was telling you about. The > > Beta1 kernel was 2.6.17+several git repos and patches. We've since > > updated to 2.6.18, but that was released as an update to the Beta1 isos > > and trees v

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Parks Fields
> >No, it's never a good idea to turn off TCP or IP checksums. That >leads to possibilities of silent data corruption too easily. I totally agree... * Correspondence * This email contains no programmatic content that requires independent ADC review _

[openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Chris Dennett
I've been trying to install OFED 1.1 RC7 on an x86 server with a fresh install of SLES10 (32-bit). It errors out when trying to build the kernel modules. I've included what I think are the relevant log messages below. I've tried installing everything (minus iser and tvflash) or just the modul

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > Michael> kernel code does rmb rather than mb there. > > OK, but that's an optimization rather than a correctness issue: mb is > stronger than rmb. Very strange. Let's consider amd64: libib

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael> kernel code does rmb rather than mb there. OK, but that's an optimization rather than a correctness issue: mb is stronger than rmb. The reason I did it that way was because I wasn't sure it was worth defining mb, rmb and wmb for userspace. - R.

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread akepner
On Tue, 17 Oct 2006, Roland Dreier wrote: > OK, here's what I actually put in my tree. Can you eyeball this and > maybe give it a quick test? If it looks good to you, I'll send it on > to the stable team for 2.6.18.x. > Yep, looks fine, and it works on my Altix. Thanks, Roland. -- Arthur _

[openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Tziporet Koren
Hi All, OFED 1.1-pre1 is available: URL: https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz According to the 1.1 release schedule I published yesterday and got all partners approval (Qlogic have not answered so I assumed its OK with them too). Each company has 3 days for ba

[openib-general] client-server small message performance issues

2006-10-17 Thread Pete Wyckoff
I'm trying to understand some performance variation in an Openib application, and wrote a small test program to simulate its behavior. Attached are the code and a plot of some results. Each dot in the plot shows the time for a single iteration in the code explained below. One client communicates

Re: [openib-general] [ucma] executing the ucmatose with local IPoIB IP address of port 2 fails

2006-10-17 Thread Sean Hefty
Sean Hefty wrote: > This is a ROUTE_ERROR (path record query fails). Are the IP addresses on > different subnets? Are you having ucmatose bind to the port 2 ip address. Another thing to check is what port ucmatose binds to after calling rdma_resolve_addr(). - Sean ___

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote: > Hi Hal > > Fixing more things in the multicast test flow. > > Still have things to do in case when multicast group removal > fails, and have to add some cleanup (as we've discussed previously). > -- > Yevgeny > > Signed-off-by: Yevgeny Kl

Re: [openib-general] [ucma] executing the ucmatose with local IPoIB IP address of port 2 fails

2006-10-17 Thread Sean Hefty
> scenario 2: fails > SM was executed on port 2 > i executed ucmatose server and ucmatose client with IPoIB IP address > of port 2 > > here is the output of the client: > ucmatose: starting client > ucmatose: connecting > ucmatose: event: 3, error: 0 > receiving data transfers > sending

[openib-general] [PATCH] opensm: misc fixes in lft dump file parser

2006-10-17 Thread Sasha Khapyorsky
There are misc small fixes for lft dump parser: - merge ERROR and SYS logging in single osm_log() call - more strict strtoul() results checking - fix potential bugs with invalid dump files - break too long lines Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]> --- osm/opensm/osm_ucast_file.c

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > > > I don't think an mmiowb() equivalent is available from userspace. > > > > Isn't this just an asm() command? > > Nope, look at the kernel source, specifically arch/ia64/sn/kernel/iomv.c

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
> > I don't think an mmiowb() equivalent is available from userspace. > > Isn't this just an asm() command? Nope, look at the kernel source, specifically arch/ia64/sn/kernel/iomv.c > BTW, I think we really should implement proper rmb/wmb in arch.h. > Last time I looked we only had compiler

Re: [openib-general] [PATCH] Re-send ARP as prev ARP request could have got dropped.

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: > Re-send ARP, since earlier ARP request could have got > dropped/lost. This should be done in addr_resolve_remote() > as doing it in rdma_resolve_ip() means sending ARP only > once. This was intentional. Users can call rdma_resolve_ip() again to retry a timed out request.

Re: [openib-general] [PATCH] Rewrite cma_req_handler() to encapsulate common code.

2006-10-17 Thread Sean Hefty
Acked-by: Sean Hefty <[EMAIL PROTECTED]> Let me see how Roland would like to handle merging the patches going forward, but this one looks fine. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] Fix some cancellation problems in process_req().

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: > mutex_lock(&lock); > list_for_each_entry_safe(req, temp_req, &req_list, list) { > - if (req->status) { > + if (req->status && req->status != -ECANCELED) { I think we just need: if (req->status == -ENODATA) { >

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
What I suggested here is when it's connected mode with large MTU, set ib interface flag to CHECKSUM_UNNECESSARY. But this only works on packets not being routed off-net at the TCP layer. Thanks Shirley Ma IBM Linux Technology Center 15300 SW Koll Parkway Beaverton, OR 97006-6063 Phone(Fax): (503)

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread akepner
On Tue, 17 Oct 2006, Michael S. Tsirkin wrote: > Quoting r. Roland Dreier <[EMAIL PROTECTED]>: >> Subject: Re: [PATCH] use mmiowb after doorbell ring >> >> Michael> BTW, something like this will be needed for userspace too? >> >> Ugh, I forgot about that. >> >> I don't think an mmiowb() equiva

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > No, it's never a good idea to turn off TCP or IP checksums. That > leads to possibilities of silent data corruption too easily. "never" is probably too strong a word - hardware checksum offloading turns off checksumming in software, moving that to h

Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Sean Hefty
Acked-by: Sean Hefty <[EMAIL PROTECTED]> Roland, this looks good for 2.6.20. How would you like to handle pulling in patches like these? Once OFA has git up, would it be easier to pull them into my git tree, then request that you pull from there, or does this work okay? > In queue_req(), use

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > Michael> BTW, something like this will be needed for userspace too? > > Ugh, I forgot about that. > > I don't think an mmiowb() equivalent is available from userspace. Isn't this just an

Re: [openib-general] [PATCH] [RFC] cma_new_id can kfree on error instead of destroy_id

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: > cma_new_id() does not require to do destroy_id(), instead > it can kfree(), since nothing is allocated on that id. > Posting this as an RFC in case anyone feels that create_id > should be cleaned up by destroy_id (even if redundant). I can go either way on this. It's a litt

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Roland Dreier
Shirley> I read the discussion in net-dev. Since IB packet has its Shirley> own CRC (ICRC, VCRC). Is it a good idea to enable Shirley> checksum unnecessary in a pure IB Fabrics for large MTU Shirley> 64K. It requires some negotiation. Does your prototype Shirley> implementation

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael> BTW, something like this will be needed for userspace too? Ugh, I forgot about that. I don't think an mmiowb() equivalent is available from userspace. However, the problem only arises if userspace uses the same QP/CQ/SRQ from multiple nodes at the same time -- so maybe we can live wi

Re: [openib-general] uDAPL problem

2006-10-17 Thread Arlin Davis
Stephen Smaldone wrote: > > > Arlin Davis wrote: > >> Steve Smaldone wrote: >> >>> Hi, >>> >>> Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm >>> device appears. However, it now fails with the following: >>> >>> $ ./dapltest -T S -D IB1 >>> ... >>> DAT Registry: dat_ia_openv

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Sean Hefty
Michael S. Tsirkin wrote: > Could be a compiler thing: maybe cm_issue_rej is used in ore than > one place? To make sure, you can try removing the static > keryword and see if this appears. That could be. cm_issue_rej is called from multiple locations, whereas cm_issue_drep is not. - Sean _

Re: [openib-general] [PATCH] If addr_handler() got error, do not set state as OK

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote: > diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c > --- org/drivers/infiniband/core/cma.c 2006-10-10 15:45:27.0 +0530 > +++ new/drivers/infiniband/core/cma.c 2006-10-10 15:59:53.0 +0530 > @@ -1515,6 +1515,8 @@ static void addr_hand

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>: > Subject: Re: OFED 1.1 release schedule > > Tziporet Koren wrote: > > I checked it and saw that the patch is applied, but since in the patch > > Sean put the cm_issue_drep as a static, thus nm does not show it. > > from the patch: +static int cm_issue_d

Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Sean Hefty
Krishna Kumar2 wrote: > Hmmm, OK, I will re-phrase this patch to reduce nesting. Something similar to: if (cma_any_addr...) { ret = rdma_translate_ip(..); if (ret) goto err1; mutex_lock ret = cma_acquire_dev mutex_unlock if (ret)

Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Sean Hefty
Tziporet Koren wrote: > I checked it and saw that the patch is applied, but since in the patch > Sean put the cm_issue_drep as a static, thus nm does not show it. > from the patch: +static int cm_issue_drep(struct cm_port *port, cm_issue_rej is also static, but shows up. > Do you really need the

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 06:48:22PM +0200, Michael S. Tsirkin wrote: > Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > > Subject: Re: [openfabrics-ewg] OFED 1.1 RC7 fork() issue. > > > > On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote: > > > > > > >3. Fork support from kernel 2

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Michael S. Tsirkin
Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Subject: Re: [openfabrics-ewg] OFED 1.1 RC7 fork() issue. > > On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote: > > > > >3. Fork support from kernel 2.6.12 and above is available > > >provided that applications do not use threads

Re: [openib-general] Tools for development

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Matt Leininger <[EMAIL PROTECTED]>: > Developers had requested git 1.4, but Ubuntu had an older version. We > went ahead and installed git from source. I'd prefer to stick to Ubuntu > packages if possible. We have much to gain from newer versions - just look at gitweb change log. But

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>: > > > /* can be added later once ipoib support sg > > > .get_sg = ethtool_op_get_sg, > > > .set_sg = ethtool_op_set_sg, > > > */ > > > > The difficulty here is that sg currently requires checksum offloading in > > netdevice. > > I read the discussion in

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing
Thanks for the clarification. --CQ >> >You need to make a difference between full fork support that >will be available only in libibverbs1.1 and the system /fork & >exec fork support that is depend on the kernel only and >available from kernel 2.6.12. > >See also the explanation from Gleb

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tziporet Koren
Tang, Changqing wrote: > Thanks, I still use 2.6.9-34, Or Gerlitz told me that fork() support is > only in libibverbs1.1 which is not released yet. Both OFED 1.0 and 1.1 > use libibverbs1.0, is it still true ? > > --CQ > > > You need to make a difference between full fork support that will be a

Re: [openib-general] Tools for development

2006-10-17 Thread Sasha Khapyorsky
On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote: > Quoting r. Steve Wise <[EMAIL PROTECTED]>: > > At the risk of opening a can of worms, is there any reason we don't move > > the user stuff into its own git tree? This would get rid of svn > > altogether... > > If we do, that should probably be

Re: [openib-general] Tools for development

2006-10-17 Thread Sasha Khapyorsky
On 09:17 Tue 17 Oct , Jeff Squyres wrote: > Per the teleconference last week, I'd like to survey the developers > about the tools that should be installed on the new OFA server (is > there a plan to migrate there yet?). > > As I understand it (please correct me if I get this wrong): > > -

[openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Yevgeny Kliteynik
Hi Hal Fixing more things in the multicast test flow. Still have things to do in case when multicast group removal fails, and have to add some cleanup (as we've discussed previously). -- Yevgeny Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]> Index: osmtest/osmt_multicast.c =

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote: > > >3. Fork support from kernel 2.6.12 and above is available > >provided that applications do not use threads. The fork() is > >supported as long as parent process does not run before child > >exits or calls exec(). > > After f

Re: [openib-general] Tools for development

2006-10-17 Thread Matt Leininger
On Tue, 2006-10-17 at 07:49 -0700, Roland Dreier wrote: > Michael> The tool versions installed on openib are ancient. Can > Michael> site admins please install latest svn and git versions > Michael> from source? > > What distro is on the new openfabrics.org server? Ubuntu. > If i

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing
>3. Fork support from kernel 2.6.12 and above is available >provided that applications do not use threads. The fork() is >supported as long as parent process does not run before child >exits or calls exec(). After fork(), in child, before exec(), can we call printf(), putenv(), or even re-dire

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma
"Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote on 10/16/2006 11:12:03 PM: > Quoting r. Shirley Ma <[EMAIL PROTECTED]>: > > /* can be added later once ipoib support sg > > .get_sg = ethtool_op_get_sg, > > .set_sg = ethtool_op_set_sg, > > */ > > The difficulty here is that sg currently requires ch

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: [PATCH] use mmiowb after doorbell ring > > OK, here's what I actually put in my tree. Can you eyeball this and > maybe give it a quick test? If it looks good to you, I'll send it on > to the stable team for 2.6.18.x. BTW, something li

  1   2   >