Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> Subject: Re: [openfabrics-ewg] RHEL5 and OFED ...
> 
> On Sun, 2006-10-15 at 12:13 -0400, Doug Ledford wrote:
> 
> > > Now for userspace - does RHEL5 include at least libibverbs-1.0?
> > > This has been released a while back, and Roland makes regular bugfix 
> > > releases.
> > 
> > It includes the OFED 1.0 libibverbs (which makes openmpi complain about
> > lack of out of band data support, but otherwise seems to work).

What's out of band data BTW?

> I built the OFED-1.1-pre1 user space RPMs for RHEL5.  They are available
> at my web site.

Thanks!

> Kernel RPMs with the OFED 1.1 code will come a little
> later.

>From our dicussion, it seems we should be able to just push the
small number of missing bits into RHEL5 directly. That would be
nicer of course.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Sun, 2006-10-15 at 12:13 -0400, Doug Ledford wrote:

> > Now for userspace - does RHEL5 include at least libibverbs-1.0?
> > This has been released a while back, and Roland makes regular bugfix 
> > releases.
> 
> It includes the OFED 1.0 libibverbs (which makes openmpi complain about
> lack of out of band data support, but otherwise seems to work).

I built the OFED-1.1-pre1 user space RPMs for RHEL5.  They are available
at my web site.  Kernel RPMs with the OFED 1.1 code will come a little
later.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Or Gerlitz <[EMAIL PROTECTED]>:
> Subject: Re: OFED-1.1-pre1 is ready
> 
> Tziporet Koren wrote:
> > OFED 1.1-pre1 is available:
> > URL:
> > https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz
> > Release details:
> > 
> > BUILD_ID:
> > OFED-1.1-pre1
> > 
> > openib-1.1 (REV=9854)
> > # User space
> > https://openib.org/svn/gen2/branches/1.1/src/userspace
> > Git:
> > ref: refs/heads/ofed_1_1
> > commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78
> 
> Hi Tziporet,
> 
> I have asked this Michael few days ago and did not get a reply yet: can 
> you clarify where is the version of the OFED IB ***kernel*** drivers 
> stated?

That's the commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78 part.

> I understand they are typically based on some tag of Linus GIT tree (for 
> example OFED1.1 uses 2.6.18 - correct?) but i could not find any notice 
> for that in the docs nor in the per rc emails.
> 
> Or.

OFED1.1 was last rebased against 2.6.18-rc6 + a couple of small patches touching
cma + adding scripts out of kernel modules backports etc. 2.6.18 wasn't out
by code freeze time, but all fixes in 2.6.18 are also in OFED 1.1.

Try something like
git log v2.6.18-rc6..936b9fc0bd1411b52826213a5d89e2ceb4f52a78
to get the list of OFED changes against v2.6.18-rc6.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Or Gerlitz
Tziporet Koren wrote:
> OFED 1.1-pre1 is available:
> URL:
> https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz
> Release details:
> 
> BUILD_ID:
> OFED-1.1-pre1
> 
> openib-1.1 (REV=9854)
> # User space
> https://openib.org/svn/gen2/branches/1.1/src/userspace
> Git:
> ref: refs/heads/ofed_1_1
> commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78

Hi Tziporet,

I have asked this Michael few days ago and did not get a reply yet: can 
you clarify where is the version of the OFED IB ***kernel*** drivers 
stated?

I understand they are typically based on some tag of Linus GIT tree (for 
example OFED1.1 uses 2.6.18 - correct?) but i could not find any notice 
for that in the docs nor in the per rc emails.

Or.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Erez Zilber
I reported the same problem last week:

http://openib.org/pipermail/openfabrics-ewg/2006-October/001714.html

-- 



Erez Zilber | 972-9-971-7689

Software Engineer, Storage Team

Voltaire – _The Grid Backbone_

__

www.voltaire.com 



Scott Weitzenkamp (sweitzen) wrote:
> You need the kernel-source RPM, I guess the OFED install.sh should check
> for that RPM.
>
> svbu-qa-opteron-1:~ # uname -a
> Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC
> 2006 i68
> 6 athlon i386 GNU/Linux
> svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel
> kernel-source-2.6.16.21-0.8
> kernel-smp-2.6.16.21-0.8
> kernel-ib-1.1-2.6.16.21_0.8_smp
> kernel-ib-devel-1.1-2.6.16.21_0.8_smp
> svbu-qa-opteron-1:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp
> .config Makefilearch include2
> .kernelrelease  Module.symvers  include  scripts
>
> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>  
>
>   
>> -Original Message-
>> From: [EMAIL PROTECTED] 
>> [mailto:[EMAIL PROTECTED] On Behalf Of Chris Dennett
>> Sent: Tuesday, October 17, 2006 12:46 PM
>> To: [EMAIL PROTECTED]; openib-general@openib.org
>> Subject: [openib-general] OFED 1.1-RC7 build problem on SLES10
>>
>> I've been trying to install OFED 1.1 RC7 on an x86 server 
>> with a fresh install 
>> of SLES10 (32-bit).  It errors out when trying to build the 
>> kernel modules.  
>> I've included what I think are the relevant log messages 
>> below.  I've tried 
>> installing everything (minus iser and tvflash) or just the 
>> modules needed for 
>> SRP.  I've installed 1.1 RC7 successfully on other RedHat 
>> servers without any 
>> problems.  I am installing as root.  Any help would be appreciated.
>>
>> Thanks.
>>
>> -Chris
>>
>> ==
>> + make kernel
>> Building kernel modules
>> Kernel version: 2.6.16.21-0.8-smp
>> Modules directory: //lib/modules/2.6.16.21-0.8-smp
>> Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build
>> env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include 
>> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
>> 
>> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \
>> 
>> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \
>> make -C /lib/modules/2.6.16.21-0.8-smp/build 
>> SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband" 
>> KERNELRELEASE=2.6.16.21-0.8-smp \
>> EXTRAVERSION=.21-0.8-smp V=1  \
>> CONFIG_INFINIBAND=m \
>> CONFIG_INFINIBAND_IPOIB=m \
>> CONFIG_INFINIBAND_SDP= \
>> CONFIG_INFINIBAND_SRP=m \
>> CONFIG_INFINIBAND_USER_MAD=m \
>> CONFIG_INFINIBAND_USER_ACCESS=m \
>> CONFIG_INFINIBAND_ADDR_TRANS=y \
>> CONFIG_INFINIBAND_MTHCA=m \
>> CONFIG_INFINIBAND_IPOIB_DEBUG=y \
>> CONFIG_INFINIBAND_ISER= \
>> CONFIG_INFINIBAND_EHCA= \
>> CONFIG_INFINIBAND_RDS= \
>> CONFIG_INFINIBAND_RDS_DEBUG= \
>> CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \
>> CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \
>> CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \
>> CONFIG_INFINIBAND_SDP_DEBUG= \
>> CONFIG_INFINIBAND_SDP_DEBUG_DATA= \
>> CONFIG_INFINIBAND_IPATH= \
>> CONFIG_INFINIBAND_MTHCA_DEBUG=y \
>> CONFIG_INFINIBAND_MADEYE= \
>> LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \
>> 
>> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
>> -Iinclude \
>> $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
>> -include include/linux/autoconf.h \
>> -include 
>> /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \
>> ' \
>> modules
>> make[1]: Entering directory 
>> `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
>> make[1]: *** No rule to make target `modules'.  Stop.
>> make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
>> make: *** [kernel] Error 2
>> error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
>>
>>
>> RPM build errors:
>> user vlad does not exist - using root
>> group mtl does not exist - using root
>> user vlad does not exist - using root
>> group mtl does not exist - using root
>> Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
>> ERROR: Failed executing "rpmbuild --rebuild --define '_topdir 
>> /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' --define 
>> 'build_root 
>> /var/tmp/OFED' --define 'configure_options --with-libibcommon 
>> --with-libibmad 
>> --with-libibumad --with-libibverbs --with-libmthca --with-opensm 
>> --with-librdmacm --with-openib-diags --with-srptools --with-mstflint 
>> --with-perftest --with-ipoib-mod --with-mthca-mod --with-srp-mod 
>> --with-core-mod --with-user_mad-mod --with-user_access-mod 
>> --with-addr_trans-mod' --de

[openib-general] [PATCH] IB/SRP Userspace: srptools/srp_daemon - Fix connect bug and add support for user specified initiator extension

2006-10-17 Thread Lakshmanan, Madhu
The patch addresses 3 issues:
1. Fixes bug in srp_daemon for the case where if it is invoked with the
'-e' option, it fails to connect to the SRP targets because of a newline
character in the parameter string.
2. Changes the name of the constant 'MAX_TRAGET_CONFIG_STR_STRING' to
'MAX_TARGET_CONFIG_STR_STRING'.
3. Changes the behavior of the '-n' option to srp_daemon. The earlier
behavior printed the initiator extension. The new behavior allows the
user to specify an initiator extension as an argument to the '-n'
option.

Signed-off-by: Madhu Lakshmanan <[EMAIL PROTECTED]>
---
--- 1.1-orig/src/userspace/srptools/srp_daemon/srp_daemon.c
2006-10-17 06:32:14.0 -0400
+++ 1.1/src/userspace/srptools/srp_daemon/srp_daemon.c  2006-10-17
06:10:12.0 -0400
@@ -78,7 +78,7 @@ static char *sysfs_path = "/sys";
 
 static void usage(const char *argv0)
 {
-   fprintf(stderr, "Usage: %s [-vVceo] [-d  | -i
 [-p ]] [-t ] [-r ]
[-R ]\n", argv0);
+   fprintf(stderr, "Usage: %s [-vVceo] [-d  | -i
 [-p ]] [-n ] [-t ] [-r ] [-R ]\n", argv0);
fprintf(stderr, "-v Verbose\n");
fprintf(stderr, "-V debug Verbose\n");
fprintf(stderr, "-c prints connection
Commands\n");
@@ -91,7 +91,7 @@ static void usage(const char *argv0)
fprintf(stderr, "-Rperform complete Rescan
every  seconds\n");
fprintf(stderr, "-t   Timeout for mad response
in milisec \n");
fprintf(stderr, "-rnumber of send Retries
for each mad\n");
-   fprintf(stderr, "-n New print - prints also
initiator extention\n");
+   fprintf(stderr, "-n  New: use initiator
extension\n");
fprintf(stderr, "\nExample: srp_daemon -e -i mthca0 -p 1 -R
60\n");
 }
 
@@ -114,7 +114,7 @@ void pr_cmd(char *target_str, int not_co
int ret;
 
if (config->cmd)
-   printf("%s", target_str);
+   printf("%s\n", target_str);
 
if (config->execute && not_connected) {
int fd = open(config->add_target_file, O_WRONLY);
@@ -122,6 +122,7 @@ void pr_cmd(char *target_str, int not_co
pr_err("unable to open %s, maybe ib_srp is not
loaded\n", config->add_target_file);
return;
}
+   pr_debug("Add target str: %s\n", target_str);
ret = write(fd, target_str, strlen(target_str));
pr_debug("Adding target returned %d\n", ret);
close(fd);
@@ -174,8 +175,8 @@ static void add_non_exist_traget(char *i
char *subdir_name_ptr;
int prefix_len;
uint8_t dgid_val[16];
-   const int MAX_TRAGET_CONFIG_STR_STRING = 255;
-   char target_config_str[MAX_TRAGET_CONFIG_STR_STRING];
+   const int MAX_TARGET_CONFIG_STR_STRING = 255;
+   char target_config_str[MAX_TARGET_CONFIG_STR_STRING];
int len, len_left;
int not_connected = 1;
 
@@ -190,8 +191,7 @@ static void add_non_exist_traget(char *i
prefix_len = strlen(scsi_host_dir);
subdir_name_ptr = scsi_host_dir + prefix_len;
 
-   subdir = (void *) 1; /* Dummy value to enter the loop */
-   while (subdir) {
+   do {
subdir = readdir(dir);

if (!subdir)
@@ -237,9 +237,9 @@ static void add_non_exist_traget(char *i
 
return;
 
-   }
+   } while (subdir);
 
-   len = snprintf(target_config_str, MAX_TRAGET_CONFIG_STR_STRING,
"id_ext=%s,"
+   len = snprintf(target_config_str, MAX_TARGET_CONFIG_STR_STRING,
"id_ext=%s,"
"ioc_guid=%016llx,"
"dgid=%016llx%016llx,"
"pkey=,"
@@ -249,41 +249,40 @@ static void add_non_exist_traget(char *i
(unsigned long long) subnet_prefix,
(unsigned long long) h_guid,
(unsigned long long) h_service_id);
-   if (len >= MAX_TRAGET_CONFIG_STR_STRING) {
+   if (len >= MAX_TARGET_CONFIG_STR_STRING) {
pr_err("Target conifg string is too long, ignoring
target\n");
closedir(dir);
return;
}
 
if (ioc_prof.io_class != htons(SRP_REV16A_IB_IO_CLASS)) {
-   len_left = MAX_TRAGET_CONFIG_STR_STRING - len;
+   len_left = MAX_TARGET_CONFIG_STR_STRING - len;
len += snprintf(target_config_str+len, 
-   MAX_TRAGET_CONFIG_STR_STRING - len,
+   MAX_TARGET_CONFIG_STR_STRING - len,
",io_class=%04hx",
ntohs(ioc_prof.io_class));
 
-   if (len >= MAX_TRAGET_CONFIG_STR_STRING) {
+   if (len >= MAX_TARGET_CONFIG_STR_STRING) {
pr_err("Target conifg string is too long,
ignoring target\n");
closedir(dir);
return;
}
}
 
-   if (config->print_initiator_

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> Subject: Re: RHEL5 and OFED ...
> 
> On Wed, 2006-10-18 at 06:01 +0200, Michael S. Tsirkin wrote:
> > Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > > Far easier would be to go the other way around,
> > > run on x86_64 and build for i386, in which case gcc supports that out of
> > > the box.
> > 
> > All that's left is to convince Lenovo there's a market for x86_64
> > thinkpads.
> 
> I would place that behind convincing them not to ship exploding
> batteries (thankfully, they actually saw the light on that after some
> spectacular examples of why they should).

Sigh. BTW, the utility they supplied to check the battery didn't run under wine,
had to open the case, copy the S/N to their web page manually.

And I wandered - what does the utility do on windows? Heats the thing
up and tests whether it explodes?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] If addr_handler() got error, do not set state as OK

2006-10-17 Thread Krishna Kumar2
Sean Hefty <[EMAIL PROTECTED]> wrote on 10/17/2006 10:33:41 PM:

> Can you rework this patch without adding in extra flags to indicate what 
has or 
> has not been executed?

OK, will fix it accordingly.

thanks,

- KK


> Krishna Kumar wrote:
> > diff -ruNp org/drivers/infiniband/core/cma.c 
new/drivers/infiniband/core/cma.c
> > --- org/drivers/infiniband/core/cma.c   2006-10-10 15:45:27.0 
+0530
> > +++ new/drivers/infiniband/core/cma.c   2006-10-10 15:59:53.0 
+0530
> > @@ -1515,6 +1515,8 @@ static void addr_handler(int status, str
> >  {
> > struct rdma_id_private *id_priv = context;
> > enum rdma_cm_event_type event;
> > +   int did_comp_exch = 0;
> > +   int destroy = 0;
> 
> As a general comment, I really don't think that we need to be overly 
concerned 
> about optimizing error handling at the expense of code readability.
> 

> 
> Thanks,
> - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Wed, 2006-10-18 at 06:01 +0200, Michael S. Tsirkin wrote:
> Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > Far easier would be to go the other way around,
> > run on x86_64 and build for i386, in which case gcc supports that out of
> > the box.
> 
> All that's left is to convince Lenovo there's a market for x86_64
> thinkpads.

I would place that behind convincing them not to ship exploding
batteries (thankfully, they actually saw the light on that after some
spectacular examples of why they should).

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Krishna Kumar
rdma_bind_addr() leaks a cma_dev reference count
in failure case.

Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]>
---
diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c
--- org/drivers/infiniband/core/cma.c   2006-10-09 17:13:41.0 +0530
+++ new/drivers/infiniband/core/cma.c   2006-10-09 19:42:31.0 +0530
@@ -1749,6 +1749,7 @@ static int cma_get_port(struct rdma_id_p
 int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
 {
struct rdma_id_private *id_priv;
+   int did_acquire_dev = 0;
int ret;
 
if (addr->sa_family != AF_INET)
@@ -1767,6 +1768,7 @@ int rdma_bind_addr(struct rdma_cm_id *id
}
if (ret)
goto err;
+   did_acquire_dev = 1;
}
 
memcpy(&id->route.addr.src_addr, addr, ip_addr_size(addr));
@@ -1776,6 +1778,8 @@ int rdma_bind_addr(struct rdma_cm_id *id
 
return 0;
 err:
+   if (did_acquire_dev)
+   cma_detach_from_dev(id_priv);
cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_IDLE);
return ret;
 }

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Jason Gunthorpe
On Wed, Oct 18, 2006 at 06:43:54AM +0200, Michael S. Tsirkin wrote:
> The difference here is that libibverbs insists on putting all plugins
> in a separate directory and passing full path to dlopen, which of course
> breaks this.

Yeah, plugins in a seperate dir are not well supported by all the
fancy things that dl does behind the scenes.. Unfortunately
dlopen will not permute a relative path with the search parameters
so there is no way to make it work other that fiddling LD_LIBRARY_PATH
prior to calling dlopen:
  old = setenv("LD_LIBRARY_PATH",dirname(foo));
  dlopen(basename(foo));
  setenv("LD_LIBRARY_PATH",old);

I just look a quick look at the directory setup and if you are
changing things I'd say you should also arrange to have the libibverbs
soname stamped into the plugin path and soname. Something like
libmthca-libibverbs.2.so.0. Once you do that it is pretty safe
to put it in /usr/lib* 

For libraries it is always best to design in support for multiple
major versions being installed at once since invariably someone will
need to do that down the road.

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Krishna Kumar2
> Something similar to:
> 
> if (cma_any_addr...) {
>ret = rdma_translate_ip(..);
>if (ret)
>   goto err1;
> 
>mutex_lock
>ret = cma_acquire_dev
>mutex_unlock
>if (ret)
>   goto err2;
> }
> 
> should work fine.

Actually that will not work, since the undo operation is for when the
next operation (cma_get_port()) fails after we did an acquire_dev,
and in that case the refcount needs to be dropped. So I am not
able to avoid using an extra flag to indicate that a ref was got some
time in the past, and drop it in the error path. I will send that out now.

Thanks,

- KK


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Krishna Kumar2
> Please add something like "RDMA/addr: " before the "Use" there, so
> that someone skimming the kernel log knows what subsystem/specific
> area the patch touches.  (I added that by hand)

> Git just wants three -s like "---" between changelog entry and actual 
patch.

> the last line in the original mail was blank, when it should have a
> single space.  This makes git complain (correctly) about a corrupt
> patch.  Please make sure your mailer doesn't corrupt whitespace.

OK, all points noted. Sorry for the extra work :)

- KK


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Jason Gunthorpe <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
> On Tue, Oct 17, 2006 at 08:44:34PM -0700, Roland Dreier wrote:
> > Jason> I think the typical way this is done would be to use
> > Jason> ld.so's 'hwcap' handling and stick an optimized library in
> > Jason> /usr/lib/sse2.
>  
> > It's a good suggestion, but the problem is that the CPU-dependent code
> > is in the mthca.so driver-dependent plugin, which libibverbs dlopen()s
> > at runtime.  Do you know how to use the hwcap stuff with dlopen()?
> > I'm not thrilled about creating an sse2 special case in libibverbs
> > just to handle libmthca on i386.
> 
> It is automatic, I just doubled checked to be sure:

The difference here is that libibverbs insists on putting all plugins
in a separate directory and passing full path to dlopen, which of course
breaks this.

Roland, I've been looking at changing the way we handle plugins
and this might be a good reason to finally do this before 1.1:
rather than look for plugins in a pre-configured path, let's just have
a config file (or files) and ask users to put the list of plugins there.

As it is, it is already painful to keep both 32 and 64 bit libibverbs on the
same system - we have to invent a methodology for where to put 64/32 bit
libraries. And when I have to keep several library versions around for testing
it's much easier (at least, for me) to just use LD_LIBRARY_PATH for everything
and stick each version in a separate directory, than to remember playing with
special environment that was invented just for libibverbs.

Does this make sense?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] RHEL5 and OFED ...

2006-10-17 Thread Roland Dreier
Michael> All that's left is to convince Lenovo there's a market
Michael> for x86_64 thinkpads.

Actually you just have to wait a few months -- Core 2 (Merom) is
64-bit capable so once Lenovo catches up to everyone else, you'll be
able to get a 64-bit thinkpad.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Jason Gunthorpe
On Tue, Oct 17, 2006 at 08:44:34PM -0700, Roland Dreier wrote:
> Jason> I think the typical way this is done would be to use
> Jason> ld.so's 'hwcap' handling and stick an optimized library in
> Jason> /usr/lib/sse2.
 
> It's a good suggestion, but the problem is that the CPU-dependent code
> is in the mthca.so driver-dependent plugin, which libibverbs dlopen()s
> at runtime.  Do you know how to use the hwcap stuff with dlopen()?
> I'm not thrilled about creating an sse2 special case in libibverbs
> just to handle libmthca on i386.

It is automatic, I just doubled checked to be sure:

$ cat t.c 
#include 
int main(int argc, const char *argv[])
{
   dlopen(argv[1],RTLD_NOW);
}

$ find /usr/lib -name "libcrypto.so.0.9.8"
./i486/libcrypto.so.0.9.8
./libcrypto.so.0.9.8
./i586/libcrypto.so.0.9.8
./i686/cmov/libcrypto.so.0.9.8

$ strace ./t libcrypto.so.0.9.8
[..]
open("/usr/lib/i686/cmov/libcrypto.so.0.9.8", O_RDONLY) = 3
$ mv libcrypto.so.0.9.8 /tmp/libcrypto.so.0.9.8.x
$ strace ./t libcrypto.so.0.9.8
[.. 34 occurances of open /usr//libcrypto.so.0.9.8 ..]
open("/usr/lib/libcrypto.so.0.9.8", O_RDONLY) = 3
$ ldconfig
$ strace ./t libcrypto.so.0.9.8
[..]
open("/usr/lib/libcrypto.so.0.9.8", O_RDONLY) = 3

Undocumented, but it does something very close to what you'd
want.. ldconfig caches the soname mapping in /etc/ld.so.cache so you
have to be careful when experimenting. Several packages that get big
gains with specific optimizations use this already. Strace on the
above test program with a non-existing library shows the search path
and all permutations.

It is also probably worth benchmarking a full cmov+i686+sse2 build of
everything and look at always providing it if it is faster, like is
often done for glibc.

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
> Roland> For now I just used lock; addl %0 to implement rmb on
> Roland> i386.  I'm really not comfortable making libmthca depend
> Roland> on sse2, and I don't see a good way to detect and use sse2
> Roland> at runtime.
> 
> Jason> I think the typical way this is done would be to use
> Jason> ld.so's 'hwcap' handling and stick an optimized library in
> Jason> /usr/lib/sse2.
> 
> It's a good suggestion, but the problem is that the CPU-dependent code
> is in the mthca.so driver-dependent plugin, which libibverbs dlopen()s
> at runtime.  Do you know how to use the hwcap stuff with dlopen()?
> I'm not thrilled about creating an sse2 special case in libibverbs
> just to handle libmthca on i386.

Off the top of my head, an easy way seems to be to split sse2-dependent code in
a separate library, which can then be installed on ld path, have mthca pull that
in.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> Far easier would be to go the other way around,
> run on x86_64 and build for i386, in which case gcc supports that out of
> the box.

All that's left is to convince Lenovo there's a market for x86_64
thinkpads.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 20:18, Scott Weitzenkamp (sweitzen) wrote:
> I agree the 32-bit byte and packet counters are useless as they get
> pegged in a few seconds on a busy IB networks.  I thought there was an
> effort in IBTA to fix this.

The fix at least in terms of the spec has been there for a while.
PortCountersExtended are in the 1.2 spec but not all hardware/PMA
supports these (they are optional).

> For IB counters in a Cisco switch, we read and reset the 32-bit counters
> once per second and keep 64-bit counters internally.

32 bit byte counters can be pegged in only 16 seconds on a 4x SDR link
and there are 4x DDR links now (8 seconds) and 12x links (5 seconds) so
that strategy is inaccurate on busy networks.

> This would be possible in OF too, right?

This is part of a performance manager (which is part of fabric
management) and is not standardized (specific to each fabric management
offering). Most offer this manager as part of their solution.

OpenSM will be adding a performance manager in the not distant future.
An RFC will initially be published on this list so I look forward to
comments since this seems to be an area of interest.

-- Hal

> Scott Weitzenkamp
> SQA and Release Manager
> Server Virtualization Business Unit
> Cisco Systems
>  
> 
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Michael Newton
> > Sent: Tuesday, October 17, 2006 5:10 PM
> > To: Hal Rosenstock
> > Cc: openib-general@openib.org
> > Subject: Re: [openib-general] sysfs exposure of port counters useless?
> > 
> > On Tue, 17 Oct 2006, Hal Rosenstock wrote:
> > > On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote:
> > > > > From: Michael Newton
> > > > > Sent: Tuesday, October 17, 2006 3:02 AM
> > > > > To: openib-general@openib.org
> > > > > Subject: [openib-general] sysfs exposure of port 
> > counters useless?
> > > > >
> > > > >
> > > > > These are 32 bit counters. The rcv/xmit_data counters 
> > count 32-bit
> > > > > blocks. Also, these counts do not wrap: they peg at all 1s.
> > > > > At infiniband speeds, these counts can peg out very 
> > quickly indeed,
> > > > > to the point they can really only be of use if they can 
> > be reset each
> > > > time
> > > > > there read. Now if anyone who wants to use them has to 
> > go the CLI to
> > > > reset
> > > > > them, and theres little point in reading them without 
> > reset, why would
> > > > > anyone read them via sysfs? so why have them?
> > > > >
> > > >
> > > > We have found that while your comment is true for the 
> > data movement
> > > > counters, the error counters should not peg quickly, 
> > hence it is valid
> > 
> > its true i overstated the case just a little;) .. yes error counters
> > should be fine and its mainly the data counters that are problematic
> > (tho now im not sure i havent seen the packet counters freeze when the
> > data ones peg out)..
> > 
> > > > to read them without resetting.  However it is also 
> > useful to have an
> > > > ability to reset them.  Of course if there are other CLI 
> > commands which
> > > > do this easily, the sysfs info is of less value.
> > >
> > > There are diag tools for this.
> > 
> > thats where we started.. the point im making is that exposing the data
> > counters in sysfs is of little use, because if you have to go to other
> > tools to reset, why wouldnt you use them to read as well?
> > 
> > i was looking at exposing infiniband stats via PCP
> > (http://oss.sgi.com/projects/pcp/). This would be useful for 
> > folk doing IB
> > performance testing. Its very easy to just feed in the sysfs values..
> > unfortunately they turn out to be of little value. Life would 
> > be so much
> > easier if there were 64 bit counters available. Instead I 
> > will probably
> > need to have an additional daemon to construct them.
> > 
> > 
> > ___
> > openib-general mailing list
> > openib-general@openib.org
> > http://openib.org/mailman/listinfo/openib-general
> > 
> > To unsubscribe, please visit 
> > http://openib.org/mailman/listinfo/openib-general
> > 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Roland> For now I just used lock; addl %0 to implement rmb on
Roland> i386.  I'm really not comfortable making libmthca depend
Roland> on sse2, and I don't see a good way to detect and use sse2
Roland> at runtime.

Jason> I think the typical way this is done would be to use
Jason> ld.so's 'hwcap' handling and stick an optimized library in
Jason> /usr/lib/sse2.

It's a good suggestion, but the problem is that the CPU-dependent code
is in the mthca.so driver-dependent plugin, which libibverbs dlopen()s
at runtime.  Do you know how to use the hwcap stuff with dlopen()?
I'm not thrilled about creating an sse2 special case in libibverbs
just to handle libmthca on i386.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 20:10, Michael Newton wrote:
> On Tue, 17 Oct 2006, Hal Rosenstock wrote:
> > On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote:
> > > > From: Michael Newton
> > > > Sent: Tuesday, October 17, 2006 3:02 AM
> > > > To: openib-general@openib.org
> > > > Subject: [openib-general] sysfs exposure of port counters useless?
> > > >
> > > >
> > > > These are 32 bit counters. The rcv/xmit_data counters count 32-bit
> > > > blocks. Also, these counts do not wrap: they peg at all 1s.
> > > > At infiniband speeds, these counts can peg out very quickly indeed,
> > > > to the point they can really only be of use if they can be reset each
> > > time
> > > > there read. Now if anyone who wants to use them has to go the CLI to
> > > reset
> > > > them, and theres little point in reading them without reset, why would
> > > > anyone read them via sysfs? so why have them?
> > > >
> > >
> > > We have found that while your comment is true for the data movement
> > > counters, the error counters should not peg quickly, hence it is valid
> 
> its true i overstated the case just a little;) .. yes error counters
> should be fine and its mainly the data counters that are problematic
> (tho now im not sure i havent seen the packet counters freeze when the
> data ones peg out)..
> 
> > > to read them without resetting.  However it is also useful to have an
> > > ability to reset them.  Of course if there are other CLI commands which
> > > do this easily, the sysfs info is of less value.
> >
> > There are diag tools for this.
> 
> thats where we started.. 

Guess I missed that.

> the point im making is that exposing the data
> counters in sysfs is of little use, because if you have to go to other
> tools to reset, why wouldnt you use them to read as well?

You can. They support this.

> i was looking at exposing infiniband stats via PCP
> (http://oss.sgi.com/projects/pcp/). This would be useful for folk doing IB
> performance testing. Its very easy to just feed in the sysfs values..
> unfortunately they turn out to be of little value. Life would be so much
> easier if there were 64 bit counters available. Instead I will probably
> need to have an additional daemon to construct them.

Depends on what you mean by available. They are defined in the IB spec
(PortCountersExtended) but are optional and not available in all PMAs.

-- Hal



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-17 Thread Roland Dreier
Sorry, I just noticed my cross-compilation test setup was messed up,
so I never actually built the modified ehca, even though I thought I
did.  Anyway, the patch below on top of what I sent out should fix
everything up.

I've also merged this into my ipoib-napi branch, so what's there
should be OK for ehca now.

Anyway, I'm eagerly awaiting your NAPI results with ehca.

Thanks,
  Roland

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Greg Lindahl
On Tue, Oct 17, 2006 at 05:18:34PM -0700, Scott Weitzenkamp (sweitzen) wrote:

> I agree the 32-bit byte and packet counters are useless as they get
> pegged in a few seconds on a busy IB networks.  I thought there was an
> effort in IBTA to fix this.

Yes, it's in the management working group.

> For IB counters in a Cisco switch, we read and reset the 32-bit counters
> once per second and keep 64-bit counters internally.  This would be
> possible in OF too, right?

Yep. We keep 64 bit counters internally and dumb them down as required
to meet the standard.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Scott Weitzenkamp (sweitzen)
I agree the 32-bit byte and packet counters are useless as they get
pegged in a few seconds on a busy IB networks.  I thought there was an
effort in IBTA to fix this.

For IB counters in a Cisco switch, we read and reset the 32-bit counters
once per second and keep 64-bit counters internally.  This would be
possible in OF too, right?

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Michael Newton
> Sent: Tuesday, October 17, 2006 5:10 PM
> To: Hal Rosenstock
> Cc: openib-general@openib.org
> Subject: Re: [openib-general] sysfs exposure of port counters useless?
> 
> On Tue, 17 Oct 2006, Hal Rosenstock wrote:
> > On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote:
> > > > From: Michael Newton
> > > > Sent: Tuesday, October 17, 2006 3:02 AM
> > > > To: openib-general@openib.org
> > > > Subject: [openib-general] sysfs exposure of port 
> counters useless?
> > > >
> > > >
> > > > These are 32 bit counters. The rcv/xmit_data counters 
> count 32-bit
> > > > blocks. Also, these counts do not wrap: they peg at all 1s.
> > > > At infiniband speeds, these counts can peg out very 
> quickly indeed,
> > > > to the point they can really only be of use if they can 
> be reset each
> > > time
> > > > there read. Now if anyone who wants to use them has to 
> go the CLI to
> > > reset
> > > > them, and theres little point in reading them without 
> reset, why would
> > > > anyone read them via sysfs? so why have them?
> > > >
> > >
> > > We have found that while your comment is true for the 
> data movement
> > > counters, the error counters should not peg quickly, 
> hence it is valid
> 
> its true i overstated the case just a little;) .. yes error counters
> should be fine and its mainly the data counters that are problematic
> (tho now im not sure i havent seen the packet counters freeze when the
> data ones peg out)..
> 
> > > to read them without resetting.  However it is also 
> useful to have an
> > > ability to reset them.  Of course if there are other CLI 
> commands which
> > > do this easily, the sysfs info is of less value.
> >
> > There are diag tools for this.
> 
> thats where we started.. the point im making is that exposing the data
> counters in sysfs is of little use, because if you have to go to other
> tools to reset, why wouldnt you use them to read as well?
> 
> i was looking at exposing infiniband stats via PCP
> (http://oss.sgi.com/projects/pcp/). This would be useful for 
> folk doing IB
> performance testing. Its very easy to just feed in the sysfs values..
> unfortunately they turn out to be of little value. Life would 
> be so much
> easier if there were 64 bit counters available. Instead I 
> will probably
> need to have an additional daemon to construct them.
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Jason Gunthorpe
On Tue, Oct 17, 2006 at 04:31:00PM -0700, Roland Dreier wrote:

> For now I just used lock; addl %0 to implement rmb on i386.  I'm
> really not comfortable making libmthca depend on sse2, and I don't see
> a good way to detect and use sse2 at runtime.

I think the typical way this is done would be to use ld.so's 'hwcap'
handling and stick an optimized library in /usr/lib/sse2.

Jason

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] sysfs exposure of port counters useless?

2006-10-17 Thread Michael Newton
On Tue, 17 Oct 2006, Hal Rosenstock wrote:
> On Tue, 2006-10-17 at 09:55, Rimmer, Todd wrote:
> > > From: Michael Newton
> > > Sent: Tuesday, October 17, 2006 3:02 AM
> > > To: openib-general@openib.org
> > > Subject: [openib-general] sysfs exposure of port counters useless?
> > >
> > >
> > > These are 32 bit counters. The rcv/xmit_data counters count 32-bit
> > > blocks. Also, these counts do not wrap: they peg at all 1s.
> > > At infiniband speeds, these counts can peg out very quickly indeed,
> > > to the point they can really only be of use if they can be reset each
> > time
> > > there read. Now if anyone who wants to use them has to go the CLI to
> > reset
> > > them, and theres little point in reading them without reset, why would
> > > anyone read them via sysfs? so why have them?
> > >
> >
> > We have found that while your comment is true for the data movement
> > counters, the error counters should not peg quickly, hence it is valid

its true i overstated the case just a little;) .. yes error counters
should be fine and its mainly the data counters that are problematic
(tho now im not sure i havent seen the packet counters freeze when the
data ones peg out)..

> > to read them without resetting.  However it is also useful to have an
> > ability to reset them.  Of course if there are other CLI commands which
> > do this easily, the sysfs info is of less value.
>
> There are diag tools for this.

thats where we started.. the point im making is that exposing the data
counters in sysfs is of little use, because if you have to go to other
tools to reset, why wouldnt you use them to read as well?

i was looking at exposing infiniband stats via PCP
(http://oss.sgi.com/projects/pcp/). This would be useful for folk doing IB
performance testing. Its very easy to just feed in the sysfs values..
unfortunately they turn out to be of little value. Life would be so much
easier if there were 64 bit counters available. Instead I will probably
need to have an additional daemon to construct them.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH/RFC 1/2] IB: Return "maybe_missed_event" hint from ib_req_notify_cq()

2006-10-17 Thread Shirley Ma

Hi, Roland,

There were a couple errors and warning when I applied this patch to OFED-1.1-rc7.
1. ehca_req_notify_cq() in ehca_iverbs.h is not updated.
2. *maybe_missed_event = ipz_qeit_is_valid(my_cq->ipz_queue) should be =ipz_qeit_is_valid(&my_cq->ipz_queue) 
3. a compile warning this line return cqe_flags >> 7 == queue->toggle_state & 1;

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
OK, you convinced me to add rmb()/wmb() and use it in libmthca.  I
just checked a bunch of changes to do that into svn.  Please survey
the wreckage of libibverbs/libmthca and let me know if you see where I
broke anything.

For now I just used lock; addl %0 to implement rmb on i386.  I'm
really not comfortable making libmthca depend on sse2, and I don't see
a good way to detect and use sse2 at runtime.

Thanks,
  Roland

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 23:48 +0200, Michael S. Tsirkin wrote:
> Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> > Subject: Re: RHEL5 and OFED ...
> > 
> > On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote:
> > > On a tangent, is there a way to set up a cross-build environment that will
> > > build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine?
> > > I'm doing this now with gcc and kernel.org kernel I built myself from 
> > > source.
> > > I guess I mostly need to get gcc and binutils SRPMs to generate
> > > cross-compiling tools - has anyone done that?
> > 
> > At least for Red Hat, rpm already mostly supports this with only a few
> > examples of breakage (apps needing gfortran like openmpi are an example
> > that might break depending on usage).
> 
> So, you are saying I shuld rpmbuild binutils and gcc rpms?
> Hmm, I'll give it a try.

No.  To build an i686 binary on an x86_64 only requires passing -m32 to
the compiler.  It will then build the 32bit variant instead of the
64bit.  I'm not saying rpmbuild gcc and binutils, because the variants
installed will already do what you want (assuming building 32bit on
64bit is what you want, but I see that isn't the case on down).

> >  You can
> > call rpmbuild with the --target option to specify the mode you want the
> > package built as
> 
> Hmm, no, I really want to take a srpm from amd64 and get a 32 bit
> gcc executable that will build 64 bit binaries that match these
> built on native amd64 system exectly.

Between just i386 and x86_64, you might be able to do that.  However, in
general, byte for byte identical cross compiling can't be done.  That's
one of the reasons we build all the packages on the arch they are being
built for, so if a user rebuilds a package on the arch then it will most
likely match the package we built.  If we cross compiled, that would be
false more often than true.  Minor things like variance in how CPUs
handle floating point math and precision of said math effect gcc
optimization decisions and change the generated byte code.

In any case, I'm certainly no gcc build expert, so I don't know the
magic incantations to get the gcc sources to spit out a 32bit binary
that builds 64bit code.  Far easier would be to go the other way around,
run on x86_64 and build for i386, in which case gcc supports that out of
the box.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Scott Weitzenkamp (sweitzen)
You need the kernel-source RPM, I guess the OFED install.sh should check
for that RPM.

svbu-qa-opteron-1:~ # uname -a
Linux svbu-qa-opteron-1 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC
2006 i68
6 athlon i386 GNU/Linux
svbu-qa-opteron-1:~ # rpm -qa | fgrep kernel
kernel-source-2.6.16.21-0.8
kernel-smp-2.6.16.21-0.8
kernel-ib-1.1-2.6.16.21_0.8_smp
kernel-ib-devel-1.1-2.6.16.21_0.8_smp
svbu-qa-opteron-1:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp
.config Makefilearch include2
.kernelrelease  Module.symvers  include  scripts

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Chris Dennett
> Sent: Tuesday, October 17, 2006 12:46 PM
> To: [EMAIL PROTECTED]; openib-general@openib.org
> Subject: [openib-general] OFED 1.1-RC7 build problem on SLES10
> 
> I've been trying to install OFED 1.1 RC7 on an x86 server 
> with a fresh install 
> of SLES10 (32-bit).  It errors out when trying to build the 
> kernel modules.  
> I've included what I think are the relevant log messages 
> below.  I've tried 
> installing everything (minus iser and tvflash) or just the 
> modules needed for 
> SRP.  I've installed 1.1 RC7 successfully on other RedHat 
> servers without any 
> problems.  I am installing as root.  Any help would be appreciated.
> 
> Thanks.
> 
> -Chris
> 
> ==
> + make kernel
> Building kernel modules
> Kernel version: 2.6.16.21-0.8-smp
> Modules directory: //lib/modules/2.6.16.21-0.8-smp
> Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build
> env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include 
> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
> 
> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \
> 
> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \
> make -C /lib/modules/2.6.16.21-0.8-smp/build 
> SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband" 
> KERNELRELEASE=2.6.16.21-0.8-smp \
> EXTRAVERSION=.21-0.8-smp V=1  \
> CONFIG_INFINIBAND=m \
> CONFIG_INFINIBAND_IPOIB=m \
> CONFIG_INFINIBAND_SDP= \
> CONFIG_INFINIBAND_SRP=m \
> CONFIG_INFINIBAND_USER_MAD=m \
> CONFIG_INFINIBAND_USER_ACCESS=m \
> CONFIG_INFINIBAND_ADDR_TRANS=y \
> CONFIG_INFINIBAND_MTHCA=m \
> CONFIG_INFINIBAND_IPOIB_DEBUG=y \
> CONFIG_INFINIBAND_ISER= \
> CONFIG_INFINIBAND_EHCA= \
> CONFIG_INFINIBAND_RDS= \
> CONFIG_INFINIBAND_RDS_DEBUG= \
> CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \
> CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \
> CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \
> CONFIG_INFINIBAND_SDP_DEBUG= \
> CONFIG_INFINIBAND_SDP_DEBUG_DATA= \
> CONFIG_INFINIBAND_IPATH= \
> CONFIG_INFINIBAND_MTHCA_DEBUG=y \
> CONFIG_INFINIBAND_MADEYE= \
> LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \
> 
> -I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
> -Iinclude \
> $(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
> -include include/linux/autoconf.h \
> -include 
> /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \
> ' \
> modules
> make[1]: Entering directory 
> `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
> make[1]: *** No rule to make target `modules'.  Stop.
> make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
> make: *** [kernel] Error 2
> error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
> 
> 
> RPM build errors:
> user vlad does not exist - using root
> group mtl does not exist - using root
> user vlad does not exist - using root
> group mtl does not exist - using root
> Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
> ERROR: Failed executing "rpmbuild --rebuild --define '_topdir 
> /var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' --define 
> 'build_root 
> /var/tmp/OFED' --define 'configure_options --with-libibcommon 
> --with-libibmad 
> --with-libibumad --with-libibverbs --with-libmthca --with-opensm 
> --with-librdmacm --with-openib-diags --with-srptools --with-mstflint 
> --with-perftest --with-ipoib-mod --with-mthca-mod --with-srp-mod 
> --with-core-mod --with-user_mad-mod --with-user_access-mod 
> --with-addr_trans-mod' --define 'configure_options32 %{nil}' --define 
> 'KVERSION 2.6.16.21-0.8-smp' --define 'KSRC 
> /lib/modules/2.6.16.21-0.8-smp/build' --define 
> 'build_kernel_ib 1' --define 
> 'build_kernel_ib_devel 0' --define 'NETWORK_CONF_DIR 
> /etc/sysconfig/network' 
> --define 'modprobe_update 1' --define 'include_ipoib_conf 0' --define 
> 'build_32bit 0' /root/OFED-1.1-rc7/SRPMS/openib-1.1-0.src.rpm"
> 
> ===
> 
> smx32:~ # uname -a
> Linux linux-yeez

Re: [openib-general] Race in mthca_cmd_post()

2006-10-17 Thread John Partridge
I'm going back and comparing analyzer traces with the fix and without and the
machine doing an MCA.

John

Roland Dreier wrote:
> chas> i would guess the read to the mmio region is flushing the
> chas> writes to the config register but the read happens "too
> chas> soon" after those writes.  on a more mundance computer, the
> chas> write/write/read probably wouldnt be batched together.
> 
> config writes can't be posted though, so that doesn't make sense.
> 
>  - R.


-- 
John Partridge

Silicon Graphics Inc
Tel:  651-683-3428
Vnet: 233-3428
E-Mail: [EMAIL PROTECTED]

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Parks Fields


\
Have we ever seen silent data corruption in CHECKSUM_HW?


Here at lanl we have seen silent corruption on other types of networks
but not IB yet that we know of. So we are a little gun shy...

  
* Correspondence * 
This email contains no programmatic content that requires independent ADC
review 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] ibv_reg_mr failure with pvfs on ehca?

2006-10-17 Thread Hoang-Nam Nguyen
Hi Troy!
> I am running PVFS2 on OpenIB, with IBM's ehca.
> When we start writing/reading large files, either with the NetPIPE
> PVFS module we have or a modified GAMESS executable that uses
> libpvfs2 directly, the 'ibv_reg_mr' function fails, and we get an error.
> This is also correlated with kernel log messages like this:
> Oct 16 11:14:45 p5l8 kernel: PU0003 000e0091:ehca_hcall_7arg_7ret
> HCAD_ERROR  opco
> de=160 ret=fff7 arg1=1304 arg2=5
> arg3=14f0ebc8 arg4=1
> arg5=e0 arg6=e3e9f200 arg7=0 out1=0 out2=0 out3=0 out4=0
> out5=0 out6=0
> out7=0
Return code f7 from firmware/hvcall means H_NO_MEM. I'm wondering
if you could provide me with some pre-history of this problem.
Is this a permanent problem? If yes, could you give me more infos
on your testcase resp. scenario eg large file size, NetPIPE options?
Which version of ehca are you using? And which kernel version?
Thanks!
Hoang-Nam Nguyen


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Detecting SSE2 is easy -- we could just do the cpuid ourselves if we
> wanted to.  The problem is what do you do when you see that the CPU
> does or doesn't have the instruction?  The runtime patching that the
> kernel does is way too complicated, and if you're going to move mb()
> out of line then just doing a regular serializing instruction is
> probably just as good.

Maybe just do the test, print a warning and exit for now?
I don't think anyone is gonnu run libibverbs on Pentium III.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
>  > as for mb() - I don't thnk our kernel code uses that so I think userspace
>  > should switch to wmb as well. wmb isjust a compiler barrier on most
>  > arhitectures.
> 
> I'm not sure it's worth the trouble to split up the two cases at this
> point.

Shouldn't be hard - just look at kernel code and make userspac to match.

> Does it make a bit performance difference?

I imagine it does, latency-wise.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
 > Well, at startup we can read /proc/cpuinfo and look for sse2 in the flags: 
 > line.
 > Seems simple enough.

Detecting SSE2 is easy -- we could just do the cpuid ourselves if we
wanted to.  The problem is what do you do when you see that the CPU
does or doesn't have the instruction?  The runtime patching that the
kernel does is way too complicated, and if you're going to move mb()
out of line then just doing a regular serializing instruction is
probably just as good.

 > I hope we can do something without compile flags - most people
 > don't know enough to turn them on, and distros commonly
 > compile for least common denominator.

And rightfully so -- the default compile of libibverbs/libmthca should
run on a least common denominator CPU.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Doug Ledford <[EMAIL PROTECTED]>:
> Subject: Re: RHEL5 and OFED ...
> 
> On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote:
> > On a tangent, is there a way to set up a cross-build environment that will
> > build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine?
> > I'm doing this now with gcc and kernel.org kernel I built myself from 
> > source.
> > I guess I mostly need to get gcc and binutils SRPMs to generate
> > cross-compiling tools - has anyone done that?
> 
> At least for Red Hat, rpm already mostly supports this with only a few
> examples of breakage (apps needing gfortran like openmpi are an example
> that might break depending on usage).

So, you are saying I shuld rpmbuild binutils and gcc rpms?
Hmm, I'll give it a try.

>  You can
> call rpmbuild with the --target option to specify the mode you want the
> package built as

Hmm, no, I really want to take a srpm from amd64 and get a 32 bit
gcc executable that will build 64 bit binaries that match these
built on native amd64 system exectly.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [GIT PULL] please pull infiniband.git

2006-10-17 Thread Roland Dreier
Linus, please pull from

master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus

This tree is also available from kernel.org mirrors at:

git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git 
for-linus

This includes various fixes found since 2.6.19-rc2:

Adrian Bunk:
  RDMA/amso1100: Fix a NULL dereference in error path

Arthur Kepner:
  IB/mthca: Use mmiowb after doorbell ring

Henrik Kretzschmar:
  RDMA/amso1100: pci_module_init() conversion

Robert Walsh:
  IB/ipath: Initialize diagpkt file on device init only

 drivers/infiniband/hw/amso1100/c2.c|2 -
 drivers/infiniband/hw/amso1100/c2_rnic.c   |4 +-
 drivers/infiniband/hw/ipath/ipath_diag.c   |   65 
 drivers/infiniband/hw/ipath/ipath_driver.c |   10 
 drivers/infiniband/hw/ipath/ipath_kernel.h |3 -
 drivers/infiniband/hw/mthca/mthca_cq.c |7 +++
 drivers/infiniband/hw/mthca/mthca_qp.c |   19 
 drivers/infiniband/hw/mthca/mthca_srq.c|8 +++
 8 files changed, 75 insertions(+), 43 deletions(-)


diff --git a/drivers/infiniband/hw/amso1100/c2.c 
b/drivers/infiniband/hw/amso1100/c2.c
index dc1ebea..9e7bd94 100644
--- a/drivers/infiniband/hw/amso1100/c2.c
+++ b/drivers/infiniband/hw/amso1100/c2.c
@@ -1243,7 +1243,7 @@ static struct pci_driver c2_pci_driver =
 
 static int __init c2_init_module(void)
 {
-   return pci_module_init(&c2_pci_driver);
+   return pci_register_driver(&c2_pci_driver);
 }
 
 static void __exit c2_exit_module(void)
diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c 
b/drivers/infiniband/hw/amso1100/c2_rnic.c
index e37c568..30409e1 100644
--- a/drivers/infiniband/hw/amso1100/c2_rnic.c
+++ b/drivers/infiniband/hw/amso1100/c2_rnic.c
@@ -150,8 +150,8 @@ static int c2_rnic_query(struct c2_dev *
(struct c2wr_rnic_query_rep *) (unsigned long) (vq_req->reply_msg);
if (!reply)
err = -ENOMEM;
-
-   err = c2_errno(reply);
+   else
+   err = c2_errno(reply);
if (err)
goto bail2;
 
diff --git a/drivers/infiniband/hw/ipath/ipath_diag.c 
b/drivers/infiniband/hw/ipath/ipath_diag.c
index 29958b6..28c087b 100644
--- a/drivers/infiniband/hw/ipath/ipath_diag.c
+++ b/drivers/infiniband/hw/ipath/ipath_diag.c
@@ -67,19 +67,54 @@ static struct file_operations diag_file_
.release = ipath_diag_release
 };
 
+static ssize_t ipath_diagpkt_write(struct file *fp,
+  const char __user *data,
+  size_t count, loff_t *off);
+
+static struct file_operations diagpkt_file_ops = {
+   .owner = THIS_MODULE,
+   .write = ipath_diagpkt_write,
+};
+
+static atomic_t diagpkt_count = ATOMIC_INIT(0);
+static struct cdev *diagpkt_cdev;
+static struct class_device *diagpkt_class_dev;
+
 int ipath_diag_add(struct ipath_devdata *dd)
 {
char name[16];
+   int ret = 0;
+
+   if (atomic_inc_return(&diagpkt_count) == 1) {
+   ret = ipath_cdev_init(IPATH_DIAGPKT_MINOR,
+ "ipath_diagpkt", &diagpkt_file_ops,
+ &diagpkt_cdev, &diagpkt_class_dev);
+
+   if (ret) {
+   ipath_dev_err(dd, "Couldn't create ipath_diagpkt "
+ "device: %d", ret);
+   goto done;
+   }
+   }
 
snprintf(name, sizeof(name), "ipath_diag%d", dd->ipath_unit);
 
-   return ipath_cdev_init(IPATH_DIAG_MINOR_BASE + dd->ipath_unit, name,
-  &diag_file_ops, &dd->diag_cdev,
-  &dd->diag_class_dev);
+   ret = ipath_cdev_init(IPATH_DIAG_MINOR_BASE + dd->ipath_unit, name,
+ &diag_file_ops, &dd->diag_cdev,
+ &dd->diag_class_dev);
+   if (ret)
+   ipath_dev_err(dd, "Couldn't create %s device: %d",
+ name, ret);
+
+done:
+   return ret;
 }
 
 void ipath_diag_remove(struct ipath_devdata *dd)
 {
+   if (atomic_dec_and_test(&diagpkt_count))
+   ipath_cdev_cleanup(&diagpkt_cdev, &diagpkt_class_dev);
+
ipath_cdev_cleanup(&dd->diag_cdev, &dd->diag_class_dev);
 }
 
@@ -275,30 +310,6 @@ bail:
return ret;
 }
 
-static ssize_t ipath_diagpkt_write(struct file *fp,
-  const char __user *data,
-  size_t count, loff_t *off);
-
-static struct file_operations diagpkt_file_ops = {
-   .owner = THIS_MODULE,
-   .write = ipath_diagpkt_write,
-};
-
-static struct cdev *diagpkt_cdev;
-static struct class_device *diagpkt_class_dev;
-
-int __init ipath_diagpkt_add(void)
-{
-   return ipath_cdev_init(IPATH_DIAGPKT_MINOR,
-  "ipath_diagpkt", &diagpkt_file_ops,
-  &diagpkt_cdev, &diagpkt_class_dev);
-}
-
-void __exit ipat

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> But of course not all x86 processors
> support lfence/mfence which leads to some ugly issues of how to handle
> this

lfence seems to be part of SSE2,
and I don't think we really need sfence/mfence.
We can just require SSE2 support:
http://en.wikipedia.org/wiki/SSE2#CPUs_supporting_SSE2

>
> -- runtime detection seems important but I don't know a good way
> to do that.

Well, at startup we can read /proc/cpuinfo and look for sse2 in the flags: line.
Seems simple enough.

> Probably the best thing would be just to do "lock; addl
> $0,0(%%esp)" by default and add a special compile flag or something to
> enable mfence.

I hope we can do something without compile flags - most people
don't know enough to turn them on, and distros commonly
compile for least common denominator.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
 > > Another confusing thing is that asm-i386 defines mb() and rmb() just
 > > to be compiler barriers,
 > 
 > I see:
 > #define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", 
 > X86_FEATURE_XMM2)

Oops, you're right.  I misread that file.

OK, we probably want mb() to be more than a compiler barrier on i386
and x86-64.  I'll fix up the libibverbs code.

 > as for mb() - I don't thnk our kernel code uses that so I think userspace
 > should switch to wmb as well. wmb isjust a compiler barrier on most
 > arhitectures.

I'm not sure it's worth the trouble to split up the two cases at this
point.  Does it make a bit performance difference?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael> True, but I dont think anyone us still running libibverbs
Michael> on processors that don't.  What happens if an older
Michael> processors when you call lfence?

You get an illegal instruction signal and the process dies I guess.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Another confusing thing is that asm-i386 defines mb() and rmb() just
> to be compiler barriers,

I see:
#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)

as for mb() - I don't thnk our kernel code uses that so I think userspace
should switch to wmb as well. wmb isjust a compiler barrier on most
arhitectures.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 22:28 +0200, Michael S. Tsirkin wrote:
> On a tangent, is there a way to set up a cross-build environment that will
> build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine?
> I'm doing this now with gcc and kernel.org kernel I built myself from source.
> I guess I mostly need to get gcc and binutils SRPMs to generate
> cross-compiling tools - has anyone done that?

At least for Red Hat, rpm already mostly supports this with only a few
examples of breakage (apps needing gfortran like openmpi are an example
that might break depending on usage).  This is one of the reason I
totally ignore the install.sh script in the OFED releases.  For any arch
that supports multiple run time variants, the default installation
installs compilers for all supported run time variants, but not
necessarily for other variants (aka, on x86_64 gcc will support x86_64
or ia32, but ia32 won't necessarily support x86_64, and neither will
necessarily support building ppc or ppc64 or ia64 or s390(x)).  You can
call rpmbuild with the --target option to specify the mode you want the
package built as, and in the process that automatically changes all of
the configure options present as part of the %configure macro of rpm to
the right paths (hence why I also strip out all of the %_libdir and
friends settings from the spec file, rpm gets this right itself) and
changes the CFLAGS and CPPFLAGS environment variables to force compiling
in the right mode.

Now, that being said, kernel modules in particular are a different
beast.  I originally had a module build kit for the 2.4 kernels that
would cross build kernel modules.  That's long since deprecated though.
Now a days, the rule is to build a proper kernel looking source tree,
install the kernel-devel package, then the build command is basically
something like:

cd /lib/modules/`uname -r`/build
make SUBDIRS=

In order to cross compile, you would likely need to pass ARCH= to the
make command line.  In addition, for a cross compile you very well may
need to install the kernel-devel from that arch instead of the native
one.  However, given the limitations I listed above about cross
compiling, it's likely that you can only cross compile from certain
arches to certain other arches.

All that being said, a kernel-ib spec file could include something like
this:

BuildConflicts: kernel-devel (I'm not sure build conflicts is a proper
tag, if not, you might have to script a little more carefully in %setup)

%prep
%setup -q
rpm -i /usr/src/redhat/RPMS/%{arch}/kernel-devel-`uname -r`.%{arch}.rpm

%build
cd /lib/modules/`uname -r`/build
make ARCH=%{arch} SUBDIRS=${RPM_BUILD_DIR}/%{name}-%{version} modules

%install
cd /lib/modules/`uname -r`/build
make ARCH=%{arch} SUBDIRS=${RPM_BUILD_DIR}/%{name}-%{version}
INSTALL_MOD_PATH=${RPM_BUILD_ROOT} modules_install

%clean
rm -fr ${RPM_BUILD_ROOT}
rpm -e kernel-devel

Then, to cross compile, you would simply run rpmbuild multiple times:

for i in i686 x86_64; do
  rpmbuild --ba --target=$i kernel-ib.spec
done

BTW, all the rpms are live on my site now.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] Rewrite cma_req_handler() to encapsulate common code.

2006-10-17 Thread Roland Dreier
OK, queued for 2.6.20

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Roland Dreier
 > Roland, this looks good for 2.6.20.  How would you like to handle
 > pulling in patches like these?  Once OFA has git up, would it be
 > easier to pull them into my git tree, then request that you pull from
 > there, or does this work okay?

Git pulls are definitely the easiest, but I'm fine with applying
patches from email too (git has good tools for that).  However it does
make my life easier if the patch applies cleanly.  In this case I had
the following problems (I applied it to for-2.6.20 anyway):

 > [PATCH] Use time_after_eq() instead of time_after() in queue_req()

Please add something like "RDMA/addr: " before the "Use" there, so
that someone skimming the kernel log knows what subsystem/specific
area the patch touches.  (I added that by hand)

 > 

Git just wants three -s like "---" between changelog entry and actual patch.

 > diff -ruNp org/drivers/infiniband/core/addr.c 
 > new/drivers/infiniband/core/addr.c
 > --- org/drivers/infiniband/core/addr.c   2006-10-09 16:54:37.0 
 > +0530
 > +++ new/drivers/infiniband/core/addr.c   2006-10-09 16:55:36.0 
 > +0530
 > @@ -118,7 +118,7 @@ static void queue_req(struct addr_req *r
 >  
 >  mutex_lock(&lock);
 >  list_for_each_entry_reverse(temp_req, &req_list, list) {
 > -if (time_after(req->timeout, temp_req->timeout))
 > +if (time_after_eq(req->timeout, temp_req->timeout))
 >  break;
 >  }
 > 

the last line in the original mail was blank, when it should have a
single space.  This makes git complain (correctly) about a corrupt
patch.  Please make sure your mailer doesn't corrupt whitespace.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> But of course not all x86 processors support lfence/mfence

True, but I dont think anyone us still running libibverbs on processors that
don't.  What happens if an older processors when you call lfence?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] client-server small message performance issues

2006-10-17 Thread Roland Dreier
 > Basic ping pong is 25 us.  That's fine as this is not a particularly
 > optimal way to communicate.  Each additional server adds 6 us.  That
 > seems like a lot of overhead just to do another pair of posts and
 > polls, but not my major complaint.  Look at the jump from 6 to 7
 > servers, 41 us.  Beyond that, too.  And the standard deviation
 > becomes huge.  A plot of the individual values shows a large spread,
 > not just a few outliers.

 > The hardware is all Mellanox MT25204

I would guess you are seeing the effect of exceeding the size of some
internal HCA cache, maybe the QP state cache.  But I don't know enough
details of the HCA internals to know if this is true and if so which
limit you're hitting.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
 > Very strange. Let's consider amd64: libibverbs has
 > 
 > #elif defined(__x86_64__)
 > 
 > #define mb()asm volatile("" ::: "memory")
 > 
 > So its just a compiler barrier there.
 > 
 > While linux has asm-x86_64/system.h
 > 
 > #define rmb()  asm volatile("lfence":::"memory")
 > 
 > So rmb seems to be stronger than mb: it will prevent the CPU from reordering
 > reads while mb won't.

OK, that's a difference between the kernel and libibverbs -- and it
may be a bug.  I have a faint memory of deciding when I wrote the code
that mfence/lfence were only needed for dealing with non-temporal
stores, but looking at asm-x86_64 I see

/*
 * Force strict CPU ordering.
 * And yes, this is required on UP too when we're talking
 * to devices.
 */
#define mb()asm volatile("mfence":::"memory")
#define rmb()   asm volatile("lfence":::"memory")

so maybe this is wrong.  I know that x86 can do loads speculatively
and out of order, so perhaps we are living dangerously.

Another confusing thing is that asm-i386 defines mb() and rmb() just
to be compiler barriers, but I would think that the same ordering
issues apply in 32-bit mode.  But of course not all x86 processors
support lfence/mfence which leads to some ugly issues of how to handle
this -- runtime detection seems important but I don't know a good way
to do that.  Probably the best thing would be just to do "lock; addl
$0,0(%%esp)" by default and add a special compile flag or something to
enable mfence.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Roland Dreier
Shirley> Have we ever seen silent data corruption in CHECKSUM_HW?

Well, a quick web search finds stuff like
http://my.adsm.org/modules.php?op=modload&name=phpBB_14&file=index&action=viewtopic&topic=2362&0

But what I was really talking about was the risk of sending IP packets
without a checksum.  It may be fine within the IB fabric, because the
ICRC protects traffic end-to-end.  But if you have a gateway/router
between the IPoIB-connected-mode fabric and some other IP network,
then that router would have to generate the TCP checksums, which means
that router potentially becomes a source of silent corruption.

There are interesting surveys such as
http://portal.acm.org/citation.cfm?id=347561
about this.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma

Parks Fields <[EMAIL PROTECTED]> wrote on 10/17/2006 01:12:48 PM:
> 
> >
> >No, it's never a good idea to turn off TCP or IP checksums.  That
> >leads to possibilities of silent data corruption too easily.
> 
> I totally agree...

Have we ever seen silent data corruption in CHECKSUM_HW?

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 22:23 +0200, Michael S. Tsirkin wrote:
> Quoting Doug Ledford <[EMAIL PROTECTED]>:
> > Evidently, I was mistaken and rhn is still populated with the beta1
> > rpms.  So, I've made the latest kernel available on my web page as
> > referenced below (amongst other rpms as well).  However, it may still be
> > a while before the rpms are fully populated as I've had to request an
> > increase to my quota limit on that web server in order to hold the
> > kernel rpms tree.
> 
> When available, I gather they will be here:
> http://people.redhat.com/dledford/Infiniband/kernel/2.6.18/1.2729.el5/src/
> is that right?
> 

Yep.

-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 16:21, Yevgeny Kliteynik wrote:
> Hal Rosenstock wrote:
> > On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote:
> >> Hi Hal
> >>
> >> Fixing more things in the multicast test flow.
> >>  
> >> Still have things to do in case when multicast group removal
> >> fails, and have to add some cleanup (as we've discussed previously).
> >> --
> >> Yevgeny
> >>
> >> Signed-off-by:  Yevgeny Kliteynik <[EMAIL PROTECTED]>
> > 
> > Looks good. One question below.
> > 
> > -- Hal
> > 
> >> Index: osmtest/osmt_multicast.c
> >> ===
> >> --- osmtest/osmt_multicast.c   (revision 9856)
> >> +++ osmtest/osmt_multicast.c   (working copy)
> > 
> > [snip...]
> > 
> >> @@ -3261,6 +3306,7 @@ osmt_run_mcast_flow( IN osmtest_t * cons
> >>  &mc_req_rec,
> >>  comp_mask,
> >>  &res_sa_mad );
> >> +  status = IB_SUCCESS;
> > 
> > This doesn't look right to me.
> 
> Right, this must be some cut-and-paste bug.
> This line shouldn't be there. Good catch.
> Thanks.

Thanks. Applied with some cosmetic changes.

-- Hal

> --
> Yevgeny.
> 
> >>if (status != IB_SUCCESS)
> >>{
> >>  osm_log( &p_osmt->log, OSM_LOG_ERROR,
> >> @@ -3274,6 +3320,10 @@ osmt_run_mcast_flow( IN osmtest_t * cons
> >>  fail_to_delete_mcg++;
> >>}
> >>  }
> >> +else
> >> +{
> >> +  end_ipoib_cnt++;
> >> +}
> >>  p_mgrp = (osmtest_mgrp_t*)cl_qmap_next( &p_mgrp->map_item );
> >>}
> >>  
> > 
> > [snip...]
> > 
> > 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
On a tangent, is there a way to set up a cross-build environment that will
build kernel modules for e.g. RHEL amd64 kernel on a 32 bit machine?
I'm doing this now with gcc and kernel.org kernel I built myself from source.
I guess I mostly need to get gcc and binutils SRPMs to generate
cross-compiling tools - has anyone done that?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Michael S. Tsirkin
Quoting Doug Ledford <[EMAIL PROTECTED]>:
> Evidently, I was mistaken and rhn is still populated with the beta1
> rpms.  So, I've made the latest kernel available on my web page as
> referenced below (amongst other rpms as well).  However, it may still be
> a while before the rpms are fully populated as I've had to request an
> increase to my quota limit on that web server in order to hold the
> kernel rpms tree.

When available, I gather they will be here:
http://people.redhat.com/dledford/Infiniband/kernel/2.6.18/1.2729.el5/src/
is that right?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Yevgeny Kliteynik

Hal Rosenstock wrote:
> On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote:
>> Hi Hal
>>
>> Fixing more things in the multicast test flow.
>>  
>> Still have things to do in case when multicast group removal
>> fails, and have to add some cleanup (as we've discussed previously).
>> --
>> Yevgeny
>>
>> Signed-off-by:  Yevgeny Kliteynik <[EMAIL PROTECTED]>
> 
> Looks good. One question below.
> 
> -- Hal
> 
>> Index: osmtest/osmt_multicast.c
>> ===
>> --- osmtest/osmt_multicast.c (revision 9856)
>> +++ osmtest/osmt_multicast.c (working copy)
> 
> [snip...]
> 
>> @@ -3261,6 +3306,7 @@ osmt_run_mcast_flow( IN osmtest_t * cons
>>  &mc_req_rec,
>>  comp_mask,
>>  &res_sa_mad );
>> +  status = IB_SUCCESS;
> 
> This doesn't look right to me.

Right, this must be some cut-and-paste bug.
This line shouldn't be there. Good catch.
Thanks.

--
Yevgeny.

>>if (status != IB_SUCCESS)
>>{
>>  osm_log( &p_osmt->log, OSM_LOG_ERROR,
>> @@ -3274,6 +3320,10 @@ osmt_run_mcast_flow( IN osmtest_t * cons
>>  fail_to_delete_mcg++;
>>}
>>  }
>> +else
>> +{
>> +  end_ipoib_cnt++;
>> +}
>>  p_mgrp = (osmtest_mgrp_t*)cl_qmap_next( &p_mgrp->map_item );
>>}
>>  
> 
> [snip...]
> 
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] RHEL5 and OFED ...

2006-10-17 Thread Doug Ledford
On Tue, 2006-10-17 at 17:09 +0200, Michael S. Tsirkin wrote:

> > Yeah, this is the rolling updates thing I was telling you about.  The
> > Beta1 kernel was 2.6.17+several git repos and patches.  We've since
> > updated to 2.6.18, but that was released as an update to the Beta1 isos
> > and trees via RHN.  So, I don't think you'll see the kernel unless you
> > either 1) use up2date to refresh the beta system
> 
> Will that get me the sources too?

Evidently, I was mistaken and rhn is still populated with the beta1
rpms.  So, I've made the latest kernel available on my web page as
referenced below (amongst other rpms as well).  However, it may still be
a while before the rpms are fully populated as I've had to request an
increase to my quota limit on that web server in order to hold the
kernel rpms tree.

> > or 2) download later
> > iso images and look at the kernel present.  The current kernel version
> > is 2.6.18-1.2717.el5.
> 
> So, I'd like to help, but how can one get the updated kernel source?
> Are the iso's with updated sources available somewhere?
> 
-- 
Doug Ledford <[EMAIL PROTECTED]>
  GPG KeyID: CFBFF194
  http://people.redhat.com/dledford

Infiniband specific RPMs available at
  http://people.redhat.com/dledford/Infiniband


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Parks Fields

>
>No, it's never a good idea to turn off TCP or IP checksums.  That
>leads to possibilities of silent data corruption too easily.


I totally agree...


* Correspondence *

This email contains no programmatic content that requires independent 
ADC review  



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] OFED 1.1-RC7 build problem on SLES10

2006-10-17 Thread Chris Dennett
I've been trying to install OFED 1.1 RC7 on an x86 server with a fresh install 
of SLES10 (32-bit).  It errors out when trying to build the kernel modules.  
I've included what I think are the relevant log messages below.  I've tried 
installing everything (minus iser and tvflash) or just the modules needed for 
SRP.  I've installed 1.1 RC7 successfully on other RedHat servers without any 
problems.  I am installing as root.  Any help would be appreciated.

Thanks.

-Chris

==
+ make kernel
Building kernel modules
Kernel version: 2.6.16.21-0.8-smp
Modules directory: //lib/modules/2.6.16.21-0.8-smp
Kernel sources: /lib/modules/2.6.16.21-0.8-smp/build
env EXTRA_CFLAGS=" -I/var/tmp/OFEDRPM/BUILD/openib-1.1/include 
-I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
-I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/ulp/ipoib \
-I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/debug" \
make -C /lib/modules/2.6.16.21-0.8-smp/build 
SUBDIRS="/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband" 
KERNELRELEASE=2.6.16.21-0.8-smp \
EXTRAVERSION=.21-0.8-smp V=1  \
CONFIG_INFINIBAND=m \
CONFIG_INFINIBAND_IPOIB=m \
CONFIG_INFINIBAND_SDP= \
CONFIG_INFINIBAND_SRP=m \
CONFIG_INFINIBAND_USER_MAD=m \
CONFIG_INFINIBAND_USER_ACCESS=m \
CONFIG_INFINIBAND_ADDR_TRANS=y \
CONFIG_INFINIBAND_MTHCA=m \
CONFIG_INFINIBAND_IPOIB_DEBUG=y \
CONFIG_INFINIBAND_ISER= \
CONFIG_INFINIBAND_EHCA= \
CONFIG_INFINIBAND_RDS= \
CONFIG_INFINIBAND_RDS_DEBUG= \
CONFIG_INFINIBAND_IPOIB_DEBUG_DATA= \
CONFIG_INFINIBAND_SDP_SEND_ZCOPY= \
CONFIG_INFINIBAND_SDP_RECV_ZCOPY= \
CONFIG_INFINIBAND_SDP_DEBUG= \
CONFIG_INFINIBAND_SDP_DEBUG_DATA= \
CONFIG_INFINIBAND_IPATH= \
CONFIG_INFINIBAND_MTHCA_DEBUG=y \
CONFIG_INFINIBAND_MADEYE= \
LINUXINCLUDE='-I/var/tmp/OFEDRPM/BUILD/openib-1.1/include \
-I/var/tmp/OFEDRPM/BUILD/openib-1.1/drivers/infiniband/include \
-Iinclude \
$(if $(KBUILD_SRC),-Iinclude2 -I$(srctree)/include) \
-include include/linux/autoconf.h \
-include /var/tmp/OFEDRPM/BUILD/openib-1.1/include/linux/autoconf.h \
' \
modules
make[1]: Entering directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
make[1]: *** No rule to make target `modules'.  Stop.
make[1]: Leaving directory `/usr/src/linux-2.6.16.21-0.8-obj/i386/smp'
make: *** [kernel] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.92052 (%install)


RPM build errors:
user vlad does not exist - using root
group mtl does not exist - using root
user vlad does not exist - using root
group mtl does not exist - using root
Bad exit status from /var/tmp/rpm-tmp.92052 (%install)
ERROR: Failed executing "rpmbuild --rebuild --define '_topdir 
/var/tmp/OFEDRPM' --define '_prefix /usr/local/ofed' --define 'build_root 
/var/tmp/OFED' --define 'configure_options --with-libibcommon --with-libibmad 
--with-libibumad --with-libibverbs --with-libmthca --with-opensm 
--with-librdmacm --with-openib-diags --with-srptools --with-mstflint 
--with-perftest --with-ipoib-mod --with-mthca-mod --with-srp-mod 
--with-core-mod --with-user_mad-mod --with-user_access-mod 
--with-addr_trans-mod' --define 'configure_options32 %{nil}' --define 
'KVERSION 2.6.16.21-0.8-smp' --define 'KSRC 
/lib/modules/2.6.16.21-0.8-smp/build' --define 'build_kernel_ib 1' --define 
'build_kernel_ib_devel 0' --define 'NETWORK_CONF_DIR /etc/sysconfig/network' 
--define 'modprobe_update 1' --define 'include_ipoib_conf 0' --define 
'build_32bit 0' /root/OFED-1.1-rc7/SRPMS/openib-1.1-0.src.rpm"

===

smx32:~ # uname -a
Linux linux-yeez 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 18:25:39 UTC 2006 i686 
i686 i386 GNU/Linux
smx32:~ # ls /usr/src/linux-2.6.16.21-0.8-obj/i386/smp
Module.symvers



-- 
Chris Dennett
Design Engineer
Texas Memory Systems, Inc.
713-266-3200 x430
[EMAIL PROTECTED]


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
> Michael> kernel code does rmb rather than mb there.
> 
> OK, but that's an optimization rather than a correctness issue: mb is
> stronger than rmb.

Very strange. Let's consider amd64: libibverbs has

#elif defined(__x86_64__)

#define mb()asm volatile("" ::: "memory")

So its just a compiler barrier there.

While linux has asm-x86_64/system.h

#define rmb()  asm volatile("lfence":::"memory")

So rmb seems to be stronger than mb: it will prevent the CPU from reordering
reads while mb won't.
Hmm?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael> kernel code does rmb rather than mb there.

OK, but that's an optimization rather than a correctness issue: mb is
stronger than rmb.

The reason I did it that way was because I wasn't sure it was worth
defining mb, rmb and wmb for userspace.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread akepner
On Tue, 17 Oct 2006, Roland Dreier wrote:

> OK, here's what I actually put in my tree.  Can you eyeball this and
> maybe give it a quick test?  If it looks good to you, I'll send it on
> to the stable team for 2.6.18.x.
>

Yep, looks fine, and it works on my Altix.

Thanks, Roland.

-- 
Arthur


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] OFED-1.1-pre1 is ready

2006-10-17 Thread Tziporet Koren
Hi All,

OFED 1.1-pre1 is available:
URL:
https://openib.org/svn/gen2/branches/1.1/ofed/releases/OFED-1.1-pre1.tgz

According to the 1.1 release schedule I published yesterday and got all
partners approval (Qlogic have not answered so I assumed its OK with
them too).

Each company has 3 days for basic "dead or alive tests" and making sure
that no blocker issues are still open.

If everything goes well we will do the release at the end of this
Thursday.

Components owners: Please remember to update the release notes till
Wednesday. 
Documents should be the only component that will be changed from this
pre-release to the official release.

Tziporet & Vlad




Release details:

BUILD_ID:
OFED-1.1-pre1

openib-1.1 (REV=9854)
# User space
https://openib.org/svn/gen2/branches/1.1/src/userspace
Git:
ref: refs/heads/ofed_1_1
commit 936b9fc0bd1411b52826213a5d89e2ceb4f52a78

# MPI
mpi_osu-0.9.7-mlx2.2.0.tgz
openmpi-1.1.1-1.src.rpm
mpitests-2.0-0.src.rpm


Fixed bugs:
BUG 273: OFED 1.1 rc7 does not work with Cisco FC Gateway
BUG 274: OFED 1.1: RENICE_IB_MAD=yes hangs dual-HCA system with
dual-port HCAs
BUG 277: OFED 1.1 rc7: uninitialized value during IPoIB failover in
ipoib_ha.pl
BUG 278: OFED 1.1: two copies of openib.spec in openib-1.1.tgz

Other changes from OFED-1.1-rc7:
- Fix in ibdiagnet to support SM on a switch 
- Activate scaling code of ehca as default in the install 
- Documentation update
- Dapl: removed SCM from the configuration file dat.conf.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] client-server small message performance issues

2006-10-17 Thread Pete Wyckoff
I'm trying to understand some performance variation in an Openib
application, and wrote a small test program to simulate its
behavior.  Attached are the code and a plot of some results.  Each
dot in the plot shows the time for a single iteration in the code
explained below.

One client communicates with some number of servers.  In the plot,
anywhere from 1 to 10 servers.  There is one RC QP from the client
to each server, and no QPs between servers.  Everything is set up in
advance using TCP to exchange lid/key information.  Look for the
function "multiping" to see the client do the following:

start timer
foreach s in server:
post recv to QP[s]
foreach s in server:
post send to QP[s], 200 bytes
wait for 2 * numserver completions (one for each send and recv)
stop timer

Each server meanwhile has a preposted number of receives, 20 is
plenty.  Their loop is:

wait for receive completion
post send to client, 200 bytes
post recv to client QP
wait for send completion

The results for 1000 iterations (and 20 untimed warmups), invoked as
   mpiexec -comm=none -pernode -np 11 multiping -s 200 -n 1000 ib30 > x
   egrep '^#' x
look like:

#   +/-  median , all in usec
#  1 24.744 +/- 2.117 median 24.080 us
#  2 30.352 +/- 2.241 median 30.041 us
#  3 36.202 +/- 2.774 median 35.048 us
#  4 45.475 +/- 2.347 median 45.061 us
#  5 51.843 +/- 2.598 median 51.022 us
#  6 58.552 +/- 2.407 median 57.936 us
#  7 97.751 +/- 16.427 median 95.129 us
#  8 114.346 +/- 16.568 median 113.010 us
#  9 188.962 +/- 52.061 median 192.881 us
# 10 230.065 +/- 48.299 median 215.054 us

Basic ping pong is 25 us.  That's fine as this is not a particularly
optimal way to communicate.  Each additional server adds 6 us.  That
seems like a lot of overhead just to do another pair of posts and
polls, but not my major complaint.  Look at the jump from 6 to 7
servers, 41 us.  Beyond that, too.  And the standard deviation
becomes huge.  A plot of the individual values shows a large spread,
not just a few outliers.

I was hoping to see each additional server add a fixed amount of
overhead to the overall time.  The same application on ethernet
starts slower, but scales much better as the number of servers is
increased.

The hardware is all Mellanox MT25204, with 18-port MT47396 switches.
I tried 11 hosts all connected to the same switch, and another 11
hosts to a different switch.  Also mixing hosts across switches.  No
perceptible changes to the results.  Also played around with QP attr
timeout and retry_count to discover that retries do happen, so the
retry_count must be at least 2, but that a timeout from 2 to 10
doesn't have an effect.  Software is stock kernel 2.6.17.6 and
libibverbs-1.0.3-1.fc4, libmthca-1.0.2-1.fc4.

Any suggestions on how to avoid these big jumps?  Explanations as to
the cause?

-- Pete

/*
 * Test completion time for lots of small conversations.  Task 0 of this
 * parallel code is the "client", who does a number of iterations of a
 * test involving small transactions with "servers".  At each iteration,
 * the client pre-posts a receive on each QP, posts a small send on each,
 * then * polls until all sends and receives are completed.  Each server
 * keeps a constant number of receives posted, waits for a message to
 * arrive and immediatly responds.
 *
 * Copyright (C) Pete Wyckoff, 2006.  <[EMAIL PROTECTED]>
 *
 * Built like this:
 * gcc -O3 -c -o multiping.o multiping.c
 * gcc -o multiping multiping.o -libverbs -lm
 *
 * Somehow get it started on many nodes, pointing them all to one
 * which is designated as the master, e.g.:
 *
 *for i in piv002 piv004 piv006 ; do rsh -n $i multiping piv002 & done
 *
 * Mpiexec users inside a PBS job:
 *
 *mpiexec --comm=none -pernode -nostdin multiping $(hostname)
 *
 * Bproc users could do:
 *
 *bpsh 3-31 ./multiping n3
 *
 * Run the code with no args to see the usage() message.  Numbers
 * can be given with suffix "k", "m", or "g" to scale by 10^3, 6, or 9,
 * e.g.:  multiping -n 1k -s 1m $(hostname)
 *
 * Two environment variables adjust QP settings, e.g.:
 *   ARDMA_RETRY_COUNT=2
 *   ARDMA_TIMEOUT=4
 *
 */
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

/*
 * Debugging support.
 */
#if 0
#define DEBUG_LEVEL 2
#define debug(lvl,fmt,args...) \
do { \
if (lvl <= DEBUG_LEVEL) \
info(fmt,##args); \
} while (0)
#define assert(cond,fmt,args...) \
do { \
if (__builtin_expect(!(cond),0)) \
error(fmt,##args); \
} while (0)
#else  /* no debug version */
#  define debug(lvl,cond,fmt,...)
#  define assert(cond,fmt,...)
#endif

/*
 * Handy macros.
 */
#define ptr_from_int64(p) (void *)(unsigned long)(p)
#define int64_from_ptr(p) (u_int64_t)(unsigned long)(p)

/*
 * Some shared variables.
 */
const char *progname;
int myid, numproc;
char *myhos

Re: [openib-general] [ucma] executing the ucmatose with local IPoIB IP address of port 2 fails

2006-10-17 Thread Sean Hefty
Sean Hefty wrote:
> This is a ROUTE_ERROR (path record query fails).  Are the IP addresses on 
> different subnets?  Are you having ucmatose bind to the port 2 ip address.

Another thing to check is what port ucmatose binds to after calling 
rdma_resolve_addr().

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Hal Rosenstock
On Tue, 2006-10-17 at 12:07, Yevgeny Kliteynik wrote:
> Hi Hal
> 
> Fixing more things in the multicast test flow.
>  
> Still have things to do in case when multicast group removal
> fails, and have to add some cleanup (as we've discussed previously).
> --
> Yevgeny
> 
> Signed-off-by:  Yevgeny Kliteynik <[EMAIL PROTECTED]>

Looks good. One question below.

-- Hal

> Index: osmtest/osmt_multicast.c
> ===
> --- osmtest/osmt_multicast.c  (revision 9856)
> +++ osmtest/osmt_multicast.c  (working copy)

[snip...]

> @@ -3261,6 +3306,7 @@ osmt_run_mcast_flow( IN osmtest_t * cons
>  &mc_req_rec,
>  comp_mask,
>  &res_sa_mad );
> +  status = IB_SUCCESS;

This doesn't look right to me.

>if (status != IB_SUCCESS)
>{
>  osm_log( &p_osmt->log, OSM_LOG_ERROR,
> @@ -3274,6 +3320,10 @@ osmt_run_mcast_flow( IN osmtest_t * cons
>  fail_to_delete_mcg++;
>}
>  }
> +else
> +{
> +  end_ipoib_cnt++;
> +}
>  p_mgrp = (osmtest_mgrp_t*)cl_qmap_next( &p_mgrp->map_item );
>}
>  

[snip...]



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [ucma] executing the ucmatose with local IPoIB IP address of port 2 fails

2006-10-17 Thread Sean Hefty
> scenario 2:  fails
> SM was executed on port 2
> i executed ucmatose server and ucmatose client with IPoIB IP address 
> of port 2
> 
> here is the output of the client:
> ucmatose: starting client
> ucmatose: connecting
> ucmatose: event: 3, error: 0
> receiving data transfers
> sending replies
> data transfers complete
> test complete
> return status 0

This is a ROUTE_ERROR (path record query fails).  Are the IP addresses on 
different subnets?  Are you having ucmatose bind to the port 2 ip address.

> It seems that when using the IPoIB IP address of port 2 in the client 
> side and there is an SM only on port 2 the test fails but if i add an SM 
> on port 1 the test passes.
> 
> Did you notice this behavior before?

I have not tested this configuration.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] opensm: misc fixes in lft dump file parser

2006-10-17 Thread Sasha Khapyorsky

There are misc small fixes for lft dump parser:
- merge ERROR and SYS logging in single osm_log() call
- more strict strtoul() results checking
- fix potential bugs with invalid dump files
- break too long lines

Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>
---
 osm/opensm/osm_ucast_file.c |   69 +++
 1 files changed, 37 insertions(+), 32 deletions(-)

diff --git a/osm/opensm/osm_ucast_file.c b/osm/opensm/osm_ucast_file.c
index da39d1a..446c243 100644
--- a/osm/opensm/osm_ucast_file.c
+++ b/osm/opensm/osm_ucast_file.c
@@ -132,21 +132,19 @@ static int do_ucast_file_load(void *cont
 
file_name = p_osm->subn.opt.ucast_dump_file;
if (!file_name) {
-   osm_log(&p_osm->log, OSM_LOG_SYS,
-   "ucast dump file name is not defined; using default 
routing algorithm\n");
-   osm_log(&p_osm->log, OSM_LOG_ERROR,
+   osm_log(&p_osm->log, OSM_LOG_ERROR|OSM_LOG_SYS,
"do_ucast_file_load: ERR 6301: "
-   "ucast dump file name is not defined; using default 
routing algorithm\n");
+   "ucast dump file name is not defined; "
+   "using default routing algorithm\n");
return -1;
}
 
file = fopen(file_name, "r");
if (!file) {
-   osm_log(&p_osm->log, OSM_LOG_SYS,
-   "Cannot open ucast dump file \'%s\'; using default 
routing algorithm\n", file_name);
-   osm_log(&p_osm->log, OSM_LOG_ERROR,
+   osm_log(&p_osm->log, OSM_LOG_ERROR|OSM_LOG_SYS,
"do_ucast_file_load: ERR 6302: "
-   "cannot open ucast dump file \'%s\'; using default 
routing algorithm\n", file_name);
+   "cannot open ucast dump file \'%s\'; "
+   "using default routing algorithm\n", file_name);
return -1;
}
 
@@ -167,25 +165,25 @@ static int do_ucast_file_load(void *cont
continue;
 
if (!strncmp(p, "Multicast mlids", 15)) {
-   osm_log(&p_osm->log, OSM_LOG_SYS,
-   "Multicast dump file detected; "
-   "skipping parsing. Using default routing 
algorithm\n");
-   osm_log(&p_osm->log, OSM_LOG_ERROR,
+   osm_log(&p_osm->log, OSM_LOG_ERROR|OSM_LOG_SYS,
"do_ucast_file_load: ERR 6303: "
"Multicast dump file detected; "
-   "skipping parsing. Using default routing 
algorithm\n");
+   "skipping parsing. Using default "
+   "routing algorithm\n");
} else if (!strncmp(p, "Unicast lids", 12)) {
q = strstr(p, " guid 0x");
if (!q) {
-   osm_log(&p_osm->log, OSM_LOG_ERROR, "PARSE 
ERROR: %s:%u: "
+   osm_log(&p_osm->log, OSM_LOG_ERROR,
+   "PARSE ERROR: %s:%u: "
"cannot parse switch definition\n",
file_name, lineno);
return -1;
}
-   p = q + 6;
+   p = q + 8;
sw_guid = strtoull(p, &q, 16);
-   if (q && !isspace(*q)) {
-   osm_log(&p_osm->log, OSM_LOG_ERROR, "PARSE 
ERROR: %s:%u: "
+   if (q == p || !isspace(*q)) {
+   osm_log(&p_osm->log, OSM_LOG_ERROR,
+   "PARSE ERROR: %s:%u: "
"cannot parse switch guid: \'%s\'\n",
file_name, lineno, p);
return -1;
@@ -204,39 +202,46 @@ static int do_ucast_file_load(void *cont
continue;
}
} else if (p_sw && !strncmp(p, "0x", 2)) {
+   p += 2;
lid = (uint16_t)strtoul(p, &q, 16);
-   if (q && !isspace(*q)) {
-   osm_log(&p_osm->log, OSM_LOG_ERROR, "PARSE 
ERROR: %s:%u: "
-   "cannot parse lid: \'%s\'\n", 
file_name, lineno, p);
+   if (q == p || !isspace(*q)) {
+   osm_log(&p_osm->log, OSM_LOG_ERROR,
+   "PARSE ERROR: %s:%u: "
+   "cannot parse lid: \'%s\'\n",
+   file_name, lineno, p);
return -1;
}
p = q;
while 

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
>  > > I don't think an mmiowb() equivalent is available from userspace.
>  > 
>  > Isn't this just an asm() command?
> 
> Nope, look at the kernel source, specifically arch/ia64/sn/kernel/iomv.c
> 
>  > BTW, I think we really should implement proper rmb/wmb in arch.h.
>  > Last time I looked we only had compiler barriers here, and
>  > this means, I think, that a read from e.g. CQE contents could bypass
>  > the read of the CQE valid bit.
> 
> I'm not absolutely sure everything there is correct but I did my best,
> for example
> 
> #elif defined(__ia64__)
> 
> #define mb()asm volatile("mf" ::: "memory")
> 
> Do you know of any specific archs that are broken?

Look e.g. on mthca/cq.c


cqe = next_cqe_sw(cq);
if (!cqe)
return CQ_EMPTY;

/*
   * Make sure we read CQ entry contents after we've checked the
   * ownership bit.
   */
mb();

qpn = ntohl(cqe->my_qpn);


kernel code does rmb rather than mb there.



-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
 > > I don't think an mmiowb() equivalent is available from userspace.
 > 
 > Isn't this just an asm() command?

Nope, look at the kernel source, specifically arch/ia64/sn/kernel/iomv.c

 > BTW, I think we really should implement proper rmb/wmb in arch.h.
 > Last time I looked we only had compiler barriers here, and
 > this means, I think, that a read from e.g. CQE contents could bypass
 > the read of the CQE valid bit.

I'm not absolutely sure everything there is correct but I did my best,
for example

#elif defined(__ia64__)

#define mb()asm volatile("mf" ::: "memory")

Do you know of any specific archs that are broken?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Re-send ARP as prev ARP request could have got dropped.

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote:
> Re-send ARP, since earlier ARP request could have got
> dropped/lost. This should be done in addr_resolve_remote()
> as doing it in rdma_resolve_ip() means sending ARP only
> once.

This was intentional.  Users can call rdma_resolve_ip() again to retry a timed 
out request.

In any case, we want to avoid resending an ARP until after the first request 
has 
timed out.  addr_resolve_remote() can be called multiple times for the same 
destination within the specified time out window.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Rewrite cma_req_handler() to encapsulate common code.

2006-10-17 Thread Sean Hefty
Acked-by: Sean Hefty <[EMAIL PROTECTED]>

Let me see how Roland would like to handle merging the patches going forward, 
but this one looks fine.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Fix some cancellation problems in process_req().

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote:
>   mutex_lock(&lock);
>   list_for_each_entry_safe(req, temp_req, &req_list, list) {
> - if (req->status) {
> + if (req->status && req->status != -ECANCELED) {

I think we just need:

if (req->status == -ENODATA) {

>   src_in = (struct sockaddr_in *) &req->src_addr;
>   dst_in = (struct sockaddr_in *) &req->dst_addr;
>   req->status = addr_resolve_remote(src_in, dst_in,
> req->addr);
> + if (req->status && time_after_eq(jiffies, req->timeout))
> + req->status = -ETIMEDOUT;
> + else if (req->status == -ENODATA)
> + continue;
>   }
> - if (req->status && time_after(jiffies, req->timeout))
> - req->status = -ETIMEDOUT;
> - else if (req->status == -ENODATA)
> - continue;
> -

The other changes look fine.  But note that if req->status == -ECANCELED and 
time_after() is true, then it seems like a toss up as to which one can be 
reported to the user.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma

What I suggested here is when it's connected mode with large MTU, set ib interface flag to CHECKSUM_UNNECESSARY. But this only works on packets not being routed off-net at the TCP layer.

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread akepner
On Tue, 17 Oct 2006, Michael S. Tsirkin wrote:

> Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
>> Subject: Re: [PATCH] use mmiowb after doorbell ring
>>
>> Michael> BTW, something like this will be needed for userspace too?
>>
>> Ugh, I forgot about that.
>>
>> I don't think an mmiowb() equivalent is available from userspace.
>
> Isn't this just an asm() command?
>

Depends on the architecture, but on sn2, it's not.

(Actually, on most architectures, it's a no-op. See
arch/ia64/sn/kernel/iomv.c:__sn_mmiowb() for the sn2 
version.)

-- 
Arthur



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> No, it's never a good idea to turn off TCP or IP checksums.  That
> leads to possibilities of silent data corruption too easily.

"never" is probably too strong a word - hardware checksum offloading
turns off checksumming in software, moving that to hardware.
Some people dislike that, too, but its not a universal thing.
Another example is loopback interface which sets NETIF_F_NO_CSUM. But this
might be a linux-only thing.

Right?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Use time_after_eq() instead of time_after() in queue_req()

2006-10-17 Thread Sean Hefty
Acked-by: Sean Hefty <[EMAIL PROTECTED]>

Roland, this looks good for 2.6.20.  How would you like to handle pulling in 
patches like these?  Once OFA has git up, would it be easier to pull them into 
my git tree, then request that you pull from there, or does this work okay?


> In queue_req(), use time_after_eq() instead of time_after()
> for following reasons :
> 
> - Improves insert time if multiple entries with same time are
>   present.
> - set_timeout need not be called if entry with same time
>   is added to the list (and that happens to be the entry
>   with the smallest time), saving atomic/locking operations.
> - Earlier entries with same time are deleted first (fifo).
> 
> Signed-off-by: Krishna Kumar <[EMAIL PROTECTED]>
> 
> diff -ruNp org/drivers/infiniband/core/addr.c 
> new/drivers/infiniband/core/addr.c
> --- org/drivers/infiniband/core/addr.c2006-10-09 16:54:37.0 
> +0530
> +++ new/drivers/infiniband/core/addr.c2006-10-09 16:55:36.0 
> +0530
> @@ -118,7 +118,7 @@ static void queue_req(struct addr_req *r
>  
>   mutex_lock(&lock);
>   list_for_each_entry_reverse(temp_req, &req_list, list) {
> - if (time_after(req->timeout, temp_req->timeout))
> + if (time_after_eq(req->timeout, temp_req->timeout))
>   break;
>   }

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
> Michael> BTW, something like this will be needed for userspace too?
> 
> Ugh, I forgot about that.
> 
> I don't think an mmiowb() equivalent is available from userspace.

Isn't this just an asm() command?

> However, the problem only arises if userspace uses the same QP/CQ/SRQ
> from multiple nodes at the same time -- so maybe we can live with this.

BTW, I think we really should implement proper rmb/wmb in arch.h.
Last time I looked we only had compiler barriers here, and
this means, I think, that a read from e.g. CQE contents could bypass
the read of the CQE valid bit.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] [RFC] cma_new_id can kfree on error instead of destroy_id

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote:
> cma_new_id() does not require to do destroy_id(), instead
> it can kfree(), since nothing is allocated on that id.
> Posting this as an RFC in case anyone feels that create_id
> should be cleaned up by destroy_id (even if redundant).

I can go either way on this.  It's a little cleaner to match rdma_create_id() 
with rdma_destroy_id(), rather than matching it with kfree().  It makes 
maintenance easier.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Roland Dreier
Shirley> I read the discussion in net-dev. Since IB packet has its
Shirley> own CRC (ICRC, VCRC). Is it a good idea to enable
Shirley> checksum unnecessary in a pure IB Fabrics for large MTU
Shirley> 64K. It requires some negotiation. Does your prototype
Shirley> implementation for large MTU requires both ends
Shirley> agreement?

No, it's never a good idea to turn off TCP or IP checksums.  That
leads to possibilities of silent data corruption too easily.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Roland Dreier
Michael> BTW, something like this will be needed for userspace too?

Ugh, I forgot about that.

I don't think an mmiowb() equivalent is available from userspace.
However, the problem only arises if userspace uses the same QP/CQ/SRQ
from multiple nodes at the same time -- so maybe we can live with this.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] uDAPL problem

2006-10-17 Thread Arlin Davis
Stephen Smaldone wrote:

>
>
> Arlin Davis wrote:
>
>> Steve Smaldone wrote:
>>
>>> Hi,
>>>
>>> Sorry for replying to myself, but I loaded rdma_ucm and the rdma_cm 
>>> device appears.  However, it now fails with the following:
>>>
>>> $ ./dapltest -T S -D IB1
>>> ...
>>> DAT Registry: dat_ia_openv (IB1,1:2,0) called
>>> DAT Registry: IA IB1, trying to load library /usr/local/lib/libdapl.so
>>> DAT Registry: dat_registry_add_provider (IB1,1:2,0)
>>> libibverbs: Warning: no userspace device-specific driver found for 
>>> uverbs0
>>>driver search path: /usr/local/lib/infiniband
>>> libibverbs: Warning: no userspace device-specific driver found for 
>>> uverbs0
>>>driver search path: /usr/local/lib/infiniband
>>> DT_cs_Server: Could not open IB1 (DAT_INVALID_ADDRESS )
>>> DT_cs_Server (IB1):  Exiting.
>>> DAT Registry: Stopped (dat_fini)
>>>
>>> The configuration remains the same otherwise.
>>>  
>>>
 My dat.conf:
 IB1 u1.2 nonthreadsafe default /usr/local/lib/libdapl.so 
 mv_dapl.1.2 "hora-1-ib0 0" ""
   
>>>
>> Do you have an entry in your /etc/hosts for hora-1-ib0 and 10.2.2.135?
>>
>> there seems to be problems resolving "hora-1-ib0"
>>
>> -arlin
>
> Yes.  There is an entry as follows:
> 10.2.2.135  hora-1-ib0

could you change the "hora-1-ib0 0" to just "ib0 0" in your dat.conf and 
retry? They may be an issue parsing a hostname instead of a netdev name.

>
> Thanks,
>
> Steve
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Sean Hefty
Michael S. Tsirkin wrote:
> Could be a compiler thing: maybe cm_issue_rej is used in ore than
> one place? To make sure, you can try removing the static
> keryword and see if this appears.

That could be.  cm_issue_rej is called from multiple locations, whereas 
cm_issue_drep is not.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] If addr_handler() got error, do not set state as OK

2006-10-17 Thread Sean Hefty
Krishna Kumar wrote:
> diff -ruNp org/drivers/infiniband/core/cma.c new/drivers/infiniband/core/cma.c
> --- org/drivers/infiniband/core/cma.c 2006-10-10 15:45:27.0 +0530
> +++ new/drivers/infiniband/core/cma.c 2006-10-10 15:59:53.0 +0530
> @@ -1515,6 +1515,8 @@ static void addr_handler(int status, str
>  {
>   struct rdma_id_private *id_priv = context;
>   enum rdma_cm_event_type event;
> + int did_comp_exch = 0;
> + int destroy = 0;

As a general comment, I really don't think that we need to be overly concerned 
about optimizing error handling at the expense of code readability.

Can you rework this patch without adding in extra flags to indicate what has or 
has not been executed?

Thanks,
- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: OFED 1.1 release schedule
> 
> Tziporet Koren wrote:
> > I checked it and saw that the patch is applied, but since in the patch 
> > Sean put the cm_issue_drep as a static, thus nm does not show it.
> > from the patch: +static int cm_issue_drep(struct cm_port *port,
> 
> cm_issue_rej is also static, but shows up.

Could be a compiler thing: maybe cm_issue_rej is used in ore than
one place? To make sure, you can try removing the static
keryword and see if this appears.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] rdma_bind_addr() leaks a cma_dev reference count

2006-10-17 Thread Sean Hefty
Krishna Kumar2 wrote:
> Hmmm, OK, I will re-phrase this patch to reduce nesting.

Something similar to:

if (cma_any_addr...) {
ret = rdma_translate_ip(..);
if (ret)
goto err1;

mutex_lock
ret = cma_acquire_dev
mutex_unlock
if (ret)
goto err2;
}

should work fine.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.1 release schedule

2006-10-17 Thread Sean Hefty
Tziporet Koren wrote:
> I checked it and saw that the patch is applied, but since in the patch 
> Sean put the cm_issue_drep as a static, thus nm does not show it.
> from the patch: +static int cm_issue_drep(struct cm_port *port,

cm_issue_rej is also static, but shows up.

> Do you really need the symbol to be exported out of the ib_cm module, or 
> is it enough this way.

The symbol does not need to be exported.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 06:48:22PM +0200, Michael S. Tsirkin wrote:
> Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>:
> > Subject: Re: [openfabrics-ewg] OFED 1.1 RC7 fork() issue.
> > 
> > On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote:
> > > 
> > > >3. Fork support from kernel 2.6.12 and above is available 
> > > >provided that applications do not use threads. The fork() is 
> > > >supported as long as parent process does not run before child 
> > > >exits or calls exec().
> > > 
> > > After fork(), in child, before exec(), can we call printf(), putenv(),
> > > or even re-direct stdout/stderr ?
> > > 
> > Child can do whatever he wants (except using verbs), but parent can't use
> > verbs until child exits() or execs().
> 
> Or even write to registered pages at all.
> 
Right. Forgot that. So parent better be doing nothing, but waiting.

--
Gleb.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Michael S. Tsirkin
Quoting r. [EMAIL PROTECTED] <[EMAIL PROTECTED]>:
> Subject: Re: [openfabrics-ewg] OFED 1.1 RC7 fork() issue.
> 
> On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote:
> > 
> > >3. Fork support from kernel 2.6.12 and above is available 
> > >provided that applications do not use threads. The fork() is 
> > >supported as long as parent process does not run before child 
> > >exits or calls exec().
> > 
> > After fork(), in child, before exec(), can we call printf(), putenv(),
> > or even re-direct stdout/stderr ?
> > 
> Child can do whatever he wants (except using verbs), but parent can't use
> verbs until child exits() or execs().

Or even write to registered pages at all.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Matt Leininger <[EMAIL PROTECTED]>:
> Developers had requested git 1.4, but Ubuntu had an older version.  We
> went ahead and installed git from source.  I'd prefer to stick to Ubuntu
> packages if possible.

We have much to gain from newer versions - just look at gitweb change log.
But my assumption here was that someone will keep the built from source
tools updated. I don't have a problem alerting the list when new
versions come out.

If, as Roland suggested, we'll be stuck at this version, its better
to stick with distro-supplied ones, assuming that *that* is updated
in a timely fashion.

So, I guess the question is how is the sytsem supported/updated?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> > > /* can be added later once ipoib support sg
> > > .get_sg = ethtool_op_get_sg,
> > > .set_sg = ethtool_op_set_sg,
> > > */
> > 
> > The difficulty here is that sg currently requires checksum offloading in
> > netdevice.
> 
> I read the discussion in net-dev.

Hmm, any suggestions?

> Since IB packet has its own CRC (ICRC, VCRC). Is it a good idea to enable
> checksum unnecessary in a pure IB Fabrics for large MTU 64K.  It requires some
> negotiation.

Not sure what you are saying here.

> Does your prototype implementation for large MTU requires both ends agreement?
> Practically it can be implemented, but I don't know what RFCs have defined.

Look up IPoIB connected mode.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing

Thanks for the clarification. --CQ


>>   
>You need to make a difference between full fork support that 
>will be available only in libibverbs1.1 and the system /fork & 
>exec fork support that is depend on the kernel only and 
>available from kernel 2.6.12.
>
>See also the explanation from Gleb on this
>
>Tziporet
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tziporet Koren
Tang, Changqing wrote:
> Thanks, I still use 2.6.9-34, Or Gerlitz told me that fork() support is
> only in libibverbs1.1 which is not released yet. Both OFED 1.0 and 1.1
> use libibverbs1.0, is it still true ?
>
> --CQ
>
>
>   
You need to make a difference between full fork support that will be 
available only in libibverbs1.1 and the system /fork & exec fork support 
that is depend on the kernel only and available from kernel 2.6.12.

See also the explanation from Gleb on this

Tziporet

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-17 Thread Sasha Khapyorsky
On 17:04 Tue 17 Oct , Michael S. Tsirkin wrote:
> Quoting r. Steve Wise <[EMAIL PROTECTED]>:
> > At the risk of opening a can of worms, is there any reason we don't move
> > the user stuff into its own git tree?  This would get rid of svn
> > altogether...
> 
> If we do, that should probably be multiple git trees - verbs, management,
> tests are all more or less independent and developed mostly by different 
> people.

Reasonable. And generally this should not be too bad.

Sasha

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-17 Thread Sasha Khapyorsky
On 09:17 Tue 17 Oct , Jeff Squyres wrote:
> Per the teleconference last week, I'd like to survey the developers  
> about the tools that should be installed on the new OFA server (is  
> there a plan to migrate there yet?).
> 
> As I understand it (please correct me if I get this wrong):
> 
> - The community has decided to stay with git for kernel level  
> development
>--> Was there a plan for any consolidation of the various git  
> repositories?)
> - The community decided to stay with svn for user space level  
> development

This would be nice to have automatic svn -> git mirroring for user space
projects too (at least for those projects where developers will like it).
Then developers will be able to choose between svn and git.

Currently I use such svn -> git mirroring privately for 
src/userspace/management, have some scripts and of course may help with
setup.

Sasha

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] osm: reviewing osmtest - osmt_multicast.c

2006-10-17 Thread Yevgeny Kliteynik
Hi Hal

Fixing more things in the multicast test flow.
 
Still have things to do in case when multicast group removal
fails, and have to add some cleanup (as we've discussed previously).
--
Yevgeny

Signed-off-by:  Yevgeny Kliteynik <[EMAIL PROTECTED]>

Index: osmtest/osmt_multicast.c
===
--- osmtest/osmt_multicast.c(revision 9856)
+++ osmtest/osmt_multicast.c(working copy)
@@ -418,6 +418,12 @@ osmt_init_mc_query_rec(IN  osmtest_t * c
  * o15.0.1.16:
  * - Try GetTable with PortGUID wildcarded and get back some groups.
  ***/
+
+/* The following macro can be used only within the osmt_run_mcast_flow() 
function */
+#define IS_IPOIB_MGID(p_mgid) \
+   ( !memcmp(&osm_ipoib_good_mgid ,   (p_mgid) , 
sizeof(osm_ipoib_good_mgid)) || \
+ !memcmp(&osm_ts_ipoib_good_mgid ,(p_mgid) , 
sizeof(osm_ts_ipoib_good_mgid)) )
+
 ib_api_status_t
 osmt_run_mcast_flow( IN osmtest_t * const p_osmt ) {
   ib_api_status_t status;
@@ -433,13 +439,15 @@ osmt_run_mcast_flow( IN osmtest_t * cons
   ib_net16_t max_mlid = cl_hton16(0xFFFE),tmp_mlid;
   boolean_t ReachedMlidLimit = FALSE;
   int start_cnt = 0, cnt, middle_cnt = 0, end_cnt = 0;
-  int IPoIBIsFound = 0, mcg_outside_test_cnt = 0, fail_to_delete_mcg = 0;
+  int start_ipoib_cnt = 0, end_ipoib_cnt = 0;
+  int mcg_outside_test_cnt = 0, fail_to_delete_mcg = 0;
   osmtest_req_context_t context;
   ib_node_record_t *p_rec;
   uint32_t num_recs = 0, i;
   uint8_t mtu_phys = 0, rate_phys = 0;
   cl_map_t test_created_mlids; /* List of all mlids created in this test */
   ib_member_rec_t* p_recvd_rec;
+  boolean_t got_error = FALSE;
 
   static ib_gid_t good_mgid = {
 {
@@ -538,13 +546,19 @@ osmt_run_mcast_flow( IN osmtest_t * cons
   while( p_mgrp != (osmtest_mgrp_t*)cl_qmap_end( p_mgrp_mlid_tbl ) )
   {
 /* search for ipoib mgid */
-if 
(!memcmp(&osm_ipoib_good_mgid,&p_mgrp->mcmember_rec.mgid,sizeof(osm_ipoib_good_mgid))
 ||
-
!memcmp(&osm_ts_ipoib_good_mgid,&p_mgrp->mcmember_rec.mgid,sizeof(osm_ts_ipoib_good_mgid)))
+if (IS_IPOIB_MGID(&p_mgrp->mcmember_rec.mgid))
 {
-  IPoIBIsFound = 1;
+  start_ipoib_cnt++;
 }
 else
+{
+  osm_log( &p_osmt->log, OSM_LOG_INFO,
+   "osmt_run_mcast_flow: "
+   "Non-IPoIB MC Groups exist: mgid=0x%016" PRIx64 ":0x%016" 
PRIx64 "\n",
+   cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.prefix),
+   cl_ntoh64(p_mgrp->mcmember_rec.mgid.unicast.interface_id) );
mcg_outside_test_cnt++;
+}
 
 p_mgrp = (osmtest_mgrp_t*)cl_qmap_next( &p_mgrp->map_item );
   }
@@ -553,7 +567,7 @@ osmt_run_mcast_flow( IN osmtest_t * cons
"osmt_run_mcast_flow: "
"Found %d non-IPoIB MC Groups\n", mcg_outside_test_cnt);
 
-  if (IPoIBIsFound)
+  if (start_ipoib_cnt)
   {
 /* o15-0.2.4 - Check a join request to already created MCG */
 osm_log( &p_osmt->log, OSM_LOG_INFO,
@@ -576,6 +590,9 @@ osmt_run_mcast_flow( IN osmtest_t * cons
 
 osm_log(&p_osmt->log, OSM_LOG_INFO,
 "osmt_run_mcast_flow: "
+"Joining to an existing IPoIB multicast group\n");
+osm_log(&p_osmt->log, OSM_LOG_INFO,
+"osmt_run_mcast_flow: "
 "Sent Join request with :\n\t\tport_gid=0x%016"PRIx64
 ":0x%016" PRIx64 ", mgid=0x%016" PRIx64 ":0x%016" PRIx64
 "\n\t\tjoin state= 0x%x, response is : %s\n",
@@ -585,6 +602,14 @@ osmt_run_mcast_flow( IN osmtest_t * cons
 cl_ntoh64(mc_req_rec.mgid.unicast.interface_id),
 (mc_req_rec.scope_state & 0x0F),
 ib_get_err_str(status));
+if (status != IB_SUCCESS)
+{
+   osm_log( &p_osmt->log, OSM_LOG_ERROR,
+"osmt_run_mcast_flow : ERR 02B3: "
+"Failed joining existing IPoIB MCGroup - got %s\n",
+ib_get_err_str(status));
+   goto Exit;
+}
 /* Check MTU & Rate Value and resend with SA suggested values */
 p_mc_res = ib_sa_mad_get_payload_ptr(&res_sa_mad);
 
@@ -1473,7 +1498,7 @@ osmt_run_mcast_flow( IN osmtest_t * cons
   {
 osm_log( &p_osmt->log, OSM_LOG_ERROR,
  "osmt_run_mcast_flow: ERR 02A5: "
- "Failed to create MCG for MGID=0 with higher than minimum RATE\n",
+ "Failed to create MCG for MGID=0 with higher than minimum RATE - 
got %s/%s\n",
  ib_get_err_str( status ),
  ib_get_mad_status_str( (ib_mad_t*)(&res_sa_mad) )
  );
@@ -1518,7 +1543,7 @@ osmt_run_mcast_flow( IN osmtest_t * cons
   {
 osm_log( &p_osmt->log, OSM_LOG_ERROR,
  "osmt_run_mcast_flow: ERR 0211: "
- "Failed to create MCG for MGID=0 with less than highest RATE\n",
+ "Failed to create MCG for MGID=0 with less than highest RATE - 
got %s/%s\n",
  ib_get_err_str( status ),
  ib_get_mad_status_str( (

Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread glebn
On Tue, Oct 17, 2006 at 10:34:23AM -0500, Tang, Changqing wrote:
> 
> >3. Fork support from kernel 2.6.12 and above is available 
> >provided that applications do not use threads. The fork() is 
> >supported as long as parent process does not run before child 
> >exits or calls exec().
> 
> After fork(), in child, before exec(), can we call printf(), putenv(),
> or even re-direct stdout/stderr ?
> 
Child can do whatever he wants (except using verbs), but parent can't use
verbs until child exits() or execs().

--
Gleb.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Tools for development

2006-10-17 Thread Matt Leininger
On Tue, 2006-10-17 at 07:49 -0700, Roland Dreier wrote:
> Michael> The tool versions installed on openib are ancient.  Can
> Michael> site admins please install latest svn and git versions
> Michael> from source?
> 
> What distro is on the new openfabrics.org server?

  Ubuntu.

>  If it's something
> like Fedora or Ubuntu, then it would probably be better to install the
> distros versions of svn and git, so that keeping up with security
> updates is easiser.

  Developers had requested git 1.4, but Ubuntu had an older version.  We
went ahead and installed git from source.  I'd prefer to stick to Ubuntu
packages if possible.

  - Matt


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.1 RC7 fork() issue.

2006-10-17 Thread Tang, Changqing

>3. Fork support from kernel 2.6.12 and above is available 
>provided that applications do not use threads. The fork() is 
>supported as long as parent process does not run before child 
>exits or calls exec().

After fork(), in child, before exec(), can we call printf(), putenv(),
or even re-direct stdout/stderr ?

--CQ


>The former can be achieved by calling wait(childpid) the later 
>can be achieved by application specific means.  Posix system() 
>call is supported.
>
>Something along those lines.
>
>--
>   Gleb.
>
>___
>openib-general mailing list
>openib-general@openib.org
>http://openib.org/mailman/listinfo/openib-general
>
>To unsubscribe, please visit 
>http://openib.org/mailman/listinfo/openib-general
>
>

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ethtool support for ipoib

2006-10-17 Thread Shirley Ma

"Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote on 10/16/2006 11:12:03 PM:

> Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> > /* can be added later once ipoib support sg
> > .get_sg = ethtool_op_get_sg,
> > .set_sg = ethtool_op_set_sg,
> > */
> 
> The difficulty here is that sg currently requires checksum offloading in
> netdevice.
> 
> -- 
> MST

I read the discussion in net-dev. Since IB packet has its own CRC (ICRC, VCRC). Is it a good idea to enable checksum unnecessary in a pure IB Fabrics for large MTU 64K. It requires some negotiation. Does your prototype implementation for large MTU requires both ends agreement? Practically it can be implemented, but I don't know what RFCs have defined.___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] use mmiowb after doorbell ring

2006-10-17 Thread Michael S. Tsirkin
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] use mmiowb after doorbell ring
> 
> OK, here's what I actually put in my tree.  Can you eyeball this and
> maybe give it a quick test?  If it looks good to you, I'll send it on
> to the stable team for 2.6.18.x.

BTW, something like this will be needed for userspace too?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



  1   2   >