Re: [openib-general] RHEL 4 U3 - lost completions
On Tue, Oct 03, 2006 at 07:58:50AM +0200, Or Gerlitz wrote: > Bill Hartner wrote: > > > > Roland Dreier wrote: > >> Bill> At 1st, I thought that was the case, a fork, however, I do > >> Bill> not think get_user_pages(), and the increment of the ref > >> Bill> count, will guarantee the page struct does not change for > >> Bill> RHEL 4 U3, I need to verify that though. > >> > >> Are you doing a fork()? If so then, yes, you will not be able to make > >> your app work on a RHEL4 kernel. After get_user_pages(), if you do a > >> fork() then a copy-on-write will still happen, which will cause the > >> physical page to move as you have discovered. > > > > There is no fork that I am aware of in the code. The pthread that > > created the EVD and any other thread in the process that executes the > > debug code sees the changed page struct. I will try to recreate this in > > a test app. > AFAIR there is a bug in kernel 2.6.9 that makes it possible for page to be changed in process's VM even though it is locked by get_user_pages(). That is why Mellanox driver used mlock() in addition to get_user_pages(). I think this bug was fixed somewhere around 2.6.11. -- Gleb. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [Bug 261] New: can't configure IPoIB pkey interfaces at boot time
http://openib.org/bugzilla/show_bug.cgi?id=261 Summary: can't configure IPoIB pkey interfaces at boot time Product: OpenFabrics Linux Version: 1.1rc6 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P3 Component: IPoIB AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] It would be nice if /etc/init.d/openibd could configure pkey interfaces like ib0.8001, perhaps using config files like ifcfg-ib0.8001. --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] IB multicast
Hi, I have a configuration consisting of 6 nodes connected through a single IB switch. I am sending data from a single node to all the remaining 5 nodes using IB multicast. I get a bandwidth of not more than 1.5 - 2 Gbps. I was expecting it to be around 10 Gbps( i.e same as point to point b/w). Bandwidth here is defined as (total sent data from the source)/(time for getting completion acks from all the 5 nodes on receiving source data). 1. What could be the reason? 2. What is the expected bandwidth? regards, -Chev ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
Quoting r. Robert Walsh <[EMAIL PROTECTED]>: > Subject: Re: [openfabrics-ewg] OFED Status > > The attached patch fixes this problem by deferring creation of our > diagpkt device until at least one piece of hardware has been found. > > Michael: this will fix the OFED testing problem you were seeing. > > Roland: please queue for 2.6.19. Just saw this, thanks, I'll try. Do you want to update the patch following Roland's comments? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Quoting r. Woodruff, Robert J <[EMAIL PROTECTED]>: > Subject: Drop in performance on Mellanox MT25204 single port DDR HCA > > > Hi Roland/Michael, > > One of my coworkers in Champaign is seeing a performance > issue with the latest SVN driver and the OFED 1.1 Mellanox > driver on certain platforms. > > On the older SVN somewhere around 7500 the Mellanox driver > did not save and restore certain PCI registers before a reset. > Somewhere around SVN 8000 a patch was added to save and > restore these registers. However on our Alcolu platform > this patch causes the MaxReadReq to be set to 128 bytes (rather than > 512) > which limits bandwith to 650MBytes/sec. If I remove the > save/restore of these registers (attached patch), the > bandwidth is back to where we would expect it 1250 Mbytes/sec. > > Is there some problem with this patch or do you think it is > some BIOS issue in the platform ? This is a BIOS issue - it should set the MaxReadReq register for maximum performance and stability. As a work-around, you can use the setpci utility to modify MaxReadReq before loading the driver. Unfortunately, mthca has no way to know which values are legal and which will give the best performance, and previous behaviour was out of spec, reportedly causing stability issues on compliant platforms. I will look into adding this info in release notes. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Sean Hefty wrote: > Steve Wise wrote: >> What is the status of the rdma_cm branch in Roland's infiniband.git >> tree? It doesn't have the iwarp stuff in it. I'm wondering if it can >> be merged with the 2.6.19 stuff to create a branch that was iwarp + ucma >> support? Or is that a dumb idea? > > I'm currently working on moving the rdma_cm code that's in svn forward to > what's > upstream. (I was just typing a message on this...) My plan is to ask Roland > to > host one, maybe two, branches in the infiniband.git tree. Here are the main > pieces missing from the kernel: > > 1. We need to add rdma_establish() and expose the rdma_conn_param values as > part of the connection event. I'm working on a patch for the latter. > > 2. We need a ucma branch. To merge upstream, it makes sense to include item > 1 > first, but this leads to a conflict with the OFED releases. OFED ABI version > 1 > includes RC QP support, but without item 1 changes, and SVN ABI version 2 > includes multicast support. > > 3. There's been requests for an rdma_cm branch that includes UD QP / > multicast. > > The cleanest solution from an ABI perspective is to merge multicast support > before the ucma; however, I'm not sure that makes the most sense for merging > upstream. Thoughts? Since the ucma will not make it for the 2.6.19 feature merge window, why not target both the ucma and the cma ud/ud-mulitast support for 2.6.20? This way you would be doing one big ABI change and would not carry this HUGE svn/git diff. As I have mentioned in the other thread, once it would make sense from your schedule to do the patch preparation work, it would be good to push it into the for-2.6.20 branch of Roland's tree from where it can go to the -mm tree so people can start testing it. Once the code is in the for-2.6.20 branch, it would be also possible to include it in OFED 1.2 release which is expected on December this year. Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Problems with OFED IPoIB HA on SLES10
Vlad, I filed a bug for these issues. 1) If I start IPoIB HA with ib0 IB port shut down (from IB switch) and ib1 IB port enabled, then IPoIB does not work because "ip monitor link all" does not report NO-CARRIER at startup like ipoib_ha.pl is looking for. This is a major hole. 2) /etc/init.d/openibd runs ipoib_ha.pl with its stdout and stderr redirected to /dev/null, should we run with -v for verbose instead and redirect log file to /var/log? # fgrep ipoib_ha.pl /etc/init.d/openibd ipoib_ha.pl -p ${PRIMARY_IPOIB_DEV} -s ${SECONDARY_IPOIB_DEV} --with-arping --with-multicast > /dev/null 2>&1 & 3) I got IPoIB HA working on SLES 10, but the documentation is a little lacking. Looks like I have to put the same IP address in ifcfg-ib0 and ifcfg-ib1, is this correct? # pwd/etc/sysconfig/network# cat ifcfg-ib0DEVICE=ib0BOOTPROTO=staticIPADDR=192.168.2.46NETMASK=255.255.255.0># cat ifcfg-ib1DEVICE=ib1BOOTPROTO=staticIPADDR=192.168.2.46NETMASK=255.255.255.0> 4) If I shutdown ib0 IB port, I see this from "/usr/local/ofed/bin/ipoib_ha.pl -v --with-arping --with-multicast" Use of uninitialized value in concatenation (.) or string at /usr/local/ofed/bin/ipoib_ha.pl line 287. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Bill Hartner wrote: > > Roland Dreier wrote: >> Bill> At 1st, I thought that was the case, a fork, however, I do >> Bill> not think get_user_pages(), and the increment of the ref >> Bill> count, will guarantee the page struct does not change for >> Bill> RHEL 4 U3, I need to verify that though. >> >> Are you doing a fork()? If so then, yes, you will not be able to make >> your app work on a RHEL4 kernel. After get_user_pages(), if you do a >> fork() then a copy-on-write will still happen, which will cause the >> physical page to move as you have discovered. > > There is no fork that I am aware of in the code. The pthread that > created the EVD and any other thread in the process that executes the > debug code sees the changed page struct. I will try to recreate this in > a test app. Bill - Is there a chance your code uses daemonize()? Roland - If indeed, does it make sense that the problem does not reproduce with single threaded runs? Or. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
> modprobe would go into the D state and stay there. Why? What was the process stuck sleeping on? > From: Robert Walsh <[EMAIL PROTECTED]> I assume this is supposed to be Signed-off-by: ? > +void ipath_diagpkt_add(void) > +{ > +if (diagpkt_count == 0) > +ipath_cdev_init(IPATH_DIAGPKT_MINOR, > +"ipath_diagpkt", &diagpkt_file_ops, > +&diagpkt_cdev, &diagpkt_class_dev); > + > +diagpkt_count++; > +} This seems dangerous, especially now that we have PCI_MULTITHREAD_PROBE: nothing prevents ipath_cdev_init() from being called twice. Better to use something like test_and_set_bit() to make sure this is done exactly once. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Ramachandra> In that case, can you please consider this for the Ramachandra> for-2.6.20 branch ? > I'm happy to keep this in a vex branch or something like that, but as > the emails I just sent show, this is not ready for merging yet (which > is to be expected -- it's never been reviewed). > Thanks. That's pretty much what we are expecting at this early stage. I fully agree that it is not ready for merging yet. We'll work on the items pointed out by the various IB reviewers and then take it from there. It is premature at this point to discuss when, where and how to merge. Madhu ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0 of 28] ipath patches for 2.6.19
"Bryan O'Sullivan" <[EMAIL PROTECTED]> writes: > Eric W. Biederman wrote: > >> Have you tested your driver against the -mm tree? > > No. > >> To the best of my knowledge the irq handling of your hypertransport card >> is a complete and total hack that works only by chance. > > And a happy Monday morning to you, too :-) :) >> In the -mm tree I have added a first pass at proper support for the >> hypertranport interrupt capability. As this code is slated to go into >> 2.6.19 could you please test against that? > > I'm on vacation for a few weeks. We'll find someone to do it. Sure. I talked to Dave Olson about this a while ago, and I couldn't get anything happening. Eric ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] problems running MVAPICH on OFED 1.1 rc6 with SLES10 x86_64
Aviram, can I try Mellanox binary RPMs? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -Original Message- > From: Scott Weitzenkamp (sweitzen) > Sent: Sunday, October 01, 2006 9:31 PM > To: 'Aviram Gutman'; Scott Weitzenkamp (sweitzen) > Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] > Subject: RE: [openfabrics-ewg] problems running MVAPICH on > OFED 1.1 rc6 with SLES10 x86_64 > > $ uname -a > Linux svbu-qa1850-3 2.6.16.21-0.8-smp #1 SMP Mon Jul 3 > 18:25:39 UTC 2006 x86_64 > x86_64 x86_64 GNU/Linux > $ > /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh > -np 2 192.168.2.46 192.168.2.49 hostname > svbu-qa1850-4 > svbu-qa1850-3 > $ > /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/bin/mpirun_rsh > -np 2 192.168.2.46 192.168.2.49 > /usr/local/ofed/mpi/gcc/mvapich-0.9.7-mlx2.2.0/tests/osu_bench > marks-2.2/osu_latency > > The last command just hangs. Can I try your binary RPMs? > > Scott Weitzenkamp > SQA and Release Manager > Server Virtualization Business Unit > Cisco Systems > > > > -Original Message- > > From: Aviram Gutman [mailto:[EMAIL PROTECTED] > > Sent: Sunday, October 01, 2006 2:29 AM > > To: Scott Weitzenkamp (sweitzen) > > Cc: OpenFabricsEWG; openib; [EMAIL PROTECTED] > > Subject: Re: [openfabrics-ewg] problems running MVAPICH on > > OFED 1.1 rc6 with SLES10 x86_64 > > > > Can you please elaborate on MVAPICH issues, can you send > > command line? > > We ran it here on 32 Opteron nodes each quad core and also rigorous > > tests on the many other nodes? > > > > > > > > Scott Weitzenkamp (sweitzen) wrote: > > > We are just getting started with OFED testing on SLES10, first > > > platform is x86_64. > > > > > > IPoIB, SDP, SRP, Open MPI, HP MPI, and Intel MPI are > > working so far. > > > MVAPICH with OSU benchmarks just hang.This same > hardware works > > > fine with OFED and RHEL4 U3. > > > > > > Has anyone else seen this? > > > > > > Scott Weitzenkamp > > > SQA and Release Manager > > > Server Virtualization Business Unit > > > Cisco Systems > > > > > > > > -- > > -- > > > > > > ___ > > > openfabrics-ewg mailing list > > > [EMAIL PROTECTED] > > > http://openib.org/mailman/listinfo/openfabrics-ewg > > > > > > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
The attached patch fixes this problem by deferring creation of our diagpkt device until at least one piece of hardware has been found. Michael: this will fix the OFED testing problem you were seeing. Roland: please queue for 2.6.19. Regards, Robert. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
Robert Walsh wrote: > The attached patch fixes this problem by deferring creation of our > diagpkt device until at least one piece of hardware has been found. Of course, if I'd actually attached the patch, it might have been a bit more useful :-) IB/ipath - initialize diagpkt file on device init only Don't attempt to set up the diagpkt device in the module init code. Instead, wait until a piece of hardware is initted. Fixes a problem when loading the ib_ipath module when no InfiniPath hardware is present: modprobe would go into the D state and stay there. From: Robert Walsh <[EMAIL PROTECTED]> diff -r 2ed7140d5700 drivers/infiniband/hw/ipath/ipath_diag.c --- a/drivers/infiniband/hw/ipath/ipath_diag.c Mon Oct 02 16:56:55 2006 -0700 +++ b/drivers/infiniband/hw/ipath/ipath_diag.c Mon Oct 02 16:58:29 2006 -0700 @@ -286,17 +286,23 @@ static struct file_operations diagpkt_fi static struct cdev *diagpkt_cdev; static struct class_device *diagpkt_class_dev; - -int __init ipath_diagpkt_add(void) -{ - return ipath_cdev_init(IPATH_DIAGPKT_MINOR, - "ipath_diagpkt", &diagpkt_file_ops, - &diagpkt_cdev, &diagpkt_class_dev); -} - -void __exit ipath_diagpkt_remove(void) -{ - ipath_cdev_cleanup(&diagpkt_cdev, &diagpkt_class_dev); +static int diagpkt_count; + +void ipath_diagpkt_add(void) +{ + if (diagpkt_count == 0) + ipath_cdev_init(IPATH_DIAGPKT_MINOR, + "ipath_diagpkt", &diagpkt_file_ops, + &diagpkt_cdev, &diagpkt_class_dev); + + diagpkt_count++; +} + +void ipath_diagpkt_remove(void) +{ + diagpkt_count--; + if (diagpkt_count == 0) + ipath_cdev_cleanup(&diagpkt_cdev, &diagpkt_class_dev); } /** diff -r 2ed7140d5700 drivers/infiniband/hw/ipath/ipath_driver.c --- a/drivers/infiniband/hw/ipath/ipath_driver.cMon Oct 02 16:56:55 2006 -0700 +++ b/drivers/infiniband/hw/ipath/ipath_driver.cMon Oct 02 17:00:39 2006 -0700 @@ -559,6 +559,7 @@ static int __devinit ipath_init_one(stru ipathfs_add_device(dd); ipath_user_add(dd); ipath_diag_add(dd); + ipath_diagpkt_add(); ipath_register_ib_device(dd); /* Check that we have a LID in LID_TIMEOUT seconds. */ @@ -700,6 +701,7 @@ static void __devexit ipath_remove_one(s if (dd->verbs_dev) ipath_unregister_ib_device(dd->verbs_dev); + ipath_diagpkt_remove(); ipath_diag_remove(dd); ipath_user_remove(dd); ipathfs_remove_device(dd); @@ -2183,17 +2185,7 @@ static int __init infinipath_init(void) goto bail_group; } - ret = ipath_diagpkt_add(); - if (ret < 0) { - printk(KERN_ERR IPATH_DRV_NAME ": Unable to create " - "diag data device: error %d\n", -ret); - goto bail_ipathfs; - } - goto bail; - -bail_ipathfs: - ipath_exit_ipathfs(); bail_group: ipath_driver_remove_group(&ipath_driver.driver); diff -r 2ed7140d5700 drivers/infiniband/hw/ipath/ipath_kernel.h --- a/drivers/infiniband/hw/ipath/ipath_kernel.hMon Oct 02 16:56:55 2006 -0700 +++ b/drivers/infiniband/hw/ipath/ipath_kernel.hMon Oct 02 16:58:29 2006 -0700 @@ -889,7 +889,7 @@ void ipath_device_remove_group(struct de void ipath_device_remove_group(struct device *, struct ipath_devdata *); int ipath_expose_reset(struct device *); -int ipath_diagpkt_add(void); +void ipath_diagpkt_add(void); void ipath_diagpkt_remove(void); int ipath_init_ipathfs(void); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Kernel Oops in user-mad, mad
Hal Rosenstock wrote: >> Is there a possibility that there is a double deletion from a list >> somewhere? > > > Perhaps but I don't see it. Sean ? Roland ? I looked at this and couldn't find anything obviously wrong. I was waiting to hear back to Michael's question about module unload being involved. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Kernel Oops in user-mad, mad
On Sun, 2006-10-01 at 05:53, Jack Morgenstein wrote: > We received the following kernel Oops while running regression > (see console picture attached). > > This looks like a possible race condition between handling umad send > completions > and ib_unregister_mad_agent. > > The Oops is at the list_del line of dequeue_send (user_mad.c: 186) > Note that ib_unregister_mad_agent invokes unregister_mad_agent->cancel_mads > -> agent send handler. > > Is there a possibility that there is a double deletion from a list somewhere? Perhaps but I don't see it. Sean ? Roland ? -- Hal > Jack ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] rdma_cm branch
Steve Wise wrote: > What is the status of the rdma_cm branch in Roland's infiniband.git > tree? It doesn't have the iwarp stuff in it. I'm wondering if it can > be merged with the 2.6.19 stuff to create a branch that was iwarp + ucma > support? Or is that a dumb idea? I'm currently working on moving the rdma_cm code that's in svn forward to what's upstream. (I was just typing a message on this...) My plan is to ask Roland to host one, maybe two, branches in the infiniband.git tree. Here are the main pieces missing from the kernel: 1. We need to add rdma_establish() and expose the rdma_conn_param values as part of the connection event. I'm working on a patch for the latter. 2. We need a ucma branch. To merge upstream, it makes sense to include item 1 first, but this leads to a conflict with the OFED releases. OFED ABI version 1 includes RC QP support, but without item 1 changes, and SVN ABI version 2 includes multicast support. 3. There's been requests for an rdma_cm branch that includes UD QP / multicast. The cleanest solution from an ABI perspective is to merge multicast support before the ucma; however, I'm not sure that makes the most sense for merging upstream. Thoughts? - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] OFED Status
Woodruff, Robert J wrote: >Aviram wrote, > > >>Pending that IPoIB HA is solved would like to issue RC7 that suppose to >> >> >>be final. Is everyone OK with this approach? >> >> >>Aviram >> >> > >Sounds good, > >What is the target date for RC7 ? > > Do we have a new target date? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH ofed-1.1 2/2] ehca: improved ehca debug format
Michael, here is the 2nd patch of ehca with a small format improvement in ehca debug function. It would be great if we could include it for ofed-1.1. Note that I created this patch against the dir openib-1.1 extracted from ofed-1.1-rc6/SOURCES/openib-1.1.tgz. Thanks! Nam Nguyen Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]> --- ehca_tools.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -Nurp openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_tools.h openib-1.1_work/drivers/infiniband/hw/ehca/ehca_tools.h --- openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_tools.h 2006-09-20 06:28:56.0 -0700 +++ openib-1.1_work/drivers/infiniband/hw/ehca/ehca_tools.h 2006-10-02 09:29:53.0 -0700 @@ -117,7 +117,7 @@ extern int ehca_debug_level; unsigned int l = (unsigned int)(len); \ unsigned char *deb = (unsigned char*)(adr); \ for (x = 0; x < l; x += 16) { \ - printk("EHCA_DMP:%s" format \ + printk("EHCA_DMP:%s " format \ " adr=%p ofs=%04x %016lx %016lx\n", \ __FUNCTION__, ##args, deb, x, \ *((u64 *)&deb[0]), *((u64 *)&deb[8])); \ diff -Nurp openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_tools.h openib-1.1_work/drivers/infiniband/hw/ehca/ehca_tools.h --- openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_tools.h 2006-09-20 06:28:56.0 -0700 +++ openib-1.1_work/drivers/infiniband/hw/ehca/ehca_tools.h 2006-10-02 09:29:53.0 -0700 @@ -117,7 +117,7 @@ extern int ehca_debug_level; unsigned int l = (unsigned int)(len); \ unsigned char *deb = (unsigned char*)(adr); \ for (x = 0; x < l; x += 16) { \ - printk("EHCA_DMP:%s" format \ + printk("EHCA_DMP:%s " format \ " adr=%p ofs=%04x %016lx %016lx\n", \ __FUNCTION__, ##args, deb, x, \ *((u64 *)&deb[0]), *((u64 *)&deb[8])); \ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH ofed-1.1 1/2] ehca: fix ehca device registration
Hi Michael! Please consider this patch of ehca for ofed-1.1 as it fixes a bug (crash) that occured when ib_ehca is loaded after ib_ipoib. This patch initializes struct ehca_shca with struct device*, then creates internal resources and finally registers the ehca IB device. And that is the proper sequence we have to implement. I wanted to create this patch against the ofed git tree branch ehca_branch, but saw that ehca_main.c has version SVNEHCA_0012, which is much older than the version SVNEHCA_0015 in ofed-1.1-rc6. Tried to do a pull and git said that it's already updated. Thus I don't know what I did wrong. Anyway I created this patch against the dir openib-1.1 extracted from ofed-1.1-rc6/SOURCES/openib-1.1.tgz. Hope that it still works for you. Thanks! Nam Nguyen Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]> --- ehca_main.c | 35 +++ 1 file changed, 19 insertions(+), 16 deletions(-) diff -Nurp openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_main.c openib-1.1_work/drivers/infiniband/hw/ehca/ehca_main.c --- openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_main.c 2006-09-20 06:28:56.0 -0700 +++ openib-1.1_work/drivers/infiniband/hw/ehca/ehca_main.c 2006-10-02 15:24:48.010001888 -0700 @@ -5,6 +5,7 @@ * * Authors: Heiko J Schick <[EMAIL PROTECTED]> * Hoang-Nam Nguyen <[EMAIL PROTECTED]> + * Joachim Fenkes <[EMAIL PROTECTED]> * * Copyright (c) 2005 IBM Corporation * @@ -238,7 +239,7 @@ init_node_guid1: return ret; } -int ehca_register_device(struct ehca_shca *shca) +int ehca_init_device(struct ehca_shca *shca) { int ret; @@ -316,11 +317,6 @@ int ehca_register_device(struct ehca_shc /* shca->ib_device.process_mad = ehca_process_mad; */ shca->ib_device.mmap = ehca_mmap; - ret = ib_register_device(&shca->ib_device); - if (ret) - ehca_err(&shca->ib_device, -"ib_register_device() failed ret=%x", ret); - return ret; } @@ -446,7 +442,7 @@ static ssize_t ehca_show_##name(struct kfree(rblock);\ return 0;\ } \ - \ +\ data = rblock->name; \ kfree(rblock); \ \ @@ -560,9 +556,9 @@ static int __devinit ehca_probe(struct i goto probe1; } - ret = ehca_register_device(shca); + ret = ehca_init_device(shca); if (ret) { - ehca_gen_err("Cannot register Infiniband device"); + ehca_gen_err("Cannot init ehca device struct"); goto probe1; } @@ -570,7 +566,7 @@ static int __devinit ehca_probe(struct i ret = ehca_create_eq(shca, &shca->eq, EHCA_EQ, 2048); if (ret) { ehca_err(&shca->ib_device, "Cannot create EQ."); - goto probe2; + goto probe1; } ret = ehca_create_eq(shca, &shca->neq, EHCA_NEQ, 513); @@ -599,6 +595,13 @@ static int __devinit ehca_probe(struct i goto probe5; } + ret = ib_register_device(&shca->ib_device); + if (ret) { + ehca_err(&shca->ib_device, +"ib_register_device() failed ret=%x", ret); + goto probe6; + } + /* create AQP1 for port 1 */ if (ehca_open_aqp1 == 1) { shca->sport[0].port_state = IB_PORT_DOWN; @@ -606,7 +609,7 @@ static int __devinit ehca_probe(struct i if (ret) { ehca_err(&shca->ib_device, "Cannot create AQP1 for port 1."); - goto probe6; + goto probe7; } } @@ -617,7 +620,7 @@ static int __devinit ehca_probe(struct i if (ret) { ehca_err(&shca->ib_device, "Cannot create AQP1 for port 2."); - goto probe7; + goto probe8; } } @@ -629,12 +632,15 @@ static int __devinit ehca_probe(struct i return 0; -probe7: +probe8: ret = ehca_destroy_aqp1(&shca->sport[0]); if (ret) ehca_err(&shca->ib_device, "Cannot destroy AQP1 for port 1. ret=%x", ret); +probe7: + ib_unregister_device(&shca->ib_device); + probe6: ret = ehca_dereg_internal_maxmr(shca); if (ret) @@ -659,9 +665,6 @@ probe3: ehca_err(&shca->ib_device, "Cannot destroy EQ. ret=%x", ret); -probe2: - ib_unregister_device(&shca->ib_device); - probe1: ib_dealloc_device(&shca->ib_device); diff -Nurp openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_main.c openib-1.1_work/drivers/infiniband/hw/ehca/ehca_main.c --- openib-1.1_orig/drivers/infiniband/hw/ehca/ehca_main.c 2006-09-20 06:28:56.0 -0700 +++ openib-1.1_work/drivers/infiniband/hw/ehca/ehca_main.c 2006-10-02 15:24:48.010001888 -0700 @@ -5,6 +5,7 @@ * * Authors: Heiko J Schick <[EMAIL PROTECTED]> * Hoang-Nam Nguyen <[EMAIL PROTECTED]> + * Joachim Fenkes <[EMAIL PROTECTED]> * * Copyright (c) 2005 IBM Corporation * @@ -238,7 +239,7 @@ init_node_guid1: return ret; } -int ehca_register_device(struct ehca_shca *shca) +int ehca_init_device(struct ehca_shca *shca) { int ret; @@ -316,11 +317,6 @@ int ehca_register_device(struct ehca_shc
Re: [openib-general] rdma_cm branch
Steve> Hey Roland/Sean, What is the status of the rdma_cm branch Steve> in Roland's infiniband.git tree? It doesn't have the iwarp Steve> stuff in it. I'm wondering if it can be merged with the Steve> 2.6.19 stuff to create a branch that was iwarp + ucma Steve> support? Or is that a dumb idea? I'm waiting for a ucma patch from Sean to fix things up. What's there doesn't even build... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] rdma_cm branch
Hey Roland/Sean, What is the status of the rdma_cm branch in Roland's infiniband.git tree? It doesn't have the iwarp stuff in it. I'm wondering if it can be merged with the 2.6.19 stuff to create a branch that was iwarp + ucma support? Or is that a dumb idea? Steve. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] IB/SRP: Enable multichannel
Michael> We could just let the user specify the Id Ext when adding Michael> the device. How does this sound? Yes, that makes the most sense -- just add another optional option for use when adding a target. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [GIT PULL] please pull infiniband.git
Linus, please pull from master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband.git for-linus This tree is also available from kernel.org mirrors at: git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git for-linus We're through the bulk of our 2.6.19 merge, but this will get some fixes for drivers and the RDMA CM: Hoang-Nam Nguyen: IB/ehca: Fix device registration IB/ehca: Tweak trace message format Krishna Kumar: RDMA/cma: Fix leak of cm_ids in case of failures RDMA/cma: Fix device removal race RDMA/cma: Eliminate unnecessary remove_list RDMA/cma: Optimize error handling Ralph Campbell: IB/ipath: Fix RDMA reads Sean Hefty: RDMA/cma: Set status correctly on route resolution error drivers/infiniband/core/cma.c | 47 +++-- drivers/infiniband/hw/ehca/ehca_main.c | 36 ++- drivers/infiniband/hw/ehca/ehca_tools.h |2 + drivers/infiniband/hw/ipath/ipath_rc.c | 59 +-- 4 files changed, 80 insertions(+), 64 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 1178bd4..9ae4f3a 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -874,23 +874,25 @@ static struct rdma_id_private *cma_new_i __u16 port; u8 ip_ver; + if (cma_get_net_info(ib_event->private_data, listen_id->ps, +&ip_ver, &port, &src, &dst)) + goto err; + id = rdma_create_id(listen_id->event_handler, listen_id->context, listen_id->ps); if (IS_ERR(id)) - return NULL; + goto err; + + cma_save_net_info(&id->route.addr, &listen_id->route.addr, + ip_ver, port, src, dst); rt = &id->route; rt->num_paths = ib_event->param.req_rcvd.alternate_path ? 2 : 1; - rt->path_rec = kmalloc(sizeof *rt->path_rec * rt->num_paths, GFP_KERNEL); + rt->path_rec = kmalloc(sizeof *rt->path_rec * rt->num_paths, + GFP_KERNEL); if (!rt->path_rec) - goto err; + goto destroy_id; - if (cma_get_net_info(ib_event->private_data, listen_id->ps, -&ip_ver, &port, &src, &dst)) - goto err; - - cma_save_net_info(&id->route.addr, &listen_id->route.addr, - ip_ver, port, src, dst); rt->path_rec[0] = *ib_event->param.req_rcvd.primary_path; if (rt->num_paths == 2) rt->path_rec[1] = *ib_event->param.req_rcvd.alternate_path; @@ -903,8 +905,10 @@ static struct rdma_id_private *cma_new_i id_priv = container_of(id, struct rdma_id_private, id); id_priv->state = CMA_CONNECT; return id_priv; -err: + +destroy_id: rdma_destroy_id(id); +err: return NULL; } @@ -932,6 +936,7 @@ static int cma_req_handler(struct ib_cm_ mutex_unlock(&lock); if (ret) { ret = -ENODEV; + cma_exch(conn_id, CMA_DESTROYING); cma_release_remove(conn_id); rdma_destroy_id(&conn_id->id); goto out; @@ -1307,6 +1312,7 @@ static void cma_query_handler(int status work->old_state = CMA_ROUTE_QUERY; work->new_state = CMA_ADDR_RESOLVED; work->event.event = RDMA_CM_EVENT_ROUTE_ERROR; + work->event.status = status; } queue_work(cma_wq, &work->work); @@ -1862,6 +1868,11 @@ static int cma_connect_ib(struct rdma_id ret = ib_send_cm_req(id_priv->cm_id.ib, &req); out: + if (ret && !IS_ERR(id_priv->cm_id.ib)) { + ib_destroy_cm_id(id_priv->cm_id.ib); + id_priv->cm_id.ib = NULL; + } + kfree(private_data); return ret; } @@ -1889,10 +1900,8 @@ static int cma_connect_iw(struct rdma_id cm_id->remote_addr = *sin; ret = cma_modify_qp_rtr(&id_priv->id); - if (ret) { - iw_destroy_cm_id(cm_id); - return ret; - } + if (ret) + goto out; iw_param.ord = conn_param->initiator_depth; iw_param.ird = conn_param->responder_resources; @@ -1904,6 +1913,10 @@ static int cma_connect_iw(struct rdma_id iw_param.qpn = conn_param->qp_num; ret = iw_cm_connect(cm_id, &iw_param); out: + if (ret && !IS_ERR(cm_id)) { + iw_destroy_cm_id(cm_id); + id_priv->cm_id.iw = NULL; + } return ret; } @@ -2142,12 +2155,9 @@ static int cma_remove_id_dev(struct rdma static void cma_process_remove(struct cma_device *cma_dev) { - struct list_head remove_list; struct rdma_id_private *id_priv; int ret; - INIT_LIST_HEAD(&remove_list); - mutex_lock(&lock); while (!list_empty(&cma_dev->id_list)) {
Re: [openib-general] [PATCH 2.6.19-rc1 2/2] ehca: improved ehca debug format
Thanks, applied both patches. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
> From: Scott Weitzenkamp (sweitzen) > Sent: Monday, October 02, 2006 4:22 PM > To: Kuchimanchi, Ramachandra; Roland Dreier (rdreier) > Cc: openib-General > Subject: Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm > Virtual Ethernet I/O controller (VEx) > > Is this communication protocols documented anywhere? How does this > feature compare to IPoIB and SDP? > This protocol is distinct from IPoIB and SDP. In brief: IPoIB treats an IB fabric as a LAN. As such it has UD semantics. SDP essentially treats the HCA as a TOE and leverages IB's RC semantics to emulate TCP/IP SOCK_STREAM sockets. This protocol implements the interface to communicate to the SilverStorm VEx Ethernet Virtual IO Controllers. The VEx card presents a true Ethernet NIC to the host and essentially treats IB as an IO bus to allow a host CPU to use the VEx card as its NIC. Todd Rimmer ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Robert> Yes. 1250Mbytes/sec is what we expect. You say the 128 Robert> value comes from the BIOS ? If so, we need to discuss this Robert> with our BIOS team to find out why they limit it to 128, Robert> perhaps it is a BIOS bug. Yes, I believe that the BIOS is the only place that would set that value. We know that resetting the device makes it go back to a different default value, and nothing in the kernel that I know of is going to set it down to 128. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Roland wrote, >Is that good? I lost track from the beginning of the thread. >I would suggest working with your platform people to figure out why >the BIOS is setting the PCI Express parameters to non-optimal values. > - R. Yes. 1250Mbytes/sec is what we expect. You say the 128 value comes from the BIOS ? If so, we need to discuss this with our BIOS team to find out why they limit it to 128, perhaps it is a BIOS bug. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
> Adding: > Options ib_mthca tune_pci=1 > > Puts MaxReadReq = 4096. > > I get 1250MB/s bandwidth. Is that good? I lost track from the beginning of the thread. I would suggest working with your platform people to figure out why the BIOS is setting the PCI Express parameters to non-optimal values. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 9/10] Driver utility file - implements various utility macros
Ramachandra K wrote: > > +#define PRINT(level, x, fmt, arg...) \ > + printk(level "%s: %s: %s, line %d: " fmt, \ > +MODULE_NAME, x, __FILE__, __LINE__, ##arg) Use dev_info and friends instead of printk. > +#define hton8(x) (x) > +#define hton16(x)__cpu_to_be16(x) > +#define hton32(x)__cpu_to_be32(x) > +#define hton64(x)__cpu_to_be64(x) Drop these macros. > +#define get_sksport(sk) inet_sk(sk)->sport > +#define get_skdport(sk) inet_sk(sk)->dport And these. > +typedef unsigned long uintn; /* __WORDSIZE/pointer sized integer */ And this typedef. > +/* round down value to align, align must be a power of 2 */ > +#ifndef ROUNDDOWNP2 > +#define ROUNDDOWNP2(val, align) \ > + (((uintn)(val)) & (~((uintn)(align)-1))) > +#endif Perhaps introduce a generic ALIGN_DOWN macro here. > +/* round up value to align, align must be a power of 2 */ > +#ifndef ROUNDUPP2 > +#define ROUNDUPP2(val, align) > \ > + (((uintn)(val) + (uintn)(align) - 1) & (~((uintn)(align)-1))) > +#endif Use ALGIN instead of this macro. > +#define BOOLEAN u8 > +#define TRUE 1 > +#define FALSE0 Yeuch. These have to go. > +#define MAXU32 0x > +#define MAXU64 ((u64)(~0ULL)) Drop these. > +#if BITS_PER_LONG == 64 > +#define PTR64(what) ((u64)(what)) > +#define PTR(what)((void *)(u64)(what)) > +#elif BITS_PER_LONG == 32 > +#define PTR64(what) ((u64)(u32)(what)) > +#define PTR(what)((void *)(u32)(what)) And these. > +#if BITS_PER_LONG == 64 > +#ifdef __ia64__ > +#define __PRI64_PREFIX "l" > +#else > +#define __PRI64_PREFIX "ll" > +#endif > +#define PRISZT "lu" > +#elif BITS_PER_LONG == 32 > +#define __PRI64_PREFIX "L" > +#define PRISZT "u" > +#else > +#error "BITS_PER_LONG not 64 nor 32" > +#endif > +#define __PRIN_PREFIX"l" Just cast 64-bit values to unsigned long long, use %lld etc everywhere, and drop all of this. > +/* source time is 100ths of a sec */ > +#define CONV2JIFFIES(time) (((time) * HZ) / 100) > +#define CONV2USEC(time)((time) * 1) > + > +#ifndef min > +#define min(a,b) ((a)<(b)?(a):(b)) > +#endif Use the standard macros for these. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 8/10] sysfs interface implementation
Ramachandra K wrote: > > +/* > + * target eiocs are added by writing > + * > + * ioc_guid=,dgid=,pkey=,name= > + * to the create_primary sysfs attribute. > + */ > +enum { > + VNIC_OPT_ERR = 0, > + VNIC_OPT_IOC_GUID = 1 << 0, > + VNIC_OPT_DGID = 1 << 1, > + VNIC_OPT_PKEY = 1 << 2, > + VNIC_OPT_NAME = 1 << 3, > + VNIC_OPT_INSTANCE = 1 << 4, > + VNIC_OPT_RXCSUM = 1 << 5, > + VNIC_OPT_TXCSUM = 1 << 6, > + VNIC_OPT_HEARTBEAT = 1 << 7, > + VNIC_OPT_ALL = (VNIC_OPT_IOC_GUID | > + VNIC_OPT_DGID | VNIC_OPT_NAME | VNIC_OPT_PKEY), > +}; This is not OK. You can't pass in multiple values to a sysfs file. Either set the values separately or (if they have to be set all at once) find some other way to do this work. Also, putting all of this parsing cruft in a driver is a sign you're trying to do something you shouldn't be. > +static int avg_ticks_as_time(cycles_t ticks, u32 count, char *buffer) Leave out all the pretty printing. Just print a number in standard units, and let userspace do the parsing. > +static int setup_vnic_stats_files(struct vnic *vnic) > +{ This code needs to use sysfs_create_group instead. > > +static int create_netpath(struct netpath *n_pdest, struct path_param > *p_params) > +{ Why does this not return any error values? > +struct vnic *create_vnic(struct path_param *param) > +{ Ditto with the sysfs_create_group. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Adding: Options ib_mthca tune_pci=1 Puts MaxReadReq = 4096. I get 1250MB/s bandwidth. -- Peter -Original Message- From: Woodruff, Robert J Sent: Monday, October 02, 2006 3:51 PM To: Roland Dreier; Hartman, Peter Cc: Michael S. Tsirkin; openib-general; EWG; Hartman, Peter Subject: RE: Drop in performance on Mellanox MT25204 single port DDR HCA Roland wrote, >However tune_pci=1 will make the driver override this setting if you >really know what you're doing. > - R. Peter, can you give this a try ? I think you set this in /etc/modprobe.conf add the line, options mthca tune_pci=1 Also, we need to understand why the BIOS in your platform is setting it to 128 rather than 512. Is this an oversight in the BIOS or are they doing it for a reason. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 5/10] Implementation of Data path of the communication protocol
Ramachandra K wrote: > Adds the files that implement the data transfer part of the > communication protocol with the VEx. The RDMA of ethernet > packets is implemented in here. I see no sparse annotations to indicate endianness or user visibility of any data throughout the driver. The driver should pass make C=1 CF=-D__CHECK_ENDIAN__ cleanly, which it looks like it won't right now. Also, I see a number of non-standard macros like ntoh16 and so on. Please use the normal cpu_to_be16 etc instead. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Roland wrote, >However tune_pci=1 will make the driver override this setting if you >really know what you're doing. > - R. Peter, can you give this a try ? I think you set this in /etc/modprobe.conf add the line, options mthca tune_pci=1 Also, we need to understand why the BIOS in your platform is setting it to 128 rather than 512. Is this an oversight in the BIOS or are they doing it for a reason. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/10] Driver viport files - implementation of communication protocol with VEx
Bryan> This looks like a cut-and-paste of the main driver file, Bryan> and has the same big problem of a single huge state machine Bryan> function and a bunch of tiny trivial stubs that all serve Bryan> to obfuscate the code. Yes, in general it seems like this all could be made quite a bit smaller and easier to understand by removing some of the extraneous layering -- almost all the functions look like trivial pass-throughs to lower layers. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/10] Driver viport files - implementation of communication protocol with VEx
Ramachandra K wrote: > Adds the driver viport files. These files implement the state machine > for the communication protocol with the VEx. This looks like a cut-and-paste of the main driver file, and has the same big problem of a single huge state machine function and a bunch of tiny trivial stubs that all serve to obfuscate the code. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 1/10] Driver Main files - netdev functions and corresponding state maintenance
Ramachandra K wrote: > +#include Not needed. > +#include Not needed. > +#ifdef CONFIG_INFINIBAND_VNIC_STATS > + if (vnic->statistics.conn_time == 0) { > + vnic->statistics.conn_time = > + get_cycles() - vnic->statistics.start_time; > + } > + if (vnic->statistics.disconn_ref != 0) { > + vnic->statistics.disconn_time += > + get_cycles() - vnic->statistics.disconn_ref; > + vnic->statistics.disconn_num++; > + vnic->statistics.disconn_ref = 0; > + } > +#endif /* CONFIG_INFINIBAND_VNIC_STATS */ Why does none of your stats code use locks? > +static int vnic_open(struct net_device *device) > +{ > + struct vnic *vnic; > + int ret = 0; > + struct netpath *np; > + > + VNIC_FUNCTION("vnic_open()\n"); > + vnic = (struct vnic *)device->priv; > + np = vnic->current_path; > + > + if (vnic->state != VNIC_REGISTERED) { > + ret = -ENODEV; > + } > + > + vnic->open++; > + vnic_npevent_queue_evt(&vnic->primary_path, VNIC_NP_SETLINK); > + vnic->xmit_started = 1; > + netif_start_queue(&vnic->netdevice); > + > + return ret; > +} If you're returning an error value, you shouldn't be finishing the open call as if nothing happened. > +static int vnic_hard_start_xmit(struct sk_buff *skb, struct net_device > *device) > +{ > > + dev_kfree_skb(skb); > + return 0; /* TBD: what should I return? */ > +} Any non-zero value means "try again". > +static void vnic_tx_timeout(struct net_device *device) > > + return; Not needed. > +static int vnic_do_ioctl(struct net_device *device, struct ifreq *ifr, int > cmd) > +{ > + struct vnic *vnic; > + int ret = 0; > + > + VNIC_FUNCTION("vnic_do_ioctl()\n"); > + vnic = (struct vnic *)device->priv; > + > + /* TBD */ > + > + return ret; > +} If you don't do anything, don't implement this. And especially don't return success no matter what you're passed. > +static int vnic_set_config(struct net_device *device, struct ifmap *map) > +{ > + struct vnic *vnic; > + int ret = 0; > + > + VNIC_FUNCTION("vnic_set_config()\n"); > + vnic = (struct vnic *)device->priv; > + > + /* TBD */ > + > + return ret; > +} Likewise. > +static BOOLEAN vnic_npevent_register(struct vnic *vnic, struct netpath > *netpath) There's no BOOLEAN type in the kernel; please don't add one. > + if (register_netdev(&vnic->netdevice) != 0) { > + VNIC_ERROR("failed registering netdev\n"); > + return FALSE; > + } Propagate the error value instead. > + vnic->state = VNIC_REGISTERED; > + vnic->carrier = 2; /* special value to force > netif_carrier_(on|off) */ > + return TRUE; > +} And return 0 on success. > + BOOLEAN delay = TRUE; No BOOLEANs, please. > + if (!vnic->carrier) { > + switch (netpath->timer_state) { > + case NETPATH_TS_IDLE: > + netpath->timer_state = > + NETPATH_TS_ACTIVE; > + if (vnic->state == VNIC_UNINITIALIZED) > + netpath_timer(netpath, This is a very deep nesting of conditionals. Please restructure into something more compreshensible. A general comment: I don't understand why you've moved a bunch of code with well-defined entry points into this big ugly single-function state machine. It means you have a whole lot of trivial wrapper code that serves no purpose, and decreases the readability of the driver significantly. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Does using the "tune_pci=1" module option for ib_mthca bring the performance back up? The reason the driver was changed to work this way is that presumably the BIOS is setting the PCI configuration as it does for a reason. So you might want investigate why the BIOS sets MaxReadReq down to 128 in the first place. (removing the save/restore across reset lets the HCA pick a new default for all the settings, but may cause problems by getting rid of BIOS settings, which we assume were done for a reason). However tune_pci=1 will make the driver override this setting if you really know what you're doing. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Ramachandra> In that case, can you please consider this for the Ramachandra> for-2.6.20 branch ? I'm happy to keep this in a vex branch or something like that, but as the emails I just sent show, this is not ready for merging yet (which is to be expected -- it's never been reviewed). I think Scott's question about protocol documentation is a good one. And also as I said this needs to be sent to lkml and netdev for full review by everyone. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 7/10] Handling of various configurable parameters of the driver
> +sid = 0x10LL << 56 | > +0x00LL << 48 | > +0x06LL << 40 | > +0x6aLL << 32 | > +0x00LL << 24 | > +0x00LL << 16 | > +0x00LL << 8 | ((be64_to_cpu(params->ioc_guid) >> 32) & 0xFF); What is this magic number code doing?? Wouldn't it be clearer just to use the constant 0x166a rather than making it by hand? What does that value mean? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2.6.19-rc1 2/2] ehca: improved ehca debug format
Hi, here is the 2nd patch of ehca with a small format improvement in ehca debug function. Thanks! Nam Nguyen Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]> --- ehca_tools.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_tools.h infiniband_work/drivers/infiniband/hw/ehca/ehca_tools.h --- infiniband_orig/drivers/infiniband/hw/ehca/ehca_tools.h 2006-10-02 22:08:57.0 +0200 +++ infiniband_work/drivers/infiniband/hw/ehca/ehca_tools.h 2006-10-02 18:29:53.0 +0200 @@ -117,7 +117,7 @@ extern int ehca_debug_level; unsigned int l = (unsigned int)(len); \ unsigned char *deb = (unsigned char*)(adr); \ for (x = 0; x < l; x += 16) { \ - printk("EHCA_DMP:%s" format \ + printk("EHCA_DMP:%s " format \ " adr=%p ofs=%04x %016lx %016lx\n", \ __FUNCTION__, ##args, deb, x, \ *((u64 *)&deb[0]), *((u64 *)&deb[8])); \ diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_tools.h infiniband_work/drivers/infiniband/hw/ehca/ehca_tools.h --- infiniband_orig/drivers/infiniband/hw/ehca/ehca_tools.h 2006-10-02 22:08:57.0 +0200 +++ infiniband_work/drivers/infiniband/hw/ehca/ehca_tools.h 2006-10-02 18:29:53.0 +0200 @@ -117,7 +117,7 @@ extern int ehca_debug_level; unsigned int l = (unsigned int)(len); \ unsigned char *deb = (unsigned char*)(adr); \ for (x = 0; x < l; x += 16) { \ - printk("EHCA_DMP:%s" format \ + printk("EHCA_DMP:%s " format \ " adr=%p ofs=%04x %016lx %016lx\n", \ __FUNCTION__, ##args, deb, x, \ *((u64 *)&deb[0]), *((u64 *)&deb[8])); \ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 9/10] Driver utility file - implements various utility macros
> +#define hton8(x)(x) > +#define hton16(x) __cpu_to_be16(x) > +#define hton32(x) __cpu_to_be32(x) > +#define hton64(x) __cpu_to_be64(x) > + > +#define ntoh8(x)(x) > +#define ntoh16(x) __be16_to_cpu(x) > +#define ntoh32(x) __be32_to_cpu(x) > +#define ntoh64(x) __be64_to_cpu(x) Please just use the standard cpu_to_beXX / beXX_to_cpu functions directly (without the __). > +#define is_power_of2(value) (((value) & ((value - 1))) == 0) > + > +typedef unsigned long uintn;/* __WORDSIZE/pointer sized integer */ > + > +/* round down value to align, align must be a power of 2 */ > +#ifndef ROUNDDOWNP2 > +#define ROUNDDOWNP2(val, align) \ > +(((uintn)(val)) & (~((uintn)(align)-1))) > +#endif > +/* round up value to align, align must be a power of 2 */ > +#ifndef ROUNDUPP2 > +#define ROUNDUPP2(val, align) > \ > +(((uintn)(val) + (uintn)(align) - 1) & (~((uintn)(align)-1))) > +#endif If you need this stuff it should probably go in some common kernel include. > +#if BITS_PER_LONG == 64 > +#define PTR64(what) ((u64)(what)) > +#define PTR(what) ((void *)(u64)(what)) > +#elif BITS_PER_LONG == 32 > +#define PTR64(what) ((u64)(u32)(what)) > +#define PTR(what) ((void *)(u32)(what)) > +#else > +#error "BITS_PER_LONG not 32 nor 64" > +#endif umm.. what the heck is this trying to do? If you want to cast a pointer to an integer, just use 'unsigned long' to hold it. > +#endif > +#define __PRIN_PREFIX "l" > +#define PRId64 __PRI64_PREFIX"d" > +#define PRIo64 __PRI64_PREFIX"o" > +#define PRIu64 __PRI64_PREFIX"u" > +#define PRIx64 __PRI64_PREFIX"x" > +#define PRIX64 __PRI64_PREFIX"X" > +#define PRIdN __PRIN_PREFIX"d" > +#define PRIoN __PRIN_PREFIX"o" > +#define PRIuN __PRIN_PREFIX"u" > +#define PRIxN __PRIN_PREFIX"x" kernel style is just to use "%llx" or whatever for printing 64-bit values, and cast them to unsigned long long to avoid warnings about printf formats. > +/* source time is 100ths of a sec */ > +#define CONV2JIFFIES(time) (((time) * HZ) / 100) > +#define CONV2USEC(time)((time) * 1) Why are you using such a wacky unit? This looks really error-prone -- the conversions in should be good enough I think. > +#ifndef min > +#define min(a,b)((a)<(b)?(a):(b)) > +#endif Unneeded since the kernel _does_ have a better definition of min() (type-safe, evaluations parameters only once, etc) - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2.6.19-rc1 1/2] ehca: fix ehca device registration
Hi Roland! Below is a patch of ehca, which fixes a bug (crash) that occured when ib_ehca is loaded after ib_ipoib. This patch initializes struct ehca_shca with struct device*, then creates internal resources and finally registers the ehca IB device. And that is the proper sequence to do. Thanks! Nam Nguyen Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]> --- ehca_main.c | 36 +++- 1 file changed, 19 insertions(+), 17 deletions(-) diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_main.c infiniband_work/drivers/infiniband/hw/ehca/ehca_main.c --- infiniband_orig/drivers/infiniband/hw/ehca/ehca_main.c 2006-10-02 22:08:57.0 +0200 +++ infiniband_work/drivers/infiniband/hw/ehca/ehca_main.c 2006-10-02 18:29:53.0 +0200 @@ -49,7 +49,7 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Christoph Raisch <[EMAIL PROTECTED]>"); MODULE_DESCRIPTION("IBM eServer HCA InfiniBand Device Driver"); -MODULE_VERSION("SVNEHCA_0016"); +MODULE_VERSION("SVNEHCA_0017"); int ehca_open_aqp1 = 0; int ehca_debug_level = 0; @@ -239,7 +239,7 @@ init_node_guid1: return ret; } -int ehca_register_device(struct ehca_shca *shca) +int ehca_init_device(struct ehca_shca *shca) { int ret; @@ -317,11 +317,6 @@ int ehca_register_device(struct ehca_shc /* shca->ib_device.process_mad = ehca_process_mad; */ shca->ib_device.mmap = ehca_mmap; - ret = ib_register_device(&shca->ib_device); - if (ret) - ehca_err(&shca->ib_device, -"ib_register_device() failed ret=%x", ret); - return ret; } @@ -561,9 +556,9 @@ static int __devinit ehca_probe(struct i goto probe1; } - ret = ehca_register_device(shca); + ret = ehca_init_device(shca); if (ret) { - ehca_gen_err("Cannot register Infiniband device"); + ehca_gen_err("Cannot init ehca device struct"); goto probe1; } @@ -571,7 +566,7 @@ static int __devinit ehca_probe(struct i ret = ehca_create_eq(shca, &shca->eq, EHCA_EQ, 2048); if (ret) { ehca_err(&shca->ib_device, "Cannot create EQ."); - goto probe2; + goto probe1; } ret = ehca_create_eq(shca, &shca->neq, EHCA_NEQ, 513); @@ -600,6 +595,13 @@ static int __devinit ehca_probe(struct i goto probe5; } + ret = ib_register_device(&shca->ib_device); + if (ret) { + ehca_err(&shca->ib_device, +"ib_register_device() failed ret=%x", ret); + goto probe6; + } + /* create AQP1 for port 1 */ if (ehca_open_aqp1 == 1) { shca->sport[0].port_state = IB_PORT_DOWN; @@ -607,7 +609,7 @@ static int __devinit ehca_probe(struct i if (ret) { ehca_err(&shca->ib_device, "Cannot create AQP1 for port 1."); - goto probe6; + goto probe7; } } @@ -618,7 +620,7 @@ static int __devinit ehca_probe(struct i if (ret) { ehca_err(&shca->ib_device, "Cannot create AQP1 for port 2."); - goto probe7; + goto probe8; } } @@ -630,12 +632,15 @@ static int __devinit ehca_probe(struct i return 0; -probe7: +probe8: ret = ehca_destroy_aqp1(&shca->sport[0]); if (ret) ehca_err(&shca->ib_device, "Cannot destroy AQP1 for port 1. ret=%x", ret); +probe7: + ib_unregister_device(&shca->ib_device); + probe6: ret = ehca_dereg_internal_maxmr(shca); if (ret) @@ -660,9 +665,6 @@ probe3: ehca_err(&shca->ib_device, "Cannot destroy EQ. ret=%x", ret); -probe2: - ib_unregister_device(&shca->ib_device); - probe1: ib_dealloc_device(&shca->ib_device); @@ -750,7 +752,7 @@ int __init ehca_module_init(void) int ret; printk(KERN_INFO "eHCA Infiniband Device Driver " - "(Rel.: SVNEHCA_0016)\n"); + "(Rel.: SVNEHCA_0017)\n"); idr_init(&ehca_qp_idr); idr_init(&ehca_cq_idr); spin_lock_init(&ehca_qp_idr_lock); diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_main.c infiniband_work/drivers/infiniband/hw/ehca/ehca_main.c --- infiniband_orig/drivers/infiniband/hw/ehca/ehca_main.c 2006-10-02 22:08:57.0 +0200 +++ infiniband_work/drivers/infiniband/hw/ehca/ehca_main.c 2006-10-02 18:29:53.0 +0200 @@ -49,7 +49,7 @@ MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Christoph Raisch <[EMAIL PROTECTED]>"); MODULE_DESCRIPTION("IBM eServer HCA InfiniBand Device Driver"); -MODULE_VERSION("SVNEHCA_0016"); +MODULE_VERSION("SVNEHCA_0017"); int ehca_open_aqp1 = 0; int ehca_debug_level = 0; @@ -239,7 +239,7 @@ init_node_guid1: return ret; } -int ehca_register_device(struct ehca_shca *shca) +int ehca_init_device(struct ehca_shca *shca) { int ret; @@ -317,11 +317,6 @@ int ehca_register_device(struct ehca_shc /* shca->ib_device.process_mad = ehca_process_mad; */ shca->ib_device.mmap= ehca_mmap; - ret = ib_register_device(&shca->ib_device); - if (ret) - ehca_err(&shca->ib_device, -"ib_register_device() failed ret=%x", ret); - return ret; } @@ -561,9 +556,9 @@ static int __devinit ehca_probe(struc
[openib-general] Drop in performance on Mellanox MT25204 single port DDR HCA
Hi Roland/Michael, One of my coworkers in Champaign is seeing a performance issue with the latest SVN driver and the OFED 1.1 Mellanox driver on certain platforms. On the older SVN somewhere around 7500 the Mellanox driver did not save and restore certain PCI registers before a reset. Somewhere around SVN 8000 a patch was added to save and restore these registers. However on our Alcolu platform this patch causes the MaxReadReq to be set to 128 bytes (rather than 512) which limits bandwith to 650MBytes/sec. If I remove the save/restore of these registers (attached patch), the bandwidth is back to where we would expect it 1250 Mbytes/sec. Is there some problem with this patch or do you think it is some BIOS issue in the platform ? woody pci_regs.patch Description: pci_regs.patch ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 3/10] Driver viport files - implementation of communication protocol with VEx
> +viport = (struct viport *)kmalloc(sizeof(struct viport), GFP_KERNEL); > +memset(viport, 0, sizeof(struct viport)); cast from void * is not necessary. memset can be replaced by just using kzalloc(). - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2/10] Driver netpath files - abstraction of connection to VEx
> +if (netpath->timer_state == NETPATH_TS_ACTIVE) { > +del_timer_sync(&netpath->timer); > +} kernel style is just to do if (netpath->timer_state == NETPATH_TS_ACTIVE) del_timer_sync(&netpath->timer); this could be fixed many places. > +void netpath_connected(struct netpath *netpath, struct viport *viport) > +{ > +vnic_connected(netpath->parent, netpath); > +return; > +} > + > +void netpath_disconnected(struct netpath *netpath, struct viport *viport) > +{ > +vnic_disconnected(netpath->parent, netpath); > +return; > +} what do the return; statement accomplish here? In fact what do these wrappers accomplish? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2.6.19-rc1] ehca: fix ehca_probe if module loaded after ib_ipoib
> Looks OK but your mailer mangled the patch. Please resend in a form > that can be applied... > please send unrelated changes as separate patches. > So this should come as two patches -- one to fix the device > registration, and one to change your debug formatting. ok, will resend those two patches soon. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 1/10] Driver Main files - netdev functions and corresponding state maintenance
> +#ifdef CONFIG_INFINIBAND_VNIC_STATS > +extern cycles_t recv_ref; > +#endif /* CONFIG_INFINIBAND_VNIC_STATS */ put this declaration in a header file somewhere, not inside a function in a .c file. Also is it really worth having CONFIG_INFINIBAND_VNIC_STATS? Who would use it? Or would anyone turn it off? All the #ifdefs make the code much harder to read so I think you need to figure out a better way to make it conditional if you really want it to be configurable. > +/* TBD */ > +/* TBD */ Umm... > +static BOOLEAN vnic_npevent_register(struct vnic *vnic, struct netpath > *netpath) What do you gain from having this shouting "BOOLEAN" type? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Roland Dreier wrote: >Ramachandra> This patch series is intended for your infiniband.git >Ramachandra> for-2.6.19 branch. It also has been tested against >Ramachandra> the for-2.6.20 branch. > >Well, no way is this going to be merged into 2.6.19 at this stage in >the release cycle (the merge window is closing in a few days and this >has never been reviewed at all). > > In that case, can you please consider this for the for-2.6.20 branch ? Regards, Ram ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Is this communication protocols documented anywhere? How does this feature compare to IPoIB and SDP? Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ramachandra K > Sent: Monday, October 02, 2006 12:58 PM > To: Roland Dreier (rdreier) > Cc: [EMAIL PROTECTED]; openib-General > Subject: [openib-general] [PATCH 0/10] [RFC] Support for > SilverStorm Virtual Ethernet I/O controller (VEx) > > Hi Roland, > > This patch series adds support for the SilverStorm Virtual > Ethernet I/O > Controllers (VEx) by adding a new kernel level driver. > > This kernel driver: > > 1. Communicates with the VEx on the SilverStorm fabric > switches/directors using >SilverStorm's native protocol > 2. Presents a standard Ethernet NIC interface to the system > 3. Uses IB reliable connection semantics > 4. Is tuned for high performance and throughput > > The SilverStorm VEx and the associated communication protocol > is in wide use > amongst users of SilverStorm IB fabric solutions. > > This patch series is intended for your infiniband.git > for-2.6.19 branch. It > also has been tested against the for-2.6.20 branch. > > Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> > > Regards, > Ram > > ___ > openib-general mailing list > openib-general@openib.org > http://openib.org/mailman/listinfo/openib-general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Ramachandra> This patch series is intended for your infiniband.git Ramachandra> for-2.6.19 branch. It also has been tested against Ramachandra> the for-2.6.20 branch. Well, no way is this going to be merged into 2.6.19 at this stage in the release cycle (the merge window is closing in a few days and this has never been reviewed at all). Also, you're going to want to cross-post this to lkml and netdev as well so that people subscribed there can review it. - R ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 10/10] Driver Kconfig/Makefile. Modifications to toplevel Kconfig/Makefile
Adds the Kconfig and Makefile for the driver. Modifies the top level Infiniband Kconfig and Makefile to include VNIC. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/Kconfig |2 ++ drivers/infiniband/Makefile |1 + drivers/infiniband/ulp/vnic/Kconfig | 28 drivers/infiniband/ulp/vnic/Makefile | 11 +++ 4 files changed, 42 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig index 9edface..5676c6a 100644 --- a/drivers/infiniband/Kconfig +++ b/drivers/infiniband/Kconfig @@ -45,4 +45,6 @@ source "drivers/infiniband/ulp/srp/Kconf source "drivers/infiniband/ulp/iser/Kconfig" +source "drivers/infiniband/ulp/vnic/Kconfig" + endmenu diff --git a/drivers/infiniband/Makefile b/drivers/infiniband/Makefile index 2b5d109..5407878 100644 --- a/drivers/infiniband/Makefile +++ b/drivers/infiniband/Makefile @@ -6,3 +6,4 @@ obj-$(CONFIG_INFINIBAND_AMSO1100) += hw/ obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/ obj-$(CONFIG_INFINIBAND_SRP) += ulp/srp/ obj-$(CONFIG_INFINIBAND_ISER) += ulp/iser/ +obj-$(CONFIG_INFINIBAND_VNIC) += ulp/vnic/ diff --git a/drivers/infiniband/ulp/vnic/Kconfig b/drivers/infiniband/ulp/vnic/Kconfig new file mode 100644 index 000..3be14ff --- /dev/null +++ b/drivers/infiniband/ulp/vnic/Kconfig @@ -0,0 +1,28 @@ +config INFINIBAND_VNIC + tristate "VNIC - Support for SilverStorm Virtual Ethernet I/O Controller" + depends on INFINIBAND && NETDEVICES && INET + ---help--- + Support for the SilverStorm Virtual Ethernet I/O Controller + (VEx). In conjunction with the VEx, this provides virtual + ethernet interfaces and transports ethernet packets over + InfiniBand so that you can communicate with Ethernet networks + using your IB device. + +config INFINIBAND_VNIC_DEBUG + bool "VNIC Verbose debugging" + depends on INFINIBAND_VNIC + default n + ---help--- + This option causes verbose debugging code to be compiled + into the VNIC driver. The output can be turned on via the + vnic_debug module parameter. + +config INFINIBAND_VNIC_STATS + bool "VNIC Statistics" + depends on INFINIBAND_VNIC + default n + ---help--- + This option compiles statistics collecting code into the + data path of the VNIC driver to help in profiling and fine + tuning. This adds some overhead in the interest of gathering + data. diff --git a/drivers/infiniband/ulp/vnic/Makefile b/drivers/infiniband/ulp/vnic/Makefile new file mode 100644 index 000..253d167 --- /dev/null +++ b/drivers/infiniband/ulp/vnic/Makefile @@ -0,0 +1,11 @@ +obj-$(CONFIG_INFINIBAND_VNIC) += ib_vnic.o + +ib_vnic-y := vnic_main.o \ + vnic_ib.o \ + vnic_viport.o \ + vnic_control.o \ + vnic_data.o \ + vnic_netpath.o \ + vnic_config.o \ + vnic_sys.o + ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 9/10] Driver utility file - implements various utility macros
Adds the driver utility file. This file contains utility macros for debugging etc Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_util.h | 286 +++ 1 files changed, 286 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_util.h b/drivers/infiniband/ulp/vnic/vnic_util.h new file mode 100644 index 000..ca35fa0 --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_util.h @@ -0,0 +1,286 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#ifndef VNIC_UTIL_H_INCLUDED +#define VNIC_UTIL_H_INCLUDED + +#define MODULE_NAME "VNIC" + +extern u32 vnic_debug; + +#define DEBUG_IB_INFO 0x0001 +#define DEBUG_IB_FUNCTION 0x0002 +#define DEBUG_IB_FSTATUS 0x0004 +#define DEBUG_IB_ASSERTS 0x0008 +#define DEBUG_CONTROL_INFO 0x0010 +#define DEBUG_CONTROL_FUNCTION 0x0020 +#define DEBUG_CONTROL_PACKET 0x0040 +#define DEBUG_CONFIG_INFO 0x0100 +#define DEBUG_DATA_INFO0x1000 +#define DEBUG_DATA_FUNCTION0x2000 +#define DEBUG_NETPATH_INFO 0x0001 +#define DEBUG_VIPORT_INFO 0x0010 +#define DEBUG_VIPORT_FUNCTION 0x0020 +#define DEBUG_LINK_STATE 0x0040 +#define DEBUG_VNIC_INFO0x0100 +#define DEBUG_VNIC_FUNCTION0x0200 +#define DEBUG_SYS_INFO 0x1000 +#define DEBUG_SYS_VERBOSE 0x4000 + +#ifdef CONFIG_INFINIBAND_VNIC_DEBUG +#define PRINT(level, x, fmt, arg...) \ + printk(level "%s: %s: %s, line %d: " fmt, \ + MODULE_NAME, x, __FILE__, __LINE__, ##arg) + +#define PRINT_CONDITIONAL(level, x, condition, fmt, arg...)\ + do {\ + if (condition) \ + printk(level "%s: %s: %s, line %d: " fmt, \ + MODULE_NAME, x, __FILE__, __LINE__, \ + ##arg); \ + } while(0) +#else +#define PRINT(level, x, fmt, arg...) \ + printk( level "%s: " fmt, MODULE_NAME, ##arg) + +#define PRINT_CONDITIONAL(level, x, condition, fmt, arg...)\ + do {\ +if (condition) \ + printk(level "%s: %s: " fmt,\ + MODULE_NAME, x, ##arg); \ + } while(0) +#endif /*CONFIG_INFINIBAND_VNIC_DEBUG*/ + +#define IB_PRINT(fmt, arg...) PRINT(KERN_INFO, "IB", fmt, ##arg) +#define IB_ERROR(fmt, arg...) PRINT(KERN_ERR, "IB", fmt, ##arg) + +#define IB_FUNCTION(fmt, arg...) \ + PRINT_CONDITIONAL(KERN_INFO,\ + "IB", \ + (vnic_debug & DEBUG_IB_FUNCTION), \ + fmt, ##arg) + +#define IB_INFO(fmt, arg...) \ + PRINT_CONDITIONAL(KERN_INFO,\ + "IB", \ + (vnic_debug & DEBUG_IB_INFO), \ + fmt, ##arg) + +#define IB_ASSERT(x)
[openib-general] [PATCH 8/10] sysfs interface implementation
Adds the files that implement the sysfs interface of the driver. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_sys.c | 1118 drivers/infiniband/ulp/vnic/vnic_sys.h | 51 + 2 files changed, 1169 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_sys.c b/drivers/infiniband/ulp/vnic/vnic_sys.c new file mode 100644 index 000..052783e --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_sys.c @@ -0,0 +1,1118 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include + +#ifdef CONFIG_INFINIBAND_VNIC_STATS +#include +#endif + +#include "vnic_util.h" +#include "vnic_config.h" +#include "vnic_ib.h" +#include "vnic_viport.h" +#include "vnic_main.h" + +extern struct list_head vnic_list; + +/* + * target eiocs are added by writing + * + * ioc_guid=,dgid=,pkey=,name= + * to the create_primary sysfs attribute. + */ +enum { + VNIC_OPT_ERR = 0, + VNIC_OPT_IOC_GUID = 1 << 0, + VNIC_OPT_DGID = 1 << 1, + VNIC_OPT_PKEY = 1 << 2, + VNIC_OPT_NAME = 1 << 3, + VNIC_OPT_INSTANCE = 1 << 4, + VNIC_OPT_RXCSUM = 1 << 5, + VNIC_OPT_TXCSUM = 1 << 6, + VNIC_OPT_HEARTBEAT = 1 << 7, + VNIC_OPT_ALL = (VNIC_OPT_IOC_GUID | + VNIC_OPT_DGID | VNIC_OPT_NAME | VNIC_OPT_PKEY), +}; + +static match_table_t vnic_opt_tokens = { + {VNIC_OPT_IOC_GUID, "ioc_guid=%s"}, + {VNIC_OPT_DGID, "dgid=%s"}, + {VNIC_OPT_PKEY, "pkey=%x"}, + {VNIC_OPT_NAME, "name=%s"}, + {VNIC_OPT_INSTANCE, "instance=%d"}, + {VNIC_OPT_RXCSUM, "rx_csum=%s"}, + {VNIC_OPT_TXCSUM, "tx_csum=%s"}, + {VNIC_OPT_HEARTBEAT, "heartbeat=%d"}, + {VNIC_OPT_ERR, NULL} +}; + +static void vnic_release_class_dev(struct class_device *class_dev) +{ + struct class_dev_info *cdev_info = + container_of(class_dev, struct class_dev_info, class_dev); + + complete(&cdev_info->released); + +} + +struct class vnic_class = { + .name = "infiniband_vnic", + .release = vnic_release_class_dev +}; + +struct class_dev_info interface_cdev; + +static int vnic_parse_options(const char *buf, struct path_param *param) +{ + char *options, *sep_opt; + char *p; + char dgid[3]; + substring_t args[MAX_OPT_ARGS]; + int opt_mask = 0; + int token; + int ret = -EINVAL; + int i; + + options = kstrdup(buf, GFP_KERNEL); + if (!options) + return -ENOMEM; + + sep_opt = options; + while ((p = strsep(&sep_opt, ",")) != NULL) { + if (!*p) + continue; + + token = match_token(p, vnic_opt_tokens, args); + opt_mask |= token; + + switch (token) { + case VNIC_OPT_IOC_GUID: + p = match_strdup(args); + param->ioc_guid = cpu_to_be64(simple_strtoull(p, NULL, + 16)); + kfree(p); + break; + + case VNIC_OPT_DGID: + p = match_strdup(args); + if (strlen(p) != 32) { + printk(KERN_WARNING PFX + "bad dest GID parameter '%s'\n", p); + kfree(p); + goto out; + } + + for (i = 0; i < 16; ++i) { +
[openib-general] [PATCH 7/10] Handling of various configurable parameters of the driver
Adds the files that handle various configurable parameters of the driver configuration of virtual NIC, control, data connections to the VEx and general IB connection parameters. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_config.c | 739 + drivers/infiniband/ulp/vnic/vnic_config.h | 215 2 files changed, 954 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_config.c b/drivers/infiniband/ulp/vnic/vnic_config.c new file mode 100644 index 000..61db4ee --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_config.c @@ -0,0 +1,739 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include + +#include + +#include "vnic_util.h" +#include "vnic_config.h" +#include "vnic_trailer.h" + +#define CONFIG_PARAM(x) u32 x = 0x; +#define DEFAULT_PARAM(x, y)\ + do {\ + if (x == 0x)\ +x = y; \ + } while(0) + +#define boolean_range_check(x) __range_check(x, 0, 1, #x) +#define u32_zero_range_check(x)__range_check(x, 0, 0x7FFF, #x) +#define u32_range_check(x) __range_check(x, 1, 0x7FFF, #x) +#define u16_zero_range_check(x)__range_check(x, 0, 0x, #x) +#define u16_range_check(x) __range_check(x, 1, 0x, #x) +#define u8_zero_range_check(x) __range_check(x, 0, 0xFF, #x) +#define u8_range_check(x) __range_check(x, 1, 0xFF, #x) + +#define range_check(x, min, max) __range_check(x, min, max, #x) +#define less_or_equal_check(lo, hi)__less_or_equal_check(lo, hi, #lo, #hi) +#define less_than_check(lo, hi)__less_than_check(lo, hi, #lo, #hi) +#define power_of_2_check(num) __power_of_2_check(num, #num) + +CONFIG_PARAM(max_address_entries); +CONFIG_PARAM(min_address_entries); + +CONFIG_PARAM(min_mtu); +module_param(min_mtu, int, 0444); + +CONFIG_PARAM(max_mtu); +module_param(max_mtu, int, 0444); + +CONFIG_PARAM(host_recv_pool_entries); +module_param(host_recv_pool_entries, int, 0444); + +CONFIG_PARAM(min_host_pool_sz); +module_param(min_host_pool_sz, int, 0444); + +CONFIG_PARAM(min_eioc_pool_sz); +module_param(min_eioc_pool_sz, int, 0444); + +CONFIG_PARAM(max_eioc_pool_sz); +module_param(max_eioc_pool_sz, int, 0444); + +CONFIG_PARAM(min_host_kick_timeout); +module_param(min_host_kick_timeout, int, 0444); + +CONFIG_PARAM(max_host_kick_timeout); +module_param(max_host_kick_timeout, int, 0444); + +CONFIG_PARAM(min_host_kick_entries); +module_param(min_host_kick_entries, int, 0444); + +CONFIG_PARAM(max_host_kick_entries); +module_param(max_host_kick_entries, int, 0444); + +CONFIG_PARAM(min_host_kick_bytes); +module_param(min_host_kick_bytes, int, 0444); + +CONFIG_PARAM(max_host_kick_bytes); +module_param(max_host_kick_bytes, int, 0444); + +CONFIG_PARAM(min_host_update_sz); +module_param(min_host_update_sz, int, 0444); + +CONFIG_PARAM(max_host_update_sz); +module_param(max_host_update_sz, int, 0444); + +CONFIG_PARAM(min_eioc_update_sz); +module_param(min_eioc_update_sz, int, 0444); + +CONFIG_PARAM(max_eioc_update_sz); +module_param(max_eioc_update_sz, int, 0444); + +CONFIG_PARAM(notify_bundle_sz); +module_param(notify_bundle_sz, int, 0444); + +CONFIG_PARAM(viport_stats_interval); +CONFIG_PARAM(viport_hb_interval); +CONFIG_PARAM(viport_hb_timeout); +CONFIG_PARAM(control_rsp_timeout); +CONFIG_PARAM(control_req_retry_count); + +/* Infiniband connection values
[openib-general] [PATCH 6/10] Driver IB files - IB core stack interaction
Adds the files that implement interaction with the core IB stack. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_ib.c | 709 + drivers/infiniband/ulp/vnic/vnic_ib.h | 167 2 files changed, 876 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_ib.c b/drivers/infiniband/ulp/vnic/vnic_ib.c new file mode 100644 index 000..0c50b83 --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_ib.c @@ -0,0 +1,709 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include +#include +#include + +#include "vnic_util.h" +#include "vnic_config.h" +#include "vnic_ib.h" +#include "vnic_viport.h" +#include "vnic_sys.h" +#include "vnic_main.h" + +static int vnic_ib_inited = 0; + +static void vnic_add_one(struct ib_device *device); +static void vnic_remove_one(struct ib_device *device); + +static struct ib_client vnic_client = { + .name = "vnic", + .add = vnic_add_one, + .remove = vnic_remove_one +}; + +static struct ib_sa_client vnic_sa_client; + +static CLASS_DEVICE_ATTR(create_primary, S_IWUSR, NULL, vnic_create_primary); +static CLASS_DEVICE_ATTR(create_secondary, S_IWUSR, NULL, +vnic_create_secondary); + +static CLASS_DEVICE_ATTR(delete_vnic, S_IWUSR, NULL, vnic_delete); + +static struct vnic_ib_port *vnic_add_port(struct vnic_ib_device *device, u8 port_num) +{ + struct vnic_ib_port *port; + + port = kzalloc(sizeof *port, GFP_KERNEL); + if (!port) + return NULL; + + init_completion(&port->cdev_info.released); + port->dev = device; + port->port_num = port_num; + + port->cdev_info.class_dev.class = &vnic_class; + port->cdev_info.class_dev.dev = device->dev->dma_device; + snprintf(port->cdev_info.class_dev.class_id, BUS_ID_SIZE, "vnic-%s-%d", +device->dev->name, port_num); + + if (class_device_register(&port->cdev_info.class_dev)) + goto free_port; + + if (class_device_create_file(&port->cdev_info.class_dev, +&class_device_attr_create_primary)) + goto err_class; + if (class_device_create_file(&port->cdev_info.class_dev, +&class_device_attr_create_secondary)) + goto err_class; + + return port; +err_class: + class_device_unregister(&port->cdev_info.class_dev); + +free_port: + kfree(port); + + return NULL; +} + +static void vnic_add_one(struct ib_device *device) +{ + struct vnic_ib_device *vnic_dev; + struct vnic_ib_port *port; + int s, e, p; + + vnic_dev = kmalloc(sizeof *vnic_dev, GFP_KERNEL); + vnic_dev->dev = device; + if (!vnic_dev) + return; + + INIT_LIST_HEAD(&vnic_dev->dev_list); + if (device->node_type == RDMA_NODE_IB_SWITCH) { + s = 0; + e = 0; + + } else { + s = 1; + e = device->phys_port_cnt; + + } + + for (p = s; p <= e; p++) { + port = vnic_add_port(vnic_dev, p); + if (port) + list_add_tail(&port->list, &vnic_dev->dev_list); + } + + ib_set_client_data(device, &vnic_client, vnic_dev); + +} + +static void vnic_remove_one(struct ib_device *device) +{ + struct vnic_ib_device *vnic_dev; + struct vnic_ib_port *port, *tmp_port; + + vnic_dev = ib_get_client_data(device, &vnic_client); + list
[openib-general] [PATCH 5/10] Implementation of Data path of the communication protocol
Adds the files that implement the data transfer part of the communication protocol with the VEx. The RDMA of ethernet packets is implemented in here. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_data.c| 1065 drivers/infiniband/ulp/vnic/vnic_data.h| 179 + drivers/infiniband/ulp/vnic/vnic_trailer.h | 63 ++ 3 files changed, 1307 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_data.c b/drivers/infiniband/ulp/vnic/vnic_data.c new file mode 100644 index 000..e3b9739 --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_data.c @@ -0,0 +1,1065 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include + +#include "vnic_util.h" +#include "vnic_viport.h" +#include "vnic_config.h" +#include "vnic_data.h" +#include "vnic_trailer.h" + +static void data_received_kick(struct io *io); +static void data_xmit_complete(struct io *io); + +#define LOCAL_IO(x) PTR64((x)) + +#define INBOUND_COPY + +#ifdef INBOUND_COPY +u32 min_rcv_skb = 60; +module_param(min_rcv_skb, int, 0444); +#endif + +u32 min_xmt_skb = 60; +module_param(min_xmt_skb, int, 0444); + +#ifdef CONFIG_INFINIBAND_VNIC_STATS +cycles_t recv_ref; +#endif /* CONFIG_INFINIBAND_VNIC_STATS */ + +BOOLEAN data_init(struct data * data, struct viport * viport, + struct data_config * config, struct ib_pd *pd, u64 guid) +{ + DATA_FUNCTION("data_init()\n"); + + data->parent = viport; + data->config = config; + data->ib_conn.viport = viport; + data->ib_conn.ib_config = &config->ib_config; + data->ib_conn.state = IB_CONN_UNINITTED; + + if ((min_xmt_skb < 60) || (min_xmt_skb > 9000)) { + DATA_ERROR("min_xmt_skb (%d) must be between 60 and 9000\n", + min_xmt_skb); + goto failure; + } + if (!vnic_ib_conn_init(&data->ib_conn, viport, pd, guid, + &config->ib_config)) { + goto failure; + } + data->mr = ib_get_dma_mr(pd, +IB_ACCESS_LOCAL_WRITE | +IB_ACCESS_REMOTE_READ | +IB_ACCESS_REMOTE_WRITE); + if (IS_ERR(data->mr)) { + DATA_ERROR("failed to register memory for data connection\n"); + goto destroy_conn; + } + + data->ib_conn.cm_id = ib_create_cm_id(viport->config->ibdev, + vnic_ib_cm_handler, + &data->ib_conn); + + if (IS_ERR(data->ib_conn.cm_id)) { + DATA_ERROR("creating data CM ID failed\n"); + return FALSE; + } + + return TRUE; + +destroy_conn: + ib_destroy_qp(data->ib_conn.qp); + ib_destroy_cq(data->ib_conn.cq); +failure: + return FALSE; +} + +static void data_post_recvs(struct data *data) +{ + unsigned long flags; + + DATA_FUNCTION("data_post_recvs()\n"); + spin_lock_irqsave(&data->recv_ios_lock, flags); + while (!list_empty(&data->recv_ios)) { + struct io *io = list_entry(data->recv_ios.next, + struct io, list_ptrs); + struct recv_io *recv_io = (struct recv_io *)io; + + list_del(&recv_io->io.list_ptrs); + spin_unlock_irqrestore(&data->recv_ios_lock, flags); + if (!vnic_ib_post_recv(&data->ib_conn, &recv_io->io)) { + viport_failur
[openib-general] [PATCH 3/10] Driver viport files - implementation of communication protocol with VEx
Adds the driver viport files. These files implement the state machine for the communication protocol with the VEx. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_viport.c | 936 + drivers/infiniband/ulp/vnic/vnic_viport.h | 175 + 2 files changed, insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_viport.c b/drivers/infiniband/ulp/vnic/vnic_viport.c new file mode 100644 index 000..516e802 --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_viport.c @@ -0,0 +1,936 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include +#include +#include + +#include "vnic_util.h" +#include "vnic_main.h" +#include "vnic_viport.h" +#include "vnic_netpath.h" +#include "vnic_control.h" +#include "vnic_data.h" +#include "vnic_config.h" +#include "vnic_control_pkt.h" + +DECLARE_WAIT_QUEUE_HEAD(viport_queue); +LIST_HEAD(viport_list); +DECLARE_COMPLETION(viport_thread_exit); +spinlock_t viport_list_lock = SPIN_LOCK_UNLOCKED; + +int viport_thread = -1; +int viport_thread_end = 0; + +struct viport *viport_allocate(struct viport_config *config) +{ + struct viport *viport; + + VIPORT_FUNCTION("viport_allocate()\n"); + viport = (struct viport *)kmalloc(sizeof(struct viport), GFP_KERNEL); + if (!viport) { + VIPORT_ERROR("failed allocating viport structure\n"); + config_free_viport(viport->config); + return NULL; + } + memset(viport, 0, sizeof(struct viport)); + + viport->state = VIPORT_DISCONNECTED; + viport->link_state = LINK_RETRYWAIT; + viport->connect = WAIT; + viport->new_mtu = 1500; + viport->new_flags = 0; + viport->config = config; + + spin_lock_init(&viport->lock); + init_waitqueue_head(&viport->stats_queue); + init_waitqueue_head(&viport->disconnect_queue); + INIT_LIST_HEAD(&viport->list_ptrs); + + viport_kick(viport); + + return viport; +} + +BOOLEAN viport_connect(struct viport * viport, BOOLEAN delay) +{ + VIPORT_FUNCTION("viport_connect()\n"); + if (viport->parent == NULL) { + return FALSE; + } + if (delay) + viport->connect = DELAY; + else + viport->connect = NOW; + viport_kick(viport); + return TRUE; +} + +BOOLEAN viport_set_parent(struct viport *viport, struct netpath *netpath) +{ + VIPORT_FUNCTION("viport_set_parent()\n"); + if (viport->parent != NULL) { + return FALSE; + } + + viport->parent = netpath; + viport_kick(viport); + return TRUE; +} + +BOOLEAN viport_unset_parent(struct viport * viport, struct netpath * netpath) +{ + VIPORT_FUNCTION("viport_unset_parent()\n"); + if (viport->parent != netpath) { + return FALSE; + } + viport_free(viport); + return TRUE; +} + +void viport_free(struct viport *viport) +{ + VIPORT_FUNCTION("viport_free()\n"); + viport_disconnect(viport); /* NOTE: this can sleep */ + config_free_viport(viport->config); + kfree(viport); + return; +} + +void viport_disconnect(struct viport *viport) +{ + VIPORT_FUNCTION("viport_disconnect()\n"); + viport->disconnect = 1; + viport_failure(viport); + wait_event(viport->disconnect_queue, viport->disconnect == 0); + return; +} + +BOOLEAN viport_set_link(struct viport * viport, u16 flags, u16 mtu) +{ + unsigned long localflags
[openib-general] [PATCH 2/10] Driver netpath files - abstraction of connection to VEx
Adds the driver netpath files. These files implement the netpath layer. Netpath is an an abstraction of a connection to the VEx. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_netpath.c | 250 drivers/infiniband/ulp/vnic/vnic_netpath.h | 103 2 files changed, 353 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_netpath.c b/drivers/infiniband/ulp/vnic/vnic_netpath.c new file mode 100644 index 000..e02d602 --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_netpath.c @@ -0,0 +1,250 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include + +#include "vnic_util.h" +#include "vnic_main.h" +#include "vnic_viport.h" +#include "vnic_netpath.h" + +void netpath_init(struct netpath *netpath, struct vnic *vnic, int second_bias) +{ + netpath->parent = vnic; + netpath->carrier = 0; + netpath->viport = NULL; + netpath->second_bias = second_bias; + netpath->timer_state = NETPATH_TS_IDLE; + init_timer(&netpath->timer); + return; +} + +void vnic_npevent_timeout(unsigned long data) +{ + struct netpath *netpath = (struct netpath *)data; + vnic_npevent_queue_evt(netpath, VNICNP_TIMEREXPIRED); +} + +void netpath_timer(struct netpath *netpath, int timeout) +{ + if (netpath->timer_state == NETPATH_TS_ACTIVE) { + del_timer_sync(&netpath->timer); + } + if (timeout) { + init_timer(&netpath->timer); + netpath->timer_state = NETPATH_TS_ACTIVE; + netpath->timer.expires = jiffies + timeout; + netpath->timer.data = (unsigned long)netpath; + netpath->timer.function = vnic_npevent_timeout; + add_timer(&netpath->timer); + } else { + vnic_npevent_timeout((unsigned long)netpath); + } + return; +} + +void netpath_timer_stop(struct netpath *netpath) +{ + if (netpath->timer_state == NETPATH_TS_ACTIVE) { + del_timer_sync(&netpath->timer); + vnic_npevent_dequeue_evt(netpath, VNICNP_TIMEREXPIRED); + netpath->timer_state = NETPATH_TS_IDLE; + } +} + +void netpath_free(struct netpath *netpath) +{ + if (netpath->viport) { + netpath_remove_path(netpath, netpath->viport); + class_device_unregister(&netpath->class_dev_info.class_dev); + wait_for_completion(&netpath->class_dev_info.released); + } + + return; +} + +BOOLEAN netpath_add_path(struct netpath * netpath, struct viport * viport) +{ + if (netpath->viport) { + return FALSE; + } else { + netpath->viport = viport; + viport_set_parent(viport, netpath); + return TRUE; + } +} + +BOOLEAN netpath_remove_path(struct netpath * netpath, struct viport * viport) +{ + if (netpath->viport != viport) { + return FALSE; + } else { + netpath->viport = NULL; + viport_unset_parent(viport, netpath); + return TRUE; + } +} + +void netpath_connected(struct netpath *netpath, struct viport *viport) +{ + vnic_connected(netpath->parent, netpath); + return; +} + +void netpath_disconnected(struct netpath *netpath, struct viport *viport) +{ + vnic_disconnected(netpath->parent, netpath); + return; +} + +BOOLEAN netpath_set_link(struct netpath * netpath, u16 flags, u16 mtu) +{ + BOOLEAN ret = FALSE; + + NETPA
[openib-general] [PATCH 1/10] Driver Main files - netdev functions and corresponding state maintenance
Adds the driver main files. These files implement netdev registration, netdev functions and state maintenance of the virtual NIC corresponding to the various events associated with the Virtual Ethernet IOC (VEx) connection. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> --- drivers/infiniband/ulp/vnic/vnic_main.c | 1040 +++ drivers/infiniband/ulp/vnic/vnic_main.h | 152 + 2 files changed, 1192 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/vnic/vnic_main.c b/drivers/infiniband/ulp/vnic/vnic_main.c new file mode 100644 index 000..b87e00b --- /dev/null +++ b/drivers/infiniband/ulp/vnic/vnic_main.c @@ -0,0 +1,1040 @@ +/* + * Copyright (c) 2006 SilverStorm Technologies Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "vnic_util.h" +#include "vnic_main.h" +#include "vnic_netpath.h" +#include "vnic_viport.h" +#include "vnic_ib.h" + +#define MODULEVERSION "0.1" +#define MODULEDETAILS "Virtual NIC driver version " MODULEVERSION + +MODULE_AUTHOR("SilverStorm Technologies Inc."); +MODULE_DESCRIPTION(MODULEDETAILS); +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_SUPPORTED_DEVICE("SilverStorm Ethernet Virtual I/O Controller"); + +u32 vnic_debug = 0x0; + +module_param(vnic_debug, uint, 0444); + +LIST_HEAD(vnic_list); + +const char driver[] = "vnic"; + +void vnic_connected(struct vnic *vnic, struct netpath *netpath) +{ + VNIC_FUNCTION("vnic_connected()\n"); + vnic_npevent_queue_evt(netpath, VNICNP_CONNECTED); +#ifdef CONFIG_INFINIBAND_VNIC_STATS + if (vnic->statistics.conn_time == 0) { + vnic->statistics.conn_time = + get_cycles() - vnic->statistics.start_time; + } + if (vnic->statistics.disconn_ref != 0) { + vnic->statistics.disconn_time += + get_cycles() - vnic->statistics.disconn_ref; + vnic->statistics.disconn_num++; + vnic->statistics.disconn_ref = 0; + } +#endif /* CONFIG_INFINIBAND_VNIC_STATS */ +} + +void vnic_disconnected(struct vnic *vnic, struct netpath *netpath) +{ + VNIC_FUNCTION("vnic_disconnected()\n"); + vnic_npevent_queue_evt(netpath, VNICNP_DISCONNECTED); +} + +void vnic_link_up(struct vnic *vnic, struct netpath *netpath) +{ + VNIC_FUNCTION("vnic_link_up()\n"); + vnic_npevent_queue_evt(netpath, VNICNP_LINKUP); +} + +void vnic_link_down(struct vnic *vnic, struct netpath *netpath) +{ + VNIC_FUNCTION("vnic_link_down()\n"); + vnic_npevent_queue_evt(netpath, VNICNP_LINKDOWN); +} + +void vnic_stop_xmit(struct vnic *vnic, struct netpath *netpath) +{ + VNIC_FUNCTION("vnic_stop_xmit()\n"); + if (netpath == vnic->current_path) { + if (vnic->xmit_started) { + netif_stop_queue(&vnic->netdevice); + vnic->xmit_started = 0; + } +#ifdef CONFIG_INFINIBAND_VNIC_STATS + if (vnic->statistics.xmit_ref == 0) { + vnic->statistics.xmit_ref = get_cycles(); + } +#endif /* CONFIG_INFINIBAND_VNIC_STATS */ + } + return; +} + +void vnic_restart_xmit(struct vnic *vnic, struct netpath *netpath) +{ + VNIC_FUNCTION("vnic_restart_xmit()\n"); + if (netpath == vnic->current_path) { + if (!vnic->xmit_started) { + netif_wake_queue(&vnic->netdevice); + vnic->xmit_started = 1; + } +#ifdef CONFIG_INFINIBAND_VNI
[openib-general] [PATCH 0/10] [RFC] Support for SilverStorm Virtual Ethernet I/O controller (VEx)
Hi Roland, This patch series adds support for the SilverStorm Virtual Ethernet I/O Controllers (VEx) by adding a new kernel level driver. This kernel driver: 1. Communicates with the VEx on the SilverStorm fabric switches/directors using SilverStorm's native protocol 2. Presents a standard Ethernet NIC interface to the system 3. Uses IB reliable connection semantics 4. Is tuned for high performance and throughput The SilverStorm VEx and the associated communication protocol is in wide use amongst users of SilverStorm IB fabric solutions. This patch series is intended for your infiniband.git for-2.6.19 branch. It also has been tested against the for-2.6.20 branch. Signed-off-by: Ramachandra K <[EMAIL PROTECTED]> Regards, Ram ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is notloaded on AMD
On Sun, 2006-10-01 at 09:50 +0200, Michael S. Tsirkin wrote: > Quoting r. Tseng-Hui (Frank) Lin <[EMAIL PROTECTED]>: > > Subject: RE: FW: Mstflint - not working on ppc64 andwhendriver is notloaded > > on AMD > > > > The ppc64 problem is actually in pci_64.c. Here is the patch: > > > > cut here = > > diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c > > index 4c4449b..490403c 100644 > > --- a/arch/powerpc/kernel/pci_64.c > > +++ b/arch/powerpc/kernel/pci_64.c > > @@ -734,9 +734,7 @@ static struct resource *__pci_mmap_make_ > > if (hose == 0) > > return NULL;/* should never happen */ > > > > - /* If memory, add on the PCI bridge address offset */ > > if (mmap_state == pci_mmap_mem) { > > - *offset += hose->pci_mem_offset; > > res_bit = IORESOURCE_MEM; > > } else { > > io_offset = (unsigned long)hose->io_base_virt - pci_io_base; > > = end cut = > > > > The mmap() system call on resource0 does not work on ppc64 without this > > patch. PowerMAC G5 got away with this because its hose->pci_mem_offset > > was set to 0. > > > > The fix is made on 8/21. It may be able to make it into 2.6.19. But it > > certainly won't get into SLES10, SLES9-SP3, or REHL4-U4 which have > > already been released. > > > > To cover both cases with and without the fix, my patch try to > > mmap /sys/bus/pci//resource0 first. It it failed it tries > > mmap /proc/bus/pci/ If it failed again, we have no choice but fall > > back to use PCI config space. > > OK, so for OFED just mmap from /proc/bus/pci/ should be sufficient > work-around - it will make things work when driver is loaded. > Correct? > Michael: No. Without the above patch for pci_64.c, mmap() is broken in ppc64 no matter mmap() from /sys/bus/pci//resource0 or /proc/bus/pci/. The only way is "mstflint -d /proc/bus/pci/", which use pread() and pwrite() instaed of mmap(). With the patch, mmap() from /sys/bus/pci//resource0 and /proc/bus/pci/ both work when mthca driver is loaded. No workaround is needed. Note that "-d " uses mmap from /proc/bus/pci/. "-d /proc/bus/pci/" uses pread() and pwrite(). ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Roland Dreier wrote: > > Bill> At 1st, I thought that was the case, a fork, however, I do > Bill> not think get_user_pages(), and the increment of the ref > Bill> count, will guarantee the page struct does not change for > Bill> RHEL 4 U3, I need to verify that though. > > Are you doing a fork()? If so then, yes, you will not be able to make > your app work on a RHEL4 kernel. After get_user_pages(), if you do a > fork() then a copy-on-write will still happen, which will cause the > physical page to move as you have discovered. There is no fork that I am aware of in the code. The pthread that created the EVD and any other thread in the process that executes the debug code sees the changed page struct. I will try to recreate this in a test app. -Bill ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Bill> At 1st, I thought that was the case, a fork, however, I do Bill> not think get_user_pages(), and the increment of the ref Bill> count, will guarantee the page struct does not change for Bill> RHEL 4 U3, I need to verify that though. Are you doing a fork()? If so then, yes, you will not be able to make your app work on a RHEL4 kernel. After get_user_pages(), if you do a fork() then a copy-on-write will still happen, which will cause the physical page to move as you have discovered. This is fixed on newer kernels with libibverbs 1.1 (not yet released though). I don't think there's any real way to make it work on RHEL4's 2.6.9 kernel. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.18 kernel support in the main trunk.
James Lentini wrote: >>> It is likely that companies will restrict developers at HCA >>> vendor X from contributing code to HCA vendor Y's repository. >> I doubt it. > > Unfortunately this does happen. Sean has already said he can only > access git trees at kernel.org. That appears to be a matter of Intel's lamebrained firewall rules getting in his way, not a "thou shalt not poke at competitor's open code" restriction. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Roland Dreier wrote: > > Bill> I am testing an app in development on RHEL 4 U3 using uDAPL. > Bill> The app runs OK on gen1 stacks, but cannot run on any OFED > Bill> based stack I have tried on RHEL 4 U3. The symptom is RDMAs > Bill> not getting completion. A completion notification is sent, > Bill> but mthca_poll_cq() finds no completion. I debugged the > Bill> problem to this: the memory for the completion queue is not > Bill> pinned and at some point the page struct changes *after* the > Bill> HCA has been handed the address of the completion queue, so > Bill> subsequent completions are written elsewhere in memory and > Bill> the app hangs waiting for completion. > > The memory should be pinned by the call to __mthca_reg_mr() in > mthca_create_cq(), since the kernel will do get_user_pages() on the > memory. > > By any chance, does your app do fork() or system() or something like that? At 1st, I thought that was the case, a fork, however, I do not think get_user_pages(), and the increment of the ref count, will guarantee the page struct does not change for RHEL 4 U3, I need to verify that though. I dumped the page struct in ib_umem_get() when the completion queue memory was 1st registered. Then my DTO event thread, on a 10 second timeout, would go ahead and create another EVD (not used) so I could then dump the page struct of the 1st completion queue again in ib_umem_get(), and sure enough the page struct changed. If I wrote some code that mapped an address to the original page struct, I would probably see the completions there. -Bill ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0 of 28] ipath patches for 2.6.19
Eric W. Biederman wrote: > Have you tested your driver against the -mm tree? No. > To the best of my knowledge the irq handling of your hypertransport card > is a complete and total hack that works only by chance. And a happy Monday morning to you, too :-) > In the -mm tree I have added a first pass at proper support for the > hypertranport interrupt capability. As this code is slated to go into > 2.6.19 could you please test against that? I'm on vacation for a few weeks. We'll find someone to do it. http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] RHEL 4 U3 - lost completions
Bill> I am testing an app in development on RHEL 4 U3 using uDAPL. Bill> The app runs OK on gen1 stacks, but cannot run on any OFED Bill> based stack I have tried on RHEL 4 U3. The symptom is RDMAs Bill> not getting completion. A completion notification is sent, Bill> but mthca_poll_cq() finds no completion. I debugged the Bill> problem to this: the memory for the completion queue is not Bill> pinned and at some point the page struct changes *after* the Bill> HCA has been handed the address of the completion queue, so Bill> subsequent completions are written elsewhere in memory and Bill> the app hangs waiting for completion. The memory should be pinned by the call to __mthca_reg_mr() in mthca_create_cq(), since the kernel will do get_user_pages() on the memory. By any chance, does your app do fork() or system() or something like that? - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] RHEL 4 U3 - lost completions
I am testing an app in development on RHEL 4 U3 using uDAPL. The app runs OK on gen1 stacks, but cannot run on any OFED based stack I have tried on RHEL 4 U3. The symptom is RDMAs not getting completion. A completion notification is sent, but mthca_poll_cq() finds no completion. I debugged the problem to this: the memory for the completion queue is not pinned and at some point the page struct changes *after* the HCA has been handed the address of the completion queue, so subsequent completions are written elsewhere in memory and the app hangs waiting for completion. I hacked in the following to get the app running, I replaced the allocation of the completion buffer in libmthca, ret = posix_memalign(memptr, alignment, size); with, size = (size + (4096-1)) & ~(4096-1); *memptr = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS | MAP_LOCKED,0,0); Is there a restriction on using completion queues on a RHEL 4 Update 3 kernel ? Am I missing a patch ? Details in http://openib.org/bugzilla/show_bug.cgi?id=147 -Bill ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 2.6.19-rc1] ehca: fix ehca_probe if module loaded after ib_ipoib
Looks OK but your mailer mangled the patch. Please resend in a form that can be applied... Also: > In addition to that this patch contains a very small format improvement > in our tracing function. please send unrelated changes as separate patches. So this should come as two patches -- one to fix the device registration, and one to change your debug formatting. Thanks, Roland ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH 2.6.19-rc1] ehca: fix ehca_probe if module loaded after ib_ipoib
Hello Roland! Below is a patch of ehca, which fixes a bug (crash) that occured when ib_ehca is loaded after ib_ipoib. This patch initializes struct ehca_shca with struct device*, then creates internal resources and finally registers the ehca IB device. And that is the proper sequence to do. In addition to that this patch contains a very small format improvement in our tracing function. Thanks! Nam Nguyen Signed-off-by: Hoang-Nam Nguyen <[EMAIL PROTECTED]> --- ehca_main.c | 36 +++- ehca_tools.h |2 +- 2 files changed, 20 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_main.c b/drivers/infiniband/hw/ehca/ehca_main.c index 2380994..024d511 100644 --- a/drivers/infiniband/hw/ehca/ehca_main.c +++ b/drivers/infiniband/hw/ehca/ehca_main.c @@ -49,7 +49,7 @@ #include "hcp_if.h" MODULE_LICENSE("Dual BSD/GPL"); MODULE_AUTHOR("Christoph Raisch <[EMAIL PROTECTED]>"); MODULE_DESCRIPTION("IBM eServer HCA InfiniBand Device Driver"); -MODULE_VERSION("SVNEHCA_0016"); +MODULE_VERSION("SVNEHCA_0017"); int ehca_open_aqp1 = 0; int ehca_debug_level = 0; @@ -239,7 +239,7 @@ init_node_guid1: return ret; } -int ehca_register_device(struct ehca_shca *shca) +int ehca_init_device(struct ehca_shca *shca) { int ret; @@ -317,11 +317,6 @@ int ehca_register_device(struct ehca_shc /* shca->ib_device.process_mad = ehca_process_mad; */ shca->ib_device.mmap = ehca_mmap; - ret = ib_register_device(&shca->ib_device); - if (ret) - ehca_err(&shca->ib_device, -"ib_register_device() failed ret=%x", ret); - return ret; } @@ -561,9 +556,9 @@ static int __devinit ehca_probe(struct i goto probe1; } - ret = ehca_register_device(shca); + ret = ehca_init_device(shca); if (ret) { - ehca_gen_err("Cannot register Infiniband device"); + ehca_gen_err("Cannot init ehca device struct"); goto probe1; } @@ -571,7 +566,7 @@ static int __devinit ehca_probe(struct i ret = ehca_create_eq(shca, &shca->eq, EHCA_EQ, 2048); if (ret) { ehca_err(&shca->ib_device, "Cannot create EQ."); - goto probe2; + goto probe1; } ret = ehca_create_eq(shca, &shca->neq, EHCA_NEQ, 513); @@ -600,6 +595,13 @@ static int __devinit ehca_probe(struct i goto probe5; } + ret = ib_register_device(&shca->ib_device); + if (ret) { + ehca_err(&shca->ib_device, +"ib_register_device() failed ret=%x", ret); + goto probe6; + } + /* create AQP1 for port 1 */ if (ehca_open_aqp1 == 1) { shca->sport[0].port_state = IB_PORT_DOWN; @@ -607,7 +609,7 @@ static int __devinit ehca_probe(struct i if (ret) { ehca_err(&shca->ib_device, "Cannot create AQP1 for port 1."); - goto probe6; + goto probe7; } } @@ -618,7 +620,7 @@ static int __devinit ehca_probe(struct i if (ret) { ehca_err(&shca->ib_device, "Cannot create AQP1 for port 2."); - goto probe7; + goto probe8; } } @@ -630,12 +632,15 @@ static int __devinit ehca_probe(struct i return 0; -probe7: +probe8: ret = ehca_destroy_aqp1(&shca->sport[0]); if (ret) ehca_err(&shca->ib_device, "Cannot destroy AQP1 for port 1. ret=%x", ret); +probe7: + ib_unregister_device(&shca->ib_device); + probe6: ret = ehca_dereg_internal_maxmr(shca); if (ret) @@ -660,9 +665,6 @@ probe3: ehca_err(&shca->ib_device, "Cannot destroy EQ. ret=%x", ret); -probe2: - ib_unregister_device(&shca->ib_device); - probe1: ib_dealloc_device(&shca->ib_device); @@ -750,7 +752,7 @@ int __init ehca_module_init(void) int ret; printk(KERN_INFO "eHCA Infiniband Device Driver " - "(Rel.: SVNEHCA_0016)\n"); + "(Rel.: SVNEHCA_0017)\n"); idr_init(&ehca_qp_idr); idr_init(&ehca_cq_idr); spin_lock_init(&ehca_qp_idr_lock); diff --git a/drivers/infiniband/hw/ehca/ehca_tools.h b/drivers/infiniband/hw/ehca/ehca_tools.h index 9f56bb8..809da3e 100644 --- a/drivers/infiniband/hw/ehca/ehca_tools.h +++ b/drivers/infiniband/hw/ehca/ehca_tools.h @@ -117,7 +117,7 @@ #define ehca_dmp(adr, len, format, args. unsigned int l = (unsigned int)(len); \ unsigned char *deb = (unsigned char*)(adr); \ for (x = 0; x < l; x += 16) { \ - printk("EHCA_DMP:%s" format \ + printk("EHCA_DMP:%s " format \ " adr=%p ofs=%04x %016lx %016lx\n", \ __FUNCTION__, ##args, deb, x, \ *((u64 *)&deb[0]), *((u64 *)&deb[8])); \ ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH v2] ib_cm: fix module unload race with timewait
Updated patch based on Roland's feedback - converted a couple uses of spinlock_irqsave to spinlock_irq, and used list manipulation routine for cleanup. Signed-off-by: Sean Hefty <[EMAIL PROTECTED]> --- Index: cm.c === --- cm.c(revision 9680) +++ cm.c(working copy) @@ -75,6 +75,7 @@ static struct ib_cm { struct rb_root remote_sidr_table; struct idr local_id_table; __be32 random_id_operand; + struct list_head timewait_list; struct workqueue_struct *wq; } cm; @@ -112,6 +113,7 @@ struct cm_work { struct cm_timewait_info { struct cm_work work;/* Must be first. */ + struct list_head list; struct rb_node remote_qp_node; struct rb_node remote_id_node; __be64 remote_ca_guid; @@ -648,13 +650,6 @@ static inline int cm_convert_to_ms(int i static void cm_cleanup_timewait(struct cm_timewait_info *timewait_info) { - unsigned long flags; - - if (!timewait_info->inserted_remote_id && - !timewait_info->inserted_remote_qp) - return; - - spin_lock_irqsave(&cm.lock, flags); if (timewait_info->inserted_remote_id) { rb_erase(&timewait_info->remote_id_node, &cm.remote_id_table); timewait_info->inserted_remote_id = 0; @@ -664,7 +659,6 @@ static void cm_cleanup_timewait(struct c rb_erase(&timewait_info->remote_qp_node, &cm.remote_qp_table); timewait_info->inserted_remote_qp = 0; } - spin_unlock_irqrestore(&cm.lock, flags); } static struct cm_timewait_info * cm_create_timewait_info(__be32 local_id) @@ -685,8 +679,12 @@ static struct cm_timewait_info * cm_crea static void cm_enter_timewait(struct cm_id_private *cm_id_priv) { int wait_time; + unsigned long flags; + spin_lock_irqsave(&cm.lock, flags); cm_cleanup_timewait(cm_id_priv->timewait_info); + list_add_tail(&cm_id_priv->timewait_info->list, &cm.timewait_list); + spin_unlock_irqrestore(&cm.lock, flags); /* * The cm_id could be destroyed by the user before we exit timewait. @@ -702,9 +700,13 @@ static void cm_enter_timewait(struct cm_ static void cm_reset_to_idle(struct cm_id_private *cm_id_priv) { + unsigned long flags; + cm_id_priv->id.state = IB_CM_IDLE; if (cm_id_priv->timewait_info) { + spin_lock_irqsave(&cm.lock, flags); cm_cleanup_timewait(cm_id_priv->timewait_info); + spin_unlock_irqrestore(&cm.lock, flags); kfree(cm_id_priv->timewait_info); cm_id_priv->timewait_info = NULL; } @@ -1308,6 +1310,7 @@ static struct cm_id_private * cm_match_r if (timewait_info) { cur_cm_id_priv = cm_get_id(timewait_info->work.local_id, timewait_info->work.remote_id); + cm_cleanup_timewait(cm_id_priv->timewait_info); spin_unlock_irqrestore(&cm.lock, flags); if (cur_cm_id_priv) { cm_dup_req_handler(work, cur_cm_id_priv); @@ -1316,7 +1319,8 @@ static struct cm_id_private * cm_match_r cm_issue_rej(work->port, work->mad_recv_wc, IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REQ, NULL, 0); - goto error; + listen_cm_id_priv = NULL; + goto out; } /* Find matching listen request. */ @@ -1324,21 +1328,20 @@ static struct cm_id_private * cm_match_r req_msg->service_id, req_msg->private_data); if (!listen_cm_id_priv) { + cm_cleanup_timewait(cm_id_priv->timewait_info); spin_unlock_irqrestore(&cm.lock, flags); cm_issue_rej(work->port, work->mad_recv_wc, IB_CM_REJ_INVALID_SERVICE_ID, CM_MSG_RESPONSE_REQ, NULL, 0); - goto error; + goto out; } atomic_inc(&listen_cm_id_priv->refcount); atomic_inc(&cm_id_priv->refcount); cm_id_priv->id.state = IB_CM_REQ_RCVD; atomic_inc(&cm_id_priv->work_count); spin_unlock_irqrestore(&cm.lock, flags); +out: return listen_cm_id_priv; - -error: cm_cleanup_timewait(cm_id_priv->timewait_info); - return NULL; } static int cm_req_handler(struct cm_work *work) @@ -2630,28 +2633,29 @@ static int cm_timewait_handler(struct cm { struct cm_timewait_info *timewait_info; struct cm_id_private *cm_id_priv; - unsigned long flags; int ret; timewait_info = (struct cm_timewait_info *)work; - cm_cleanup_timewait(timewait_info); + spin_lock_irq(&cm.lock); + list_del(&timewait_info->list); +
Re: [openib-general] 2.6.18 kernel support in the main trunk.
>James> Unfortunately this does happen. Sean has already said he >James> can only access git trees at kernel.org. > >I think he just said that he can only access git trees via http://. I can access git://git.kernel.org or http://git.somewhere.else. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.18 kernel support in the main trunk.
James> Unfortunately this does happen. Sean has already said he James> can only access git trees at kernel.org. I think he just said that he can only access git trees via http://. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] 2.6.18 kernel support in the main trunk.
On Fri, 29 Sep 2006, Bryan O'Sullivan wrote: > On Fri, 2006-09-29 at 12:26 -0400, James Lentini wrote: > > > Balkanizing the OFA repository into corporate repositories would be a > > mistake. > > Nobody is suggesting this. However, separating the mess that is the > current SVN trunk into a set of well-understood branches, each of which > sees some testing by its authors in isolation, can *only* be a good > thing for ensuring a higher-quality OF process in general. > > > It is likely that companies will restrict developers at HCA > > vendor X from contributing code to HCA vendor Y's repository. > > I doubt it. Unfortunately this does happen. Sean has already said he can only access git trees at kernel.org. > As a practical matter, having your driver in the kernel tree means > it's open season for anyone who wants to take a crack at it. Just > look at the number of IB/10gbE/iWarp hardware vendors that have > fingerprints all over each other's code in drivers/infiniband/hw for > an example. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Question about ehca CQ handling
> While looking over the ehca driver from the perspective of adding a > "peek CQ" operation, I noticed some code that looked funny. > > In hipz_set_cqx_n0() and hipz_set_cqx_n1(), what is the point of the > calls to hipz_galpa_load_cq()? The return value is discarded. I see > that hipz_galpa_load_cq() dereferences a volatile pointer internally, > so I'm guessing this is some sort of ordering constraint. But would > it be just as good to do "barrier()" there? > > - R. No, barrier won't help, the I/O bus connection is theoretically allowed to reorder and aggregate writes in a defined pattern. The recommended way to ensure that the ehca chip actually has seen the write is doing a read on the same address. Gruss / Regards . . . Christoph R ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general