Re: [ewg] RDS problematic on RC2

2008-01-16 Thread Olaf Kirch
On Thursday 17 January 2008 04:15, Johann George wrote: > RDS/IB: completion on 10.1.1.205 had status 9, disconnecting and reconnecting > > Note that this is using RDS over IB. Our minimal experience with the > non-IB version of RDS was worse. We only tried it with RC1 and it > crashed one of th

Re: [ewg] [PATCH 1/2] fmr_pool flush serials can get out of sync

2008-01-16 Thread Olaf Kirch
On Wednesday 16 January 2008 22:54, Roland Dreier wrote: > Thanks, good catch, and I applied this (except I removed the BUG_ON, > since I don't think killing the system with minimal info available on > how the counts got out of sync is that useful...) Can you turn it into a rate limited printk ins

Re: [ewg] RDS problematic on RC2

2008-01-16 Thread Olaf Kirch
On Thursday 17 January 2008 04:15, Johann George wrote: > We've been testing the OFED 1.3 pre-releases on a 12 node cluster here > at UNH-IOL. RDS seemed largely functional (other than problems we > were aware of) on OFED 1.3 RC1. When we installed RC2, RDS stopped > working. A dmesg indicates t

Re: [ewg] [PATCH 1/2] fmr_pool flush serials can get out of sync

2008-01-16 Thread Olaf Kirch
Hi Roland, On Wednesday 16 January 2008 22:54, Roland Dreier wrote: > However I'm a little puzzled about how this can lead to memory > corruption in practice: the only thing that flushing FMRs should do is > make memory keys that should no longer be in use anyway become > invalid. So the only eff

Re: [ewg] RDS problematic on RC2

2008-01-16 Thread Vladimir Sokolovsky
Johann George wrote: We've been testing the OFED 1.3 pre-releases on a 12 node cluster here at UNH-IOL. RDS seemed largely functional (other than problems we were aware of) on OFED 1.3 RC1. When we installed RC2, RDS stopped working. A dmesg indicates the following message repeatedly on the co

[ewg] Ado6e FotoshopCS3 Extended for MAC\XP\Vlsta 89, Retail 999 (save 909)

2008-01-16 Thread Mann Wells
type 'ezadobenow .com' in Internet Exp!orer coreldraw graphics suite x3 - 59 symantec antivirus corporate 10 - 29 2008 microsoft office beta for mac - 79 grand theft auto: san andreas - 29 autodesk autocad lt 2008 - 69 microsoft vista ultimate - 89 adobe encore dvd 2 - 49 adobe photoshop cs2 v 9.0

Re: [ewg] RE: [ofa-general] OFED Jan 14 meeting summary on RC2readiness

2008-01-16 Thread Roland Dreier
> Roland, you said that XRC API is ugly, are you going to push it upstream > in its present form? That's a good question. Since there is no 'present form' for XRC as far as I can tell, it's hard to make a definitive answer. Certainly I haven't made up my mind in advance one way or another. In

Re: [ewg] RDS problematic on RC2

2008-01-16 Thread Richard Frank
copying rds-dev. Johann George wrote: We've been testing the OFED 1.3 pre-releases on a 12 node cluster here at UNH-IOL. RDS seemed largely functional (other than problems we were aware of) on OFED 1.3 RC1. When we installed RC2, RDS stopped working. A dmesg indicates the following message re

[ewg] RDS problematic on RC2

2008-01-16 Thread Johann George
We've been testing the OFED 1.3 pre-releases on a 12 node cluster here at UNH-IOL. RDS seemed largely functional (other than problems we were aware of) on OFED 1.3 RC1. When we installed RC2, RDS stopped working. A dmesg indicates the following message repeatedly on the console: RDS/IB: complet

Re: [ewg] [PATCH 1/2] fmr_pool flush serials can get out of sync

2008-01-16 Thread Roland Dreier
> Normally, the serial numbers for flush requests and flushes > executed should be in sync. > > When we decide to flush dirty MRs because there are too many of them, we > wake up the cleanup thread and let it do its stuff. As a side effect, it > increments pool->flush_ser, which leaves it o

[ewg] dapltest segfault after calling inet_ntoa

2008-01-16 Thread Allen Hubbe
Here is a typical command I use to run dapltest: dapltest -T T -s 10.1.1.202 -D OpenIB-cma \ -i 100 -t 1 -w 1 -R BE client SR 256 Starting with some of the OFED-1.3 releases, this gives me a segmentation fault. I do not get a segmentation fault on OFED 1.2.5.5. The offending lines of co

[ewg] Re: [PATCH 2/2] fmr_pool_flush didn't flush all MRs

2008-01-16 Thread Olaf Kirch
From: Olaf Kirch <[EMAIL PROTECTED]> Subject: [fmr_pool] fmr_pool_flush didn't flush all MRs When a FMR is released via ib_fmr_pool_unmap, the FMR usually ends up on the free_list rather than the dirty_list (because we allow a certain number of remappings before actually requiring a flush). Howev

[ewg] [PATCH 1/2] fmr_pool flush serials can get out of sync

2008-01-16 Thread Olaf Kirch
From: Olaf Kirch <[EMAIL PROTECTED]> Subject: [fmr_pool] fmr_pool flush serials can get out of sync Normally, the serial numbers for flush requests and flushes executed should be in sync. When we decide to flush dirty MRs because there are too many of them, we wake up the cleanup thread and let i

[ewg] Issues with fmr_pool

2008-01-16 Thread Olaf Kirch
Hi all, I've been debugging a memory corruption in the RDS zerocopy code for the past several days - basically, when we tear down a socket and destroy any existing MRs, RMDA writes that are in progress continue well after we've freed the MR and flushed the fmr_pool. After chasing several schools

RE: [ewg] Re: Can you send explanation how to work with bonding and the standard bonding setting

2008-01-16 Thread Scott Weitzenkamp (sweitzen)
Or, I don't see /sbin/call_ifenslave in my OFED-1.3-20080115-0600 ib-bonding package. [EMAIL PROTECTED] ~]# uname -a Linux svbu-qa1850-1 2.6.9-55.ELsmp #1 SMP Fri Apr 20 16:36:54 EDT 2007 x86_64 x8 6_64 x86_64 GNU/Linux [EMAIL PROTECTED] ~]# rpm -ql ib-bonding /lib/modules/2.6.9-55.ELsmp/updates/

[ewg] RE: [ofa-general] OFED 1.3 RC2 release is available

2008-01-16 Thread Scott Weitzenkamp (sweitzen)
Isn't RHEL4 up6 supported, too? I have added Version 1.3rc2 to bugzilla. Scott Weitzenkamp SQA and Release Manager Server Virtualization Business Unit Cisco Systems > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Tziporet Koren > Sent: Wedn

[ewg] OFED 1.3 RC2 release is available

2008-01-16 Thread Tziporet Koren
Hi, OFED 1.3 RC2 release is available on http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc2.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ The RC3 release is expected on January 30 Tziporet & Vlad ===

[ewg] Re: [GIT PULL] ~sashak/management.git

2008-01-16 Thread Vladimir Sokolovsky
Sasha Khapyorsky wrote: Hi Vlad, Please pull recent ofed_1_3 branch of ~sashak/management.git. The changes are: Sasha Khapyorsky (4): infiniband-diags/configure.in: complib doesn't have opensm dependencies anymore opensm/perfmgr: use pkey at index 0 libibumad: increase the

[ewg] Re: Can you send explanation how to work with bonding and the standard bonding setting

2008-01-16 Thread Or Gerlitz
Tziporet Koren wrote: We wish to test in this way and also add this explanation to OFED docs sure, its all documented in the ib-bonding.txt file that comes with the ib-bonding package, so I suggest just to add a note in the docs pointing to it. Or. # rpm -ql ib-bonding /etc/sysconfig/net