Re: [ewg] with the ipoib patches, debug prints spam the system log

2008-02-07 Thread Eli Cohen

On Thu, 2008-02-07 at 10:01 +0200, Eli Cohen wrote:
 On Thu, 2008-02-07 at 09:48 +0200, Or Gerlitz wrote:
  Or Gerlitz wrote:
   You have left somehow too many... debug prints in the last patches,
   please clean this up. See for example how the system log after less
   then a minute when ipoib debug prints are opened, it has one original
   print (ib0: Send unicast ARP to 0023) and all the rest are yours.
  
   Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, 
   tail = 2752
   Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
  
  Hi Eli,
  
  Just a reminder to remove this for RC4, using last night snapshot I 
  still see it.
  
  Or.
  
 
 I have to look at last night build - it should have been there already.

Sorry - it will be in the next build.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: traffic jittery, send queue full reports from mthca driver

2008-02-07 Thread Eli Cohen

On Thu, 2008-02-07 at 09:42 +0200, Or Gerlitz wrote:
  ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 
  max, 0 nreq)
  ib0: failed to post zlen send
 
 OK, Eli, taking the kernel bits from OFED-1.3-20080206-0751.tgz I don't
 see these prints any more. When probing out the driver inorder to replace
 it with the drop, I have got the following:
 
   ib0: timing out; will leak address handles
   ib0: ib_dealloc_pd failed
 
 so, is it another issue or related to the room-for-zlen-in-ring-accounting 
 fix?
 

I am not sure but I will look into it.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: traffic jittery, send queue full reports from mthca driver

2008-02-07 Thread Eli Cohen

On Thu, 2008-02-07 at 09:42 +0200, Or Gerlitz wrote:
  ib_mthca :03:00.0: SQ 000404 full (756910656 head, 756910592 tail, 64 
  max, 0 nreq)
  ib0: failed to post zlen send
 
 OK, Eli, taking the kernel bits from OFED-1.3-20080206-0751.tgz I don't
 see these prints any more. When probing out the driver inorder to replace
 it with the drop, I have got the following:
 
   ib0: timing out; will leak address handles
   ib0: ib_dealloc_pd failed
 
 so, is it another issue or related to the room-for-zlen-in-ring-accounting 
 fix?
 

Or,

does it happen on mthca or connectx? Does it happen when running iperf
in the way you reported in bugzilla?

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] [ANNOUNCE] open iSCSI over iSER target RPMis available

2008-02-07 Thread Erez Zilber

  * READ: 920 MB/sec
  * WRITE: 850 MB/sec

 Not getting anything even remotely close to this.  Are there more
 details on configuration somewhere?  I followed the web page as indicated.


Are you running iSCSI over TCP or iSCSI over iSER (over InfiniBand)? Our
results are with iSER.

Erez
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] with the ipoib patches, debug prints spam the system log

2008-02-07 Thread Eli Cohen

On Thu, 2008-02-07 at 09:48 +0200, Or Gerlitz wrote:
 Or Gerlitz wrote:
  You have left somehow too many... debug prints in the last patches,
  please clean this up. See for example how the system log after less
  then a minute when ipoib debug prints are opened, it has one original
  print (ib0: Send unicast ARP to 0023) and all the rest are yours.
 
  Feb  6 14:39:23  kernel: ib0: posting zlen send, wrid = 4: head = 2756, 
  tail = 2752
  Feb  6 14:39:23  kernel: ib0: ipoib_ib_tx_timer_func-427: head = 2757
 
 Hi Eli,
 
 Just a reminder to remove this for RC4, using last night snapshot I 
 still see it.
 
 Or.
 

I have to look at last night build - it should have been there already.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] MVAPICH1 1.0.0 SRPM available

2008-02-07 Thread Pavel Shamis (Pasha)

New srpm for MVAPICH1 was uploaded.
Please check ~pasha/ofed_1_3/ (see latest.txt for the build number)
Bugfix for: 883, 884, 888, 887, 889, 893

--
Pavel Shamis (Pasha)
Mellanox Technologies

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-07 Thread Tziporet Koren

Eli Cohen wrote:

I have tried to reproduce this but when using ib_mthca and mlx4_ib and
could not see this problem. Could you try to dig more into this and
provide more details.


  

Please reproduce the issue on our HCAs since we do not have any ehca
Note that Eli tried the code when using the non-SRQ path

Tziporet
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-07 Thread Pradeep Satyanarayana
Eli Cohen wrote:
 This problem was seen on a ehca that supports SRQ.

 
 Please reply how many scatter entries does ehca support when working
 in SRQ mode? Also any piece of info I might need to try and mimic ehca
 behaviour on Mellanox devices. I will appreciate if you can repeat the
 exact sequence of actions you do to reproduce this.

Hello Eli,

Ehca supports fewer than 16 s/g entries- hence the srq patch addresses that 
issue. 
The sequence of steps that I followed for the touch test:
1. On a freshly booted system, configure ib0 and assign an IP addresss
2. Switch to connected mode and change mtu
3. ping remote ib interface (already in CM mode)
4. modprobe -r ib_ehca

I see a series of cascading failures in /var/log/messages, starting with 
the issue of not being able to destroy the cq (specifically rcq)

Pradeep

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg][PATCH][0/2] SRP multipath failover within 60 seconds,

2008-02-07 Thread David Dillow

On Thu, 2008-02-07 at 08:18 +0200, Vladimir Sokolovsky wrote:
 Vu Pham wrote:
  The following patches assist SRP/dm-multipath to failover within 60 
  seconds (bugzilla #577) without data corruption, read/write error
[snip]
 Applied,
 kernel_patches/fixes/srp_2_disconnect_without_wait.patch
 kernel_patches/fixes/srp_3_qp_err_timer_reconnect_target.patch

Are there plans for these (and the ones they build on) to make their way
to the upstream kernel?
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-07 Thread Eli Cohen
 Ehca supports fewer than 16 s/g entries- hence the srq patch addresses that 
 issue.
 The sequence of steps that I followed for the touch test:
 1. On a freshly booted system, configure ib0 and assign an IP addresss
 2. Switch to connected mode and change mtu
 3. ping remote ib interface (already in CM mode)
 4. modprobe -r ib_ehca

 I see a series of cascading failures in /var/log/messages, starting with
 the issue of not being able to destroy the cq (specifically rcq)

I followed the procedure you describe with Arbel device. I changed the
code such that it will publish 12 scatter entires for the SRQ. I did
not see this problem however so I don't how to debug this. Could it be
a problem in the ehca driver?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Update (Re: open iSCSI over iSER target RPM ...)

2008-02-07 Thread Joe Landman

Update:

[EMAIL PROTECTED] etc]# dd if=/dev/zero of=/big/local.file bs=256k count=10
10+0 records in
10+0 records out
2621440 bytes (26 GB) copied, 58.7484 seconds, 446 MB/s

Better. I rebuilt OFED 1.2.5.5.  Are there specific recommended tuning 
guides for iSER?  Backing store in this case are real disks, and we can 
sink/source 750 MB/s on them, so I am not worried about disk IO 
bottlenecks, more worried about bad config of iSCSI/iSER.


BTW:  the 2TB LUN limit I asked about is still here in this code.  Same 
machines (initiator and target) used for SRP reported correct LUN sizes. 
 Here we are using the -868 open-iscsi initiator, and the tgt RPM 
announced.  I would like to dig into this.


This is what I am getting in dmesg for this iSER target:

iscsi: registered transport (tcp)
iscsi: registered transport (iser)
iser: iser_connect:connecting to: 10.2.1.2, port 0xbc0c
iser: iser_cma_handler:event 0 conn 81024b9f69c0 id 810209748c00
iser: iser_cma_handler:event 2 conn 81024b9f69c0 id 810209748c00
iser: iser_create_ib_conn_res:setting conn 81024b9f69c0 cma_id 
810209748c00: fmr_pool 81024bfb32c0 qp 8101cb16d600

iser: iser_cma_handler:event 9 conn 81024b9f69c0 id 810209748c00
iser: iscsi_iser_ep_poll:ib conn 81024b9f69c0 rc = 1
scsi13 : iSCSI Initiator over iSER, v.0.1
iser: iscsi_iser_conn_bind:binding iscsi conn 81021b65fa90 to 
iser_conn 81024b9f69c0

  Vendor: IET   Model: ControllerRev: 0001
  Type:   RAID   ANSI SCSI revision: 05
scsi 13:0:0:0: Attached scsi generic sg2 type 12
  Vendor: IET   Model: VIRTUAL-DISK  Rev: 0001
  Type:   Direct-Access  ANSI SCSI revision: 05
sdc : very big device. try to use READ CAPACITY(16).
sdc : READ CAPACITY(16) failed.
sdc : status=1, message=00, host=0, driver=08
sdc : use 0x as device size
SCSI device sdc: 4294967296 512-byte hdwr sectors (2199023 MB)
sdc: Write Protect is off
sdc: Mode Sense: 79 00 00 08
SCSI device sdc: drive cache: write back
sdc : very big device. try to use READ CAPACITY(16).
sdc : READ CAPACITY(16) failed.
sdc : status=1, message=00, host=0, driver=08
sdc : use 0x as device size
SCSI device sdc: 4294967296 512-byte hdwr sectors (2199023 MB)
sdc: Write Protect is off
sdc: Mode Sense: 79 00 00 08
SCSI device sdc: drive cache: write back
 sdc: unknown partition table
sd 13:0:0:1: Attached scsi disk sdc
sd 13:0:0:1: Attached scsi generic sg3 type 0


and this is what we get in SRP

scsi6 : SRP.T10:0008F104039862A4
  Vendor: SCST_BIO  Model: vdisk0Rev:  096
  Type:   Direct-Access  ANSI SCSI revision: 04
sdc : very big device. try to use READ CAPACITY(16).
SCSI device sdc: 12693355130 512-byte hdwr sectors (6498998 MB)
sdc: Write Protect is off
sdc: Mode Sense: 6b 00 10 08
SCSI device sdc: drive cache: write back w/ FUA


This looks suspiciously like a 2^32 limit somewhere.


Our exported device is

[EMAIL PROTECTED] ~]# parted /dev/sdb print

Model: Areca jrvs1 (scsi)
Disk /dev/sdb: 6500GB
Sector size (logical/physical): 512B/512B
Partition Table: loop

Number  Start   End SizeFile system  Flags
 1  0.00kB  6500GB  6500GB  xfs


and this is what tgtadm reports

[EMAIL PROTECTED] ~]# tgtadm --lld iscsi --op show --mode target
Target 1: iqn.2001-04.com.jr1-jackrabbit.small
System information:
Driver: iscsi
Status: running
I_T nexus information:
I_T nexus: 4
Initiator: iqn.1996-04.voltaire.com:01:dfaa3fd
Connection: 0
RDMA IP Address: 10.2.1.1
LUN information:
LUN: 0
Type: controller
SCSI ID: deadbeaf1:0
SCSI SN: beaf10
Size: 0
Online: No
Poweron/Reset: Yes
Removable media: No
Backing store: No backing store
LUN: 1
Type: disk
SCSI ID: deadbeaf1:1
SCSI SN: beaf11
Size: 5T
Online: Yes
Poweron/Reset: No
Removable media: No
Backing store: /dev/sdb
Account information:
ACL information:
10.2.1.1

So it looks like the LUN 1 is approximately correct (5T ???) on the 
target, and incorrect when the initiator asks for it.


Please note that I have successfully used the full 6+TB as an iSCSI 
target using the SCST-iscsi code, so I do know that the initiator works 
correctly.


Is there a source RPM/tree for this target?

Joe Landman wrote:

Hi Erez

Erez Zilber wrote:

stgt (SCSI target) is an open-source framework for storage target
drivers. It supports iSCSI over iSER among other storage target drivers.

Voltaire added a git tree for stgt that will be added to OFED 1.4:
http://www2.openfabrics.org/git/?p=~dorons/tgt.git;a=summary

Until OFED 1.4 gets released, it is possible to install the stgt RPM on
top of OFED 1.3. For more details about how to install and 

Re: [ofa-general] Re: [ewg] OFED 1.3 rc4 update

2008-02-07 Thread Eli Cohen

 This problem was seen on a ehca that supports SRQ.


Please reply how many scatter entries does ehca support when working
in SRQ mode? Also any piece of info I might need to try and mimic ehca
behaviour on Mellanox devices. I will appreciate if you can repeat the
exact sequence of actions you do to reproduce this.

thanks.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: [ofa-general] [ANNOUNCE] open iSCSI over iSER target RPMis available

2008-02-07 Thread Joe Landman

Erez Zilber wrote:

* READ: 920 MB/sec
* WRITE: 850 MB/sec

Not getting anything even remotely close to this.  Are there more
details on configuration somewhere?  I followed the web page as indicated.



Are you running iSCSI over TCP or iSCSI over iSER (over InfiniBand)? Our
results are with iSER.


I followed the instructions on the web pages that were pointed to for 
iSER.  Are there updated pages?  Is there a way to tell whether or not 
the RDMA path is being used?


Thanks.

Joe



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED 1.3 RC4 release is available

2008-02-07 Thread Tziporet Koren
Hi, 
OFED 1.3 RC3 release is available on 
http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc4.tgz http://www.openfabrics.org/builds/ofed-1.3/release/OFED-1.3-rc3.tgz


To get BUILD_ID run ofed_info 

Please report any issues in bugzilla https://bugs.openfabrics.org/ 
The RC5 (Gold) release is expected on February 18


Tziporet  Vlad 




Release information: 
 
Linux Operating Systems:

   - RedHat EL4 up4:   2.6.9-42.ELsmp
   - RedHat EL4 up5:   2.6.9-55.ELsmp
   - RedHat EL4 up6:   2.6.9-67.ELsmp  *
   - RedHat EL5:   2.6.18-8.el5
   - RedHat EL5 up1:   2.6.18-53.el5
   - Fedora C6:2.6.18-8.fc6*
   - SLES10:   2.6.16.21-0.8-smp
   - SLES10 SP1:   2.6.16.46-0.12-smp
   - SLES10 SP1 up1:   2.6.16.53-0.16-smp
   - OpenSuSE 10.3:2.6.22-*-*  *
   - kernel.org:   2.6.23 and 2.6.24

 * OSes that are partially tested

Systems: 
	* x86_64 
	* x86 
	* ia64 
	* ppc64 


Main Changes from OFED 1.3-RC3
=== 
* Fixed 13 Bugs (see attachment)

* MPI packages update: mvapich-1.0.0-1981.src.rpm
* Updated libraries:
 * uDAPL 2.0.6
 * libibcm 1.0.2
 * librdmacm 1.0.6* I
* IPoIB enhancements: 
 * Non-SRQ for CM mode

 * 4K MTU support
 * Enhancements to improve small messages BW

Tasks that should be completed for RC5: 
===
1. Fix critical and major bugs 
2. Update all documents


bug_id,bug_severity,op_sys,assigned_to,resolution,short_short_desc
794,normal,Other,[EMAIL PROTECTED],FIXED,Kernel panic while unload driver
883,normal,RHEL 4,[EMAIL PROTECTED],FIXED,mvapich gets killed during alltoall, 
32nodes
884,normal,RHEL 4,[EMAIL PROTECTED],FIXED,mvapich doesn't report non-active 
ports
893,blocker,Other,[EMAIL PROTECTED],FIXED,Dynamic library supprot is broken
892,blocker,SLES 10,[EMAIL PROTECTED],FIXED,openibd does not remove cxgb3  
module
897,critical,SLES 10,[EMAIL PROTECTED],FIXED,traffic is jittery, send queue 
full reports from mthca
891,critical,RHEL 4,[EMAIL PROTECTED],FIXED,ib_sa panics system when enabled
878,critical,Other,[EMAIL PROTECTED],FIXED,slow failover with bonding and 
connected mode
887,critical,All,[EMAIL PROTECTED],FIXED,IMB benchmark stuck
577,critical,All,[EMAIL PROTECTED],FIXED,SRP multipath failover too slow 
(minutes, not seconds)
761,major,Other,[EMAIL PROTECTED],FIXED,Poor and jittery UDP performance at 
small messages
889,minor,Other,[EMAIL PROTECTED],FIXED,Intel test stuck 
fortran-datatype-functional-MPI_Type_contiguous_idispls
888,minor,Other,[EMAIL PROTECTED],FIXED,OSU latency benchmark (old version with 
iteration and message size parameter) stuck sometime
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] [Fwd: Re: [ofa-general] Problem with latest OFED 1.3 build... IPoIB and iPATH]

2008-02-07 Thread Ralph Campbell
I forgot to CC EWG on my reply to Arlin Davis.
---BeginMessage---
We can reproduce the problem here.
We haven't made any ib_ipath driver changes between RC3 and RC4
so some recent patch has broken us.
I'm in the process of looking at it.

On Wed, 2008-02-06 at 17:17 -0800, Arlin Davis wrote:
 I cannot ifconfig ib0 on ipath with using the latest build
 (ofed20080206).
  
 ifup ib0
 SIOCSIFFLAGS: Invalid argument
 Failed to bring up ib0.
 
   ib0: failed to create own ah
  
 CA 'ipath0'
 CA type: InfiniPath_QLE7140
 Number of ports: 1
 Firmware version:
 Hardware version: 2
 Node GUID: 0x001175ffd75b
 System image GUID: 0x001175ffd75b
 Port 1:
 State: Active
 Physical state: LinkUp
 Rate: 10
 Base lid: 14
 LMC: 0
 SM lid: 1
 Capability mask: 0x02010800
 Port GUID: 0x001175ffd75b
  
 It works fine on mthca adapters. Anyone else see this problem?
 
 
 -arlin
 
 
  
 ___
 general mailing list
 [EMAIL PROTECTED]
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
---End Message---
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] [Fwd: Re: [ofa-general] Problem with latest OFED 1.3 build... IPoIB and iPATH]

2008-02-07 Thread Shirley Ma
Hello Ralph,

What's ifconfig ib0 output?

  We can reproduce the problem here.
  We haven't made any ib_ipath driver changes between RC3 and RC4
  so some recent patch has broken us.
  I'm in the process of looking at it.
  
  On Wed, 2008-02-06 at 17:17 -0800, Arlin Davis wrote:
   I cannot ifconfig ib0 on ipath with using the latest build
   (ofed20080206).

   ifup ib0
   SIOCSIFFLAGS: Invalid argument
   Failed to bring up ib0.
   
 ib0: failed to create own ah

int ipoib_ib_dev_open(struct net_device *dev)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
int ret;

if (ib_find_pkey(priv-ca, priv-port, priv-pkey,
priv-pkey_index)) {
ipoib_warn(priv, P_Key 0x%04x not found\n,
priv-pkey);
clear_bit(IPOIB_PKEY_ASSIGNED, priv-flags);
return -1;
}
set_bit(IPOIB_PKEY_ASSIGNED, priv-flags);

ret = create_own_ah(priv);
if (ret) {
priv-own_ah = NULL;
ipoib_warn(priv, failed to create own ah\n);
return -1;
}

Looks like the ipath driver returns error from create_own_ah() call. Are
you sure there is no ipath driver changes between RC3 and RC4?

Which kernel did you hit this problem? What's the kernel PAGE_SIZE?

thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [Stgt-devel] Update (Re: open iSCSI over iSER target RPM ...)

2008-02-07 Thread FUJITA Tomonori
On Thu, 07 Feb 2008 11:05:03 -0500
Joe Landman [EMAIL PROTECTED] wrote:

 Update:
 
 [EMAIL PROTECTED] etc]# dd if=/dev/zero of=/big/local.file bs=256k 
 count=10
 10+0 records in
 10+0 records out
 2621440 bytes (26 GB) copied, 58.7484 seconds, 446 MB/s
 
 Better. I rebuilt OFED 1.2.5.5.  Are there specific recommended tuning 
 guides for iSER?  Backing store in this case are real disks, and we can 
 sink/source 750 MB/s on them, so I am not worried about disk IO 
 bottlenecks, more worried about bad config of iSCSI/iSER.
 
 BTW:  the 2TB LUN limit I asked about is still here in this code.  Same 
 machines (initiator and target) used for SRP reported correct LUN sizes. 
   Here we are using the -868 open-iscsi initiator, and the tgt RPM 
 announced.  I would like to dig into this.

Thanks a lot,

I thought that I tested tgt with 2TB devices but seems that I
didn't. I'll try to fix the problem shortly.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [Fwd: Re: [ofa-general] Problem with latest OFED 1.3 build... IPoIB and iPATH]

2008-02-07 Thread Ralph Campbell
On Thu, 2008-02-07 at 08:29 -0800, Shirley Ma wrote:
 On Thu, 2008-02-07 at 18:16 -0800, Ralph Campbell wrote:
  # cat /etc/*release
  Red Hat Enterprise Linux Server release 5 (Tikanga)
  # uname -r
  2.6.18-8.el5
  
  4K PAGE_SIZE
 I don't have ipath driver here. Otherwise I could try them out. 
 
 A couple suggestions here, could you please try out?
 
 1. try this on 64K page size, like RHEL5U1 to see whether you have the
 same issue.

We don't have any systems with 64K page size at hand.

 2. Can you put a debug message in ipath_create_ah() to see whether this
 is a memory allocation failure?

I'm working on it.

 3. How many IB cards in your system? If you have severals, just leave
 one ipath there to see whether you can hit this problem.

only one card with one IB port.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [Fwd: Re: [ofa-general] Problem with latest OFED 1.3 build... IPoIB and iPATH]

2008-02-07 Thread Shirley Ma
On Thu, 2008-02-07 at 18:16 -0800, Ralph Campbell wrote:
 # cat /etc/*release
 Red Hat Enterprise Linux Server release 5 (Tikanga)
 # uname -r
 2.6.18-8.el5
 
 4K PAGE_SIZE
I don't have ipath driver here. Otherwise I could try them out. 

A couple suggestions here, could you please try out?

1. try this on 64K page size, like RHEL5U1 to see whether you have the
same issue.

2. Can you put a debug message in ipath_create_ah() to see whether this
is a memory allocation failure?

3. How many IB cards in your system? If you have severals, just leave
one ipath there to see whether you can hit this problem.

Thanks
Shirley

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg