Hi Ole, Have you got solution for this? I think we got exactly same problem on 4600 with ofed-1.4.1-rc4: lspci output: 03:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0 2.5GT/s] (rev a0)
and error messages from dmesg: mlx4_core: Mellanox ConnectX core driver v1.0 (April 4, 2008) mlx4_core: Initializing 0000:03:00.0 mlx4_core 0000:03:00.0: Requested number of MACs is too much for port 1, reducing to 1. mlx4_core 0000:03:00.0: command 0x13 failed: fw status = 0x1 mlx4_core 0000:03:00.0: SW2HW_EQ failed (-5) mlx4_core 0000:03:00.0: Failed to initialize event queue table, aborting. mlx4_core: probe of 0000:03:00.0 failed with error -5 Thanks Liang Ole Widar Saastad wrote: > I have problems using the OFED 1.4 software on the Sun x4600 nodes. > Need help to get this to work. We plan to run GPFS over IB on these > nodes in addition to MPI. > > Sun 4600 nodes with 8 quad core cpus, > Quad-Core AMD Opteron(tm) Processor 8380 > > OS is Rocks release 4. > centos-release-4-4.2/x86_64/ > > Linux compute-0-0.local 2.6.9-67.0.15.ELlargesmp #1 SMP Thu May 8 > 11:03:57 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux > > > Needless to say our 300+ nodes (SUN x2200 with quad core) runs fine with > OFED 1.4 (and 1.3), they have the almost the same kernel : > Linux compute-4-0.local 2.6.9-67.0.15.ELsmp #1 SMP Thu May 8 10:50:20 > EDT 2008 x86_64 x86_64 x86_64 GNU/Linux > Same except ELsmp and not ELlargesmp. > > More information: > > dmesg prints out the following error message : > > Losing some ticks... checking if CPU frequency changed. > modulecmd[17499]: segfault at 0000007fc0b01688 rip 000000000060aa38 rsp > 0000007fbfffcfd8 error 6 > mlx4_core: Mellanox ConnectX core driver v1.0 (April 4, 2008) > mlx4_core: Initializing 0000:02:00.0 > ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 19 (level, low) -> IRQ 193 > PCI: Setting latency timer of device 0000:02:00.0 to 64 > mlx4_core 0000:02:00.0: Requested number of MACs is too much for port 1, > reducing to 1. > MSI INIT SUCCESS > mlx4_core 0000:02:00.0: command 0x13 failed: fw status = 0x1 > mlx4_core 0000:02:00.0: SW2HW_EQ failed (-5) > mlx4_core 0000:02:00.0: Failed to initialize event queue table, aborting. > mlx4_core: probe of 0000:02:00.0 failed with error -5 > > The following software is installed: > > Select Option [1-5]:3 > kernel-ib > libibverbs > libibverbs-devel > libibverbs-utils > libmthca > libmlx4 > libcxgb3 > libnes > libipathverbs > libibcommon > libibcommon-devel > libibumad > libibumad-devel > ofed-docs > ofed-scripts > ibvexdmtools > qlgc_vnic_daemon > > > Just to be sure the card is present : > lspci returns : > 02:00.0 InfiniBand: Mellanox Technologies: Unknown device 634a (rev a0) > > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
