I moved our BGL to Jumbo Frames several weeks ago, and we ran into similar issues to what you reported as well. We ended up establishing two networks, one for 9000MTU, and one for all of the equipment which must remain at 1500 (FC RAID controllers, PXE boot infrastructure for x86 nodes, loaner equipment, etc).
Your note about having issues at or near 9000 MTU is concerning - can you give us some more insight into the problems that you experienced, and any ways that we could test for the same issues in our environment? Thanks, Michael Oberg Research Systems Evaluation Team National Center for Atmospheric Research (NCAR) Office: 303.497.1268, Cell: 720.938.6585 [EMAIL PROTECTED] Andrew Cherry wrote: > Matthew, Sam- > > FYI, using jumbo frames is not necessarily that simple. The IBM file > servers that came with our BG/L don't support jumbo frames on the > internal NICs. Ours are IBM x346 type 8840, and the built-in Broadcom > NICs couldn't handle jumbo frames. I imagine other xSeries boxes with > integrated Broadcom NICs may have similar issues. We ended up buying > PCI network cards in order to implement jumbo frames in our > environment. Also, you'll need to make sure your network switch can > handle jumbo frames (ours is a Force10, don't know the exact model off > the top of my head but it supports jumbo frames). > > The other thing you need to be aware of is that switching to jumbo > frames is an all-or-nothing proposition; if you do it, you'll have to do > it for *all* of the hardware on the involved network segment. You can't > just change a couple of servers. > > I'm Cc:ing a couple of folks at Argonne who worked on getting jumbo > frames working for our environment; they might be able to warn you of > any other gotchas. We're using 8000 byte frames, but if I were starting > from scratch I'd try something closer to 8300 so that an entire > 8192-byte NFS packet can fit in a single frame (avoiding fragmentation > if you're using an 8192 byte NFS rsize/wsize). Note, 8300 is just a > ballparck guess that I haven't been able to confirm. > > Be warned -- in our environment, we started to have problems when we got > close to 9000 byte frames, so don't go too high. > > -Andrew Cherry > BG/L Support > Argonne National Laboratory > > On Apr 20, 2007, at 5:02 PM, Sam Lang wrote: > >> >> Hi Matthew, >> >> I think the version of PVFS in the Zepto release is pvfs2-1.5.1. >> Besides some performance improvements in the latest release >> (pvfs-2.6.3), there was a specific bugfix made in PVFS for largish >> mpi-io jobs. If you could try the latest (at http://www.pvfs.org/), >> it would help us to verify that you're not running into the same problem. >> >> Regarding config options for PVFS on BGL, make sure you have jumbo >> frames enabled, i.e. >> >> ifconfig eth0 mtu 8000 up >> >> Also, you should probably set the tcp buffer sizes explicitly in the >> pvfs config file, fs.conf: >> >> <Defaults> >> ... >> TCPBufferSend 524288 >> TCPBufferReceive 1048576 >> </Defaults> >> >> You might also see better performance with an alternative trove method >> for doing disk io: >> >> <StorageHints> >> ... >> TroveMethod alt-aio >> </StorageHints> >> >> >> Thanks, >> >> -sam >> >> On Apr 20, 2007, at 4:25 PM, Matthew Woitaszek wrote: >> >>> >>> Good afternoon, >>> >>> Michael Oberg and I are attempting to get PVFS2 working on NCAR's 1-rack >>> BlueGene/L system using ZeptoOS. We ran into a snag at over 8 BG/L >>> I/O nodes >>> (>256 compute nodes). >>> >>> We've been using the mpi-io-test program shipped with PVFS2 to test the >>> system. For cases up to and including 8 I/O nodes (256 coprocessor or >>> 512 >>> virtual node mode tasks), everything works fine. Larger jobs fail >>> with file >>> not found error messages, such as: >>> >>> MPI_File_open: File does not exist, error stack: >>> ADIOI_BGL_OPEN(54): File /pvfs2/mattheww/_file_0512_co does not exist >>> >>> The file is created on the PVFS2 filesystem and has a zero-byte size. >>> We've >>> run the tests with 512 tasks on 256 nodes, and it successfully created a >>> 8589934592-byte file. Going to 257 nodes fails. >>> >>> Has anyone seen this behavior before? Are there any PVFS2 server or >>> client >>> configuration options that you would recommend for a BG/L >>> installation like >>> this? >>> >>> Thanks for your time, >>> >>> Matthew >>> >>> >>> >>> _______________________________________________ >>> Pvfs2-users mailing list >>> [email protected] >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >>> >> > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
