Hi Leigh, On Mar 7, 2011, at 2:24 PM, Leigh Orf wrote:
> On Mon, Mar 7, 2011 at 9:28 AM, Quincey Koziol <koz...@hdfgroup.org> wrote: >> Hi Leigh, >> >> On Mar 5, 2011, at 2:56 PM, Leigh Orf wrote: >> >>> On Sat, Mar 5, 2011 at 12:14 PM, Tjf (mobile) <tfo...@sci.utah.edu> wrote: >>>> I'm not up on hdf5 internals, but I can't imagine any API would >>>> effectively deal with such small writes, because the os/disks aren't going >>>> to cope with them effectively. >>>> >>>> If hdf5 can coalesce writes, try enabling that. Otherwise, forward your >>>> data to a subset of nodes for writing, such that each write is large. >>>> Generally larger is better, but I would say shoot for 16 megs per write. >>> >>> As I understand from Mark & Quincey when you write in collective mode, >>> it assigns writers and collects data to the writers so that the chunks >>> are larger, and aligns the data to the underlying FS stripe size (at >>> least with lustre, what I am using). However the details of this are a >>> mystery to me. >> >> No, this isn't quite accurate. The chunks in the file are always set >> at the size you use when creating the dataset, even when collective I/O is >> used. You should use H5Pset_alignment() (as you mentioned in your other >> email) to align the chunks on a "good" boundary for Lustre. Also, if the >> datasets are fixed size, you can compute the number of chunks that will be >> produced and use H5Pset_istore_k() to be 1/2 of that value, so that there is >> only one B-tree node for the chunked dataset's index, which will speed up >> metadata operations for the dataset (this is being addressed with new chunk >> indexing methods in the next major release of HDF5 - 1.10.0). Also, you >> should move up to the recently released 1.8.6 release, which has all the >> performance improvements that we implemented for the paper that Mark Howison >> wrote with us last year. > > That is very useful information. I assumed the H5Pset_alignment was > done "under the hood." Clearly I am therefore doing unaligned writes > which is causing bad performance. I will follow your suggestions and > let you know how it turns out. > > Did not version 1.8.5 have the performance improvements? You do > mention that version in the paper. Regardless, I will ask to have > 1.8.6 built on krraken as well. Some of the improvements made it into 1.8.5, but some took longer and only made it into the 1.8.6 release. Quincey > > Thanks, > > Leigh > >> >> Quincey >> >>> Leigh >>> >>>> >>>> -tom >>>> >>>> Am Mar 4, 2011 um 5:03 PM schrieb Leigh Orf <leigh....@gmail.com>: >>>> >>>>> What is the size of a "write operation" with parallel hdf5? That >>>>> terminology comes up a lot on my sole source of guidance for lustre on >>>>> the machine I'm running on ( http://www.nics.tennessee.edu/io-tips ) >>>>> >>>>> I am trying to choose ideal parameters for the lustre file system. >>>>> >>>>> I experienced abysmal performance with my first attempt at writing 1 >>>>> file containing 3D data with 30,000 cores, and I want to choose better >>>>> parameters. After 11 minutes 62 GB had been written, and I killed the >>>>> job. >>>>> >>>>> Each 3D array that I write from a core is 435,600 bytes. I have my >>>>> chunk dimensions the same as my array dimension. Does that mean that >>>>> each core writes a chunk of data 435,600 bytes long? Would I therefore >>>>> wish to set my stripe size to 435,600 bytes? That is smaller than the >>>>> default of 1 MB. >>>>> >>>>> It seems that lustre performs best when each "write operation" is >>>>> large (say 32 MB) and the stripe size matches it. However our cores >>>>> each are writing comparatively much smaller chunks of data. >>>>> >>>>> I am going to see if the folks on the kraken machine can help me with >>>>> optimizing lustre, but want to understand as much as possible about >>>>> how pHDF5 works before I do. >>>>> >>>>> Thanks, >>>>> >>>>> Leigh >>>>> >>>>> -- >>>>> Leigh Orf >>>>> Associate Professor of Atmospheric Science >>>>> Department of Geology and Meteorology >>>>> Central Michigan University >>>>> Currently on sabbatical at the National Center for Atmospheric >>>>> Research in Boulder, CO >>>>> NCAR office phone: (303) 497-8200 >>>>> >>>>> _______________________________________________ >>>>> Hdf-forum is for HDF software users discussion. >>>>> Hdf-forum@hdfgroup.org >>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >>>> >>>> _______________________________________________ >>>> Hdf-forum is for HDF software users discussion. >>>> Hdf-forum@hdfgroup.org >>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >>>> >>> >>> >>> >>> -- >>> Leigh Orf >>> Associate Professor of Atmospheric Science >>> Department of Geology and Meteorology >>> Central Michigan University >>> Currently on sabbatical at the National Center for Atmospheric >>> Research in Boulder, CO >>> NCAR office phone: (303) 497-8200 >>> >>> _______________________________________________ >>> Hdf-forum is for HDF software users discussion. >>> Hdf-forum@hdfgroup.org >>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >> >> >> _______________________________________________ >> Hdf-forum is for HDF software users discussion. >> Hdf-forum@hdfgroup.org >> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >> > > > > -- > Leigh Orf > Associate Professor of Atmospheric Science > Department of Geology and Meteorology > Central Michigan University > Currently on sabbatical at the National Center for Atmospheric > Research in Boulder, CO > NCAR office phone: (303) 497-8200 _______________________________________________ Hdf-forum is for HDF software users discussion. Hdf-forum@hdfgroup.org http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org