[Hdf-forum] size of "write operation" with pHDF5

Quincey Koziol Mon, 07 Mar 2011 19:59:35 -0800

Hi Leigh,

On Mar 7, 2011, at 2:24 PM, Leigh Orf wrote:


> On Mon, Mar 7, 2011 at 9:28 AM, Quincey Koziol <koz...@hdfgroup.org> wrote:
>> Hi Leigh,
>> 
>> On Mar 5, 2011, at 2:56 PM, Leigh Orf wrote:
>> 
>>> On Sat, Mar 5, 2011 at 12:14 PM, Tjf (mobile) <tfo...@sci.utah.edu> wrote:
>>>> I'm not up on hdf5 internals, but I can't imagine any API would 
>>>> effectively deal with such small writes, because the os/disks aren't going 
>>>> to cope with them effectively.
>>>> 
>>>> If hdf5 can coalesce writes, try enabling that.  Otherwise, forward your 
>>>> data to a subset of nodes for writing, such that each write is large.  
>>>> Generally larger is better, but I would say shoot for 16 megs per write.
>>> 
>>> As I understand from Mark & Quincey when you write in collective mode,
>>> it assigns writers and collects data to the writers so that the chunks
>>> are larger, and aligns the data to the underlying FS stripe size (at
>>> least with lustre, what I am using). However the details of this are a
>>> mystery to me.
>> 
>>       No, this isn't quite accurate.  The chunks in the file are always set 
>> at the size you use when creating the dataset, even when collective I/O is 
>> used.  You should use H5Pset_alignment() (as you mentioned in your other 
>> email) to align the chunks on a "good" boundary for Lustre.  Also, if the 
>> datasets are fixed size, you can compute the number of chunks that will be 
>> produced and use H5Pset_istore_k() to be 1/2 of that value, so that there is 
>> only one B-tree node for the chunked dataset's index, which will speed up 
>> metadata operations for the dataset (this is being addressed with new chunk 
>> indexing methods in the next major release of HDF5 - 1.10.0).  Also, you 
>> should move up to the recently released 1.8.6 release, which has all the 
>> performance improvements that we implemented for the paper that Mark Howison 
>> wrote with us last year.
> 
> That is very useful information. I assumed the H5Pset_alignment was
> done "under the hood." Clearly I am therefore doing unaligned writes
> which is causing bad performance. I will follow your suggestions and
> let you know how it turns out.
> 
> Did not version 1.8.5 have the performance improvements? You do
> mention that version in the paper. Regardless, I will ask to have
> 1.8.6 built on krraken as well.

        Some of the improvements made it into 1.8.5, but some took longer and 
only made it into the 1.8.6 release.

        Quincey

> 
> Thanks,
> 
> Leigh
> 
>> 
>>       Quincey
>> 
>>> Leigh
>>> 
>>>> 
>>>> -tom
>>>> 
>>>> Am Mar 4, 2011 um 5:03 PM schrieb Leigh Orf <leigh....@gmail.com>:
>>>> 
>>>>> What is the size of a "write operation" with parallel hdf5? That
>>>>> terminology comes up a lot on my sole source of guidance for lustre on
>>>>> the machine I'm running on ( http://www.nics.tennessee.edu/io-tips )
>>>>> 
>>>>> I am trying to choose ideal parameters for the lustre file system.
>>>>> 
>>>>> I experienced abysmal performance with my first attempt at writing 1
>>>>> file containing 3D data with 30,000 cores, and I want to choose better
>>>>> parameters. After 11 minutes 62 GB had been written, and I killed the
>>>>> job.
>>>>> 
>>>>> Each 3D array that I write from a core is 435,600 bytes. I have my
>>>>> chunk dimensions the same as my array dimension. Does that mean that
>>>>> each core writes a chunk of data 435,600 bytes long? Would I therefore
>>>>> wish to set my stripe size to 435,600 bytes? That is smaller than the
>>>>> default of 1 MB.
>>>>> 
>>>>> It seems that lustre performs best when each "write operation" is
>>>>> large (say 32 MB) and the stripe size matches it. However our cores
>>>>> each are writing comparatively much smaller chunks of data.
>>>>> 
>>>>> I am going to see if the folks on the kraken machine can help me with
>>>>> optimizing lustre, but want to understand as much as possible about
>>>>> how  pHDF5 works before I do.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Leigh
>>>>> 
>>>>> --
>>>>> Leigh Orf
>>>>> Associate Professor of Atmospheric Science
>>>>> Department of Geology and Meteorology
>>>>> Central Michigan University
>>>>> Currently on sabbatical at the National Center for Atmospheric
>>>>> Research in Boulder, CO
>>>>> NCAR office phone: (303) 497-8200
>>>>> 
>>>>> _______________________________________________
>>>>> Hdf-forum is for HDF software users discussion.
>>>>> Hdf-forum@hdfgroup.org
>>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>> 
>>>> _______________________________________________
>>>> Hdf-forum is for HDF software users discussion.
>>>> Hdf-forum@hdfgroup.org
>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Leigh Orf
>>> Associate Professor of Atmospheric Science
>>> Department of Geology and Meteorology
>>> Central Michigan University
>>> Currently on sabbatical at the National Center for Atmospheric
>>> Research in Boulder, CO
>>> NCAR office phone: (303) 497-8200
>>> 
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> Hdf-forum@hdfgroup.org
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum@hdfgroup.org
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
> 
> 
> 
> -- 
> Leigh Orf
> Associate Professor of Atmospheric Science
> Department of Geology and Meteorology
> Central Michigan University
> Currently on sabbatical at the National Center for Atmospheric
> Research in Boulder, CO
> NCAR office phone: (303) 497-8200



_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

[Hdf-forum] size of "write operation" with pHDF5

Reply via email to