My tests, based on modifying the below code, show that hashing can
take more than 10% of the time to write a Mesh to HDF5.

What's required is the right abstraction for handling Functions and
files. I think the hashing approach is more a hack. What about
something along the lines of:

    Function u(V);
    Function w(V);

    HDF5Function hdf5_function_file("my_filename.h5", "w");
    hdf5_function_file.register(u, "u_name");
    hdf5_function_file.register(w, "w_name");

    hdf5_function_file.parameters["common_mesh"] = true;
    hdf5_function_file.parameters["write_mesh_once"] = true;

    // Write all registered functions
    hdf5_function_file.write();

    // Write all registered functions again
    hdf5_function_file.write();

   // Write u only
    hdf5_function_file.write("u_name");

Some  HDF5 trickery could be used to link and structure data in the file.

Garth

On 29 September 2013 18:07, Øyvind Evju <[email protected]> wrote:
> from dolfin import *
> set_log_level(10000)
> if MPI.process_number() == 0:
>     print "%10s --- %10s" %("Cells", "Time (s)")
> for i in range(5,250, 20):
>     mesh = UnitCubeMesh(i,i,i)
>     tic()
>     mesh.hash()
>     t = toc()
>     if MPI.process_number() == 0:
>         print "%10d --- %.10f" %(mesh.size_global(3), t)
>     del mesh
>
>
> 2013/9/29 Garth N. Wells <[email protected]>
>>
>> On 29 September 2013 17:46, Øyvind Evju <[email protected]> wrote:
>> > From a quad core @ 2.20ghz, calling the mesh.hash() function.
>> >
>>
>> Please post test code.
>>
>> Garth
>>
>> > One process:
>> >      Cells ---   Time (s)
>> >        750 --- 0.0001020432
>> >      93750 --- 0.0019581318
>> >     546750 --- 0.0110230446
>> >    1647750 --- 0.0335328579
>> >    3684750 --- 0.0734529495
>> >    6945750 --- 0.1374619007
>> >   11718750 --- 0.2321729660
>> >   18291750 --- 0.3683109283
>> >   26952750 --- 0.5321540833
>> >   37989750 --- 0.7479040623
>> >   51690750 --- 1.0299670696
>> >   68343750 --- 1.3440520763
>> >   88236750 --- 1.7490680218
>> >
>> > Two processes:
>> >      Cells ---   Time (s)
>> >        750 --- 0.0002639294
>> >      93750 --- 0.0011038780
>> >     546750 --- 0.0128669739
>> >    1647750 --- 0.0124230385
>> >    3684750 --- 0.0274820328
>> >    6945750 --- 0.0780282021
>> >   11718750 --- 0.1386530399
>> > (Out of memory)
>> >
>> >
>> > -Øyvind
>> >
>> >
>> >
>> >
>> > 2013/9/29 Garth N. Wells <[email protected]>
>> >
>> >> On 29 September 2013 17:12, Øyvind Evju <[email protected]> wrote:
>> >> > Wouldn't it be quite messy to suddenly have several vectors
>> >> > associated
>> >> > with
>> >> > a Function?
>> >> >
>> >>
>> >> No. It's very natural for a time-dependent Function.
>> >>
>> >> > Creating a hash of the mesh and finite element and storing cells,
>> >> > cell_dofs
>> >> > and x_cell_dofs there, we could keep the same structure for Functions
>> >> > as
>> >> > today with links (instead of actual data sets) within each Function
>> >> > to
>> >> > cells, cell_dofs and x_cell_dofs.
>> >> >
>> >> > When writing a Function a check is done to see if the cells,
>> >> > cell_dofs
>> >> > and
>> >> > x_cell_dofs exist under the relevant hash. If the hash (mesh,
>> >> > distribution
>> >> > or function space) changes, we need to write these data sets under
>> >> > the
>> >> > new
>> >> > hash.
>> >> >
>> >> > Have I misunderstood this hashing? It does seem to be very efficient,
>> >> > more
>> >> > efficient than rewriting those three datasets.
>> >> >
>> >>
>> >> Can you post a benchmark for testing the speed of hashing?
>> >>
>> >> Garth
>> >>
>> >> >
>> >> > -Øyvind
>> >> >
>> >> >
>> >> >
>> >> > 2013/9/28 Chris Richardson <[email protected]>
>> >> >>
>> >> >> On 28/09/2013 13:29, Garth N. Wells wrote:
>> >> >>>
>> >> >>> On 28 September 2013 12:56, Chris Richardson <[email protected]>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> On 28/09/2013 11:31, Garth N. Wells wrote:
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> On 28 September 2013 10:42, Chris Richardson
>> >> >>>>> <[email protected]>
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> This is a continuation of the discussion at:
>> >> >>>>>>
>> >> >>>>>> https://bitbucket.org/fenics-project/dolfin/pull-request/52
>> >> >>>>>>
>> >> >>>>>> The question is how best to save a time-series of Function in
>> >> >>>>>> HDF5,
>> >> >>>>>> when
>> >> >>>>>> the
>> >> >>>>>> cell and dof layout remains constant.
>> >> >>>>>>
>> >> >>>>>> It has been suggested to use:
>> >> >>>>>>
>> >> >>>>>> u = Function(V)
>> >> >>>>>> h0 = HDF5File('Timeseries_of_Function.h5', 'w')
>> >> >>>>>> h0.write(u, '/Function')
>> >> >>>>>> # Then later
>> >> >>>>>> h0.write(u.vector(), "/Vector/0")
>> >> >>>>>> h0.write(u.vector(), "/Vector/1")
>> >> >>>>>>
>> >> >>>>>
>> >> >>>>> Shouldn't this be
>> >> >>>>>
>> >> >>>>>     h0.write(u.vector(), "/Function/Vector/0")
>> >> >>>>>     h0.write(u.vector(), "/Function/Vector/1")
>> >> >>>>>
>> >> >>>>
>> >> >>>> In the HDF5File model, the user is free to put vectors etc
>> >> >>>> wherever
>> >> >>>> they
>> >> >>>> want. There is no explicit meaning
>> >> >>>> to dumping extra vectors inside the "group" of a Function.
>> >> >>>>
>> >> >>>>
>> >> >>>>>
>> >> >>>>>> and to read back:
>> >> >>>>>>
>> >> >>>>>> u = Function(V)
>> >> >>>>>> h0 = HDF5File('Timeseries_of_Function.h5', 'r')
>> >> >>>>>> h0.read(u, "/Function")
>> >> >>>>>> h0.read(u.vector(), "/Function/vector")
>> >> >>>>>>
>> >> >>>>
>> >> >>>> OK, this probably should have been
>> >> >>>>
>> >> >>>> h0.read(u.vector(), "/Vector/1")
>> >> >>>>
>> >> >>>> When reading in a vector, it is just read directly, and
>> >> >>>> not reordered in any way. If the vector was saved from a different
>> >> >>>> set
>> >> >>>> of
>> >> >>>> processors, with different partitioning, the order could be quite
>> >> >>>> different.
>> >> >>>>
>> >> >>>> When reading a Function, the vector is reordered to take this into
>> >> >>>> account.
>> >> >>>>
>> >> >>>> If the vector is already associated with a Function (not all
>> >> >>>> vectors
>> >> >>>> are)
>> >> >>>> then it should be possible to reorder it when reading... maybe
>> >> >>>> that
>> >> >>>> should
>> >> >>>> be an option.
>> >> >>>>
>> >> >>>
>> >> >>> A solution seems very simple - use the HDF5 hierarchal structure to
>> >> >>> associate Vectors with a Function. This is the advantage of using a
>> >> >>> hierarchal storage format.
>> >> >>>
>> >> >>> If a user reads a Vector that is not already associated with a
>> >> >>>
>> >> >>> Function, then it should be the user's responsibility to take care
>> >> >>> of
>> >> >>> things.
>> >> >>>
>> >> >>
>> >> >> It could work like this:
>> >> >>
>> >> >> At present, when writing a Function, it creates a group and
>> >> >> populates
>> >> >> it
>> >> >> with
>> >> >> dofmap, cells, and vector. Writing again with the same name will
>> >> >> cause
>> >> >> an
>> >> >> error.
>> >> >> We could allow writes to the same name, but create more vectors
>> >> >> (maybe
>> >> >> checking that the cells/dofs are still compatible) in the same HDF5
>> >> >> group.
>> >> >> Or, a user could just manually dump more vectors in the group (as
>> >> >> described
>> >> >> above by Garth).
>> >> >>
>> >> >> For read, reading a Function will still behave the same, but we
>> >> >> could
>> >> >> have
>> >> >> the additional option of reading a Function by just giving the
>> >> >> vector
>> >> >> dataset name - and assuming that cell/dof information exists in the
>> >> >> same
>> >> >> HDF5 group. This should be fairly easy to implement.
>> >> >>
>> >> >>
>> >> >> Chris
>> >> >>
>> >> >> _______________________________________________
>> >> >> fenics mailing list
>> >> >> [email protected]
>> >> >> http://fenicsproject.org/mailman/listinfo/fenics
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > fenics mailing list
>> >> > [email protected]
>> >> > http://fenicsproject.org/mailman/listinfo/fenics
>> >> >
>> >
>> >
>> >
>> > _______________________________________________
>> > fenics mailing list
>> > [email protected]
>> > http://fenicsproject.org/mailman/listinfo/fenics
>> >
>
>
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to