Re: [FEniCS] HDF5File saving Functions with invariant cells and dofs

Anders Logg Wed, 02 Oct 2013 22:32:12 -0700

On Wed, Oct 02, 2013 at 06:38:25PM +0100, Garth N. Wells wrote:
> My tests, based on modifying the below code, show that hashing can
> take more than 10% of the time to write a Mesh to HDF5.
>
> What's required is the right abstraction for handling Functions and
> files. I think the hashing approach is more a hack. What about
> something along the lines of:
>
>     Function u(V);
>     Function w(V);
>
>     HDF5Function hdf5_function_file("my_filename.h5", "w");


HDF5FunctionFile ?
            ~~~~
--
Anders


>     hdf5_function_file.register(u, "u_name");
>     hdf5_function_file.register(w, "w_name");
>
>     hdf5_function_file.parameters["common_mesh"] = true;
>     hdf5_function_file.parameters["write_mesh_once"] = true;
>
>     // Write all registered functions
>     hdf5_function_file.write();
>
>     // Write all registered functions again
>     hdf5_function_file.write();
>
>    // Write u only
>     hdf5_function_file.write("u_name");
>
> Some  HDF5 trickery could be used to link and structure data in the file.
>
> Garth
>
> On 29 September 2013 18:07, Øyvind Evju <[email protected]> wrote:
> > from dolfin import *
> > set_log_level(10000)
> > if MPI.process_number() == 0:
> >     print "%10s --- %10s" %("Cells", "Time (s)")
> > for i in range(5,250, 20):
> >     mesh = UnitCubeMesh(i,i,i)
> >     tic()
> >     mesh.hash()
> >     t = toc()
> >     if MPI.process_number() == 0:
> >         print "%10d --- %.10f" %(mesh.size_global(3), t)
> >     del mesh
> >
> >
> > 2013/9/29 Garth N. Wells <[email protected]>
> >>
> >> On 29 September 2013 17:46, Øyvind Evju <[email protected]> wrote:
> >> > From a quad core @ 2.20ghz, calling the mesh.hash() function.
> >> >
> >>
> >> Please post test code.
> >>
> >> Garth
> >>
> >> > One process:
> >> >      Cells ---   Time (s)
> >> >        750 --- 0.0001020432
> >> >      93750 --- 0.0019581318
> >> >     546750 --- 0.0110230446
> >> >    1647750 --- 0.0335328579
> >> >    3684750 --- 0.0734529495
> >> >    6945750 --- 0.1374619007
> >> >   11718750 --- 0.2321729660
> >> >   18291750 --- 0.3683109283
> >> >   26952750 --- 0.5321540833
> >> >   37989750 --- 0.7479040623
> >> >   51690750 --- 1.0299670696
> >> >   68343750 --- 1.3440520763
> >> >   88236750 --- 1.7490680218
> >> >
> >> > Two processes:
> >> >      Cells ---   Time (s)
> >> >        750 --- 0.0002639294
> >> >      93750 --- 0.0011038780
> >> >     546750 --- 0.0128669739
> >> >    1647750 --- 0.0124230385
> >> >    3684750 --- 0.0274820328
> >> >    6945750 --- 0.0780282021
> >> >   11718750 --- 0.1386530399
> >> > (Out of memory)
> >> >
> >> >
> >> > -Øyvind
> >> >
> >> >
> >> >
> >> >
> >> > 2013/9/29 Garth N. Wells <[email protected]>
> >> >
> >> >> On 29 September 2013 17:12, Øyvind Evju <[email protected]> wrote:
> >> >> > Wouldn't it be quite messy to suddenly have several vectors
> >> >> > associated
> >> >> > with
> >> >> > a Function?
> >> >> >
> >> >>
> >> >> No. It's very natural for a time-dependent Function.
> >> >>
> >> >> > Creating a hash of the mesh and finite element and storing cells,
> >> >> > cell_dofs
> >> >> > and x_cell_dofs there, we could keep the same structure for Functions
> >> >> > as
> >> >> > today with links (instead of actual data sets) within each Function
> >> >> > to
> >> >> > cells, cell_dofs and x_cell_dofs.
> >> >> >
> >> >> > When writing a Function a check is done to see if the cells,
> >> >> > cell_dofs
> >> >> > and
> >> >> > x_cell_dofs exist under the relevant hash. If the hash (mesh,
> >> >> > distribution
> >> >> > or function space) changes, we need to write these data sets under
> >> >> > the
> >> >> > new
> >> >> > hash.
> >> >> >
> >> >> > Have I misunderstood this hashing? It does seem to be very efficient,
> >> >> > more
> >> >> > efficient than rewriting those three datasets.
> >> >> >
> >> >>
> >> >> Can you post a benchmark for testing the speed of hashing?
> >> >>
> >> >> Garth
> >> >>
> >> >> >
> >> >> > -Øyvind
> >> >> >
> >> >> >
> >> >> >
> >> >> > 2013/9/28 Chris Richardson <[email protected]>
> >> >> >>
> >> >> >> On 28/09/2013 13:29, Garth N. Wells wrote:
> >> >> >>>
> >> >> >>> On 28 September 2013 12:56, Chris Richardson <[email protected]>
> >> >> >>> wrote:
> >> >> >>>>
> >> >> >>>> On 28/09/2013 11:31, Garth N. Wells wrote:
> >> >> >>>>>
> >> >> >>>>>
> >> >> >>>>> On 28 September 2013 10:42, Chris Richardson
> >> >> >>>>> <[email protected]>
> >> >> >>>>> wrote:
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>>
> >> >> >>>>>> This is a continuation of the discussion at:
> >> >> >>>>>>
> >> >> >>>>>> https://bitbucket.org/fenics-project/dolfin/pull-request/52
> >> >> >>>>>>
> >> >> >>>>>> The question is how best to save a time-series of Function in
> >> >> >>>>>> HDF5,
> >> >> >>>>>> when
> >> >> >>>>>> the
> >> >> >>>>>> cell and dof layout remains constant.
> >> >> >>>>>>
> >> >> >>>>>> It has been suggested to use:
> >> >> >>>>>>
> >> >> >>>>>> u = Function(V)
> >> >> >>>>>> h0 = HDF5File('Timeseries_of_Function.h5', 'w')
> >> >> >>>>>> h0.write(u, '/Function')
> >> >> >>>>>> # Then later
> >> >> >>>>>> h0.write(u.vector(), "/Vector/0")
> >> >> >>>>>> h0.write(u.vector(), "/Vector/1")
> >> >> >>>>>>
> >> >> >>>>>
> >> >> >>>>> Shouldn't this be
> >> >> >>>>>
> >> >> >>>>>     h0.write(u.vector(), "/Function/Vector/0")
> >> >> >>>>>     h0.write(u.vector(), "/Function/Vector/1")
> >> >> >>>>>
> >> >> >>>>
> >> >> >>>> In the HDF5File model, the user is free to put vectors etc
> >> >> >>>> wherever
> >> >> >>>> they
> >> >> >>>> want. There is no explicit meaning
> >> >> >>>> to dumping extra vectors inside the "group" of a Function.
> >> >> >>>>
> >> >> >>>>
> >> >> >>>>>
> >> >> >>>>>> and to read back:
> >> >> >>>>>>
> >> >> >>>>>> u = Function(V)
> >> >> >>>>>> h0 = HDF5File('Timeseries_of_Function.h5', 'r')
> >> >> >>>>>> h0.read(u, "/Function")
> >> >> >>>>>> h0.read(u.vector(), "/Function/vector")
> >> >> >>>>>>
> >> >> >>>>
> >> >> >>>> OK, this probably should have been
> >> >> >>>>
> >> >> >>>> h0.read(u.vector(), "/Vector/1")
> >> >> >>>>
> >> >> >>>> When reading in a vector, it is just read directly, and
> >> >> >>>> not reordered in any way. If the vector was saved from a different
> >> >> >>>> set
> >> >> >>>> of
> >> >> >>>> processors, with different partitioning, the order could be quite
> >> >> >>>> different.
> >> >> >>>>
> >> >> >>>> When reading a Function, the vector is reordered to take this into
> >> >> >>>> account.
> >> >> >>>>
> >> >> >>>> If the vector is already associated with a Function (not all
> >> >> >>>> vectors
> >> >> >>>> are)
> >> >> >>>> then it should be possible to reorder it when reading... maybe
> >> >> >>>> that
> >> >> >>>> should
> >> >> >>>> be an option.
> >> >> >>>>
> >> >> >>>
> >> >> >>> A solution seems very simple - use the HDF5 hierarchal structure to
> >> >> >>> associate Vectors with a Function. This is the advantage of using a
> >> >> >>> hierarchal storage format.
> >> >> >>>
> >> >> >>> If a user reads a Vector that is not already associated with a
> >> >> >>>
> >> >> >>> Function, then it should be the user's responsibility to take care
> >> >> >>> of
> >> >> >>> things.
> >> >> >>>
> >> >> >>
> >> >> >> It could work like this:
> >> >> >>
> >> >> >> At present, when writing a Function, it creates a group and
> >> >> >> populates
> >> >> >> it
> >> >> >> with
> >> >> >> dofmap, cells, and vector. Writing again with the same name will
> >> >> >> cause
> >> >> >> an
> >> >> >> error.
> >> >> >> We could allow writes to the same name, but create more vectors
> >> >> >> (maybe
> >> >> >> checking that the cells/dofs are still compatible) in the same HDF5
> >> >> >> group.
> >> >> >> Or, a user could just manually dump more vectors in the group (as
> >> >> >> described
> >> >> >> above by Garth).
> >> >> >>
> >> >> >> For read, reading a Function will still behave the same, but we
> >> >> >> could
> >> >> >> have
> >> >> >> the additional option of reading a Function by just giving the
> >> >> >> vector
> >> >> >> dataset name - and assuming that cell/dof information exists in the
> >> >> >> same
> >> >> >> HDF5 group. This should be fairly easy to implement.
> >> >> >>
> >> >> >>
> >> >> >> Chris
> >> >> >>
> >> >> >> _______________________________________________
> >> >> >> fenics mailing list
> >> >> >> [email protected]
> >> >> >> http://fenicsproject.org/mailman/listinfo/fenics
> >> >> >
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > fenics mailing list
> >> >> > [email protected]
> >> >> > http://fenicsproject.org/mailman/listinfo/fenics
> >> >> >
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > fenics mailing list
> >> > [email protected]
> >> > http://fenicsproject.org/mailman/listinfo/fenics
> >> >
> >
> >
> _______________________________________________
> fenics mailing list
> [email protected]
> http://fenicsproject.org/mailman/listinfo/fenics
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Re: [FEniCS] HDF5File saving Functions with invariant cells and dofs

Reply via email to