Here are my numbers (see attachment).

Johannes

On Tue, Mar 31, 2015 at 9:46 AM, Garth N. Wells <[email protected]> wrote:
> FEniCS 1.4 package (Ubuntu 14.10)
>
> Summary of timings                                       |  Average
> time  Total time  Reps
> ------------------------------------------------------------------------------------------
> Apply (PETScMatrix)                                      |
> 0.00033009    0.079882   242
> Apply (PETScVector)                                      |
> 6.9951e-06    0.005806   830
> Assemble cells                                           |
> 0.017927      9.5731   534
> Boost Cuthill-McKee graph ordering (from dolfin::Graph)  |
> 9.5844e-05  9.5844e-05     1
> Build Boost CSR graph                                    |
> 7.7009e-05  7.7009e-05     1
> Build mesh number mesh entities                          |
> 0           0     2
> Build sparsity                                           |
> 0.0041105   0.0082209     2
> Delete sparsity                                          |
> 1.0729e-06  2.1458e-06     2
> Init MPI                                                 |
> 0.055825    0.055825     1
> Init PETSc                                               |
> 0.056171    0.056171     1
> Init dof vector                                          |
> 0.00018656  0.00037313     2
> Init dofmap                                              |
> 0.0064399   0.0064399     1
> Init dofmap from UFC dofmap                              |
> 0.0017549   0.0035098     2
> Init tensor                                              |
> 0.0002135  0.00042701     2
> LU solver                                                |
> 0.11543      27.933   242
> PETSc LU solver                                          |
> 0.1154      27.926   242
>
>
>
> FEniCS dev (my build, using PETSc dev)
>
> [MPI_AVG] Summary of timings     |  reps    wall avg    wall tot
> ----------------------------------------------------------------
> Apply (PETScMatrix)              |   242  0.00020009    0.048421
> Apply (PETScVector)              |   830  8.5487e-06   0.0070954
> Assemble cells                   |   534    0.017001      9.0787
> Build mesh number mesh entities  |     1    7.35e-07    7.35e-07
> Build sparsity                   |     2   0.0068867    0.013773
> Delete sparsity                  |     2    9.88e-07   1.976e-06
> Init MPI                         |     1   0.0023164   0.0023164
> Init PETSc                       |     1    0.002519    0.002519
> Init dof vector                  |     2  0.00016088  0.00032177
> Init dofmap                      |     1     0.04457     0.04457
> Init dofmap from UFC dofmap      |     1   0.0035997   0.0035997
> Init tensor                      |     2  0.00034076  0.00068153
> LU solver                        |   242    0.097293      23.545
> PETSc LU solver                  |   242    0.097255      23.536
> SCOTCH graph ordering            |     1   0.0005598   0.0005598
> compute connectivity 1 - 2       |     1  0.00088592  0.00088592
> compute entities dim = 1         |     1    0.028021    0.028021
>
> Garth
>
>
> On Mon, Mar 30, 2015 at 11:37 PM, Jan Blechta
> <[email protected]> wrote:
>> Could you, guys, run it with
>>
>>   list_timings()
>>
>> to get a detailed structure where's the time spent?
>>
>> Jan
>>
>>
>> On Mon, 30 Mar 2015 23:21:41 +0200
>> Johannes Ring <[email protected]> wrote:
>>
>>> On Mon, Mar 30, 2015 at 8:37 PM, Anders Logg <[email protected]> wrote:
>>> > Could you or someone else build FEniCS with fenics-install.sh
>>> > (takes time but is presumably automatic) and compare?
>>>
>>> I got 53s with the Debian packages and 1m5s with the HashDist based
>>> installation.
>>>
>>> > The alternative would be for me to build FEniCS manually but that
>>> > takes a lot of manual effort and it's not clear I can make a "good"
>>> > build. It would be good to get a number, not only to check for a
>>> > possible regression but also to test whether something is
>>> > suboptimal in the HashDist build.
>>> >
>>> > Johannes, is the HashDist build with optimization?
>>>
>>> DOLFIN is built with CMAKE_BUILD_TYPE=Release. The flags for building
>>> PETSc is listed below.
>>>
>>> Johannes
>>>
>>> PETSc flags for Debian package:
>>>
>>> PETSC_DIR=/tmp/src/petsc-3.4.2.dfsg1 PETSC_ARCH=linux-gnu-c-opt \
>>>   ./config/configure.py --with-shared-libraries --with-debugging=0 \
>>>   --useThreads 0 --with-clanguage=C++ --with-c-support \
>>>   --with-fortran-interfaces=1 \
>>>   --with-mpi-dir=/usr/lib/openmpi --with-mpi-shared=1 \
>>>   --with-blas-lib=-lblas --with-lapack-lib=-llapack \
>>>   --with-blacs=1 --with-blacs-include=/usr/include \
>>>   
>>> --with-blacs-lib=[/usr/lib/libblacsCinit-openmpi.so,/usr/lib/libblacs-openmpi.so]
>>> \
>>>   --with-scalapack=1 --with-scalapack-include=/usr/include \
>>>   --with-scalapack-lib=/usr/lib/libscalapack-openmpi.so \
>>>   --with-mumps=1 --with-mumps-include=/usr/include \
>>>   
>>> --with-mumps-lib=[/usr/lib/libdmumps.so,/usr/lib/libzmumps.so,/usr/lib/libsmumps.so,/usr/lib/libcmumps.so,/usr/lib/libmumps_common.so,/usr/lib/libpord.so]
>>> \
>>>   --with-umfpack=1 --with-umfpack-include=/usr/include/suitesparse \
>>>   --with-umfpack-lib=[/usr/lib/libumfpack.so,/usr/lib/libamd.so] \
>>>   --with-cholmod=1 --with-cholmod-include=/usr/include/suitesparse \
>>>   --with-cholmod-lib=/usr/lib/libcholmod.so \
>>>   --with-spooles=1 --with-spooles-include=/usr/include/spooles \
>>>   --with-spooles-lib=/usr/lib/libspooles.so \
>>>   --with-hypre=1 --with-hypre-dir=/usr \
>>>   --with-ptscotch=1 --with-ptscotch-include=/usr/include/scotch \
>>>   
>>> --with-ptscotch-lib=[/usr/lib/libptesmumps.so,/usr/lib/libptscotch.so,/usr/lib/libptscotcherr.so]
>>> \
>>>   --with-fftw=1 --with-fftw-include=/usr/include \
>>>   
>>> --with-fftw-lib=[/usr/lib/x86_64-linux-gnu/libfftw3.so,/usr/lib/x86_64-linux-gnu/libfftw3_mpi.so]
>>> \
>>>   --with-hdf5=1 --with-hdf5-dir=/usr/lib/x86_64-linux-gnu/hdf5/openmpi
>>> --CXX_LINKER_FLAGS="-Wl,--no-as-needed"
>>>
>>>
>>> PETSc flags for HashDist based build:
>>>
>>> mkdir ${PWD}/_tmp && TMPDIR=${PWD}/_tmp \
>>>   ./configure --prefix="${ARTIFACT}" \
>>>   COPTFLAGS=-O2 \
>>>   --with-shared-libraries=1 \
>>>   --with-debugging=0 \
>>>   --with-ssl=0 \
>>>   --with-blas-lapack-lib=${OPENBLAS_DIR}/lib/libopenblas.so \
>>>   --with-metis-dir=$PARMETIS_DIR \
>>>   --with-parmetis-dir=$PARMETIS_DIR \
>>>   --with-scotch-dir=${SCOTCH_DIR} \
>>>   --with-ptscotch-dir=${SCOTCH_DIR} \
>>>   --with-suitesparse=1 \
>>>   --with-suitesparse-include=${SUITESPARSE_DIR}/include/suitesparse \
>>>   
>>> --with-suitesparse-lib=[${SUITESPARSE_DIR}/lib/libumfpack.a,libklu.a,libcholmod.a,libbtf.a,libccolamd.a,libcolamd.a,libcamd.a,libamd.a,libsuitesparseconfig.a]
>>> \
>>>   --with-hypre=1 \
>>>   --with-hypre-include=${HYPRE_DIR}/include \
>>>   --with-hypre-lib=${HYPRE_DIR}/lib/libHYPRE.so \
>>>   --with-mpi-compilers \
>>>   CC=$MPICC \
>>>   CXX=$MPICXX \
>>>   F77=$MPIF77 \
>>>   F90=$MPIF90 \
>>>   FC=$MPIF90 \
>>>   --with-patchelf-dir=$PATCHELF_DIR \
>>>   --with-python-dir=$PYTHON_DIR \
>>>   --with-superlu_dist-dir=$SUPERLU_DIST_DIR \
>>>   --download-mumps=1 \
>>>   --download-scalapack=1 \
>>>   --download-blacs=1 \
>>>   --download-ml=1
>>>
>>>
>>> > --
>>> > Anders
>>> >
>>> >
>>> > mån 30 mars 2015 kl 17:05 skrev Garth N. Wells <[email protected]>:
>>> >>
>>> >> On Mon, Mar 30, 2015 at 1:34 PM, Anders Logg <[email protected]>
>>> >> wrote:
>>> >> > See this question on the QA forum:
>>> >> >
>>> >> >
>>> >> > http://fenicsproject.org/qa/6875/ubuntu-compile-from-source-which-provide-better-performance
>>> >> >
>>> >> > The Cahn-Hilliard demo takes 40 seconds with 1.3 Ubuntu packages
>>> >> > and 52 seconds with 1.5+ built from source. Are these
>>> >> > regressions in performance or
>>> >> > is Johannes that much better at building Debian packages than I
>>> >> > am building
>>> >> > FEniCS (with HashDist).
>>> >> >
>>> >>
>>> >> With the 1.4 Ubuntu package (Ubuntu 14.10), I get 42s. With my
>>> >> build of the dev version (I don't use Hashdist) I get 34s.
>>> >>
>>> >> Garth
>>> >>
>>> >> > PS: Looking at the benchbot, there seem to have been some
>>> >> > regressions in the
>>> >> > timing facilities with the recent changes:
>>> >> >
>>> >> > http://fenicsproject.org/benchbot/
>>> >> >
>>> >> > --
>>> >> > Anders
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > fenics mailing list
>>> >> > [email protected]
>>> >> > http://fenicsproject.org/mailman/listinfo/fenics
>>> >> >
>>> >
>>> >
>>> > _______________________________________________
>>> > fenics mailing list
>>> > [email protected]
>>> > http://fenicsproject.org/mailman/listinfo/fenics
>>> >
>>> _______________________________________________
>>> fenics mailing list
>>> [email protected]
>>> http://fenicsproject.org/mailman/listinfo/fenics
>>
>> _______________________________________________
>> fenics mailing list
>> [email protected]
>> http://fenicsproject.org/mailman/listinfo/fenics
FEniCS 1.5.0 (Debian package)

Summary of timings                                       |  Average time  Total 
time  Reps
------------------------------------------------------------------------------------------
Apply (PETScMatrix)                                      |    0.00030709     
0.07493   244
Apply (PETScVector)                                      |    1.0215e-05   
0.0085604   838
Assemble cells                                           |      0.017019       
9.156   538
Boost Cuthill-McKee graph ordering (from dolfin::Graph)  |     0.0015955    
0.003191     2
Build Boost CSR graph                                    |    0.00042701  
0.00085402     2
Build mesh number mesh entities                          |    1.0729e-06  
2.1458e-06     2
Build sparsity                                           |     0.0058496    
0.011699     2
Delete sparsity                                          |    4.7684e-07  
9.5367e-07     2
Init MPI                                                 |       0.24635     
0.24635     1
Init PETSc                                               |    0.00026011  
0.00026011     1
Init dof vector                                          |    0.00030649  
0.00061297     2
Init dofmap                                              |      0.024765    
0.049531     2
Init dofmap from UFC dofmap                              |     0.0036101   
0.0072203     2
Init tensor                                              |    0.00024343  
0.00048685     2
LU solver                                                |       0.19981      
48.754   244
PETSc LU solver                                          |       0.19978      
48.746   244
compute connectivity 1 - 2                               |      0.001076    
0.001076     1
compute entities dim = 1                                 |      0.018031    
0.018031     1

real    1m1.840s
user    0m59.780s
sys     0m2.048s


FEniCS 1.5.0 (built with fenics-install.sh)

Summary of timings               |  Average time  Total time  Reps
------------------------------------------------------------------
Apply (PETScMatrix)              |      0.000407    0.099308   244
Apply (PETScVector)              |    1.5445e-05    0.012943   838
Assemble cells                   |      0.038805      20.877   538
Build mesh number mesh entities  |    1.4305e-06   2.861e-06     2
Build sparsity                   |     0.0055639    0.011128     2
Delete sparsity                  |    9.5367e-07  1.9073e-06     2
Init PETSc                       |    2.8849e-05  2.8849e-05     1
Init dof vector                  |     0.0001756  0.00035119     2
Init dofmap                      |      0.021683    0.043367     2
Init dofmap from UFC dofmap      |     0.0040101   0.0080202     2
Init tensor                      |    0.00035644  0.00071287     2
LU solver                        |       0.19576      47.765   244
PETSc LU solver                  |       0.19572      47.755   244
SCOTCH graph ordering            |    0.00063097   0.0012619     2
compute connectivity 1 - 2       |     0.0011649   0.0011649     1
compute entities dim = 1         |      0.016705    0.016705     1

real    1m14.187s
user    1m41.408s
sys     2m49.892s


FEniCS 1.6.0dev (built with fenics-install.sh)

[MPI_AVG] Summary of timings     |  reps    wall avg    wall tot
----------------------------------------------------------------
Apply (PETScMatrix)              |   244  0.00037086    0.090489
Apply (PETScVector)              |   838  1.6145e-05    0.013529
Assemble cells                   |   538    0.036995      19.903
Build mesh number mesh entities  |     2  1.2095e-06   2.419e-06
Build sparsity                   |     2   0.0048764   0.0097528
Delete sparsity                  |     2  1.0975e-06   2.195e-06
Init PETSc                       |     1  3.0598e-05  3.0598e-05
Init dof vector                  |     2  0.00016808  0.00033615
Init dofmap                      |     2    0.025441    0.050881
Init dofmap from UFC dofmap      |     2   0.0035895   0.0071789
Init tensor                      |     2  0.00032729  0.00065458
LU solver                        |   244     0.17353      42.341
PETSc LU solver                  |   244     0.17348      42.328
SCOTCH graph ordering            |     2  0.00057533   0.0011507
compute connectivity 1 - 2       |     1  0.00099166  0.00099166
compute entities dim = 1         |     1     0.02528     0.02528

real    1m7.443s
user    1m33.664s
sys     2m37.236s
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to