Re: [deal.II] Re: Application of inverse mass matrix in matrix-free context using cell_loop instead of scale()

'Maxi Miller' via deal.II User Group Sun, 19 Jan 2020 04:51:21 -0800

Hei,
I attached a working MWE. Changing between the approach suggested above and 
scale() is done by changing the variable use_scale_function. The output 
contains both the expected solution and the obtained solution.
My approach for replacing scale() with a local loop is
            data.cell_loop(&LaplaceOperator::local_apply_cell,
                           this,
                           dst,
                           src,
                           [&](const unsigned int start_range, const 
unsigned int end_range){
                for(size_t i = start_range; i < end_range; ++i){
                    dst.local_element(i) = 0;
                }
            },
            [&](const unsigned int start_range, const unsigned int end_range
){
                for(unsigned int i = start_range; i < end_range; ++i){
                    dst.local_element(i) = dst.local_element(i) * 
inv_mass_matrix.local_element(i);
                }
            })


I expect my result to follow the solution function exp(-2 * pi^2 * t) * 
sin(pi * x) * sin(pi * y). While that works for the scale()-approach, the 
result for integrated approach does not change from time step to time step. 
Currently I am only running it with a single MPI thread, after encountering 
issues when running with several threads.
Thanks!

Am Samstag, 18. Januar 2020 18:42:04 UTC+1 schrieb Daniel Arndt:
>
> Maxi,
>
> As usual, it is much easier to help if you provide a complete minimal 
> example and say how the result differs from what you expect.
> Does it only scale certain vector entries? Are the results correct when 
> running with one MPI process?
> How does your approach differ from 
> https://github.com/dealii/dealii/blob/b84270a1d4099292be5b3d43c2ea65f3ee005919/tests/matrix_free/pre_and_post_loops_01.cc#L100-L121
> ?
>
> Best,
> Daniel
>
> Am Sa., 18. Jan. 2020 um 12:05 Uhr schrieb 'Maxi Miller' via deal.II User 
> Group <dea...@googlegroups.com <javascript:>>:
>
>> I tried implementing it as 
>>         data.cell_loop(&LaplaceOperator::local_apply_cell,
>>                        this,
>>                        dst,
>>                        src,
>>                        //true,
>>                        [&](const unsigned int start_range, const unsigned 
>> int end_range){
>>             for(size_t i = start_range; i < end_range; ++i){
>>                 dst.local_element(i) = 0;
>>             }
>>         },
>>         [&](const unsigned int start_range, const unsigned int end_range
>> ){
>>             for(unsigned int i = start_range; i < end_range; ++i){
>>                 dst.local_element(i) = dst.local_element(i) * 
>> inv_mass_matrix.local_element(i);
>>             }
>>         });
>>
>> but the result was not correct. Thus, I assume I have to do something 
>> else?
>> Thanks!
>>
>> Am Samstag, 18. Januar 2020 17:12:34 UTC+1 schrieb peterrum:
>>>
>>> Yes, like here 
>>> https://github.com/dealii/dealii/blob/b84270a1d4099292be5b3d43c2ea65f3ee005919/tests/matrix_free/pre_and_post_loops_01.cc#L100-L121
>>>
>>> On Saturday, 18 January 2020 12:57:24 UTC+1, Maxi Miller wrote:
>>>>
>>>> In step-48 the inverse mass matrix is applied by moving the inverse 
>>>> data into a vector and applying the function scale(), i.e. as in the 
>>>> following code:
>>>>         data.cell_loop(&LaplaceOperator::local_apply_cell,
>>>>                        this,
>>>>                        dst,
>>>>                        src,
>>>>                        true);
>>>>         computing_times[0] += timer.wall_time();
>>>>         timer.restart();
>>>>
>>>>        dst.scale(inv_mass_matrix);
>>>>
>>>>
>>>>
>>>> Now I would like to do the same, but use a cell_loop instead of 
>>>> scale(). Thus, I created an additional function called 
>>>> "local_apply_inverse_mass_matrix" as 
>>>>     template <int dim, int degree, int n_points_1d>
>>>>     void LaplaceOperator<dim, degree, n_points_1d>::
>>>> local_apply_inverse_mass_matrix(
>>>>             const MatrixFree<dim, Number> &                   data,
>>>>             LinearAlgebra::distributed::Vector<Number> &      dst,
>>>>             const LinearAlgebra::distributed::Vector<Number> &src,
>>>>             const std::pair<unsigned int, unsigned int> &     
>>>> cell_range) const
>>>>     {
>>>>         (void) data;
>>>>         (void) cell_range;
>>>>         dst = src;
>>>>         dst.scale(inv_mass_matrix);
>>>>     }
>>>>
>>>> When running that code, though, using
>>>>         LinearAlgebra::distributed::Vector<Number> tmp(src);
>>>>
>>>>         data.initialize_dof_vector(tmp);
>>>>         data.initialize_dof_vector(dst);
>>>>         data.cell_loop(&LaplaceOperator::local_apply_cell,
>>>>                        this,
>>>>                        tmp,
>>>>                        src,
>>>>                        true);
>>>>         computing_times[0] += timer.wall_time();
>>>>         timer.restart();
>>>>
>>>>         data.cell_loop(&LaplaceOperator::
>>>> local_apply_inverse_mass_matrix,
>>>>                        this,
>>>>                        dst,
>>>>                        tmp,
>>>>                        true);
>>>>         computing_times[1] += timer.wall_time();
>>>>
>>>>         computing_times[3] += 1.;
>>>>
>>>>
>>>> I get the error 
>>>> An error occurred in line <3338> of file </opt/dealii/include/deal.II/
>>>> matrix_free/matrix_free.h> in function
>>>>     void dealii::internal::VectorDataExchange<dim, Number, 
>>>> VectorizedArrayType>::compress_start(unsigned int, VectorType&) [with 
>>>> VectorType = dealii::LinearAlgebra::distributed::Vector<double>; 
>>>> typename std::enable_if<(dealii::internal::has_compress_start<
>>>> VectorType>::value && dealii::internal::has_exchange_on_subset<
>>>> VectorType>::value), VectorType>::type* <anonymous> = 0; int dim = 2; 
>>>> Number = double; VectorizedArrayType = dealii::VectorizedArray<double, 
>>>> 4>]
>>>> The violated condition was: 
>>>>     vec.has_ghost_elements() == false
>>>> Additional information: 
>>>>     You are trying to use functionality in deal.II that is currently 
>>>> not implemented. In many cases, this indicates that there simply didn't 
>>>> appear much of a need for it, or that the author of the original code did 
>>>> not have the time to implement a particular case. If you hit this 
>>>> exception, it is therefore worth the time to look into the code to find 
>>>> out 
>>>> whether you may be able to implement the missing functionality. If you do, 
>>>> please consider providing a patch to the deal.II development sources (see 
>>>> the deal.II website on how to contribute).
>>>>
>>>> Stacktrace:
>>>> -----------
>>>> #0  ./MF_FES_RK4-Test: void dealii::internal::VectorDataExchange<2, 
>>>> double, dealii::VectorizedArray<double, 4> 
>>>> >::compress_start<dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>, 
>>>> (dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>*)0>(unsigned int, 
>>>> dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>&)
>>>> #1  ./MF_FES_RK4-Test: void dealii::internal::compress_start<2, 
>>>> dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>, double, dealii::VectorizedArray<double, 4>, 
>>>> (dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>*)0>(dealii::LinearAlgebra::distributed::Vector<double,
>>>>  
>>>> dealii::MemorySpace::Host>&, dealii::internal::VectorDataExchange<2, 
>>>> double, dealii::VectorizedArray<double, 4> >&, unsigned int)
>>>> #2  ./MF_FES_RK4-Test: dealii::internal::MFWorker<dealii::MatrixFree<2, 
>>>> double, dealii::VectorizedArray<double, 4> >, 
>>>> dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>, 
>>>> dealii::LinearAlgebra::distributed::Vector<double, 
>>>> dealii::MemorySpace::Host>, Step40::LaplaceOperator<2, 2, 4>, 
>>>> true>::vector_compress_start()
>>>> #3  /opt/dealii/lib/libdeal_II.g.so.9.2.0-pre: 
>>>> dealii::internal::MatrixFreeFunctions::MPICommunication::execute()
>>>> #4 
>>>>  
>>>> /opt/intel/compilers_and_libraries_2019.5.281/linux/tbb/lib/intel64_lin/gcc4.7/libtbb_debug.so.2:
>>>>  
>>>>
>>>> #5 
>>>>  
>>>> /opt/intel/compilers_and_libraries_2019.5.281/linux/tbb/lib/intel64_lin/gcc4.7/libtbb_debug.so.2:
>>>>  
>>>>
>>>> #6 
>>>>  
>>>> /opt/intel/compilers_and_libraries_2019.5.281/linux/tbb/lib/intel64_lin/gcc4.7/libtbb_debug.so.2:
>>>>  
>>>>
>>>> #7 
>>>>  
>>>> /opt/intel/compilers_and_libraries_2019.5.281/linux/tbb/lib/intel64_lin/gcc4.7/libtbb_debug.so.2:
>>>>  
>>>>
>>>> #8 
>>>>  
>>>> /opt/intel/compilers_and_libraries_2019.5.281/linux/tbb/lib/intel64_lin/gcc4.7/libtbb_debug.so.2:
>>>>  
>>>>
>>>> #9  /lib64/libpthread.so.0: 
>>>> #10  /lib64/libc.so.6: clone
>>>>
>>>> Why that, and how can I fix it? 
>>>> Thanks!
>>>>
>>> -- 
>> The deal.II project is located at http://www.dealii.org/
>> For mailing list/forum options, see 
>> https://groups.google.com/d/forum/dealii?hl=en
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "deal.II User Group" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to dea...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/dealii/a3c92a70-323e-48a5-9490-58e0b72b8860%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/dealii/a3c92a70-323e-48a5-9490-58e0b72b8860%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/77ef5f99-3331-4830-8ae6-7d7862091d5c%40googlegroups.com.

/* ---------------------------------------------------------------------
 *
 * Copyright (C) 2009 - 2019 by the deal.II authors
 *
 * This file is part of the deal.II library.
 *
 * The deal.II library is free software; you can use it, redistribute
 * it, and/or modify it under the terms of the GNU Lesser General
 * Public License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 * The full text of the license can be found in the file LICENSE.md at
 * the top level directory of deal.II.
 *
 * ---------------------------------------------------------------------
 *
 * Author: Wolfgang Bangerth, Texas A&M University, 2009, 2010
 *         Timo Heister, University of Goettingen, 2009, 2010
 */
#include <deal.II/base/quadrature_lib.h>
#include <deal.II/base/function.h>
#include <deal.II/base/timer.h>
#include <deal.II/lac/generic_linear_algebra.h>

#include <deal.II/base/time_stepping.h>

#include <deal.II/base/utilities.h>
#include <deal.II/base/conditional_ostream.h>
#include <deal.II/base/index_set.h>
#include <deal.II/base/convergence_table.h>

namespace LA
{
#undef DEAL_II_WITH_PETSC
#if defined(DEAL_II_WITH_PETSC) && !defined(DEAL_II_PETSC_WITH_COMPLEX) && \
	!(defined(DEAL_II_WITH_TRILINOS) && defined(FORCE_USE_OF_TRILINOS))
	using namespace dealii::LinearAlgebraPETSc;
#  define USE_PETSC_LA
#elif defined(DEAL_II_WITH_TRILINOS)
	using namespace dealii::LinearAlgebraTrilinos;
#else
#  error DEAL_II_WITH_PETSC or DEAL_II_WITH_TRILINOS required
#endif
} // namespace LA
#include <deal.II/lac/vector.h>
#include <deal.II/lac/full_matrix.h>
#include <deal.II/lac/solver_cg.h>
#include <deal.II/lac/affine_constraints.h>
#include <deal.II/lac/dynamic_sparsity_pattern.h>
#include <deal.II/lac/sparsity_tools.h>
#include <deal.II/lac/linear_operator.h>
#include <deal.II/lac/parpack_solver.h>
#include <deal.II/lac/solver_gmres.h>
#include <deal.II/lac/sparse_direct.h>
#include <deal.II/lac/la_parallel_vector.h>

#include <deal.II/grid/grid_generator.h>
#include <deal.II/grid/tria_accessor.h>
#include <deal.II/grid/tria_iterator.h>

#include <deal.II/dofs/dof_handler.h>
#include <deal.II/dofs/dof_accessor.h>
#include <deal.II/dofs/dof_tools.h>

#include <deal.II/fe/fe_values.h>
#include <deal.II/fe/fe_q.h>
#include <deal.II/fe/fe_system.h>

#include <deal.II/numerics/vector_tools.h>
#include <deal.II/numerics/data_out.h>
#include <deal.II/numerics/error_estimator.h>

#include <deal.II/matrix_free/fe_evaluation.h>
#include <deal.II/matrix_free/matrix_free.h>
#include <deal.II/matrix_free/operators.h>

#include <deal.II/differentiation/ad.h>

#include <deal.II/distributed/tria.h>
#include <deal.II/distributed/grid_refinement.h>

#include <fstream>
#include <iostream>

constexpr double time_step = 0.01;

constexpr unsigned int fe_degree            = 2;
constexpr unsigned int n_q_points_1d        = fe_degree + 2;
using Number                                = double;
constexpr unsigned int n_components			= 2;

constexpr bool use_scale_function = true;

namespace Step40
{
	using namespace dealii;

	template <int dim, int degree, int n_points_1d>
	class LaplaceOperator
	{
	public:
		static constexpr unsigned int n_quadrature_points_1d = n_points_1d;

		LaplaceOperator() : computing_times(6){

		};

		void reinit(const Mapping<dim> &   mapping,
					const DoFHandler<dim> &dof_handler,
					const AffineConstraints<double> &hanging_node_constraints);

		void print_computing_times() const;

		void apply(const double                                      current_time,
				   const LinearAlgebra::distributed::Vector<Number> &src,
				   LinearAlgebra::distributed::Vector<Number> &      dst) const;

		void project(const Function<dim> &                       function,
					 LinearAlgebra::distributed::Vector<Number> &solution) const;

		void
		initialize_vector(LinearAlgebra::distributed::Vector<Number> &vector) const;

	private:
		MatrixFree<dim, Number>     data;
		mutable std::vector<double> computing_times;

		LinearAlgebra::distributed::Vector<double> inv_mass_matrix;

		void local_apply_inverse_mass_matrix(
				const MatrixFree<dim, Number> &                   data,
				LinearAlgebra::distributed::Vector<Number> &      dst,
				const LinearAlgebra::distributed::Vector<Number> &src,
				const std::pair<unsigned int, unsigned int> &     cell_range) const;

		void local_apply_cell(
				const MatrixFree<dim, Number> &                   data,
				LinearAlgebra::distributed::Vector<Number> &      dst,
				const LinearAlgebra::distributed::Vector<Number> &src,
				const std::pair<unsigned int, unsigned int> &     cell_range) const;
	};

	template <int dim, int degree, int n_points_1d>
	void LaplaceOperator<dim, degree, n_points_1d>::reinit(
			const Mapping<dim> &mapping,
			const DoFHandler<dim> &dof_handler,
			const AffineConstraints<double> &hanging_node_constraints){
		std::vector<const DoFHandler<dim> *>           dof_handlers({&dof_handler});
		AffineConstraints<double>                      dummy(hanging_node_constraints);
		std::vector<const AffineConstraints<double> *> constraints({&hanging_node_constraints});
		std::vector<Quadrature<1>>                     quadratures(
		{QGauss<1>(n_q_points_1d),
		 QGauss<1>(fe_degree + 1)});
		typename MatrixFree<dim, Number>::AdditionalData additional_data;
		additional_data.mapping_update_flags =
				(update_gradients | update_JxW_values | update_quadrature_points |
				 update_values);

		additional_data.tasks_parallel_scheme =
				//MatrixFree<dim, Number>::AdditionalData::none;
				MatrixFree<dim, Number>::AdditionalData::partition_partition;

		data.reinit(mapping,
					dof_handlers,
					constraints,
					quadratures,
					additional_data);

		data.initialize_dof_vector(inv_mass_matrix);
		FEEvaluation<dim, degree, n_points_1d, n_components, Number> fe_eval(data);
		const unsigned int           n_q_points = fe_eval.n_q_points;
		for (unsigned int cell = 0; cell < data.n_cell_batches(); ++cell)
		{
			fe_eval.reinit(cell);
			for (unsigned int q = 0; q < n_q_points; ++q)
				fe_eval.submit_value(Tensor<1, n_components, VectorizedArray<Number>>({make_vectorized_array(1.),
																					   make_vectorized_array(1.)}), q);
			fe_eval.integrate(true, false);
			fe_eval.distribute_local_to_global(inv_mass_matrix);
		}
		inv_mass_matrix.compress(VectorOperation::add);
		for (unsigned int k = 0; k < inv_mass_matrix.local_size(); ++k)
			if (inv_mass_matrix.local_element(k) > 1e-15)
				inv_mass_matrix.local_element(k) =
						1. / inv_mass_matrix.local_element(k);
			else
				inv_mass_matrix.local_element(k) = 1;
	}

	template <int dim, int degree, int n_points_1d>
	void LaplaceOperator<dim, degree, n_points_1d>::initialize_vector(
			LinearAlgebra::distributed::Vector<Number> &vector) const
	{
		data.initialize_dof_vector(vector);
	}

	template <int dim, int degree, int n_points_1d>
	void LaplaceOperator<dim, degree, n_points_1d>::print_computing_times() const
	{
		ConditionalOStream        pcout(std::cout,
										Utilities::MPI::this_mpi_process(MPI_COMM_WORLD) ==
										0);
		Utilities::MPI::MinMaxAvg data;
		if (computing_times[2] + computing_times[3] > 0)
		{
			std::ios_base::fmtflags flags(std::cout.flags());
			std::cout << std::setprecision(3);
			pcout << "   Apply called "
				  << static_cast<std::size_t>(computing_times[3]) << " times"
																  << std::endl;
			pcout << "   RK update called "
				  << static_cast<std::size_t>(computing_times[2]) << " times"
																  << std::endl;
			pcout << "   Evaluate L_h:            ";
			data = Utilities::MPI::min_max_avg(computing_times[0], MPI_COMM_WORLD);
			pcout << std::scientific << std::setw(10) << data.min << " (p"
				  << std::setw(4) << data.min_index << ") " << std::setw(10)
				  << data.avg << std::setw(10) << data.max << " (p" << std::setw(4)
				  << data.max_index << ")" << std::endl;
			pcout << "   Inverse mass (+RK upd):  ";
			data = Utilities::MPI::min_max_avg(computing_times[1], MPI_COMM_WORLD);
			pcout << std::scientific << std::setw(10) << data.min << " (p"
				  << std::setw(4) << data.min_index << ") " << std::setw(10)
				  << data.avg << std::setw(10) << data.max << " (p" << std::setw(4)
				  << data.max_index << ")" << std::endl;
			pcout << "   Compute transport speed: ";
			data = Utilities::MPI::min_max_avg(computing_times[4], MPI_COMM_WORLD);
			pcout << std::scientific << std::setw(10) << data.min << " (p"
				  << std::setw(4) << data.min_index << ") " << std::setw(10)
				  << data.avg << std::setw(10) << data.max << " (p" << std::setw(4)
				  << data.max_index << ")" << std::endl;
			pcout << "   Compute errors/norms:    ";
			data = Utilities::MPI::min_max_avg(computing_times[5], MPI_COMM_WORLD);
			pcout << std::scientific << std::setw(10) << data.min << " (p"
				  << std::setw(4) << data.min_index << ") " << std::setw(10)
				  << data.avg << std::setw(10) << data.max << " (p" << std::setw(4)
				  << data.max_index << ")" << std::endl;
			std::cout.flags(flags);
		}
	}

	template <int dim, int degree, int n_points_1d>
	void LaplaceOperator<dim, degree, n_points_1d>::local_apply_cell(
			const MatrixFree<dim, Number> &                   data,
			LinearAlgebra::distributed::Vector<Number> &      dst,
			const LinearAlgebra::distributed::Vector<Number> &src,
			const std::pair<unsigned int, unsigned int> &     cell_range) const
	{
		FEEvaluation<dim, degree, n_points_1d, n_components, Number> phi(data);

		for (unsigned int cell = cell_range.first; cell < cell_range.second; ++cell)
		{
			phi.reinit(cell);
			phi.read_dof_values_plain(src);
			phi.evaluate(false, true);
			for (unsigned int q = 0; q < phi.n_q_points; ++q)
			{
				phi.submit_gradient(-1. * phi.get_gradient(q), q);
			}
			phi.integrate_scatter(false, true, dst);
		}
	}

	template <int dim, int degree, int n_points_1d>
	void LaplaceOperator<dim, degree, n_points_1d>::local_apply_inverse_mass_matrix(
			const MatrixFree<dim, Number> &                   data,
			LinearAlgebra::distributed::Vector<Number> &      dst,
			const LinearAlgebra::distributed::Vector<Number> &src,
			const std::pair<unsigned int, unsigned int> &     cell_range) const
	{
		(void) data;
		(void) cell_range;
		dst = src;
		dst.scale(inv_mass_matrix);
	}

	template <int dim, int degree, int n_points_1d>
	void LaplaceOperator<dim, degree, n_points_1d>::apply(
			const double                                      current_time,
			const LinearAlgebra::distributed::Vector<Number> &src,
			LinearAlgebra::distributed::Vector<Number> &      dst) const
	{
		Timer timer;
		(void) current_time;
		LinearAlgebra::distributed::Vector<Number> tmp(src);

		data.initialize_dof_vector(tmp);
		data.initialize_dof_vector(dst);

		if(use_scale_function){
			data.cell_loop(&LaplaceOperator::local_apply_cell,
						   this,
						   dst,
						   src,
						   true);
			dst.scale(inv_mass_matrix);
		}
		else
			data.cell_loop(&LaplaceOperator::local_apply_cell,
						   this,
						   dst,
						   src,
						   [&](const unsigned int start_range, const unsigned int end_range){
				for(size_t i = start_range; i < end_range; ++i){
					dst.local_element(i) = 0;
				}
			},
			[&](const unsigned int start_range, const unsigned int end_range){
				for(unsigned int i = start_range; i < end_range; ++i){
					dst.local_element(i) = dst.local_element(i) * inv_mass_matrix.local_element(i);
				}
			});
		computing_times[0] += timer.wall_time();
		computing_times[3] += 1.;
	}

	template <int dim>
	class InitialCondition : public Function<dim>
	{
	public:
		InitialCondition() : Function<dim>(n_components){

		}
		virtual double value(const Point<dim> &p, const unsigned int component) const override;
		virtual Tensor<1, dim> gradient(const Point<dim> &p, const unsigned int component) const override;
	};

	template <int dim>
	double InitialCondition<dim>::value(const Point<dim> &p, const unsigned int) const{
		const double x = p[0];
		const double y = p[1];

		return sin(M_PI * x) * sin(M_PI * y);
	}

	template <int dim>
	Tensor<1, dim> InitialCondition<dim>::gradient(const Point<dim> &p, const unsigned int) const{
		const double x = p[0];
		const double y = p[1];

		Tensor<1, dim> return_value;
		return_value[0] = M_PI * cos(M_PI * x) * sin(M_PI * y);
		return_value[1] = M_PI * sin(M_PI * x) * cos(M_PI * y);

		return return_value;
	}

	template <int dim>
	class Solution : public Function<dim>
	{
	public:
		Solution(const double time) : Function<dim>(n_components), time(time) {

		}
		virtual double value(const Point<dim> &p, const unsigned int component) const override;
		virtual Tensor<1, dim> gradient(const Point<dim> &p, const unsigned int component) const override;
	private:
		const double time;
	};

	template <int dim>
	double Solution<dim>::value(const Point<dim> &p, const unsigned int) const{
		const double x = p[0];
		const double y = p[1];

		return exp(-2 * M_PI * M_PI * time) * sin(M_PI * x) * sin(M_PI * y);
	}

	template <int dim>
	Tensor<1, dim> Solution<dim>::gradient(const Point<dim> &p, const unsigned int) const{
		const double x = p[0];
		const double y = p[1];

		Tensor<1, dim> return_value;
		return_value[0] = exp(-2 * M_PI * M_PI * time) * M_PI * cos(M_PI * x) * sin(M_PI * y);
		return_value[1] = exp(-2 * M_PI * M_PI * time) * M_PI * sin(M_PI * x) * cos(M_PI * y);

		return return_value;
	}

	template <int dim>
	class LaplaceProblem
	{
	public:
		LaplaceProblem();
		void run_with_MF();
	private:
		void setup_system();

		void output_results(const unsigned int cycle, const double time);

		LinearAlgebra::distributed::Vector<Number> evaluate_diffusion(const double          time,
																	  const LinearAlgebra::distributed::Vector<Number> &y) const;

		unsigned int
		embedded_explicit_method(const TimeStepping::runge_kutta_method method,
								 const unsigned int n_time_steps,
								 const double       initial_time,
								 const double       final_time);

		void assemble_mass_matrix();

		void process_solution(const unsigned int cycle);

		MPI_Comm mpi_communicator;
		parallel::distributed::Triangulation<dim> triangulation;
		FESystem<dim>       fe;
		MappingQGeneric<dim> mapping;
		DoFHandler<dim> dof_handler;
		AffineConstraints<double> constraints;
		LaplaceOperator<dim, fe_degree, n_q_points_1d> laplace_operator;
		LinearAlgebra::distributed::Vector<Number> locally_relevant_solution_MF;
		SparseMatrix<double> mass_matrix;
		SparseMatrix<double> system_matrix;
		SparsityPattern sparsity_pattern;

		ConditionalOStream pcout;
		TimerOutput        computing_timer;

		ConvergenceTable convergence_table;
	};

	template <int dim>
	LaplaceProblem<dim>::LaplaceProblem()
		: mpi_communicator(MPI_COMM_WORLD)
		, triangulation(mpi_communicator,
						typename Triangulation<dim>::MeshSmoothing(
							Triangulation<dim>::smoothing_on_refinement |
							Triangulation<dim>::smoothing_on_coarsening))
		, fe(FE_Q<dim>(fe_degree), n_components)
		, mapping(fe.degree + 1)
		, dof_handler(triangulation)
		, pcout(std::cout,
				(Utilities::MPI::this_mpi_process(mpi_communicator) == 0))
		, computing_timer(mpi_communicator,
						  pcout,
						  TimerOutput::summary,
						  TimerOutput::wall_times)
	{}

	template <int dim>
	void LaplaceProblem<dim>::setup_system()
	{
		TimerOutput::Scope t(computing_timer, "setup");
		dof_handler.distribute_dofs(fe);

		IndexSet locally_relevant_dofs;
		DoFTools::extract_locally_relevant_dofs(dof_handler, locally_relevant_dofs);

		constraints.clear();
		constraints.reinit(locally_relevant_dofs);
		DoFTools::make_hanging_node_constraints(dof_handler, constraints);
		FEValuesExtractors::Scalar first_element(0),
				second_element(1);

		VectorTools::interpolate_boundary_values(dof_handler,
												 0,
												 Functions::ZeroFunction<dim>(n_components),
												 constraints,
												 fe.component_mask(first_element));

		VectorTools::interpolate_boundary_values(dof_handler,
												 0,
												 Functions::ZeroFunction<dim>(n_components),
												 constraints,
												 fe.component_mask(second_element));
		constraints.close();

		laplace_operator.reinit(mapping, dof_handler, constraints);
		laplace_operator.initialize_vector(locally_relevant_solution_MF);

		//		DynamicSparsityPattern dsp(locally_relevant_dofs);
		//		DoFTools::make_sparsity_pattern(dof_handler, dsp, constraints, false);
		//		SparsityTools::distribute_sparsity_pattern(
		//					dsp,
		//					dof_handler.compute_n_locally_owned_dofs_per_processor(),
		//					mpi_communicator,
		//					locally_relevant_dofs);
		//		sparsity_pattern.copy_from(dsp);
		//		mass_matrix.reinit(sparsity_pattern);
		//		system_matrix.reinit(sparsity_pattern);
	}

	template <int dim>
	LinearAlgebra::distributed::Vector<Number> LaplaceProblem<dim>::evaluate_diffusion(const double time,
																					   const LinearAlgebra::distributed::Vector<Number> &y) const{
		LinearAlgebra::distributed::Vector<Number> tmp(y), tmp_II(y);
		laplace_operator.initialize_vector(tmp);
		laplace_operator.initialize_vector(tmp_II);
		laplace_operator.apply(time, y, tmp);
		return tmp;
	}

	template <int dim>
	void LaplaceProblem<dim>::assemble_mass_matrix(){
		//		TimerOutput::Scope t(computing_timer, "Mass matrix assembly");
		//		mass_matrix = 0.;
		//		system_matrix = 0.;

		//		const QGauss<dim> quadrature_formula(fe_degree + 1);
		//		FEValues<dim> fe_values(fe,
		//								quadrature_formula,
		//								update_values | update_gradients |
		//								update_quadrature_points | update_JxW_values);
		//		const unsigned int dofs_per_cell = fe.dofs_per_cell;
		//		const unsigned int n_q_points    = quadrature_formula.size();
		//		FullMatrix<double> cell_mass_matrix(dofs_per_cell, dofs_per_cell),
		//				cell_matrix(dofs_per_cell, dofs_per_cell);
		//		std::vector<types::global_dof_index> local_dof_indices(dofs_per_cell);

		//		for (const auto &cell : dof_handler.active_cell_iterators())
		//			if (cell->is_locally_owned())
		//			{
		//				cell_mass_matrix = 0.;
		//				cell_matrix = 0.;
		//				fe_values.reinit(cell);

		//				for (unsigned int q_point = 0; q_point < n_q_points; ++q_point)
		//				{
		//					for (unsigned int i = 0; i < dofs_per_cell; ++i)
		//					{
		//						for (unsigned int j = 0; j < dofs_per_cell; ++j){
		//							cell_mass_matrix(i, j) += ((fe_values.shape_value(i, q_point)
		//														* fe_values.shape_value(j, q_point))
		//													   )
		//									* fe_values.JxW(q_point);
		//							cell_matrix(i, j) += -1.
		//									* fe_values.shape_grad(i, q_point)
		//									* fe_values.shape_grad(j, q_point)
		//									* fe_values.JxW(q_point);
		//						}
		//					}
		//				}
		//				cell->get_dof_indices(local_dof_indices);

		//				constraints.distribute_local_to_global(cell_mass_matrix,
		//													   local_dof_indices,
		//													   mass_matrix);
		//				constraints.distribute_local_to_global(cell_matrix,
		//													   local_dof_indices,
		//													   system_matrix);
		//			}
		//		mass_matrix.compress(VectorOperation::add);
		//		system_matrix.compress(VectorOperation::add);
	}

	template <int dim>
	unsigned int LaplaceProblem<dim>::embedded_explicit_method(const TimeStepping::runge_kutta_method method,
															   const unsigned int n_time_steps,
															   const double       initial_time,
															   const double       final_time){

		//assemble_mass_matrix();
		TimerOutput::Scope t(computing_timer, "Embedded method");
		double time_step =
				(final_time - initial_time) / static_cast<double>(n_time_steps);
		double       time          = initial_time;
		const double coarsen_param = 1.2;
		const double refine_param  = 0.8;
		const double min_delta     = 1e-8;
		const double max_delta     = 10 * time_step;
		const double refine_tol    = 1e-1;
		const double coarsen_tol   = 1e-5;

		TimeStepping::EmbeddedExplicitRungeKutta<LinearAlgebra::distributed::Vector<Number>>
				embedded_explicit_runge_kutta(method,
											  coarsen_param,
											  refine_param,
											  min_delta,
											  max_delta,
											  refine_tol,
											  coarsen_tol);

		unsigned int n_steps = 0;
		static int counter = 0;
		while (time < final_time)
		{
			if (time + time_step > final_time)
				time_step = final_time - time;
			time = embedded_explicit_runge_kutta.evolve_one_time_step(
						[this](const double time, const LinearAlgebra::distributed::Vector<Number> &y) {
				return this->evaluate_diffusion(time, y);
			},
			time,
			time_step,
			locally_relevant_solution_MF);
			constraints.distribute(locally_relevant_solution_MF);

			time_step = embedded_explicit_runge_kutta.get_status().delta_t_guess;
			++n_steps;
		}
		counter++;
		return n_steps;
	}

	template <int dim>
	void LaplaceProblem<dim>::output_results(const unsigned int cycle, const double time)
	{
		TimerOutput::Scope t(computing_timer, "Output results");
		DataOut<dim> data_out;

		LinearAlgebra::distributed::Vector<Number> exact_solution;
		laplace_operator.initialize_vector (exact_solution);
		VectorTools::interpolate(dof_handler,
								 Solution<dim>(time),
								 exact_solution);

		data_out.attach_dof_handler(dof_handler);
		constraints.distribute(locally_relevant_solution_MF);
		locally_relevant_solution_MF.update_ghost_values();
		std::vector<DataComponentInterpretation::DataComponentInterpretation> data_component_interpretation{DataComponentInterpretation::component_is_scalar, DataComponentInterpretation::component_is_scalar};

		data_out.add_data_vector(locally_relevant_solution_MF,
								 std::vector<std::string>{"first_MF_solution", "second_MF_solution"},
								 DataOut<dim>::type_dof_data,
								 data_component_interpretation);
		data_out.add_data_vector(exact_solution,
								 std::vector<std::string>{"first_exact_solution", "second_exact_solution"},
								 DataOut<dim>::type_dof_data,
								 data_component_interpretation);
		Vector<float> subdomain(triangulation.n_active_cells());
		for (unsigned int i = 0; i < subdomain.size(); ++i)
			subdomain(i) = triangulation.locally_owned_subdomain();
		data_out.add_data_vector(subdomain, "subdomain");
		data_out.build_patches();

		data_out.write_vtu_with_pvtu_record("", "solution", cycle);

		if(cycle > 0){
			convergence_table.set_precision("L2", 3);
			convergence_table.set_precision("H1", 3);
			convergence_table.set_precision("Linfty", 3);
			convergence_table.set_scientific("L2", true);
			convergence_table.set_scientific("H1", true);
			convergence_table.set_scientific("Linfty", true);

			convergence_table.add_column_to_supercolumn("cycle", "n cells");
			convergence_table.add_column_to_supercolumn("cells", "n cells");
			std::vector<std::string> new_order;
			new_order.emplace_back("n cells");
			new_order.emplace_back("H1");
			new_order.emplace_back("L2");
			convergence_table.set_column_order(new_order);
			convergence_table.evaluate_convergence_rates(
						"L2", ConvergenceTable::reduction_rate);
			convergence_table.evaluate_convergence_rates(
						"L2", ConvergenceTable::reduction_rate_log2);
			convergence_table.evaluate_convergence_rates(
						"H1", ConvergenceTable::reduction_rate);
			convergence_table.evaluate_convergence_rates(
						"H1", ConvergenceTable::reduction_rate_log2);
			pcout << std::endl;
			convergence_table.write_text(std::cout);
			convergence_table.clear();
		}
	}

	template <int dim>
	void LaplaceProblem<dim>::process_solution(const unsigned int cycle){
		Vector<float> difference_per_cell(triangulation.n_active_cells());
		VectorTools::integrate_difference(dof_handler,
										  locally_relevant_solution_MF,
										  Solution<dim>((cycle + 1) * time_step),
										  difference_per_cell,
										  QGauss<dim>(fe.degree + 1),
										  VectorTools::L2_norm);
		const double L2_error =
				VectorTools::compute_global_error(triangulation,
												  difference_per_cell,
												  VectorTools::L2_norm);

		VectorTools::integrate_difference(dof_handler,
										  locally_relevant_solution_MF,
										  Solution<dim>((cycle + 1) * time_step),
										  difference_per_cell,
										  QGauss<dim>(fe.degree + 1),
										  VectorTools::H1_seminorm);
		const double H1_error =
				VectorTools::compute_global_error(triangulation,
												  difference_per_cell,
												  VectorTools::H1_seminorm);

		const QTrapez<1>     q_trapez;
		const QIterated<dim> q_iterated(q_trapez, fe.degree * 2 + 1);
		VectorTools::integrate_difference(dof_handler,
										  locally_relevant_solution_MF,
										  Solution<dim>((cycle + 1) * time_step),
										  difference_per_cell,
										  q_iterated,
										  VectorTools::Linfty_norm);
		const double Linfty_error =
				VectorTools::compute_global_error(triangulation,
												  difference_per_cell,
												  VectorTools::Linfty_norm);

		const unsigned int n_active_cells = triangulation.n_active_cells();
		const unsigned int n_dofs         = dof_handler.n_dofs();
		//		pcout << "Cycle " << cycle << ':' << std::endl
		//			  << "   Number of active cells:       " << n_active_cells
		//			  << std::endl
		//			  << "   Number of degrees of freedom: " << n_dofs << std::endl;
		convergence_table.add_value("cycle", cycle);
		convergence_table.add_value("cells", n_active_cells);
		convergence_table.add_value("dofs", n_dofs);
		convergence_table.add_value("L2", L2_error);
		convergence_table.add_value("H1", H1_error);
		convergence_table.add_value("Linfty", Linfty_error);
	}

	template <int dim>
	void LaplaceProblem<dim>::run_with_MF(){
		{
			const unsigned int n_vect_number =
					VectorizedArray<Number>::n_array_elements;
			const unsigned int n_vect_bits = 8 * sizeof(Number) * n_vect_number;

			pcout << "Running with "
				  << Utilities::MPI::n_mpi_processes(MPI_COMM_WORLD)
				  << " MPI processes" << std::endl;
			pcout << "Vectorization over " << n_vect_number << " "
				  << (std::is_same<Number, double>::value ? "doubles" : "floats")
				  << " = " << n_vect_bits << " bits ("
				  << Utilities::System::get_current_vectorization_level()
				  << "), VECTORIZATION_LEVEL=" << DEAL_II_COMPILER_VECTORIZATION_LEVEL
				  << std::endl;
		}

		const unsigned int n_cycles = 8;

		double local_time = 0., local_end_time = 0.;

		GridGenerator::hyper_cube(triangulation);
		triangulation.refine_global(5);

		setup_system();

		for (unsigned int cycle = 0; cycle < n_cycles; ++cycle)
		{
			local_time = cycle * time_step;
			local_end_time = (cycle + 1) * time_step;
			pcout << "Cycle " << cycle << ':' << std::endl;

			//triangulation.refine_global(1);


			if(cycle == 0){

				LinearAlgebra::distributed::Vector<Number> local_solution;
				laplace_operator.initialize_vector(local_solution);

				VectorTools::project(dof_handler,
									 constraints,
									 QGauss<dim>(fe.degree + 1),
									 InitialCondition<dim>(),
									 local_solution);

				locally_relevant_solution_MF = local_solution;
			}

			if (Utilities::MPI::n_mpi_processes(mpi_communicator) <= 32 && cycle == 0)
			{
				output_results(cycle, local_time);
			}

			pcout << "   Number of active cells:       "
				  << triangulation.n_global_active_cells() << std::endl
				  << "   Number of degrees of freedom: " << dof_handler.n_dofs()
				  << std::endl;

			Timer        timer;
			unsigned int timestep_number = 1;

			process_solution (0.);

			timestep_number = embedded_explicit_method(TimeStepping::DOPRI,
													   10,
													   local_time,
													   local_end_time);


			pcout << "Number of time steps: " << timestep_number << '\n';
			timestep_number = 1;
			process_solution(1.);
			if (Utilities::MPI::n_mpi_processes(mpi_communicator) <= 32)
			{
				TimerOutput::Scope t(computing_timer, "output");
				output_results(cycle + 1, local_end_time);
			}
			computing_timer.print_summary();
			computing_timer.reset();
			//laplace_operator.print_computing_times();
			pcout << std::endl;
		}

		local_time = 0.;
	}
} // namespace Step40
int main(int argc, char *argv[])
{
	std::cout << "Hello World\n";
	try
	{
		using namespace dealii;
		using namespace Step40;
		Utilities::MPI::MPI_InitFinalize mpi_initialization(argc, argv, dealii::numbers::invalid_unsigned_int);
		LaplaceProblem<2> laplace_problem_2d;
		//laplace_problem_2d.run();
		laplace_problem_2d.run_with_MF();
	}
	catch (std::exception &exc)
	{
		std::cerr << std::endl
				  << std::endl
				  << "----------------------------------------------------"
				  << std::endl;
		std::cerr << "Exception on processing: " << std::endl
				  << exc.what() << std::endl
				  << "Aborting!" << std::endl
				  << "----------------------------------------------------"
				  << std::endl;
		return 1;
	}
	catch (...)
	{
		std::cerr << std::endl
				  << std::endl
				  << "----------------------------------------------------"
				  << std::endl;
		std::cerr << "Unknown exception!" << std::endl
				  << "Aborting!" << std::endl
				  << "----------------------------------------------------"
				  << std::endl;
		return 1;
	}
	return 0;
}

Re: [deal.II] Re: Application of inverse mass matrix in matrix-free context using cell_loop instead of scale()

Reply via email to