Re: Nonlinear term MPI running does not end

Konstantinos Poulios Fri, 23 Jul 2021 04:31:15 -0700

Dear Tetsuo,

Have you tested the mpi-fixes branch? Should we merge it?


Regarding unit tests, we could configure "make check" to run for example
test_assembly.cc with different numbers of processes and report assembly
times. Then we can consider also adding a check that computational times
are decreasing with increasing number of processes.

Best regards
Kostas

On Sun, May 23, 2021 at 4:18 PM Tetsuo Koyama <tkoyama...@gmail.com> wrote:

> Dear Kostas
>
> Thanks. Yes, I would like to.
> Are there any points when adding tests?
>
> Is there a
>
> 2021年5月23日(日) 21:46 Konstantinos Poulios <logar...@googlemail.com>:
>
>> Dear Tetsuo,
>>
>> You can now test the code in the mpi-fixes branch. Would you like to help
>> with a bit more extensive testing of MPI in GetFEM and maybe also with
>> making some new unit tests for MPI?
>>
>> Best regards
>> Kostas
>>
>> On Sun, May 23, 2021 at 7:29 AM Tetsuo Koyama <tkoyama...@gmail.com>
>> wrote:
>>
>>> Dear Kostas
>>>
>>> Thanks a lot. I will check the code.
>>>
>>> BR
>>> Tetsuo
>>>
>>> 2021年5月23日(日) 10:34 Konstantinos Poulios <logar...@googlemail.com>:
>>>
>>>> I think I have fixed it but need to test it a bit more and tidy it up.
>>>> BR
>>>> Kostas
>>>>
>>>> On Fri, May 21, 2021 at 8:42 AM Konstantinos Poulios <
>>>> logar...@googlemail.com> wrote:
>>>>
>>>>> oh sorry, my fault, I can reproduce the error now. I had forgotten
>>>>> that I had to replace the linear term with a nonlinear one.
>>>>>
>>>>> BR
>>>>> Kostas
>>>>>
>>>>> On Thu, May 20, 2021 at 7:42 AM Tetsuo Koyama <tkoyama...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sorry for lack of explanation.
>>>>>>
>>>>>> I  build getfem on ubuntu:20.04 and  using the configuration command
>>>>>> " --with-pic --enable-paralevel=2"
>>>>>>
>>>>>> I am using...
>>>>>> - automake
>>>>>> - libtool
>>>>>> - make
>>>>>> - g++
>>>>>> - libqd-dev
>>>>>> - libqhull-dev
>>>>>> - libmumps-dev
>>>>>> - liblapack-dev
>>>>>> - libopenblas-dev
>>>>>> - libpython3-dev
>>>>>> - gfortran
>>>>>> - libmetis-dev
>>>>>>
>>>>>> I attach a Dockerfile and a Python file to reproduce.
>>>>>> You can reproduce by the following command.
>>>>>>
>>>>>> $ sudo docker build -t demo_parallel_laplacian_nonlinear_term.py .
>>>>>>
>>>>>> Best Regards
>>>>>> Tetsuo
>>>>>>
>>>>>> 2021年5月19日(水) 23:12 Konstantinos Poulios <logar...@googlemail.com>:
>>>>>>
>>>>>>> I think the instructions page is correct. What distribution do you
>>>>>>> build getfem on and what is your configuration command?
>>>>>>> Best regards
>>>>>>> Kostas
>>>>>>>
>>>>>>> On Wed, May 19, 2021 at 10:44 AM Tetsuo Koyama <tkoyama...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Dear Kostas
>>>>>>>>
>>>>>>>> This page (http://getfem.org/tutorial/install.html) says
>>>>>>>> - Parallel MUMPS, METIS and MPI4PY packages if you want to use the
>>>>>>>> MPI parallelized version of GetFEM.
>>>>>>>>
>>>>>>>> Is there a recommended way to install Parallel Parallel MUMPS,
>>>>>>>> METIS and MPI4PY ?
>>>>>>>> I could not find the information in the page.
>>>>>>>>
>>>>>>>> If you could give me any information I will add it to the following
>>>>>>>> page.
>>>>>>>> http://getfem.org/install/install_linux.html
>>>>>>>>
>>>>>>>> BR
>>>>>>>> Tetsuo
>>>>>>>>
>>>>>>>> 2021年5月19日(水) 10:45 Tetsuo Koyama <tkoyama...@gmail.com>:
>>>>>>>>
>>>>>>>>> Dear Kostast
>>>>>>>>>
>>>>>>>>> No I haven't. I am using libmumps-seq-dev of Ubuntu repository.
>>>>>>>>> I will use parallel version of mumps again.
>>>>>>>>>
>>>>>>>>> BR
>>>>>>>>> Tetsuo
>>>>>>>>>
>>>>>>>>> 2021年5月19日(水) 4:50 Konstantinos Poulios <logar...@googlemail.com>:
>>>>>>>>>
>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>>
>>>>>>>>>> Have you compiled GetFEM with the parallel version of mumps? In
>>>>>>>>>> Ubuntu/Debian you must link to dmumps instead of dmumps_seq for 
>>>>>>>>>> example.
>>>>>>>>>>
>>>>>>>>>> BR
>>>>>>>>>> Kostast
>>>>>>>>>>
>>>>>>>>>> On Tue, May 18, 2021 at 2:09 PM Tetsuo Koyama <
>>>>>>>>>> tkoyama...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>
>>>>>>>>>>> Thank you for your report.
>>>>>>>>>>> I am happy that it runs well in your system.
>>>>>>>>>>> I will organize the procedure that can reproduce this error.
>>>>>>>>>>> Please wait.
>>>>>>>>>>>
>>>>>>>>>>> Best Regards Tetsuo
>>>>>>>>>>>
>>>>>>>>>>> 2021年5月18日(火) 18:10 Konstantinos Poulios <
>>>>>>>>>>> logar...@googlemail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>>>> I could not confirm this issue. On my system the example runs
>>>>>>>>>>>> well both on 1 and 2 processes (it doesn't scale well though)
>>>>>>>>>>>> BR
>>>>>>>>>>>> Kostas
>>>>>>>>>>>>
>>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, May 16, 2021 at 10:07 AM Tetsuo Koyama <
>>>>>>>>>>>> tkoyama...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am looking inside the source code.
>>>>>>>>>>>>> > if (generic_expressions.size()) {...}
>>>>>>>>>>>>> Sorry it looks complex for me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> FYI. I found that MPI process 1 and 2 is different in the
>>>>>>>>>>>>> following line.
>>>>>>>>>>>>> >    if (iter.finished(crit)) {
>>>>>>>>>>>>> This is in the "Newton_with_step_control" function in
>>>>>>>>>>>>> getfem_model_solvers.h.
>>>>>>>>>>>>>
>>>>>>>>>>>>> "crit" is calculated by rit = res / approx_eln and res and
>>>>>>>>>>>>> approx_eln is ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ mpirun -n 1 python demo_parallel_laplacian.py
>>>>>>>>>>>>> res=1.31449e-11
>>>>>>>>>>>>> approx_eln=6.10757
>>>>>>>>>>>>> crit=2.15222e-12
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ mpirun -n 2 python demo_parallel_laplacian.py
>>>>>>>>>>>>> res=6.02926
>>>>>>>>>>>>> approx_eln=12.2151
>>>>>>>>>>>>> crit=0.493588
>>>>>>>>>>>>>
>>>>>>>>>>>>> res=0.135744
>>>>>>>>>>>>> approx_eln=12.2151
>>>>>>>>>>>>> crit=0.0111128
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am now trying to understand what is the correct residual
>>>>>>>>>>>>> value of  Newton(-Raphson) algorithm.
>>>>>>>>>>>>> I will be glad if you have an opinion.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best Regards Tetsuo
>>>>>>>>>>>>> 2021年5月11日(火) 19:28 Tetsuo Koyama <tkoyama...@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > The relevant code is in the void model::assembly function
>>>>>>>>>>>>>> in getfem_models.cc. The relevant code assembling the term you 
>>>>>>>>>>>>>> add with
>>>>>>>>>>>>>> md.add_nonlinear_term(..) must be executed inside the if 
>>>>>>>>>>>>>> condition
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > if (generic_expressions.size()) {...}
>>>>>>>>>>>>>> > You can have a look there and ask for further help if it
>>>>>>>>>>>>>> looks too complex. You should also check if the test works when 
>>>>>>>>>>>>>> you run it
>>>>>>>>>>>>>> with md.add_nonlinear_term but setting the number of MPI 
>>>>>>>>>>>>>> processes to one.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks. I will check it. And the following command completed
>>>>>>>>>>>>>> successfully..
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ mpirun -n 1 python demo_parallel_laplacian.py
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So all we have to check is compare -n 1 with -n2 .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards Tetsuo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2021年5月11日(火) 18:44 Konstantinos Poulios <
>>>>>>>>>>>>>> logar...@googlemail.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The relevant code is in the void model::assembly function in
>>>>>>>>>>>>>>> getfem_models.cc. The relevant code assembling the term you add 
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>> md.add_nonlinear_term(..) must be executed inside the if 
>>>>>>>>>>>>>>> condition
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> if (generic_expressions.size()) {...}
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You can have a look there and ask for further help if it
>>>>>>>>>>>>>>> looks too complex. You should also check if the test works when 
>>>>>>>>>>>>>>> you run it
>>>>>>>>>>>>>>> with md.add_nonlinear_term but setting the number of MPI 
>>>>>>>>>>>>>>> processes to one.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> BR
>>>>>>>>>>>>>>> Kostas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, May 11, 2021 at 10:44 AM Tetsuo Koyama <
>>>>>>>>>>>>>>> tkoyama...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for your reply.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > Interesting. In order to isolate the issue, can you also
>>>>>>>>>>>>>>>> check with
>>>>>>>>>>>>>>>> > md.add_linear_term(..)
>>>>>>>>>>>>>>>> > ?
>>>>>>>>>>>>>>>> It ends when using md.add_linear_term(..).
>>>>>>>>>>>>>>>> It seems that it is a problem of md.add_nonlinear_term(..).
>>>>>>>>>>>>>>>> Is there a point which I can check?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards Tetsuo.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2021年5月11日(火) 17:19 Konstantinos Poulios <
>>>>>>>>>>>>>>>> logar...@googlemail.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Interesting. In order to isolate the issue, can you also
>>>>>>>>>>>>>>>>> check with
>>>>>>>>>>>>>>>>> md.add_linear_term(..)
>>>>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>> Kostas
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, May 11, 2021 at 12:22 AM Tetsuo Koyama <
>>>>>>>>>>>>>>>>> tkoyama...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Dear GetFEM community
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am running MPI Parallelization of GetFEM.The running
>>>>>>>>>>>>>>>>>> command is
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> $ git clone
>>>>>>>>>>>>>>>>>> https://git.savannah.nongnu.org/git/getfem.git
>>>>>>>>>>>>>>>>>> $ cd getfem
>>>>>>>>>>>>>>>>>> $ bash autogen.sh
>>>>>>>>>>>>>>>>>> $ ./configure --with-pic --enable-paralevel=2
>>>>>>>>>>>>>>>>>> $ make
>>>>>>>>>>>>>>>>>> $ make install
>>>>>>>>>>>>>>>>>> $ mpirun -n 2 python demo_parallel_laplacian.py
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The python script ends correctly. But when I changed the
>>>>>>>>>>>>>>>>>> following linear term to nonlinear term the script did not 
>>>>>>>>>>>>>>>>>> end.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -md.add_Laplacian_brick(mim, 'u')
>>>>>>>>>>>>>>>>>> +md.add_nonlinear_term(mim, "Grad_u.Grad_Test_u")
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Do you know the reason?
>>>>>>>>>>>>>>>>>> Best regards Tetsuo
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Re: Nonlinear term MPI running does not end

Reply via email to