On Sat, Sep 26, 2020 at 1:07 PM Matthew Knepley <knep...@gmail.com> wrote:

> On Sat, Sep 26, 2020 at 11:17 AM Mark McClure <m...@resfrac.com> wrote:
>
>> Thank you, all for the explanations.
>>
>> Following Matt's suggestion, we'll use -g (and not use -with-debugging=0)
>> all future compiles to all users, so in future, we can provide better
>> information.
>>
>> Second, Chris is going to boil our function down to minimum stub and
>> share in case there is some subtle issue with the way the functions are
>> being called.
>>
>> Third, I have question/request - Petsc is, in fact, detecting an error.
>> As far as I can tell, this is not an uncontrolled 'seg fault'. It seems to
>> me that maybe Petsc could choose to return out from the function when it
>> detects this error, returning an error code, rather than dumping the core
>> and terminating the program. If Petsc simply returned out with an error
>> message, this would resolve the problem for us. After the Petsc call, we
>> check for Petsc error messages. If Petsc returns an error - that's fine -
>> we use a direct solver as a backup, and the simulation continues. So - I am
>> not sure whether this is feasible - but if Petsc could return out with an
>> error message - rather than dumping the core and terminating the program -
>> then that would effectively resolve the issue for us. Would this change be
>> possible?
>>
>
> At some level, I think it is currently doing what you want. CHKERRQ()
> simply returns an error code from that function call, printing an error
> message. Suppressing the message is harder I think,
>

He does not need this.


> but for now, if you know what function call is causing the error, you can
> just catch the (ierr != 0) yourself instead of using CHKERRQ.
>

This is what I suggested earlier but maybe I was not clear enough.

Your code calls something like

ierr = SNESSolve(....); CHKERRQ(ierr);

You can replace this with:

 ierr = SNESSolve(....);
 if (ierr) {
    ....
 }

I suggested something earlier to do here. Maybe call KSPView. You could
even destroy the solver and start the solver from scratch and see if that
works.

Mark


> The drawback here is that we might not have cleaned up
> all the state so that restarting makes sense. It should be possible to
> just kill the solve, reset the solver, and retry, although it is not clear
> to me at first glance if MPI will be in an okay state.
>
>

Reply via email to