So long as the error checking is not much more expensive then the computation 
than I see no harm in including it with the if (debug) model. For example, 
checking a sorted array is sorted is less expensive than the sort so can be 
included at the end of sort algorithm. 

  For the numerical solvers things are much trickery because it is difficult to 
check if the result is "correct", especially given the different criteria 
possible for "convergence" and norms that might be used. Thus checking the 
"little" things throughout the code is a good idea, because we can't catch the 
"big things".

  One check we don't do consistently which requires elbow grease is to use 
VecGetArrayWrite() instead of VecGetArray() when the routine has to fill ALL 
the values in the array. Then VecGetArrayWrite() can fill the array with Nan, 
and VecRestoreArrayWrite() verifies that there are no Nan in the result to do 
the basic test that the routine did not miss setting some values. 

We could extend this concept to other places where a routine is suppose to 
"fill up" memory obtained with PetscMalloc(), have something like 
PetscMallocVerifyFilled() that checks malloced space after it has been filled 
by the code to make sure no places where missing. Note that valgrind does some 
of this checking, but not all, valgrind only generates messages when a 
resulting unset value is used in an if (something) or something that controls 
program flow etc. It will not detect when a numerical location is not filled in 
but is used later in a numerical computation that never controls program flow. 
Hence in debug mode I would like PetscMalloc() to fill all numerical arrays 
with Nan.

   This is the simplest error checking, not even checking that correct values 
are used, just checking that something is used and we don't even do this 
everywhere yet.

  Barry






> On Aug 4, 2020, at 12:24 PM, Jacob Faibussowitsch <jacob....@gmail.com> wrote:
> 
> Hello All,
> 
> How far should one go in error checking when using #if 
> defined(PETSC_USE_DEBUG)? So far I have gone with the mantra that internal 
> petsc routines (including ones authored by myself) should be considered 
> infallible and that I should only be checking for garbage *input* from the 
> user. But there is not a person in existence that doesn’t write buggy code, 
> as evidenced by the somewhat routine “fixing bug in XYZ” merge requests one 
> sees. 
> 
> On the other hand it is also not reasonable to check every single output with 
> a fine-tooth comb because for the majority of cases code written by petsc 
> developers is working as intended. Take for example writing an array sorting 
> algorithm. Since every operation is "performance critical” these are often 
> written in less logical or less readable formats leading to some subtle bugs 
> that the writer doesn’t immediately catch. If these remain uncaught through 
> CI/CD and then bleeds into a user code I see absolutely no chance of the user 
> (or even other devs) ever being able to identify that the sorting algorithm 
> deep in some function stack is the one producing the bug without significant 
> effort. One could include a “dumb” version of the same algorithm that checks 
> a copy of the initial array for missing/misplaced elements but as mentioned 
> above this is a pointless slowdown 99% of the time. 
> 
> CI/CD -- while excellent at catching a lot of machine-specific bugs -- isn’t 
> bulletproof either. It relies on the assumption that the writer knows all 
> possible sources of bugs in their code and provides a test case for each, but 
> to quote Isaac Newton “what we know is a drop, what we don’t know is an 
> ocean”. I have been mulling over this problem for a while now, and have 
> looked through the user manual/developers manual but have not found a 
> definitive answer. 
> 
> Best regards,
> 
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> Cell: (312) 694-3391
> 

Reply via email to