Re: [Plplot-devel] Error propagation

Phil Rosenberg Sat, 04 Feb 2017 12:09:06 -0800

Hi Andrew
Thanks for adding to the discussion. Further additions below

On 3 February 2017 at 11:15, Andrew Ross
<andrewr...@users.sourceforge.net> wrote:
>
> Hi all,
>
> Sorry I have been silent for so long. Work and life have rather got in
> the way of plplot development. This is something I have raised a couple
> of times in the past and something I have thought about a fair bit.
>
> Comments on Phil's proposals below.
>
> Andrew
>
> On Mon, 2017-01-30 at 11:00 +0000, Phil Rosenberg wrote:
>> Hi all
>> We have had this discussion in the past. But I would like to reopen
>> it.
>>
>> We really really need to get id of the exit() calls in PLplot. I
>> don't
>> think that we can really make any recommendation that our library is
>> "industrial strength" while they are their and I would be very
>> nervous
>> about using PLplot in production software with them in - plus even
>> using PLplot for science work I have been caught out be these calls
>> in
>> the past.
>>
>> I think there are a few options we have got
>>
>> Firstly with our internal propagation we can either do
>> A1) Set up our functions so they return error codes and make sure we
>> check these. I feel there is a tendency for this to be bad because I
>> know that I am inherently lazy if I am frank (and I'm sure we all
>> have
>> our lazy moments) and tend to write code without these checks then
>> (try to) go back and add them later. Missing checks are likely to
>> cause segfaults or memory corruptions which are even worse than the
>> exit calls().
>
> Laziness is no excuse! The advantage of this proposal is that is allows
> the library to clean things up as we unwind the stack and so (if done
> properly) it will minimise any chance of problems down the line. Even
> if this is a fatal error from plplot's perspective (e.g. memory
> allocation failure), then we must return control to the calling
> programme in a clean and controlled manner.


I wasn't saying it's an excuse, and perhaps I am being flippant using
the term laziness. I guess that what I meant is that it is easy to
forget to do the checking and can be painful, error prone and take a
long time to implement/manage and the result of a single forgotten or
missed error check is equal to a single exit call - i.e. likely
segfault and immediate program termination. Currently a single memory
allocation imposes a single plexit() function call - so at least there
aren't many of them and we know where they are. In a returned error
code solution there will be many function calls to deal with for each
of our current exit calls. Is there a way we can manage this with code
style? Maybe could we label internal functions that can result in
failures with a prefix like plcf (for pl can fail - better prefix
suggestions welcome) Then only functions which have the plcf prefix or
API entry points are allowed to call other functions with the plcf
prefix? Could something like this be checked with a script?
Could we have a required format of checking that we can police with a
script? like if a plcf call is made, the next like should contain if(
err ) or something?

>> A2) Set up an error flag in PLStream which can be set on error and
>> checked after every function call. This would avoid having to change
>> functions that already return values, but still relies on manual
>> checks which I think will be a big weak point.
>
> I would agree that this is still subject to human laziness in checking
> return codes. The advantage of explicitly returning values is that it
> is at least more obvious you need to check them. On the other hand, if
> functions already return values this gets tricky. We probably need to
> do something of a code audit to see if this is likely to be an issue
> before making final decisions.

Same comment as above.
>
>> A3) Use setjmp/longjmp calls. In this case we call setjmp in the top
>> level of every API call and on error we call longjmp which returns
>> the
>> execution point back to the point setjmp was called. There is some
>> extra setup time here to avoid memory leaks and other resource leaks,
>> but once that is done we can all write code with almost no worry
>> regarding error propagation.
>
> This has advantages in terms of not requiring explicit error code
> checking. The bigger problem is making sure we tidy up properly to
> avoid leaks. As Phil says, it requires some care thinking about how we
> deal with non-C drivers as well. I can't say I'm too keen on
> setjmp/longjmp. It can be a lazy solution to the problem rather than
> writing better code in the first place, but given we have a fairly
> large heritage codebase it might be an easier solution.
I don't feel like it is a lazy solution - it is using a feature of the
language for exactly the purpose it was designed for. But I know there
are lots of caveats that come with it. One is that we must do some
book keeping to avoid memory leaks or other resource leaks (like
failure to close files properly). But that bookkeeping seems trivial
to me - Because we have a PLStream struct it is almost no more effort
than using exceptions in C++. The big issue for me is dealing with
drivers written in C++, that does seem like it could be hard work.
>
>> Once we decide on our internal method we need to decide how we can
>> inform the user. We have a number of options again:
>>
>> B1) An API change so that all our functions return an error code.
>
> Quite a C-like approach. Requires quite a lot of changes to user code
> to properly check for errors. Huge API change. Could cause problems for
> the few plplot functions which already return a value.
>
>> B2) Make use of our current plabort call and let the user pass in a
>> plabort handler.
>
> This could get messy with some of the bindings. Calling an abort
> handler also makes it hard for the programme to decide what to do. For
> some uses for example you might want to shut down the plot, but
> continue with the program. Harder with a handler. (I'm thinking about
> things like the interactive octave bindings where I find it exceedingly
>  annoying if plplot kills octave. In case of an error I might want to
> return control to octave and proceed to save my work or whatever in an
> orderly manner before choosing to exit or reinitialise plplot.)

Yes this is definitely what we should aim for - in some cases the host
program almost definitely wants to continue - Octave is a good
example.
>
>> B3) If we have an error, store an error code in PLStream and add an
>> API function called plgeterr or similar which allows the user to
>> check
>> for error after any other function call.
>
> Easier to implement in terms of API changes, but makes for ugly end
> user code if the end user really does need to call an additional
> function to check return codes every time. One plus of an internal
> plstream error flag is that we could internally check it whenever a
> function is called. This would prevent the user trying to continue
> after an error (or just failing to check the error code). We could then
> possible provide some mechanism for resetting the error flag and maybe
> reinitialising plplot at the same time depending on the error?

Yep, a bit messy, but probably less painful than the huge API change.
Not sure if people may be interested in this, but could we provide a
#define that users could put before their plplot #include that would
allow them to use either the existing API with a call to plgeterr or a
new API which returns an error code?

>
>> B4) In the C++ binding we could throw an error. Not sure if other
>> languages have similar throw/catch style options.
>
> Other more modern languages such as java and python have exception
> handling. I would very much like to see the bindings take advantage of
> these where possible. It would lead to much more natural, standard and
> cleaner code in these languages.

Agreed. I'm not sure in C++ if there are issues throwing over dll
boundaries. If so we could easily make the C++ binding header only as
there isn't very much code in it.
>
>> B5) Some combination of the above.
>
> I'd definitely say B4 for languages that support it and B3 or B1 where
> not.
>
> We might want to keep the option of a plabort handler (default to exit)
> for now to aid compatibility in the changes. Users who are happy with
> plplot exiting on error can carry on with that behaviour without having
> to change their code, beyond perhaps enabling the handler. Or is that
> just lazy?

I think probably once we have this sorted it will basically just work.
So if we fail during plot initialisation then the plot will act
likethe initialisation didn't happen. Otherwise the data won't get
plotted or something.
>
>> I will add there are some specific things we need to think about for
>> our C++ drivers. These are not so much options, but things we need to
>> deal with anyway.
>>
>> C1) Our C++ drivers cannot be allowed to throw an error out of the
>> driver code and into the main C code. That would potentially end up
>> with the throw going all the way out to the user's top level code
>> which may be in C, Fortran, or any other language that can't deal
>> with
>> it. Note that even if the driver doesn't call throw, it may call
>> either something in the stl or another driver (in Qt or wxWidgets for
>> example) that does throw. This is easy to avoid by wrapping the intry
>> points in try blocks. I have done this for wxWidgets, but haven't
>> checked the other C++ drivers to see if they are the same.
>>
>> C2) If we go for option A3, then we cannot allow longjmp over C++
>> code. This is because nontrivial destructors are not called when
>> longjmp is called and if objects with these destructors are jumped
>> over then the result is undefined behaviour - which again is worse
>> than exit calls.
>>
>> The fix for C2 - and I think this would be a good thing anyway -
>> would
>> be to have a specific driver API. This would allow us to have setjmp
>> calls when we enter that API and avoid jumping over drivers. I think
>> this would be nice in general as it is not well documented how to
>> write a driver and a proper driver API would address that.
>>
>> I am well aware that the use of setjmp and longjmp is very divisive.
>> So my intention is to work this code up only for the case of memory
>> allocation failures. This code will consist of:
>> D2)A set of plmalloc, plrealloc and plfree fuctions which will
>> replace
>> the usual versions. As well as allocating/freeing memory these
>> functions will record the allocations in the PLstream.
>> D1)PLTRY, PLENDTRY macros which will go in each API call. PLTRY will
>> have the setjmp call. This will be followed by the current code in
>> the
>> function as it is now then PLENDTRY will deal with what happens if
>> longjmp has been called - it will check for memory allocations
>> recorded in the PLStream and will free them. For now it will then
>> just
>> call exit() so we get the same behaviour but this can be changed
>> later
>> if we want to go down this route. Once we have this first look at how
>> setjmp/longjmp can work in our code then I think it will be easier
>> for
>> us to make an informed choice about the A options.
>>
>> Does that sound sensible to people? If anyone would like to look at
>> what would be involved in doing a similar thing using the other
>> options then it would be really nice to be able to compare the work
>> and admin/code practices needed for those too.
>>
>> Phil
>>
>> -------------------------------------------------------------------
>> -----------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Plplot-devel mailing list
>> Plplot-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/plplot-devel

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Re: [Plplot-devel] Error propagation

Reply via email to