On 23 July 2017 at 12:27, Alan W. Irwin <ir...@beluga.phys.uvic.ca> wrote:
> Hi Phil:
>
> Here is further discussion of what you have said as opposed to the
> error report I sent you in my previous post concerning your 3rd
> iteration.
>
> Do keep in mind the possibility for the future that the memory
> allocation part of your commits are worthwhile in their own right and
> could be finished and landed on master much sooner than the rest of
> your commits that are concerned with implementation of C exception
> handling and use of that for handling our errors.

I'm not sure how useful they really are without exception handling.
Without such handling there is no location in the code where the array
of pointers to the allocated memory is reviewed and in addition the
best response to a failed allocation may not be a call to plexit (but
I guess mostly it would be).

>
> I suggested in my big e-mail to you early this week (Mon, 17 Jul 2017
> 14:59:00 -0700 (PDT)) concerning your second iteration a method of doing
> this using a list of plstream heap pointers.  Please respond to that
> suggestion.

When you said a list I'm not sure I understood. Do you mean a linked
list rather than an array?

>
> Also please respond to my suggestion (with appropriate code changes
> while I handle the CMake part) in that e-mail concerning
>
> #define PL_IMPLEMENT_EXCEPTION_HANDLING and
> #ifdef PL_IMPLEMENT_EXCEPTION_HANDLING
>
> On 2017-07-22 22:47+0100 Phil Rosenberg wrote:
>
>> [in that same e-mail] You asked why every API call required a PLTRY block.
>> You are correct
>> that actually they don't. Every API call that may result in a call to
>> plexit needs a PLTRY block.
>
>
> I initially didn't understand this explanation, but after _much_ too
> long a time considering various scenarios to explain my point of view
> the light finally dawned. Exception handling does not allow you to
> transfer control to arbitrary places in the code (a stupid assumption
> I was making). Instead, it merely allows you to unwind the call graph
> from the plexit call (in this case that is the only current use of
> PLTHROW in our source code) to return control to a routine potentially
> far up the call stack from there such as _any_ of the public API
> routines (as you said) that have a possible call to plexit somewhere
> lower down on their call graph.

Yes correct, we just unwind the stack to a certain point in the call
sequence (or whatever terminology you choose as you mentioned in your
ps and pps). This is good, because the user application is still sat
waiting for our API function to return. We don't want to jump to some
specific execution point, we need that function to return and for the
user to find out somehow that there has been an error.

>
> So sorry for my previous noise on this sub-topic, but
> now we are finally on the same page in this regard, I have some
> further questions and comments about your implementation of
> exception handling and how you are using that exception handling
> to deal with bad errors.
>
> * If you miss putting a PLTRY and PLCATCH block in one of those public
>   API cases where a potential call to plexit is somewhere lower on the
>   call graph, then what does the current code do?  Does it simply
>   crash (i.e., is that the potential cause of the serious issues I
>   have reported to you in my previous post) or do you handle that
>   internal error case by, e.g., emitting an "internal error: unhandled
>   exception" message followed by an exit?

I think that currently something very odd. The value of
pls->currentjumpbuffer used in PLTHROW would be either out of date or
NULL, so probably some sort of crash. This is why it is important that
every API entry function includes a PLTRY, PLCATCH, PLENDTRY sequence.
Of course we could set things up to check for this scenario, but then
what would we do? Call exit()? I had been trying to think if there was
a C method to check at compile time, but I cannot think of one (there
would be ways in C++ I think). But the API changes so infrequently
that I think just ensuring all API functions include the PLTRY,
PLCATCH, PLENDTRY code is probably okay - open to suggestions though.

>
> * What happens when an external application calls a "first" public API
>   which has another "second" public API lower down in its call graph
>   and even further down on the call graph of that second API occurs a
>   potential call to plexit? Then if plexit is called that will PLTHROW
>   control to the catch block of that second API rather than the first.
>   So for this particular call graph case for the first API this second
>   API needs to PLTHROW control to that first PLCATCH block rather than
>   simply returning.  And, of course, since that second API is
>   sometimes called directly from external applications and libraries
>   without anything higher in the call graph with a PLCATCH block, it
>   cannot PLTHROW all the time without sometimes running
>   into the internal error discussed above.  So how are you going to figure
> out
>   when to PLTHROW in a public API PLCATCH block and when not to PLTHROW?
>   Or is that case already automatically taken care of by your present code
>   (e.g., by each PLTRY macro incrementing a library status flag
>   so that the PLCATCH macro can check that flag to see if there
>   is a PLCATCH block higher in the call graph that it should PLTHROW
>   to)?

I had considered this ;-) We store a stack of jump buffers which
increases in size by one every time we enter a new PLTRY block. Also
the memory management marks this point with a NULL in the list of
pointers to allocated memory. A PLTHROW unwinds to the first
PLTRY/PLCATCH/PLENDTRY block it finds. At this point memory back to
the previous NULL is released and the top jump buffer is removed from
the stack. If there are any more jump buffers in the stack then part
of the PLENDTRY code does another PLTHROW with the new top jump buffer
of the stack. This repeats until there are no jump buffers left and we
will be at the API function initially called by the user. At least
that is the plan - I am yet to do the testing :-)

>
> * Shouldn't your current plexit call plend (like the master branch
>   version does) to shut down the library?  See further discussion
>   below concerning this suggestion.
My first thought is no - I would like the state of Plplot to be as it
was before the API call was made - although I have just realised that
this will not be the case, because the PLStreams are on the heap, not
the stack, they won't be restored by the stack unwind. I will need to
do this manually and probably do something like delay all memory frees
until PLENDTRY so I can restore the pointers.
>
> * I also suggest you should implement a PLplot library state variable
>   (let's call it plplot_library_status) that is initialized to 0 (good
>   status) when the library is initialized but set to 1 (bad status) by
>   plexit. Then follow up by implementing a c_plstatus() public API
>   that returns the value of library_status and which emits a message
>   such as ("library had bad error so it had to be shut down.  To use
>   it again you need to initialize the library by calling one of the
>   plinit family of routines".) Then an external application could call
>   c_plstatus to check on the status of the library after any call to
>   our public API and act accordingly.
>
>   Or it could be sloppy about that (i.e., our standard C examples
>   might only call plstatus in one example to make sure it works). In
>   which case after the library is shutdown (see above) by an internal
>   call to plexit there would be a blizzard of further plabort (due to
>   plsc->level being 0) and plexit messages after the first one because
>   of that library shutdown.  But there would be no call to the dreaded
>   exit routine so there is at least the chance that the external
>   application will survive that sloppiness.  And if plexit does not
>   call plend, I think the chances of an external application
>   encountering a showstopper error are much worse.

Hmm, this is quite interesting actually. As I said above I would like
to not totally kill the PLStream - a good example could be that maybe
plexit gets called due to a bad path for a shapefile for a map. This
maybe could be fixed if the calling program checks for an error state.
Of course people are lazy :-) so it might not get checked. In my
initial thinking, the map plot would fail but all other plotting would
carry on. maybe it would be better to set the PLStream to an error
state and have all further API calls fail. Maybe the user could clear
the error state to permit plotting to continue. I welcome thoughts on
this.

In terms of an error flag that you suggested. I am thinking perhaps
all calls to plerr result in an error message being logged in the
PLStream, plus when we reach the top level PLTRY/PLCATCH/PLENDTRY we
call a user provided error callback. Then we have an API function to
get the list of errors, another to clear the list of errors and one to
check if there are any errors (which would just just check if the
number of error messages in the log is greater than 0). So it is like
your flag, but with some extra detail.
>
>> However, if we just have a policy that
>> every API call must have a PLTRY block then it makes it easy to ensure
>> nothing gets missed. But perhaps this is overkill. A number of
>> functions in plcore.c, particularly the "get" type ones are very
>> simple and would never call plexit. I'm happy to take advice on where
>> the balance should lie?
>>
>> One other issue at the moment. I currently output a warning when
>> plfree is called on a pointer that is not being managed by the
>> PLStream. At present this will include all plfree calls made in
>> plend/plend1 and can include memory allocated by the contour routines.
>> The reason for this is due to the following chain of events
>> 1) These bits of memory get allocated by plalloc in an API call
>> somewhere (e.g cmap0).
>> 2) On exit from the API call plfreeall gets called
>> 3) plfreeall spots the allocated memory that wasn't freed and checks
>> all the pointers that belong to the PLStream
>> 4) It finds that the pointers belong to PLStream and need to not be
>> freed. It therefore stops tracking them.
>> 5) In plend1 plfree is called on the pointer. plfree cannot see the
>> pointer in the list of tracked pointers so issues a warning.
>>
>> There are a few options for dealing with this, but I haven't decided
>> which to use. For now my apologies for the list of warnings.
>>
>
> I am frankly out of gas, and this is already a pretty long e-mail
> so I will leave it to you to decide on the above topics unless
> someone else here wants to comment.
>
> Anyhow, after I get some sleep, I am looking forward to your further
> comments on (a) my error report for interation 3, and (b) what I have
> had the energy to discuss above.

Thanks for the thoughts and comments so far. They are very welcome.

>
> Alan
>
> __________________________
> Alan W. Irwin
>
> Astronomical research affiliation with Department of Physics and Astronomy,
> University of Victoria (astrowww.phys.uvic.ca).
>
> Programming affiliations with the FreeEOS equation-of-state
> implementation for stellar interiors (freeeos.sf.net); the Time
> Ephemerides project (timeephem.sf.net); PLplot scientific plotting
> software package (plplot.sf.net); the libLASi project
> (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
> and the Linux Brochure Project (lbproject.sf.net).
> __________________________
>
> Linux-powered Science
> __________________________

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Reply via email to