Re: Unlikely recoverable failures

2018-10-22 Thread will sanfilippo
Chris:

Sorry was away so did not comment on this prior to your implementation. All 
sounds good; I like DEBUG_PANIC for the name (I liked it better than DEV_PANIC).

Will


> On Oct 19, 2018, at 11:14 AM, Christopher Collins  wrote:
> 
> FYI- I have submitted a PR that implements this new macro:
> https://github.com/apache/mynewt-core/pull/1471
> 
> I went with `DEBUG_PANIC()` for the name.
> 
> Chris
> 
> On Thu, Oct 18, 2018 at 11:13:27AM -0700, Christopher Collins wrote:
>> Hello all,
>> 
>> I think Mynewt lacks a mechanism for dealing with a certain class of
>> errors: unlikely recoverable failures.  For the purposes of this email,
>> I divide failures into two groups:
>> 
>>1. Failures we cannot recover from, or that we don't want to recover
>>   from.
>> 
>> This group includes: immediate bugs in the code, memory corruption,
>> software watchdog expiry, etc.
>> 
>> The best way to handle these failures is to crash immediately, causing
>> gdb to hit a breakpoint and / or creating a core dump.  We already have
>> a tool for this: assert().
>> 
>>2. Failures we want to recover from in production, but not always
>>   during development.
>> 
>> This group includes: failure of nonessential hardware, dynamic memory
>> allocation failures, file system write failures due to insufficient
>> space, etc.
>> 
>> For this second group, I think we need a "crash()" macro that can be
>> enabled or disabled at compile time[1].  During development and debugging,
>> the macro would crash the system.  In production builds, the macro would
>> expand to nothing, allowing the recovery code to execute.
>> 
>> I would like to implement such a macro in mynewt-core.  Here are some of
>> my thoughts on the subject.  As always, please don't hesitate to express
>> any criticism.
>> 
>> A. No conditional.
>> 
>> Unlike `assert()`, this new macro should not evaluate an expression to
>> determine success or failure.  Instead, the calling code should detect
>> failure with an `if` statement or similar, and only invoke the macro in
>> the failure case.  Since these failures are expected, I think it is
>> likely that the application will always want to execute some code in the
>> failure case, regardless of whether the macro is configured to trigger a
>> crash.  For example: logging a message to the console when the failure
>> is detected.  If the macro itself detects the failure, the application
>> doesn't have the opportunity to execute any code before the crash.
>> 
>> B. No severity.
>> 
>> Initially, I was thinking we could have a severity associated with each
>> failure.  The new macro would only trigger a crash if invoked with a
>> severity less than or equal to some `PANIC_LEVEL` setting.  However, I
>> decided that this just complicates the feature without adding any real
>> value.  I can't think of a good way to assign a severity to each
>> failure.  If we ever need this, we can add it on top of the
>> single-severity implementation.
>> 
>> I do think the ability to enable this feature per-package would be
>> useful, so that is something to consider for the future.
>> 
>> C. Name?
>> 
>> This is what has been causing me the most agony: what to name the macro.
>> The names I hate the least are:
>> 
>>* DBG_PANIC() // "Debug panic"
>>* DEV_PANIC() // "Development panic"
>> 
>> But I am not crazy about these.  Any other suggestions?  Even though
>> all-caps is ugly, I do think it should be used here to make it obvious
>> that this construct does very macro-like things (inspects the file and
>> line number, possibly expands to nothing, etc.).
>> 
>> Thanks,
>> Chris
>> 
>> [1] `assert()` can be enabled and disabled via the `NDEBUG` macro.
>> However, I am of the opinion that asserts are too useful to ever
>> disable, so I would like an additional level of configurability here.



Re: Unlikely recoverable failures

2018-10-19 Thread Christopher Collins
FYI- I have submitted a PR that implements this new macro:
https://github.com/apache/mynewt-core/pull/1471

I went with `DEBUG_PANIC()` for the name.

Chris

On Thu, Oct 18, 2018 at 11:13:27AM -0700, Christopher Collins wrote:
> Hello all,
> 
> I think Mynewt lacks a mechanism for dealing with a certain class of
> errors: unlikely recoverable failures.  For the purposes of this email,
> I divide failures into two groups:
> 
> 1. Failures we cannot recover from, or that we don't want to recover
>from.
> 
> This group includes: immediate bugs in the code, memory corruption,
> software watchdog expiry, etc.
> 
> The best way to handle these failures is to crash immediately, causing
> gdb to hit a breakpoint and / or creating a core dump.  We already have
> a tool for this: assert().
> 
> 2. Failures we want to recover from in production, but not always
>during development.
> 
> This group includes: failure of nonessential hardware, dynamic memory
> allocation failures, file system write failures due to insufficient
> space, etc.
> 
> For this second group, I think we need a "crash()" macro that can be
> enabled or disabled at compile time[1].  During development and debugging,
> the macro would crash the system.  In production builds, the macro would
> expand to nothing, allowing the recovery code to execute.
> 
> I would like to implement such a macro in mynewt-core.  Here are some of
> my thoughts on the subject.  As always, please don't hesitate to express
> any criticism.
> 
> A. No conditional.
> 
> Unlike `assert()`, this new macro should not evaluate an expression to
> determine success or failure.  Instead, the calling code should detect
> failure with an `if` statement or similar, and only invoke the macro in
> the failure case.  Since these failures are expected, I think it is
> likely that the application will always want to execute some code in the
> failure case, regardless of whether the macro is configured to trigger a
> crash.  For example: logging a message to the console when the failure
> is detected.  If the macro itself detects the failure, the application
> doesn't have the opportunity to execute any code before the crash.
> 
> B. No severity.
> 
> Initially, I was thinking we could have a severity associated with each
> failure.  The new macro would only trigger a crash if invoked with a
> severity less than or equal to some `PANIC_LEVEL` setting.  However, I
> decided that this just complicates the feature without adding any real
> value.  I can't think of a good way to assign a severity to each
> failure.  If we ever need this, we can add it on top of the
> single-severity implementation.
> 
> I do think the ability to enable this feature per-package would be
> useful, so that is something to consider for the future.
> 
> C. Name?
> 
> This is what has been causing me the most agony: what to name the macro.
> The names I hate the least are:
> 
> * DBG_PANIC() // "Debug panic"
> * DEV_PANIC() // "Development panic"
> 
> But I am not crazy about these.  Any other suggestions?  Even though
> all-caps is ugly, I do think it should be used here to make it obvious
> that this construct does very macro-like things (inspects the file and
> line number, possibly expands to nothing, etc.).
> 
> Thanks,
> Chris
> 
> [1] `assert()` can be enabled and disabled via the `NDEBUG` macro.
> However, I am of the opinion that asserts are too useful to ever
> disable, so I would like an additional level of configurability here.