On a related note (and in full agreement with Michael and Alex) I have always felt not terribly at ease with our current practice of *not* logging at the point of error - relying instead on returning an Error() object instead.
I have now witnessed a large number of instances, when debugging user-reported failures/errors, when we had to "grep" the codebase for the error message and find the actual call point that way, as the logs would not provide a sufficiently clear indication as to *where* the error actually occurred. So, in addition to what Michael suggests, and in line with Alex's exhortation to have "excellent core error reporting facilities," I would also like to propose that we consider adopting a pattern I've seen elsewhere: if (errorOccurred) { string msg = "Dude, something went wrong!"; LOG(ERROR) << msg; return Error(msg); } or a variation thereof. -- *Marco Massenzio* Distributed Systems Engineer http://codetrips.com On Fri, Nov 6, 2015 at 4:55 AM, Alexander Rojas <alexan...@mesosphere.io> wrote: > I also like the idea of having better error reports. On that side, I > always liked the design of boost system_error where one has categories of > errors with a set of error codes (integer) where each of them have a > specific string associated and which can be complemented with extra text. > > > > On 06 Nov 2015, at 09:29, Michael Hopcroft <m...@hopcroft.org> wrote: > > > > Error reporting in Mesos is based on Stout’s Error class and a variety of > > other classes, such as ErrnoError and WindowsError, that derive from > Error. > > One goal of these classes is to simplify the construction and storage of > > error message strings. As an example, the ErrnoError constructor > > automatically incorporates std::string(strerror(errno)) into its error > > message. > > > > > > Stout catches exceptions and converts them into Try<T> and Result<T> > > objects which are used as polymorphic return values that can express > either > > an error condition or a proper return value. One consequence of catching > > exceptions is that call stacks from throw sites and error sites are lost. > > This makes it harder to debug production environment crashes post hoc. > > > > > > One way to improve debugability would be to add more information about > the > > error site to the error message. This could be done by convention, but it > > seems that the vast majority of the error messages in Mesos currently > don’t > > include much information about the site that originally generated an > error. > > > > > > Because much of the code in Mesos uses Error and ErrnoError, we have an > > opportunity to add diagnostic information automatically with only a small > > amount of churn. What we’d need to do is provide Error creation macros > that > > would pass the source file name and line number to the constructors of > the > > various Error classes. > > > > > > Concretely this would involve > > > > > > - class Error > > - Rename declaration and definition to ErrorImpl (or ErrorBase). > This > > change is not error prone. > > - Rename references to class’ type. These changes are mostly limited > > to result.hpp and try.hpp. > > - Don’t rename calls to constructor. These will now just invoke the > > macro. > > - Modify the constructor to take a line number and source filename > in > > addition to a message. > > - Define a macro called Error(message) that has the same prototype > as > > the original Error constructor. This macro would invoke > > ErrorImpl::ErrorImpl, passing the current line number and source > filename > > along with the user-provided message. The macro would make use > > of __FILE__ > > and __LINE__, which are predefined in gcc and Visual Studio. > > - class ErrnoError, WindowsError. > > - Similar pattern as above. > > > > *Pros.* > > > > > > - Codebase is more debuggable in production environment because the > vast > > majority of error sites will document the file name and line number in > the > > error message. > > - Most error sites don’t require a code change. > > > > *Cons.* > > > > > > - Introduces a macro. In general, we’d like to move away from macros > > wherever possible. > > - Discourages subclassing of Error because each subclass would need its > > own special macro to call its constructor. A brief search of the code > base > > seems to indicate that Mesos programmers don’t tend to declare > subclasses > > of Error. Right now, it seems that subclasses exist mainly to provide > > different construction semantics. The big question is whether one would > > want to use subclasses down the road to distinguish between equivalence > > classes of errors, say those that should bring the process down vs > those > > that are recoverable. > > - Some amount of code churn, mainly in error.hpp, result.hpp, and > > try.hpp. > > > > *Limitations.* > > > > > > - This approach by itself won’t address error sites that invoke > > convenience functions such as Try<T>::error() and Result<T>::error(). > These > > could be addressed with a similar macro, but it would need to somehow > deal > > with the template parameter as well. > > > > *Risks and Potential Complications.* > > > > > > - The rename process might fail to correctly distinguish between error > > sites (i.e. calls to Error::Error()) and other uses of the term > “Error”. > > - There may be important compilers that don’t support __FILE__ and > > __LINE__. These predefined macros have been in the ANSI/ISO standard > for a > > while, but it would still be wise to check each compiler we care about. > > - General hazards associated with macro expansion. > > - Users of Mesos may have written diagnostic tools that depend on the > > current values of the error strings. > > - Unit tests may depend on exact values of error strings. If this were > > the case, we'd have to provide a convenience function that strips off > the > > file names and line numbers so that unit tests don't break when error > sites > > move around due to edits elsewhere in the file. > > > > Let me know what you think of this idea. If the approach seems reasonable > > and a worthwhile, I would be willing to prototype it to see if it has > some > > fatal flaw. Thanks for your input! > > > > > > > > -Mike > >