Re: API discussion (revived)

J.Pietschmann Sun, 28 Aug 2005 15:17:06 -0700

Jeremias Maerki wrote:

Uh, yeah, that one. That's a design task by itself. Any takers for a
good exception strategy for FOP and XML Graphics Commons?????


Back in the mists of time, men, computers and animals happily lived
together and everything run as expected. Then a moth got trapped
between the contacts of a relay and was squashed, and "unusual
operating conditions" were introduced. Ever since men struggled with
corrupt databases, printers which were out of paper, dropped network
connections and users insisting that their name contains a mongolian
hyphen.

Well. The first problem is defining "what's a problem?".
When designing an API, the parameters, return values and the behavior
description form a contract. The general rule is that everything is
a problem if
- it doesn't fit the commonly accepted semantics of the interface
- it's impossible to foresee by the API designer (e.g. because it's
 caused by a foreign sub-provider)
- it can't be handled locally (e.g. needs human intervention, like
 a printer being out of paper)

Let's take the usual file reading API an an example. The standard
behavior is "read as much bytes into the provided array as possible".
A subtle part of the contract is EOF handling: if the file pointer is
not yet at EOF, there is an successful read, possibly capped by the
number of bytes available to EOF. At EOF, there is a successful read
of zero bytes. the next read throws an exception.
What we learn: there may be conditions which occur reasonably often
and which are best handled by the immediate caller, and which therefore
aren't considered "problematic". Note further how "in band" information
was abused for this purpose. It's unreasonable to expect this pattern
to be applicable often.
OTOH, the Class.forName is so often used to check for the existence
of a specific class that the design of throwing a ClassNotFoundException
looks ugly.

== Design choice: return codes vs. throwing ==
In ancient times, it was custom for subroutines to return an integer
enumerating possible outcomes. Of course, one outcome was usually a
catch all "unknown error" or such.
The obvious advantage of this approach is the simplicity. There are
a few drawbacks, of course:
- The immediate caller has to catch the value and act accordingly.
 If the immediate caller can't handle the problem, it has to pass
 it further up in a sensible way.
- The range of possible outcomes (error codes) can't be easily extended
 later. Possible extensions might clash with codes defined elsewhere.
 The only surefire solution for this is a central error code repository
 and recompiling *everything* after changes to the repository. This was
 already a pain 20 years ago, and it's going to hurt real bad today.
- A single integer can't pass all that much information up the chain.
The real death of the error return codes is of course the first
drawback: if even a single routine in the call stack is missing an error
code, upper layers are screwed. With ever more libraries stacked on
top of each other, error codes became too unwieldy.
A nice demonstration of the problem is the java.io.File.delete(), which
returns false if a directory wasn't deleted. The Tomcat webapp deployer
missed checking the return value, probably because the code deleted all
the files in the directory before. Unfortunately for me, I had the
directory tree on a NFS mounted drive, and some NFS lockfiles didn't get
removed in time, thereby thwarting the attempt to delete the directory.
Nevertheless, upper layers got an "ok", but later redeployment of the
webapp failed due to existing directories (which wasn't caught properly
either). It's definitely unexpected to get an "OK" for an actually
failed deployment.

== Design choice checked vs. unchecked exceptions ==
Java provides the usual exceptions, which have to be declared in the
API, as well as Errors and RuntimeExceptions, which don't have to be
declared.
The semantics of Error should be easy: An Error "indicates serious
problems that a reasonable application should not try to catch." I.e.
it should got to the very top and abort the application or at least the
operation in progress. Throwing an Error is not an easy decision, in
particular if the library can be expected to be used in an interactive
application where a higher level entity (the user) can be asked whether
to (a)bort, (i)gnore or (c)ontinue. Therefore, I consider the java.lang
Errors basically complete.
The discussion whether to use a checked exception or a subclass of
RuntimeException usually gets near a religious war. There is the "we
don't want method heads to be cluttered with *lots* of exception
declarations" vs. the "exceptions are part of the contract and have
to be declared" fractions.
There are reasons for RuntimeException to exist, in particular that
subclasses are thrown by language constructs which are not functions
(OutOfMemoryException, ArrayOutOfBoundsException). But then, did you
really expect String.substring to throw an ArrayOutOfBoundsException?
The sheer amount of possible exceptions which can be thrown by a method
from a foreign class leads even programmers which are not know to be
lazy to catch Exception, or even Throwable at some places, even though
this rarely a good idea.
Excalibur had a few places where it caught an Exception, ignored it
and moved on, thereby destroying valuable information for debugging
erroneous configurations.

== Design choice: Reusing existing exceptions vs. roll your own ==
This can also be tough: quite a few of the java.* exceptions are
encouraged to be reused. However, your own exception might have
advantages:
- You control its semantics.
- Specific to your problem, may contain custom data.
- Might be specifically caught somewhere upstack.
- Can't be expected to be thrown from another library
OTOH, if there aren't any newsworthy information to pass, a java.lang
exception is often ok, because catching a specific exception form a
library means the catcher can do something about this, which is
rarely the case.

== Design choice: Common root exception vs. multiple roots ==
One extremum of the spectrum: use multiple Exceptions which are
immediate subclasses of Exception but unrelated otherwise, favours
specific catches but also encourages to catch just Exception to avoid
catch clause inflation.
The other extreme, put everyting in one tree, just encourages catching
the tree root., which is almost as unpleasant.
Recent additions to the Java RTL seem to have standardized on two
exception trees per library (immediate subpackage of java), one for
configuration problems and another one for everything else. I have
yet to see the advantage, especially if a single method may throw
exception from either tree, or if the by far most common use case
involves putting methods throwing them into a single throw block.
Note that the multiple root approach also may cause exception
declaration inflation, which tempts programmers to catch Exception
as early as possible just to be done. The single (or double) root
approach fares somewhat better in this regard.

== Design choice: subclassing exceptions vs. a single exception ==
The java.io package is probably the most used library where subclasses
of the packages root exception are actually caught. Nevertheless,
providing finer grained subclasses of the root exception might be a
good idea, if only for library internal purposes.

== Design choice: wrapping lower level exception vs. let them through ==
Again, letting checked exception from downstack pass upwards might cause
excessive exception declarations. OTOH, wrapping everyting in library
specific exceptions might get in the way of specific catchers.
If the Java RTL is a good example, java.io.IOException should be passed,

while everything else could be wrapped. I'd addjavax.print.PrintException to the pass list for FOP.

Multiple, nested wrappings are OTOH the main cause for loooong, unwieldy
stack traces we all have come to love from certain posts on fop-user.

== Design choice: library internal exception classes ==
Complex libraries might use library internal exceptions which are not
derived from the main exception(s) used in the official library API.
This might facilitate spin-off of subpackages as stand-alone, separately
reused libraries. OTOH, this may cause excessive exception wrapping.

== Status quo ==
Ok, we have the following exceptions:
 .../apps/FOPException.java
 .../fo/expr/PropertyException.java
 .../fo/ValidationException.java
 .../hyphenation/HyphenationException.java
 .../pdf/PDFFilterException.java
 .../render/rtf/rtflib/exceptions/RtfException.java
 .../render/rtf/rtflib/exceptions/RtfStructureException.java

PropertyException and ValidationException properly extend FOPException.
Well, except that PropertyException should probably have more
constructors and a few methods protected instead.
HyphenationExpection descends immediately from exception, which might
make sense, although discriminating between problems while building a
hyphenation pattern tree from XML and mishaps during calculating
hyphenation points might be desirable.
The PDFFilterExceptions is also an immediate subclass of Exception, and
in contrast to HyphenationException it's not as easily seen as "root
exception" for a potentially stand-alone sub-library.

The RtfException is even more strange: apart from the violation ofidentifier building rules, it is a subclass of IOException. This is

certainly convenient at places, and in some sense even justifiable
(if you want to see RTF generation as part of an output process, just
like encoding characters into bytes). However, I think this should be
redesigned.

In order to suggest further changes, a closer look at the places where
a FOPException is thrown, and, more important, where exceptions are
caught, is necessary.

== Final Word ==
Did I miss something important?

Regards
J.Pietschmann

Re: API discussion (revived)

Reply via email to