[general][lang] monolithic components considered harmful

Rodney Waldhoff Mon, 30 Dec 2002 15:47:24 -0800

It may seem that I'm picking on [lang] here, but that's not my intention.
I just feel like I'm watching an impending train-wreck, and intend to
throw the switch while there's still time.


The Jakarta-Commons charter suggests (well, literally requires [0]) that:

"Each package must have a clearly defined purpose, scope, and API -- Do
one thing well, and keep your contracts"

and suggests in a number of ways that small, single-purpose components are
preferable to monolithic ones.  (Perhaps most succinctly as "Place types
that are commonly used, changed, and released together, or mutually
dependant on each other, into the same package [and types that are not
used, changed, and released together, or mutually dependent into different
packages].)  Yet there seems to be an increasing tendency here toward
lumping discrete units into monolithic components.

Allow me justify this position.

The arguments in favor of monolithic components I've seen seem to boil
down to concerns about minimizing dependencies and preventing
circularities.  This may seem superficially correct, but it is misguided.
The number of JARs I need to have in my classpath is at best an indirect
metric for the absence or presence of dependency issues, and at worst a
misleading one.  Adding a new JAR to the classpath is a trivial issue, and
tools like Maven [1], ClassWorld's UberJar [2], Commons-Combo [3] and even
Java Web Start [4] make it even less of an issue (for better or worse).
The real concerns here should be those of configuration management. For
example, which version of X does Y require, and is that compatible with
the version of X that Z requires?  How many applications will be impacted
by a given change?  How small can I make my (end-user) application?

Monolithic components make configuration management problems worse, not
better.

Here's how:

1) Monolithic components introduce false dependencies.

Let's suppose, as some have suggested, that we release [lang] with new
reflection and math packages.  Suppose further that [cli] uses the
lang.math utilities and that [beanutils] uses the lang.reflect utilities,
and that I've got an application that uses both [cli] and [beanutils].

One might think this gives a simple dependency graph:

      [LANG]
        ^
        |
    .--' '--.
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]

(where [X] <-- [Y] means Y depends on X)

but the reality is more complicated.  Suppose the latest version of
[beanutils] required some changes to lang.reflect.  In the same period,
some changes have been made to lang.math, but [cli] has not yet been
updated to support that.  This makes the version of [lang] required by
[beanutils] incompatible with the version of [lang] required by [cli].
(And if your solution is "we'll just keep [cli] up-to-date", replace [cli]
in this example with some third-party, possibly closed-source component.)

This means:

  [LANG]  [LANG']
    ^       ^
    |       |
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]


but since [lang] != [lang'], I can't do that.  This problem isn't caused
by any true incompatibilities, but by an artificial coupling of unrelated
code.

If [reflect] and [math] are teased apart, the artificial problems go away:

  [MATH] [REFLECT]
    ^       ^
    |       |
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]

I can now replace [reflect] with [reflect'], and I only need to worry
about updating those components that depend upon the [reflect] classes.
This is true even if both [math] and [reflect] depend upon some other
stuff in [lang]:

      [LANG]
        ^
        |
    .--' '--.
    |       |
  [MATH] [REFLECT]
    ^       ^
    |       |
    |       |
  [CLI] [BEANUTILS]
    ^       ^
    |       |
    '--. .--'
        |
     [MY APP]


2) Monolithic components encourage superfluous dependencies and
inappropriate coupling.

Bundling unrelated code into a single component inappropriately lowers the
cost of crossing interface boundaries.  Since the code is distributed
together, it would seem that the cost of using, say, a method of
lang.SerializationUtils within lang.functor.FactoryUtils, is negligible.
But the true cost here isn't in getting SerializationUtils into the
classpath, it's in coupling of the two classes--making FactoryUtils
sensitive to changes in SerializationUtils.

Consider, for instance, lang.StringUtils.  There are number of handy
methods there, some of them non-trivial and all of them offering better
readability than the naive alternative.  I sympathize with the desire for
increased readability and reuse, and in some circumstances it may be a
Good Thing to use, for example, StringUtils.trim(String):

    public static String trim(String str) {
        return (str == null ? null : str.trim());
    }

instead of simply inlining the (str == null ? null : str.trim()) clause.

But when used infrequently in an otherwise unrelated class, the price paid
for this trivial reuse is fairly high, coupling this code with a 1700+
line class to reuse 33 characters of code. (And StringUtils uses
CharSetUtils, which uses CharSet, which uses various java collection
classes, etc.)

There are times when trivial code is just that.  Lumping together
unrelated code in a monolithic component encourages me to be lazy about
these dependencies and more importantly, these couplings.  Packaging
unrelated code into distinct components forces me to consider whether
introducing a new coupling is justified.

3) Monolithic components slow the pace of development.

When components are small and single purpose, changes are small,
well-contained, readily tested and easily understood. New releases can be
performed more readily, more easily and hence more frequently.

Bundling unrelated code into a monolithic component means I need to
synchronize development of that unrelated code: Maybe I'd like to do a new
release of sub-component X, but I can't since sub-component Y is in the
midst of a major refactoring.  Maybe I'd like to do a major refactoring of
sub-component A but I can't since sub-component B is preparing for a
release.

The more "foundational" a component is, the more this problem multiplies.
E.g., suppose we can't release lang.reflect because we're screwing around
with lang.time, and beanutils can't release without a released version of
lang.reflect, and struts can't release with released version of beanutils,
etc.

(Decoupling the CVS HEAD of lang.time and released version of lang.reflect
(i.e., releasing lang with the latest lang.reflect but without lang.time),
as we've done in other circumstances only demonstrates that these really
are unrelated packages, and causes problems for those that work from a
SNAPSHOT.)

4) Monolithic components make it more difficult for clients to track and
communicate their dependencies.

Following our versioning guidelines [5], non-backward compatible changes
to public APIs require new major version numbers.  Hence a non-backward
compatible change to sub-component X will require new major version
number, even though sub-component Y may be fully backwards compatible.
Clients that only depend upon Y (and since X and Y are not strongly
related, this is a significant set) will find the contract implied by the
versioning guidelines broken--the version numbers suggest a major change,
but there isn't as far as Y is concerned.  Clients that only depend upon Y
are forced to confirm that nothing has been broken, and perhaps even
update existing deployments even though there has been no change to Y.
This weakens the utility of the versioning heuristics, and makes it more
difficult for clients to track and manage their dependencies.

5) Monolithic components only hide circularities, and may even encourage
them.

Whenever A depends upon B and B depends on A, we have a circular
dependency, wherever the code for A and B is located.  As with most forms
of strong coupling, such circularities should be avoided whenever
possible.  Building A and B in the same compilation run may make it
possible to deal with a circular dependency, but it doesn't prevent it.
Similarly, placing A and B are in different components doesn't create a
circular dependency, it exposes it.

The "circular dependency" issue is largely hypothetical anyway.  In case
of [lang] for example, several of the sub-packages have literally no
dependency on the rest of the package, and most that do have very weak
coupling at best.  Moreover, it is trivial to combine two previously
independent components.  Following (1) and (2), it may be substantially
more difficult to tease apart classes that were once part of the same
component.

6) Monolithic components only get bigger, making all of these problems
worse.

For instance, the [lang] proposal that was approved describes its scope
as:

"[A] package of Java utility classes for the classes that are in
java.lang's hierarchy, or are considered to be so standard as to justify
existence in java.lang. The Lang Package also applies to primitives and
arrays." [6]

In the five months since that proposal was accepted, the scope of lang has
expanded significantly ([7], [8], [9], [10], [11]) and now includes or is
proposed to include:

 * math utilities [12]
 * serialization utilities [13]
 * currency and unit classes [14]
 * date and time utilities [15]
 * reflection and introspection utilities [16]
 * functors [17]
 * and much more [18], [19], [20], [21], [22]

And the more the scope expands, the more the scope expands--the existence
of the [lang] monolith has encouraged a reduction in ([23], [24], others)
and discouraged the growth of ([25], [26], others) other components, and
has discouraged the introduction of new components ([27], [28], others).


As above and before, if classes aren't commonly used, changed, and
released together, or mutually dependant on each other, they should be in
distinct components.  If we want a catch-all JAR, we've got one [3].
Given the principles enumerated in the commons guidelines and detrimental
effects enumerated here, I'm not sure why we'd follow any other course.

 - Rod

[0] <http://jakarta.apache.org/commons/charter.html>
[1] <http://jakarta.apache.org/turbine/maven/>
[2] <http://classworlds.werken.com/uberjar.html>
[3] <http://cvs.apache.org/viewcvs/jakarta-commons/combo/>
[4] <http://java.sun.com/products/javawebstart/>
[5] <http://jakarta.apache.org/commons/versioning.html>
[6] 
<http://cvs.apache.org/viewcvs/jakarta-commons/lang/PROPOSAL.html?rev=1.1&content-type=text/vnd.viewcvs-markup>
[7] 
<http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.10&r2=1.12&diff_format=h>
[8] 
<http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.25&r2=1.26&diff_format=h>
[9] 
<http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.28&r2=1.29&diff_format=h>
[10] 
<http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.30&r2=1.31&diff_format=h>
[11] 
<http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.31&r2=1.32&diff_format=h>
[12] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=586315>
[13] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=457636>
[14] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=18957>
[15] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=577799>
[16] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=411302>
[17] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=577713>
[18] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=16718>
[19] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=18778>
[20] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=19885>
[21] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=512176>
[22] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=581065>
[23] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=519705>
[24] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=20304>
[25] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=19847>
[26] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19865>
[27] 
<http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=551801>
[28] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19221>


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

[general][lang] monolithic components considered harmful

Reply via email to