It may seem that I'm picking on [lang] here, but that's not my intention. I just feel like I'm watching an impending train-wreck, and intend to throw the switch while there's still time.
The Jakarta-Commons charter suggests (well, literally requires [0]) that: "Each package must have a clearly defined purpose, scope, and API -- Do one thing well, and keep your contracts" and suggests in a number of ways that small, single-purpose components are preferable to monolithic ones. (Perhaps most succinctly as "Place types that are commonly used, changed, and released together, or mutually dependant on each other, into the same package [and types that are not used, changed, and released together, or mutually dependent into different packages].) Yet there seems to be an increasing tendency here toward lumping discrete units into monolithic components. Allow me justify this position. The arguments in favor of monolithic components I've seen seem to boil down to concerns about minimizing dependencies and preventing circularities. This may seem superficially correct, but it is misguided. The number of JARs I need to have in my classpath is at best an indirect metric for the absence or presence of dependency issues, and at worst a misleading one. Adding a new JAR to the classpath is a trivial issue, and tools like Maven [1], ClassWorld's UberJar [2], Commons-Combo [3] and even Java Web Start [4] make it even less of an issue (for better or worse). The real concerns here should be those of configuration management. For example, which version of X does Y require, and is that compatible with the version of X that Z requires? How many applications will be impacted by a given change? How small can I make my (end-user) application? Monolithic components make configuration management problems worse, not better. Here's how: 1) Monolithic components introduce false dependencies. Let's suppose, as some have suggested, that we release [lang] with new reflection and math packages. Suppose further that [cli] uses the lang.math utilities and that [beanutils] uses the lang.reflect utilities, and that I've got an application that uses both [cli] and [beanutils]. One might think this gives a simple dependency graph: [LANG] ^ | .--' '--. | | [CLI] [BEANUTILS] ^ ^ | | '--. .--' | [MY APP] (where [X] <-- [Y] means Y depends on X) but the reality is more complicated. Suppose the latest version of [beanutils] required some changes to lang.reflect. In the same period, some changes have been made to lang.math, but [cli] has not yet been updated to support that. This makes the version of [lang] required by [beanutils] incompatible with the version of [lang] required by [cli]. (And if your solution is "we'll just keep [cli] up-to-date", replace [cli] in this example with some third-party, possibly closed-source component.) This means: [LANG] [LANG'] ^ ^ | | | | [CLI] [BEANUTILS] ^ ^ | | '--. .--' | [MY APP] but since [lang] != [lang'], I can't do that. This problem isn't caused by any true incompatibilities, but by an artificial coupling of unrelated code. If [reflect] and [math] are teased apart, the artificial problems go away: [MATH] [REFLECT] ^ ^ | | | | [CLI] [BEANUTILS] ^ ^ | | '--. .--' | [MY APP] I can now replace [reflect] with [reflect'], and I only need to worry about updating those components that depend upon the [reflect] classes. This is true even if both [math] and [reflect] depend upon some other stuff in [lang]: [LANG] ^ | .--' '--. | | [MATH] [REFLECT] ^ ^ | | | | [CLI] [BEANUTILS] ^ ^ | | '--. .--' | [MY APP] 2) Monolithic components encourage superfluous dependencies and inappropriate coupling. Bundling unrelated code into a single component inappropriately lowers the cost of crossing interface boundaries. Since the code is distributed together, it would seem that the cost of using, say, a method of lang.SerializationUtils within lang.functor.FactoryUtils, is negligible. But the true cost here isn't in getting SerializationUtils into the classpath, it's in coupling of the two classes--making FactoryUtils sensitive to changes in SerializationUtils. Consider, for instance, lang.StringUtils. There are number of handy methods there, some of them non-trivial and all of them offering better readability than the naive alternative. I sympathize with the desire for increased readability and reuse, and in some circumstances it may be a Good Thing to use, for example, StringUtils.trim(String): public static String trim(String str) { return (str == null ? null : str.trim()); } instead of simply inlining the (str == null ? null : str.trim()) clause. But when used infrequently in an otherwise unrelated class, the price paid for this trivial reuse is fairly high, coupling this code with a 1700+ line class to reuse 33 characters of code. (And StringUtils uses CharSetUtils, which uses CharSet, which uses various java collection classes, etc.) There are times when trivial code is just that. Lumping together unrelated code in a monolithic component encourages me to be lazy about these dependencies and more importantly, these couplings. Packaging unrelated code into distinct components forces me to consider whether introducing a new coupling is justified. 3) Monolithic components slow the pace of development. When components are small and single purpose, changes are small, well-contained, readily tested and easily understood. New releases can be performed more readily, more easily and hence more frequently. Bundling unrelated code into a monolithic component means I need to synchronize development of that unrelated code: Maybe I'd like to do a new release of sub-component X, but I can't since sub-component Y is in the midst of a major refactoring. Maybe I'd like to do a major refactoring of sub-component A but I can't since sub-component B is preparing for a release. The more "foundational" a component is, the more this problem multiplies. E.g., suppose we can't release lang.reflect because we're screwing around with lang.time, and beanutils can't release without a released version of lang.reflect, and struts can't release with released version of beanutils, etc. (Decoupling the CVS HEAD of lang.time and released version of lang.reflect (i.e., releasing lang with the latest lang.reflect but without lang.time), as we've done in other circumstances only demonstrates that these really are unrelated packages, and causes problems for those that work from a SNAPSHOT.) 4) Monolithic components make it more difficult for clients to track and communicate their dependencies. Following our versioning guidelines [5], non-backward compatible changes to public APIs require new major version numbers. Hence a non-backward compatible change to sub-component X will require new major version number, even though sub-component Y may be fully backwards compatible. Clients that only depend upon Y (and since X and Y are not strongly related, this is a significant set) will find the contract implied by the versioning guidelines broken--the version numbers suggest a major change, but there isn't as far as Y is concerned. Clients that only depend upon Y are forced to confirm that nothing has been broken, and perhaps even update existing deployments even though there has been no change to Y. This weakens the utility of the versioning heuristics, and makes it more difficult for clients to track and manage their dependencies. 5) Monolithic components only hide circularities, and may even encourage them. Whenever A depends upon B and B depends on A, we have a circular dependency, wherever the code for A and B is located. As with most forms of strong coupling, such circularities should be avoided whenever possible. Building A and B in the same compilation run may make it possible to deal with a circular dependency, but it doesn't prevent it. Similarly, placing A and B are in different components doesn't create a circular dependency, it exposes it. The "circular dependency" issue is largely hypothetical anyway. In case of [lang] for example, several of the sub-packages have literally no dependency on the rest of the package, and most that do have very weak coupling at best. Moreover, it is trivial to combine two previously independent components. Following (1) and (2), it may be substantially more difficult to tease apart classes that were once part of the same component. 6) Monolithic components only get bigger, making all of these problems worse. For instance, the [lang] proposal that was approved describes its scope as: "[A] package of Java utility classes for the classes that are in java.lang's hierarchy, or are considered to be so standard as to justify existence in java.lang. The Lang Package also applies to primitives and arrays." [6] In the five months since that proposal was accepted, the scope of lang has expanded significantly ([7], [8], [9], [10], [11]) and now includes or is proposed to include: * math utilities [12] * serialization utilities [13] * currency and unit classes [14] * date and time utilities [15] * reflection and introspection utilities [16] * functors [17] * and much more [18], [19], [20], [21], [22] And the more the scope expands, the more the scope expands--the existence of the [lang] monolith has encouraged a reduction in ([23], [24], others) and discouraged the growth of ([25], [26], others) other components, and has discouraged the introduction of new components ([27], [28], others). As above and before, if classes aren't commonly used, changed, and released together, or mutually dependant on each other, they should be in distinct components. If we want a catch-all JAR, we've got one [3]. Given the principles enumerated in the commons guidelines and detrimental effects enumerated here, I'm not sure why we'd follow any other course. - Rod [0] <http://jakarta.apache.org/commons/charter.html> [1] <http://jakarta.apache.org/turbine/maven/> [2] <http://classworlds.werken.com/uberjar.html> [3] <http://cvs.apache.org/viewcvs/jakarta-commons/combo/> [4] <http://java.sun.com/products/javawebstart/> [5] <http://jakarta.apache.org/commons/versioning.html> [6] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/PROPOSAL.html?rev=1.1&content-type=text/vnd.viewcvs-markup> [7] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.10&r2=1.12&diff_format=h> [8] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.25&r2=1.26&diff_format=h> [9] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.28&r2=1.29&diff_format=h> [10] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.30&r2=1.31&diff_format=h> [11] <http://cvs.apache.org/viewcvs/jakarta-commons/lang/STATUS.html.diff?r1=1.31&r2=1.32&diff_format=h> [12] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=586315> [13] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=457636> [14] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=18957> [15] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=577799> [16] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=411302> [17] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=577713> [18] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=16718> [19] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=18778> [20] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=19885> [21] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=512176> [22] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=581065> [23] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=519705> [24] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=20304> [25] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgNo=19847> [26] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19865> [27] <http://archives.apache.org/eyebrowse/ReadMsg?[EMAIL PROTECTED]&msgId=551801> [28] <http://archives.apache.org/eyebrowse/ReadMsg?listId=15&msgNo=19221> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>