Hi,

On Saturday, 2012-10-27, Josh Wills wrote:
> On Sat, Oct 27, 2012 at 2:41 PM, Matthias Friedrich <[email protected]> wrote:

[...] 
>> I'd like to avoid a CRUNCH-60 type of situation where I spend lots of
>> time playing through scenarios and planning changes that, as it turns
>> out, aren't welcome. That is *very* frustrating, to say the least, so
>> let's talk ;-)
 
> Yeah, that sucked, let's not do that again.
 
> I think my main concern is that I don't have visibility into the scope
> of API changes that come along with CRUNCH-60, so I don't really see
> what we're building towards. I want to know what the end state of the
> API looks like w/some amount of precision-- what will live where,
> etc., etc.

As far as I'm concerned, CRUNCH-60 is dead because we didn't want to
create any more Maven modules, which is fine by me. My goal is to
provide a clean, stable API to clients which means we have to decide
which packages are client-facing (and thus, everything in there has to
remain stable) and which packages are internal to Crunch and may be
changed at any time (giving us the opportunity to improve things).

With "stable" I mean that we can still add and remove stuff, but only
in a way that doesn't break compatibility. We can only remove things
that have been deprecated for at least two releases. What we can't do
anymore is moving types around.

Re. your question on precision: I don't believe in completely freezing
the API. API design is hard and we don't have Josh Bloch on the team,
so it's unlikely that we get everything right. But I do believe that
we can adhere to the defintion of stable above.
 
> I agree w/Gabriel that having PType and PTypeFamily in the base
> package feels right and that having stuff that is essentially
> implementation detail stuff like Converter and OutputHandler there
> feels wrong. It may be that the value of having PType and PTypeFamily
> in base outweighs the cost of having Converter and OutputHandler there
> as well, but since I don't see the big picture here, it feels like
> we're making a series of locally optimal decisions that don't
> necessarily lead to a globally optimal API, where I'm using "optimal"
> in the sense of one that is both cleanly separated as well as easy to
> understand and use. Making a series of locally optimal decisions is
> how I ended up coding into the cul-de-sac that necessitated the
> OutputHandler abstraction in the first place.

OK, I see the problem. As usual we want the same thing, mostly ;-)
 
> If it's possible to spell out exactly what the APIs look like when
> this is done, let's do that; if we're not there yet but could spell
> out a set of principles that we would like to be true of the APIs,
> then let's do that and then start creating some strawman proposals for
> organization of code that satisfies them as closely as we can.
> Implementing the changes required to get to the goals may take longer
> than a single release cycle, but I wouldn't want it to take more than
> two releases, just because I'm loathe to be in a process of
> continually moving things around.

Good idea, let's first agree on a set of principles. In my opinion,
we should limit the scope for these prinicples to client-facing
packages, everything else can be changed in any way at any time.

My proposal is based on [2], a very short and incomplete summary can
be found at [3]. For us, it boils down to this:

 * A package must have a clear purpose; it contains either mostly
   abstractions or mostly implementations (this makes it easier
   to explain)
 * A package must not depend on a package that is less stable 
   than itself (meaning a package containing mostly abstractions
   must not depend on one containing mostly implementations)
 * There must be no dependencies from a client-facing package to
   an internal package (that is, javadocs don't have dangling
   references)
 * There must be tight cohesion between classes in a package or
   the package should be split (this doesn't apply for .util)
 * There must be no dependency cycles between client-facing packages

You can calculate metrics for all of this but it's really just common
sense. Crunch follows these rules in the vast majority of cases
already. Right now I see the following violations:

 * The .types package mixes abstractions and implementations and
   is part of a dependency cycle with base.
 * The base package references the .io implementation package
   causing a dependency cycle.
 * The base package references the .util package causing a
   dependency cycle.
 * There are lots of implementations in CombineFn and other Fns
   that shouldn't be in base (which is for abstractions). We should
   move them to .fn, perhaps to Guava style CombineFns, FilterFns.
   We can even do this in a backwards compatible way.

Given that Crunch is a framework this is not really ambitious. My
teams at work follow a lot more design rules even for applications.
One example: We make everything final by default; extension takes
place either by composition or by template method which makes it
possible to evolve classes without breaking client code. That's
what the DesignForExtension Checkstyle rule mandates (we disabled
it for Crunch).

Note: Java 8 will be a game changer; as far as I understand it, our
abstract Fns are *not* going to work with Lambdas, we would need
interfaces with only a single method. Does anyone think we have to
address this before Java 8 is released?
 
> I think that once we have a proposal we agree on, we should cut the
> 0.4.0 release and then devote the next release cycle to implementing
> the new API for 0.5.0.

Agreed. BTW, I have verified our release process (with a local Nexus
and all), we're ready to release at any time.

So, what do you think of the principles outlined above?

Regards,
  Matthias

[1] 
http://tmp.mafr.de/crunch/apidocs/0.4.0-incubating-SNAPSHOT/overview-summary.html
[2] Robert C. Martin, "Agile Principles, Patterns, and Practices",
    chapter "Principles of Package and Component Design".
[3] http://en.wikipedia.org/wiki/Package_Principles

Reply via email to