2009/12/28 Robert Burrell Donkin <[email protected]>: > we've struggled to find the right balance between power, performance > and usability. IMHO we haven't yet succeeded.
So we agree we can try to improve things even if this means breaking backward compatibility. >> 1) We have a "Field" interface, a RawField and a ParsedField. Most >> code deal with generic Fields but knows when it is a parsedfield or a >> rawfield. Nowhere we check the Field implementation to understand if >> it is already parsed or not, so it seems we always know when it is a >> parsedfield and when it is a rawfield. Some code calling getName does >> a trim and a lowercase, some other code simply lowercase without >> trimming. Why don't we simply canonicalize things in getName and >> publish a clear contract about what getName returns? > > IIRC performance (some downstream application don't care about > canonicalisation and don't want to pay the cost) and power (some > downstream apps require uncanonicalised input - this is a requirement > for round tripping in particular) I guess all of them simply use "getRaw", don't you think? getName and getBody should not be use for roundtripping as they could change somethinh anyway (getBody is unfolded, so if you fold again you can't be sure you obtain the same result as you could end up folding in a different place). > it's important to remember that there are downstream applications that > use the methods and classes directly. so, even if a method does not > seem to be used in Mime4J, it may have been added to facilitate a > downstream use case. equally, it could be legacy. hard to tell since > everything's bundled up together. As we are still in 0.x releases and we agree that the exposed interfaces/code should be improved we should try to keep track of current downstream users and understand exactly what they need to do, so to use them as use-case to help us improving the separation of concerns. We don't want to expose every single class and to mantain backward compatibility for every single class, so we should start selecting things. IMHO if we are unable to collect downstream users we should try to decide on our own and maybe hide some unused method if we don't think it should be used outside, and maybe after releasing a new version (0.7) we'll wait for "upgraders" to complaint for the missing features and, if we find we really removed an useful feature we can add it again in the next release (0.8). >> As I fail to see the current "idea" maybe there is no idea and simply >> this is the result of too many hands and refactorings done in the >> years, so before being the next hand and applying the next refactoring >> I'd like to collect some thought. > > IMO to satisfy so many use cases requires low level complexity. no > one's managed to come with a single idea that can satisfy all > requirements. We all know the XML parsers world. We have SAX, DOM, StAX, TraX, XOM (and also xml databinding apis), and so on.. there is no api to satisfy all users and none of them has been obsoleted by other. xml libraries usually expose one or more of that APIs but (AFAICT) none of them expose all of the interfaces in a single library. MimeTokenStream is our StAX parser MimeStreamParser is our SAX parser the "message" package is our DOM in XML world SAX and StAX "events" are mainly based on Strings and at most on "QName". There are no Elements, Node at this level or anything "DOM" related, yet. In mime4j (wrt streaming apis) we almost there: the model is pretty similar to the xml model, the main difference is our "Field" interface that is shared between our DOM and our S(t)AX. Talking about "copying" what XML did we know that we have to "compromise" on roundtripping (most XML apis out there let you read XML or alter XML, but they will loose most of the original formatting during the parsing) IMHO our current SAX/StAX parser is almost OK and we should only improve naming, packages and maybe few other things like decide what to do with the "Field" interface. In our DOM, instead, I see one big "defect" and it is that we don't have interfaces for some key nodes: we should add that MIME is a different beast than XML, but I think that we should try to model interfaces in a package and put there the Message as interface, each *Field as interfaces and then have some "builder" service to start a new Message from scratch or to parse it using SAX (and we already have the MessageBuilder)... What about creating interfaces for the DOM and split "Field" used by our S*AX by the "Field" used by our DOM ? >> Do you think all I've written are foolish thoughts or do you think we >> should try to sort this stuff out before releasing mime4j 1.0 ? > > IMO the API isn't stable or good enough for a 1.0 release > > some deep design decisions need to be taken about the library. without > the powerful but unintuitive features, mime4j can't be used for > downstream applications that require performance and power. perhaps > mime4j needs to be split into two libraries: a usable, intuitive API > for non-experts and a low level powerful, quick API for downstream > applications. this has worked for other applications. Don't you think that the "current" StAX+SAX+DOM approach works for MIME too? IMHO what is "unintuitive" is the way we try to implement them now (expecially the field parsing and the DOM handling) e.g: we currently have again package dependecy cycles.. I know some of you couldn't care less of this, but I think that working without cycles and keeping a clear package dependency tree is the only way to produce an intuitive result. If I can't create a package tree, or a modules-tree, for an application then I can't understand or explain it "intuitively". Stefano > - robert >
