> Ooh, long e-mail! I'm gonna try and split this up... :-D Sorry Dude, I got excited. :-) I'll try to keep them shorter or split them. [I'll reply a few times to this one.]
Having slept on what I saw, I do have some serious questions, and (to keep it short, I'll come right to the point, knowing you know I mean it respectfully) I wonder if there is as much significant difference between Gump2 and Gump3 as I first thought. They are much the same. I have a slight deja-vu feeling here. You've built a nice (clean) start, like Sam did, but to get from this to a live running system will take much the same work that I added last time, and I'm not sure the key problems of Gump2 have been understood/corrected. I'm going to try (over time) to list every place in Gump2 that I feel would be as bad in Gump3 so we can address them. This isn't me being petty, but me trying to pressure test this new approach against my understanding of reality (for all it's/my warts). > I firmly believe there is very little need for different components to > communicate. If you architect things the IOC way, components will use just > one or two other components, and their parent can just set up the references > between all those components. [ BTW: I still could use help with IOC. I have a crude understanding of it, but please don't forget to enlighten me if you see I'm missing a point.] Sure, I see that components ought not need to communicate directly. In Gump2 we have a model tree (workspace/modules/projects) and a (theoretically separate, but not) tree of results. That tree is for a few projects, or all, based off the filter of work to do. As components do work on that tree they store data at the right level (run/workspace/module/project), perhaps even setting state (failed, etc.). This is Gump2, and (as I hear it) Gump3, no differences. I feel it is that tree that is the weakness people consider "bloat". Not it's memory size, but it's complexity, all the data stored in there -- and the fact it is a "batch". That is a key similarity between Gump2/Gump3 and (IMHO) a key issue to address. The closer I look the more I realize the similarities between Gump2 and Gump3. > What will happen is that a component needs a certain kind of result > available. For example, something that pushes information in the dynagump > database needs that information, which might be put there by an ant builder > or something like that. This kind of stuff is trivial in python; you just > set the property on the relevant part of the model and then retrieve it > later. [...] > Note that such communication is pretty indirect. For example the start of > the CvsUpdater plugin I did just pushes information into the model (the log > of the cvs command, exit status, etc) without worrying who uses that > information (at the moment, it is just ignored). Part of the problem is ordering/sequencing. The CVS updating would not halt all efforts on a module (builds would occur) 'cos the CVS failed if it had a "semi-fresh" copy. (This was due to SF.net CVS being so flakey for so long even for Gump-wise stable things like JUnit.) As such, prior to CVS updating we needed to bring some "stats/history" information into memory, so enforces an implicit dependency. [Note: Stats Actor today stores Stats on the Tree, so users (CVS Actor) just ask for it from there, they don't talk directly.] I know you can do "inter component communications" w/ Python properties, Gump2 does, but it has no "contract" (as Stefano would say) it is not clean, it is intricate internals knowledge from one component to annother. It is stuff like this (and order dependencies like this) that ties components together, and keeps things fat. [Gump2 at least used typed member data/methods on the tree in order to allow some contracts.] What you are suggesting in almost exactly how Gump2 works, and is (I fear) where the thoughts to "bloat" come from. > > There > > were times when building logic wanted to know something historically (had > > this built before, etc.) in order to determine how much effort (or what > > switches) to use. Is inter-component communications like this a real no-no, > > or is this something that might be "coincidentally" allowed via steps in > > pre-processing, etc. > > We don't need "steps". Think unix command line utilities. You can make them > communicate: > > find . -type f | xargs -v ".svn" I'm a PIPE lover the much as the next guy, but simple flat stream pipes are not what we are building. Our components use complex results. Do we need contracts for those, or things (like DOM tree/XML structures) that we can persist/stream/validate. [How does Cocoon address this?] > Without steps. That "|" there in gump is achieved by setting a property on a > piece of the model. As with Gump2, but the properties grow and need management. They (and implicit dependencies) are the bloat. > Plugins > ------- > > I think that generating plug-ins (perhaps even for loading, and such) is > > key. I'm not sure (yet) if the new model is any better than the old in > > allowing the "core steps" (loading, modelling) to be pluged-in, but I think > > it need to be investigated. > > Yes, its easy. Change the get_verifier() in config.py to provide a different > implementation, and that's it! > > > I see you have a Maven parser, but could/should > > that be a plug-in? > > I doubt we should be talking about this kind of stuff as a "plugin". There's > very specific bits of functionality that *need* to be performed (right > "contracts") for gump to work. To me, a plugin is something you can leave > out and still have something that basically works. I think *the* key problem with Gump2 is "what is core" and "what can be plugged in". Maybe I (and you) are getting a little carried away with what can be a plug-in, and maybe too many things are invalidly coded as such. Is "historical information" a fundamental service or some swappable component? [Please forgive me if I fail to know the correct terminology for 'corn concerns' or whatever. Perhaps teach me what I need to communicate more clearly with you.] The problem with Gump2 (and why it is a batch, and less able to be incremental/split) is that we have metadata loading as a stage, and not "on demand". As such we blast (we hope) through the whole metadata, building a tree, and then work it as a batch. It is hard to allow folks to plug in loaders (e.g. Maven parsers), and harder still to allow them to build/load the in-memory structures themselves. This is true for "loading" for "modelling", and for much of our core. This is where we fail to have a system that we or others can break into pieces, uses in pieces. I think this is where we need components. I don't know if all things can be simple components, or if we need some "interfaces" (e.g. a LoaderComponent, a BuilderCompent. etc.) In Gump2 I tried the latter (if not formally as components) 'cos I felt it was less pure, more practical, and better fitting the need. I'd like to hear viewpoints on that, 'cos I think it is key. > > Thanks Leo. Good job. [and now my mind is racing w/ thoughts around this, > > thanks for waking me up! I hope I don't cut a finger off w/ the jaws 'cos > > I'm distracted. ;-)] > > Hehehe. Do let us know you're alright dude! Fingers all still here, and still as "fat" as always. ;-) Burn building next week, and once I (w/ a too enthusiastic career instructor) melted my helmet in one of those. I'll try to bring my brain back, from next week, w/o too much new frying. :-) regards, Adam --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]