Hi Folks,

  Jerome and I have been thinking a bit about the whole issue of "static"
NutchConf, versus removing it and making it a constructor parameter, etc. I
personally think that a lot of this issue stems from the fact that the
actual source code for nutch, and the what I would call "source
distribution", is in the same location at the actual "deployment"
distribution. For example, we have a directory structure like:

$NUTCH_HOME:

src/
build/
lib/
bin/
...

Which is where you run Nutch, and also where you build Nutch. I think that
this would be drastically improved by defining the notion of a "Nutch
Deployment". For example, in a lot of my projects at JPL, we check out the
source code of our projects from CM, then we construct a "build" of our
project. This build becomes what we then "deploy" to a particular deployment
environment, or location, and that's where the system is run from. A simple
example would be:

I have project A, here is A's source code structure:

/path/to/A/src/java/my/package/Test.java
/path/to/A/build.xml
/path/to/A/....


Then, when I type: ant deploy, the following structure is created:

/path/to/A/build/distribution/lib/<all lib files needed by the deployment>
/path/to/A/build/distribution/bin/<all scripts needed by the deployment>
/path/to/A/build/distribution/LICENSE.txt
/path/to/A/build/distribution/conf/<any conf files needed by the deployment>
...and so on

Then, a user could take the /path/to/A/build/distribution folder, and then
copy it to a "deployment" directory, and then, that's the deployment of the
system, which is separate from source code, thereby untying the source
distribution and the deployment distribution. If we had this concept
currently in Nutch, I think a lot of the static Nutch conf issues dissapear,
correct, because we have the concept of separate deployments, instead of
just relying on the same deployment to run a whole bunch of distributed
processes out of.

I may be misunderstanding this whole conversation, but if I'm right, then I
would propose that we formalize a notion of a "deployment" of Nutch versus
the actual "source distribution", instead of co-mingling them. Thoughts?


Thanks!

Cheers,
  Chris


______________________________________________
Chris A. Mattmann
[EMAIL PROTECTED]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply via email to