Actually thinking a bit further into this, I kind of agree with you. I
initially thought that the best approach would be to change
PluginRepository.get(Configuration) to PluginRepository.get() where
get() just creates a configuration internally and initializes itself
with it. But then we wouldn't be passing JobConf to PluginRepository
but PluginRepository would do something like a
NutchConfiguration.create(), which is probably wrong.

So, all in all, I've come to believe that my (and Nicolas') patch is a
not-so-bad way of fixing this. It allows us to pass JobConf to
PluginRepository and stops creating new PluginRepository-s again and
again...

What do you think?

IMO a better way would be to add a proper equals() method to Hadoop's Configuration object (and hashcode) that would call getProps().equals(o.getProps()). So that you could use them as keys... Every class which is a map from keys to values has "equals & hashcode" (Properties, HashMap, etc.).

Another nice thing would be to be able to "freeze" a configuration object, preventing anyone from modifying it.

Reply via email to