Andrzej Bialecki wrote:
Dennis Kubes wrote:
[..]
So there it is, a new Scoring framework and a new Indexing framework.
I believe these two pieces contribute significantly to improving the
relevancy in the current Nutch system. These two pieces are currently
in Jira as NUTCH-635. I hope to finish up comments, documentation,
and other small changes within the next few days and move this into
the nutch core. If anybody has any questions or comments, feel free.
Fantastic piece of work, Dennis! I'm heading for a week off, but as soon
as I'm back I'm going to review this patch thoroughly. As we all know
the current scoring system is wonky, and this patch promises to fix it
so that it's finally usable and predictable ... Thank you!
The big question from the point of view of release engineering is
whether this patch is orthogonal enough so that it can be included in
1.0, so that people who are stuck with the current scoring system can
still continue using it without major changes.
This patch doesn't break the current scoring or indexing systems and the
two systems can be included and used side-by-side People can:
1) use the old system as is
2) use the new scoring but not the new indexing, then they would
activate the scoring-link plugin
3) use the new scoring and indexing systems together and the rest of the
pieces (generator, fetching, querying) should be unaffected.
On the other hand, we could say that 1.0 should NOT be released without
a revamped scoring system, because the current one is broken anyway ...
Any thoughts on this?
I definitely think we shouldn't release 1.0 until scoring is fixed, and
I think this fixes it, I also think we should probably deprecate the old
systems in 1.0 and remove them in a later release. Ideally these two
new pieces are precursors to a Nutch 2 architecture made to work within
the 1.0 system.
Dennis