Jérôme Charron wrote:
jar. A short-term solutions could be to move the core classes (which have no
dependencies on
nutch) to a new lib-plugin (lib-lang for instance and adding a dependecy to
this plugin in the
language-identifier), so that this code could be used as a standalone lib.
Are you ok, with such changes?
Perhaps you could isolate ngram specific stuff to own plugin and the
lang-id into other.
Or the other option would be (what I implemented some time ago)
something like this (as ngram categorizer can also used for other
interesting stuff):
new package in core nutch containing classes like:
NGramProfile - pretty much as is
Categorizer - generic configurable ngram categorizer, configure
profiles, ngram sizes etc.
CategorizerFactory - to get hold of different categorizers
In LangId plugin you just get a correct ( configured to use lang ngram
profiles and predefined settings for ngramsizes etc ) categorizer from
factory and tell it to do it's job when needed.
--
Sami Siren
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers