[ 
https://issues.apache.org/jira/browse/NUTCH-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522041
 ] 

Doğacan Güney commented on NUTCH-544:
-------------------------------------

> Doğacan, would it be a problem if we threw in BeanShell and Dom4j JARs? We 
> have been talking about this with Staszek -- this 
> would allow us to instantiate clustering algorithms dynamically and would 
> effectively provide alternatives for Nutch users to 
> use  Lingo, STC or Lingo3G (our commercial clusterer).
>
> I'm asking because I remember at the beginning there were concerns about the 
> size of Nutch when compliled with all plugin 
> dependencies etc.

I wouldn't want to comment on it, since I wasn't around during those 
discussions. However, as far as I am concerned we can add those two jars 
because being able to use different clustering algorithms sounds useful to me. 
(though I don't understand why we need beanshell and dom4j to provide 
alternatives. Can you elaborate a bit?)

> Same patch, but I added an optional parameter that allows custom clustering 
> processes to be used.

I took a quick look at the code but couldn't find it. Is this a configuration 
parameter (via nutch-site.xml)?


I am going to let the code stay here for a few days, then commit it if there 
are no objections...

> Upgrade Carrot2 clustering plugin to the newest stable release (2.1)
> --------------------------------------------------------------------
>
>                 Key: NUTCH-544
>                 URL: https://issues.apache.org/jira/browse/NUTCH-544
>             Project: Nutch
>          Issue Type: Improvement
>            Reporter: Dawid Weiss
>            Priority: Minor
>         Attachments: clustering-upgrade-2.1.patch, libs-packed.tar.gz
>
>
> This issue upgrades Carrot2 search results clustering plugin to the newest 
> stable version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to