[ https://issues.apache.org/jira/browse/SOLR-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stanislaw Osinski updated SOLR-1804: ------------------------------------ Attachment: SOLR-1804-carrot2-3.4.0-dev.patch Hi, As we're near the 3.4.0 release of Carrot2, I'm including a patch that upgrades the clustering plugin. The most notable changes are: * [3.4.0] Carrot2 core no longer depends on Lucene APIs, so the {{build.xml}} can be enabled again. The only class that makes use of Lucene API, {{LuceneLanguageModelFactory}}, is now included in the plugin's code, so there shouldn't be any problems with refactoring. In fact, I've already updated {{LuceneLanguageModelFactory}} to remove the use of deprecated APIs. * [3.3.0] The STC algorithm has seen some [significant scalability improvements|http://project.carrot2.org/release-3.3.0-notes.html] * [3.2.0] Carrot2 core no longer depends on LGPL libraries, so all the JARs can now be included in Solr SVN and SOLR-2007 won't need fixing. Included is a patch against r966211. A ZIP with JARs will follow in a sec. A couple of notes: * The upgrade requires upgrading Google collections to Guava. This is a drop-in replacement, all tests pass for me after the upgrade, plus the upgrade is [recommended|http://code.google.com/p/google-collections/] on the original Google Collections site. * The patch includes Carrot2 3.4.0-dev JAR, but I guess it's worth committing already to avoid the library downloads hassle (SOLR-2007). * Originally, Carrot2 supports clustering of Chinese content based on the Smart Chinese Tokenizer. This tokenizer would have to be referenced from the {{LuceneLanguageModelFactory}} class in Solr. However, when compiling the code in Ant, this smartcn doesn't seem available in the classpath. Is it a matter of modifying the build files, or it's a policy on dependencies between plugins? Let me know if you have any problems applying the patch. Thanks! S. > Upgrade Carrot2 to 3.2.0 > ------------------------ > > Key: SOLR-1804 > URL: https://issues.apache.org/jira/browse/SOLR-1804 > Project: Solr > Issue Type: Improvement > Components: contrib - Clustering > Reporter: Grant Ingersoll > Assignee: Grant Ingersoll > Attachments: SOLR-1804-carrot2-3.4.0-dev.patch > > > http://project.carrot2.org/release-3.2.0-notes.html > Carrot2 is now LGPL free, which means we should be able to bundle the binary! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org