[ https://issues.apache.org/jira/browse/LUCENE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14192350#comment-14192350 ]
yuanyun.cn commented on LUCENE-4345: ------------------------------------ Hi, Tommaso: Appreciate your great work and contribution to the community :) But I can't find solr/contrib/classification in dev/trunk. Is it not checked in? Also I read your presentation: http://archive.apachecon.com/eu2012/presentations/08-Thursday/L1R-Lucene/aceu-2012-text-categorization-with-lucene-and-solr.pdf and like your autmatic text caetgoration druing index: the CategorizationUpdateRequestProcessorFactory Is it possible to also check in it to Solr? Thanks. > Create a Classification module > ------------------------------ > > Key: LUCENE-4345 > URL: https://issues.apache.org/jira/browse/LUCENE-4345 > Project: Lucene - Core > Issue Type: New Feature > Reporter: Tommaso Teofili > Assignee: Tommaso Teofili > Priority: Minor > Fix For: Trunk > > Attachments: LUCENE-4345.patch, LUCENE-4345_2.patch, SOLR-3700.patch, > SOLR-3700_2.patch > > > Lucene/Solr can host huge sets of documents containing lots of information in > fields so that these can be used as training examples (w/ features) in order > to very quickly create classifiers algorithms to use on new documents and / > or to provide an additional service. > So the idea is to create a contrib module (called 'classification') to host a > ClassificationComponent that will use already seen data (the indexed > documents / fields) to classify new documents / text fragments. > The first version will contain a (simplistic) Lucene based Naive Bayes > classifier but more implementations should be added in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org