Can you suggest detailed use cases? In my experience explicit variable selection is not a common strategy in machine learning at scale. If anything, the use of regularizers has driven things in another direction entirely.
On Tue, Apr 2, 2013 at 10:34 AM, Claudio Reggiani <nop...@gmail.com> wrote: > After one month I'd like to know if this new feature is interesting for > Mahout, or I didn't get any reply because nobody noticed it. If it is not > good enough I could first publish it on github on my account. > > > 2013/3/6 Claudio Reggiani (JIRA) <j...@apache.org> > > > Claudio Reggiani created MAHOUT-1152: > > ---------------------------------------- > > > > Summary: mRMR feature selection algorithm > > Key: MAHOUT-1152 > > URL: https://issues.apache.org/jira/browse/MAHOUT-1152 > > Project: Mahout > > Issue Type: Improvement > > Components: Integration > > Affects Versions: 0.7 > > Reporter: Claudio Reggiani > > Priority: Minor > > Fix For: 0.8 > > > > > > Proposal Title: mRMR Feature Selection Algorithm on Map-Reduce. > > > > Student Name: Claudio Reggiani > > > > Student E-mail: nop...@gmail.com > > > > Proposal Abstract: > > > > The mRMR algorithm, described in [1], is a feature selection algorithm > > that leverages mutual information evaluation to select features. At each > > iteration, mRMR selects a new feature based on both how much it's > strongly > > correlated to the target output and how much it's less correlated to the > > features already selected. The correlation is measured by means of mutual > > information. The project proposes to provide the mRMR algorithm in > > MapReduce programming framework. > > > > Additional information: > > > > 1. *The code is already available* with some tests, because I'm working > on > > my master thesis an initial milestone of my research was to implement > mRMR > > algorithm in MapReduce. > > 2. I'm figuring out if it's possible for me to apply at Google Summer of > > Code 2013. > > > > References: > > > > [1] Hanchuan Peng, Fuhui Long, and Chris Ding > > IEEE Transactions on Pattern Analysis and Machine Intelligence, > > Vol. 27, No. 8, pp.1226-1238, 2005. > > Link: http://penglab.janelia.org/papersall/docpdf/2005_TPAMI_FeaSel.pdf > > > > -- > > This message is automatically generated by JIRA. > > If you think it was sent incorrectly, please contact your JIRA > > administrators > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > >