Yes perfect I'll look at those and begin readings there and figure out next steps.Thanks again for your help in starting this effort.
> Date: Wed, 22 Feb 2012 16:25:27 -0700 > From: [email protected] > To: [email protected] > Subject: Re: Helping out with the .7 release > > Hi Saikat, > > Glad you're excited. Paritosh offered one suggestion below. You could > look at TestKmeansClustering for patterns you could use to test the > ClusterClassificationMapper and Driver in MR mode. That should be > straightforward, but please coordinate with Paritosh so you don't > duplicate efforts. > > Another place you might look into would be the KMeansDriver and > MAHOUT-930. You could work on refactoring KMeansDriver to use the new > ClusterClassificationDriver in MAHOUT-929. That would exercise both its > sequential and MR options. It will be interesting to see how much code > can be removed. > > Finally, you could see if you can wrap your mind around the > ClusterIterator and how it could be used for further refactoring of the > KMeansDriver. See TestClusterClassifier for insight. > > That enough reading and doing for now? > Jeff > > On 2/22/12 10:06 AM, Saikat Kanjilal wrote: > > Jeff,I'm pretty excited to help out with this, so as a starter can you > > point me to where I should begin my readings of the code, I havent looked > > too closely but are there certain classes in the clustering area where this > > refactoring effort is centered around. > > Regards > > > >> Date: Wed, 22 Feb 2012 08:56:23 -0700 > >> From: [email protected] > >> To: [email protected] > >> Subject: Re: Helping out with the .7 release > >> > >> Hi Saikat, > >> > >> I agree with Paritosh, that a great place to begin would be to write > >> some unit tests. This will familiarize you with the code base and help > >> us a lot with our 0.7 housekeeping release. The new clustering > >> classification components are going to unify many - but not all - of the > >> existing clustering algorithms to reduce their complexity by factoring > >> out duplication and streamlining their integration into semi-supervised > >> classification engines. > >> > >> Please feel free to post any questions you may have in reading through > >> this code. This is a major refactoring effort and we will need all the > >> help we can get. Thanks for the offer, > >> > >> Jeff > >> > >> On 2/21/12 10:46 PM, Saikat Kanjilal wrote: > >>> Hi Paritosh,Yes creating the test case would be a great first start, > >>> however are there other tasks you guys need help with before I can do > >>> before the test creation, I will sync trunk and start reading through the > >>> code in the meantime.Regards > >>> > >>>> Date: Wed, 22 Feb 2012 10:57:51 +0530 > >>>> From: [email protected] > >>>> To: [email protected] > >>>> Subject: Re: Helping out with the .7 release > >>>> > >>>> We are creating clustering as classification components which will help > >>>> in moving clustering out. Once the component is ready, then the > >>>> clustering algorithms would need refactoring. > >>>> The clustering as classification component and the outlier removal > >>>> component has been created. > >>>> > >>>> Most of it is committed, and rest is available as a patch. See > >>>> https://issues.apache.org/jira/browse/MAHOUT-929 > >>>> If you will apply the latest patch available on Mahout-929 you can see > >>>> all that is available now. > >>>> > >>>> If you want, you can help with the test case of > >>>> ClusterClassificationMapper available in the patch. > >>>> > >>>> On 22-02-2012 10:27, Saikat Kanjilal wrote: > >>>>> Hi Guys,I was interested in helping out with the clustering component > >>>>> of mahout, I looked through the JIRA items below and was wondering if > >>>>> there is a specific one that would be good to start with: > >>>>> > >>>>> https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+MAHOUT+AND+resolution+%3D+Unresolved+AND+component+%3D+Clustering+ORDER+BY+priority+DESC&mode=hide > >>>>> > >>>>> I initially was thinking to work on Mahout-930 or Mahout-931 but could > >>>>> work on others if needed. > >>>>> Best Regards > >>> > > >
