[ https://issues.apache.org/jira/browse/MAHOUT-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Deneche A. Hakim updated MAHOUT-145: ------------------------------------ Attachment: partial_Sep_15.patch * DONE: no need to load the whole dataset in memory just to extract the labels, this should help when dealing with large datasets > PartialData mapreduce Random Forests > ------------------------------------ > > Key: MAHOUT-145 > URL: https://issues.apache.org/jira/browse/MAHOUT-145 > Project: Mahout > Issue Type: New Feature > Components: Classification > Affects Versions: 0.2 > Reporter: Deneche A. Hakim > Priority: Minor > Fix For: 0.2 > > Attachments: partial_August_10.patch, partial_August_13.patch, > partial_August_15.patch, partial_August_17.patch, partial_August_19.patch, > partial_August_2.patch, partial_August_24.patch, partial_August_27.patch, > partial_August_31.patch, partial_August_9.patch, partial_Sep_15.patch > > > This implementation is based on a suggestion by Ted: > "modify the original algorithm to build multiple trees for different portions > of the data. That loses some of the solidity of the original method, but > could actually do better if the splits exposed non-stationary behavior." -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.