Thanks deneche, I'll try the commands, and refer to you if I have more questions.
On Sun, Jul 17, 2011 at 2:00 PM, deneche abdelhakim <[email protected]> wrote: > without the -p option it will use the In Memory variation: the dataset is > fully loaded in memory on all the computing nodes > > without the -mr option Mahout will still use Hadoop's commands to access the > files but I think it won't require a Hadoop cluster if the file is not on > HDFS, you'll have to give it a try though. But it's easy to setup Hadoop in > local mode (just take a look at Hadoop's website) > > RandomForests use Hadoop's DistributedCache, it's a mechanism that can copy > the data onto all computing nodes so that every mapper get access to it. So > yes, when using -mr without -p Hadoop will copy the dataset into all > computing nodes > > one last information, Mahout's RandomForests are not ment to be used without > a real computing cluster, if you want to use RandomForests on a single > machine I think that Weka's implementation is more suited. > > On Sat, Jul 16, 2011 at 8:26 AM, XiaoboGu <[email protected]> wrote: > >> Hi, >> >> If call BuildForest without the -p option, then what algorithm is used? >> >> Regarding to the -mr option of TestForest, there are two senarioes: >> 1. If -i option is supplied with a HDFS file or path URL, will Mahout use >> Hadoop to do the classification even if without the -mr option? >> 2.If -I option is supplied with a local file path, then what does the -mr >> option will do, copy the file into the configed Hadoop cluster, or launch a >> local Hadoop instance? >> >> Regards, >> >> Xiaobo Gu >> >> >
