RE: Difference when we don't use partial implementation

2012-07-05 Thread Nowal, Akshay
Regards, Akshay Nowal  |    -Original Message- From: deneche abdelhakim [mailto:adene...@gmail.com] Sent: Thursday, July 05, 2012 11:23 AM To: user@mahout.apache.org Subject: Re: Difference when we don't use partial implementation Hi Akshay, when you don't use the -p parameter

Re: Difference when we don't use partial implementation

2012-07-05 Thread deneche abdelhakim
, 2012 11:23 AM To: user@mahout.apache.org Subject: Re: Difference when we don't use partial implementation Hi Akshay, when you don't use the -p parameter, the builder loads the whole dataset in memory in every computing node, so every tree grown is trained on the whole dataset (of course

Difference when we don't use partial implementation

2012-07-04 Thread Nowal, Akshay
Hi All, I am running Decision forest in Mahout, below are the commands that I have used to implement the algo: Info file: mahout org.apache.mahout.df.tools.Describe -p /user/an32665/KDD/KDDTrain+.arff -f /user/an32665/KDD/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L Building

Re: Difference when we don't use partial implementation

2012-07-04 Thread deneche abdelhakim
Hi Akshay, when you don't use the -p parameter, the builder loads the whole dataset in memory in every computing node, so every tree grown is trained on the whole dataset (of course using bagging to select a subset of it). When using -p, every computing node loads a part of the dataset (thus the