Re: a bug of fpgrowth?

2012-08-22 Thread tom pierce
Hello, Could you try re-running FP-Growth with the '-2' flag, and let us know if you have more success? This uses an alternate implementation of the FPGrowth algorithm; I have had problems similar to what you are seeing when using the default implementation. I am skeptical of the change yo

Re: Frequent itemset mining

2012-06-06 Thread tom pierce
> mapreduce methods) on a 48GB boxes. Am I doing something wrong? Should it >> be minutes instead of seconds? >> -- >> Alex K >> >> On Mon, Dec 5, 2011 at 12:50 PM, Isabel Drost wrote: >> >>> On 02.12.2011 Tom Pierce wrote: >>>> These progr

Re: mahout FPGrowth problem

2012-05-29 Thread tom pierce
Hi Jens, Is it neccessary to use itemsets with equal length? No - fixed size itemsets are not required. Is it possible to use itemsets with duplicates in mahout FPGrowth? Not reliably. This crash looks like it caused by having more items in one particular itemset than in the set of items

Re: Mahout fpg missing patterns

2011-12-19 Thread Tom Pierce
here a way for me to get all the patterns with support > strictly greater then a particular value? > > Thanks > Gaurav > > On Mon, Dec 19, 2011 at 4:58 PM, Tom Pierce wrote: > >> One possible explanation is that Mahout's FPG avoids reporting >> patterns that are subs

Re: Mahout fpg missing patterns

2011-12-19 Thread Tom Pierce
One possible explanation is that Mahout's FPG avoids reporting patterns that are subsumed by others. For example, if you have pattern [a, b, c] with support 3, you clearly must also have [a, b], [b, c] and [a, c] with support >= 3. Mahout will not report any of those unless the support is strictl

Re: Maximum number of categories in a Bayesian classifier

2011-12-02 Thread Tom Pierce
Hi, I've run into the same or a similar error; I've filed MAHOUT-911 with a set of Wikipedia categories you can use to trigger this condition using the Wikipedia/NaiveBayes example recipe (classifier application fails in either mapreduce or sequential mode). -tom On Wed, Nov 16, 2011 at 7:51 AM,

Re: Frequent itemset mining

2011-12-02 Thread Tom Pierce
These programs are actually exposed though the main mahout program; if you run: $MAHOUT_HOME/bin/mahout fpg it will run the Frequent Pattern Growth algorithm (aka frequent itemset mining). Running the command above will show you what parameters are required/available, including a switch to run i