Take note that the trunk version has FP-Bonsai implementation integrated.
You will see substantial speed boost for long transactions>(10 items)

On Tue, Jan 19, 2010 at 10:28 AM, Robin Anil <[email protected]> wrote:

> Are you running it on the trunk or the 0.2 release version ?
> Robin
>
> On Tue, Jan 19, 2010 at 9:43 AM, sej <[email protected]> wrote:
>
>>
>> Hello all,
>>
>> I am running PFP on a fairly large dataset and it works well for smaller
>> subsets of the data.  However, once I attempt larger samples, I run into
>> this error in the reducer phase:
>>
>> 1)  10/01/19 00:25:35 INFO mapred.JobClient: Task Id : attempt_, Status :
>> FAILED
>> Task attempt_ failed to report status for 607 seconds. Killing!
>>
>> I've also noticed that only one reducer is launched for the FP-Tree mining
>> phase.
>> I've tried passing in -D mapred options but it doesn't seem like
>> PFPGrowthJob supports it.  Is there anyway I can increase the timeout,
>> heap
>> size, and/or number of reducers without explicitly changing the code and
>> recompiling?
>>
> This wouldnt be the case unless you specify number of groups =1.  Could you
> give some idea about your dataset
>
>
>>
>> Also, from my understanding of the algorithm, as long as the number of
>> groups is higher than the number of features that are above min support,
>> each tree will be able to utilize the maximum available heap resources
>> because each feature will be guaranteed to be mined separately, is that
>> assumption correct?
>>
> No. Number of groups should be always lower than the number of features,
> else it maxes out at the count of features as there would be no features
> left to fill in the group.  What is the number of features you are working
> on. As a rule of thumb if the data is too large, try and keep 10-20 features
> per group. So assign groups that way
>
>
>
>> Thanks!
>> --sej
>>
>> p.s.
>> the last few log lines outputted for the failed reducer:
>> 2010-01-18 13:06:39,418 INFO
>> org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: Number of unique pruned
>> items 9091
>> 2010-01-18 13:06:39,530 INFO
>> org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: FPTree Building: Read
>> 10000 Transactions
>> 2010-01-18 13:06:39,649 INFO
>> org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: FPTree Building: Read
>> 20000 Transactions
>> 2010-01-18 13:06:39,758 INFO
>> org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: FPTree Building: Read
>> 30000 Transactions
>> 2010-01-18 13:06:39,774 INFO
>> org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: Number of Nodes in the
>> FP
>> Tree: 40904
>> 2010-01-18 13:06:39,775 INFO
>> org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: Mining FTree Tree for
>> all
>> patterns with 3393
>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/PFP---failed-to-report-status----of-reducers-tp27220725p27220725.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.
>>
>>
>
>
> --
> ------
> Robin Anil
> Blog: http://techdigger.wordpress.com
> -------
> Try out Swipeball for iPhone
> Video: http://www.youtube.com/watch?v=3hvEbWHciwU
> iTunes: http://itunes.com/apps/swipeball
>



-- 
------
Robin Anil
Blog: http://techdigger.wordpress.com
-------
Try out Swipeball for iPhone
Video: http://www.youtube.com/watch?v=3hvEbWHciwU
iTunes: http://itunes.com/apps/swipeball

Reply via email to