[ https://issues.apache.org/jira/browse/MAHOUT-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robin Anil updated MAHOUT-157: ------------------------------ Attachment: MAHOUT-157-Oct-1.patch Finished Sequential version of FPGrowth. May need some more documentation and cleanup run ParallelFPGrowth in examples with following settings s = min support h = max size of heap g = number of groups (works for parallel version) -i /home/robina/Desktop/accidents.dat -o output -s 3 -h 100 -g 1000 -method sequential accidents file is a transaction list with each line as a traction with features delimited by comma or tab You can download a dataset from here http://fimi.cs.helsinki.fi/data/ the dataset needs to be converted into comma separated or tab separated. (Ensure there are trailing commas at the end of the line) the output file will have the top h patterns for each frequent feature > Frequent Pattern Mining using Parallel FP-Growth > ------------------------------------------------ > > Key: MAHOUT-157 > URL: https://issues.apache.org/jira/browse/MAHOUT-157 > Project: Mahout > Issue Type: New Feature > Components: Frequent Itemset/Association Rule Mining > Affects Versions: 0.2 > Reporter: Robin Anil > Assignee: Robin Anil > Fix For: 0.2 > > Attachments: MAHOUT-157-August-17.patch, MAHOUT-157-August-24.patch, > MAHOUT-157-August-31.patch, MAHOUT-157-August-6.patch, > MAHOUT-157-Combinations-BSD-License.patch, > MAHOUT-157-Combinations-BSD-License.patch, > MAHOUT-157-inProgress-August-5.patch, MAHOUT-157-Oct-1.patch, > MAHOUT-157-September-10.patch, MAHOUT-157-September-18.patch, > MAHOUT-157-September-5.patch > > > Implement: http://infolab.stanford.edu/~echang/recsys08-69.pdf -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.