FPGrowth/PFPGrowth giving out wrong results.
---------------------------------------------
Key: MAHOUT-617
URL: https://issues.apache.org/jira/browse/MAHOUT-617
Project: Mahout
Issue Type: Bug
Components: Frequent Itemset/Association Rule Mining
Affects Versions: 0.4
Environment: Mac OS X, Linux
Reporter: Vipul Pandey
PFPGrowth with my data is giving out wrong results. Attached are :
- The input data
- The output (sequence file) generated by FPGrowth (PFPGrowth gives the same
results)
- Output as text
$ cat part-r-00000 | grep 1678807047
12 1678807047
38 1678807047 3159925415
which says that the support (12) for the item (1678807047) is lesser than the
support (38) of a pair containing that item.
another example
$ cat part-r-00000 | grep 1441690161
12 1441690161 3910019844
18 1604285941 1441690161 3910019844
75 1441690161
Runtime parameters :
-i baskets/part-r-00000 -o patterns -k 50 -method sequential -g 10 -regex
'[\t]' -s 10
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira