That seems to make sense. What do you mean by "  Mahout will not report any
of those unless the support is strictly greater
than 3. " Is there a way for me to get all the patterns with support
strictly greater then a particular value?

Thanks
Gaurav

On Mon, Dec 19, 2011 at 4:58 PM, Tom Pierce <t...@cloudera.com> wrote:

> One possible explanation is that Mahout's FPG avoids reporting
> patterns that are subsumed by others.
>
> For example, if you have pattern [a, b, c] with support 3, you clearly
> must also have [a, b], [b, c] and [a, c] with support >= 3.  Mahout
> will not report any of those unless the support is strictly greater
> than 3.
>
> Does that help explain your discrepancies?  If not can you share an
> example data set along with a missed pattern?
>
> -tom
>
> On Mon, Dec 19, 2011 at 1:37 AM, gaurav singh <gauravonlin...@gmail.com>
> wrote:
> > Hi All,
> >
> > I am using mahout  on Ubuntu 10.04  from the repository and running it
> on a
> > data set of 1472 row, I am running it in sequential mode with k=200,000
> and
> > s= 400. I have implemented fp-growth in php but when I compare the output
> > of my implementation of fp-growth and mahout fpg, I find that in mahout
> the
> > output consists of just 17,500 patterns whereas from my implementation I
> > get around 65,000 unique patterns(I have verified there uniqueness!), for
> > the same value of support threshold. I have also verified my outputs from
> > the actual data set and have found out that all my patterns are correct
> and
> > do exist in the data set with correct value of their support.
> >
> >
> > Can anyone please explain me the reason??
> >
> > Thanks!!
> >
> > --
> > regards
> > Gaurav Singh
> >
> >
> >
> >
> >
> > --
> > regards
> > Gaurav Singh
> >
> >
> >
> >
> >
> > --
> > regards
> > Gaurav Singh
>



-- 
regards
Gaurav Singh

Reply via email to