Re: bringing back the fp-growth code in mahout

ray Mon, 27 Apr 2015 20:15:07 -0700

What is the best way to tell if Apache code is being maintained, inparticular the fp-growth algorithm in Spark's MLlib?

My original intent (5 months ago) was to replace the map reduce portionof the fp-growth code with an alternate, though I wasn't sure what thatalternate should be.

My motivation for wanting frequent itemsets is that they are closed withrespect to intersections, so they form simplicial complexes. I'vewritten software for mining simplicial complexes for their geometry.Actually, for their 2-dimensional persistent homology. It means I canlook at how the geometry changes as both the support and confidenceparameters vary. I'm hoping to take at least some of the guesswork outof making the right choices for these parameters, which seems to be sortof an open question.

So for now I'll see if Spark's implementation generates usable frequentitem sets, and have some fun learning Scala, and see about maybe gettingfp-growth running on top of Flink.



On 04/27/2015 07:59 AM, Ted Dunning wrote:


Ray,

Is the Spark implementation usable?  Is it maintained?  If not, there is
a decent reason to move forward.

I don't think that we want to revive the old map-reduce implementation.



On Mon, Apr 27, 2015 at 5:48 AM, ray <rtmel...@gmail.com
<mailto:rtmel...@gmail.com>> wrote:

    I had it in mind to volunteer to maintain the fp-growth code in
    Mahout, but I see that Spark has an fp-growth implementation.  So
    now that I have the time to work on this, I'm wondering if there is
    any point, or if there is still any interest in the Mahout community.

    If not, so be it.  If so, I volunteer.

    Regards, Ray.

Re: bringing back the fp-growth code in mahout

Reply via email to