Re: bringing back the fp-growth code in mahout

2015-04-29 Thread Ted Dunning
On Mon, Apr 27, 2015 at 8:13 PM, ray rtmel...@gmail.com wrote:

 What is the best way to tell if Apache code is being maintained, in
 particular the fp-growth algorithm in Spark's MLlib?


Ask on the appropriate mailing list.


Re: bringing back the fp-growth code in mahout

2015-04-27 Thread Ted Dunning
Ray,

Is the Spark implementation usable?  Is it maintained?  If not, there is a
decent reason to move forward.

I don't think that we want to revive the old map-reduce implementation.



On Mon, Apr 27, 2015 at 5:48 AM, ray rtmel...@gmail.com wrote:

 I had it in mind to volunteer to maintain the fp-growth code in Mahout,
 but I see that Spark has an fp-growth implementation.  So now that I have
 the time to work on this, I'm wondering if there is any point, or if there
 is still any interest in the Mahout community.

 If not, so be it.  If so, I volunteer.

 Regards, Ray.



Re: bringing back the fp-growth code in mahout

2015-04-27 Thread ray
What is the best way to tell if Apache code is being maintained, in 
particular the fp-growth algorithm in Spark's MLlib?


My original intent (5 months ago) was to replace the map reduce portion 
of the fp-growth code with an alternate, though I wasn't sure what that 
alternate should be.


My motivation for wanting frequent itemsets is that they are closed with 
respect to intersections, so they form simplicial complexes.  I've 
written software for mining simplicial complexes for their geometry. 
Actually, for their 2-dimensional persistent homology.  It means I can 
look at how the geometry changes as both the support and confidence 
parameters vary.  I'm hoping to take at least some of the guesswork out 
of making the right choices for these parameters, which seems to be sort 
of an open question.


So for now I'll see if Spark's implementation generates usable frequent 
item sets, and have some fun learning Scala, and see about maybe getting 
fp-growth running on top of Flink.



On 04/27/2015 07:59 AM, Ted Dunning wrote:


Ray,

Is the Spark implementation usable?  Is it maintained?  If not, there is
a decent reason to move forward.

I don't think that we want to revive the old map-reduce implementation.



On Mon, Apr 27, 2015 at 5:48 AM, ray rtmel...@gmail.com
mailto:rtmel...@gmail.com wrote:

I had it in mind to volunteer to maintain the fp-growth code in
Mahout, but I see that Spark has an fp-growth implementation.  So
now that I have the time to work on this, I'm wondering if there is
any point, or if there is still any interest in the Mahout community.

If not, so be it.  If so, I volunteer.

Regards, Ray.