Re: Spark FP-Growth algorithm for frequent sequential patterns

2015-06-28 Thread Xiangrui Meng
Hi Ping, FYI, we just merged Feynman's PR: https://github.com/apache/spark/pull/6997 that adds sequential pattern support. Please check out master branch and help test. Thanks! Best, Xiangrui On Wed, Jun 24, 2015 at 2:16 PM, Feynman Liang fli...@databricks.com wrote: There is a JIRA for this

Re: Spark FP-Growth algorithm for frequent sequential patterns

2015-06-23 Thread Xiangrui Meng
This is on the wish list for Spark 1.5. Assuming that the items from the same transaction are distinct. We can still follow FP-Growth's steps: 1. find frequent items 2. filter transactions and keep only frequent items 3. do NOT order by frequency 4. use suffix to partition the transactions

Spark FP-Growth algorithm for frequent sequential patterns

2015-06-19 Thread ping yan
Hi, I have a use case where I'd like to mine frequent sequential patterns (consider the clickpath scenario). Transaction A - B doesn't equal Transaction B-A.. From what I understand about FP-growth in general and the MLlib implementation of it, the orders are not preserved. Anyone can provide