[ https://issues.apache.org/jira/browse/SPARK-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377234#comment-14377234 ]
Xiangrui Meng commented on SPARK-6487: -------------------------------------- [~Zhang JiaJin] I'm not very familiar with patten mining, but I don't see many citations of the paper you mentioned. So I need more information to understand the importance/popularity of sequential pattern mining and whether there exist really scalable algorithms. If there are not many requests for this feature or there are no scalable algorithms, you can certainly register your implementation as a third-party package on spark-packages.org and maintain it outside Spark for users. > Add sequential pattern mining algorithm to Spark MLlib > ------------------------------------------------------ > > Key: SPARK-6487 > URL: https://issues.apache.org/jira/browse/SPARK-6487 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Zhang JiaJin > > [~mengxr] [~zhangyouhua] > Sequential pattern mining is an important branch in the pattern mining. In > the past the actual work, we use the sequence mining (mainly PrefixSpan > algorithm) to find the telecommunication signaling sequence pattern, achieved > good results. But once the data is too large, the operation time is too long, > even can not meet the the service requirements. We are ready to implement the > PrefixSpan algorithm in spark, and applied to our subsequent work. > The related Paper: "Distributed PrefixSpan algorithm based on MapReduce". -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org