[ https://issues.apache.org/jira/browse/SPARK-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-6487: --------------------------------- Shepherd: Xiangrui Meng Target Version/s: 1.5.0 > Add sequential pattern mining algorithm to Spark MLlib > ------------------------------------------------------ > > Key: SPARK-6487 > URL: https://issues.apache.org/jira/browse/SPARK-6487 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Zhang JiaJin > > [~mengxr] [~zhangyouhua] > Sequential pattern mining is an important branch in the pattern mining. In > the past the actual work, we use the sequence mining (mainly PrefixSpan > algorithm) to find the telecommunication signaling sequence pattern, achieved > good results. But once the data is too large, the operation time is too long, > even can not meet the the service requirements. We are ready to implement the > PrefixSpan algorithm in spark, and applied to our subsequent work. > The related Paper: > PrefixSpan: > Pei, Jian, et al. "Mining sequential patterns by pattern-growth: The > prefixspan approach." Knowledge and Data Engineering, IEEE Transactions on > 16.11 (2004): 1424-1440. > Parallel Algorithm: > Cong, Shengnan, Jiawei Han, and David Padua. "Parallel mining of closed > sequential patterns." Proceedings of the eleventh ACM SIGKDD international > conference on Knowledge discovery in data mining. ACM, 2005. > Distributed Algorithm: > Wei, Yong-qing, Dong Liu, and Lin-shan Duan. "Distributed PrefixSpan > algorithm based on MapReduce." Information Technology in Medicine and > Education (ITME), 2012 International Symposium on. Vol. 2. IEEE, 2012. > Pattern mining and sequential mining Knowledge: > Han, Jiawei, et al. "Frequent pattern mining: current status and future > directions." Data Mining and Knowledge Discovery 15.1 (2007): 55-86. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org