[ 
https://issues.apache.org/jira/browse/SPARK-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377234#comment-14377234
 ] 

Xiangrui Meng commented on SPARK-6487:
--------------------------------------

[~Zhang JiaJin] I'm not very familiar with patten mining, but I don't see many 
citations of the paper you mentioned. So I need more information to understand 
the importance/popularity of sequential pattern mining and whether there exist 
really scalable algorithms. If there are not many requests for this feature or 
there are no scalable algorithms, you can certainly register your 
implementation as a third-party package on spark-packages.org and maintain it 
outside Spark for users.

> Add sequential pattern mining algorithm to Spark MLlib
> ------------------------------------------------------
>
>                 Key: SPARK-6487
>                 URL: https://issues.apache.org/jira/browse/SPARK-6487
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Zhang JiaJin
>
> [~mengxr] [~zhangyouhua]
> Sequential pattern mining is an important branch in the pattern mining. In 
> the past the actual work, we use the sequence mining (mainly PrefixSpan 
> algorithm) to find the telecommunication signaling sequence pattern, achieved 
> good results. But once the data is too large, the operation time is too long, 
> even can not meet the the service requirements. We are ready to implement the 
> PrefixSpan algorithm in spark, and applied to our subsequent work. 
> The related Paper: "Distributed PrefixSpan algorithm based on MapReduce".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to