[ https://issues.apache.org/jira/browse/SPARK-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-9578: ----------------------------------- Assignee: (was: Apache Spark) > Stemmer feature transformer > --------------------------- > > Key: SPARK-9578 > URL: https://issues.apache.org/jira/browse/SPARK-9578 > Project: Spark > Issue Type: New Feature > Components: ML > Reporter: Joseph K. Bradley > Priority: Minor > > Transformer mentioned first in [SPARK-5571] based on suggestion from > [~aloknsingh]. Very standard NLP preprocessing task. > From [~aloknsingh]: > {quote} > We have one scala stemmer in scalanlp%chalk > https://github.com/scalanlp/chalk/tree/master/src/main/scala/chalk/text/analyze > which can easily copied (as it is apache project) and is in scala too. > I think this will be better alternative than lucene englishAnalyzer or > opennlp. > Note: we already use the scalanlp%breeze via the maven dependency so I think > adding scalanlp%chalk dependency is also the options. But as you had said we > can copy the code as it is small. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org