spark ml - ngram - how to preserve single word (1-gram)

Nirav Patel Tue, 08 Nov 2016 14:42:07 -0800

Is it possible to preserve single token while using n-gram feature
transformer?


e.g.

Array("Hi", "I", "heard", "about", "Spark")

Becomes

Array("Hi", "i", "heard", "about", "Spark", "Hi i", "I heard", "heard
about", "about Spark")

Currently if I want to do it I will have to manually transform column first
using current ngram implementation then join 1-gram tokens to each column
value. basically I have to do this outside of pipeline.

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

spark ml - ngram - how to preserve single word (1-gram)

Reply via email to