Thanks! Is that standard practice or do people typically serialize their
encoders and then load the binaries later?
On Wed, Jan 7, 2015 at 5:25 PM, Ted Dunning ted.dunn...@gmail.com wrote:
On Wed, Jan 7, 2015 at 2:20 PM, chirag lakhani chirag.lakh...@gmail.com
wrote:
In the Mahout in
I am trying vectorize text data for a Naive Bayes classifier that will be
trained in Hadoop and then the corresponding model will be deployed in a
Java app. My basic approach is to tokenize a string of text data using
Lucene and then encode each token using a StaticWordValueEncoder here are
the
On Wed, Jan 7, 2015 at 2:20 PM, chirag lakhani chirag.lakh...@gmail.com
wrote:
In the Mahout in Action book I got the impression that the term memo will
seed the random number generator and I wanted to confirm that means I will
have consistency if I deploy this vectorizer in both my Hadoop