Thanks! Is that standard practice or do people typically serialize their encoders and then load the binaries later?
On Wed, Jan 7, 2015 at 5:25 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > On Wed, Jan 7, 2015 at 2:20 PM, chirag lakhani <chirag.lakh...@gmail.com> > wrote: > > > In the Mahout in Action book I got the impression that the term "memo" > will > > seed the random number generator and I wanted to confirm that means I > will > > have consistency if I deploy this vectorizer in both my Hadoop > environment > > as well as my Java app. In particular, I am fixing the vector size to be > > of length FEATURES and I am using "memo" as the name of my encoder. Will > > those two things guarantee consistency of my text vectorization? > > > > It should do. > > Anything else would be a bug (which is, of course, possible) >