Thanks! Is that standard practice or do people typically serialize their encoders and then load the binaries later?
On Wed, Jan 7, 2015 at 5:25 PM, Ted Dunning <[email protected]> wrote: > On Wed, Jan 7, 2015 at 2:20 PM, chirag lakhani <[email protected]> > wrote: > > > In the Mahout in Action book I got the impression that the term "memo" > will > > seed the random number generator and I wanted to confirm that means I > will > > have consistency if I deploy this vectorizer in both my Hadoop > environment > > as well as my Java app. In particular, I am fixing the vector size to be > > of length FEATURES and I am using "memo" as the name of my encoder. Will > > those two things guarantee consistency of my text vectorization? > > > > It should do. > > Anything else would be a bug (which is, of course, possible) >
