Hi Stanley, Can you help with this: You might encode the feature to vector and serialize them to the file system by MapReduce to reduce cost on data parsing.
And I have started a new thread on http://mail-archives.apache.org/mod_mbox/mahout-dev/201108.mbox/%3cCACOCgckzcAm4V8y3CQhnBWtUy9jVgAbKzE1R+z6zpQAF=8x...@mail.gmail.com%3e > Best wishes, > Stanley Xu