Yep it's all in memory -- it would be too slow to access it out of Mongo.
The purpose is just making it easy to read and re-read data into Mongo, and
facilitate updates.

If the data is too big to fit in memory you should look first at pruning
your data -- can sampling 10% of it still give you good results?
If not, you are in Hadoop territory then and would want to look at a
distributed algorithm here.

On Sun, Mar 18, 2012 at 8:12 PM, Mridul Kapoor <mridulkap...@gmail.com>wrote:

> Hi,
> I am up for building a item based recommender using Mahout. I have
> humongous amount of data in a Mongodb collection. But I am not sure that
> the MongoDBDataModel provided with Mahout will be able to handle my case. I
> see that in the buildModel() function, it creates a
>
> > FastByIDMap<Collection<Preference>> userIDPrefMap = new
> > FastByIDMap<Collection<Preference>>();
> >
> [line 556]
> Does the subsequent code refer to creating an in-memory-model of the data
> from the mongodb collection(which I think it does); if yes - is there any
> current immediate alternative to that ?
>
> Thanks
> Mridul
>

Reply via email to