You are using the in-memory recommender (not Hadoop version)? Note that this 
may not scale well.

The in-memory and Hadoop versions of the recommender *require* user and item 
IDs to be non-negative contiguous integers. You must map your IDs to Mahout-IDs 
and back again. Inside Mahout *only* Mahout-IDs are used. 

Not sure what you are asking about “indexes”

BTW the new Spark-Mahout v1.0 snapshot version of the recommender has no such 
restriction on user and item IDs. See a description here: 
http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html It is 
much easier to use with MongoDB especially if you index certain document fields 
with Solr, which it requires to deliver recommendations.

On Feb 11, 2015, at 6:15 AM, 黄雅冠 <[email protected]> wrote:

Hi !

I am using mahout item-based recommendation with mongodb. I play around
with it and have serval questions.

  -

  How to persistent the recommend model from memory to disk? I know it is
  an old question and there already exists several discussions, such as this
  one
  
<http://mail-archives.apache.org/mod_mbox/mahout-user/201112.mbox/%3ccanq80da42nfr8p5mt-qnbo-ycaxyfrbskyoefairdzyrdy-...@mail.gmail.com%3E>
.
  The result come out I have to do it myself. I just wondering is there any
  realization after two years?
  -

  Is it better to set index in the collection ( the one provides
  preference data )? I read the source and find some query on the collection,
  such as (user_id, item_id), (user_id), (item_id). Also when refresh
  called, it will scan the whole collection to find the new data, so
  (create_at). Would I benefit from ensure index on the fields? If yes,
  which indexes should I ensure?
  -

  From what I can understand, I can use refreshData to achieve event
  driven fresh. That is, when an event ( user scores at an item), I can call
  refresh to refresh the model. And it is better on performance and the model
  keeps up to date. Am I right?

Thanks!

— hyg

Reply via email to