Thanks for the reply.

Yes, I am using in memory version because of the learning curve for a
biginner.

The index I have memtion is mongo index on collection. Will it quicker if I
ensure some index before trainning?

I use maven to manage project. Does 1.0 accessiabe via maven. Does it a
beta version or a stable one?
2015年2月12日 上午3:05于 "Pat Ferrel" <[email protected]>写道:

> You are using the in-memory recommender (not Hadoop version)? Note that
> this may not scale well.
>
> The in-memory and Hadoop versions of the recommender *require* user and
> item IDs to be non-negative contiguous integers. You must map your IDs to
> Mahout-IDs and back again. Inside Mahout *only* Mahout-IDs are used.
>
> Not sure what you are asking about “indexes”
>
> BTW the new Spark-Mahout v1.0 snapshot version of the recommender has no
> such restriction on user and item IDs. See a description here:
> http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html
> It is much easier to use with MongoDB especially if you index certain
> document fields with Solr, which it requires to deliver recommendations.
>
> On Feb 11, 2015, at 6:15 AM, 黄雅冠 <[email protected]> wrote:
>
> Hi !
>
> I am using mahout item-based recommendation with mongodb. I play around
> with it and have serval questions.
>
>   -
>
>   How to persistent the recommend model from memory to disk? I know it is
>   an old question and there already exists several discussions, such as
> this
>   one
>   <
> http://mail-archives.apache.org/mod_mbox/mahout-user/201112.mbox/%3ccanq80da42nfr8p5mt-qnbo-ycaxyfrbskyoefairdzyrdy-...@mail.gmail.com%3E
> >
> .
>   The result come out I have to do it myself. I just wondering is there any
>   realization after two years?
>   -
>
>   Is it better to set index in the collection ( the one provides
>   preference data )? I read the source and find some query on the
> collection,
>   such as (user_id, item_id), (user_id), (item_id). Also when refresh
>   called, it will scan the whole collection to find the new data, so
>   (create_at). Would I benefit from ensure index on the fields? If yes,
>   which indexes should I ensure?
>   -
>
>   From what I can understand, I can use refreshData to achieve event
>   driven fresh. That is, when an event ( user scores at an item), I can
> call
>   refresh to refresh the model. And it is better on performance and the
> model
>   keeps up to date. Am I right?
>
> Thanks!
>
> — hyg
>
>

Reply via email to