subject:"Universal Recommender. How to rank items returned by query on three types of indicators\?"

Re: Universal Recommender. How to rank items returned by query on three types of indicators?

2017-02-06 Thread Pat Ferrel

> is the dot product of the
> normalized vectors

:-)

On Feb 5, 2017, at 2:35 PM, Andrew Evans  wrote:

Careful, dot products are sometimes called “cosine” is false. Cosine =
(x.dot(y)) /(norm(x)*norm(y)). That is not x.dot(y) unless the norms sum to
1.

On Sun, Feb 5, 2017 at 10:36 AM, Pat Ferrel > wrote:

> Nice, someone does read the math :-)
> 
> Content: The type of personalized “content” indicators talked about in the
> slides are not supported by the Universal Recommender and have little value
> unless you have no collaborative filtering data. They can theoretically be
> mixed with other indicators but you have to have history of the content a
> user has preferred in some way and that can also be seen as CF data so that
> part of the theory has value in only very specific edge cases like
> personalized news, where stories mostly do not get enough events to use for
> CF. If this is your case  we can talk more. Most people have CF data and so
> content cannot be used in this way but can as “intrinic”.
> 
> Intrinsic: These are things like categories, tags, subjects, even derived
> indicators like LDA Topics, or popularity. They are attached to items as
> metadata. These are supported by the UR in several ways including boosts
> and filters. Imagine an ecom use case where a user is looking at a piece of
> “clothing”, at the bottom of the page you show “people who bought this also
> bought these” but you want only clothing, not the occasional video of
> electronics item. The things at the bottom of the page are “item-based”
> recommendations, not personalized but could also be personalized—no matter.
> The point is that of all recommendations you want to show only items that
> have the “category”: [“clothing”]. So it you have attached this “intrinsic”
> indicator to items you can query for item or user based recs with category:
> clothing. You can filter all recommendations out that do not have the
> category or you can boost items that have the category, both are done by
> changing the “bias” value in the query. See this page:
> http://actionml.com/docs/ur_queries  
> >
> 
> Collaborative Filtering based indicators. Are based on any action, bit of
> context, or profile info that you think may relate to the user’s taste or
> preferences. These are more correctly called indicators when they are
> gathered but they go through a correlation test, that checks if the
> individual events appear to correlate with the conversion/primary event. So
> after the test we call them correlators and they are attached to items. So
> CF correlators of several types may be attached to each item along with the
> Intrinsic correlators.
> 
> The Universal Recommender creates a model of all items with all CF and
> Intrinsic Correlators attached in a Lucene Index to all items with
> correlators. The index allows very fast scalable KNN queries (using cosine
> similarity). So when you ask the UR for user-based recommendations for
> user-1 we look up the recent events of user-1 and use these to make a KNN
> query to Lucene (inside of Elasticsearch) for items that have similar
> correlators. If you ask for user-based recommendations but bias or boost
> clothing by 10, the UR will internally multiply the hit score for
> “clothing” by 10 and re-rank all results. This means that “clothing” will
> be favored in results but if there are no recs for clothing, other types of
> recs may be returned.
> 
> Scores: These are literally the sum of “dot products” of all indictors
> with boosts accounted for. Dot products are sometimes called “cosine” since
> the cosine of the angle between two vectors is the dot product of the
> normalized vectors. Each indicator is a vector, if you refer back to the
> slides and the total score is the sum of one vector times the entire
> matrix. If you then sum the dot products it is the score for all items.
> Lucene actually does this but makes use of special indexing and the
> sparseness of the data and query. So the result from Lucene is the items
> that are K Nearest Neighbors to the indicator vectors in the query.
> Conceptually Lucene does this for all items in the index but it skips 99%
> of them and distributes queries to produce the answer very quickly. The
> math in the slides shows what you would get if you did the matrix math for
> all data and if you paginated and returned all recommendations you would
> get exactly the results in the slides, but all you care about are the top
> k—therefor KNN
> 
> TLDR; After the model is created with Mahout the last phase of the matrix
> math, finding the most similar items done inside Elasticsearch so one query
> returns the top ranked results. The scores can be explained (by the math
> you read) but are of no real use, only the rank matters.
> 
> BTW the CCO algorithm in

Re: Universal Recommender. How to rank items returned by query on three types of indicators?

2017-02-05 Thread Andrew Evans

Careful, dot products are sometimes called “cosine” is false. Cosine =
(x.dot(y)) /(norm(x)*norm(y)). That is not x.dot(y) unless the norms sum to
1.

On Sun, Feb 5, 2017 at 10:36 AM, Pat Ferrel  wrote:

> Nice, someone does read the math :-)
>
> Content: The type of personalized “content” indicators talked about in the
> slides are not supported by the Universal Recommender and have little value
> unless you have no collaborative filtering data. They can theoretically be
> mixed with other indicators but you have to have history of the content a
> user has preferred in some way and that can also be seen as CF data so that
> part of the theory has value in only very specific edge cases like
> personalized news, where stories mostly do not get enough events to use for
> CF. If this is your case  we can talk more. Most people have CF data and so
> content cannot be used in this way but can as “intrinic”.
>
> Intrinsic: These are things like categories, tags, subjects, even derived
> indicators like LDA Topics, or popularity. They are attached to items as
> metadata. These are supported by the UR in several ways including boosts
> and filters. Imagine an ecom use case where a user is looking at a piece of
> “clothing”, at the bottom of the page you show “people who bought this also
> bought these” but you want only clothing, not the occasional video of
> electronics item. The things at the bottom of the page are “item-based”
> recommendations, not personalized but could also be personalized—no matter.
> The point is that of all recommendations you want to show only items that
> have the “category”: [“clothing”]. So it you have attached this “intrinsic”
> indicator to items you can query for item or user based recs with category:
> clothing. You can filter all recommendations out that do not have the
> category or you can boost items that have the category, both are done by
> changing the “bias” value in the query. See this page:
> http://actionml.com/docs/ur_queries 
>
> Collaborative Filtering based indicators. Are based on any action, bit of
> context, or profile info that you think may relate to the user’s taste or
> preferences. These are more correctly called indicators when they are
> gathered but they go through a correlation test, that checks if the
> individual events appear to correlate with the conversion/primary event. So
> after the test we call them correlators and they are attached to items. So
> CF correlators of several types may be attached to each item along with the
> Intrinsic correlators.
>
> The Universal Recommender creates a model of all items with all CF and
> Intrinsic Correlators attached in a Lucene Index to all items with
> correlators. The index allows very fast scalable KNN queries (using cosine
> similarity). So when you ask the UR for user-based recommendations for
> user-1 we look up the recent events of user-1 and use these to make a KNN
> query to Lucene (inside of Elasticsearch) for items that have similar
> correlators. If you ask for user-based recommendations but bias or boost
> clothing by 10, the UR will internally multiply the hit score for
> “clothing” by 10 and re-rank all results. This means that “clothing” will
> be favored in results but if there are no recs for clothing, other types of
> recs may be returned.
>
> Scores: These are literally the sum of “dot products” of all indictors
> with boosts accounted for. Dot products are sometimes called “cosine” since
> the cosine of the angle between two vectors is the dot product of the
> normalized vectors. Each indicator is a vector, if you refer back to the
> slides and the total score is the sum of one vector times the entire
> matrix. If you then sum the dot products it is the score for all items.
> Lucene actually does this but makes use of special indexing and the
> sparseness of the data and query. So the result from Lucene is the items
> that are K Nearest Neighbors to the indicator vectors in the query.
> Conceptually Lucene does this for all items in the index but it skips 99%
> of them and distributes queries to produce the answer very quickly. The
> math in the slides shows what you would get if you did the matrix math for
> all data and if you paginated and returned all recommendations you would
> get exactly the results in the slides, but all you care about are the top
> k—therefor KNN
>
> TLDR; After the model is created with Mahout the last phase of the matrix
> math, finding the most similar items done inside Elasticsearch so one query
> returns the top ranked results. The scores can be explained (by the math
> you read) but are of no real use, only the rank matters.
>
> BTW the CCO algorithm in partly implemented in Mahout with the last phase
> in Elasticsearch, and you can get community support for the Universal
> Recommender here: https://groups.google.com/forum/#!forum/actionml-user <
>

Re: Universal Recommender. How to rank items returned by query on three types of indicators?

2017-02-05 Thread Pat Ferrel

Nice, someone does read the math :-)

Content: The type of personalized “content” indicators talked about in the
slides are not supported by the Universal Recommender and have little value
unless you have no collaborative filtering data. They can theoretically be
mixed with other indicators but you have to have history of the content a user
has preferred in some way and that can also be seen as CF data so that part of
the theory has value in only very specific edge cases like personalized news,
where stories mostly do not get enough events to use for CF. If this is your
case we can talk more. Most people have CF data and so content cannot be used
in this way but can as “intrinic”.

Intrinsic: These are things like categories, tags, subjects, even derived
indicators like LDA Topics, or popularity. They are attached to items as
metadata. These are supported by the UR in several ways including boosts and
filters. Imagine an ecom use case where a user is looking at a piece of
“clothing”, at the bottom of the page you show “people who bought this also
bought these” but you want only clothing, not the occasional video of
electronics item. The things at the bottom of the page are “item-based”
recommendations, not personalized but could also be personalized—no matter. The
point is that of all recommendations you want to show only items that have the
“category”: [“clothing”]. So it you have attached this “intrinsic” indicator to
items you can query for item or user based recs with category: clothing. You
can filter all recommendations out that do not have the category or you can
boost items that have the category, both are done by changing the “bias” value
in the query. See this page: http://actionml.com/docs/ur_queries

Collaborative Filtering based indicators. Are based on any action, bit of
context, or profile info that you think may relate to the user’s taste or
preferences. These are more correctly called indicators when they are gathered
but they go through a correlation test, that checks if the individual events
appear to correlate with the conversion/primary event. So after the test we
call them correlators and they are attached to items. So CF correlators of
several types may be attached to each item along with the Intrinsic correlators.

The Universal Recommender creates a model of all items with all CF and
Intrinsic Correlators attached in a Lucene Index to all items with correlators.
The index allows very fast scalable KNN queries (using cosine similarity). So
when you ask the UR for user-based recommendations for user-1 we look up the
recent events of user-1 and use these to make a KNN query to Lucene (inside of
Elasticsearch) for items that have similar correlators. If you ask for
user-based recommendations but bias or boost clothing by 10, the UR will
internally multiply the hit score for “clothing” by 10 and re-rank all results.
This means that “clothing” will be favored in results but if there are no recs
for clothing, other types of recs may be returned.

Scores: These are literally the sum of “dot products” of all indictors with
boosts accounted for. Dot products are sometimes called “cosine” since the
cosine of the angle between two vectors is the dot product of the normalized
vectors. Each indicator is a vector, if you refer back to the slides and the
total score is the sum of one vector times the entire matrix. If you then sum
the dot products it is the score for all items. Lucene actually does this but
makes use of special indexing and the sparseness of the data and query. So the
result from Lucene is the items that are K Nearest Neighbors to the indicator
vectors in the query. Conceptually Lucene does this for all items in the index
but it skips 99% of them and distributes queries to produce the answer very
quickly. The math in the slides shows what you would get if you did the matrix
math for all data and if you paginated and returned all recommendations you
would get exactly the results in the slides, but all you care about are the top
k—therefor KNN

TLDR; After the model is created with Mahout the last phase of the matrix math,
finding the most similar items done inside Elasticsearch so one query returns
the top ranked results. The scores can be explained (by the math you read) but
are of no real use, only the rank matters.

BTW the CCO algorithm in partly implemented in Mahout with the last phase in
Elasticsearch, and you can get community support for the Universal Recommender
here: https://groups.google.com/forum/#!forum/actionml-user

On Feb 5, 2017, at 12:42 AM, Peng Zhang wrote:

Hi,

Suppose we have created three types of indicators (coocurrence, content and
intrinsic) and indexed them into Ellastic Search (ES). Then we query on
these three types of indicators of a user to get recommended items. How
does

Universal Recommender. How to rank items returned by query on three types of indicators?

2017-02-05 Thread Peng Zhang

Hi,

Suppose we have created three types of indicators (coocurrence, content and
intrinsic) and indexed them into Ellastic Search (ES). Then we query on
these three types of indicators of a user to get recommended items. How
does Universal Tecommender rank the items recommended based on these three
types of indicators?

I have gone thru the slides on Universal Recommender created by Pat. It's
very informative. Here is the link:
https://www.slideshare.net/mobile/pferrel/unified-recommender-39986309

Thanks
-Peng

Re: Universal Recommender. How to rank items returned by query on three types of indicators?

Re: Universal Recommender. How to rank items returned by query on three types of indicators?

Re: Universal Recommender. How to rank items returned by query on three types of indicators?

Universal Recommender. How to rank items returned by query on three types of indicators?

4 matches

Site Navigation

Mail list logo

Footer information