TT’ does not solve cold start because you need user history for personalizations. There are several other techniques that I’ve mentioned many times on the list that help with cold start but TT’ is for a slightly different thing. It’s use is when you have a user’s history of item preferences but the items are too old to recommend and you only want to recommend new ones with no history. If you think about news, it is close to being like this. Or patent application, law opinions or judgments too. To be helpful there needs to be a lot of content for each item and you only want new things recommended.
What cold-start do you need to “solve” new anonymous users with no history or items with no conversions? Search the PIO list and AML group for past posts on this. Tag use is implemented as both CF and content similarity (not TT’). If you ask for item-based recommendation and the item has no conversions, you will get popular items by default. If you boost items with the same tags as the item the user is looking at, you get popular items mostly with similar tags. If you disable the popularity part you get items with similar tags, This requires that you attach tags to the items with $set and your query should contain the tags (or any other properties) of the example item. There are many ways of mixing this. You could also just get recs and mix-in new inventory by some small random amount. You can use different placements for these so you aren’t ruining recs with too much randomized cold-items. Anyway, the best way to do this depends on your GUI and data. On Jun 4, 2017, at 11:35 AM, Marius Rabenarivo <[email protected]> wrote: I didn't mean to tell you what it means, but I just wanted to make it clear for my part. As I understand, the T part is a personalization that we should make if we want to use content based information when doing recommendation. For my use case, I want to use it for to overcome the cold start problem. I was thinking that it was already implemented as you documented it in the slides but I didn't find tag use in the code. Is it SimilarityAnalysis.rowSimilarity() in Mahout that implement TT'? (just to confirm) 2017-06-04 22:06 GMT+04:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: No offense Marius but I wrote the slides and the equation so I do indeed know what they are saying. Whether a user writes a tag or you are detecting the user preference for a tag you wrote, they are user indicators of preference. The LLR filtering of these secondary indicators is what CCO is all about and leaves you with a model that can be compared to a user’s history and contains only indicators that correlate to some conversion behavior. T in the "whole enchilada" it used to personalize content based recommendations. Each row of T represent an item and it’s content as tokens. Tokens are stemmed, tokenized text terms, of can be entities in the item’s text (using some form of NLP) or tags, etc. TT’ then gives you items and items that are most similar in terms of whatever content you were using in T. Now you take the users’s history of content item preference, which articles did they read for instance, and the most similar items in TT’. These will be personalized content-based recommendations. This is not implemented in the UR but is in the CCO tools in Mahout. The reason it is not implemented is that it still requires users history and content-based recs are worse predictors than collaborative filtering with user history. In CF you treat the terms or tags as indicators of preference you do not find items similar by content. The personalized content-based recs may serve for edge conditions where you are recommending items with no usage behavior as the most common case, like news articles where you have no items all the time with no usage events. In this case extracting something better than “bag-of-words” for content is quite important. So highly detailed user tagging or NLP techniques can greatly increase the quality of results. On Jun 4, 2017, at 4:09 AM, Marius Rabenarivo <[email protected] <mailto:[email protected]>> wrote: IMHO, T represents tag it an Anonymous tag (or property) labeling task and what you propose is Personalized tag (or property) labeling as described in https://arxiv.org/pdf/1203.4487.pdf <https://arxiv.org/pdf/1203.4487.pdf> (Section 1.4.5 Emerging new classification) p. 40 2017-06-04 8:14 GMT+04:00 Marius Rabenarivo <[email protected] <mailto:[email protected]>>: And what the T in the slides is for? How can we implement it if it's is not implemented yet? 2017-06-04 8:11 GMT+04:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: Buy purchasing an item with a tag that you have given it, they are displaying a preference for that tag. On Jun 3, 2017, at 12:36 PM, Marius Rabenarivo <[email protected] <mailto:[email protected]>> wrote: So the tag here is assumed to be a tag given by the user to an item? I was thinking that it was some kind of tag we give to the item by some mean (classification, LDA, etc) 2017-06-03 21:14 GMT+04:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: A = history of all purchases (in the e-com case) B = history of all tag preferences r = [A’A]h_a + [A’B]h_b The part in the slides about content-based recs is not needed here because you have captured them as user preferences. On Jun 2, 2017, at 7:22 PM, Marius Rabenarivo <[email protected] <mailto:[email protected]>> wrote: Please correct side to size in my previous e-mail 2017-06-03 6:14 GMT+04:00 Marius Rabenarivo <[email protected] <mailto:[email protected]>>: What will be the size of the matrix if we send an event like tag-pref We will get a |U|x|T| matrix I think (where T is the set of all tags). So [AtA] will be a |T| x |T| matrix and we will do a dot product with the user history hT to get recommendation right? I was assuming that A should be of side |U| x |I| where I is the set of all items as it should be added to other terms of the whole enchilada formula afterwards. Thank you for your guidance Pat. 2017-06-02 21:35 GMT+04:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: Please refer to the documents. The “event” is the name of the type of event or indicator if preference, it implies the type of the targetEntityId. So a “tag-pref’ event would be accompanied by a targetEntityId = tag-id. This is separate from attaching “tag” properties to items with the $set event for use with filter and boost rules. One looks at the data as a possible preference indicator and the other is used to restrict results. This is why we usually name events so they sound like a user preference of some type, whereas item property values are simply item attributes, intrinsic to the items and independent of an individual user. The event can have any name that makes sense to you. On Jun 2, 2017, at 9:19 AM, Marius Rabenarivo <[email protected] <mailto:[email protected]>> wrote: so, the event field should be the token and targetEntityId the item ID, right? 2017-06-02 20:07 GMT+04:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: Yes, each is analyzed separately as a separate event. If you are using REST you can send up to 50 events in a single array. Some SDKs may support this too. On Jun 2, 2017, at 8:56 AM, Marius Rabenarivo <[email protected] <mailto:[email protected]>> wrote: So I have to send an event like category-preference for each tag associated to an item right? entityId: userd-id event: category-preference targetEntityId : tag/token 2017-06-02 19:47 GMT+04:00 Pat Ferrel <[email protected] <mailto:[email protected]>>: When a user expresses a preference for a tag, word or term as in search or even in content like descriptions, these can be considered secondary events. The most useful are tags and search terms in our experience. Content can be used but each term/token needs to be sent as a separate preference while search phrases can be used though again turning them into tokens may be better. Please looks through the docs here: http://actionml.com/docs/ur <http://actionml.com/docs/ur> or the siide deck here: https://www.slideshare.net/pferrel/unified-recommender-39986309 <https://www.slideshare.net/pferrel/unified-recommender-39986309> The major innovation of CCO, the algorithm behind the UR, is the use of these cross-domain indicators. They are not guaranteed to predict conversions but the CCO algo tests them and weights them low if they do not so we tend to test for strength of prediction of the entire category of indictor and drop them if weak or set a minLLR threshold and filter weak individual indicators out. Technically these are not called latent, that has another meaning in Machine Learning having to do with Latent Factor Analysis. On Jun 1, 2017, at 11:26 PM, Marius Rabenarivo <[email protected] <mailto:[email protected]>> wrote: Hello everyone! Do you have an idea on how to use latent informations associated to items like tag, word vector embedding in Mahout's SimilarityAnalysis.cooccurrences? Regards, Marius -- You received this message because you are subscribed to the Google Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com <https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com <https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com <https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUxvyyhfEfBoSPnD%2Bv_-4ZCpR0AQ%40mail.gmail.com <https://groups.google.com/d/msgid/actionml-user/CAC-ATVHa-v4Aw8Ebo4xESzKUxvyyhfEfBoSPnD%2Bv_-4ZCpR0AQ%40mail.gmail.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "actionml-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To post to this group, send email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVFoJQpX8XWJ25cQo7CEF8YR%3DRzWxVHTFFZWv_fjGgC6LA%40mail.gmail.com <https://groups.google.com/d/msgid/actionml-user/CAC-ATVFoJQpX8XWJ25cQo7CEF8YR%3DRzWxVHTFFZWv_fjGgC6LA%40mail.gmail.com?utm_medium=email&utm_source=footer>. For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
