Thanks again for the answer. I will read the paper soon. How can recommendations be configured for content-based filtering (based on item properties) for products which are never sold? Instead of using e.g. populair items.
Boosting with these properties is done with itemBias. > Op 24 mei 2017 om 17:54 heeft Pat Ferrel <[email protected]> het > volgende geschreven: > > I split answers in 2 since the config is a completely separate thing. > > increasing maxCorrelatorsPerEventType it usually the wrong thing to do. It is > making the model fuzzier, for lack of a better term. I fact we’d like to > restrict the correlators to only the best and maxCorrelatorsPerEventType is a > crude way to do this that is worse the more you allow. Another new method is > an LLR threshold, which can be set per indicator to use the correlation value > as a threshold for inclusion as a correlator. maxCorrelatorsPerEventType just > take the top ones even if their scores are low. This is why making this > number big will not make things better because it will include more of lower > quality. > > Also maxEventsPerEventType increases memory usage and takes far longer to > calculate the model for very little if any gain. This is from a paper by > Sebastian Schelter, one of the inventors of CCO > https://ssc.io/pdf/rec11-schelter.pdf > > I’d leave those as defaulted and measure a baseline KPI before doing A/B > tests or cross-validation to try different numbers there. > > > On May 24, 2017, at 8:28 AM, Dennis Honders <[email protected]> wrote: > > Current data: > > {"event": "cart-transaction", "entityId": "1", "entityType": "user", > "targetEntityId": "12", "targetEntityType": "item"}, > > {"event": "$set", "entityType": "item", "entityId": "12", "properties": > {"category": ["1", "2", "3", "4", "5", "6", "7"], "manufacturer": 1, "label": > "test", "price": "$1-$2"}} > > Questions: > > Cart-transaction is the primary for shopping cart recommendation, maybe use > user-buy-item as secondary event or is there no link between this? > > Item-based queries are for similar items. For shopping cart recommendations, > complementary recommendations will suite better? If so, those are made by > 'user-id' (cart-id). How can this be done? > > I like to do content-based recommendation for items that haven't been in a > transaction. I think this can be configured in the engine.json. Any advice > for doing this? > > Engine.json: > > { > "comment":" This config file uses default settings for all but the required > values see README.md for docs", > "id": "default", > "description": "Default settings", > "engineFactory": "com.actionml.RecommendationEngine", > "datasource": { > "params" : { > "name": "ur-name", > "appName": "Test", > "eventNames": ["cart-transaction"] > } > }, > "sparkConf": { > "spark.serializer": "org.apache.spark.serializer.KryoSerializer", > "spark.kryo.registrator": > "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator", > "spark.kryo.referenceTracking": "false", > "spark.kryoserializer.buffer.mb": "300", > "spark.kryoserializer.buffer": "300m", > "es.index.auto.create": "true" > }, > "algorithms": [ > { > "comment": "simplest setup where all values are default, popularity > based backfill, must add eventsNames", > "name": "ur", > "params": { > "appName": "Test", > "indexName": "test", > "typeName": "cart", > "comment": "must have data for the first event or the model > will not build, other events are optional", > "eventNames": ["cart-transaction"], > "maxEventsPerEventType": 50000, > "maxCorrelatorsPerEventType": 5000, > "num": 10, > "itemBias": 2.0, > "rankings": [{ > "name": "preferredRank", > "type": "userDefined" > }] > } > } > ] > } > >
