Re: MySQLJDBCDataModel vs FileDataModel
If the item similarities are already precomputed there's no sense in fetching them from the data model, you can just read use the already precomputed set of possibly similar items as no other items can be recommended anyway and it's faster to fetch them from a similarity implementation that holds them in memory then from any data model implementation. --sebastian 2011/7/4 Mark : > May I ask why you choose to go with AllSimilarItemsCandidateItemsStrategy > over the default PreferredItemsNeighborhoodCandidateItemsStrategy? > > On 7/4/11 10:23 AM, Sebastian Schelter wrote: >> >> A look into a recent blogpost of mine might maybe be helpful with >> choosing the appropriate data access strategies for your recommender >> setup. It covers a very common usecase in great detail: >> >> >> http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/ >> >> --sebastian >> >> 2011/7/4 Mark: >>> >>> I wouldn't use the in memory JDBC solution. >>> >>> I was wondering do most people choose the JDBC backed solutions or the >>> File >>> backed? >>> >>> On 7/4/11 10:17 AM, Sean Owen wrote: Yes. Both are just fine to use in production. For speed and avoiding abuse of the database, I'd load into memory and tell it to periodically reload. But that too is a bit of a choice between how often you want to consume new data and how much work you want to do to recompute new values. On Mon, Jul 4, 2011 at 6:13 PM, Mark wrote: > Ahh ok. So if I want everything in memory like the file backed solution > I > should use ReloadFromJDBCDataModel? I'm going to give that a try right > now. > > Typically which solution is recommended for production use? > > Thanks > > >
Re: MySQLJDBCDataModel vs FileDataModel
May I ask why you choose to go with AllSimilarItemsCandidateItemsStrategy over the default PreferredItemsNeighborhoodCandidateItemsStrategy? On 7/4/11 10:23 AM, Sebastian Schelter wrote: A look into a recent blogpost of mine might maybe be helpful with choosing the appropriate data access strategies for your recommender setup. It covers a very common usecase in great detail: http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/ --sebastian 2011/7/4 Mark: I wouldn't use the in memory JDBC solution. I was wondering do most people choose the JDBC backed solutions or the File backed? On 7/4/11 10:17 AM, Sean Owen wrote: Yes. Both are just fine to use in production. For speed and avoiding abuse of the database, I'd load into memory and tell it to periodically reload. But that too is a bit of a choice between how often you want to consume new data and how much work you want to do to recompute new values. On Mon, Jul 4, 2011 at 6:13 PM, Markwrote: Ahh ok. So if I want everything in memory like the file backed solution I should use ReloadFromJDBCDataModel? I'm going to give that a try right now. Typically which solution is recommended for production use? Thanks
Re: MySQLJDBCDataModel vs FileDataModel
A look into a recent blogpost of mine might maybe be helpful with choosing the appropriate data access strategies for your recommender setup. It covers a very common usecase in great detail: http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/ --sebastian 2011/7/4 Mark : > I wouldn't use the in memory JDBC solution. > > I was wondering do most people choose the JDBC backed solutions or the File > backed? > > On 7/4/11 10:17 AM, Sean Owen wrote: >> >> Yes. Both are just fine to use in production. For speed and avoiding abuse >> of the database, I'd load into memory and tell it to periodically reload. >> But that too is a bit of a choice between how often you want to consume >> new >> data and how much work you want to do to recompute new values. >> >> On Mon, Jul 4, 2011 at 6:13 PM, Mark wrote: >> >>> Ahh ok. So if I want everything in memory like the file backed solution I >>> should use ReloadFromJDBCDataModel? I'm going to give that a try right >>> now. >>> >>> Typically which solution is recommended for production use? >>> >>> Thanks >>> >>> >
Re: MySQLJDBCDataModel vs FileDataModel
I wouldn't use the in memory JDBC solution. I was wondering do most people choose the JDBC backed solutions or the File backed? On 7/4/11 10:17 AM, Sean Owen wrote: Yes. Both are just fine to use in production. For speed and avoiding abuse of the database, I'd load into memory and tell it to periodically reload. But that too is a bit of a choice between how often you want to consume new data and how much work you want to do to recompute new values. On Mon, Jul 4, 2011 at 6:13 PM, Mark wrote: Ahh ok. So if I want everything in memory like the file backed solution I should use ReloadFromJDBCDataModel? I'm going to give that a try right now. Typically which solution is recommended for production use? Thanks
Re: MySQLJDBCDataModel vs FileDataModel
Yes. Both are just fine to use in production. For speed and avoiding abuse of the database, I'd load into memory and tell it to periodically reload. But that too is a bit of a choice between how often you want to consume new data and how much work you want to do to recompute new values. On Mon, Jul 4, 2011 at 6:13 PM, Mark wrote: > Ahh ok. So if I want everything in memory like the file backed solution I > should use ReloadFromJDBCDataModel? I'm going to give that a try right now. > > Typically which solution is recommended for production use? > > Thanks > >
Re: MySQLJDBCDataModel vs FileDataModel
Ahh ok. So if I want everything in memory like the file backed solution I should use ReloadFromJDBCDataModel? I'm going to give that a try right now. Typically which solution is recommended for production use? Thanks On 7/4/11 10:09 AM, Sean Owen wrote: Yes, this is trading memory for speed. If you can fit everything in memory, then you should. FileDataModel is in memory. MySQLJDBCDataModel is not in memory and queries the DB every time. This is pretty slow, though by caching item-item similarity as you do, a lot of the load is removed. However if you want to go all in memory, use ReloadFromJDBCDataModel. (The naming is weirder than the actual structure or logic...) On Mon, Jul 4, 2011 at 6:05 PM, Mark wrote: I've read the source for FileDataModel and it suggested using a JDBC backed implementation for larger datasets so I decided to upgrade our recommendation system to use MySQLJDBCDataModel with MySQLJDBCInMemoryItemSimilarit**y. I've found that the JDBC backed versions performance is actually worse that FileDataModel and FileItemSimilarity versions. Should this be the case? Which versions are most people using out there? Any recommendations? Thanks
Re: MySQLJDBCDataModel vs FileDataModel
Yes, this is trading memory for speed. If you can fit everything in memory, then you should. FileDataModel is in memory. MySQLJDBCDataModel is not in memory and queries the DB every time. This is pretty slow, though by caching item-item similarity as you do, a lot of the load is removed. However if you want to go all in memory, use ReloadFromJDBCDataModel. (The naming is weirder than the actual structure or logic...) On Mon, Jul 4, 2011 at 6:05 PM, Mark wrote: > I've read the source for FileDataModel and it suggested using a JDBC backed > implementation for larger datasets so I decided to upgrade our > recommendation system to use MySQLJDBCDataModel with > MySQLJDBCInMemoryItemSimilarit**y. > > I've found that the JDBC backed versions performance is actually worse that > FileDataModel and FileItemSimilarity versions. Should this be the case? > Which versions are most people using out there? Any recommendations? > > Thanks >