Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sebastian Schelter
If the item similarities are already precomputed there's no sense in
fetching them from the data model, you can just read use the already
precomputed set of possibly similar items as no other items can be
recommended anyway and it's faster to fetch them from a similarity
implementation that holds them in memory then from any data model
implementation.

--sebastian

2011/7/4 Mark :
> May I ask why you choose to go with AllSimilarItemsCandidateItemsStrategy
> over the default PreferredItemsNeighborhoodCandidateItemsStrategy?
>
> On 7/4/11 10:23 AM, Sebastian Schelter wrote:
>>
>> A look into a recent blogpost of mine might maybe be helpful with
>> choosing the appropriate data access strategies for your recommender
>> setup. It covers a very common usecase in great detail:
>>
>>
>> http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/
>>
>> --sebastian
>>
>> 2011/7/4 Mark:
>>>
>>> I wouldn't use the in memory JDBC solution.
>>>
>>> I was wondering do most people choose the JDBC backed solutions or the
>>> File
>>> backed?
>>>
>>> On 7/4/11 10:17 AM, Sean Owen wrote:

 Yes. Both are just fine to use in production. For speed and avoiding
 abuse
 of the database, I'd load into memory and tell it to periodically
 reload.
 But that too is a bit of a choice between how often you want to consume
 new
 data and how much work you want to do to recompute new values.

 On Mon, Jul 4, 2011 at 6:13 PM, Mark
  wrote:

> Ahh ok. So if I want everything in memory like the file backed solution
> I
> should use ReloadFromJDBCDataModel? I'm going to give that a try right
> now.
>
> Typically which solution is recommended for production use?
>
> Thanks
>
>
>


Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark
May I ask why you choose to go with 
AllSimilarItemsCandidateItemsStrategy over the default 
PreferredItemsNeighborhoodCandidateItemsStrategy?


On 7/4/11 10:23 AM, Sebastian Schelter wrote:

A look into a recent blogpost of mine might maybe be helpful with
choosing the appropriate data access strategies for your recommender
setup. It covers a very common usecase in great detail:

http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/

--sebastian

2011/7/4 Mark:

I wouldn't use the in memory JDBC solution.

I was wondering do most people choose the JDBC backed solutions or the File
backed?

On 7/4/11 10:17 AM, Sean Owen wrote:

Yes. Both are just fine to use in production. For speed and avoiding abuse
of the database, I'd load into memory and tell it to periodically reload.
But that too is a bit of a choice between how often you want to consume
new
data and how much work you want to do to recompute new values.

On Mon, Jul 4, 2011 at 6:13 PM, Markwrote:


Ahh ok. So if I want everything in memory like the file backed solution I
should use ReloadFromJDBCDataModel? I'm going to give that a try right
now.

Typically which solution is recommended for production use?

Thanks




Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sebastian Schelter
A look into a recent blogpost of mine might maybe be helpful with
choosing the appropriate data access strategies for your recommender
setup. It covers a very common usecase in great detail:

http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/

--sebastian

2011/7/4 Mark :
> I wouldn't use the in memory JDBC solution.
>
> I was wondering do most people choose the JDBC backed solutions or the File
> backed?
>
> On 7/4/11 10:17 AM, Sean Owen wrote:
>>
>> Yes. Both are just fine to use in production. For speed and avoiding abuse
>> of the database, I'd load into memory and tell it to periodically reload.
>> But that too is a bit of a choice between how often you want to consume
>> new
>> data and how much work you want to do to recompute new values.
>>
>> On Mon, Jul 4, 2011 at 6:13 PM, Mark  wrote:
>>
>>> Ahh ok. So if I want everything in memory like the file backed solution I
>>> should use ReloadFromJDBCDataModel? I'm going to give that a try right
>>> now.
>>>
>>> Typically which solution is recommended for production use?
>>>
>>> Thanks
>>>
>>>
>


Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark

I wouldn't use the in memory JDBC solution.

I was wondering do most people choose the JDBC backed solutions or the 
File backed?


On 7/4/11 10:17 AM, Sean Owen wrote:

Yes. Both are just fine to use in production. For speed and avoiding abuse
of the database, I'd load into memory and tell it to periodically reload.
But that too is a bit of a choice between how often you want to consume new
data and how much work you want to do to recompute new values.

On Mon, Jul 4, 2011 at 6:13 PM, Mark  wrote:


Ahh ok. So if I want everything in memory like the file backed solution I
should use ReloadFromJDBCDataModel? I'm going to give that a try right now.

Typically which solution is recommended for production use?

Thanks




Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sean Owen
Yes. Both are just fine to use in production. For speed and avoiding abuse
of the database, I'd load into memory and tell it to periodically reload.
But that too is a bit of a choice between how often you want to consume new
data and how much work you want to do to recompute new values.

On Mon, Jul 4, 2011 at 6:13 PM, Mark  wrote:

> Ahh ok. So if I want everything in memory like the file backed solution I
> should use ReloadFromJDBCDataModel? I'm going to give that a try right now.
>
> Typically which solution is recommended for production use?
>
> Thanks
>
>


Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark
Ahh ok. So if I want everything in memory like the file backed solution 
I should use ReloadFromJDBCDataModel? I'm going to give that a try right 
now.


Typically which solution is recommended for production use?

Thanks

On 7/4/11 10:09 AM, Sean Owen wrote:

Yes, this is trading memory for speed. If you can fit everything in memory,
then you should. FileDataModel is in memory.

MySQLJDBCDataModel is not in memory and queries the DB every time. This is
pretty slow, though by caching item-item similarity as you do, a lot of the
load is removed. However if you want to go all in memory, use
ReloadFromJDBCDataModel.

(The naming is weirder than the actual structure or logic...)

On Mon, Jul 4, 2011 at 6:05 PM, Mark  wrote:


I've read the source for FileDataModel and it suggested using a JDBC backed
implementation for larger datasets so I decided to upgrade our
recommendation system to use MySQLJDBCDataModel with
MySQLJDBCInMemoryItemSimilarit**y.

I've found that the JDBC backed versions performance is actually worse that
FileDataModel and FileItemSimilarity versions. Should this be the case?
Which versions are most people using out there? Any recommendations?

Thanks



Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sean Owen
Yes, this is trading memory for speed. If you can fit everything in memory,
then you should. FileDataModel is in memory.

MySQLJDBCDataModel is not in memory and queries the DB every time. This is
pretty slow, though by caching item-item similarity as you do, a lot of the
load is removed. However if you want to go all in memory, use
ReloadFromJDBCDataModel.

(The naming is weirder than the actual structure or logic...)

On Mon, Jul 4, 2011 at 6:05 PM, Mark  wrote:

> I've read the source for FileDataModel and it suggested using a JDBC backed
> implementation for larger datasets so I decided to upgrade our
> recommendation system to use MySQLJDBCDataModel with
> MySQLJDBCInMemoryItemSimilarit**y.
>
> I've found that the JDBC backed versions performance is actually worse that
> FileDataModel and FileItemSimilarity versions. Should this be the case?
> Which versions are most people using out there? Any recommendations?
>
> Thanks
>