Re: Issue updating a FileDataModel

2014-03-04 Thread Juan José Ramos
Thanks Sebastian. Although I got the FileDataModel updating correctly after following your advice, everything seems to point that I will need to use a database to back my dataModel. On Mon, Mar 3, 2014 at 3:47 PM, Sebastian Schelter s...@apache.org wrote: I think it depends on the difference

Re: Issue updating a FileDataModel

2014-03-03 Thread Sebastian Schelter
Hi Juan, IIRC then FileDataModel has a parameter that determines how much time must have been spent since the last modification of the underlying file. You can also directly append new data to the original file. If you want a to have a DataModel that can be concurrently updated, I suggest

Re: Issue updating a FileDataModel

2014-03-03 Thread Juan José Ramos
calls to recommender.refresh(null)? Many thanks. On Mon, Mar 3, 2014 at 1:18 PM, Sebastian Schelter s...@apache.org wrote: Hi Juan, IIRC then FileDataModel has a parameter that determines how much time must have been spent since the last modification of the underlying file. You can also

Re: Issue updating a FileDataModel

2014-03-03 Thread Sebastian Schelter
wrote: Hi Juan, IIRC then FileDataModel has a parameter that determines how much time must have been spent since the last modification of the underlying file. You can also directly append new data to the original file. If you want a to have a DataModel that can be concurrently updated, I suggest

Issue updating a FileDataModel

2014-03-02 Thread Juan José Ramos
I am having issues refreshing my recommender, in particular with the DataModel. I am using a FileDataModel and a GenericItemBasedRecommender that also has a CachingItemSimilarity wrapping a FileItemSimilarity. But for the test I am running I am making things even simpler. By the time I

Re: How to extend FileDataModel

2013-05-23 Thread huangjia
A follow-up question. I worked around for quite a while and got stuck. It is difficult for me to figure out which are the classes I need to extend. I suppose they are : FileDataModel, Preference, PreferenceArray, am I correct? Thanks! Jia On Fri, May 17, 2013 at 1:20 AM, Manuel Blechschmidt

Re: How to extend FileDataModel

2013-05-17 Thread Manuel Blechschmidt
want to build a recommendation model based on Mahout. My dataset format is in the format of userID, itemID, rating timestamp tag1 tag2 tag3. Thus, I think I need to extend the FileDataModel. I looked into *JesterDataModel* as an example. However, I have a problem with the logic flow. In its

Re: How to extend FileDataModel

2013-05-16 Thread Sean Owen
is in the format of userID, itemID, rating timestamp tag1 tag2 tag3. Thus, I think I need to extend the FileDataModel. I looked into *JesterDataModel* as an example. However, I have a problem with the logic flow. In its *buildModel()* method, an empty map data is first constructed

Re: How to extend FileDataModel

2013-05-16 Thread huangjia
, rating timestamp tag1 tag2 tag3. Thus, I think I need to extend the FileDataModel. I looked into *JesterDataModel* as an example. However, I have a problem with the logic flow. In its *buildModel()* method, an empty map data is first constructed. It is then thrown into processFile. I

How to extend FileDataModel

2013-05-15 Thread huangjia
Hi, I want to build a recommendation model based on Mahout. My dataset format is in the format of userID, itemID, rating timestamp tag1 tag2 tag3. Thus, I think I need to extend the FileDataModel. I looked into *JesterDataModel* as an example. However, I have a problem with the logic flow

Re: FileDataModel

2013-03-03 Thread Sean Owen
be reloaded. On Mar 2, 2013 6:34 AM, Nadia Najjar ned...@gmail.com wrote: I am using a FileDataModel and remove and insert preferences before estimating preferences. Do I need to rebuild the recommender after these methods are called for it to be reflected in the prediction?

Re: FileDataModel

2013-03-02 Thread Sean Owen
Yes to integrate any new data everything must be reloaded. On Mar 2, 2013 6:34 AM, Nadia Najjar ned...@gmail.com wrote: I am using a FileDataModel and remove and insert preferences before estimating preferences. Do I need to rebuild the recommender after these methods are called

Re: FileDataModel

2013-03-02 Thread Nadia Najjar
to integrate any new data everything must be reloaded. On Mar 2, 2013 6:34 AM, Nadia Najjar ned...@gmail.com wrote: I am using a FileDataModel and remove and insert preferences before estimating preferences. Do I need to rebuild the recommender after these methods are called for it to be reflected

FileDataModel

2013-03-01 Thread Nadia Najjar
I am using a FileDataModel and remove and insert preferences before estimating preferences. Do I need to rebuild the recommender after these methods are called for it to be reflected in the prediction?

Re: FileDataModel vs ReloadFromJDBCDataModel

2012-11-11 Thread Sebastian Schelter
the loading time by tuning those. Best, Sebastian On 11.11.2012 11:53, Onur Kuru wrote: Hi all, If I use FileDataModel, it takes about 5 secs to build the data model with 1m movielens data but it takes about 25 secs if I use ReloadFromJDBCDataModel. I know the former uses file and the latter

Re: FileDataModel vs ReloadFromJDBCDataModel

2012-11-11 Thread Sean Owen
, and then it's in memory. On Sun, Nov 11, 2012 at 10:53 AM, Onur Kuru kuru.on...@gmail.com wrote: Hi all, If I use FileDataModel, it takes about 5 secs to build the data model with 1m movielens data but it takes about 25 secs if I use ReloadFromJDBCDataModel. I know the former uses file and the latter

Re: Item Recommender Does not read Filedatamodel

2012-02-29 Thread Sean Owen
: Creating FileDataModel for file datasets\mydb.csv Feb 29, 2012 10:38:08 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Reading file info... [WARNING] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method

Re: FileDataModel / FileIDMigrator

2011-08-12 Thread Ted Dunning
There is a Dictionary class that might help. Do you have some code to contribute? On Thu, Aug 11, 2011 at 7:30 PM, Charles McBrearty ctm...@gmail.com wrote: After having actually having implemented the import/export conversions it makes a little more sense why you didn't want to put this in

Re: FileDataModel / FileIDMigrator

2011-08-11 Thread Sebastian Schelter
set that I have that uses strings as the ItemID's and it looks to me like the suggested way to do this is to subclass FileDataModel and then use FileIdMigrator to manage the String - Long mapping. This seems like a lot of complication to deal with what I would imagine is a pretty common use case

Re: FileDataModel / FileIDMigrator

2011-08-11 Thread Sean Owen
, Charles McBrearty ctm...@gmail.com wrote: Hi, I am taking a look at running some of the recommender examples from Mahout in action on a data set that I have that uses strings as the ItemID's and it looks to me like the suggested way to do this is to subclass FileDataModel and then use

Re: FileDataModel / FileIDMigrator

2011-08-11 Thread Charles McBrearty
examples from Mahout in action on a data set that I have that uses strings as the ItemID's and it looks to me like the suggested way to do this is to subclass FileDataModel and then use FileIdMigrator to manage the String - Long mapping. This seems like a lot of complication to deal with what I

Re: FileDataModel / FileIDMigrator

2011-08-11 Thread Sean Owen
and it looks to me like the suggested way to do this is to subclass FileDataModel and then use FileIdMigrator to manage the String - Long mapping. This seems like a lot of complication to deal with what I would imagine is a pretty common use case. Is there something that I'm missing here

Re: FileDataModel / FileIDMigrator

2011-08-11 Thread Ted Dunning
You don't need to rekey those tables. You can use hashes of the strings. Or you can build a dictionary to use at the import/export points. On Thu, Aug 11, 2011 at 3:27 PM, Charles McBrearty ctm...@gmail.com wrote: In any event, your suggestion to switch to numeric IDs is a non-starter. This

FileDataModel / FileIDMigrator

2011-08-10 Thread Charles McBrearty
Hi, I am taking a look at running some of the recommender examples from Mahout in action on a data set that I have that uses strings as the ItemID's and it looks to me like the suggested way to do this is to subclass FileDataModel and then use FileIdMigrator to manage the String - Long

Re: FileDataModel / FileIDMigrator

2011-08-10 Thread Ted Dunning
The issue is that actually supporting strings through the whole process kills performance. Interning the strings to be consecutively assigned integers helps ginormously. On Wed, Aug 10, 2011 at 5:02 PM, Charles McBrearty ctm...@gmail.com wrote: This seems like a lot of complication to deal

MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark
I've read the source for FileDataModel and it suggested using a JDBC backed implementation for larger datasets so I decided to upgrade our recommendation system to use MySQLJDBCDataModel with MySQLJDBCInMemoryItemSimilarity. I've found that the JDBC backed versions performance is actually

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sean Owen
Yes, this is trading memory for speed. If you can fit everything in memory, then you should. FileDataModel is in memory. MySQLJDBCDataModel is not in memory and queries the DB every time. This is pretty slow, though by caching item-item similarity as you do, a lot of the load is removed. However

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark
. If you can fit everything in memory, then you should. FileDataModel is in memory. MySQLJDBCDataModel is not in memory and queries the DB every time. This is pretty slow, though by caching item-item similarity as you do, a lot of the load is removed. However if you want to go all in memory, use

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sean Owen
Yes. Both are just fine to use in production. For speed and avoiding abuse of the database, I'd load into memory and tell it to periodically reload. But that too is a bit of a choice between how often you want to consume new data and how much work you want to do to recompute new values. On Mon,

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark
I wouldn't use the in memory JDBC solution. I was wondering do most people choose the JDBC backed solutions or the File backed? On 7/4/11 10:17 AM, Sean Owen wrote: Yes. Both are just fine to use in production. For speed and avoiding abuse of the database, I'd load into memory and tell it to

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sebastian Schelter
A look into a recent blogpost of mine might maybe be helpful with choosing the appropriate data access strategies for your recommender setup. It covers a very common usecase in great detail: http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/ --sebastian 2011/7/4

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Mark
May I ask why you choose to go with AllSimilarItemsCandidateItemsStrategy over the default PreferredItemsNeighborhoodCandidateItemsStrategy? On 7/4/11 10:23 AM, Sebastian Schelter wrote: A look into a recent blogpost of mine might maybe be helpful with choosing the appropriate data access

Re: MySQLJDBCDataModel vs FileDataModel

2011-07-04 Thread Sebastian Schelter
If the item similarities are already precomputed there's no sense in fetching them from the data model, you can just read use the already precomputed set of possibly similar items as no other items can be recommended anyway and it's faster to fetch them from a similarity implementation that holds

FileDataModel question: loading incremental files

2010-11-15 Thread Jordan, Eric
Hi, This is my second time trying to post this - the first time did not seem to work; my apologies if this ends up being a duplicate post. I'm having an issue with FileDataModel.  In particular, suppose you have a main data file (say, /tmp/data.lst) and two incremental files (say, /tmp/data.1

Re: FileDataModel question: loading incremental files

2010-11-15 Thread Sean Owen
lines of change. On Mon, Nov 15, 2010 at 2:31 PM, Jordan, Eric eric.jor...@navteq.comwrote: Hi, This is my second time trying to post this - the first time did not seem to work; my apologies if this ends up being a duplicate post. I'm having an issue with FileDataModel. In particular

Re: Selection Criteria in FileDataModel

2010-10-08 Thread Steven Bourke
It would be a nice feature to have build into the api for sure. You could use the getPreferencesFromUser to determine which users have the appropriate level of options. On Fri, Oct 8, 2010 at 6:27 PM, Sean Owen sro...@gmail.com wrote: There's nothing built-in. Yeah I'd view that as a step

Re: Selection Criteria in FileDataModel

2010-10-08 Thread Sean Owen
Maybe, my hunch is that it will affect so much in the code as to be hard to support. It is rare you want to filter the data in different ways repeatedly I think. And if you're filtering one way probably better to not have it in memory. On Oct 8, 2010 6:43 PM, Steven Bourke sbou...@gmail.com wrote:

Re: Selection Criteria in FileDataModel

2010-10-08 Thread Otis Gospodnetic
In the past I've extended the FileDataModel (if I recall correctly) that did this exact filtering that ChrisS was asking for. It worked well. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message

Re: FileDataModel

2010-08-15 Thread Sean Owen
What do you mean by this? I'm not clear yet. On Sun, Aug 15, 2010 at 1:09 PM, Tamas Jambor jambo...@gmail.com wrote: Hi, One more possible bug, in FileDataModel, there is nothing to make sure that the superclass - AbstractDataModel gets the value for maxPreference and minPreference. Tamas

Re: FileDataModel

2010-08-15 Thread Tamas Jambor
DataModel model = new FileDataModel(new File(./data/test.txt)); //just to make sure it loads the model model.getNumItems(); System.out.println(model.getMaxPreference()); this prints out a NaN because you have maxPreference/minPreference calculated when it creates the inner DataModel