Also I should mention or note that you likely want to use the samplingRate parameter on the nearest-neighborhood implementation - rather than search all users for the closest neighbors you could have it try, say, 10%. The idea is you still probably sample a goos neighborhood. There are many 'levers' like this parameter in the code that let you trade some accuracy, perhaps, for a lot of speed. Without setting them (and they aren't set by default) you will probably find performance bad.
Sean On Apr 22, 2009 7:12 PM, "Sean Owen" <[email protected]> wrote: It's going to be slow, but there are things to do to make sure it is not slower than it needs to be. You have indexes on the user ID / column ID tables? the DB is on the same machine as the JVM? connection pooling? all that sort of thing. Beyond that, caching will help, anything to prevenet going to the database. On Wed, Apr 22, 2009 at 4:46 PM, Mirko Gontek <[email protected]> wrote: > Yeah, I was wond...
