>> I signed up for the Netflix Prize. (www.netflixprize.com)
>> and downloaded their data and have imported it into PostgreSQL.
>> Here is how I created the table:
>
> I signed up as well, but have the table as follows:
>
> CREATE TABLE rating (
>   movie  SMALLINT NOT NULL,
>   person INTEGER  NOT NULL,
>   rating SMALLINT NOT NULL,
>   viewed DATE     NOT NULL
> );
>
> I also recommend not loading the entire file until you get further
> along in the algorithm solution. :)
>
> Not that I have time to really play with this....

As luck would have it, I wrote a recommendations system based on music
ratings a few years ago.

After reading the NYT article, it seems as though one or more of the guys
behind "Net Perceptions" is either helping them or did their system, I'm
not sure. I wrote my system because Net Perceptions was too slow and did a
lousy job.

I think the notion of "communities" in general is an interesting study in
statistics, but every thing I've seen in the form of bad recommendations
shows that while [N] people may share certain tastes, but that doesn't
nessisarily mean that what one likes the others do. This is especially
flawed with movie rentals because it is seldom a 1:1 ratio of movies to
people. There are often multiple people in a household. Also, movies are
almost always for multiple people.

Anyway, good luck! (Not better than me, of course :-)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to