Hi,
I discovered and downloaded mahout today. Maybe its just giddiness, but can
you help me,


this from tutorial http://lucene.apache.org/mahout/taste.html
"

   1. Download the "1 Million MovieLens Dataset" from
   http://www.grouplens.org/.
   2. Unpack the archive and copy   ->movies.dat<-   and    ->ratings.dat<-
     to
   
trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens
under
   the Mahout distribution directory.

"

 I


I downloaded the  MovieLens date set, there is no "movies.dat or
ratings.dat". Are the correct files u.data and u.item?
I haven't found any documention  on file formats, there are other things
confusing to new users, such as when I built
the downloaded gz file, and built it with maven following the instructions ,
 the directory  was only partly built, however, when I used checked out with
svn, the full diretory structure was built.

Can Taste incorporate other data files, like the ones listed below, as
well?, ie demographic data, etc Where can I find documentation about data
file formats accepted by taste, or do I need to dig into the code?


Thank you,
Brian Wolf
developer
gOgO deVelopment, ltd
Sedona, AZ

u.data     -- The full u data set, 100000 ratings by 943 users on 1682
items.
              Each user has rated at least 20 movies.  Users and items are
              numbered consecutively from 1.  The data is randomly
              ordered. This is a tab separated list of
         user id | item id | rating | timestamp.
              The time stamps are unix seconds since 1/1/1970 UTC

u.info     -- The number of users, items, and ratings in the u data set.

u.item     -- Information about the items (movies); this is a tab separated
              list of
              movie id | movie title | release date | video release date |
              IMDb URL | unknown | Action | Adventure | Animation |
              Children's | Comedy | Crime | Documentary | Drama | Fantasy |
              Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi |
              Thriller | War | Western |
              The last 19 fields are the genres, a 1 indicates the movie
              is of that genre, a 0 indicates it is not; movies can be in
              several genres at once.
              The movie ids are the ones used in the u.data data set.

u.genre    -- A list of the genres.

u.user     -- Demographic information about the users; this is a tab
              separated list of
              user id | age | gender | occupation | zip code
              The user ids are the ones used in the u.data data set.

u.occupation -- A list of the occupations.

Reply via email to