Hi Grant, I'm currently looking into MailToRecMapper to understand the data you extract from the ASF email archives. (Haven't had the time to actually run it yet)
As far as I understand it outputs from,msgId,1 for each mail. What exactly is the msgId here? I'm searching for an example where I have implicit feedback data in the form <user> <item> <number of observed interactions> It would be important to have different numbers of interaction as the algorithm I'm trying to exemplify uses this number to calculate a "confidence" for the data point. E.g. if a user has never seen some movie, you would see 0 interactions, which could mean that he doesn't like the movie, but it could also mean he just doesn't know it exists, so we have low confidence in the observation. On the other hand if he watched the movie 20 times, we can be pretty sure he likes it. Would it be possible to extract data in the form <email> <thread> <number of responses> from the asf email archives? I recall a discussion stating that identifying a thread is pretty hard task... Best, Sebastian On 09.11.2011 16:35, Grant Ingersoll (Commented) (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147103#comment-13147103 > ] > > Grant Ingersoll commented on MAHOUT-878: > ---------------------------------------- > > See also the stuff I did for build-asf-email.sh. Would be nice to add into > that. > >> Provide better examples for the parallel ALS recommender code >> ------------------------------------------------------------- >> >> Key: MAHOUT-878 >> URL: https://issues.apache.org/jira/browse/MAHOUT-878 >> Project: Mahout >> Issue Type: Task >> Components: Collaborative Filtering >> Affects Versions: 1.0 >> Reporter: Sebastian Schelter >> Assignee: Sebastian Schelter >> >> We should provide examples that show how to apply the parallel ALS >> recommender to the Netflix or KDD2011 datasets. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > >
