I'm running trunk RecommenderJob (via build-asf-email.sh) and am not getting 
any recommendations due to NaNs being calculated in the AggregateAndRecommend 
step.  I'm not quite sure what is going on as it seems like this was working as 
little as two weeks ago (post Sebastian's big change to RecJob), but I don't 
see a whole lot of changes in that part of the code.

The data is user id's mapping to email thread ids.  My input data is simply a 
triple of user id, thread id, 1 (meaning that user participated in that thread) 
 It seems like I will have a lot of good values in the inputs to the 
AggregateAndRecommend step, except one id will be NaN and this then seems to 
get added in and makes everything NaN (I realize this is a very naive 
understanding).  I sense that I should be looking upstream in the process for a 
fix, but I am not sure where that is.

Any ideas where I should be looking to eliminate these NaNs?  If you want to 
try this with a small data set, you can get it here: 
http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout (but 
note the companion article is not published yet.)

Thanks,
Grant

Reply via email to