Note, the next version (13df29e4fe97b4370f24d7e91ab5909de76f0f3b) doesn't work. Debugging.
On Oct 13, 2011, at 9:31 PM, Grant Ingersoll wrote: > OK, I can confirm that an earlier version > (54300025dbdd6e688a4eb3d043016eb641067c7e in github/lucidimagination/mahout) > worked. Now, to figure out why. > > -Grant > > On Oct 13, 2011, at 4:01 AM, Sebastian Schelter wrote: > >> Grant, >> >> Can you share a little more details about the results, do you get any >> exceptions? Or do you just get no results? >> >> Using the NaNs inside the similarity matrix vectors has been included in >> the job for a very long time and should not cause any problems. As Sean >> already mentioned we have unit tests with toy data that should catch the >> very obvious errors in this code. >> >> Can you share the dataset? I can do a testrun on my research cluster. >> >> --sebastian >> >> On 13.10.2011 08:37, Sean Owen wrote: >>> RecommenderJob? The unit tests run it all the time. >>> There should not be any glitches with static variables -- don't think >>> there are any. >>> >>> On Thu, Oct 13, 2011 at 7:33 AM, Lance Norskog <goks...@gmail.com> wrote: >>>> Is this job working well for anyone now? >>>> When was the last time this job worked for someone? >>>> >>>> On Wed, Oct 12, 2011 at 11:30 AM, Grant Ingersoll >>>> <gsing...@apache.org>wrote: >>>> >>>>> Both local and on EC2 >>>>> >>>>> On Oct 12, 2011, at 2:10 PM, Ken Krugler wrote: >>>>> >>>>>> Hi Grant, >>>>>> >>>>>> Just curious, are you running this locally or distributed? >>>>>> >>>>>> I'd run into a similar issue, though in a completely different algorithm >>>>> (Jimmy Lin's PageRank implementation) due to the use of a static variable. >>>>>> >>>>>> When running locally, this wasn't getting cleared between loops, and thus >>>>> I got wonky results. >>>>>> >>>>>> The same thing would have happened with JVM reuse enabled. >>>>>> >>>>>> -- Ken >>>>>> >>>>>> On Oct 12, 2011, at 3:28pm, Grant Ingersoll wrote: >>>>>> >>>>>>> Digging some more: >>>>>>> >>>>>>> In AggregateAndRecommend, around lines 143, I have, for userId 0, a >>>>> simColumn of: >>>>>>> >>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:0.9566912651062012,263376:NaN} >>>>>>> >>>>>>> Which then becomes the numerator and the denom. >>>>>>> >>>>>>> Looping, my next simCol is: >>>>>>> >>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:NaN,263374:0.9566912651062012,263376:0.9566912651062012} >>>>>>> >>>>>>> and then >>>>>>> >>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:NaN,263376:0.9566912651062012} >>>>>>> >>>>>>> ... >>>>>>> >>>>>>> Each time, those are getting added into the numerators/denoms value, >>>>> such that by the time we are done looping (line 161), we have: >>>>>>> numerators: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN} >>>>>>> denoms: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN} >>>>>>> >>>>>>> numberOfSimilarItemsUsed: >>>>> {81901:5.0,22966:5.0,263376:5.0,263375:5.0,263374:5.0} >>>>>>> >>>>>>> Not sure on how to interpret this as I haven't dug into the math here >>>>> yet or figured out where those NaN are coming from originally. >>>>>>> >>>>>>> On Oct 11, 2011, at 2:55 PM, Grant Ingersoll wrote: >>>>>>> >>>>>>>> >>>>>>>> On Oct 11, 2011, at 2:49 PM, Grant Ingersoll wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Oct 11, 2011, at 12:36 PM, Sean Owen wrote: >>>>>>>>> >>>>>>>>>> Where is the NaN coming up -- what has this value? >>>>>>>>> >>>>>>>>> simColumn seems to be the originator in the Aggregate step. For >>>>> instance, my current breakpoint shows: >>>>>>>>> {309682:0.9566912651062012,42938:0.9566912651062012,309672:NaN} >>>>>>>>> >>>>>>>>> I can also see some in the PartialMultiplyMapper via the >>>>> similarityMatrixColumn. >>>>>>>>> >>>>>>>>> Is that set by SimilarityMatrixRowWrapperMapper? >>>>>>>>> <code> >>>>>>>>> /* remove self similarity */ >>>>>>>>> similarityMatrixRow.set(key.get(), Double.NaN); >>>>>>>>> </code> >>>>>>>> >>>>>>>> Ah, but that is just taking care of itself, so maybe not the issue. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> It should be propagated in some cases but not others. I'm not aware >>>>> of >>>>>>>>>> any changes here. >>>>>>>>> >>>>>>>>> yeah, me neither. This is all related to MAHOUT-798. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Generally small data sets will have this problem of not being able to >>>>>>>>>> compute much of anything useful, so NaN might be right here. >>>>>>>>>> But you say it was different recently, which seems to rule that out. >>>>>>>>> >>>>>>>>> I also _believe_ I'm seeing it in a much larger data set on Hadoop, >>>>> it's just that's a whole lot harder to debug. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Oct 11, 2011 at 5:34 PM, Grant Ingersoll < >>>>> gsing...@apache.org> wrote: >>>>>>>>>>> I'm running trunk RecommenderJob (via build-asf-email.sh) and am not >>>>> getting any recommendations due to NaNs being calculated in the >>>>> AggregateAndRecommend step. I'm not quite sure what is going on as it >>>>> seems >>>>> like this was working as little as two weeks ago (post Sebastian's big >>>>> change to RecJob), but I don't see a whole lot of changes in that part of >>>>> the code. >>>>>>>>>>> >>>>>>>>>>> The data is user id's mapping to email thread ids. My input data is >>>>> simply a triple of user id, thread id, 1 (meaning that user participated >>>>> in >>>>> that thread) It seems like I will have a lot of good values in the inputs >>>>> to the AggregateAndRecommend step, except one id will be NaN and this then >>>>> seems to get added in and makes everything NaN (I realize this is a very >>>>> naive understanding). I sense that I should be looking upstream in the >>>>> process for a fix, but I am not sure where that is. >>>>>>>>>>> >>>>>>>>>>> Any ideas where I should be looking to eliminate these NaNs? If you >>>>> want to try this with a small data set, you can get it here: >>>>> http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout(but >>>>> note the companion article is not published yet.) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Grant >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -------------------------------------------- >>>>>>>> Grant Ingersoll >>>>>>>> http://www.lucidimagination.com >>>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>>>> >>>>>>> >>>>>>> -------------------------------------------- >>>>>>> Grant Ingersoll >>>>>>> http://www.lucidimagination.com >>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>>> >>>>>> >>>>>> -------------------------- >>>>>> Ken Krugler >>>>>> +1 530-210-6378 >>>>>> http://bixolabs.com >>>>>> custom big data solutions & training >>>>>> Hadoop, Cascading, Mahout & Solr >>>>>> >>>>>> >>>>>> >>>>> >>>>> -------------------------------------------- >>>>> Grant Ingersoll >>>>> http://www.lucidimagination.com >>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>> >>>>> >>>> >>>> >>>> -- >>>> Lance Norskog >>>> goks...@gmail.com >>>> >> > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > Lucene Eurocon 2011: http://www.lucene-eurocon.com > -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com Lucene Eurocon 2011: http://www.lucene-eurocon.com