Looks like it is me. Still not sure why, but getting there. On Oct 13, 2011, at 10:35 PM, Grant Ingersoll wrote:
> Note, the next version (13df29e4fe97b4370f24d7e91ab5909de76f0f3b) doesn't > work. Debugging. > > > > On Oct 13, 2011, at 9:31 PM, Grant Ingersoll wrote: > >> OK, I can confirm that an earlier version >> (54300025dbdd6e688a4eb3d043016eb641067c7e in github/lucidimagination/mahout) >> worked. Now, to figure out why. >> >> -Grant >> >> On Oct 13, 2011, at 4:01 AM, Sebastian Schelter wrote: >> >>> Grant, >>> >>> Can you share a little more details about the results, do you get any >>> exceptions? Or do you just get no results? >>> >>> Using the NaNs inside the similarity matrix vectors has been included in >>> the job for a very long time and should not cause any problems. As Sean >>> already mentioned we have unit tests with toy data that should catch the >>> very obvious errors in this code. >>> >>> Can you share the dataset? I can do a testrun on my research cluster. >>> >>> --sebastian >>> >>> On 13.10.2011 08:37, Sean Owen wrote: >>>> RecommenderJob? The unit tests run it all the time. >>>> There should not be any glitches with static variables -- don't think >>>> there are any. >>>> >>>> On Thu, Oct 13, 2011 at 7:33 AM, Lance Norskog <goks...@gmail.com> wrote: >>>>> Is this job working well for anyone now? >>>>> When was the last time this job worked for someone? >>>>> >>>>> On Wed, Oct 12, 2011 at 11:30 AM, Grant Ingersoll >>>>> <gsing...@apache.org>wrote: >>>>> >>>>>> Both local and on EC2 >>>>>> >>>>>> On Oct 12, 2011, at 2:10 PM, Ken Krugler wrote: >>>>>> >>>>>>> Hi Grant, >>>>>>> >>>>>>> Just curious, are you running this locally or distributed? >>>>>>> >>>>>>> I'd run into a similar issue, though in a completely different algorithm >>>>>> (Jimmy Lin's PageRank implementation) due to the use of a static >>>>>> variable. >>>>>>> >>>>>>> When running locally, this wasn't getting cleared between loops, and >>>>>>> thus >>>>>> I got wonky results. >>>>>>> >>>>>>> The same thing would have happened with JVM reuse enabled. >>>>>>> >>>>>>> -- Ken >>>>>>> >>>>>>> On Oct 12, 2011, at 3:28pm, Grant Ingersoll wrote: >>>>>>> >>>>>>>> Digging some more: >>>>>>>> >>>>>>>> In AggregateAndRecommend, around lines 143, I have, for userId 0, a >>>>>> simColumn of: >>>>>>>> >>>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:0.9566912651062012,263376:NaN} >>>>>>>> >>>>>>>> Which then becomes the numerator and the denom. >>>>>>>> >>>>>>>> Looping, my next simCol is: >>>>>>>> >>>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:NaN,263374:0.9566912651062012,263376:0.9566912651062012} >>>>>>>> >>>>>>>> and then >>>>>>>> >>>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:NaN,263376:0.9566912651062012} >>>>>>>> >>>>>>>> ... >>>>>>>> >>>>>>>> Each time, those are getting added into the numerators/denoms value, >>>>>> such that by the time we are done looping (line 161), we have: >>>>>>>> numerators: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN} >>>>>>>> denoms: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN} >>>>>>>> >>>>>>>> numberOfSimilarItemsUsed: >>>>>> {81901:5.0,22966:5.0,263376:5.0,263375:5.0,263374:5.0} >>>>>>>> >>>>>>>> Not sure on how to interpret this as I haven't dug into the math here >>>>>> yet or figured out where those NaN are coming from originally. >>>>>>>> >>>>>>>> On Oct 11, 2011, at 2:55 PM, Grant Ingersoll wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Oct 11, 2011, at 2:49 PM, Grant Ingersoll wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Oct 11, 2011, at 12:36 PM, Sean Owen wrote: >>>>>>>>>> >>>>>>>>>>> Where is the NaN coming up -- what has this value? >>>>>>>>>> >>>>>>>>>> simColumn seems to be the originator in the Aggregate step. For >>>>>> instance, my current breakpoint shows: >>>>>>>>>> {309682:0.9566912651062012,42938:0.9566912651062012,309672:NaN} >>>>>>>>>> >>>>>>>>>> I can also see some in the PartialMultiplyMapper via the >>>>>> similarityMatrixColumn. >>>>>>>>>> >>>>>>>>>> Is that set by SimilarityMatrixRowWrapperMapper? >>>>>>>>>> <code> >>>>>>>>>> /* remove self similarity */ >>>>>>>>>> similarityMatrixRow.set(key.get(), Double.NaN); >>>>>>>>>> </code> >>>>>>>>> >>>>>>>>> Ah, but that is just taking care of itself, so maybe not the issue. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> It should be propagated in some cases but not others. I'm not aware >>>>>> of >>>>>>>>>>> any changes here. >>>>>>>>>> >>>>>>>>>> yeah, me neither. This is all related to MAHOUT-798. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Generally small data sets will have this problem of not being able >>>>>>>>>>> to >>>>>>>>>>> compute much of anything useful, so NaN might be right here. >>>>>>>>>>> But you say it was different recently, which seems to rule that out. >>>>>>>>>> >>>>>>>>>> I also _believe_ I'm seeing it in a much larger data set on Hadoop, >>>>>> it's just that's a whole lot harder to debug. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Oct 11, 2011 at 5:34 PM, Grant Ingersoll < >>>>>> gsing...@apache.org> wrote: >>>>>>>>>>>> I'm running trunk RecommenderJob (via build-asf-email.sh) and am >>>>>>>>>>>> not >>>>>> getting any recommendations due to NaNs being calculated in the >>>>>> AggregateAndRecommend step. I'm not quite sure what is going on as it >>>>>> seems >>>>>> like this was working as little as two weeks ago (post Sebastian's big >>>>>> change to RecJob), but I don't see a whole lot of changes in that part of >>>>>> the code. >>>>>>>>>>>> >>>>>>>>>>>> The data is user id's mapping to email thread ids. My input data >>>>>>>>>>>> is >>>>>> simply a triple of user id, thread id, 1 (meaning that user participated >>>>>> in >>>>>> that thread) It seems like I will have a lot of good values in the >>>>>> inputs >>>>>> to the AggregateAndRecommend step, except one id will be NaN and this >>>>>> then >>>>>> seems to get added in and makes everything NaN (I realize this is a very >>>>>> naive understanding). I sense that I should be looking upstream in the >>>>>> process for a fix, but I am not sure where that is. >>>>>>>>>>>> >>>>>>>>>>>> Any ideas where I should be looking to eliminate these NaNs? If >>>>>>>>>>>> you >>>>>> want to try this with a small data set, you can get it here: >>>>>> http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout(but >>>>>> note the companion article is not published yet.) >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Grant >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -------------------------------------------- >>>>>>>>> Grant Ingersoll >>>>>>>>> http://www.lucidimagination.com >>>>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>>>>> >>>>>>>> >>>>>>>> -------------------------------------------- >>>>>>>> Grant Ingersoll >>>>>>>> http://www.lucidimagination.com >>>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>>>> >>>>>>> >>>>>>> -------------------------- >>>>>>> Ken Krugler >>>>>>> +1 530-210-6378 >>>>>>> http://bixolabs.com >>>>>>> custom big data solutions & training >>>>>>> Hadoop, Cascading, Mahout & Solr >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -------------------------------------------- >>>>>> Grant Ingersoll >>>>>> http://www.lucidimagination.com >>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Lance Norskog >>>>> goks...@gmail.com >>>>> >>> >> >> -------------------------------------------- >> Grant Ingersoll >> http://www.lucidimagination.com >> Lucene Eurocon 2011: http://www.lucene-eurocon.com >> > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > Lucene Eurocon 2011: http://www.lucene-eurocon.com > -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com Lucene Eurocon 2011: http://www.lucene-eurocon.com