Re: RecommenderJob and NaN

Lance Norskog Thu, 13 Oct 2011 13:12:22 -0700

Is the Apache public download bandwidth donated by Amazon? Or should we try
to keep usage within AWS?


On Thu, Oct 13, 2011 at 3:47 AM, Grant Ingersoll <[email protected]>wrote:

>
> On Oct 13, 2011, at 4:01 AM, Sebastian Schelter wrote:
>
> > Grant,
> >
> > Can you share a little more details about the results, do you get any
> > exceptions? Or do you just get no results?
>
> No results.
>
> >
> > Using the NaNs inside the similarity matrix vectors has been included in
> > the job for a very long time and should not cause any problems. As Sean
> > already mentioned we have unit tests with toy data that should catch the
> > very obvious errors in this code.
>
> Yeah, I don't know what happened.  I know I was getting results as little
> as two weeks ago.  I will try rolling back to an earlier commit.
>
> >
> > Can you share the dataset? I can do a testrun on my research cluster.
>
> I already have earlier in this thread.  There is a small set via the link
> below or you can use the ASF email public dataset on Amazon or any subset of
> it.
>
>
> >
> > --sebastian
> >
> > On 13.10.2011 08:37, Sean Owen wrote:
> >> RecommenderJob? The unit tests run it all the time.
> >> There should not be any glitches with static variables -- don't think
> >> there are any.
> >>
> >> On Thu, Oct 13, 2011 at 7:33 AM, Lance Norskog <[email protected]>
> wrote:
> >>> Is this job working well for anyone now?
> >>> When was the last time this job worked for someone?
> >>>
> >>> On Wed, Oct 12, 2011 at 11:30 AM, Grant Ingersoll <[email protected]
> >wrote:
> >>>
> >>>> Both local and on EC2
> >>>>
> >>>> On Oct 12, 2011, at 2:10 PM, Ken Krugler wrote:
> >>>>
> >>>>> Hi Grant,
> >>>>>
> >>>>> Just curious, are you running this locally or distributed?
> >>>>>
> >>>>> I'd run into a similar issue, though in a completely different
> algorithm
> >>>> (Jimmy Lin's PageRank implementation) due to the use of a static
> variable.
> >>>>>
> >>>>> When running locally, this wasn't getting cleared between loops, and
> thus
> >>>> I got wonky results.
> >>>>>
> >>>>> The same thing would have happened with JVM reuse enabled.
> >>>>>
> >>>>> -- Ken
> >>>>>
> >>>>> On Oct 12, 2011, at 3:28pm, Grant Ingersoll wrote:
> >>>>>
> >>>>>> Digging some more:
> >>>>>>
> >>>>>> In AggregateAndRecommend, around lines 143, I have, for userId 0, a
> >>>> simColumn of:
> >>>>>>
> >>>>
> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:0.9566912651062012,263376:NaN}
> >>>>>>
> >>>>>> Which then becomes the numerator and the denom.
> >>>>>>
> >>>>>> Looping, my next simCol is:
> >>>>>>
> >>>>
> {22966:0.9566912651062012,81901:0.9566912651062012,263375:NaN,263374:0.9566912651062012,263376:0.9566912651062012}
> >>>>>>
> >>>>>> and then
> >>>>>>
> >>>>
> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:NaN,263376:0.9566912651062012}
> >>>>>>
> >>>>>> ...
> >>>>>>
> >>>>>> Each time, those are getting added into the numerators/denoms value,
> >>>> such that by the time we are done looping (line 161), we have:
> >>>>>> numerators: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN}
> >>>>>> denoms: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN}
> >>>>>>
> >>>>>> numberOfSimilarItemsUsed:
> >>>> {81901:5.0,22966:5.0,263376:5.0,263375:5.0,263374:5.0}
> >>>>>>
> >>>>>> Not sure on how to interpret this as I haven't dug into the math
> here
> >>>> yet or figured out where those NaN are coming from originally.
> >>>>>>
> >>>>>> On Oct 11, 2011, at 2:55 PM, Grant Ingersoll wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> On Oct 11, 2011, at 2:49 PM, Grant Ingersoll wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> On Oct 11, 2011, at 12:36 PM, Sean Owen wrote:
> >>>>>>>>
> >>>>>>>>> Where is the NaN coming up -- what has this value?
> >>>>>>>>
> >>>>>>>> simColumn seems to be the originator in the Aggregate step.  For
> >>>> instance, my current breakpoint shows:
> >>>>>>>> {309682:0.9566912651062012,42938:0.9566912651062012,309672:NaN}
> >>>>>>>>
> >>>>>>>> I can also see some in the PartialMultiplyMapper via the
> >>>> similarityMatrixColumn.
> >>>>>>>>
> >>>>>>>> Is that set by SimilarityMatrixRowWrapperMapper?
> >>>>>>>> <code>
> >>>>>>>> /* remove self similarity */
> >>>>>>>> similarityMatrixRow.set(key.get(), Double.NaN);
> >>>>>>>> </code>
> >>>>>>>
> >>>>>>> Ah, but that is just taking care of itself, so maybe not the issue.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> It should be propagated in some cases but not others. I'm not
> aware
> >>>> of
> >>>>>>>>> any changes here.
> >>>>>>>>
> >>>>>>>> yeah, me neither.  This is all related to MAHOUT-798.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Generally small data sets will have this problem of not being
> able to
> >>>>>>>>> compute much of anything useful, so NaN might be right here.
> >>>>>>>>> But you say it was different recently, which seems to rule that
> out.
> >>>>>>>>
> >>>>>>>> I also _believe_ I'm seeing it in a much larger data set on
> Hadoop,
> >>>> it's just that's a whole lot harder to debug.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Oct 11, 2011 at 5:34 PM, Grant Ingersoll <
> >>>> [email protected]> wrote:
> >>>>>>>>>> I'm running trunk RecommenderJob (via build-asf-email.sh) and am
> not
> >>>> getting any recommendations due to NaNs being calculated in the
> >>>> AggregateAndRecommend step.  I'm not quite sure what is going on as it
> seems
> >>>> like this was working as little as two weeks ago (post Sebastian's big
> >>>> change to RecJob), but I don't see a whole lot of changes in that part
> of
> >>>> the code.
> >>>>>>>>>>
> >>>>>>>>>> The data is user id's mapping to email thread ids.  My input
> data is
> >>>> simply a triple of user id, thread id, 1 (meaning that user
> participated in
> >>>> that thread)  It seems like I will have a lot of good values in the
> inputs
> >>>> to the AggregateAndRecommend step, except one id will be NaN and this
> then
> >>>> seems to get added in and makes everything NaN (I realize this is a
> very
> >>>> naive understanding).  I sense that I should be looking upstream in
> the
> >>>> process for a fix, but I am not sure where that is.
> >>>>>>>>>>
> >>>>>>>>>> Any ideas where I should be looking to eliminate these NaNs?  If
> you
> >>>> want to try this with a small data set, you can get it here:
> >>>>
> http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout(butnote
>  the companion article is not published yet.)
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Grant
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> --------------------------------------------
> >>>>>>> Grant Ingersoll
> >>>>>>> http://www.lucidimagination.com
> >>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
> >>>>>>>
> >>>>>>
> >>>>>> --------------------------------------------
> >>>>>> Grant Ingersoll
> >>>>>> http://www.lucidimagination.com
> >>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
> >>>>>>
> >>>>>
> >>>>> --------------------------
> >>>>> Ken Krugler
> >>>>> +1 530-210-6378
> >>>>> http://bixolabs.com
> >>>>> custom big data solutions & training
> >>>>> Hadoop, Cascading, Mahout & Solr
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>> --------------------------------------------
> >>>> Grant Ingersoll
> >>>> http://www.lucidimagination.com
> >>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Lance Norskog
> >>> [email protected]
> >>>
> >
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
> Lucene Eurocon 2011: http://www.lucene-eurocon.com
>
>
>
>


-- 
Lance Norskog
[email protected]

Re: RecommenderJob and NaN

Reply via email to