Note, the next version (13df29e4fe97b4370f24d7e91ab5909de76f0f3b) doesn't work. 
 Debugging.  



On Oct 13, 2011, at 9:31 PM, Grant Ingersoll wrote:

> OK, I can confirm that an earlier version 
> (54300025dbdd6e688a4eb3d043016eb641067c7e in github/lucidimagination/mahout) 
> worked.  Now, to figure out why.
> 
> -Grant
> 
> On Oct 13, 2011, at 4:01 AM, Sebastian Schelter wrote:
> 
>> Grant,
>> 
>> Can you share a little more details about the results, do you get any
>> exceptions? Or do you just get no results?
>> 
>> Using the NaNs inside the similarity matrix vectors has been included in
>> the job for a very long time and should not cause any problems. As Sean
>> already mentioned we have unit tests with toy data that should catch the
>> very obvious errors in this code.
>> 
>> Can you share the dataset? I can do a testrun on my research cluster.
>> 
>> --sebastian
>> 
>> On 13.10.2011 08:37, Sean Owen wrote:
>>> RecommenderJob? The unit tests run it all the time.
>>> There should not be any glitches with static variables -- don't think
>>> there are any.
>>> 
>>> On Thu, Oct 13, 2011 at 7:33 AM, Lance Norskog <goks...@gmail.com> wrote:
>>>> Is this job working well for anyone now?
>>>> When was the last time this job worked for someone?
>>>> 
>>>> On Wed, Oct 12, 2011 at 11:30 AM, Grant Ingersoll 
>>>> <gsing...@apache.org>wrote:
>>>> 
>>>>> Both local and on EC2
>>>>> 
>>>>> On Oct 12, 2011, at 2:10 PM, Ken Krugler wrote:
>>>>> 
>>>>>> Hi Grant,
>>>>>> 
>>>>>> Just curious, are you running this locally or distributed?
>>>>>> 
>>>>>> I'd run into a similar issue, though in a completely different algorithm
>>>>> (Jimmy Lin's PageRank implementation) due to the use of a static variable.
>>>>>> 
>>>>>> When running locally, this wasn't getting cleared between loops, and thus
>>>>> I got wonky results.
>>>>>> 
>>>>>> The same thing would have happened with JVM reuse enabled.
>>>>>> 
>>>>>> -- Ken
>>>>>> 
>>>>>> On Oct 12, 2011, at 3:28pm, Grant Ingersoll wrote:
>>>>>> 
>>>>>>> Digging some more:
>>>>>>> 
>>>>>>> In AggregateAndRecommend, around lines 143, I have, for userId 0, a
>>>>> simColumn of:
>>>>>>> 
>>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:0.9566912651062012,263376:NaN}
>>>>>>> 
>>>>>>> Which then becomes the numerator and the denom.
>>>>>>> 
>>>>>>> Looping, my next simCol is:
>>>>>>> 
>>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:NaN,263374:0.9566912651062012,263376:0.9566912651062012}
>>>>>>> 
>>>>>>> and then
>>>>>>> 
>>>>> {22966:0.9566912651062012,81901:0.9566912651062012,263375:0.9566912651062012,263374:NaN,263376:0.9566912651062012}
>>>>>>> 
>>>>>>> ...
>>>>>>> 
>>>>>>> Each time, those are getting added into the numerators/denoms value,
>>>>> such that by the time we are done looping (line 161), we have:
>>>>>>> numerators: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN}
>>>>>>> denoms: {22966:NaN,81901:NaN,263376:NaN,263375:NaN,263374:NaN}
>>>>>>> 
>>>>>>> numberOfSimilarItemsUsed:
>>>>> {81901:5.0,22966:5.0,263376:5.0,263375:5.0,263374:5.0}
>>>>>>> 
>>>>>>> Not sure on how to interpret this as I haven't dug into the math here
>>>>> yet or figured out where those NaN are coming from originally.
>>>>>>> 
>>>>>>> On Oct 11, 2011, at 2:55 PM, Grant Ingersoll wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> On Oct 11, 2011, at 2:49 PM, Grant Ingersoll wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Oct 11, 2011, at 12:36 PM, Sean Owen wrote:
>>>>>>>>> 
>>>>>>>>>> Where is the NaN coming up -- what has this value?
>>>>>>>>> 
>>>>>>>>> simColumn seems to be the originator in the Aggregate step.  For
>>>>> instance, my current breakpoint shows:
>>>>>>>>> {309682:0.9566912651062012,42938:0.9566912651062012,309672:NaN}
>>>>>>>>> 
>>>>>>>>> I can also see some in the PartialMultiplyMapper via the
>>>>> similarityMatrixColumn.
>>>>>>>>> 
>>>>>>>>> Is that set by SimilarityMatrixRowWrapperMapper?
>>>>>>>>> <code>
>>>>>>>>> /* remove self similarity */
>>>>>>>>> similarityMatrixRow.set(key.get(), Double.NaN);
>>>>>>>>> </code>
>>>>>>>> 
>>>>>>>> Ah, but that is just taking care of itself, so maybe not the issue.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> It should be propagated in some cases but not others. I'm not aware
>>>>> of
>>>>>>>>>> any changes here.
>>>>>>>>> 
>>>>>>>>> yeah, me neither.  This is all related to MAHOUT-798.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Generally small data sets will have this problem of not being able to
>>>>>>>>>> compute much of anything useful, so NaN might be right here.
>>>>>>>>>> But you say it was different recently, which seems to rule that out.
>>>>>>>>> 
>>>>>>>>> I also _believe_ I'm seeing it in a much larger data set on Hadoop,
>>>>> it's just that's a whole lot harder to debug.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Oct 11, 2011 at 5:34 PM, Grant Ingersoll <
>>>>> gsing...@apache.org> wrote:
>>>>>>>>>>> I'm running trunk RecommenderJob (via build-asf-email.sh) and am not
>>>>> getting any recommendations due to NaNs being calculated in the
>>>>> AggregateAndRecommend step.  I'm not quite sure what is going on as it 
>>>>> seems
>>>>> like this was working as little as two weeks ago (post Sebastian's big
>>>>> change to RecJob), but I don't see a whole lot of changes in that part of
>>>>> the code.
>>>>>>>>>>> 
>>>>>>>>>>> The data is user id's mapping to email thread ids.  My input data is
>>>>> simply a triple of user id, thread id, 1 (meaning that user participated 
>>>>> in
>>>>> that thread)  It seems like I will have a lot of good values in the inputs
>>>>> to the AggregateAndRecommend step, except one id will be NaN and this then
>>>>> seems to get added in and makes everything NaN (I realize this is a very
>>>>> naive understanding).  I sense that I should be looking upstream in the
>>>>> process for a fix, but I am not sure where that is.
>>>>>>>>>>> 
>>>>>>>>>>> Any ideas where I should be looking to eliminate these NaNs?  If you
>>>>> want to try this with a small data set, you can get it here:
>>>>> http://www.lucidimagination.com/devzone/technical-articles/scaling-mahout(but
>>>>>  note the companion article is not published yet.)
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Grant
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> --------------------------------------------
>>>>>>>> Grant Ingersoll
>>>>>>>> http://www.lucidimagination.com
>>>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
>>>>>>>> 
>>>>>>> 
>>>>>>> --------------------------------------------
>>>>>>> Grant Ingersoll
>>>>>>> http://www.lucidimagination.com
>>>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
>>>>>>> 
>>>>>> 
>>>>>> --------------------------
>>>>>> Ken Krugler
>>>>>> +1 530-210-6378
>>>>>> http://bixolabs.com
>>>>>> custom big data solutions & training
>>>>>> Hadoop, Cascading, Mahout & Solr
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --------------------------------------------
>>>>> Grant Ingersoll
>>>>> http://www.lucidimagination.com
>>>>> Lucene Eurocon 2011: http://www.lucene-eurocon.com
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Lance Norskog
>>>> goks...@gmail.com
>>>> 
>> 
> 
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
> Lucene Eurocon 2011: http://www.lucene-eurocon.com
> 

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Eurocon 2011: http://www.lucene-eurocon.com

Reply via email to