Yeah that was it. I've added a CardinalityCorrectionMapper to MAHOUT-371
which does the job

On 16/07/10 16:40, Richard Simon Just wrote:
> Ah, that makes sense. Dawns on me that Sean pointed to this earlier but
> the implications hadn't clicked with me. Will write this now. Cheers
>
>
> On 16/07/10 15:09, Jake Mannix wrote:
>   
>> Richard,
>>
>>   I think this is because the ToItemPrefMapper and ToUserVectorReducer is
>> spitting out SequentialAccessSparseVectors with cardinality MAX_INTEGER,
>> which doesn't matter in all of the Taste stuff, because no dense vector is
>> ever created with this cardinality.  For SVD, you do need dense vectors, so
>> you really need to make sure the SASVectors have the correct cardinality.
>>
>>   A simple M/R job which runs over the output of the ToXYZPrefMapper/Reducer
>> sequence files, and spits out new Vectors with the correct cardinality but
>> the
>> same data should do the trick.  It may require new constructors for these
>> vectors however, to do it most efficiently (ie just copy references to
>> inner
>> data structures, but set a new value for the cardinality - you can't just
>> modify
>> it because it is probably final).
>>
>>   -jake
>>   
>>     
>   

Reply via email to