> On Aug 4, 2013, at 5:54 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> 
> On Sun, Aug 4, 2013 at 5:34 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
> 
>> Actually this brings up another point that I've harped on before. It sure
>> would be nice to have a vector representation where you could attache
>> arbitrary data to items or vectors. Not so memory efficient but it makes
>> things like ID translation and timestamping actions trivial. If these could
>> be attached and survive all the Mahout jobs there would be no need for the
>> in-memory hashmap I'm using to translate IDs and the actions could be
>> timestamped or other metadata could be attached. At present I guess
>> everyone knows that only weights are attached to actions/matrix values and
>> in some cases names to rows/vectors in DRMs.
>> 
> 
> This is where we started, actually.  The memory cost was fairly massive for
> arbitrary objects being attached to sparse matrices.  The problem is that
> the cost of the annotations isn't amortized very far in long-tail
> situations.
> 
No doubt but they are optional so as long as people understand the cost… But 
maybe you are talking about the cost of merely allowing arbitrary attachments.

> If we restrict our attention to text annotations, then a heavily compressed
> form might well be feasible.
> 
That would be fine with me. If we could do ID strings alone that would be super 
helpful.

Reply via email to