On Thu, Jun 18, 2009 at 3:39 PM, Jeff Eastman<j...@windwardsolutions.com> wrote:
> Shall I change the method to asWritable()?

I'd just be for getting rid of it. Vector implements Writable, so
asWritable() could just be "return this;", which seems gratuitous

As for actual efficiency:
   
lucene/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/meanshift/MeanShiftCanopy.java

is currently dumping output values as the text strings. If there's a
standard dataset, that would be an easy place to do the test.

- David

> I don't know of any situations where Vectors are used as keys. It hardly
> makes sense to use them as they are so unwieldy. Suggest we could change to
> just Writable and be ahead. In terms of the potential density improvement,
> it will be interesting to see what can typically be achieved.
>
> r786323 just removed all calls to asWritableComparable, replacing them with
> asFormatString which was correct anyway.
>

>
> Jeff
>
> David Hall wrote:
>>
>> How often does Mahout need the "Comparable" part for Vectors? Are
>> vectors commonly used as map output keys?
>>
>> In terms of space efficiency, I'd bet it's probably a bit better than
>> a factor of two in the average case, especially for densevectors. The
>> gson format is storing both the int index and the double as raw
>> strings, plus whatever boundary characters.  The writable
>> implementation stores just the bytes of the double, plus a length.
>>
>> -- David
>>
>> On Thu, Jun 18, 2009 at 2:13 PM, Jeff Eastman<j...@windwardsolutions.com>
>> wrote:
>>
>>>
>>> +1 asWritableComparable is a simple implementation that uses
>>> asFormatString.
>>> It would be good to rewrite it for internal communication. A factor of
>>> two
>>> is still a factor of two.
>>>
>>> Jeff
>>>
>>>
>>> Grant Ingersoll wrote:
>>>
>>>>
>>>> On Jun 18, 2009, at 4:45 PM, Ted Dunning wrote:
>>>>
>>>>
>>>>>
>>>>> Writable should be plenty!
>>>>>
>>>>>
>>>>
>>>> +1.  Still nice to have JSON for user facing though.
>>>>
>>>>
>>>>>
>>>>> On Thu, Jun 18, 2009 at 1:15 PM, David Hall <d...@cs.stanford.edu>
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> See my followup on another thread (sorry for the schizophrenic
>>>>>> posting); Vector already implements Writable, so that's all I really
>>>>>> can ask of it. Is there something more you'd like? I'd be happy to do
>>>>>> it.
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Reply via email to