On Thu, Jun 18, 2009 at 3:39 PM, Jeff Eastman<j...@windwardsolutions.com> wrote: > Shall I change the method to asWritable()?
I'd just be for getting rid of it. Vector implements Writable, so asWritable() could just be "return this;", which seems gratuitous As for actual efficiency: lucene/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/meanshift/MeanShiftCanopy.java is currently dumping output values as the text strings. If there's a standard dataset, that would be an easy place to do the test. - David > I don't know of any situations where Vectors are used as keys. It hardly > makes sense to use them as they are so unwieldy. Suggest we could change to > just Writable and be ahead. In terms of the potential density improvement, > it will be interesting to see what can typically be achieved. > > r786323 just removed all calls to asWritableComparable, replacing them with > asFormatString which was correct anyway. > > > Jeff > > David Hall wrote: >> >> How often does Mahout need the "Comparable" part for Vectors? Are >> vectors commonly used as map output keys? >> >> In terms of space efficiency, I'd bet it's probably a bit better than >> a factor of two in the average case, especially for densevectors. The >> gson format is storing both the int index and the double as raw >> strings, plus whatever boundary characters. The writable >> implementation stores just the bytes of the double, plus a length. >> >> -- David >> >> On Thu, Jun 18, 2009 at 2:13 PM, Jeff Eastman<j...@windwardsolutions.com> >> wrote: >> >>> >>> +1 asWritableComparable is a simple implementation that uses >>> asFormatString. >>> It would be good to rewrite it for internal communication. A factor of >>> two >>> is still a factor of two. >>> >>> Jeff >>> >>> >>> Grant Ingersoll wrote: >>> >>>> >>>> On Jun 18, 2009, at 4:45 PM, Ted Dunning wrote: >>>> >>>> >>>>> >>>>> Writable should be plenty! >>>>> >>>>> >>>> >>>> +1. Still nice to have JSON for user facing though. >>>> >>>> >>>>> >>>>> On Thu, Jun 18, 2009 at 1:15 PM, David Hall <d...@cs.stanford.edu> >>>>> wrote: >>>>> >>>>> >>>>>> >>>>>> See my followup on another thread (sorry for the schizophrenic >>>>>> posting); Vector already implements Writable, so that's all I really >>>>>> can ask of it. Is there something more you'd like? I'd be happy to do >>>>>> it. >>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >>> >>> >> >> >> > >