NamedVectorWritable already extends VectorWritable, though honestly I
don't like that and kept it to minimize disruption.

Serialized vector formats aren't exactly "polymorphic". I can't read
and X vector with the code intended to deserialize something that
extends X. So, really the Writables shouldn't inherit from one
another.

VectorWritable is like a meta-format. It writes the class name for
XVector, then serializes with class XVectorWritable. This works for
any vector, so is a good default choice as it will read/write any
vector.

However that's a serious storage overhead. Writing the class name with
every instance?  For example I don't use VectorWritable in my most
recent rewrite of co-occurrence based recommenders, and use the
Writable for my vector format directly, since it saved lots of I/O.

So while VectorWritable exists as a nice default, I don't think it's a
great idea to use in practice. Its generality comes at a price.

Erm, now I've lost track. What was the question? is it moot, answered?



On Sat, Apr 24, 2010 at 8:12 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> Put in other words, this would mean that there is either one or two output
> formats but most importantly only one input format that would always read
> NamedVectorWritables, possibly by inserting default names.  Due to
> inheritance, those objects would serve both purposes.
>
> That sounds good and simple.
>
> On Sat, Apr 24, 2010 at 11:31 AM, Robin Anil <robin.a...@gmail.com> wrote:
>
>> > For algorithms that are accepting arguments of a particular type, it
>> might
>> > be reasonable to let NVW extend VW (I am not at all sure about the
>> > unintended consequences of this, but it sounds plausible).   Then all we
>> > need is a facade that exposes an NVW interface for a wrapped
>> VectorWritable
>> > with some kind of default labels (say the indexes as strings).
>> >
>> Or the other way around. Let everything be a NamedVectorWritable. during
>> deserializing use explicit methods to use or skip the name
>

Reply via email to