Dmitriy,

  You should be able to specify that your matrices be stored in
SequentialAccessSparseVector format if you need to.  This is
almost always the right thing for HDFS-backed matrices, because
HDFS is write-once, and SASVectors are optimized for read-only
sequential access, which is your exact use case, right?

  -jake

On Mon, Dec 13, 2010 at 4:21 PM, Dmitriy Lyubimov <[email protected]> wrote:

> I don't think sequentiality is a requirement in the case i am working on.
> However, let me peek at the code first. I am guessing it is some form of a
> near-perfect hash, in which case it may not be possible to read it in parts
> at all. Which would be bad, indeed. I would need to find a completely
> alternative input format then to overcome my case.
>
> On Mon, Dec 13, 2010 at 4:01 PM, Ted Dunning <[email protected]>
> wrote:
>
> > I don't thikn that sequentiality part of the contract.
> >  RandomAccessSparseVectors are likely to
> > produce disordered values when serialized, I think.
> >
> > On Mon, Dec 13, 2010 at 1:48 PM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> >
> > > I will have to look at details of VectorWritable to make sure all cases
> > are
> > > covered (I only took a very brief look so far). But as long as it is
> able
> > > to
> > > produce elements in order of index increase, push technique will
> > certainly
> > > work for most algorithms (and in some cases, notably with SSVD, even if
> > it
> > > produces the data in non-sequential way, it would work too ) .
> > >
> >
>

Reply via email to