Hey Dmitriy,

  I've also been playing around with a VectorWritable format which is backed
by a
SequenceFile, but I've been focussed on the case where it's essentially the
entire
matrix, and the rows don't fit into memory.  This seems different than your
current
use case, however - you just want (relatively) small vectors to load faster,
right?

  -jake

On Mon, Dec 13, 2010 at 10:18 AM, Ted Dunning <[email protected]> wrote:

> Interesting idea.
>
> Would this introduce a new vector type that only allows iterating through
> the elements once?
>
> On Mon, Dec 13, 2010 at 9:49 AM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
> > Hi all,
> >
> > I would like to submit a patch to VectorWritable that allows for
> streaming
> > access to vector elements without having to prebuffer all of them first.
> > (current code allows for the latter only).
> >
> > That patch would allow to strike down one of the memory usage issues in
> > current Stochastic SVD implementation and effectively open memory bound
> for
> > n of the SVD work. (The value i see is not to open up the the bound
> though
> > but just be more efficient in memory use, thus essentially speeding u p
> the
> > computation. )
> >
> > If it's ok, i would like to create a JIRA issue and provide a patch for
> it.
> >
> > Another issue is to provide an SSVD patch that depends on that patch for
> > VectorWritable.
> >
> > Thank you.
> > -Dmitriy
> >
>

Reply via email to