Nope,

I simply thought that would make accessing and setting individual cells
more difficult.

Should I? Do you think it would perform better? And I would want to hear if
you have more design choices in your mind.


On Wed, May 8, 2013 at 12:22 AM, Ted Dunning <[email protected]> wrote:

> Have you experimented with, for instance, row number as id, value as binary
> serialized vector?
>
>
>
>
> On Tue, May 7, 2013 at 2:16 PM, Gokhan Capan <[email protected]> wrote:
>
> > 2 options:
> >
> > 1- row index as the row key, column index as column identifier, and value
> > as value
> > 2- row index and column index combined as the row key, and value in a
> > column called "value"
> >
> > Row indices are kept in a member variable in memory, to make iteration
> > fast.
> >
> >
> >
> > On Wed, May 8, 2013 at 12:11 AM, Ted Dunning <[email protected]>
> > wrote:
> >
> > > How did you store the matrix in HBase?
> > >
> > >
> > > On Tue, May 7, 2013 at 1:08 PM, Gokhan Capan <[email protected]>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > For taking large matrices as input and persisting large models (like
> > > factor
> > > > models), I created an HBase-backed version of Mahout matrix.
> > > >
> > > > It allows random access to cells and rows as well as assignment, and
> > > > iteration over rows. viewRow returns a view, and lazy loads actual
> data
> > > if
> > > > a get is actually invoked.
> > > >
> > > > I plan to add a VectorInputFormat on top of it, too.
> > > >
> > > > The code that we need to have for our algorithms is tested, but there
> > are
> > > > still parts of it that are not.
> > > >
> > > > I am going to speak about this at HBaseCon, and I wanted to let you
> > know
> > > > that it can be contributed after some refactoring.
> > > >
> > > > Is there any interest?
> > > >
> > > > --
> > > > Gokhan
> > > >
> > >
> >
> >
> >
> > --
> > Gokhan
> >
>



-- 
Gokhan

Reply via email to