On Wed, Jun 18, 2014 at 11:47 AM, Anand Avati <[email protected]> wrote:

> Supporting Int and Long keys are easy, both should be working shortly.
> String is tricky, as H2O stores only numbers. One suggestion has been to
> break up the string into bytes and store them as separate columns (and
> re-assemble them on demand). I'll look into String support after finishing
> the operators.
>
> How important are the String row keys for the algorithms itself? Would it
> grossly mess up a workflow if Strings are silently discarded by the
> backend?
>

like i said, seq2sparse produces them, and postprocessing for stuff like
LSA pipelines would not work.


>
>
> On Wed, Jun 18, 2014 at 10:58 AM, Dmitriy Lyubimov <[email protected]>
> wrote:
>
> > Supporting Int and String keys are perhaps minimum set (Long is welcome,
> > but a second-class citizen)
> >
> > supporting of DrmLike[Int] is required for a lot of things (e.g.
> > Transpose). DrmLike[String] is used in outputs of popular vectorizations
> in
> > Mahout such as seq2sparse.
> >
> >
> > On Tue, Jun 17, 2014 at 5:22 PM, Anand Avati <[email protected]> wrote:
> >
> > > Still incomplete, everything does NOT work. But lots of progress and
> end
> > is
> > > in sight.
> > >
> > > - Development happening at
> > > https://github.com/avati/mahout/commits/MAHOUT-1500. Note that I'm
> still
> > > doing lots of commit --amend and git push --force as this is my private
> > > tree.
> > >
> > > - Ground level build issues and classloader incompatibilities fixed.
> > >
> > > - Can load a matrix into H2O either from in core (through
> > drmParallelize())
> > > or HDFS (parser does not support seqfile yet)
> > >
> > > - Only Long type support for Row Keys so far.
> > >
> > > - mapBlock() works. This was the trickiest, other ops seem trivial in
> > > comparison.
> > >
> > > Everything else yet to be done. However I will be putting in more time
> > into
> > > this over the coming days (was working less than part time on this so
> > far.)
> > >
> > > Questions/comments welcome.
> > >
> >
>

Reply via email to