On Wed, Jun 18, 2014 at 11:47 AM, Anand Avati <[email protected]> wrote:
> Supporting Int and Long keys are easy, both should be working shortly. > String is tricky, as H2O stores only numbers. One suggestion has been to > break up the string into bytes and store them as separate columns (and > re-assemble them on demand). I'll look into String support after finishing > the operators. > > How important are the String row keys for the algorithms itself? Would it > grossly mess up a workflow if Strings are silently discarded by the > backend? > like i said, seq2sparse produces them, and postprocessing for stuff like LSA pipelines would not work. > > > On Wed, Jun 18, 2014 at 10:58 AM, Dmitriy Lyubimov <[email protected]> > wrote: > > > Supporting Int and String keys are perhaps minimum set (Long is welcome, > > but a second-class citizen) > > > > supporting of DrmLike[Int] is required for a lot of things (e.g. > > Transpose). DrmLike[String] is used in outputs of popular vectorizations > in > > Mahout such as seq2sparse. > > > > > > On Tue, Jun 17, 2014 at 5:22 PM, Anand Avati <[email protected]> wrote: > > > > > Still incomplete, everything does NOT work. But lots of progress and > end > > is > > > in sight. > > > > > > - Development happening at > > > https://github.com/avati/mahout/commits/MAHOUT-1500. Note that I'm > still > > > doing lots of commit --amend and git push --force as this is my private > > > tree. > > > > > > - Ground level build issues and classloader incompatibilities fixed. > > > > > > - Can load a matrix into H2O either from in core (through > > drmParallelize()) > > > or HDFS (parser does not support seqfile yet) > > > > > > - Only Long type support for Row Keys so far. > > > > > > - mapBlock() works. This was the trickiest, other ops seem trivial in > > > comparison. > > > > > > Everything else yet to be done. However I will be putting in more time > > into > > > this over the coming days (was working less than part time on this so > > far.) > > > > > > Questions/comments welcome. > > > > > >
