Looks cool im already starting to play with it.
On Friday, October 4, 2013, Makoto Yui <yuin...@gmail.com> wrote: > Hi Dean, > > Thank you for your interest in Hivemall. > > Twitter's paper actually influenced me in developing Hivemall and I > initially implemented such functionality as Pig UDFs. > > Though my Pig ML library is not released, you can find a similar > attempt for Pig in > https://github.com/y-tag/java-pig-MyUDFs > > Thanks, > Makoto > > 2013/10/3 Dean Wampler <deanwamp...@gmail.com>: >> This is great news! I know that Twitter has done something similar with UDFs >> for Pig, as described in this paper: >> http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf >> >> I'm glad to see the same thing start with Hive. >> >> Dean >> >> >> On Wed, Oct 2, 2013 at 10:21 AM, Makoto YUI <yuin...@gmail.com> wrote: >>> >>> Hello all, >>> >>> My employer, AIST, has given the thumbs up to open source our machine >>> learning library, named Hivemall. >>> >>> Hivemall is a scalable machine learning library running on Hive/Hadoop, >>> licensed under the LGPL 2.1. >>> >>> https://github.com/myui/hivemall >>> >>> Hivemall provides machine learning functionality as well as feature >>> engineering functions through UDFs/UDAFs/UDTFs of Hive. It is designed >>> to be scalable to the number of training instances as well as the number >>> of training features. >>> >>> Hivemall is very easy to use as every machine learning step is done >>> within HiveQL. >>> >>> -- Installation is just as follows: >>> add jar /tmp/hivemall.jar; >>> source /tmp/define-all.hive; >>> >>> -- Logistic regression is performed by a query. >>> SELECT >>> feature, >>> avg(weight) as weight >>> FROM >>> (SELECT logress(features,label) as (feature,weight) FROM >>> training_features) t >>> GROUP BY feature; >>> >>> You can find detailed examples on our wiki pages. >>> https://github.com/myui/hivemall/wiki/_pages >>> >>> Though we consider that Hivemall is much easier to use and more scalable >>> than Mahout for classification/regression tasks, please check it by >>> yourself. If you have a Hive environment, you can evaluate Hivemall >>> within 5 minutes or so. >>> >>> Hope you enjoy the release! Feedback (and pull request) is always welcome. >>> >>> Thank you, >>> Makoto >> >> >> >> >> -- >> Dean Wampler, Ph.D. >> @deanwampler >> http://polyglotprogramming.com >