Hi, *>>Please feel free to contribute documentation to the Apache Hama wiki[1]!* Ok. I am new to open source world so quite new to the procedure. Whenever I will find something missing, I will edit it.
*>>We also maybe work together on it but I have no idea yet. Custom “Modern” or* *“Classic” Style? Maven website again?* Ok. I do not quite understand what do you mean by Modern or Classic style. Does Apache provides some kind of CMS to manage the hosted project websites ? *>>ADDM is quite interesting, and it looks like more fit into BSP than MapReduce* *(even if HBase(?) or memory-based shared storage is used). * Yes ADMM seems to be a natural fit for BSP model because ADMM algorithms are iterative. In each iteration, different machines process and exchange data and the algorithm keep running unless a convergence criteria is met. Check out Chapter 10 (Page 78) of following ADMM paper: https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf It discusses the implementation details of ADMM on BigData systems. *>>But I don't fully understand * My understanding is also limited but if the cost function of ML algorithms is Convex then the cost function can be converted to ADMM form. Once in ADMM form we can run it on a distributed system like Hama. >>*and so don't know whether it can be used as abstraction layer of **many ML algorithms. We'll need more investigation.* Yes, more investigation is needed. Here are a few ML algorithms already in ADMM form (a,b,c). a) *L1 Linear Regression -* https://www.dtc.umn.edu/s/resources/tsp2010oct-dlasso.pdf b) *L2-Logistic Regression:* https://intentmedia.github.io/assets/2013-10-09-presenting-at-ieee-big-data/pld_js_ieee_bigdata_2013_admm.pdf c) *SVM* - http://www.jmlr.org/papers/volume11/forero10a/forero10a.pdf Regards, Behroz Sikander On Fri, Jun 5, 2015 at 3:19 AM, Edward J. Yoon <[email protected]> wrote: > Please feel free to contribute documentation to the Apache Hama wiki[1]! > Ultimately, I'm considering improving our official website[2] on HAMA-960. > We > also maybe work together on it but I have no idea yet. Custom “Modern” or > “Classic” Style? Maven website again? > > ADDM is quite interesting, and it looks like more fit into BSP than > MapReduce > (even if HBase(?) or memory-based shared storage is used). But I don't > fully > understand and so don't know whether it can be used as abstraction layer of > many ML algorithms. We'll need more investigation. > > > 1. https://wiki.apache.org/hama > 2. https://hama.apache.org/ > > -- > Best Regards, Edward J. Yoon > > -----Original Message----- > From: Behroz Sikander [mailto:[email protected]] > Sent: Thursday, June 04, 2015 10:24 PM > To: [email protected] > Subject: Re: [DISCUSS] Things I'd like to focus on next > > Hi, > +1. > Yes documentation needs improvement. I also saw that a book on Hama is also > under progress. I can help with the documentation. I only found the > following open issuehttps://issues.apache.org/jira/browse/HAMA-960. > > Something like MLBase or Mahout on top of Hama would be really nice and > will boost the project. Regarding machine learning algorithms can we use > ADMM(a) to implement the algorithms ? > Like https://issues.apache.org/jira/browse/SPARK-1543 > > a) https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf > > Regards, > Behroz Sikander > > On Wed, Jun 3, 2015 at 9:48 AM, Edward J. Yoon <[email protected]> > wrote: > > > Hey, > > > > Here's few things I'd like to focus on next. > > > > 1. Add stream input format for listening messages coming from 3rd > > party applications, and incremental learning algorithms. > > 2. Improve reliability of system e.g., fault tolerance, HA, ..., etc. > > 3. More machine learning algorithms, such as ensemble classifier, SVM, > > DNN, ..., etc > > > > Do you have any other suggestions? > > > > Thanks! > > > > -- > > Best Regards, Edward J. Yoon > > > > >
