Re: Features by engine page
meant to be normal of course On Tue, Aug 26, 2014 at 10:39 AM, Dmitriy Lyubimov wrote: > scala, if you want) to write something like `new > MultivariateUniformDistribution(mu,sigma).sample()`, so i really just dsl- >
Re: Features by engine page
on distributions, I did not find anything multivariate Mahout Matrix-based. Hopefully, i did not look well enough. Everything univariate seems to be pretty spotty. Aside from that, i need scala traits, plus i find it extremely unelegant (un-scala, if you want) to write something like `new MultivariateUniformDistribution(mu,sigma).sample()`, so i really just dsl-bridged for most part. There are enough third party choices not to bother with filling the gaps. On step-recorded evolutionary search, after my literature search on the topic, it doesnt look like even distant third best choice, in particular under big data training settings. First, i did not find head-to-head comparisons of that with any of top choices. It is not included in Amplab survey of top search choices. GP-EI is Netflix's choice, for example. So there's very little convincing data to go on, to begin with. So given lack of such comparisons, the next best thing is to copy what others do here. Second, under big data settings, every data point (training) is precious. In spark specifically, unlike MR, since we want to retain as much data in RAM is possible and avoid spills, best performance is usually achieved by sequentially semaphoring trainings rather then throwing a whole bunch of them out at once. Especially under circumstances where companies are extremely anemic in provisioning hardware needed for whatever reason. In that sense, exploration algorithms that are capable of making better inference after each new data point, and arrive to a reasonably performing model in ~20..30 sequential trains are infinitely more preferable, rather than those that require a whole bunch of trainings to happen to begin to figure the next centroid of trials. I am not even sure if step-recorded search was even ever tried outside SGD where datapoints are abundant albeit incomplete. On Tue, Aug 26, 2014 at 8:32 AM, Ted Dunning wrote: > On Mon, Aug 25, 2014 at 2:40 PM, Dmitriy Lyubimov > wrote: > > > This work is obviously also interesting in that it > > establishes probabilistic framework in Mahout (distributions & gaussian > > process). > > > > We already have that. > > (distributions not GP) > > Note that we also have an implementation of recorded step evolutionary > programming that works really well for hyper-parameter search. I don't > like the way that the API turned out (too hard to understand). >
Re: Features by engine page
On Mon, Aug 25, 2014 at 2:40 PM, Dmitriy Lyubimov wrote: > This work is obviously also interesting in that it > establishes probabilistic framework in Mahout (distributions & gaussian > process). > We already have that. (distributions not GP) Note that we also have an implementation of recorded step evolutionary programming that works really well for hyper-parameter search. I don't like the way that the API turned out (too hard to understand).
RE: Features by engine page
right- sorry i was looking at Weighted Matrix factorization. I meant added "Matrix Factorization with ALS on Implicit Feedback" as "in progress" > Date: Mon, 25 Aug 2014 15:27:39 -0700 > Subject: Re: Features by engine page > From: dlie...@gmail.com > To: dev@mahout.apache.org > > On Mon, Aug 25, 2014 at 3:23 PM, Andrew Palumbo wrote: > > > Thanks Dmitriy, > > > > I've added in SSVD, PCA, QR and Weighted ALS. > > > I think it is called "regularized ALS" > > > > To keep it simple, I'll leave them under Spark for right now. (and add > > "in development" for h2o) since they're in and passing tests. > > > > Should I add: > > > > no > > > > > GP-EI > > BFGS > > > > as "in development" > > > > bigram co-occurrence (would this be collocations?) > > > > as "in development" for spark? > > > > > > > > > > > Date: Mon, 25 Aug 2014 14:40:57 -0700 > > > Subject: Re: Features by engine page > > > From: dlie...@gmail.com > > > To: dev@mahout.apache.org > > > > > > yes SSVD and stochastic PCA as well as thin QR are re-cast in Mahout > > > algebra (meaning they are engine-independent, not just spark). > > > > > > So is regularized ALS (albeit perhaps somewhat naive and thus affecting > > > performance). > > > > > > I also had quasi algebraic implicit feedback ALS (which is in fact > > implicit > > > feedback paper and ALS-WR in the same bottle) but closed the issue due to > > > lack of reviews and interest. > > > > > > Internally I also have framework for doing hyper parameter searches and > > > right now am closing on GP-EI which will probably benefit from some > > > additions doing estimates chosen by reducing uncertainty (attempts to get > > > out of local minimum projected by GP-EI Snoek's algorithm itself). I > > hope i > > > could open it one day. This work is obviously also interesting in that it > > > establishes probabilistic framework in Mahout (distributions & gaussian > > > process). GP stuff can be also used to evaluate things like RLFM i > > think. > > > > > > I also have framework to do line search type of things, including big > > > datasets, per Nosedal and Wright, incluging BFGS, those are probably also > > > candidates for contribution. Or not, depending on the moods of my new > > boss. > > > > > > Of other interesting things that are done with DSL and may be considered > > > for contribution, I also have implementations for bigram co-occurrence > > > (both directed and undirected) made in the DSL but it is also > > > quasi-algebraic i think (meaning there are Spark-specific parts). This is > > > (I think) would also include truethful implementation of Surprise & > > > Coincidence's paper bigram problem (currently implemented in Mahout MR) > > but > > > also would estimate undirected co-occurrences (as a frequent itemsets > > > problem solver/replacement). Again, hopeful it may be contributed, but > > not > > > sure if i'll pursue that if there's lack of interest in my company. It's > > > hard to go against the wind, in a way. > > > > > > By far the most often missing piece is data prep of course, but i think i > > > can eventually contribute a couple tutorials of how to do vectorization > > > using SparkQL stuff. > > > > > > > > > > > > -d > > > > > > > > > > > > > > > On Mon, Aug 25, 2014 at 2:19 PM, Pat Ferrel > > wrote: > > > > > > > Spark RSJ, MAHOUT-1604 is in development > > > > > > > > I thought SSVD with PCA was working on Spark. > > > > > > > > > > > > On Aug 25, 2014, at 2:15 PM, Dmitriy Lyubimov > > wrote: > > > > > > > > this is super-cool to hear. > > > > > > > > > > > > On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann > > > > wrote: > > > > > > > > > Hi Andrew, > > > > > > > > > > I like the overview of the different algorithms. The Flink bindings > > are > > > > > still under development. We hope to finish them in the next couple of > > > > > weeks. > > > > > > > > > > Best regards, > > > > > > > > > > Till > > > > > > > > > > > > > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > > > > > wrote: > > > > > > > > > >> I created a "Features by Engine" table from the Mahout "List of > > > > >> Algorithms" page which I'd like to add to the Mahout site once it > > looks > > > > >> good: > > > > >> > > > > >> https://andrewpalumbo.github.io/algorithms_by_engine > > > > >> > > > > >> I just copied over the current page, and added in some of the stuff > > that > > > > > i > > > > >> know is complete/in the works. I wasn't sure about some of the > > > > >> Collaborative filtering stuff. > > > > >> > > > > >> Maybe the whole thing needs to be organized differently? A seperate > > > > >> totally abstract section for algorithms that will be sitting in > > > > > math-scala > > > > >> and then a section for each engine's implementation? > > > > >> > > > > >> Also I know that there's been some work done on Flink bindings, but > > I > > > > >> don't see a specific Jira. Should I put Filnk down as "In > > development"? > > > > >> > > > > >> Any thoughts are appreciated. > > > > >> > > > > >> > > > > >> > > > > > > > > > > > > > > > > >
Re: Features by engine page
On Mon, Aug 25, 2014 at 3:23 PM, Andrew Palumbo wrote: > Thanks Dmitriy, > > I've added in SSVD, PCA, QR and Weighted ALS. I think it is called "regularized ALS" > To keep it simple, I'll leave them under Spark for right now. (and add > "in development" for h2o) since they're in and passing tests. > > Should I add: > no > > GP-EI > BFGS > > as "in development" > > bigram co-occurrence (would this be collocations?) > > as "in development" for spark? > > > > > > Date: Mon, 25 Aug 2014 14:40:57 -0700 > > Subject: Re: Features by engine page > > From: dlie...@gmail.com > > To: dev@mahout.apache.org > > > > yes SSVD and stochastic PCA as well as thin QR are re-cast in Mahout > > algebra (meaning they are engine-independent, not just spark). > > > > So is regularized ALS (albeit perhaps somewhat naive and thus affecting > > performance). > > > > I also had quasi algebraic implicit feedback ALS (which is in fact > implicit > > feedback paper and ALS-WR in the same bottle) but closed the issue due to > > lack of reviews and interest. > > > > Internally I also have framework for doing hyper parameter searches and > > right now am closing on GP-EI which will probably benefit from some > > additions doing estimates chosen by reducing uncertainty (attempts to get > > out of local minimum projected by GP-EI Snoek's algorithm itself). I > hope i > > could open it one day. This work is obviously also interesting in that it > > establishes probabilistic framework in Mahout (distributions & gaussian > > process). GP stuff can be also used to evaluate things like RLFM i > think. > > > > I also have framework to do line search type of things, including big > > datasets, per Nosedal and Wright, incluging BFGS, those are probably also > > candidates for contribution. Or not, depending on the moods of my new > boss. > > > > Of other interesting things that are done with DSL and may be considered > > for contribution, I also have implementations for bigram co-occurrence > > (both directed and undirected) made in the DSL but it is also > > quasi-algebraic i think (meaning there are Spark-specific parts). This is > > (I think) would also include truethful implementation of Surprise & > > Coincidence's paper bigram problem (currently implemented in Mahout MR) > but > > also would estimate undirected co-occurrences (as a frequent itemsets > > problem solver/replacement). Again, hopeful it may be contributed, but > not > > sure if i'll pursue that if there's lack of interest in my company. It's > > hard to go against the wind, in a way. > > > > By far the most often missing piece is data prep of course, but i think i > > can eventually contribute a couple tutorials of how to do vectorization > > using SparkQL stuff. > > > > > > > > -d > > > > > > > > > > On Mon, Aug 25, 2014 at 2:19 PM, Pat Ferrel > wrote: > > > > > Spark RSJ, MAHOUT-1604 is in development > > > > > > I thought SSVD with PCA was working on Spark. > > > > > > > > > On Aug 25, 2014, at 2:15 PM, Dmitriy Lyubimov > wrote: > > > > > > this is super-cool to hear. > > > > > > > > > On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann > > > wrote: > > > > > > > Hi Andrew, > > > > > > > > I like the overview of the different algorithms. The Flink bindings > are > > > > still under development. We hope to finish them in the next couple of > > > > weeks. > > > > > > > > Best regards, > > > > > > > > Till > > > > > > > > > > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > > > > wrote: > > > > > > > >> I created a "Features by Engine" table from the Mahout "List of > > > >> Algorithms" page which I'd like to add to the Mahout site once it > looks > > > >> good: > > > >> > > > >> https://andrewpalumbo.github.io/algorithms_by_engine > > > >> > > > >> I just copied over the current page, and added in some of the stuff > that > > > > i > > > >> know is complete/in the works. I wasn't sure about some of the > > > >> Collaborative filtering stuff. > > > >> > > > >> Maybe the whole thing needs to be organized differently? A seperate > > > >> totally abstract section for algorithms that will be sitting in > > > > math-scala > > > >> and then a section for each engine's implementation? > > > >> > > > >> Also I know that there's been some work done on Flink bindings, but > I > > > >> don't see a specific Jira. Should I put Filnk down as "In > development"? > > > >> > > > >> Any thoughts are appreciated. > > > >> > > > >> > > > >> > > > > > > > > > > > >
RE: Features by engine page
Thanks Dmitriy, I've added in SSVD, PCA, QR and Weighted ALS. To keep it simple, I'll leave them under Spark for right now. (and add "in development" for h2o) since they're in and passing tests. Should I add: GP-EI BFGS as "in development" bigram co-occurrence (would this be collocations?) as "in development" for spark? > Date: Mon, 25 Aug 2014 14:40:57 -0700 > Subject: Re: Features by engine page > From: dlie...@gmail.com > To: dev@mahout.apache.org > > yes SSVD and stochastic PCA as well as thin QR are re-cast in Mahout > algebra (meaning they are engine-independent, not just spark). > > So is regularized ALS (albeit perhaps somewhat naive and thus affecting > performance). > > I also had quasi algebraic implicit feedback ALS (which is in fact implicit > feedback paper and ALS-WR in the same bottle) but closed the issue due to > lack of reviews and interest. > > Internally I also have framework for doing hyper parameter searches and > right now am closing on GP-EI which will probably benefit from some > additions doing estimates chosen by reducing uncertainty (attempts to get > out of local minimum projected by GP-EI Snoek's algorithm itself). I hope i > could open it one day. This work is obviously also interesting in that it > establishes probabilistic framework in Mahout (distributions & gaussian > process). GP stuff can be also used to evaluate things like RLFM i think. > > I also have framework to do line search type of things, including big > datasets, per Nosedal and Wright, incluging BFGS, those are probably also > candidates for contribution. Or not, depending on the moods of my new boss. > > Of other interesting things that are done with DSL and may be considered > for contribution, I also have implementations for bigram co-occurrence > (both directed and undirected) made in the DSL but it is also > quasi-algebraic i think (meaning there are Spark-specific parts). This is > (I think) would also include truethful implementation of Surprise & > Coincidence's paper bigram problem (currently implemented in Mahout MR) but > also would estimate undirected co-occurrences (as a frequent itemsets > problem solver/replacement). Again, hopeful it may be contributed, but not > sure if i'll pursue that if there's lack of interest in my company. It's > hard to go against the wind, in a way. > > By far the most often missing piece is data prep of course, but i think i > can eventually contribute a couple tutorials of how to do vectorization > using SparkQL stuff. > > > > -d > > > > > On Mon, Aug 25, 2014 at 2:19 PM, Pat Ferrel wrote: > > > Spark RSJ, MAHOUT-1604 is in development > > > > I thought SSVD with PCA was working on Spark. > > > > > > On Aug 25, 2014, at 2:15 PM, Dmitriy Lyubimov wrote: > > > > this is super-cool to hear. > > > > > > On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann > > wrote: > > > > > Hi Andrew, > > > > > > I like the overview of the different algorithms. The Flink bindings are > > > still under development. We hope to finish them in the next couple of > > > weeks. > > > > > > Best regards, > > > > > > Till > > > > > > > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > > > wrote: > > > > > >> I created a "Features by Engine" table from the Mahout "List of > > >> Algorithms" page which I'd like to add to the Mahout site once it looks > > >> good: > > >> > > >> https://andrewpalumbo.github.io/algorithms_by_engine > > >> > > >> I just copied over the current page, and added in some of the stuff that > > > i > > >> know is complete/in the works. I wasn't sure about some of the > > >> Collaborative filtering stuff. > > >> > > >> Maybe the whole thing needs to be organized differently? A seperate > > >> totally abstract section for algorithms that will be sitting in > > > math-scala > > >> and then a section for each engine's implementation? > > >> > > >> Also I know that there's been some work done on Flink bindings, but I > > >> don't see a specific Jira. Should I put Filnk down as "In development"? > > >> > > >> Any thoughts are appreciated. > > >> > > >> > > >> > > > > > > >
Re: Features by engine page
yes SSVD and stochastic PCA as well as thin QR are re-cast in Mahout algebra (meaning they are engine-independent, not just spark). So is regularized ALS (albeit perhaps somewhat naive and thus affecting performance). I also had quasi algebraic implicit feedback ALS (which is in fact implicit feedback paper and ALS-WR in the same bottle) but closed the issue due to lack of reviews and interest. Internally I also have framework for doing hyperparameter searches and right now am closing on GP-EI which will probably benefit from some additions doing estimates chosen by reducing uncertainty (attempts to get out of local minimum projected by GP-EI Snoek's algorithm itself). I hope i could open it one day. This work is obviously also interesting in that it establishes probabilistic framework in Mahout (distributions & gaussian process). GP stuff can be also used to evaluate things like RLFM i think. I also have framework to do line search type of things, including big datasets, per Nosedal and Wright, incluging BFGS, those are probably also candidates for contribution. Or not, depending on the moods of my new boss. Of other interesting things that are done with DSL and may be considered for contribution, I also have implementations for bigram co-occurrence (both directed and undirected) made in the DSL but it is also quasi-algebraic i think (meaning there are Spark-specific parts). This is (I think) would also include truethful implementation of Surprise & Coincidence's paper bigram problem (currently implemented in Mahout MR) but also would estimate undirected co-occurrences (as a frequent itemsets problem solver/replacement). Again, hopeful it may be contributed, but not sure if i'll pursue that if there's lack of interest in my company. It's hard to go against the wind, in a way. By far the most often missing piece is data prep of course, but i think i can eventually contribute a couple tutorials of how to do vectorization using SparkQL stuff. -d On Mon, Aug 25, 2014 at 2:19 PM, Pat Ferrel wrote: > Spark RSJ, MAHOUT-1604 is in development > > I thought SSVD with PCA was working on Spark. > > > On Aug 25, 2014, at 2:15 PM, Dmitriy Lyubimov wrote: > > this is super-cool to hear. > > > On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann > wrote: > > > Hi Andrew, > > > > I like the overview of the different algorithms. The Flink bindings are > > still under development. We hope to finish them in the next couple of > > weeks. > > > > Best regards, > > > > Till > > > > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > > wrote: > > > >> I created a "Features by Engine" table from the Mahout "List of > >> Algorithms" page which I'd like to add to the Mahout site once it looks > >> good: > >> > >> https://andrewpalumbo.github.io/algorithms_by_engine > >> > >> I just copied over the current page, and added in some of the stuff that > > i > >> know is complete/in the works. I wasn't sure about some of the > >> Collaborative filtering stuff. > >> > >> Maybe the whole thing needs to be organized differently? A seperate > >> totally abstract section for algorithms that will be sitting in > > math-scala > >> and then a section for each engine's implementation? > >> > >> Also I know that there's been some work done on Flink bindings, but I > >> don't see a specific Jira. Should I put Filnk down as "In development"? > >> > >> Any thoughts are appreciated. > >> > >> > >> > > > >
RE: Features by engine page
Thanks Pat, I added in Row Similarity- Should we keep that under "Miscellaneous" I'll add in everything in the Decomposition suite under spark for now > Subject: Re: Features by engine page > From: pat.fer...@gmail.com > Date: Mon, 25 Aug 2014 14:19:25 -0700 > To: dev@mahout.apache.org > > Spark RSJ, MAHOUT-1604 is in development > > I thought SSVD with PCA was working on Spark. > > > On Aug 25, 2014, at 2:15 PM, Dmitriy Lyubimov wrote: > > this is super-cool to hear. > > > On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann wrote: > > > Hi Andrew, > > > > I like the overview of the different algorithms. The Flink bindings are > > still under development. We hope to finish them in the next couple of > > weeks. > > > > Best regards, > > > > Till > > > > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > > wrote: > > > >> I created a "Features by Engine" table from the Mahout "List of > >> Algorithms" page which I'd like to add to the Mahout site once it looks > >> good: > >> > >> https://andrewpalumbo.github.io/algorithms_by_engine > >> > >> I just copied over the current page, and added in some of the stuff that > > i > >> know is complete/in the works. I wasn't sure about some of the > >> Collaborative filtering stuff. > >> > >> Maybe the whole thing needs to be organized differently? A seperate > >> totally abstract section for algorithms that will be sitting in > > math-scala > >> and then a section for each engine's implementation? > >> > >> Also I know that there's been some work done on Flink bindings, but I > >> don't see a specific Jira. Should I put Filnk down as "In development"? > >> > >> Any thoughts are appreciated. > >> > >> > >> > > >
Re: Features by engine page
Spark RSJ, MAHOUT-1604 is in development I thought SSVD with PCA was working on Spark. On Aug 25, 2014, at 2:15 PM, Dmitriy Lyubimov wrote: this is super-cool to hear. On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann wrote: > Hi Andrew, > > I like the overview of the different algorithms. The Flink bindings are > still under development. We hope to finish them in the next couple of > weeks. > > Best regards, > > Till > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > wrote: > >> I created a "Features by Engine" table from the Mahout "List of >> Algorithms" page which I'd like to add to the Mahout site once it looks >> good: >> >> https://andrewpalumbo.github.io/algorithms_by_engine >> >> I just copied over the current page, and added in some of the stuff that > i >> know is complete/in the works. I wasn't sure about some of the >> Collaborative filtering stuff. >> >> Maybe the whole thing needs to be organized differently? A seperate >> totally abstract section for algorithms that will be sitting in > math-scala >> and then a section for each engine's implementation? >> >> Also I know that there's been some work done on Flink bindings, but I >> don't see a specific Jira. Should I put Filnk down as "In development"? >> >> Any thoughts are appreciated. >> >> >> >
Re: Features by engine page
this is super-cool to hear. On Mon, Aug 25, 2014 at 1:53 PM, Till Rohrmann wrote: > Hi Andrew, > > I like the overview of the different algorithms. The Flink bindings are > still under development. We hope to finish them in the next couple of > weeks. > > Best regards, > > Till > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo > wrote: > > > I created a "Features by Engine" table from the Mahout "List of > > Algorithms" page which I'd like to add to the Mahout site once it looks > > good: > > > > https://andrewpalumbo.github.io/algorithms_by_engine > > > > I just copied over the current page, and added in some of the stuff that > i > > know is complete/in the works. I wasn't sure about some of the > > Collaborative filtering stuff. > > > > Maybe the whole thing needs to be organized differently? A seperate > > totally abstract section for algorithms that will be sitting in > math-scala > > and then a section for each engine's implementation? > > > > Also I know that there's been some work done on Flink bindings, but I > > don't see a specific Jira. Should I put Filnk down as "In development"? > > > > Any thoughts are appreciated. > > > > > > >
RE: Features by engine page
Thank you Till, I will add Flink in as "In Development" Andy > Date: Mon, 25 Aug 2014 22:53:22 +0200 > Subject: Re: Features by engine page > From: trohrm...@apache.org > To: dev@mahout.apache.org > > Hi Andrew, > > I like the overview of the different algorithms. The Flink bindings are > still under development. We hope to finish them in the next couple of weeks. > > Best regards, > > Till > > > On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo wrote: > > > I created a "Features by Engine" table from the Mahout "List of > > Algorithms" page which I'd like to add to the Mahout site once it looks > > good: > > > > https://andrewpalumbo.github.io/algorithms_by_engine > > > > I just copied over the current page, and added in some of the stuff that i > > know is complete/in the works. I wasn't sure about some of the > > Collaborative filtering stuff. > > > > Maybe the whole thing needs to be organized differently? A seperate > > totally abstract section for algorithms that will be sitting in math-scala > > and then a section for each engine's implementation? > > > > Also I know that there's been some work done on Flink bindings, but I > > don't see a specific Jira. Should I put Filnk down as "In development"? > > > > Any thoughts are appreciated. > > > > > >
Re: Features by engine page
Hi Andrew, I like the overview of the different algorithms. The Flink bindings are still under development. We hope to finish them in the next couple of weeks. Best regards, Till On Mon, Aug 25, 2014 at 9:17 PM, Andrew Palumbo wrote: > I created a "Features by Engine" table from the Mahout "List of > Algorithms" page which I'd like to add to the Mahout site once it looks > good: > > https://andrewpalumbo.github.io/algorithms_by_engine > > I just copied over the current page, and added in some of the stuff that i > know is complete/in the works. I wasn't sure about some of the > Collaborative filtering stuff. > > Maybe the whole thing needs to be organized differently? A seperate > totally abstract section for algorithms that will be sitting in math-scala > and then a section for each engine's implementation? > > Also I know that there's been some work done on Flink bindings, but I > don't see a specific Jira. Should I put Filnk down as "In development"? > > Any thoughts are appreciated. > > >
Features by engine page
I created a "Features by Engine" table from the Mahout "List of Algorithms" page which I'd like to add to the Mahout site once it looks good: https://andrewpalumbo.github.io/algorithms_by_engine I just copied over the current page, and added in some of the stuff that i know is complete/in the works. I wasn't sure about some of the Collaborative filtering stuff. Maybe the whole thing needs to be organized differently? A seperate totally abstract section for algorithms that will be sitting in math-scala and then a section for each engine's implementation? Also I know that there's been some work done on Flink bindings, but I don't see a specific Jira. Should I put Filnk down as "In development"? Any thoughts are appreciated.