> > (b) Mahout provides in-core support for matrices, and perhaps, data > > frames, > > to run both in front and back as needed. > > > > Sounds good. And in accord with the 0xdata proposal. > Is there a written form of such proposal touching concrete architectural details? It would help to cut off (erroneous) assumptions stemming from your original post.
> > (d) Such environment is also algorithmically sound. (i.e. it has to be a > > clean and performant functional programming envronment, preferrably > > supporting scripting as well, but not just some sort of domain specific > > language such as SQL. > > > > The first sentence is good. I am not clear that a full-scale functional > programming environment is necessary in order to support linear algebra. I > agree that SQL isn't going to help us much. > It means if we want to write non-algebra code, we don't have fight quirks like S5 objects in R. A combination of the two, a language and an environment. Criticism of R as a language is well known. > > > > (e) it linalg aspects it is damn close to R or existing environment > (since > > we are trying to push a new things on same crowd accustomed to R type of > > things). > > > > I think that a corollary here is that there be some buy in from the > existing community outside of Mahout. Initial positive reception by this > community is an indication that this is working. > Yes. That's the thing. Pull people in who don't know distributed stuff. It already became a bit of a stuck record, "intersection of people knowing how to fit a model and people knowing distributed machine cluster computing is very close to zero". This is exactly the situation I am facing every day at my office. > > > > > > So, why e.g. MLI doesn't quite fit this vision? > > > > The biggest problem here is that the MLI community isn't offering to join > forces with us to help out. > Well... it's kind of irrelevant who does what, I am just talking from vision perspective here. How MLI concept is deficient. It is deficient because it doesn't offer necessary abstraction to plug backs (perhaps), one of the most obvious things, compared to vision i am laying out. > > > > a: Tightly coupled with > > Spark. No coherent in-core/out-of-core linalg support. > > > > Ah... well that does seem to be a sticking point as well. > > > > > But MLI goes to show > > people go along these lines these days (there are more projects > breeding). > > Without these steps Mahout will not escape its major criticism: Just a > > library of rigidly built algorithms. Hard to use. Hard to develop on top. > > Hard to customize. Hard to validate. > > > > Without which steps? > (a)-(e). in short, being an ML environment first, set of algorithms second. (somewhere here is dug the performance issue, but let's not compare with MR-based iterative stuff, shall we? that's why we want to plug new backends in particular). > > > > > > There are a few items to consider as possible developer's stories. > > > > (1) scala dsl fits nicely on all requirements. No parsers, no semantic > > trees, mixed environment of strong functional language and DSL > > capabilities, if so needed, interactive shell/script engine (including > > on-the-fly byte class parsing in to the byte code, so no even > > cost-of-iteration penalties here!) > > > > This sounds great. The only thing lacking (right now) is the ability to do > byte code transformations to get really high performance. > Well . byte code is regular java byte code. it does what it is told to do. What i mean, scala scripting can parse the class and plug it into jvm so that it is not interpreted but compiled code. But that's a minor road map feature. Just doing argumentation why scala fits the profile here. Scripting, performant, functional, out-of-the box dsl as opposed to necessity to build some custom semantic trees etc.etc. > > > > > (2) in-core performance (if it is even a concern). Matrix abstraction can > > evolve to include JBlas and GPU-based data sets. In terms of performance, > > latest conference papers on GPU approach demonstrate that GPU-stored > > mutable datasets will blow socks off anything written with CPU and RAM > bus. > > > > GPU's have been hobbled up to recently by poor main memory bandwidth. For > problems that can fit inside GPU memory (possibly multiplied by > parallelism) performance can be very good. > Yes. that goes back to mutable operand. Since it can live considerable time span in pipeline without any attempt to be dissolved into a persisted form, it is very easy to imagine pipelines running on stuff that lives without being siphoned off, in gpu memory. > Nothing prevents a system like h2o from taking advantage of GPU's. It > already matches and often exceeds the speed of JBlas. Having a persistent > parallel representation for GPU resident data sets would be very > interesting for appropriate problems. > I am just saying how the main argument of the merger, the performance, can be easily surpassed, at least in some categories, as it stands. Nothing prevents Mahout either, but as it stands, it may be the case that the performance of vector compression on RAM stored data may be overrated. > GPU's, however, show very different characteristics with highly sparse > problems. There, the amount of arithmetic per memory operation is > dramatically lower than problems like deep learning and so GPU's provide > considerably less advantage. > Yes. not 100%. But definitely for matrix block operations. which is exactly what the backing partitions of the current optimizer are. > > > > > ... > > > > (3) Every multinode system (even allreduce) incurs serialized I/O. So > yes, > > *maybe* our matrices could use a better compression -- although I am > > dubious about it if cost switch to sparse algebra is properly applied in > > the optimizer. So there may be valuable contributions, but it is not > > architecture changing thing. > > > > not sure the point here. > Point here that on-the-wire compression is important and all distributed systems incur i/o which is likely to be more of bottleneck than any in-memory operations. Thus offsetting concerns for those even further. In SSVD benchmark shuffle I/o of matrix multiplication in power iteration section actually was/is the heaviest part. > > > > > (4) Couple days of work to throw in Stratosphere primitives. > > > > Likewise. If the stratosphere community would like to step up to help with > this, I would champion that contribution as well. > They don't have to. That's the beauty of the vision. Since abstraction is flexible, we don't have to go to community for this support (just like we don't have to go to Spark now). > I think that this is a straw man argument. I am not proposing a full scale > merger at this point. I am proposing that 0xdata be encouraged to > contribute a Mahout binding and implementation for a number of key > algorithms. > As mentioned, looking forward to reading the exact proposal. -d
