as long as tests run, i don't care about h20. our methods don't have a real published benchmark either (which what it really needs).
i think we need a good name encompassing not just algebra, but also the entire scope of base capabilities in environment like what is called R-base in R. This includes basic stats, too. btw i've already done stat bindings 2 for colt stuff 2 times already. it is not a big deal (especially if it is just a bridge to somethig extra common else like apache-math) On Tue, Mar 17, 2015 at 9:14 AM, Pat Ferrel <p...@occamsmachete.com> wrote: > Yeah we need a real name that brings no baggage. R-like, based on scala, big > data linear algebra yada yada. Can’t say that in a descriptive phrase so why > not a name like Mahout-xyz? Of course with a more catchy search friendly xyz > > But AP’s structure seems pretty good > > I’m nervous releasing H2O with no one supporting it. Is anyone signing up for > that? > > > On Mar 17, 2015, at 8:59 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: > > I dont like the term dsl. > > It is algebtaic optimizer, folks. Calling it dsl brings in wrong and too > trivial ideas about it. > On Mar 17, 2015 8:27 AM, "Andrew Palumbo" <ap....@outlook.com> wrote: > >> >> On 03/15/2015 01:42 PM, Pat Ferrel wrote: >> >>> Lots of discussion off the record about doing a release but shouldn’t we >>> plan this? >>> >>> What has to be in a release of Mahout 0.10? >>> >>> Seems like we could release as-is but it would be nice to have some of >>> the already completed work that isn’t committed yet: >>> * mrlegacy refactored out of scala, is it possible to get this in Dmitriy? >>> >>> One question is how to package, with which version of Spark. There is a >>> bug in Spark 1.2.1 and I think in 1.2 (this is the big distro build) that >>> requires any class that uses the JavaSerializer to set a specific SparkConf >>> key/value to point to the guava jar on all workers. This only effects >>> IndexedDatasets since they use Guava’s BiMap. Rumor has it that 1.3 fixes >>> this but I haven’t tried it yet. >>> >>> So we are currently stuck on 1.1.1 but could document how to work around >>> to use 1.2 for a user who want’s to build Mahout from scratch. A user >>> source build on 1.3 may not require a work around. We seem to be good on >>> hadoop 2.x, which in itself is a good reason to release since 0.9 was not. >>> >>> What else needs to be done: >>> * rename module math-scala to core? >>> * create the distribution build. Currently this does not publish the >>> scaladocs and does not create artifacts for H2O or and Scala. >>> >> >> same problem for javadocs (other than mregacy). Is this a question for >> INFRA? We have MAHOUT-1562 <https://issues.apache.org/ >> jira/browse/MAHOUT-1562> and MAHOUT-1585 <https://issues.apache.org/ >> jira/browse/MAHOUT-1585> >> open for these. Were javadocs for all modules ever hosted? there were >> links for them which are now dead so I removed them from the site. I'm >> wondering because even once we get the scaladocs published in the build >> will we have the same problem of them not being hosted. >> >> * is H2O really in a form to publish? >>> >> >> In terms of scala bindings for the DRM and DSL Linear algebra operations, >> solvers, etc.. , H2O should be good to go with the exception of one bug >> (MAHOUT-1638 <https://issues.apache.org/jira/browse/MAHOUT-1638>). It >> passes (almost) all math-scala tests. We have no other algorithms (outside >> of math-scala solvers, decompositions, etc) for H2O. I'm not sure if its >> being used or how much real world testing its had; It does serve at the >> very least as a proof of concept for the Engine Neutral DSL. >> >> >>> Docs >>> * IMO we should name the Mahout Spark-Scala DSL and shell. More unique >>> names are easier to find in searches. Maybe Suneel can polish off his >>> sanskrit and suggest something. >>> * we should be ready to do some work here to restructure the CMS since it >>> is very 0.9 centric with Scala stuff almost an afterthought. >>> >> >> Agreed. What about categorizing the Documentation on the site under tabs >> like "Mahout-DSL" "Mahout Spark-Environment" and "Mahout Map-Reduce" ? >> >> >> >