i was thinking 0.10.0 mid-april, update 0.10.1 end of spring. i would suggest feature extraction topics for 0.11.x. Esp. w.r.t. SchemaRDD aka DataFrame -- vectorizing, hashing, ML schema support, imputation of missing data, outlier cleanups etc. There's a lot.
Hardware backs integration -- i will certainly be looking at those, but perhaps the easiest is to start with automatic detection and configuration of capabilities via netlib, since it is already in the path and it seems likely that it will (eventually) support cuda as well in some form. This is for 0.11 or 0.12.x, depends on availability. Higher order methods are somewhat a matter of inspiration. I think i could offer some stuff there too as I already have implemented a lot of those on top of Mahout before. I did bayesian optimization (aka "spearmint", GP-EI etc.) on Mahout algebra, line search, (L)bfgs, stats including Gaussian Process support. BFGS and line search are fairly simple methods and i will give a reference if anybody is interested. also, breeze also has line search with strong wolfe conditions (if a coded reference is needed). All that is up for grabs as a fairly well understood subject. (5-6 months out) Once GP-EI is available, it becomes a fairly interesting topic to resurrect implicit feedback issue. Important insight there is that in fact feature incoding can be done by a custom scheme (not necessarily using encoding schme done in paper; in fact, there are 2 of them there; or the way mllib encodes that as well). once custom encoding schemes are adjusted, using bayesian optimization is increasingly important, especially if there are more than just 2 parameters there.