Also +1

iPhone'd

> On Jan 23, 2015, at 18:38, Andrew Palumbo <ap....@outlook.com> wrote:
> 
> +1
> 
> 
> Sent from my Verizon Wireless 4G LTE smartphone
> 
> <div>-------- Original message --------</div><div>From: Dmitriy Lyubimov 
> <dlie...@gmail.com> </div><div>Date:01/23/2015  6:06 PM  (GMT-05:00) 
> </div><div>To: dev@mahout.apache.org </div><div>Subject: Codebase refactoring 
> proposal </div><div>
> </div>
> So right now mahout-spark depends on mr-legacy.
> I did quick refactoring and it turns out it only _irrevocably_ depends on
> the following classes there:
> 
> MatrixWritable, VectorWritable, Varint/Varlong and VarintWritable, and ...
> *sigh* o.a.m.common.Pair
> 
> So  I just dropped those five classes into new a new tiny mahout-hadoop
> module (to signify stuff that is directly relevant to serializing thigns to
> DFS API) and completely removed mrlegacy and its transients from spark and
> spark-shell dependencies.
> 
> So non-cli applications (shell scripts and embedded api use) actually only
> need spark dependencies (which come from SPARK_HOME classpath, of course)
> and mahout jars (mahout-spark, mahout-math(-scala), mahout-hadoop and
> optionally mahout-spark-shell (for running shell)).
> 
> This of course still doesn't address driver problems that want to throw
> more stuff into front-end classpath (such as cli parser) but at least it
> renders transitive luggage of mr-legacy (and the size of worker-shipped
> jars) much more tolerable.
> 
> How does that sound?

Reply via email to