And in case anyone wonders yes shell starts and runs test script totally
fine with mrlegacy dependency on classpath (startup script modified to use
mahout-hadoop instead)  -- both in local and distributed (standalone) mode:

================================================

$ MASTER=spark://localhost:7077 bin/mahout spark-shell

                         _                 _
         _ __ ___   __ _| |__   ___  _   _| |_
        | '_ ` _ \ / _` | '_ \ / _ \| | | | __|
        | | | | | | (_| | | | | (_) | |_| | |_
        |_| |_| |_|\__,_|_| |_|\___/ \__,_|\__|  version 1.0


Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java
1.7.0_71)
Type in expressions to have them evaluated.
Type :help for more information.
15/01/23 15:28:25 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
another address
15/01/23 15:28:26 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Created spark context..
Mahout distributed context is available as "implicit val sdc".


mahout> :load spark-shell/src/test/mahout/simple.mscala
Loading spark-shell/src/test/mahout/simple.mscala...
a: org.apache.mahout.math.DenseMatrix =
{
  0  => {0:1.0,1:2.0,2:3.0}
  1  => {0:3.0,1:4.0,2:5.0}
}
drmA: org.apache.mahout.math.drm.CheckpointedDrm[Int] =
org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark@7940bbc5
drmAtA: org.apache.mahout.math.drm.DrmLike[Int] =
OpAB(OpAt(org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark@7940bbc5
),org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark@7940bbc5)
r: org.apache.mahout.math.drm.CheckpointedDrm[Int] =
org.apache.mahout.sparkbindings.drm.CheckpointedDrmSpark@3c46dadf
res4: org.apache.mahout.math.Matrix =
{
  0  => {0:11.0,1:15.0,2:19.0}
  1  => {0:15.0,1:21.0,2:27.0}
  2  => {0:19.0,1:27.0,2:35.0}
}
mahout>


On Fri, Jan 23, 2015 at 3:07 PM, Suneel Marthi <suneel.mar...@gmail.com>
wrote:

> +1
>
> On Fri, Jan 23, 2015 at 6:04 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
>
> > So right now mahout-spark depends on mr-legacy.
> > I did quick refactoring and it turns out it only _irrevocably_ depends on
> > the following classes there:
> >
> > MatrixWritable, VectorWritable, Varint/Varlong and VarintWritable, and
> ...
> > *sigh* o.a.m.common.Pair
> >
> > So  I just dropped those five classes into new a new tiny mahout-hadoop
> > module (to signify stuff that is directly relevant to serializing thigns
> to
> > DFS API) and completely removed mrlegacy and its transients from spark
> and
> > spark-shell dependencies.
> >
> > So non-cli applications (shell scripts and embedded api use) actually
> only
> > need spark dependencies (which come from SPARK_HOME classpath, of course)
> > and mahout jars (mahout-spark, mahout-math(-scala), mahout-hadoop and
> > optionally mahout-spark-shell (for running shell)).
> >
> > This of course still doesn't address driver problems that want to throw
> > more stuff into front-end classpath (such as cli parser) but at least it
> > renders transitive luggage of mr-legacy (and the size of worker-shipped
> > jars) much more tolerable.
> >
> > How does that sound?
> >
>

Reply via email to