Re: Compiling only MLlib?

2016-01-15 Thread Ted Yu
Looks like you didn't have zinc running.

Take a look at install_zinc() in build/mvn, around line 83.
You can use build/mvn instead of running mvn directly.

I normally use the following command line:

bin/mvn clean -Phive -Phive-thriftserver -Pyarn -Phadoop-2.4
-Dhadoop.version=2.7.0 package -DskipTests

After one full build, you should be able to build MLlib module alone.

Cheers

On Fri, Jan 15, 2016 at 6:13 PM, Colin Woodbury  wrote:

> Hi, I'm very much interested in using Spark's MLlib in standalone
> programs. I've never used Hadoop, and don't intend to deploy on massive
> clusters. Building Spark has been an honest nightmare, and I've been on and
> off it for weeks.
>
> The build always runs out of RAM on my laptop (4g of RAM, Arch Linux) when
> I try to build with Scala 2.11 support. No matter how I tweak JVM flags to
> reduce maximum RAM use, the build always crashes.
>
> When trying to build Spark 1.6.0 for Scala 2.10 just now, the build had
> compilation errors. Here is one, as a sample. I've saved the rest:
>
> [error]
> /home/colin/building/apache-spark/spark-1.6.0/repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkJLineReader.scala:16:
> object jline is not a member of package tools
> [error] import scala.tools.jline.console.completer._
>
> It informs me:
>
> [ERROR] After correcting the problems, you can resume the build with the
> command
> [ERROR]   mvn  -rf :spark-repl_2.10
>
> I don't feel safe doing that, given that I don't know what my ""
> are.
>
> I've noticed that the build is compiling a lot of things I have no
> interest in. Is it possible to just compile the Spark core, its tools, and
> MLlib? I just want to experiment, and this is causing me a  lot of stress.
>
> Thank you kindly,
> Colin
>


Re: Compiling only MLlib?

2016-01-15 Thread Matei Zaharia
Have you tried just downloading a pre-built package, or linking to Spark 
through Maven? You don't need to build it unless you are changing code inside 
it. Check out 
http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications
 for how to link to it.

Matei

> On Jan 15, 2016, at 6:13 PM, Colin Woodbury  wrote:
> 
> Hi, I'm very much interested in using Spark's MLlib in standalone programs. 
> I've never used Hadoop, and don't intend to deploy on massive clusters. 
> Building Spark has been an honest nightmare, and I've been on and off it for 
> weeks.
> 
> The build always runs out of RAM on my laptop (4g of RAM, Arch Linux) when I 
> try to build with Scala 2.11 support. No matter how I tweak JVM flags to 
> reduce maximum RAM use, the build always crashes.
> 
> When trying to build Spark 1.6.0 for Scala 2.10 just now, the build had 
> compilation errors. Here is one, as a sample. I've saved the rest:
> 
> [error] 
> /home/colin/building/apache-spark/spark-1.6.0/repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkJLineReader.scala:16:
>  object jline is not a member of package tools
> [error] import scala.tools.jline.console.completer._
> 
> It informs me:
> 
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :spark-repl_2.10
> 
> I don't feel safe doing that, given that I don't know what my "" are. 
> 
> I've noticed that the build is compiling a lot of things I have no interest 
> in. Is it possible to just compile the Spark core, its tools, and MLlib? I 
> just want to experiment, and this is causing me a  lot of stress.
> 
> Thank you kindly,
> Colin



Compiling only MLlib?

2016-01-15 Thread Colin Woodbury
Hi, I'm very much interested in using Spark's MLlib in standalone programs.
I've never used Hadoop, and don't intend to deploy on massive clusters.
Building Spark has been an honest nightmare, and I've been on and off it
for weeks.

The build always runs out of RAM on my laptop (4g of RAM, Arch Linux) when
I try to build with Scala 2.11 support. No matter how I tweak JVM flags to
reduce maximum RAM use, the build always crashes.

When trying to build Spark 1.6.0 for Scala 2.10 just now, the build had
compilation errors. Here is one, as a sample. I've saved the rest:

[error]
/home/colin/building/apache-spark/spark-1.6.0/repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkJLineReader.scala:16:
object jline is not a member of package tools
[error] import scala.tools.jline.console.completer._

It informs me:

[ERROR] After correcting the problems, you can resume the build with the
command
[ERROR]   mvn  -rf :spark-repl_2.10

I don't feel safe doing that, given that I don't know what my ""
are.

I've noticed that the build is compiling a lot of things I have no interest
in. Is it possible to just compile the Spark core, its tools, and MLlib? I
just want to experiment, and this is causing me a  lot of stress.

Thank you kindly,
Colin