It isn't that bad.  Maven is opinionated (that is a feature, not a defect).
 But it isn't that hard to deal with.

The first concept to deal with is that maven has pre-defined life cycle
goals.  The most important for most programmers are compile, test, package
and install.  These pretty much mean what they say except that install
means install artifacts in the local maven cache of artifacts, not install
software.

>From the top level,

     mvn install

will compile everything, run all the tests, build all the packaged
artifacts and stash them in maven's local cache.  If you want to skip the
tests to save time (tens of minutes), use

      mvn install -DskipTests

instead.  This should create everything you need to use the Mahout command
line tools.  These tools are invoked using a script that interpolates
references to all of the dependencies of mahout into a classpath and
invokes a particular class.  You can see the unfortunately large resulting
classpath by running

     bin/mahout classpath

Most of the development I do on mahout itself involves writing things that
would, in the C world, be command line programs.  During development,
however, I prefer to write these as unit tests.  These are easy to execute
with maven using a command like this:

    mvn test -Dtest=SequentialOutOfCoreSvdTest

If you want to run your own command line program that uses maven as a
dependency, I recommend that you build mahout as a separate project and
then build your command line program as its own maven project.  For an
example, take a look at https://github.com/tdunning/Chapter-16.  With such
a project, you have the liberty of asking maven to package all of the
dependencies into a single jar file which allows you to do something like:

     java -jar target/foo-with-dependencies.jar

to run your program.

On Sun, Jan 15, 2012 at 10:55 PM, Lance Norskog <goks...@gmail.com> wrote:

> Maven does a lot, but it does it the way it wants to. If you want to
> run all of the programs directly with Java you will learn a lot more
> about maven than you do about mahout.  It might be easier to learn
> maven itself from one of the tutorials online, then come back to
> leaning Mahout.
>
> On Sat, Jan 14, 2012 at 7:54 PM, Periya.Data <periya.d...@gmail.com>
> wrote:
> > Thanks Lance.
> >
> > The reason(s) why I am asking this specific question :
> > - I am new to Mahout
> > - I am sort of new to Java itself. (core C/C++ programmer). Learning the
> > basics of pom.xml.
> > - I do not want to use Eclipse IDE now...though I have used it before. I
> > really do not know what goes on "behind the scenes" if I use IDE. So, I
> > want to do everything on command line and understand.
> > - I have used the regular shell script for running mahout earlier ..like
> > the following- $MAHOUT_HOME/bin/mahout kmeans       --input
> > /input/mahout/vectorized/tfidf-vectors \
> >                      --output
> > $HDFS_OUTPUT_DIR/bigdata-canopy-centroids \
> >                       --clusters
> > $HDFS_OUTPUT_DIR/bigdata-canopy-centroids/clusters-0 \
> > ....
> >
> > - The Mahout in Action book assumes that the reader can easily compile
> and
> > run the programs. (which I am unable to). Please see this text in
> chapter 7
> > of the book on KMeans.
> >
> > *"7.3.3  Analyzing the output*
> > Compile and run the code in listing 7.2 using your favorite IDE or do it
> > from the com-
> > mand line. Make sure you add all the Mahout dependency JAR files to the
> > classpath.
> > Because our set of data is small, you’ll get the following output in a
> > matter of seconds:"
> >
> >
> > In other words, I really want to write a simple java clustering program
> > (say using Kmeans), compile and run from command-line....just like any
> > other normal java program. I am unable to do this simple stuff now. Any
> > step-by-step instructions on this would give me a good start. At this
> > stage, I need a little spoon-feeding.
> >
> > Appreciate your help,
> > PD
> >
> >
> > On Sat, Jan 14, 2012 at 6:23 PM, Lance Norskog <goks...@gmail.com>
> wrote:
> >
> >> Ah! A command-line invocation works from maven. mahout/bin/mahout is a
> >> shell scripts which wraps up a bunch of handy things and runs java for
> >> you. You can just say from the top level (if your class has a main):
> >>
> >> bin/mahout org.apache.mahout.package.class arg1 arg2 ... argN
> >>
> >> The problem with 'java -cp' is that the Maven repository downloader
> >> parks every jar in a separate directory. 'mvn' has a wrapper that runs
> >> java apps. Look at the mvn calls in this page:
> >>
> >>
> >>
> http://www.lucidimagination.com/search/link?url=https://cwiki.apache.org/confluence/display/MAHOUT/RecommendationExamples
> >>
> >> On Sat, Jan 14, 2012 at 5:04 PM, Periya.Data <periya.d...@gmail.com>
> >> wrote:
> >> > Thanks. About renaming packages -> I wanted to experiment with
> modified
> >> > code at a later time and do not want to change the original. I am
> >> > building/compiling from a different place.
> >> >
> >> > Also, as a newbie, I thought I would know exactly what is needed to
> run
> >> > KMeans if I "gradually" build up my pom.xml ..rather than take what is
> >> > already there which might have a lot of unnecessary  modules packaged
> up.
> >> >
> >> > Finally, a sample command line for execution will be helpful. "java
> -cp
> >> > ...". I shall try the universal job file as well.
> >> >
> >> > Thanks for your feedback,
> >> > PD.
> >> >
> >> > On Sat, Jan 14, 2012 at 3:09 PM, Lance Norskog <goks...@gmail.com>
> >> wrote:
> >> >
> >> >> I conflated two different things: 1) what you said, and 2) a newbie
> >> >> will have a much easier time trying out the MIA code against the 0.5
> >> >> release.
> >> >>
> >> >> On Sat, Jan 14, 2012 at 3:35 AM, Sean Owen <sro...@gmail.com> wrote:
> >> >> > I don't think this has anything to do with using 0.5 vs 0.6 per se.
> >> >> > All of this surgery is unnecessary. You simply need to use the .job
> >> >> > files, which package all dependencies into one .jar, rather than
> >> >> > individual jars.
> >> >> >
> >> >> > utils is now integration.
> >> >> >
> >> >> > You should not need to rename packages, not sure what you mean
> there.
> >> >> >
> >> >> > Sean
> >> >> >
> >> >> > On Sat, Jan 14, 2012 at 4:21 AM, Lance Norskog <goks...@gmail.com>
> >> >> wrote:
> >> >> >> The code for Mahout In Action is coded against the Mahout 0.5
> >> release.
> >> >> >> The trunk has changed a lot since then. You can change your
> pom.xml
> >> >> >> dependencies to Mahout 0.5 and it should work better.
> >> >> >>
> >> >> >> You should start with this file, then add your changes.
> >> >> >>
> >> >> >>
> >> >>
> >>
> https://github.com/tdunning/MiA/blob/12a0a53757ba49142ab69f94c002ff21650cb3f0/MiA/pom.xml
> >> >> >>
> >> >> >> Lance
> >> >> >>
> >> >> >> On Thu, Jan 12, 2012 at 8:07 PM, Periya.Data <
> periya.d...@gmail.com>
> >> >> wrote:
> >> >> >>> Hi,
> >> >> >>>    I am new to Mahout and began exploring the clustering
> examples. I
> >> >> >>> basically took the example code of SimpleKMeansClustering (from
> >> Mahout
> >> >> in
> >> >> >>> Action) and trying to run it. The following is what I did :
> >> >> >>>
> >> >> >>> 1 - made sure I renamed the package name in the java file
> >> >> appropriately.
> >> >> >>> 2 - made sure hadoop is running (in pseudo-distributde mode).
> >> >> >>> 3 - mvn clean install. My pom.xml file is pasted in the bottom of
> >> this
> >> >> >>> email. The result is as follows:
> >> >> >>>
> >> >> >>> pd@PeriyaData:~/Mahout/clustering/target$ ls -l
> >> >> >>> total 28
> >> >> >>> drwxrwxr-x 3 pd pd 4096 2012-01-12 19:51 classes
> >> >> >>> -rw-rw-r-- 1 pd pd 5173 2012-01-12 19:51
> >> clustering-1.0-SNAPSHOT.jar
> >> >> >>> drwxrwxr-x 4 pd pd 4096 2012-01-12 19:51 generated-sources
> >> >> >>> drwxrwxr-x 2 pd pd 4096 2012-01-12 19:51 maven-archiver
> >> >> >>> drwxrwxr-x 2 pd pd 4096 2012-01-12 19:51 surefire-reports
> >> >> >>> drwxrwxr-x 3 pd pd 4096 2012-01-12 19:51 test-classes
> >> >> >>> pd@PeriyaData:~/Mahout/clustering/target$
> >> >> >>>
> >> >> >>> 3 - Trying to run it by "java -classpath..." etc. Note...my
> >> classpath
> >> >> does
> >> >> >>> not have mahout-utils.jar. It is missing in my build.
> >> >> >>>
> >> >> >>> pd@PeriyaData:~/Mahout/clustering/target/classes$ *java -cp
> >> >> >>>
> >> >>
> >>
> ../clustering-1.0-SNAPSHOT.jar:~/CDH3/mahout/core/target/classes:~/CDH3/mahout/core/target/mahout-core-0.6-SNAPSHOT.jar:~/CDH3/mahout/math/target/mahout-math-0.6-SNAPSHOT.jar
> >> >> >>> hw.mahout.kmeans.SimpleKMeansClustering *
> >> >> >>> Exception in thread "main" java.lang.NoClassDefFoundError:
> >> >> >>> org/apache/mahout/common/distance/DistanceMeasure
> >> >> >>> Caused by: java.lang.ClassNotFoundException:
> >> >> >>> org.apache.mahout.common.distance.DistanceMeasure
> >> >> >>>    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >> >> >>>    at java.security.AccessController.doPrivileged(Native Method)
> >> >> >>>    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >> >> >>>    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> >> >> >>>    at
> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >> >> >>>    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> >> >> >>> Could not find the main class:
> >> hw.mahout.kmeans.SimpleKMeansClustering.
> >> >> >>> Program will exit.
> >> >> >>> pd@PeriyaData:~/Mahout/clustering/target/classes$
> >> >> >>>
> >> >> >>> =============================
> >> >> >>> questions:
> >> >> >>>
> >> >> >>>
> >> >> >>>   1. I do not have mahout-utils.jar file ...for some strange
> >> reason. I
> >> >> am
> >> >> >>>   using Mahout 0.6. I tried recompiling Mahout twice..using MVN
> >> clean
> >> >> >>>   install. Still I do not see / cannot find mahout-utils-0.6.jar.
> >> >> Perhaps
> >> >> >>>   that is a problem. I have mahout-core, mahout-examples and
> >> >> mahout-math.
> >> >> >>>   2. Is the command syntax "java -cp ..." correct in step 3?
> Please
> >> >> advise.
> >> >> >>>   3. Is my pom.xml is sufficient to for this build? Please note
> >> that in
> >> >> >>>   pom.xml, I have mahout core and others as 0.5 version. For some
> >> >> strange
> >> >> >>>   reason, if I have 0.6, maven build fails and complains that 4
> >> >> artifacts are
> >> >> >>>   missing - mahout-core, mahout-math, mahout-utils and
> >> mahout-examples
> >> >> jar
> >> >> >>>   files. Is there a fix this?
> >> >> >>>
> >> >> >>>
> >> >> >>> ==================
> >> >> >>>
> >> >> >>> pom.xml
> >> >> >>>
> >> >> >>> <project xmlns="http://maven.apache.org/POM/4.0.0"; xmlns:xsi="
> >> >> >>> http://www.w3.org/2001/XMLSchema-instance";
> >> >> >>>  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
> >> >> >>> http://maven.apache.org/maven-v4_0_0.xsd";>
> >> >> >>>  <modelVersion>4.0.0</modelVersion>
> >> >> >>>
> >> >> >>>  <parent>
> >> >> >>>    <artifactId>mahout</artifactId>
> >> >> >>>    <groupId>org.apache.mahout</groupId>
> >> >> >>>    <version>0.4</version>
> >> >> >>>  </parent>
> >> >> >>>
> >> >> >>>
> >> >> >>>  <groupId>hw.mahout.kmeans</groupId>
> >> >> >>>  <artifactId>clustering</artifactId>
> >> >> >>>  <packaging>jar</packaging>
> >> >> >>>  <version>1.0-SNAPSHOT</version>
> >> >> >>>  <name>clustering</name>
> >> >> >>>  <url>http://maven.apache.org</url>
> >> >> >>>
> >> >> >>> <dependencies>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.mahout</groupId>
> >> >> >>>      <artifactId>mahout-core</artifactId>
> >> >> >>>      <version>0.5</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.mahout</groupId>
> >> >> >>>      <artifactId>mahout-core</artifactId>
> >> >> >>>      <type>test-jar</type>
> >> >> >>>      <scope>test</scope>
> >> >> >>>      <version>0.5</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.mahout</groupId>
> >> >> >>>      <artifactId>mahout-math</artifactId>
> >> >> >>>      <version>0.5</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.mahout</groupId>
> >> >> >>>      <artifactId>mahout-math</artifactId>
> >> >> >>>      <type>test-jar</type>
> >> >> >>>      <scope>test</scope>
> >> >> >>>      <version>0.5</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.mahout</groupId>
> >> >> >>>      <artifactId>mahout-utils</artifactId>
> >> >> >>>      <version>0.5</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.mahout</groupId>
> >> >> >>>      <artifactId>mahout-examples</artifactId>
> >> >> >>>      <version>0.5</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>com.google.guava</groupId>
> >> >> >>>      <artifactId>guava</artifactId>
> >> >> >>>      <version>r03</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.thrift</groupId>
> >> >> >>>      <artifactId>libthrift</artifactId>
> >> >> >>>      <version>0.6.1</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.slf4j</groupId>
> >> >> >>>      <artifactId>slf4j-log4j12</artifactId>
> >> >> >>>      <version>1.5.11</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.apache.hadoop</groupId>
> >> >> >>>      <artifactId>zookeeper</artifactId>
> >> >> >>>      <version>3.3.1</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>org.twitter4j</groupId>
> >> >> >>>      <artifactId>twitter4j-stream</artifactId>
> >> >> >>>      <version>2.2.3</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>        <groupId>commons-io</groupId>
> >> >> >>>        <artifactId>commons-io</artifactId>
> >> >> >>>        <version>2.0.1</version>
> >> >> >>>        <type>jar</type>
> >> >> >>>        <scope>compile</scope>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>    <dependency>
> >> >> >>>      <groupId>commons-logging</groupId>
> >> >> >>>      <artifactId>commons-logging</artifactId>
> >> >> >>>      <version>1.1.1</version>
> >> >> >>>    </dependency>
> >> >> >>>
> >> >> >>>  </dependencies>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> <!--
> >> >> >>>  <build>
> >> >> >>>    <plugins>
> >> >> >>>      <plugin>
> >> >> >>>        <groupId>org.apache.maven.plugins</groupId>
> >> >> >>>        <artifactId>maven-compiler-plugin</artifactId>
> >> >> >>>        <version>2.3.2</version>
> >> >> >>>        <configuration>
> >> >> >>>          <encoding>UTF-8</encoding>
> >> >> >>>          <source>1.6</source>
> >> >> >>>          <target>1.6</target>
> >> >> >>>          <optimize>true</optimize>
> >> >> >>>        </configuration>
> >> >> >>>      </plugin>
> >> >> >>>      <plugin>
> >> >> >>>        <groupId>org.apache.maven.plugins</groupId>
> >> >> >>>        <artifactId>maven-antrun-plugin</artifactId>
> >> >> >>>        <version>1.6</version>
> >> >> >>>      </plugin>
> >> >> >>>      <plugin>
> >> >> >>>        <groupId>org.apache.maven.plugins</groupId>
> >> >> >>>        <artifactId>maven-resources-plugin</artifactId>
> >> >> >>>        <version>2.4.3</version>
> >> >> >>>        <configuration>
> >> >> >>>          <encoding>UTF-8</encoding>
> >> >> >>>        </configuration>
> >> >> >>>      </plugin>
> >> >> >>>
> >> >> >>>      <plugin>
> >> >> >>>        <groupId>org.apache.maven.plugins</groupId>
> >> >> >>>        <artifactId>maven-assembly-plugin</artifactId>
> >> >> >>>        <executions>
> >> >> >>>          <execution>
> >> >> >>>            <id>job</id>
> >> >> >>>            <phase>package</phase>
> >> >> >>>            <goals>
> >> >> >>>              <goal>single</goal>
> >> >> >>>            </goals>
> >> >> >>>            <configuration>
> >> >> >>>              <descriptors>
> >> >> >>>                <descriptor>src/main/assembly/job.xml</descriptor>
> >> >> >>>              </descriptors>
> >> >> >>>            </configuration>
> >> >> >>>          </execution>
> >> >> >>>          <execution>
> >> >> >>>            <id>my-jar-with-dependencies</id>
> >> >> >>>            <phase>package</phase>
> >> >> >>>            <goals>
> >> >> >>>              <goal>single</goal>
> >> >> >>>            </goals>
> >> >> >>>            <configuration>
> >> >> >>>              <descriptorRefs>
> >> >> >>>
>  <descriptorRef>jar-with-dependencies</descriptorRef>
> >> >> >>>              </descriptorRefs>
> >> >> >>>            </configuration>
> >> >> >>>          </execution>
> >> >> >>>        </executions>
> >> >> >>>      </plugin>
> >> >> >>>    </plugins>
> >> >> >>>  </build>
> >> >> >>>
> >> >> >>> -->
> >> >> >>>
> >> >> >>> </project>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> Thanks very much,
> >> >> >>>
> >> >> >>> PD.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Lance Norskog
> >> >> >> goks...@gmail.com
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Lance Norskog
> >> >> goks...@gmail.com
> >> >>
> >>
> >>
> >>
> >> --
> >> Lance Norskog
> >> goks...@gmail.com
> >>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>

Reply via email to