Okay.. So its what I did

1. Used maven to create jar (*mvn jar:jar*).
hadoop@ubuntu:/tmp/mahout-jar$ jar -tf mahout-fpgrowth-1.0-SNAPSHOT.jar
META-INF/
META-INF/MANIFEST.MF
com/
com/musigma/
com /musigma/hpc/
com/musigma/hpc/CallFPGrowth.class
META-INF/maven/
META-INF/maven/com.musigma.hpc/
META-INF/maven/com.musigma.hpc/mahout-fpgrowth/pom.xml
META-INF/maven/com.musigma.hpc/mahout-fpgrowth/pom.properties

2. This jar I have placed in /tmp/mahout-jar and added this folder in
HADOOP_CLASSPATH in hadoop-env.sh ( *export
HADOOP_CLASSPATH=/tmp/mahout-jar/*:$HADOOP_CLASSPATH)*

3. I have copied drivers.classes.props in /tmp/mahout-jar folder.

4. Copy drivers.classes.props file to /tmp/mahout-jar, made the entry of my
class file in the end

*com.musigma.hpc.CallFPGrowth = callfpgrowth : calling fpgrowth
*
5. In terminal, I am using export command to set /tmp/mahout-jar to
MAHOUT_CONF_DIR

*export MAHOUT_CONF_DIR=/tmp/mahout-jar *

6. Now from /tmp/mahout-jar, I am calling my mahout code by

*# mahout callfpgrowth *

and its RUNNING...!!! :-)

Alternatively, I was also able to run the code by this following command :

*sudo java -classpath
mahout-fpgrowth-1.0-SNAPSHOT.jar:/usr/local/hadoop/hadoop/hadoop-0.20.2-core.jar:/tmp/mahout-distribution-0.5/core/target/mahout-core-0.5-job.jar
com.musigma.hpc.CallFPGrowth*

Thanks everyone for your guidance,
Praveenesh


On Sun, Sep 25, 2011 at 3:24 PM, Ted Dunning <[email protected]> wrote:

> A better workflow, in my opinion, is to make a separate maven project for
> code that uses Mahout.  See https://github.com/tdunning/Chapter-16 for an
> example.
>
> Then you can simple compile, test and run your code using Maven or Eclipse
> of IntelliJ.  Moreover, mvn will handle jaring up your code and all the
> dependencies that you want to include.
>
> If you need changes to Mahout behavior, pop open the Mahout source and use
> maven again.  Write tests to demonstrate the function you want and then use
> maven install to push the mahout jar into your local repo.  If your code on
> the Mahout side is changing often, it probably ought to go into your work
> project instead of inside Mahout anyway.
>
> On Sun, Sep 25, 2011 at 2:12 PM, Lance Norskog <[email protected]> wrote:
>
> > For development, you can put the source in the Mahout tree and get it
> into
> > your job jars with 'mvn install'.
> > If you want your own independent source code, you can make a new Maven
> > project that creates your job.jar.
> > I do not do this until I am happy with how things work inside the Mahout
> > source tree.
> >
> > On Sun, Sep 25, 2011 at 12:52 AM, praveenesh kumar <[email protected]
> > >wrote:
> >
> > > Okay.. Heres what I am trying to do.
> > >
> > > My code is this :
> > >
> > >
> > >  import java.io.File;
> > >  import java.io.IOException;
> > >  import java.nio.charset.Charset;
> > >  import java.util.ArrayList;
> > >  import java.util.Arrays;
> > >  import java.util.Collection;
> > >  import java.util.HashSet;
> > >  import java.util.Map;
> > >  import java.util.Set;
> > >  import java.util.List;
> > >
> > >  import org.apache.hadoop.conf.Configuration;
> > >  import org.apache.hadoop.fs.FileSystem;
> > >  import org.apache.hadoop.fs.Path;
> > >  import org.apache.hadoop.io.SequenceFile;
> > >  import org.apache.hadoop.io.Text;
> > >  //import org.apache.lucene.util.Attribute;
> > >  import org.apache.mahout.common.FileLineIterable;
> > >  import org.apache.mahout.common.StringRecordIterator;
> > >
> > >  import
> org.apache.mahout.fpm.pfpgrowth.convertors.ContextStatusUpdater;
> > >  import
> > > org.apache.mahout.fpm.pfpgrowth.convertors.SequenceFileOutputCollector;
> > >  import
> > >
> org.apache.mahout.fpm.pfpgrowth.convertors.string.StringOutputConverter;
> > >
> > >
> > >
> > >  import
> > > org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns;
> > >  import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth;
> > >  //import org.apache.mahout.math.map.OpenLongObjectHashMap;
> > >
> > >  import org.apache.mahout.common.Pair;
> > >
> > >  public class DellFPGrowth {
> > >
> > >    public static void main(String[] args) throws IOException {
> > >
> > >        Set<String> features = new HashSet<String>();
> > >        String input =
> > > "/mnt/hgfs/Hadoop-automation/new-delltransaction.txt";
> > >        int minSupport = 1;
> > >        int maxHeapSize = 50;//top-k
> > >        String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
> > >        Charset encoding = Charset.forName("UTF-8");
> > >        FPGrowth<String> fp = new FPGrowth<String>();
> > >        String output = "/tmp/output.txt";
> > >        Path path = new Path(output);
> > >        Configuration conf = new Configuration();
> > >        FileSystem fs = FileSystem.get(conf);
> > >
> > >
> > >        SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf,
> > path,
> > > Text.class, TopKStringPatterns.class);
> > >
> > >
> > >  fp.generateTopKFrequentPatterns(
> > >                new StringRecordIterator(new FileLineIterable(new
> > > File(input), encoding, false), pattern),
> > >                fp.generateFList(
> > >                    new StringRecordIterator(new FileLineIterable(new
> > > File(input), encoding, false), pattern),
> > >                    minSupport),
> > >                minSupport,
> > >                maxHeapSize,
> > >                features,
> > >                new StringOutputConverter(new
> > > SequenceFileOutputCollector<Text,TopKStringPatterns>(writer)),
> > >                new ContextStatusUpdater(null));
> > >
> > >        writer.close();
> > >
> > >        List<Pair<String,TopKStringPatterns>> frequentPatterns =
> > > FPGrowth.readFrequentPattern(fs, conf, path);
> > >        for (Pair<String,TopKStringPatterns> entry : frequentPatterns) {
> > >              System.out.println(entry.getSecond());
> > >        }
> > >        System.out.print("\nthe end! ");
> > >    }
> > >
> > > }
> > >
> > >
> > > 1. I am able to compile and run this code from eclipse, so I took the
> > > .class
> > > file from eclipse target folder. Put it in some other directory and
> make
> > a
> > > simple jar file using jar -cvf command.
> > >
> > > 2. Since I am using mahout 0.4 and MAHOUT_CONF_DIR is default pointed
> to
> > > $MAHOUT_HOME/conf so I just added my jar directly to $MAHOUT_HOME/conf
> > > folder, added the entry of my class in drivers.classes.props file.
> > >
> > > I added the following line at the end
> > > com.musigma.hpc.CallFPGrowth = callfpgrowth : Calls fpgrowth
> > >
> > > com.musigma.hpc.CallFPGrowth is my class that I want to run from cmd
> and
> > > its
> > > in the jar.
> > >
> > > 3. Now when I am running bin/mahout, I am getting the following
> exception
> > >
> > > hadoop@ubuntu:/tmp/mahout-distribution-0.4$ bin/mahout
> > >
> > > Running on hadoop, using HADOOP_HOME=/usr/local/hadoop/hadoop
> > > HADOOP_CONF_DIR=/usr/local/hadoop/hadoop/conf
> > > 11/09/25 00:40:07 WARN driver.MahoutDriver: Unable to add class:
> > > com.musigma.hpc.CallFPGrowth
> > > java.lang.ClassNotFoundException: com.musigma.hpc.CallFPGrowth
> > > at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
> > > at java.security.AccessController.doPrivileged(Native Method)
> > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> > > at java.lang.Class.forName0(Native Method)
> > > at java.lang.Class.forName(Class.java:186)
> > > at
> org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:207)
> > > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:117)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > at
> > >
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > >
> > > at
> > >
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >
> > > at java.lang.reflect.Method.invoke(Method.java:616)
> > > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > >
> > >
> > > How can I resolve this issue ?
> > >
> > >
> > > On Sat, Sep 24, 2011 at 2:55 PM, Lance Norskog <[email protected]>
> > wrote:
> > >
> > > > Ah! That is all off in Maven-land. There is a maven feature called
> > > "exec".
> > > >
> > > > http://mojo.codehaus.org/exec-maven-plugin/
> > > >
> > > > There are examples for this in the Mahout wiki. Search for
> "exec:java".
> > > >
> > > > On Sat, Sep 24, 2011 at 2:42 AM, praveenesh kumar <
> > [email protected]
> > > > >wrote:
> > > >
> > > > > Which mahout jars are required to run this code and where I can
> find
> > > them
> > > > ?
> > > > > I have this src downloaded .. but there are no jars in the src ?
> > > > >
> > > > >
> > > > > On Sat, Sep 24, 2011 at 2:35 AM, Paritosh Ranjan <
> [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Just add the mahout jars in the class path while
> > compiling/executing.
> > > > > > Search "java jar in classpath" on google.
> > > > > >
> > > > > >
> > > > > > On 24-09-2011 15:01, praveenesh kumar wrote:
> > > > > >
> > > > > >> I mean to say..
> > > > > >>
> > > > > >> I have this code ..
> > > > > >>
> > > > > >>  import java.io.File;
> > > > > >>  import java.io.IOException;
> > > > > >>  import java.nio.charset.Charset;
> > > > > >>  import java.util.ArrayList;
> > > > > >>  import java.util.Arrays;
> > > > > >>  import java.util.Collection;
> > > > > >>  import java.util.HashSet;
> > > > > >>  import java.util.Map;
> > > > > >>  import java.util.Set;
> > > > > >>  import java.util.List;
> > > > > >>
> > > > > >>  import org.apache.hadoop.conf.**Configuration;
> > > > > >>  import org.apache.hadoop.fs.**FileSystem;
> > > > > >>  import org.apache.hadoop.fs.Path;
> > > > > >>  import org.apache.hadoop.io.**SequenceFile;
> > > > > >>  import org.apache.hadoop.io.Text;
> > > > > >>  //import org.apache.lucene.util.**Attribute;
> > > > > >>  import org.apache.mahout.common.**FileLineIterable;
> > > > > >>  import org.apache.mahout.common.**StringRecordIterator;
> > > > > >>
> > > > > >>  import org.apache.mahout.fpm.**pfpgrowth.convertors.**
> > > > > >> ContextStatusUpdater;
> > > > > >>  import
> > > > > >> org.apache.mahout.fpm.**pfpgrowth.convertors.**
> > > > > >> SequenceFileOutputCollector;
> > > > > >>  import
> > > > > >> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**
> > > > > >> StringOutputConverter;
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>  import
> > > > > >>
> > > > >
> > >
> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**TopKStringPatterns;
> > > > > >>  import org.apache.mahout.fpm.**pfpgrowth.fpgrowth.FPGrowth;
> > > > > >>  //import org.apache.mahout.math.map.**OpenLongObjectHashMap;
> > > > > >>
> > > > > >>  import org.apache.mahout.common.Pair;
> > > > > >>
> > > > > >>  public class DellFPGrowth {
> > > > > >>
> > > > > >>     public static void main(String[] args) throws IOException {
> > > > > >>
> > > > > >>         Set<String>  features = new HashSet<String>();
> > > > > >>         String input =
> > > > > >> "/mnt/hgfs/Hadoop-automation/**new-delltransaction.txt";
> > > > > >>         int minSupport = 1;
> > > > > >>         int maxHeapSize = 50;//top-k
> > > > > >>         String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
> > > > > >>         Charset encoding = Charset.forName("UTF-8");
> > > > > >>         FPGrowth<String>  fp = new FPGrowth<String>();
> > > > > >>         String output = "/tmp/output.txt";
> > > > > >>         Path path = new Path(output);
> > > > > >>         Configuration conf = new Configuration();
> > > > > >>         FileSystem fs = FileSystem.get(conf);
> > > > > >>
> > > > > >>
> > > > > >>         SequenceFile.Writer writer = new SequenceFile.Writer(fs,
> > > conf,
> > > > > >> path,
> > > > > >> Text.class, TopKStringPatterns.class);
> > > > > >>
> > > > > >>
> > > > > >> fp.**generateTopKFrequentPatterns(
> > > > > >>                 new StringRecordIterator(new
> FileLineIterable(new
> > > > > >> File(input), encoding, false), pattern),
> > > > > >>                 fp.generateFList(
> > > > > >>                     new StringRecordIterator(new
> > > FileLineIterable(new
> > > > > >> File(input), encoding, false), pattern),
> > > > > >>                     minSupport),
> > > > > >>                 minSupport,
> > > > > >>                 maxHeapSize,
> > > > > >>                 features,
> > > > > >>                 new StringOutputConverter(new
> > > > > >>
> SequenceFileOutputCollector<**Text,TopKStringPatterns>(**writer)),
> > > > > >>                 new ContextStatusUpdater(null));
> > > > > >>
> > > > > >>         writer.close();
> > > > > >>
> > > > > >>         List<Pair<String,**TopKStringPatterns>>
>  frequentPatterns
> > =
> > > > > >> FPGrowth.readFrequentPattern(**fs, conf, path);
> > > > > >>         for (Pair<String,**TopKStringPatterns>  entry :
> > > > > frequentPatterns)
> > > > > >> {
> > > > > >>               System.out.println(entry.**getSecond());
> > > > > >>         }
> > > > > >>         System.out.print("\nthe end! ");
> > > > > >>     }
> > > > > >>
> > > > > >> }
> > > > > >>
> > > > > >>
> > > > > >> How should I compile and run using command line..
> > > > > >> I don't have eclipse on my system. How can I run this code  ?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Praveenesh
> > > > > >>
> > > > > >> On Sat, Sep 24, 2011 at 12:40 PM, Danny
> > Bickson<danny.bickson@gmail.
> > > > > **com<[email protected]>
> > > > > >> >wrote:
> > > > > >>
> > > > > >>  It is very simple: in the root folder you run (for example for
> > > > > k-means:)
> > > > > >>> ./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o
> > > > > >>> ~/usr7/small_netflix_mahout_**output/ --numClusters
> > > > > >>> 10 -c ~/usr7/small_netflix_mahout/ -x 10
> > > > > >>>
> > > > > >>> where ./bin/mahout is used for any mahout application, and the
> > next
> > > > > >>> keyword
> > > > > >>> (kmeans in this case) defines the algorithm type.
> > > > > >>> The rest of the inputs are algorithm specific.
> > > > > >>>
> > > > > >>> If you want to add a new application to the existing ones, you
> > need
> > > > to
> > > > > >>> edit
> > > > > >>> conf/driver.classes.props
> > > > > >>> file and point into your main class.
> > > > > >>>
> > > > > >>> Best,
> > > > > >>>
> > > > > >>> - Danny Bickson
> > > > > >>>
> > > > > >>> On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar<
> > > > [email protected]
> > > > > >>>
> > > > > >>>> wrote:
> > > > > >>>> Hey,
> > > > > >>>> I have this code written using mahout libraries. I am able to
> > run
> > > > the
> > > > > >>>>
> > > > > >>> code
> > > > > >>>
> > > > > >>>> from eclipse
> > > > > >>>> How can I run the code written in mahout from command line ?
> > > > > >>>>
> > > > > >>>> My question is do I have to make a jar file and run it as
> hadoop
> > > jar
> > > > > >>>> jarfilename.jar class
> > > > > >>>> or shall I run it using simple java command ?
> > > > > >>>>
> > > > > >>>> Can anyone solve my confusion ?
> > > > > >>>> I am not able to run this code.
> > > > > >>>>
> > > > > >>>> Thanks,
> > > > > >>>> Praveenesh
> > > > > >>>>
> > > > > >>>>
> > > > > >>
> > > > > >> -----
> > > > > >> No virus found in this message.
> > > > > >> Checked by AVG - www.avg.com
> > > > > >> Version: 10.0.1410 / Virus Database: 1520/3915 - Release Date:
> > > > 09/23/11
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Lance Norskog
> > > > [email protected]
> > > >
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > [email protected]
> >
>

Reply via email to