For development, you can put the source in the Mahout tree and get it into
your job jars with 'mvn install'.
If you want your own independent source code, you can make a new Maven
project that creates your job.jar.
I do not do this until I am happy with how things work inside the Mahout
source tree.

On Sun, Sep 25, 2011 at 12:52 AM, praveenesh kumar <[email protected]>wrote:

> Okay.. Heres what I am trying to do.
>
> My code is this :
>
>
>  import java.io.File;
>  import java.io.IOException;
>  import java.nio.charset.Charset;
>  import java.util.ArrayList;
>  import java.util.Arrays;
>  import java.util.Collection;
>  import java.util.HashSet;
>  import java.util.Map;
>  import java.util.Set;
>  import java.util.List;
>
>  import org.apache.hadoop.conf.Configuration;
>  import org.apache.hadoop.fs.FileSystem;
>  import org.apache.hadoop.fs.Path;
>  import org.apache.hadoop.io.SequenceFile;
>  import org.apache.hadoop.io.Text;
>  //import org.apache.lucene.util.Attribute;
>  import org.apache.mahout.common.FileLineIterable;
>  import org.apache.mahout.common.StringRecordIterator;
>
>  import org.apache.mahout.fpm.pfpgrowth.convertors.ContextStatusUpdater;
>  import
> org.apache.mahout.fpm.pfpgrowth.convertors.SequenceFileOutputCollector;
>  import
> org.apache.mahout.fpm.pfpgrowth.convertors.string.StringOutputConverter;
>
>
>
>  import
> org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns;
>  import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth;
>  //import org.apache.mahout.math.map.OpenLongObjectHashMap;
>
>  import org.apache.mahout.common.Pair;
>
>  public class DellFPGrowth {
>
>    public static void main(String[] args) throws IOException {
>
>        Set<String> features = new HashSet<String>();
>        String input =
> "/mnt/hgfs/Hadoop-automation/new-delltransaction.txt";
>        int minSupport = 1;
>        int maxHeapSize = 50;//top-k
>        String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
>        Charset encoding = Charset.forName("UTF-8");
>        FPGrowth<String> fp = new FPGrowth<String>();
>        String output = "/tmp/output.txt";
>        Path path = new Path(output);
>        Configuration conf = new Configuration();
>        FileSystem fs = FileSystem.get(conf);
>
>
>        SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path,
> Text.class, TopKStringPatterns.class);
>
>
>  fp.generateTopKFrequentPatterns(
>                new StringRecordIterator(new FileLineIterable(new
> File(input), encoding, false), pattern),
>                fp.generateFList(
>                    new StringRecordIterator(new FileLineIterable(new
> File(input), encoding, false), pattern),
>                    minSupport),
>                minSupport,
>                maxHeapSize,
>                features,
>                new StringOutputConverter(new
> SequenceFileOutputCollector<Text,TopKStringPatterns>(writer)),
>                new ContextStatusUpdater(null));
>
>        writer.close();
>
>        List<Pair<String,TopKStringPatterns>> frequentPatterns =
> FPGrowth.readFrequentPattern(fs, conf, path);
>        for (Pair<String,TopKStringPatterns> entry : frequentPatterns) {
>              System.out.println(entry.getSecond());
>        }
>        System.out.print("\nthe end! ");
>    }
>
> }
>
>
> 1. I am able to compile and run this code from eclipse, so I took the
> .class
> file from eclipse target folder. Put it in some other directory and make a
> simple jar file using jar -cvf command.
>
> 2. Since I am using mahout 0.4 and MAHOUT_CONF_DIR is default pointed to
> $MAHOUT_HOME/conf so I just added my jar directly to $MAHOUT_HOME/conf
> folder, added the entry of my class in drivers.classes.props file.
>
> I added the following line at the end
> com.musigma.hpc.CallFPGrowth = callfpgrowth : Calls fpgrowth
>
> com.musigma.hpc.CallFPGrowth is my class that I want to run from cmd and
> its
> in the jar.
>
> 3. Now when I am running bin/mahout, I am getting the following exception
>
> hadoop@ubuntu:/tmp/mahout-distribution-0.4$ bin/mahout
>
> Running on hadoop, using HADOOP_HOME=/usr/local/hadoop/hadoop
> HADOOP_CONF_DIR=/usr/local/hadoop/hadoop/conf
> 11/09/25 00:40:07 WARN driver.MahoutDriver: Unable to add class:
> com.musigma.hpc.CallFPGrowth
> java.lang.ClassNotFoundException: com.musigma.hpc.CallFPGrowth
> at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:186)
> at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:207)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:117)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:616)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
> How can I resolve this issue ?
>
>
> On Sat, Sep 24, 2011 at 2:55 PM, Lance Norskog <[email protected]> wrote:
>
> > Ah! That is all off in Maven-land. There is a maven feature called
> "exec".
> >
> > http://mojo.codehaus.org/exec-maven-plugin/
> >
> > There are examples for this in the Mahout wiki. Search for "exec:java".
> >
> > On Sat, Sep 24, 2011 at 2:42 AM, praveenesh kumar <[email protected]
> > >wrote:
> >
> > > Which mahout jars are required to run this code and where I can find
> them
> > ?
> > > I have this src downloaded .. but there are no jars in the src ?
> > >
> > >
> > > On Sat, Sep 24, 2011 at 2:35 AM, Paritosh Ranjan <[email protected]>
> > > wrote:
> > >
> > > > Just add the mahout jars in the class path while compiling/executing.
> > > > Search "java jar in classpath" on google.
> > > >
> > > >
> > > > On 24-09-2011 15:01, praveenesh kumar wrote:
> > > >
> > > >> I mean to say..
> > > >>
> > > >> I have this code ..
> > > >>
> > > >>  import java.io.File;
> > > >>  import java.io.IOException;
> > > >>  import java.nio.charset.Charset;
> > > >>  import java.util.ArrayList;
> > > >>  import java.util.Arrays;
> > > >>  import java.util.Collection;
> > > >>  import java.util.HashSet;
> > > >>  import java.util.Map;
> > > >>  import java.util.Set;
> > > >>  import java.util.List;
> > > >>
> > > >>  import org.apache.hadoop.conf.**Configuration;
> > > >>  import org.apache.hadoop.fs.**FileSystem;
> > > >>  import org.apache.hadoop.fs.Path;
> > > >>  import org.apache.hadoop.io.**SequenceFile;
> > > >>  import org.apache.hadoop.io.Text;
> > > >>  //import org.apache.lucene.util.**Attribute;
> > > >>  import org.apache.mahout.common.**FileLineIterable;
> > > >>  import org.apache.mahout.common.**StringRecordIterator;
> > > >>
> > > >>  import org.apache.mahout.fpm.**pfpgrowth.convertors.**
> > > >> ContextStatusUpdater;
> > > >>  import
> > > >> org.apache.mahout.fpm.**pfpgrowth.convertors.**
> > > >> SequenceFileOutputCollector;
> > > >>  import
> > > >> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**
> > > >> StringOutputConverter;
> > > >>
> > > >>
> > > >>
> > > >>  import
> > > >>
> > >
> org.apache.mahout.fpm.**pfpgrowth.convertors.string.**TopKStringPatterns;
> > > >>  import org.apache.mahout.fpm.**pfpgrowth.fpgrowth.FPGrowth;
> > > >>  //import org.apache.mahout.math.map.**OpenLongObjectHashMap;
> > > >>
> > > >>  import org.apache.mahout.common.Pair;
> > > >>
> > > >>  public class DellFPGrowth {
> > > >>
> > > >>     public static void main(String[] args) throws IOException {
> > > >>
> > > >>         Set<String>  features = new HashSet<String>();
> > > >>         String input =
> > > >> "/mnt/hgfs/Hadoop-automation/**new-delltransaction.txt";
> > > >>         int minSupport = 1;
> > > >>         int maxHeapSize = 50;//top-k
> > > >>         String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
> > > >>         Charset encoding = Charset.forName("UTF-8");
> > > >>         FPGrowth<String>  fp = new FPGrowth<String>();
> > > >>         String output = "/tmp/output.txt";
> > > >>         Path path = new Path(output);
> > > >>         Configuration conf = new Configuration();
> > > >>         FileSystem fs = FileSystem.get(conf);
> > > >>
> > > >>
> > > >>         SequenceFile.Writer writer = new SequenceFile.Writer(fs,
> conf,
> > > >> path,
> > > >> Text.class, TopKStringPatterns.class);
> > > >>
> > > >>
> > > >> fp.**generateTopKFrequentPatterns(
> > > >>                 new StringRecordIterator(new FileLineIterable(new
> > > >> File(input), encoding, false), pattern),
> > > >>                 fp.generateFList(
> > > >>                     new StringRecordIterator(new
> FileLineIterable(new
> > > >> File(input), encoding, false), pattern),
> > > >>                     minSupport),
> > > >>                 minSupport,
> > > >>                 maxHeapSize,
> > > >>                 features,
> > > >>                 new StringOutputConverter(new
> > > >> SequenceFileOutputCollector<**Text,TopKStringPatterns>(**writer)),
> > > >>                 new ContextStatusUpdater(null));
> > > >>
> > > >>         writer.close();
> > > >>
> > > >>         List<Pair<String,**TopKStringPatterns>>  frequentPatterns =
> > > >> FPGrowth.readFrequentPattern(**fs, conf, path);
> > > >>         for (Pair<String,**TopKStringPatterns>  entry :
> > > frequentPatterns)
> > > >> {
> > > >>               System.out.println(entry.**getSecond());
> > > >>         }
> > > >>         System.out.print("\nthe end! ");
> > > >>     }
> > > >>
> > > >> }
> > > >>
> > > >>
> > > >> How should I compile and run using command line..
> > > >> I don't have eclipse on my system. How can I run this code  ?
> > > >>
> > > >> Thanks,
> > > >> Praveenesh
> > > >>
> > > >> On Sat, Sep 24, 2011 at 12:40 PM, Danny Bickson<danny.bickson@gmail.
> > > **com<[email protected]>
> > > >> >wrote:
> > > >>
> > > >>  It is very simple: in the root folder you run (for example for
> > > k-means:)
> > > >>> ./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o
> > > >>> ~/usr7/small_netflix_mahout_**output/ --numClusters
> > > >>> 10 -c ~/usr7/small_netflix_mahout/ -x 10
> > > >>>
> > > >>> where ./bin/mahout is used for any mahout application, and the next
> > > >>> keyword
> > > >>> (kmeans in this case) defines the algorithm type.
> > > >>> The rest of the inputs are algorithm specific.
> > > >>>
> > > >>> If you want to add a new application to the existing ones, you need
> > to
> > > >>> edit
> > > >>> conf/driver.classes.props
> > > >>> file and point into your main class.
> > > >>>
> > > >>> Best,
> > > >>>
> > > >>> - Danny Bickson
> > > >>>
> > > >>> On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar<
> > [email protected]
> > > >>>
> > > >>>> wrote:
> > > >>>> Hey,
> > > >>>> I have this code written using mahout libraries. I am able to run
> > the
> > > >>>>
> > > >>> code
> > > >>>
> > > >>>> from eclipse
> > > >>>> How can I run the code written in mahout from command line ?
> > > >>>>
> > > >>>> My question is do I have to make a jar file and run it as hadoop
> jar
> > > >>>> jarfilename.jar class
> > > >>>> or shall I run it using simple java command ?
> > > >>>>
> > > >>>> Can anyone solve my confusion ?
> > > >>>> I am not able to run this code.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Praveenesh
> > > >>>>
> > > >>>>
> > > >>
> > > >> -----
> > > >> No virus found in this message.
> > > >> Checked by AVG - www.avg.com
> > > >> Version: 10.0.1410 / Virus Database: 1520/3915 - Release Date:
> > 09/23/11
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Lance Norskog
> > [email protected]
> >
>



-- 
Lance Norskog
[email protected]

Reply via email to