I mean to say..
I have this code ..
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;
//import org.apache.lucene.util.Attribute;
import org.apache.mahout.common.FileLineIterable;
import org.apache.mahout.common.StringRecordIterator;
import org.apache.mahout.fpm.pfpgrowth.convertors.ContextStatusUpdater;
import
org.apache.mahout.fpm.pfpgrowth.convertors.SequenceFileOutputCollector;
import
org.apache.mahout.fpm.pfpgrowth.convertors.string.StringOutputConverter;
import
org.apache.mahout.fpm.pfpgrowth.convertors.string.TopKStringPatterns;
import org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth;
//import org.apache.mahout.math.map.OpenLongObjectHashMap;
import org.apache.mahout.common.Pair;
public class DellFPGrowth {
public static void main(String[] args) throws IOException {
Set<String> features = new HashSet<String>();
String input =
"/mnt/hgfs/Hadoop-automation/new-delltransaction.txt";
int minSupport = 1;
int maxHeapSize = 50;//top-k
String pattern = " \"[ ,\\t]*[,|\\t][ ,\\t]*\" ";
Charset encoding = Charset.forName("UTF-8");
FPGrowth<String> fp = new FPGrowth<String>();
String output = "/tmp/output.txt";
Path path = new Path(output);
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path,
Text.class, TopKStringPatterns.class);
fp.generateTopKFrequentPatterns(
new StringRecordIterator(new FileLineIterable(new
File(input), encoding, false), pattern),
fp.generateFList(
new StringRecordIterator(new FileLineIterable(new
File(input), encoding, false), pattern),
minSupport),
minSupport,
maxHeapSize,
features,
new StringOutputConverter(new
SequenceFileOutputCollector<Text,TopKStringPatterns>(writer)),
new ContextStatusUpdater(null));
writer.close();
List<Pair<String,TopKStringPatterns>> frequentPatterns =
FPGrowth.readFrequentPattern(fs, conf, path);
for (Pair<String,TopKStringPatterns> entry : frequentPatterns) {
System.out.println(entry.getSecond());
}
System.out.print("\nthe end! ");
}
}
How should I compile and run using command line..
I don't have eclipse on my system. How can I run this code ?
Thanks,
Praveenesh
On Sat, Sep 24, 2011 at 12:40 PM, Danny Bickson <[email protected]>wrote:
> It is very simple: in the root folder you run (for example for k-means:)
> ./bin/mahout kmeans -i ~/usr7/small_netflix_mahout/ -o
> ~/usr7/small_netflix_mahout_output/ --numClusters
> 10 -c ~/usr7/small_netflix_mahout/ -x 10
>
> where ./bin/mahout is used for any mahout application, and the next keyword
> (kmeans in this case) defines the algorithm type.
> The rest of the inputs are algorithm specific.
>
> If you want to add a new application to the existing ones, you need to edit
> conf/driver.classes.props
> file and point into your main class.
>
> Best,
>
> - Danny Bickson
>
> On Sat, Sep 24, 2011 at 9:59 AM, praveenesh kumar <[email protected]
> >wrote:
>
> > Hey,
> > I have this code written using mahout libraries. I am able to run the
> code
> > from eclipse
> > How can I run the code written in mahout from command line ?
> >
> > My question is do I have to make a jar file and run it as hadoop jar
> > jarfilename.jar class
> > or shall I run it using simple java command ?
> >
> > Can anyone solve my confusion ?
> > I am not able to run this code.
> >
> > Thanks,
> > Praveenesh
> >
>