Re: org.apache.maven.plugins:maven-antrun-plugin:1.6:run grief, copy-dependencies and unpack goals not supported by m2e, importing mahout into Eclipse

2011-11-25 Thread Shern Shiou Tan
I do have this type of problem too. But everything work out fine if you ignore it. Any reason behind this? On 11/26/2011 09:09 AM, Mike Spreitzer wrote: This is on my MacBook Pro, and I have the current JDK that Apple provides (1.6.0_29). I just installed the current Maven, which describes it

org.apache.maven.plugins:maven-antrun-plugin:1.6:run grief, copy-dependencies and unpack goals not supported by m2e, importing mahout into Eclipse

2011-11-25 Thread Mike Spreitzer
This is on my MacBook Pro, and I have the current JDK that Apple provides (1.6.0_29). I just installed the current Maven, which describes itself as Apache Maven 3.0.3 (r1075438; 2011-02-28 12:31:09-0500). This is my first ever install of Maven. I am also just getting started with Mahout. I

Reminder: SF Mahout User Meeting

2011-11-25 Thread Grant Ingersoll
For those in the San Francisco area, there will be a Mahout User Meeting on Nov. 29th at Lucid Imagination's offices. Details and RSVP are at http://sf-mahout-11-11.eventbrite.com/ For those not in the SF area, I _believe_ we will be recording it and posting it.

Re: Load Dataset and Instances from database

2011-11-25 Thread Ted Dunning
On Fri, Nov 25, 2011 at 4:46 AM, Isabel Drost wrote: > On 24.11.2011 Ted Dunning wrote: > > Actually, one of the most reliable ways to kill a database is to use it > as > > input or output for even a small Hadoop cluster. Having hundreds of > > processes all open connections and read at once is

Re: Facing problem while fetching the document id from cluser

2011-11-25 Thread syed kather
Thanks lot it works . By using this code SequenceFile.Reader reader = new SequenceFile.Reader(fs,path1, conf); IntWritable key = new IntWritable(); WeightedVectorWritable value = new WeightedVectorWritable(); while (reader.next(key, value)) { System.out.println(value.toString() + " belongs to clus

Re: Facing problem while fetching the document id from cluser

2011-11-25 Thread Grant Ingersoll
I think you need to add the --clustering (-cl) option to your KMeans step. By default, we only calculate the centroids. FWIW, the ClusterDumper can also dump out the points associated w/ a cluster once you have run the --clustering step. Also I don't think the clusterData/part-randomSeed is

Re: Load Dataset and Instances from database

2011-11-25 Thread Isabel Drost
On 24.11.2011 Sturm, Martin wrote: > Since I only want to try it out "standalone" I was hoping that this was > possible without any Hadoop stuff. Are there any tutorials or examples > available that show how to load a Dataset? Because I do not even know what > files are expected here.. cvs? You ma

Re: Load Dataset and Instances from database

2011-11-25 Thread Isabel Drost
On 24.11.2011 Ted Dunning wrote: > Actually, one of the most reliable ways to kill a database is to use it as > input or output for even a small Hadoop cluster. Having hundreds of > processes all open connections and read at once is fairly abusive. Though that does not mean that data cannot by sy

Facing problem while fetching the document id from cluser

2011-11-25 Thread syed kather
Team, While reading the sequencial file . it is returning null These are the command which i executed. For Converting the Sequence File to Chunk(sequence vector) : raghu@Syed:/media/Work/mahout$ bin/mahout seqdirectory -i /media/Work/mahout/examples/bin/sample/fileList -o /media/Work/mahout/exampl

Re: ItemSimilarityJob's results differ from non-distributed version

2011-11-25 Thread Greg H
Hi Sebastian, I converted the dataset by simply keeping all user/item pairs that had a rating of above 3. I'm also using GenericItemBasedRecommender's mostSimilarItems method instead of the recommend method to make recommendations. I'm certainly open to suggestions on better evaluation metrics. I