from:"Gaurav"

unable to find the job id.

2011-10-11 Thread Gaurav

Hello, When i try to kill a process i am unable to find the process id after using the command :hadoop job -list It says no jobs running. i am running the canopy clustering example by typing the following command: mahout org.apache.mahout.clustering.syntheticcontrol.canopy.Job --input --output

Cluster dumper crashes when run on a large dataset

2011-11-03 Thread gaurav redkar

to build the "result" map. Any idea how do i fix this.?? I am working on a dataset of size 40mb. I had tried increaseing the heap space but with no luck. Thanks Gaurav

Re: Cluster dumper crashes when run on a large dataset

2011-11-03 Thread gaurav redkar

the vectors might be too large. How many > dimensions are you having in your Vector? > > > On 04-11-2011 10:57, gaurav redkar wrote: > >> Hello, >> >> I am in a fix with the Clusterdumper utility. The clusterdump utility >> crashes when it tries to ou

Re: Cluster dumper crashes when run on a large dataset

2011-11-03 Thread gaurav redkar

ich will > populate only the dimensions which you are using. This can also decrease > memory consumption. > > > On 04-11-2011 11:19, gaurav redkar wrote: > >> Hi, >> >> yes Paritosh..even i think the same. actually i am using a test data set >> that has 5000 tup

Re: Cluster dumper crashes when run on a large dataset

2011-11-03 Thread gaurav redkar

lusterFilter > 0 , which might help in reducing > the number of clusters that you are getting as output, which, in turn, > might help in less memory usage. > > > On 04-11-2011 11:43, gaurav redkar wrote: > >> Actually i have to run the meanshift algorithm on a large dataset

Re: Cluster dumper crashes when run on a large dataset

2011-11-03 Thread gaurav redkar

sterId does not exist){ >create directory of name clusterId >} > writeVectorInDirectoryNamedClu**sterId(); > > } > > On 04-11-2011 12:09, gaurav redkar wrote: > >> Thanks a lot for ur help. Yes i will be running it on a hadoop cluster. >> Can >> u elab

meanshift clustering

2011-11-09 Thread gaurav redkar

Hi.. I am unable to identify where is the clusterPoints() function in the MeanShiftCanopyClusterer.java file being called during the execution of Meanshift job. What i need to know is where are the files in clusteredPoints n clusters-* directory being written when we run the job on hadoop. build

Re: meanshift clustering

2011-11-10 Thread gaurav redkar

Wed, Nov 9, 2011 at 11:27 PM, Jeff Eastman wrote: > See inline, > Jeff > > -----Original Message- > From: gaurav redkar [mailto:gauravred...@gmail.com] > Sent: Wednesday, November 09, 2011 4:09 AM > To: user@mahout.apache.org > Subject: meanshift clustering &

incosistent output while using clusterdumper

2011-11-11 Thread gaurav redkar

Hello After using clusterdumper on the output generated by meanshift algorithm, i see following type of result. MSV-441 {n=2 c=[0.003, -0.002,0.005,0.001,etc MSV-770{n=1 c=[0:-0.025,1:0.011,2:0.032,..etc As seen above in MSV-441 there is no presence of ":" in the output whereas MSV-770 h

Mahout fpg missing patterns

2011-12-18 Thread gaurav singh

correct and do exist in the data set with correct value of their support. Can anyone please explain me the reason?? Thanks!! -- regards Gaurav Singh -- regards Gaurav Singh -- regards Gaurav Singh

Re: Mahout fpg missing patterns

2011-12-19 Thread gaurav singh

That seems to make sense. What do you mean by " Mahout will not report any of those unless the support is strictly greater than 3. " Is there a way for me to get all the patterns with support strictly greater then a particular value? Thanks Gaurav On Mon, Dec 19, 2011 at 4:58 PM,

Re: Mahout fpg missing patterns

2011-12-19 Thread gaurav singh

You were a real help Tom! Thanks Gaurav On Mon, Dec 19, 2011 at 5:33 PM, Tom Pierce wrote: > Maybe it's easiest to give an example. > > If you have input: > > a b c > a b c d > ac d > a b c > > You should expect Mahout to output (say, for support 2): &g

Help regarding ClusterOutputPostProcessor

2012-01-06 Thread gaurav redkar

Hello, wen I ran the ClusterOutputPostProcessor on synthetic_control_data in mapreduce mode, I observed that one directory contained points belonging to 2 other clusters and the directories relating to those 2 clusters were not created as their "part- *" files were empty and the function "movePart

Mahout and Hadoop on Windows

2012-01-21 Thread gaurav singh

this class errors, like hadoop-0.19.1-core.jar,com.google.common.source_1.0.0.201004262004.jar etc to resolve many of the imported packages like com.google.common.io.Closeable etc. Did I do the right thing? Please any detailed light on its functioning would be great. Thanks everyone! -- regards Gaurav Singh

Re: Help regarding ClusterOutputPostProcessor

2012-01-25 Thread gaurav redkar

in d directory for each cluster. Any idea why is this happening ...? PS: the dataset on which i tested the algorithm has 1000 records with 200 attributes per record. I can share the dataset that i have used if needed. Thanks, Gaurav On Fri, Jan 6, 2012 at 6:12 PM, Paritosh Ranjan wrote

Re: Help regarding ClusterOutputPostProcessor

2012-01-30 Thread gaurav redkar

Hello. As Jeff mentioned, i created a JIRA issue. Kindly check out MAHOUT-966 <https://issues.apache.org/jira/browse/MAHOUT-966> and share your inputs. Thanks, Gaurav On Wed, Jan 25, 2012 at 8:51 PM, Jeff Eastman wrote: > Mean Shift accumulates the pointIds of every point assigned to

Re: Apache Mahout 0.6 Released

2012-02-06 Thread gaurav singh

Hi, When you say decision trees, you mean decision forest right? Is it possible to expect decision tree algorithm (C 4.5) to be in mahout, since the algorithm is pretty sequential in nature and won't be suitable for distributed processing. Regards Gaurav On Tue, Feb 7, 2012 at 2:49 AM, Sh

Re: How to get documents from the clusters?

2012-02-08 Thread gaurav redkar

Hi.. The clusteredPoints directory contains sequence files where each record is a pair. The format of each record is basically, this directory contains the mapping of each point in the dataset and the clusterID of the cluster to which it belongs. IntWritable is the clusterID of the cluster; We

Re: How to use clusterpp?

2012-02-17 Thread gaurav redkar

If that is the only thing that is contained in the part-r-* file, then the reducer responsible to write to that part-r-* file did not recieve any input records to write to it. This happens because the program uses the default hash partitioner which sometimes maps records belonging to different clus

Re: mahout pfp : isSubPatternof() function

2012-02-26 Thread gaurav singh

b 26, 2012 at 9:39 PM, tom wrote: > Hi Gaurav, > > The patterns are accumulated in a heap (see FrequentPatternMaxHeap), which > uses isSubPatternOf. > > That said, I do think the default implementation of PFPGrowth will get you > many redundant patterns under cer

Re: mahout pfp : isSubPatternof() function

2012-02-26 Thread gaurav singh

to hear if this persists after trying --useFPG2. > > -tom > > > > On 02/26/2012 12:06 PM, gaurav singh wrote: > >> Hi Tom, >> >> I don't understand, why do you say I will get a lot of redundant patterns? >> In each group dependent shard generates patterns wi

Re: Canopy estimator

2012-05-11 Thread gaurav redkar

ement in each column (ignoring the 0's in the diagonal) which will give me 1.4 ,1.36 , 1.36. to choose the value of t2 i intend to take mean of all the minimum elements in each column. then select the mean of these values , t2=1.37 Any comments on the approach Thanks Gaurav On Fri, May 1

Named Entity Extraction.

2012-06-09 Thread Gaurav Sehgal

s for your help, Gaurav

Dense matrix with kmeans

2012-06-20 Thread gaurav singh

. I just wish to know if it can be directly used with kmeans or I will have to write or customize kmeans for my purpose? Thanks for any help offered! -- Regards Gaurav Singh

Re: Dense matrix with kmeans

2012-06-20 Thread gaurav singh

Dunning wrote: > Yeah... you can probably do this. It will involve storing your matrices as > vectors and probably requires that they be the same size. > > Can you say more about the matrices in terms of size and how you compute > distance? > > On Wed, Jun 20, 2012 at 1:42 AM,

Re: Dense matrix with kmeans

2012-06-20 Thread gaurav singh

subtracting one matrix from another and the elements of resulting matrix should be squared and added. On Wed, Jun 20, 2012 at 5:38 PM, gaurav singh wrote: > Hi, > > The matrix if sparse can be very large like 1000 X 1000 but it will only > have at most 20 non-zero elements. That is

Re: Nave Bayes Classifier and probability calculation for a single test example

2011-05-03 Thread gaurav garg

Hi Svetlomir, You can use ClassifierContext class ( https://builds.apache.org/hudson/job/Mahout-Quality/javadoc/index.html?org/apache/mahout/common/StringTuple.html) to get the top N matching result and their respective scores. Hope it helps. Thanks Gaurav

Re: Nave Bayes Classifier and probability calculation for a single test example

2011-05-04 Thread gaurav garg

ill work. Have you read Renny's paper? http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8572 **** Thanks gaurav On Wed, May 4, 2011 at 6:16 PM, Svetlomir Kasabov < skasa...@smail.inf.fh-brs.de> wrote: > Hello Gaurav, >

unable to find the job id.

Cluster dumper crashes when run on a large dataset

Re: Cluster dumper crashes when run on a large dataset

Re: Cluster dumper crashes when run on a large dataset

Re: Cluster dumper crashes when run on a large dataset

Re: Cluster dumper crashes when run on a large dataset

meanshift clustering

Re: meanshift clustering

incosistent output while using clusterdumper

Mahout fpg missing patterns

Re: Mahout fpg missing patterns

Re: Mahout fpg missing patterns

Help regarding ClusterOutputPostProcessor

Mahout and Hadoop on Windows

Re: Help regarding ClusterOutputPostProcessor

Re: Help regarding ClusterOutputPostProcessor

Re: Apache Mahout 0.6 Released

Re: How to get documents from the clusters?

Re: How to use clusterpp?

Re: mahout pfp : isSubPatternof() function

Re: mahout pfp : isSubPatternof() function

Re: Canopy estimator

Named Entity Extraction.

Dense matrix with kmeans

Re: Dense matrix with kmeans

Re: Dense matrix with kmeans

Re: Nave Bayes Classifier and probability calculation for a single test example

Re: Nave Bayes Classifier and probability calculation for a single test example

28 matches

Site Navigation

Mail list logo

Footer information