Re: Modeling WordCount in a different way

2009-04-07 Thread Aayush Garg
Burger wrote: > Aayush, out of curiosity, why do you want model wordcount this way? > What benefit do you see? > > Norbert > > On 4/6/09, Aayush Garg wrote: > > Hi, > > > > I want to make experiments with wordcount example in a different way. > > > &

Re: Modeling WordCount in a different way

2009-04-07 Thread Aayush Garg
t; In case you expect the unique words list to be large to fit in memory. You > could read the previous step output directly from the hdfs and since it > would be a sorted file you could just walk it and merge the count in single > pass in the reduce function. > > - Sharad > -- Aayush Garg, Phone: +41 764822440

Modeling WordCount in a different way

2009-04-06 Thread Aayush Garg
Hi, I want to make experiments with wordcount example in a different way. Suppose we have very large data. Instead of splitting all the data one time, we want to feed some splits in the map-reduce job at a time. I want to model the hadoop job like this, Suppose a batch of inputsplits arrive in t

Optimized way

2008-12-04 Thread Aayush Garg
Hi, I am having a 5 node cluster for hadoop usage. All nodes are multi-core. I am running a shell command in Map function of my program and this shell command takes one file as an input. Many of such files are copied in the HDFS. So in summary map function will run a command like ./run Could y

Re: Error in start up

2008-04-23 Thread Aayush Garg
I put my username to R61neptun as you suggested but I am still getting that error: localhost: starting datanode, logging to /home/garga/Documents/hadoop-0.15.3/bin/../logs/hadoop-garga-datanode-R61neptun.out localhost: starting secondarynamenode, logging to /home/garga/Documents/hadoop-0.15.3/bin/

Re: Error in start up

2008-04-21 Thread Aayush Garg
Could anyone please help me with this error below ? I am not able to start HDFS due to this? Thanks, On Sat, Apr 19, 2008 at 7:25 PM, Aayush Garg <[EMAIL PROTECTED]> wrote: > I have my hadoop-site.xml correct !! but it creates error in this way > > > On Sat, Apr 19, 2008

Re: Splitting in various files

2008-04-21 Thread Aayush Garg
I just tried the same thing (mapred.task.id)as you told..But I am getting one file named null in my directory. On Mon, Apr 21, 2008 at 8:33 AM, Amar Kamat <[EMAIL PROTECTED]> wrote: > Aayush Garg wrote: > > > Could anyone please tell? > > > > On Sat, Apr 19, 2008 a

Re: Splitting in various files

2008-04-20 Thread Aayush Garg
Could anyone please tell? On Sat, Apr 19, 2008 at 1:33 PM, Aayush Garg <[EMAIL PROTECTED]> wrote: > Hi, > > I have written the following code for writing my key,value pairs in the > file, and this file is then read by another MR. > >Path pth = new Pa

Re: Error in start up

2008-04-19 Thread Aayush Garg
I have my hadoop-site.xml correct !! but it creates error in this way On Sat, Apr 19, 2008 at 6:35 PM, Stuart Sierra <[EMAIL PROTECTED]> wrote: > On Sat, Apr 19, 2008 at 9:53 AM, Aayush Garg <[EMAIL PROTECTED]> > wrote: > > I am getting following error on start

Error in start up

2008-04-19 Thread Aayush Garg
HI, I am getting following error on start up the hadoop as pseudo distributed:: bin/start-all.sh localhost: starting datanode, logging to /home/garga/Documents/hadoop-0.15.3/bin/../logs/hadoop-root-datanode-R61-neptun.out localhost: starting secondarynamenode, logging to /home/garga/Documents/ha

Splitting in various files

2008-04-19 Thread Aayush Garg
Hi, I have written the following code for writing my key,value pairs in the file, and this file is then read by another MR. Path pth = new Path("./dir1/dir2/filename"); FileSystem fs = pth.getFileSystem(jobconf); SequenceFile.Writer sqwrite = new SequenceFile.Writer(fs,conf,pth,Text.clas

Re: Map reduce classes

2008-04-17 Thread Aayush Garg
t to share data, put it into HDFS. > > > On 4/17/08 4:01 AM, "Aayush Garg" <[EMAIL PROTECTED]> wrote: > > > One more thing::: > > The HashMap that I am generating in the reduce phase will be on single > node > > or multiple nodes in the distributed e

Re: Map reduce classes

2008-04-17 Thread Aayush Garg
file can be very big ...so can I write in such a manner that file is distributed and I can read it easily in the next MapReduce Phase. Other way, can I split the file when it becomes gerater than a certain size? Thanks, Aayush On Thu, Apr 17, 2008 at 1:01 PM, Aayush Garg <[EMAIL PROTECTED]>

Re: Map reduce classes

2008-04-17 Thread Aayush Garg
es the kill > > flag > > in the front of the values, it can avoid processing any extra data. > > > > > > > Ted, > Will this work for the case where the cutoff frequency/count requires a > global picture? I guess not. > > In general, it is better to not tr

Re: Map reduce classes

2008-04-16 Thread Aayush Garg
ral, it is better to not try to communicate between map and reduce > except via the expected mechanisms. > > > > On 4/16/08 1:33 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote: > > > We can not read HashMap in the configure method of the reducer because > it is

Re: Map reduce classes

2008-04-16 Thread Aayush Garg
you > do this, you should use whatever format you like. > > > On 4/16/08 12:41 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote: > > > HI, > > > > The current structure of my program is:: > > Upper class{ > > class Reduce{ > > reduce f

Re: Map reduce classes

2008-04-16 Thread Aayush Garg
should I choose??? Is this design and approach ok? } public static void main() {} } I hope you have got my question. Thanks, On Wed, Apr 16, 2008 at 8:33 AM, Amar Kamat <[EMAIL PROTECTED]> wrote: > Aayush Garg wrote: > > > Hi, > > > > Are you sure that another

Re: Map reduce classes

2008-04-15 Thread Aayush Garg
will I exactly write the code snippet? Thanks, On Wed, Apr 16, 2008 at 7:18 AM, Amar Kamat <[EMAIL PROTECTED]> wrote: > Aayush Garg wrote: > > > HI, > > Could you please suggest what classes and another better way to achieve > > this:- > > > > I am g

Map reduce classes

2008-04-15 Thread Aayush Garg
HI, Could you please suggest what classes and another better way to achieve this:- I am getting outputcollector in my reduce function as: void reduce() { output.collect(key,value); } Here key is Text, and value is Custom class type that I generated from rcc. 1. After all calls are comp

Re: Sorting the OutputCollector

2008-04-09 Thread Aayush Garg
But the problem is that I need to sort according to freq which is the part of my value field... Any inputs?? Could you provide smal piece of code of your thought On Wed, Apr 9, 2008 at 9:45 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote: > On Apr 8, 2008, at 4:54 AM, Aayush Garg

Sorting the OutputCollector

2008-04-08 Thread Aayush Garg
Hi, I have implemented Key and value pairs in the following way: Key (Text class) Value(Custom class) word1 word2 class Custom{ int freq; TreeMap> } I construct this type of key, value pairs in the outputcollector of reduce phase. Now I want to "SORT" this outputcollector in

Sorting the OutputCollector

2008-04-08 Thread Aayush Garg
Hi, I have implemented Key and value pairs in the following way: Key (Text class) Value(Custom class) word1 word2 class Custom{ int freq; TreeMap> } I construct this type of key, value pairs in the outputcollector of reduce phase. Now I want to "SORT" this outputcollector in

Re: Hadoop: Multiple map reduce or some better way

2008-04-04 Thread Aayush Garg
sions of Hadoop? Thanks. > > - Robert Dempsey (new to the list) > > > On Apr 4, 2008, at 5:36 PM, Ted Dunning wrote: > > > > > > > See Nutch. See Nutch run. > > > > http://en.wikipedia.org/wiki/Nutch > > http://lucene.apache.org/nutch/ > > > -- Aayush Garg, Phone: +41 76 482 240

Re: Hadoop: Multiple map reduce or some better way

2008-04-04 Thread Aayush Garg
ld Lucene indexes using Hadoop Map/Reduce. See the index > contrib package in the trunk. Or is it still not something you are > looking for? > > Regards, > Ning > > On 4/4/08, Aayush Garg <[EMAIL PROTECTED]> wrote: > > No, currently my requirement is to solve this prob

Re: Hadoop: Multiple map reduce or some better way

2008-04-04 Thread Aayush Garg
menting this for instruction or production? > > If production, why not use Lucene? > > > On 4/3/08 6:45 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote: > > > HI Amar , Theodore, Arun, > > > > Thanks for your reply. Actaully I am new to hadoop

Re: Hadoop: Multiple map reduce or some better way

2008-04-03 Thread Aayush Garg
ocument. How would I apply multiple maps or multilevel map reduce jobs programmatically? I guess I need to make another class or add some functions in it? I am not able to figure it out. Any pointers for these type of problems? Thanks, Aayush On Thu, Mar 27, 2008 at 6:14 AM, Amar Kamat <[EMAIL PROT

Hadoop: Multiple map reduce or some better way

2008-03-26 Thread Aayush Garg
. How should I design my program from this stage? I mean how would I apply multiple mapreduce to this? What would be the better way to perform this? Thanks, Regards, - Aayush Garg, Phone: +41 76 482 240