Burger wrote:
> Aayush, out of curiosity, why do you want model wordcount this way?
> What benefit do you see?
>
> Norbert
>
> On 4/6/09, Aayush Garg wrote:
> > Hi,
> >
> > I want to make experiments with wordcount example in a different way.
> >
> &
t; In case you expect the unique words list to be large to fit in memory. You
> could read the previous step output directly from the hdfs and since it
> would be a sorted file you could just walk it and merge the count in single
> pass in the reduce function.
>
> - Sharad
>
--
Aayush Garg,
Phone: +41 764822440
Hi,
I want to make experiments with wordcount example in a different way.
Suppose we have very large data. Instead of splitting all the data one time,
we want to feed some splits in the map-reduce job at a time. I want to model
the hadoop job like this,
Suppose a batch of inputsplits arrive in t
Hi,
I am having a 5 node cluster for hadoop usage. All nodes are multi-core.
I am running a shell command in Map function of my program and this shell
command takes one file as an input. Many of such files are copied in the
HDFS.
So in summary map function will run a command like ./run
Could y
I put my username to R61neptun as you suggested but I am still getting that
error:
localhost: starting datanode, logging to
/home/garga/Documents/hadoop-0.15.3/bin/../logs/hadoop-garga-datanode-R61neptun.out
localhost: starting secondarynamenode, logging to
/home/garga/Documents/hadoop-0.15.3/bin/
Could anyone please help me with this error below ? I am not able to start
HDFS due to this?
Thanks,
On Sat, Apr 19, 2008 at 7:25 PM, Aayush Garg <[EMAIL PROTECTED]> wrote:
> I have my hadoop-site.xml correct !! but it creates error in this way
>
>
> On Sat, Apr 19, 2008
I just tried the same thing (mapred.task.id)as you told..But I am getting
one file named null in my directory.
On Mon, Apr 21, 2008 at 8:33 AM, Amar Kamat <[EMAIL PROTECTED]> wrote:
> Aayush Garg wrote:
>
> > Could anyone please tell?
> >
> > On Sat, Apr 19, 2008 a
Could anyone please tell?
On Sat, Apr 19, 2008 at 1:33 PM, Aayush Garg <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have written the following code for writing my key,value pairs in the
> file, and this file is then read by another MR.
>
>Path pth = new Pa
I have my hadoop-site.xml correct !! but it creates error in this way
On Sat, Apr 19, 2008 at 6:35 PM, Stuart Sierra <[EMAIL PROTECTED]>
wrote:
> On Sat, Apr 19, 2008 at 9:53 AM, Aayush Garg <[EMAIL PROTECTED]>
> wrote:
> > I am getting following error on start
HI,
I am getting following error on start up the hadoop as pseudo distributed::
bin/start-all.sh
localhost: starting datanode, logging to
/home/garga/Documents/hadoop-0.15.3/bin/../logs/hadoop-root-datanode-R61-neptun.out
localhost: starting secondarynamenode, logging to
/home/garga/Documents/ha
Hi,
I have written the following code for writing my key,value pairs in the
file, and this file is then read by another MR.
Path pth = new Path("./dir1/dir2/filename");
FileSystem fs = pth.getFileSystem(jobconf);
SequenceFile.Writer sqwrite = new
SequenceFile.Writer(fs,conf,pth,Text.clas
t to share data, put it into HDFS.
>
>
> On 4/17/08 4:01 AM, "Aayush Garg" <[EMAIL PROTECTED]> wrote:
>
> > One more thing:::
> > The HashMap that I am generating in the reduce phase will be on single
> node
> > or multiple nodes in the distributed e
file can
be very big ...so can I write in such a manner that file is distributed and
I can read it easily in the next MapReduce Phase. Other way, can I split the
file when it becomes gerater than a certain size?
Thanks,
Aayush
On Thu, Apr 17, 2008 at 1:01 PM, Aayush Garg <[EMAIL PROTECTED]>
es the kill
> > flag
> > in the front of the values, it can avoid processing any extra data.
> >
> >
> >
> Ted,
> Will this work for the case where the cutoff frequency/count requires a
> global picture? I guess not.
>
> In general, it is better to not tr
ral, it is better to not try to communicate between map and reduce
> except via the expected mechanisms.
>
>
>
> On 4/16/08 1:33 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote:
>
> > We can not read HashMap in the configure method of the reducer because
> it is
you
> do this, you should use whatever format you like.
>
>
> On 4/16/08 12:41 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote:
>
> > HI,
> >
> > The current structure of my program is::
> > Upper class{
> > class Reduce{
> > reduce f
should I
choose??? Is this design and approach ok?
}
public static void main() {}
}
I hope you have got my question.
Thanks,
On Wed, Apr 16, 2008 at 8:33 AM, Amar Kamat <[EMAIL PROTECTED]> wrote:
> Aayush Garg wrote:
>
> > Hi,
> >
> > Are you sure that another
will I exactly write
the code snippet?
Thanks,
On Wed, Apr 16, 2008 at 7:18 AM, Amar Kamat <[EMAIL PROTECTED]> wrote:
> Aayush Garg wrote:
>
> > HI,
> > Could you please suggest what classes and another better way to achieve
> > this:-
> >
> > I am g
HI,
Could you please suggest what classes and another better way to achieve
this:-
I am getting outputcollector in my reduce function as:
void reduce()
{
output.collect(key,value);
}
Here key is Text,
and value is Custom class type that I generated from rcc.
1. After all calls are comp
But the problem is that I need to sort according to freq which is the part
of my value field...
Any inputs?? Could you provide smal piece of code of your thought
On Wed, Apr 9, 2008 at 9:45 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:
> On Apr 8, 2008, at 4:54 AM, Aayush Garg
Hi,
I have implemented Key and value pairs in the following way:
Key (Text class) Value(Custom class)
word1
word2
class Custom{
int freq;
TreeMap>
}
I construct this type of key, value pairs in the outputcollector of reduce
phase. Now I want to "SORT" this outputcollector in
Hi,
I have implemented Key and value pairs in the following way:
Key (Text class) Value(Custom class)
word1
word2
class Custom{
int freq;
TreeMap>
}
I construct this type of key, value pairs in the outputcollector of reduce
phase. Now I want to "SORT" this outputcollector in
sions of Hadoop? Thanks.
>
> - Robert Dempsey (new to the list)
>
>
> On Apr 4, 2008, at 5:36 PM, Ted Dunning wrote:
>
> >
> >
> > See Nutch. See Nutch run.
> >
> > http://en.wikipedia.org/wiki/Nutch
> > http://lucene.apache.org/nutch/
> >
>
--
Aayush Garg,
Phone: +41 76 482 240
ld Lucene indexes using Hadoop Map/Reduce. See the index
> contrib package in the trunk. Or is it still not something you are
> looking for?
>
> Regards,
> Ning
>
> On 4/4/08, Aayush Garg <[EMAIL PROTECTED]> wrote:
> > No, currently my requirement is to solve this prob
menting this for instruction or production?
>
> If production, why not use Lucene?
>
>
> On 4/3/08 6:45 PM, "Aayush Garg" <[EMAIL PROTECTED]> wrote:
>
> > HI Amar , Theodore, Arun,
> >
> > Thanks for your reply. Actaully I am new to hadoop
ocument. How would I apply multiple maps or
multilevel map reduce jobs programmatically? I guess I need to make another
class or add some functions in it? I am not able to figure it out.
Any pointers for these type of problems?
Thanks,
Aayush
On Thu, Mar 27, 2008 at 6:14 AM, Amar Kamat <[EMAIL PROT
. How should I design my program from this stage?
I mean how would I apply multiple mapreduce to this? What would be the
better way to perform this?
Thanks,
Regards,
-
Aayush Garg,
Phone: +41 76 482 240
27 matches
Mail list logo