Re: Combiner is optional though it is specified?

2008-07-06 Thread novice user
I am guessing it as a bug in Hadoop-17. Because I am able to reproduce the problem. But, I am not able to figure where exactly this can happen. Can some one please help me on this? Thanks novice user wrote: > > To my surprise, only one output value of mapper is not reaching combiner.

Re: Combiner is optional though it is specified?

2008-07-03 Thread novice user
To my surprise, only one output value of mapper is not reaching combiner. and It is consistent when I repeated the experimentation. Same point directly reaches reducer without going thru the combiner. I am surprised how can this happen? novice user wrote: > > Regarding the conclusion, &

Re: Combiner is optional though it is specified?

2008-07-01 Thread novice user
I got the exception because it got input as "s:d". I am using hadoop-17. Icouldn't get exactly what you meant by no guarantee on the number of times a combiner is run. Can you please elaborate a bit on this? Thanks Arun C Murthy-2 wrote: > > > On Jul 1, 2008, at 4:0

Combiner is optional though it is specified?

2008-07-01 Thread novice user
Hi all, I have a query regarding the functionality of combiner. Is it possible to ignore combiner code for some of the outputs of mapper and directly being sent to reducer though combiner is specified in job configuration? Because, I figured out that, when I am running on large amounts of dat

Re: java.io.IOException: All datanodes are bad. Aborting...

2008-06-19 Thread novice user
don't run it if you don't want to lose all of that "data") > On Jun 19, 2008, at 4:32 AM, novice user wrote: > >> >> Hi Every one, >> I am running a simple map-red application similar to k-means. But, >> when I >> ran it in on single machin

java.io.IOException: All datanodes are bad. Aborting...

2008-06-19 Thread novice user
Hi Every one, I am running a simple map-red application similar to k-means. But, when I ran it in on single machine, it went fine with out any issues. But, when I ran the same on a hadoop cluster of 9 machines. It fails saying java.io.IOException: All datanodes are bad. Aborting... Here is more

Re: Getting "No job jar file set. User classes may not be found." error when running a map-reduce job in hadoop-0.17

2008-06-09 Thread novice user
Hi Brice, I tried by not using eclipse too. I created my own jar file using build.xml and used this jar when passing to hadoop jar command. But still the error is persisting. un_brice wrote: > > novice user a écrit : >> Hi, >> I am running a simple code and I am getting

Getting "No job jar file set. User classes may not be found." error when running a map-reduce job in hadoop-0.17

2008-06-09 Thread novice user
Hi, I am running a simple code and I am getting error as " No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String)". I am not able to figure what would have gone wrong. Some details of the problem are: I have recently upgraded hadoop to 0.17. I have set

NoSuchMethodError while running FileInputFormat.setInputPaths

2008-06-08 Thread novice user
Hi, I am getting the below error when I was using some one's code where they are using hadoop-17 and have the method FileInputFormat.setInputPaths for setting input paths for the job. The exact error is given below. java.lang.NoSuchMethodError: org.apache.hadoop.mapred.FileInputFormat.setInputP

Possibility to specify some type of files in a directory as input

2008-06-05 Thread novice user
Hi, I need a help in setting my map-reduce job to consider only certain type of files as input in a specific directory. For example, Suppose there is a directory dir1 and I have files like type1_1.txt type1_2.txt type1_3.txt type2_1.txt type2_2.txt and If I want to consider only those files w

Could not find any valid local directory for taskTracker

2008-05-30 Thread novice user
Hi, I got below error while running my hadoop task. But, when I tried after few hours, it worked fine. Can some one please tell me why this error occured? ERROR Below: Error initializing task_200805161358_0158_m_00_0: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find an

RE: Performance difference over two map-reduce solutions of same problem in different cluster sizes

2008-05-14 Thread novice user
map output onto disk in multiple segments and merge them at the end. > That is very costly. > > Runping > > >> -----Original Message- >> From: novice user [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, May 14, 2008 2:45 AM >> To: core-user@hadoop.ap

Performance difference over two map-reduce solutions of same problem in different cluster sizes

2008-05-14 Thread novice user
Hi, I have been working on a problem where I have to process a particular data and return three varieties of data and then, I have to process each of them and store each variety of data into separate files. In order to solve the above problem, I have proposed two solutions. One I called un-optimi