RE: how to write custom object using M/R

2011-01-14 Thread MONTMORY Alain
Hi, I think you have to put : job.setOutputFormatClass(SequenceFileOutputFormat.class); to make it works.. hopes this help Alain [@@THALES GROUP RESTRICTED@@] De : Joan [mailto:joan.monp...@gmail.com] Envoyé : vendredi 14 janvier 2011 13:58 À : mapreduce-user Objet : how to write cu

Re: split locations

2011-01-14 Thread Owen O'Malley
On Fri, Jan 14, 2011 at 3:09 AM, Pedro Costa wrote: > Hi, > > If a split location contains more that one location, it means that > this split file is replicated through all locations, or it means that > a split is divided into several blocks, and each block is in one > location? It requests tha

Re: Compile Scheduler from source code

2011-01-14 Thread Robert Grandl
I am still not able to compile scheduler code in Hadoop-0.21 I tried to use eclipse and svn checkout. I have the trunk, right click on the fairshare build.xml file and run. However, I got a bunch of errors like: [ivy:cachepath] :: loading settings :: file = /home/rgrandl/School/Project/hadoop_

Re: JVM reuse / sharing data between mappers

2011-01-14 Thread Hari Sreekumar
Do the mappers modify this data? If they don't, then you can use the distributed cache for this.. If they do modify the data and the logic depends on the order of mappers/the modified values, then probably it is not meant to be a mapreduce job, since the basic assumption is the the mappers are inde

Re: Compile Scheduler from source code

2011-01-14 Thread Robert Grandl
Thanks for your reply. However, I don't know what is Maven repository. Could you be more detailed on what exactly I should put where ? I would like an easy way to recompile schedulers source code. Many thanks, Robert On 01/14/2011 03:42 PM, Harsh J wrote: Hi, On Fri, Jan 14, 2011 at 7:48 PM

Re: Compile Scheduler from source code

2011-01-14 Thread Harsh J
Hi, On Fri, Jan 14, 2011 at 7:48 PM, Robert Grandl wrote: > [ivy:resolve]         :: > [ivy:resolve]         ::          UNRESOLVED DEPENDENCIES         :: > [ivy:resolve]         :: > [ivy:resolve]         ::

Compile Scheduler from source code

2011-01-14 Thread Robert Grandl
Hi all, I was trying to compile FairScheduler from source code. I went to: ~/hadoop-0.21.0/mapred/src/contrib/fairscheduler and run: ant clean compile However, I get the following errors:

Re: split locations

2011-01-14 Thread Harsh J
An InputSplit is the definition of a Mapper's input and has similar characteristics as a HDFS Block (Offset, Length, Locations). But, an InputSplit is computed by an InputFormat class to suit an input's requirement (such as newline boundaries in Text files, which isn't taken care of while splitting

Re: split locations

2011-01-14 Thread Pedro Costa
What do you mean by that? For example, if the location of a input split is at /DataCenter1/Rack1/Node1, this means that this is the location of the namenode, and not the physical location of the data blocks? On Fri, Jan 14, 2011 at 1:10 PM, Harsh J wrote: > Yes, this is correct. But also, a logi

Re: split locations

2011-01-14 Thread Harsh J
Yes, this is correct. But also, a logical MapReduce InputSplit is very different from a physical HDFS Block. On Fri, Jan 14, 2011 at 5:10 PM, Pedro Costa wrote: > I think that the answer is, each location of the split file > corresponds to a replica. > > On Fri, Jan 14, 2011 at 11:09 AM, Pedro Co

how to write custom object using M/R

2011-01-14 Thread Joan
Hi, I'm trying to write (K,V) where K is a Text object and V's CustomObject. But It doesn't run. I'm configuring output job like: SequenceFileInputFormat so I have job with: job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(CustomObject.class); job.setOutpu

Re: split locations

2011-01-14 Thread Pedro Costa
I think that the answer is, each location of the split file corresponds to a replica. On Fri, Jan 14, 2011 at 11:09 AM, Pedro Costa wrote: > Hi, > > If a split location contains more that one location, it means that > this split file is replicated through all locations, or it means that > a split

split locations

2011-01-14 Thread Pedro Costa
Hi, If a split location contains more that one location, it means that this split file is replicated through all locations, or it means that a split is divided into several blocks, and each block is in one location? Thanks, -- Pedro