Re: Can I use MapWritable as a key?

2011-07-19 Thread Harsh J
Btw, also checkout Avro's MapReduce components. Its a much better serialization framework, and you'll have lesser issues figuring out datatypes to use + more performance from good use of codecs. On Wed, Jul 20, 2011 at 11:37 AM, Harsh J wrote: > If your key is a "fixed" one of four attributes, wh

Re: Can I use MapWritable as a key?

2011-07-19 Thread Harsh J
If your key is a "fixed" one of four attributes, why not simply use an ArrayWritable of Text objects, over a MapWritable? On Wed, Jul 20, 2011 at 5:32 AM, Choonho Son wrote: > I am newbie. > > Most of example shows that, > job.setOutputKeyClass(Text.class); > > is it possible job.setOutputKeyClas

unsubscribe

2011-07-19 Thread joseph.kumarswamy
unsubscribe

Re: Can I use MapWritable as a key?

2011-07-19 Thread rajesh putta
Hi, As per my knowledege is concerned MapWritable doesn't implement Comparable, so I think it cannot be used as a key. If u want that functionality, then u have to have a subclass that implements Comparable and there u will define your key comparison logic.or the other option would be to use Sorted

Can I use MapWritable as a key?

2011-07-19 Thread Choonho Son
I am newbie. Most of example shows that, job.setOutputKeyClass(Text.class); is it possible job.setOutputKeyClass(MapWritable.class); because my key is combination of values(src IP, src Port, dst Port, dst IP), so I want to use MapWritable as a key. example code is like: MapWritable mkey = new

Re: MapReduce jobs hanging or failing near completion

2011-07-19 Thread Arun C Murthy
Is this reproducible? If so, I'd urge you to check your local disks... Arun On Jul 19, 2011, at 12:41 PM, Kai Ju Liu wrote: > Hi Marcos. The issue appears to be the following. A reduce task is unable to > fetch results from a map task on HDFS. The map task is re-run, but the map > task is now

Re: How would you translate this into MapReduce?

2011-07-19 Thread Em
Interesting to see the upper bound for Hadoop. However I guess this is a rare problem. I'll try to implement what we discussed so far and train myself. Regards, Em Am 19.07.2011 21:40, schrieb Steve Lewis: > If the size of a record is too big to be processed by a node you > probably need to re-a

Re: MapReduce jobs hanging or failing near completion

2011-07-19 Thread Kai Ju Liu
Hi Marcos. The issue appears to be the following. A reduce task is unable to fetch results from a map task on HDFS. The map task is re-run, but the map task is now unable to retrieve information that it needs to run. Here is the error from the second map task: java.io.FileNotFoundException: /mnt/h

Re: How would you translate this into MapReduce?

2011-07-19 Thread Steve Lewis
If the size of a record is too big to be processed by a node you probably need to re-architect using a different record which scales better and combines cleanly You also need to ask at the start what data you need to retrieve and how you intend to retrieve it- at some point a database may start to

Re: How would you translate this into MapReduce?

2011-07-19 Thread Em
Of course it won't scale or at least not as good as your suggested model. Chances are good that my idea is not an option for a production-system and not as usefull as the less-complex variant. So you are right! The reason why I asked was to get an idea of what should be done, if a record is too bi

Re: How would you translate this into MapReduce?

2011-07-19 Thread Steve Lewis
I assumed the problem was count the number of people visiting Moscow after London without considering iany intermediate stops. This leads to a data structure which is easy to combine. The structure you propose adds more information and is difficult to combine. I doubt it could handle a billion peop

Re: How would you translate this into MapReduce?

2011-07-19 Thread Em
Thanks! So you invert the data and than walk through each inverted result. Good point! What do you think about prefixing each city-name with the index in the list? This way you can say: London: 1_Moscow:2, 1_Paris:2, 2_Moscow:1, 2_Riga:4, 2_Paris:1, 3_Berlin:1... >From this list you can see that

Re: How would you translate this into MapReduce?

2011-07-19 Thread Steve Lewis
Assume Joe visits Washington, London, Paris and Moscow You start with records like Joe:Washington:20-Jan-2011 Joe:London:14-Feb2011 Joe:Paris :9-Mar-2011 You want Joe: Washington, London, Paris and Moscow For the next step the person is irrelevant you want Washington: London:1, Paris:1 ,Mosco

Re: How would you translate this into MapReduce?

2011-07-19 Thread Em
Hi Steven, thanks for your response! For the ease of use we can make those assumptions you made - maybe this makes it much easier to help. Those little extras are something for after solving the "easy" version of the task. :) What do you mean with the following? > The second job takes Person : l

Re: How would you translate this into MapReduce?

2011-07-19 Thread Steve Lewis
It is a little unclear what you start with and where you want to end up. Let us assume that you have a collection of triplets of person : place : time we might imagine this information stored on a line of text. It somewhat simplifies the problem to assume that the number of places visited by one p

Re: Too many fetch-failures

2011-07-19 Thread rajesh putta
Yes, we can set mapred.tasktracker.map.tasks.maximum for each node . Thanks & Regards Rajesh Putta M Tech CSE IIIT-H On Tue, Jul 19, 2011 at 6:36 PM, Mohamed Riadh Trad wrote: > Hi, > > I am running hadoop on a cluster with nodes having different > configurations. Is it possible to set specific

Re: Too many fetch-failures

2011-07-19 Thread Mohamed Riadh Trad
Hi, I am running hadoop on a cluster with nodes having different configurations. Is it possible to set specific mapred.tasktracker.map.tasks.maximum for each node? Bests, Trad Mohamed Riadh, M.Sc, Ing. PhD. student INRIA-TELECOM PARISTECH - ENPC School of International Management Office: 11-15