Retrieve node where a map task is running programmatically

2012-12-27 Thread Eduard Skaley
Hi, is there a way to find out in the setup function of a mapper on which node of the cluster the current mapper is running ? thank you very much, Eduard

Re: Map Shuffle Bytes

2012-12-26 Thread Eduard Skaley
counter of Map output bytes. Per-partition counters can be constructed on the user side if needed, by pre-computing the partition before emit (using the same partitioner) and counting up the bytes of your objects for its counter. On Tue, Dec 25, 2012 at 6:03 PM, Eduard Skaley e.v.ska...@gmail.com

Re: Map Shuffle Bytes

2012-12-26 Thread Eduard Skaley
in from the required record reader - for example a TextRecordReader uses a Long key that denotes current offset in file, which you could use as a simple, progressing counter of bytes read thus far. On Wed, Dec 26, 2012 at 5:16 PM, Eduard Skaley e.v.ska...@gmail.com wrote: Hi, I mean

Map Shuffle Bytes

2012-12-25 Thread Eduard Skaley
Hello guys, I need a counter for shuffled bytes to the mappers. Is there existing one or should I define one myself ? How can I implement such a counter? Thank you and happy Christmas time, Eduard

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

2012-11-05 Thread Eduard Skaley
not on MRv1 each container gets 1GB at the moment. can you try increasing memory per reducer ? On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley e.v.ska...@gmail.com mailto:e.v.ska...@gmail.com wrote: Hello, I'm getting this Error through job execution: 16:20:26 INFO [main

Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

2012-10-31 Thread Eduard Skaley
Hello, I'm getting this Error through job execution: 16:20:26 INFO [main] Job - map 100% reduce 46% 16:20:27 INFO [main] Job - map 100% reduce 51% 16:20:29 INFO [main] Job - map 100% reduce 62% 16:20:30 INFO [main]

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

2012-10-31 Thread Eduard Skaley
it was installed through the cloudera manager and we took the default value for the per reducer memory. Hello, I'm getting this Error through job execution: 16:20:26 INFO [main] Job - map 100% reduce 46% 16:20:27 INFO [main] Job - map 100% reduce

Re: Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError Java Heap Space

2012-10-31 Thread Eduard Skaley
each container gets 1GB at the moment. can you try increasing memory per reducer ? On Wed, Oct 31, 2012 at 9:15 PM, Eduard Skaley e.v.ska...@gmail.com mailto:e.v.ska...@gmail.com wrote: Hello, I'm getting this Error through job execution: 16:20:26 INFO [main

Re: Controlling on which node a reducer will be executed

2012-08-30 Thread Eduard Skaley
://issues.apache.org/jira/browse/MAPREDUCE-199 if you want to take a shot at implementing this. HBase would love to have this, I think. On Mon, Aug 27, 2012 at 10:41 PM, Eduard Skaley e.v.ska...@gmail.com wrote: Hi, i have a question concerning the execution of reducers. To use effectively the data

Re: Controlling on which node a reducer will be executed

2012-08-29 Thread Eduard Skaley
unavailable at the moment. See https://issues.apache.org/jira/browse/MAPREDUCE-199 if you want to take a shot at implementing this. HBase would love to have this, I think. On Mon, Aug 27, 2012 at 10:41 PM, Eduard Skaley e.v.ska...@gmail.com wrote: Hi, i have a question concerning the execution

Re: Controlling on which node a reducer will be executed

2012-08-29 Thread Eduard Skaley
unavailable at the moment. See https://issues.apache.org/jira/browse/MAPREDUCE-199 if you want to take a shot at implementing this. HBase would love to have this, I think. On Mon, Aug 27, 2012 at 10:41 PM, Eduard Skaley e.v.ska...@gmail.com wrote: Hi, i have a question concerning

Controlling on which node a reducer will be executed

2012-08-27 Thread Eduard Skaley
Hi, i have a question concerning the execution of reducers. To use effectively the data locality of blocks in my use case i want to control on which node a reducer will be executed. In my scenario i have a chain of map-reduce jobs where each job will be executed by exactly N reducers. I want to