Re: max 1 mapper per node

2012-05-03 Thread Radim Kolar
Dne 27.4.2012 17:30, Robert Evans napsal(a): Radim, You would want to modify the application master for this, and it is likely to be a bit of a hack because the RM scheduler itself is not really designed for this. What about to do something like this: In job JAR there will be loadable plugin

Re: max 1 mapper per node

2012-05-03 Thread Radim Kolar
if plugin system for AM is overkill, something simpler can be made like: maximum number of mappers per node maximum number of reducers per node maximum percentage of non data local tasks maximum percentage of rack local tasks and set this in job properties.

Re: Getting filename in case of MultipleInputs

2012-05-03 Thread Harsh J
Subbu, The only way I can think of, is to use an overridden InputFormat/RecordReader pair that sets the "map.input.file" config value during its initialization, using the received FileSplit object. This should be considered as a bug, however, and even 2.x is affected. Can you please file a JIRA o

RE: Getting filename in case of MultipleInputs

2012-05-03 Thread Devaraj k
Hi Subbu, I am not sure which input format you are using. If you are using FileInputFormat, you can get the file name this way in map function.. import org.apache.hadoop.mapred.FileSplit; import org.apache.hadoop.mapreduce.InputSplit; import org.apache.hadoop.mapreduce.Mapper; public c

Re: Getting filename in case of MultipleInputs

2012-05-03 Thread Bejoy Ks
Hi Subbu, The file/split processed by a mapper could be obtained from WebUI as soon as the job is executed. However this detail can't be obtained once the job is moved to JT history. Regards Bejoy On Thu, May 3, 2012 at 6:25 PM, Kasi Subrahmanyam wrote: > Hi, > > Could anyone suggest how

Re: MapReduce jobs remotely

2012-05-03 Thread Kevin
I believe I have fixed it. I am using pig-0.9.2. My cluster is using CDH4b2, but I am not using the Pig RPM install on the client. I downloaded the tarball from Apache. Each machine in my cluster has $HADOOP_MAPRED_HOME defined. I cleaned up my Pig configuration directory to only have the follow

kerberos security enabled and hadoop/hdfs/mapred users

2012-05-03 Thread Koert Kuipers
do i understand it correctly that with kerberos enabled the mappers and reducers will be "run as" the actual user that started them? as opposed to the user that runs the tasktracker, which is mapred or hadoop or something like that?

Re: kerberos security enabled and hadoop/hdfs/mapred users

2012-05-03 Thread Ravi Prakash
Yes Koert! That is correct! On Thu, May 3, 2012 at 6:08 PM, Koert Kuipers wrote: > do i understand it correctly that with kerberos enabled the mappers and > reducers will be "run as" the actual user that started them? as opposed to > the user that runs the tasktracker, which is mapred or hadoop