No, you're right - to define the queue names at the cluster level, the mapred.queue.names is the right config. To specify a queue at the job level, mapred.job.queue.name is the right config.
On Wed, Oct 17, 2012 at 11:10 PM, Patai Sangbutsarakum <silvianhad...@gmail.com> wrote: > Harsh.. i am testing it again according to your last instruction. > >>> 2. Define your required queues: >>>mapred.job.queues set to "default,foo,bar" for example, for 3 queues: >>>default, foo and bar. > > From > http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u4/cluster_setup.html#Configuring+the+Environment+of+the+Hadoop+Daemons > I couldn't find "mapred.job.queues" from that link so i have been > using mapred.queue.names which might be the case that it is my fault. > > Please suggest > > On Wed, Oct 17, 2012 at 8:43 AM, Harsh J <ha...@cloudera.com> wrote: >> Hey Robin, >> >> Thanks for the detailed post. >> >> Just looked at your older thread, and you're right, the JT does write >> into its system dir for users' job info and token files when >> initializing the Job. The bug you ran into and the exception+trace you >> got makes sense now. >> >> I just didn't see it on version which Patai seems to be using. I think >> if he specifies a proper staging directory, he'll go through, cause >> his trace is different than that of MAPREDUCE-4398 (i.e. system dir >> vs. staging dir - you had system dir unfortunately). >> >> On Wed, Oct 17, 2012 at 8:39 PM, Goldstone, Robin J. >> <goldsto...@llnl.gov> wrote: >>> Yes, you would think that users shouldn't need to write to >>> mapred.system.dir, yet that seems to be the case. I posted details about >>> my configuration along with full stack traces last week. I won't re-post >>> everything but essentially I have mapred.system.dir defined as a directory >>> in HDFS owned by mapred:hadoop. I initially set the permissions to 755 >>> but when the job tracker started up it changed the permissions to 700. >>> Then when I ran a job as a regular user I got this error: >>> >>> 12/10/09 16:27:03 INFO mapred.JobClient: Job Failed: Job initialization >>> failed: >>> org.apache.hadoop.security.AccessControlException: >>> org.apache.hadoop.security.AccessControlException: Permission denied: >>> user=robing, access=EXECUTE, inode="mapred":mapred:hadoop:rwx------ >>> >>> >>> I then manually changed the permissions back to 755 and ran again and got >>> this error: >>> 12/10/09 16:31:30 INFO mapred.JobClient: Job Failed: Job initialization >>> failed: >>> org.apache.hadoop.security.AccessControlException: >>> org.apache.hadoop.security.AccessControlException: Permission denied: >>> user=robing, access=WRITE, inode="mapred":mapred:hadoop:rwxr-xr-x >>> >>> I then changed the permissions to 777 and the job ran successfully. This >>> suggests that some process was trying to write to write to >>> mapred.system.dir but did not have sufficient permissions. The >>> speculation is that this was being attempted under my uid instead of >>> mapred. Perhaps it is something else. I welcome your suggestions. >>> >>> >>> For completeness, I also have mapred.jobtracker.staging.root.dir set to >>> /user within HDFS. I can verify the staging files are going there but >>> something else is still trying to access mapred.system.dir. >>> >>> Robin Goldstone, LLNL >>> >>> On 10/17/12 12:00 AM, "Harsh J" <ha...@cloudera.com> wrote: >>> >>>>Hi, >>>> >>>>Regular users never write into the mapred.system.dir AFAICT. That >>>>directory, is just for the JT to use to mark its presence and to >>>>"expose" the distributed filesystem it will be relying on. >>>> >>>>Users write to their respective staging directories, which lies >>>>elsewhere and is per-user. >>>> >>>>Let me post my environment: >>>> >>>>- mapred.system.dir (A HDFS Dir for a JT to register itself) set to >>>>"/tmp/mapred/system". The /tmp/mapred and /tmp/mapred/system (or >>>>whatever you configure it to) is to be owned by mapred:hadoop so that >>>>the JT can feel free to reconfigure it. >>>> >>>>- mapreduce.jobtracker.staging.root.dir (A HDFS dir that represents >>>>the parent directory for user's to write their per-user job stage >>>>files (JARs, etc.)) is set to "/user". The /user further contains each >>>>user's home directories, owned all by them. For example: >>>> >>>>drwxr-xr-x - harsh harsh 0 2012-09-27 15:51 /user/harsh >>>> >>>>All staging files from local user 'harsh' are hence written as the >>>>proper user under /user/harsh/.staging since that user does have >>>>permissions to write there. For any user to access HDFS, they'd need a >>>>home directory created on the HDFS by the admin first - and after that >>>>things users do under their own directory, will work just fine. The JT >>>>would not have to try to create per-user directories. >>>> >>>>On Wed, Oct 17, 2012 at 5:22 AM, Patai Sangbutsarakum >>>><silvianhad...@gmail.com> wrote: >>>>> Thanks everyone, Seem like i hit the dead end. >>>>> It's kind of funny when i read that jira; run it 4 time and everything >>>>> will work.. where that magic number from..lol >>>>> >>>>> respects >>>>> >>>>> On Tue, Oct 16, 2012 at 4:12 PM, Arpit Gupta <ar...@hortonworks.com> >>>>>wrote: >>>>>> https://issues.apache.org/jira/browse/MAPREDUCE-4398 >>>>>> >>>>>> is the bug that Robin is referring to. >>>>>> >>>>>> -- >>>>>> Arpit Gupta >>>>>> Hortonworks Inc. >>>>>> http://hortonworks.com/ >>>>>> >>>>>> On Oct 16, 2012, at 3:51 PM, "Goldstone, Robin J." >>>>>><goldsto...@llnl.gov> >>>>>> wrote: >>>>>> >>>>>> This is similar to issues I ran into with permissions/ownership of >>>>>> mapred.system.dir when using the fair scheduler. We are instructed to >>>>>>set >>>>>> the ownership of mapred.system.dir to mapred:hadoop and then when the >>>>>>job >>>>>> tracker starts up (running as user mapred) it explicitly sets the >>>>>> permissions on this directory to 700. Meanwhile when I go to run a >>>>>>job as >>>>>> a regular user, it is trying to write stuff into mapred.system.dir but >>>>>>it >>>>>> can't due to the ownership/permissions that have been established. >>>>>> >>>>>> Per discussion with Arpit Gupta, this is a bug with the fair scheduler >>>>>>and >>>>>> it appears from your experience that there are similar issues with >>>>>> hadoop.tmp.dir. The whole idea of the fair scheduler is to run jobs >>>>>>under >>>>>> the user's identity rather than as user mapred. This is good from a >>>>>> security perspective yet it seems no one bothered to account for this >>>>>>in >>>>>> terms of the permissions that need to be set in the various >>>>>>directories to >>>>>> enable this. >>>>>> >>>>>> Until this is sorted out by the Hadoop developers, I've put my >>>>>>attempts to >>>>>> use the fair scheduler on holdÅ >>>>>> >>>>>> Regards, >>>>>> Robin Goldstone, LLNL >>>>>> >>>>>> On 10/16/12 3:32 PM, "Patai Sangbutsarakum" <silvianhad...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> Hi Harsh, >>>>>> Thanks for breaking it down clearly. I would say i am successful 98% >>>>>> from the instruction. >>>>>> The 2% is about hadoop.tmp.dir >>>>>> >>>>>> let's say i have 2 users >>>>>> userA is a user that start hdfs and mapred >>>>>> userB is a regular user >>>>>> >>>>>> if i use default value of hadoop.tmp.dir >>>>>> /tmp/hadoop-${user.name} >>>>>> I can submit job as usersA but not by usersB >>>>>> ser=userB, access=WRITE, inode="/tmp/hadoop-userA/mapred/staging" >>>>>> :userA:supergroup:drwxr-xr-x >>>>>> >>>>>> i googled around; someone recommended to change hadoop.tmp.dir to >>>>>> /tmp/hadoop. >>>>>> This way it is almost a yay way; the thing is >>>>>> >>>>>> if I submit as userA it will create /tmp/hadoop in local machine which >>>>>> ownership will be userA.userA, >>>>>> and once I tried to submit job from the same machine as userB I will >>>>>> get "Error creating temp dir in hadoop.tmp.dir /tmp/hadoop due to >>>>>> Permission denied" >>>>>> (as because /tmp/hadoop is own by userA.userA). vise versa if I delete >>>>>> /tmp/hadoop and let the directory be created by userB, userA will not >>>>>> be able to submit job. >>>>>> >>>>>> Which is the right approach i should work with? >>>>>> Please suggest >>>>>> >>>>>> Patai >>>>>> >>>>>> >>>>>> On Mon, Oct 15, 2012 at 3:18 PM, Harsh J <ha...@cloudera.com> wrote: >>>>>> >>>>>> Hi Patai, >>>>>> >>>>>> Reply inline. >>>>>> >>>>>> On Tue, Oct 16, 2012 at 2:57 AM, Patai Sangbutsarakum >>>>>> <silvianhad...@gmail.com> wrote: >>>>>> >>>>>> Thanks for input, >>>>>> >>>>>> I am reading the document; i forget to mention that i am on cdh3u4. >>>>>> >>>>>> >>>>>> That version should have the support for all of this. >>>>>> >>>>>> If you point your poolname property to mapred.job.queue.name, then you >>>>>> can leverage the Per-Queue ACLs >>>>>> >>>>>> >>>>>> Is that mean if i plan to 3 pools of fair scheduler, i have to >>>>>> configure 3 queues of capacity scheduler. in order to have each pool >>>>>> can leverage Per-Queue ACL of each queue.? >>>>>> >>>>>> >>>>>> Queues are not hard-tied into CapacityScheduler. You can have generic >>>>>> queues in MR. And FairScheduler can bind its Pool concept into the >>>>>> Queue configuration. >>>>>> >>>>>> All you need to do is the following: >>>>>> >>>>>> 1. Map FairScheduler pool name to reuse queue names itself: >>>>>> >>>>>> mapred.fairscheduler.poolnameproperty set to 'mapred.job.queue.name' >>>>>> >>>>>> 2. Define your required queues: >>>>>> >>>>>> mapred.job.queues set to "default,foo,bar" for example, for 3 queues: >>>>>> default, foo and bar. >>>>>> >>>>>> 3. Define Submit ACLs for each Queue: >>>>>> >>>>>> mapred.queue.default.acl-submit-job set to "patai,foobar users,adm" >>>>>> (usernames groupnames) >>>>>> >>>>>> mapred.queue.foo.acl-submit-job set to "spam eggs" >>>>>> >>>>>> Likewise for remaining queues, as you need itÅ >>>>>> >>>>>> 4. Enable ACLs and restart JT. >>>>>> >>>>>> mapred.acls.enabled set to "true" >>>>>> >>>>>> 5. Users then use the right API to set queue names before submitting >>>>>> jobs, or use -Dmapred.job.queue.name=value via CLI (if using Tool): >>>>>> >>>>>> >>>>>>http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobCon >>>>>>f >>>>>> .html#setQueueName(java.lang.String) >>>>>> >>>>>> 6. Done. >>>>>> >>>>>> Let us know if this works! >>>>>> >>>>>> -- >>>>>> Harsh J >>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >>>>-- >>>>Harsh J >>> >> >> >> >> -- >> Harsh J -- Harsh J