Re: Multiples Jobs
Try the fair scheduler, it will seem more simultaneous than the default scheduler. http://hadoop.apache.org/mapreduce/docs/r0.21.0/fair_scheduler.html On Mon, Sep 13, 2010 at 2:33 PM, Rahul Malviya wrote: > Hi, > > I am running Pig jobs on Hadoop cluster. > > I just wanted to know whether I can run multiple jobs on hadoop cluster > simultaneously. > > Currently when i start two jobs on hadoop they run in a serial fashion. > > Is there a way to run N jobs simultaneously on hadoop ? > > Thanks, > Rahul
Multiples Jobs
Hi, I am running Pig jobs on Hadoop cluster. I just wanted to know whether I can run multiple jobs on hadoop cluster simultaneously. Currently when i start two jobs on hadoop they run in a serial fashion. Is there a way to run N jobs simultaneously on hadoop ? Thanks, Rahul
Re: hadoop.job.ugi backwards compatibility
Moving the discussion over to the more appropriate mapreduce-dev. On Mon, Sep 13, 2010 at 9:08 AM, Todd Lipcon wrote: > 1) Groups resolution happens on the server side, where it used to happen on > the client. Thus, all Hadoop users must exist on the NN/JT machines in order > for group mapping to succeed (or the user must write a custom group mapper). There is a plugin that performs the group lookup. See HADOOP-4656. There is no requirement for having the user accounts on the NN/JT although that is the easiest approach. It is not recommended that the users be allowed to login. I think it is important that turning security on and off doesn't drastically change the semantics or protocols. That will become much much harder to support downstream. > 2) The hadoop.job.ugi parameter is ignored - instead the user has to use the > new UGI.createRemoteUser("foo").doAs() API, even in simple security. User code that counts on hadoop.job.ugi working will be horribly broken once you turn on security. Turning on and off security should not involve testing all of your applications. It is unfortunate that we ever used the configuration value as the user, but continuing to support it will make our user's code much much more brittle. -- Owen
hadoop.job.ugi backwards compatibility
Hi all, I wanted to start a (hopefully short) discussion around the treatment of the hadoop.job.ugi configuration in Hadoop 0.22 and beyond (as well as the secure 0.20 branch). In the current security implementation, the following incompatible changes have been made even for users who are sticking with "simple" security. 1) Groups resolution happens on the server side, where it used to happen on the client. Thus, all Hadoop users must exist on the NN/JT machines in order for group mapping to succeed (or the user must write a custom group mapper). 2) The hadoop.job.ugi parameter is ignored - instead the user has to use the new UGI.createRemoteUser("foo").doAs() API, even in simple security. I'm curious whether the general user community feels these are acceptable breaking changes. The potential solutions I can see are: For 1) Add a configuration like hadoop.security.simple.groupmappinglocation -> "client" or "server". If it's set to "client", the group mapping would continue to happen as it does in prior versions on the client side. For 2) If security is "simple", we can have the FileSystem and JobClient constructors check for this parameter. If it's set, and there is no Subject object associated with the current AccessControlContext, wrap the creation of the RPC proxy with the correct doAs() call. Although security is obviously an absolute necessity for many organizations, I know of a lot of people who have small clusters and small teams who don't have any plans to deploy it. For these people, I imagine the above backward-compatibility layer may be very helpful as they adopt the next releases of Hadoop. If we don't want to support these options going forward, we can of course emit deprecation warnings when they are in effect and remove the compatibility layer in the next major release. Any thoughts here? Do people often make use of the hadoop.job.ugi variable to such an extent that this breaking change would block your organization from upgrading? Thanks -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: Support Engineer Job Opening at Datameer
Would it be possible to have a mailing list for hadoop jobs, a j...@hadoop.apache.org? My observation is that the skills for using Hadoop for big data analysis are more specialized, particularly in math, stat, and machine learning. - Sincerely, David G. Boney david.g.bo...@gmail.com http://www.sonicartifacts.com On Sep 13, 2010, at 9:01 AM, Isabel Drost wrote: > > Hello Teresa, > > > On Wed, 8 Sep 2010 Teresa Wingfield wrote: >> Datameer, a provider of big data analytics built on Apache Hadoop, is >> looking for a Support Engineer to provide technical assistance to >> customers and to help diagnose and resolve customer support requests. >> If you're interested in big data analytics and Hadoop, have strong >> technical, communication and troubleshooting skills, and are >> passionate about making customers successful, we would like to talk >> to you. > > Just a hint - are you aware of the j...@apache.org mailing list? Maybe > it makes sense to post the job offer there as well? > > > Cheers, > Isabel
Re: Support Engineer Job Opening at Datameer
Hello Teresa, On Wed, 8 Sep 2010 Teresa Wingfield wrote: > Datameer, a provider of big data analytics built on Apache Hadoop, is > looking for a Support Engineer to provide technical assistance to > customers and to help diagnose and resolve customer support requests. > If you're interested in big data analytics and Hadoop, have strong > technical, communication and troubleshooting skills, and are > passionate about making customers successful, we would like to talk > to you. Just a hint - are you aware of the j...@apache.org mailing list? Maybe it makes sense to post the job offer there as well? Cheers, Isabel