Re: [gridengine users] Gridengine and Hadoop

Ralph Castain Fri, 25 May 2012 11:36:59 -0700

On May 25, 2012, at 11:56 AM, Ron Chen wrote:

> Ralph: How common will we see jobs that request dynamic allocations? I have 
> never seen Hadoop presentations talking about them in any BigData conferences.


It actually occurs on every single MapReduce job - you just don't see it 
because it happens under the covers. The procedure used in Hadoop is:

1. your program (the "client") sets up an MR job, and then calls the Hadoop 
"Job" class to execute it

2. the Job class assembles the required allocation parameters based on your MR 
specifications (e.g., file names, memory) and contacts the JobTracker to get an 
allocation that fits it

3. the Job class in your client receives allocation responses from the 
JobTracker, and then launches your MR executables against those allocations

4. it then waits for job completion, upon which it returns the allocation to 
the JobTracker.

So it all happens "dynamically" in that your app is getting the allocation - 
you don't get it beforehand, and then execute your app within the allocation.

> 
> 
> Just also want to mention that Moab is not open-source, and I don't think we 
> will see much information about the integration from Moab. 

Yes and no - I'm working with Adaptive Computing on it, so while we won't see 
the actual code, a lot of the specs are being defined by me. :-)

> 
> 
>  -Ron
> 
> 
> 
> ----- Original Message -----
> From: Rayson Ho <ray...@scalablelogic.com>
> To: Ralph Castain <r...@open-mpi.org>
> Cc: "users@gridengine.org Group" <users@gridengine.org>
> Sent: Thursday, May 24, 2012 3:01 PM
> Subject: Re: [gridengine users] Gridengine and Hadoop
> 
> On Thu, May 24, 2012 at 1:58 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> In C - nobody wants to use Java on their clusters as many don't have it (as 
>> you note below) for both security and memory footprint.
>> 
>> I'm hoping to get everyone to the same API, but can work with it either way.
> 
> It's really good that it is now in C - but wasn't the C API not fully
> working when you emailed me last time? :-P
> 
> And I did not expect this dynamic allocation feature getting so big -
> last time I looked at your code it was similar in size (and
> functionality!) to the HDFS locality load sensor written by DanT (when
> he at Sun).
> 
> So if it is just the HDFSFileFinder.java & hdfsalloc.pl, then I
> believe we don't need to change DanT's code. But if we are planning
> for the dynamic allocation integration, then it really would help if
> we can look at the APIs used by others.
> 
> Rayson
> 
> 
> 
> 
> 
> 
> 
>> 
>>> 
>>> If the API binding is written in Java, I am interested to know if
>>> SLURM and Moab are calling the APIs in the job scheduler. While Grid
>>> Engine already has an embedded JVM for JMX, many sites don't enable
>>> JVM to save a bit of memory footprint.
>>> 
>>> Rayson
>>> 
>>> 
>>> 
>>>> 
>>>>> 
>>>>> I am wondering how the Hadoop job scheduler handles dynamic allocation
>>>>> - ie. the file request done inside the mapreduce job with async.
>>>>> callback.
>>>> 
>>>> Basically, you submit a request for a number of slots and then wait for 
>>>> the response in a blocking "poll". The response generally provides a 
>>>> partial allocation - i.e., you don't get everything you want as the Hadoop 
>>>> RM doles them out as each node contacts it to indicate its availability 
>>>> (which is why the launch takes so long). You then keep looping over the 
>>>> requests, updating the requested number of slots to reflect what you have 
>>>> already been given.
>>>> 
>>>> For MR, they launch your mapper against each slot as it is allocated, so 
>>>> you get a "rolling start". For MPI, we can't do that, so we have to wait 
>>>> until all resources have been allocated before we launch.
>>>> 
>>>> 
>>>>> 
>>>>> Rayson
>>>>> 
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Ralph
>>>>>> 
>>>>>> On May 24, 2012, at 8:34 AM, Rayson Ho wrote:
>>>>>> 
>>>>>>> Just want to update everyone - I followed up with Ralph @ EMC, and I
>>>>>>> looked at his code, which is very similar to DanT's code in SGE 6.2u5
>>>>>>> - ie. they both pull information from HDFS and use the locality info
>>>>>>> to affect scheduling.
>>>>>>> 
>>>>>>> However, the APIs used are different, and we will pay attention to the
>>>>>>> Hadoop 2.x API changes and test DanT's integration again when 2.x
>>>>>>> comes out.
>>>>>>> 
>>>>>>> CB, can you let me know about the multi-user issue? As mentioned
>>>>>>> before we have HBase, Pig, Hive, etc tested with our Hadoop setup, but
>>>>>>> we don't have real users on it and thus it really would help if you
>>>>>>> can let us know the issues you've encountered.
>>>>>>> 
>>>>>>> Rayson
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Mar 30, 2012 at 3:18 PM, CB <cbalw...@gmail.com> wrote:
>>>>>>>> I'm very much interested in SGE + Hadoop enhancement.
>>>>>>>> 
>>>>>>>> I'm currently testing Dan T's Hadoop + SGE integration for multi-user
>>>>>>>> environment on an internal dev cluster and it's working nicely.
>>>>>>>> But it is not easy to set up. It requires to change file permissions 
>>>>>>>> various
>>>>>>>> places in order to make it working under multi-user environment.
>>>>>>>> 
>>>>>>>> - Chansup
>>>>>>>> 
>>>>>>>> On Fri, Mar 30, 2012 at 1:42 PM, Chris Dagdigian <d...@sonsorol.org> 
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I'm registering my interest here.
>>>>>>>>> 
>>>>>>>>> Reuti -- if you could pass my email along to Ralph I'd appreciate it.
>>>>>>>>> 
>>>>>>>>> I have several consulting customers using EMC Isilon storage on Grid
>>>>>>>>> Engine HPC clusters and we've been getting pinged from EMC/Greenplum 
>>>>>>>>> sales
>>>>>>>>> reps pushing to show off the combination of native HDFS support in 
>>>>>>>>> Isilon +
>>>>>>>>> the greenplum hadoop appliance integration.
>>>>>>>>> 
>>>>>>>>> Basically I have a few largish sites that could test & provide 
>>>>>>>>> feedback if
>>>>>>>>> things work out. Some are commercial, some are .gov & all are 
>>>>>>>>> interested in
>>>>>>>>> SGE + Hadoop enhancements.
>>>>>>>>> 
>>>>>>>>> -dag
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Reuti wrote:
>>>>>>>>>> 
>>>>>>>>>> on behalf of Ralph Castain who you may know from the Open MPI mailing
>>>>>>>>>> list I want to forward this eMail to your attention.
>>>>>>>>>> 
>>>>>>>>>> -- Reuti
>>>>>>>>>> 
>>>>>>>>>>>>  I have a question for the Gridengine community, but thought I'd 
>>>>>>>>>>>> run
>>>>>>>>>>>> it through you as I believe you work in that area?
>>>>>>>>>>>>  >  As you may know, I am now employed by Greenplum/EMC to work on
>>>>>>>>>>>> resource management for Hadoop as well as MPI. The main concern 
>>>>>>>>>>>> frankly is
>>>>>>>>>>>> that the current Hadoop RM (yarn) scales poorly in terms of launch 
>>>>>>>>>>>> and
>>>>>>>>>>>> provides no support for MPI wireup, thus causing MPI jobs to 
>>>>>>>>>>>> exhibit
>>>>>>>>>>>> quadratic scaling of startup times.
>>>>>>>>>>>>  >  The only reason for using yarn is that it has the HDFS 
>>>>>>>>>>>> interface
>>>>>>>>>>>> required to determine file locality, thus allowing users to place 
>>>>>>>>>>>> processes
>>>>>>>>>>>> network-near to the files they will use. I have initiated an 
>>>>>>>>>>>> effort here at
>>>>>>>>>>>> GP to create a C-library for accessing HDFS to obtain that 
>>>>>>>>>>>> locality info,
>>>>>>>>>>>> and expect to have it completed in the next few weeks.
>>>>>>>>>>>>  >  Armed with that capability, it would be possible to extend more
>>>>>>>>>>>> capable RMs such as Gridengine so that users could obtain 
>>>>>>>>>>>> HDFS-based
>>>>>>>>>>>> allocations for their MapReduce applications. This would allow 
>>>>>>>>>>>> Gridengine to
>>>>>>>>>>>> support Hadoop operations, and make Hadoop clusters that used 
>>>>>>>>>>>> Gridengine as
>>>>>>>>>>>> their RM be "multi-use".
>>>>>>>>>>>>  >  Would this be of interest to the community? I can contribute 
>>>>>>>>>>>> the
>>>>>>>>>>>> C-lib code for their use under a BSD-like license structure, if 
>>>>>>>>>>>> that would
>>>>>>>>>>>> help.
>>>>>>>>>>>>  >  Regards,
>>>>>>>>>>>>  Ralph
>>>>>>>>>>>>  >
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> users@gridengine.org
>>>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users@gridengine.org
>>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users@gridengine.org
>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>> 
>>>> 
>> 
> 
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Gridengine and Hadoop

Reply via email to