Hi Ralph, I admit - I've only been half-following the OpenMPI progress. Do you have a technical write-up of what has been done?
Thanks, Brian On May 20, 2012, at 9:31 AM, Ralph Castain wrote: > FWIW: Open MPI now has an initial cut at "MR+" that runs map-reduce under any > HPC environment. We don't have the Java integration yet to support the Hadoop > MR class, but you can write a mapper/reducer and execute that programming > paradigm. We plan to integrate the Hadoop MR class soon. > > If you already have that integration, we'd love to help port it over. We > already have the MPI support completed, so any mapper/reducer could use it. > > > On May 20, 2012, at 7:12 AM, Pierre Antoine DuBoDeNa wrote: > >> We run similar infrastructure in a university project.. we plan to install >> hadoop.. and looking for "alternatives" based on hadoop in case the pure >> hadoop is not working as expected. >> >> Keep us updated on the code release. >> >> Best, >> PA >> >> 2012/5/20 Stijn De Weirdt <stijn.dewei...@ugent.be> >> >>> hi all, >>> >>> i'm part of an HPC group of a university, and we have some users that are >>> interested in Hadoop to see if it can be useful in their research and we >>> also have researchers that are using hadoop already on their own >>> infrastructure, but that is is not enough reason for us to start with >>> dedicated dedicated Hadoop infrastructure (we are now only running torque >>> based clusters with and without shared storage; setting up and properly >>> maintaining Hadoop infrastructure requires quite some understanding of new >>> software) >>> >>> to be able to support these needs we wanted to do just this: use current >>> HPC infrastructure to make private hadoop clusters so people can do some >>> work. if we attract enough interest, we will probably setup dedicated >>> infrastructure, but by that time we (the admins) will also have a better >>> understanding of what is required. >>> >>> so we used to look at HOD for testing/running hadoop on existing >>> infrastructure (never really looked at myhadoop though). >>> but (imho) the current HOD code base is not in such a good state. we did >>> some work to get it working and added some features, to come to the >>> conclusion that it was not sufficient (and not maintainable). >>> >>> so we wrote something from scratch with same functionality as HOD, and >>> much more (eg HBase is now possible, with or without MR1; some default >>> tuning; easy to add support for yarn instead of MR1). >>> it has some suport for torque, but my laptop is also sufficient. (the >>> torque support is a wrapper to submit the job) >>> we gave a workshop on hadoop using it (25 people, and each with their own >>> 5 node hadoop cluster) and it went rather well. >>> >>> it's not in a public repo yet, but we could do that. if interested, let me >>> know, and i see what can be done. (releasing the code is on our todo list, >>> but if there is some demand, we can do it sooner) >>> >>> >>> stijn >>> >>> >>> >>> On 05/18/2012 05:07 PM, Pierre Antoine DuBoDeNa wrote: >>> >>>> I am also interested to learn about myHadoop as I use a shared storage >>>> system and everything runs on VMs and not actual dedicated servers. >>>> >>>> in like amazon EC2 environment which you just have VMs and huge central >>>> storage, is it any helpful to use hadoop to distribute jobs and maybe >>>> parallelize algorithms, or is better to go with other technologies? >>>> >>>> 2012/5/18 Manu S<manupk...@gmail.com> >>>> >>>> Hi All, >>>>> >>>>> Guess HOD could be useful existing HPC cluster with Torque scheduler >>>>> which >>>>> needs to run map-reduce jobs. >>>>> >>>>> Also read about *myHadoop- Hadoop on demand on traditional HPC >>>>> resources*will support many HPC schedulers like SGE, PBS etc to over >>>>> come the >>>>> integration of shared-architecture(HPC)& shared-nothing >>>>> >>>>> architecture(Hadoop). >>>>> >>>>> Any real use case scenarios for integrating hadoop map/reduce in existing >>>>> HPC cluster and what are the advantages of using hadoop features in HPC >>>>> cluster? >>>>> >>>>> Appreciate your comments on the same. >>>>> >>>>> Thanks, >>>>> Manu S >>>>> >>>>> >>>>> >>>>> On Fri, May 18, 2012 at 12:41 AM, Merto Mertek<masmer...@gmail.com> >>>>> wrote: >>>>> >>>>> If I understand it right HOD is mentioned mainly for merging existing >>>>>> HPC >>>>>> clusters with hadoop and for testing purposes.. >>>>>> >>>>>> I cannot find what is the role of Torque here (just initial nodes >>>>>> allocation?) and which is the default scheduler of HOD ? Probably the >>>>>> scheduler from the hadoop distribution? >>>>>> >>>>>> In the doc is mentioned a MAUI scheduler, but probably if there would be >>>>>> >>>>> an >>>>> >>>>>> integration with hadoop there will be any document on it.. >>>>>> >>>>>> thanks.. >>>>>> >>>>>> >>>>> >>>> >>>