Thanks for that Lewis, very useful. Indeed my question was never designed to be a pro vs con's comparison, I'm just interested to know where people see the differences as Hadoop clearly rules the roost in "Big Data" stuff.

My background is in Business Intelligence and so I come into contact with plenty of Hadoop + Map Reduce PR daily and you end up swamped with that stuff (not that I've found much Hadoop in the wild, just Press fodder). I'm interested because people clearly see a hole in the Hadoop eco system that allows a gap in the market for the OODT setup, and should that use case arise I'd like to make sure I'm choosing the correct tool for the job.

Cheers

Tom



On 03/11/13 14:27, Lewis John Mcgibbney wrote:
Hi Tom,

On Fri, Nov 1, 2013 at 8:09 AM, Tom Barber <[email protected] <mailto:[email protected]>> wrote:

    Morning,

    Chris will remember a couple of years ago me asking on IRC about
    how OODT differs from Hadoop in terms of features and
    functionality, which he then gave a great page long explanation as
    to what the differences were. I vowed to copy that information off
    and save it somewhere useful, and of course never did, then I
    asked Sean who also couldn't dig it up.


What a shame. Would have been great to at least see this if not get it documented as you mention. Oh well. Community lists are as good as it's get IMHO so here we go.


    So, fine folks of the OODT community, for a novice like me who
    would be interested in "selling" OODT to users if the correct
    usecase came along, when someone says "Isn't OODT just a different
    type of Hadoop?" what do I answer?


I am relatively new to OODT. My opinion here is pretty abstract however I have been using Hadoop much longer and therefore hope that some of what I'm saying contributes to our shared understanding.

OODT
=====
I was attracted to OODT due to the modular, component-oriented design of the project as a whole. It is down to the system designer (the initial person/team who pick up OODT) to review and select which aspects of the overall project they need to select to satisfy and accommodate their data work-flow(s). Due to the modular nature of the project, components can be substituted as the nature and/or characteristics of the data work-flow change over time. A beautiful aspect of OODT is that many tools and instruments have been built to accommodate the above-mentioned requirements for data work-flows.

Hadoop
======
For me, Hadoop (something which I consider a blanket term for what is essentially an OS) is an operating system as oppose to OODT which I've described as a modularized data workflow platform. It provides a filesystem (HDFS), data processing platform (MapReduce), and API through which we can submit and execute jobs. Additionally we all know about the bolt on's such as workflow monitoring, security and so forth. In this respect it is down to the engineer to build the data workflow around/on-top of Hadoop given the available components provided. One thing which I think characterizes Hadoop here as well is the fact that generally speaking data follows a 'write-once read many' logic whereas this is not necessarily the case with OODT.


    I'd like to document this type of comparison stuff on the Wiki as
    well as I think its useful for people to know and understand.


I'm sure that the above is obvious to many and that I'm merely mentioning material from the immediate surroundings, however this is my experience so far using OODT and the comparisons I can draw myself.

When i started responding, it was not my aim to engage in a pro's vs con's of each piece of software so I hope the brief replay as above can act as a contribution to the conversation and we can take this onwards.

Thanks
Lewis


--
*Tom Barber* | Technical Director

meteorite bi
*T:* +44 20 8133 3730
*W:* www.meteorite.bi | *Skype:* meteorite.consulting
*A:* Surrey Technology Centre, Surrey Research Park, Guildford, GU2 7YG, UK

Reply via email to