Hello

I am working on a master thesis project which aims at integrating a custom
Map Reduce framework with similar MR interface but own implementation and
pipeline, with higher level language frameworks as PIG.

The aim is to make the proprietary MR framework available for usage in as
many scenarios as possible, maintaining it's own pipeline, with minimum
changes to the applications or frameworks which employ Hadoop Map Reduce.

Currently, the MR master and workers have been integrated with YARN so that
such jobs can be launched on YARN. The framework is written in C++ and it's
running OpenCL defined Map and Reduce functions.

Given the large landscape of Hadoop projects I would need some pointers to
resources, literature or documentation of how this has been achieved (Pig
can run on top of Hadoop MR and Spark at least) or which options can be
considered. (I am conducting reading already into YARN, Pig and so on but
some pointers would be really helpful).

Thank you,

Ion

Reply via email to