[ 
https://issues.apache.org/jira/browse/GORA-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044429#comment-14044429
 ] 

Moritz Hoffmann commented on GORA-346:
--------------------------------------

You're welcome ;-)

There is a deeper issue with Hadoop dependencies which cannot be solved just by 
switching to the right Hadoop dependency. First of all, the Hadoop API had some 
structural changes between 1.x and 2.x. These structural changes seem to be 
limited to what can be abstracted in a shim layer, as far as I can see only 
Job/JobContext creation are relevant to Gora. I cannot say anything about 
non-structural changes as I'm not an expert on Hadoop.

The core issue that needs to be solved is that Gora should run on Hadoop 1 and 
2, and it is not entirely under Gora's control which dependency gets pulled in. 
Some of Gora's dependency work with either Hadoop 1 or 2, so it depends very 
much on the environment where Gora is run. Using a shims layer it is possible 
to create an abstraction to some aspects of Hadoop (actually, with some more 
effort a complete abstraction layer could be created).

The second issue is that Hadoop's maven dependencies changed. It is not 
possible to just set the version to 2.x, but the dependencies have new 
artifactId values. Hence it is not easily possible to switch between Hadoop 1 
and 2 building. I experimented with this and got a half-working solution using 
maven profiles, but I cannot recommend this solution for production code.

So, in my eyes Gora is left with two choices: either stay at Hadoop 1 and rely 
on the shims layer to provide access to Hadoop 2; or switch to Hadoop 2 and use 
the shims layer to access Hadoop 1. I personally think the second option is to 
be preferred as Hadoop to will eventually replace Hadoop 1. The question is 
just when to make the change.

> Create shim layer to support multiple hadoop versions
> -----------------------------------------------------
>
>                 Key: GORA-346
>                 URL: https://issues.apache.org/jira/browse/GORA-346
>             Project: Apache Gora
>          Issue Type: Improvement
>    Affects Versions: 0.5
>            Reporter: Renato Javier MarroquĂ­n Mogrovejo
>              Labels: patch
>         Attachments: GORA-346_v1.patch, GORA-346_v2.patch, GORA-346_v3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to