[jira] Updated: (PIG-660) Integration with Hadoop 0.20

Dmitriy V. Ryaboy (JIRA) Tue, 04 Aug 2009 19:44:42 -0700

     [ 
https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dmitriy V. Ryaboy updated PIG-660:
----------------------------------

    Attachment: pig_660_shims.patch

Attached patch, pig_660_shims.patch, introduces an compatibility layer similar 
to that in https://issues.apache.org/jira/browse/HIVE-487 . HadoopShims.java 
contains wrappers that hide interface differences between Hadoop 18 and 20; 
when an interface change affects Pig, a shim is added into this class, and used 
by Pig.

Separate versions of the shims are maintained for different Hadoop versions.

This way, Pig users can compile against either Hadoop 18 or Hadoop 20 by simply 
changing an ant property, either via the -D flag, or build.properties, instead 
of having to go through the process of patching.

There has been discussion of officially moving Pig to 0.20; this way, we 
sidestep the whole question, and only need to worry about version compatibility 
when using specific Hadoop APIs.

I propose that we use this mechanism until Pig is moved to use the new, 
future-proofed API.  

Pig compiled against 18 won't be able to use some of the newest features, such 
as Zebra storage. Ant can be configured not to build ant if Hadoop version is < 
20.


> Integration with Hadoop 0.20
> ----------------------------
>
>                 Key: PIG-660
>                 URL: https://issues.apache.org/jira/browse/PIG-660
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>         Environment: Hadoop 0.20
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: 0.4.0
>
>         Attachments: PIG-660-for-branch-0.3.patch, PIG-660.patch, 
> PIG-660_1.patch, PIG-660_2.patch, PIG-660_3.patch, PIG-660_4.patch, 
> PIG-660_5.patch, pig_660_shims.patch
>
>
> With Hadoop 0.20, it will be possible to query the status of each map and 
> reduce in a map reduce job. This will allow better error reporting. Some of 
> the other items that could be on Hadoop's feature requests/bugs are 
> documented here for tracking.
> 1. Hadoop should return objects instead of strings when exceptions are thrown
> 2. The JobControl should handle all exceptions and report them appropriately. 
> For example, when the JobControl fails to launch jobs, it should handle 
> exceptions appropriately and should support APIs that query this state, i.e., 
> failure to launch jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-660) Integration with Hadoop 0.20

Reply via email to