[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Gaurav Jain (JIRA) Wed, 10 Feb 2010 15:28:55 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832287#action_12832287
 ]


Gaurav Jain commented on PIG-1140:
----------------------------------


Few suggestions to the implementation


TableLoader: 
 -- In initialize method(), we sld do 
      
   Configuration conf = new Configuration(false) which creates an empty object. 
 
   Configuration conf = new Configuration() populates the object from 
default-*xml which may contain conflicting properties. 
 
    ( Good to have ) 
 
 -- In seekNear method(), we might want to check the nullness of 
tableRecordReader. ( Good to have ) 
 
 -- In createIndexReader(), since we set the projection, we sld not send null 
projection to 
     createTableRecordReader(job, null). 
     It sld be createTableRecordReader(job, 
TableInoutFormat.getProjection(job)) (need to have) 
 
 -- In setLocation() and getSchema(), if we are handling paths == null then we 
might want to check paths.isEmpty() as well. (good to have) 
 
 
 
 
 TableStorer: 
 
 -- Instead of implementing new classes (TableOutputFormat and 
TableOutputCommitter), we sld use BasicTableOutputFormat and 
BasicTableOutputFormat.TableOutputCommitter in zebra mapreduce package ( must 
have ) 
 
                                   (There would be a separate jira/patch to do 
the same ) 
 
 -- Code from storeSchema sld go 
TableOutputFormat.TableOutputCommitter.cleanupJob(). 
 
 -- Does pig calls OutputCommitter.abortJob() for failed jobs ? 
 


> [zebra] Use of Hadoop 2.0 APIs  
> --------------------------------
>
>                 Key: PIG-1140
>                 URL: https://issues.apache.org/jira/browse/PIG-1140
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.6.0
>            Reporter: Yan Zhou
>            Assignee: Xuefu Zhang
>             Fix For: 0.7.0
>
>         Attachments: zebra.0209
>
>
> Currently, Zebra is still using already deprecated Hadoop 1.8 APIs. Need to 
> upgrade to its 2.0 APIs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1140) [zebra] Use of Hadoop 2.0 APIs

Reply via email to