[ 
https://issues.apache.org/jira/browse/PIG-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12869809#action_12869809
 ] 

Richard Ding commented on PIG-1333:
-----------------------------------


I propose Pig add a new class PigRunner that has a run method that returns a 
PigStats object:

{code}
package org.apache.pig;

public abstract class PigRunner {
    public static PigStats run(String[] args) {...}
}
{code}

The PigStats class will include the following methods:

{code}
boolean isSuccessful() 

int getReturnCode() // a list of return codes will be defined in PigRunner

String getErrorMessage()

int getErrorCode() // PigException's error code

int getNumberJobs() // number of MR jobs for this invocation

JobPlan getJobPlan() // DAG of MR jobs (a.k.a. an OperatorPlan)

List<String> getOutputLocations() 

long getNumberRecords(String location) // number of records in the given output 
location

long getNumberBytes(String location)  // number of bytes in the given location

... ... // a few more
{code}

A job in the JobPlan will include these methods:

{code}

String getAlias() // the alias associated with this job

String getFeature()  // the Pig feature associated with this job

int getNumberMaps() 

int getNumberReduces() 

... ... // a few more methods on job statistics retrieved from Hadoop

{code}

> API interface to Pig
> --------------------
>
>                 Key: PIG-1333
>                 URL: https://issues.apache.org/jira/browse/PIG-1333
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>            Assignee: Richard Ding
>             Fix For: 0.8.0
>
>
> It would be nice to make Pig more friendly for applications like workflow 
> that would be executing pig scripts on user behalf.
> Currently, they would have to use pig command line to execute the code; 
> however, this has limitation on the kind of output that would be delivered. 
> For instance, it is hard to produce error information that is easy to use 
> programatically or collect statistics.
> The proposal is to create a class that mimics the behavior of the Main but 
> gives users a status object back. The the main code of pig would look 
> somethig like:
> public static void main(String args[])
> {
>     PigStatus ps = PigMain.exec(args);
>     exit (PigStatus.rc);
> }
> We need to define the following:
> - Content of PigStatus. It should at least include
>    * return code
>    * error string
>    * exception 
>    * statistics
> - A way to propagate the status class through pig code

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to