[ 
https://issues.apache.org/jira/browse/PIG-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498184#comment-13498184
 ] 

Billie Rinaldi commented on PIG-3048:
-------------------------------------

I'm thinking of a workflow as a particular DAG.  Each run of a workflow has a 
globally unique ID, and it has a name that distinguishes that DAG from other 
DAGs.  It sounds like the logical plan signature would be more appropriate for 
the workflow name, assuming we want to group together runs of the same DAG with 
different inputs/outputs/arguments.

What is included in the full DAG in addition to the adjacency list?
                
> Add mapreduce workflow information to job configuration
> -------------------------------------------------------
>
>                 Key: PIG-3048
>                 URL: https://issues.apache.org/jira/browse/PIG-3048
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Billie Rinaldi
>         Attachments: PIG-3048.patch
>
>
> Adding workflow properties to the job configuration would enable logging and 
> analysis of workflows in addition to individual MapReduce jobs.  Suggested 
> properties include a workflow ID, workflow name, adjacency list connecting 
> nodes in the workflow, and the name of the current node in the workflow.
> mapreduce.workflow.id - a unique ID for the workflow, ideally prepended with 
> the application name
> e.g. pig_<pigScriptId>
> mapreduce.workflow.name - a name for the workflow, to distinguish this 
> workflow from other workflows and to group different runs of the same workflow
> e.g. pig command line
> mapreduce.workflow.adjacency - an adjacency list for the workflow graph, 
> encoded as mapreduce.workflow.adjacency.<source node> = <comma-separated list 
> of target nodes>
> mapreduce.workflow.node.name - the name of the node corresponding to this 
> MapReduce job in the workflow adjacency list

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to