Hi,
>From 3.2 onwards, counters and hadoop job ids for Pig and Map-reduce can
be accessed through the API or EL function.
First, the following should be set in wf configuration. This will store
the Pig/MR related statistics in the DB.
<property>
<name>oozie.action.external.stats.write</name>
<value>true</value>
</property>
Then, the stats and jobIds can be accessed using the verbose API
oozie job -info <jobId> -verbose
Also, the hadoop job Id's can be retrieved for a Pig action through
El-function
wf:actionData(<pig-action-name>)["hadoopJobs"]
Detailed docs at
http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/WorkflowFuncti
onalSpec.html. Look under "4.2.5 Hadoop EL Functions"
Thanks,
Virag
On 8/30/12 10:31 AM, "Eduardo Afonso Ferreira" <[email protected]> wrote:
>Hi there,
>
>I have a pig that runs periodically by oozie via coordinator with a set
>frequency.
>I wanted to capture the Pig script output because I need to look at some
>information on the results to keep track of several things.
>I know I can look at the output by doing a whole bunch of clicks starting
>at the oozie web console as follows:
>
>- Open oozie web console (ex.: http://localhost:11000/oozie/)
>- Find and click the specific job under "Workflow Jobs"
>- Select (click) the pig action in the window that pops up
>- Click the magnifying glass icon on the "Console URL" field
>- Click the Map of the launcher job
>- Click the task ID
>- Click All under "Task Logs"
>
>My question is how can I know the exact name and location of that log
>file in HDFS so I can programmaticaly retrieve the file from HDFS and
>parse and look for what I need?
>
>Is this something I can determine ahead of time, like pass a
>parameter/argument to the action/pig so that it will store the log where
>I want with the file name I want?
>
>Thanks in advance for your help.
>Eduardo.