Hi Eduardo, The log file where the Oozie pig action's output is written to, is a local file in the current working directory of the map task, and not on HDFS. Also, passing a custom path using "pig -logfile <HDFS_PATH>" is not allowed. Can you try with passing another argument at the end of your arguments list to your pig action as a redirection to some file of choice?
E.g. Trying to recreate something like the following. Note the redirection to 'myfile.txt' at the end pig -param $xyz=1000 myscript.pig &>myfile.txt I haven't tried this out myself but it will be helpful to find out. Otherwise, Virag's suggestion can help you access specific stats related information Regards, -- Mona Chitnis On 8/30/12 11:59 AM, "Virag Kothari" <[email protected]> wrote: >Hi, > >From 3.2 onwards, counters and hadoop job ids for Pig and Map-reduce can >be accessed through the API or EL function. > >First, the following should be set in wf configuration. This will store >the Pig/MR related statistics in the DB. ><property> > > <name>oozie.action.external.stats.write</name> > <value>true</value> > </property> > >Then, the stats and jobIds can be accessed using the verbose API >oozie job -info <jobId> -verbose > >Also, the hadoop job Id's can be retrieved for a Pig action through >El-function > >wf:actionData(<pig-action-name>)["hadoopJobs"] > > >Detailed docs at >http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/WorkflowFunct >i >onalSpec.html. Look under "4.2.5 Hadoop EL Functions" > >Thanks, >Virag > > > > > >On 8/30/12 10:31 AM, "Eduardo Afonso Ferreira" <[email protected]> wrote: > >>Hi there, >> >>I have a pig that runs periodically by oozie via coordinator with a set >>frequency. >>I wanted to capture the Pig script output because I need to look at some >>information on the results to keep track of several things. >>I know I can look at the output by doing a whole bunch of clicks starting >>at the oozie web console as follows: >> >>- Open oozie web console (ex.: http://localhost:11000/oozie/) >>- Find and click the specific job under "Workflow Jobs" >>- Select (click) the pig action in the window that pops up >>- Click the magnifying glass icon on the "Console URL" field >>- Click the Map of the launcher job >>- Click the task ID >>- Click All under "Task Logs" >> >>My question is how can I know the exact name and location of that log >>file in HDFS so I can programmaticaly retrieve the file from HDFS and >>parse and look for what I need? >> >>Is this something I can determine ahead of time, like pass a >>parameter/argument to the action/pig so that it will store the log where >>I want with the file name I want? >> >>Thanks in advance for your help. >>Eduardo. >
