Hey,
I'm still interested in learning if a pre-packaged 3.2 version is available out there that I can install, but I was able to move a little more by adding another jar to my app, i.e. json-simple-1.1.1.jar which solved the NoClassDefFoundError I experienced. Now I see the stats field on the oozie database (WF_ACTIONS.stats) filled with a JSON of the PigStats that I'm interested in. But I still can't see it when I run -info with -verbose. Am I missing something? Thanks. Eduardo. ________________________________ From: Eduardo Afonso Ferreira <[email protected]> To: "[email protected]" <[email protected]> Sent: Thursday, September 6, 2012 12:14 PM Subject: Re: Capturing Pig action output Hey, Virag, I built and installed Oozie 3.2 from http://incubator.apache.org/oozie/Downloads.html. I set the property oozie.action.external.stats.write to true on my WF and deployed/submitted/etc. But I still don't see PigStats when I do the -info request (ex. below) and I see exceptions related to org.json.simple.JSONObject (NoClassDefFoundError). Maybe a build problem. What would be the best way of getting version 3.2 up and running? Any package out there already built that we could download and install? I mean, without need to build/package and look for solving all sorts of dependencies. eferreira@eferreira-tbs-desktop:~/projects/aspen-core/oozie/apps$ oozie job -oozie http://localhost:11000/oozie -info 0000197-120905170442968-oozie-oozi-W -verbose Job ID : 0000197-120905170442968-oozie-oozi-W ------------------------------------------------------------------------------------------------------------------------------------ Workflow Name : video_play_counts-wf App Path : hdfs://aspendevhdp1.cnn.vgtf.net:54310/user/eferreira/oozie/apps/video_play_counts Status : RUNNING Run : 0 User : eferreira Group : - Created : 2012-09-06 14:53 Started : 2012-09-06 14:53 Last Modified : 2012-09-06 14:53 Ended : - CoordAction ID: 0000196-120905170442968-oozie-oozi-C@1 Actions ------------------------------------------------------------------------------------------------------------------------------------ ID Console URL Error Code Error Message External ID External Status Name Retries Tracker URI Type Started Status Ended ------------------------------------------------------------------------------------------------------------------------------------ 0000197-120905170442968-oozie-oozi-W@pig-node http://aspendevhdp1.cnn.vgtf.net:50030/jobdetails.jsp?jobid=job_201208071502_69799 - - job_201208071502_69799 RUNNING pig-node 0 aspendevhdp1.cnn.vgtf.net:54311 pig 2012-09-06 14:53 RUNNING - ------------------------------------------------------------------------------------------------------------------------------------ ________________________________ From: Virag Kothari <[email protected]> To: "[email protected]" <[email protected]>; Eduardo Afonso Ferreira <[email protected]> Sent: Thursday, August 30, 2012 2:59 PM Subject: Re: Capturing Pig action output Hi, From 3.2 onwards, counters and hadoop job ids for Pig and Map-reduce can be accessed through the API or EL function. First, the following should be set in wf configuration. This will store the Pig/MR related statistics in the DB. <property> <name>oozie.action.external.stats.write</name> <value>true</value> </property> Then, the stats and jobIds can be accessed using the verbose API oozie job -info <jobId> -verbose Also, the hadoop job Id's can be retrieved for a Pig action through El-function wf:actionData(<pig-action-name>)["hadoopJobs"] Detailed docs at http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/WorkflowFuncti onalSpec.html. Look under "4.2.5 Hadoop EL Functions" Thanks, Virag On 8/30/12 10:31 AM, "Eduardo Afonso Ferreira" <[email protected]> wrote: >Hi there, > >I have a pig that runs periodically by oozie via coordinator with a set >frequency. >I wanted to capture the Pig script output because I need to look at some >information on the results to keep track of several things. >I know I can look at the output by doing a whole bunch of clicks starting >at the oozie web console as follows: > >- Open oozie web console (ex.: http://localhost:11000/oozie/) >- Find and click the specific job under "Workflow Jobs" >- Select (click) the pig action in the window that pops up >- Click the magnifying glass icon on the "Console URL" field >- Click the Map of the launcher job >- Click the task ID >- Click All under "Task Logs" > >My question is how can I know the exact name and location of that log >file in HDFS so I can programmaticaly retrieve the file from HDFS and >parse and look for what I need? > >Is this something I can determine ahead of time, like pass a >parameter/argument to the action/pig so that it will store the log where >I want with the file name I want? > >Thanks in advance for your help. >Eduardo.
