Also, the -info -verbose should be on pig action and not workflow job. On 9/6/12 10:38 AM, "Virag Kothari" <[email protected]> wrote:
>Hi, > >I forgot to mention that you need to install oozie share library on hdfs. >The json-simple.jar is bundled with that package. >Look at 'Oozie-sharelib installation' under >http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/DG_QuickStart >. >html > > >You should be able to view the stats and external Ids with -info -verbose. >Please make sure that you are using 3.2.0 client. Command to check client >version is 'oozie version' > >Thanks, >Virag > > > > >On 9/6/12 10:27 AM, "Eduardo Afonso Ferreira" <[email protected]> wrote: > >>Hey, >> >> >>I'm still interested in learning if a pre-packaged 3.2 version is >>available out there that I can install, but I was able to move a little >>more by adding another jar to my app, i.e. json-simple-1.1.1.jar which >>solved the NoClassDefFoundError I experienced. >> >>Now I see the stats field on the oozie database (WF_ACTIONS.stats) filled >>with a JSON of the PigStats that I'm interested in. But I still can't see >>it when I run -info with -verbose. Am I missing something? >> >>Thanks. >>Eduardo. >> >> >> >>________________________________ >> From: Eduardo Afonso Ferreira <[email protected]> >>To: "[email protected]" <[email protected]> >>Sent: Thursday, September 6, 2012 12:14 PM >>Subject: Re: Capturing Pig action output >> >>Hey, Virag, >> >>I built and installed Oozie 3.2 from >>http://incubator.apache.org/oozie/Downloads.html. >>I set the property oozie.action.external.stats.write to true on my WF and >>deployed/submitted/etc. >>But I still don't see PigStats when I do the -info request (ex. below) >>and I see exceptions related to org.json.simple.JSONObject >>(NoClassDefFoundError). Maybe a build problem. >> >>What would be the best way of getting version 3.2 up and running? Any >>package out there already built that we could download and install? I >>mean, without need to build/package and look for solving all sorts of >>dependencies. >> >> >>eferreira@eferreira-tbs-desktop:~/projects/aspen-core/oozie/apps$ oozie >>job -oozie http://localhost:11000/oozie -info >>0000197-120905170442968-oozie-oozi-W -verbose >>Job ID : 0000197-120905170442968-oozie-oozi-W >>------------------------------------------------------------------------- >>- >>---------------------------------------------------------- >>Workflow Name : video_play_counts-wf >>App Path : >>hdfs://aspendevhdp1.cnn.vgtf.net:54310/user/eferreira/oozie/apps/video_pl >>a >>y_counts >>Status : RUNNING >>Run : 0 >>User : eferreira >>Group : - >>Created : 2012-09-06 14:53 >>Started : 2012-09-06 14:53 >>Last Modified : 2012-09-06 14:53 >>Ended : - >>CoordAction ID: 0000196-120905170442968-oozie-oozi-C@1 >> >>Actions >>------------------------------------------------------------------------- >>- >>---------------------------------------------------------- >>ID Console URL Error Code Error Message External ID >>External Status Name Retries Tracker URI Type Started >>Status Ended >>------------------------------------------------------------------------- >>- >>---------------------------------------------------------- >>0000197-120905170442968-oozie-oozi-W@pig-node >>http://aspendevhdp1.cnn.vgtf.net:50030/jobdetails.jsp?jobid=job_201208071 >>5 >>02_69799 - - job_201208071502_69799 RUNNING pig-node 0 >> aspendevhdp1.cnn.vgtf.net:54311 pig 2012-09-06 14:53 RUNNING >> - >>------------------------------------------------------------------------- >>- >>---------------------------------------------------------- >> >> >> >> >> >>________________________________ >>From: Virag Kothari <[email protected]> >>To: "[email protected]" >><[email protected]>; Eduardo Afonso Ferreira >><[email protected]> >>Sent: Thursday, August 30, 2012 2:59 PM >>Subject: Re: Capturing Pig action output >> >>Hi, >> >>From 3.2 onwards, counters and hadoop job ids for Pig and Map-reduce can >>be accessed through the API or EL function. >> >>First, the following should be set in wf configuration. This will store >>the Pig/MR related statistics in the DB. >><property> >> <name>oozie.action.external.stats.write</name> >> <value>true</value> >> </property> >> >>Then, the stats and jobIds can be accessed using the verbose API >>oozie job -info <jobId> -verbose >> >>Also, the hadoop job Id's can be retrieved for a Pig action through >>El-function >> >>wf:actionData(<pig-action-name>)["hadoopJobs"] >> >> >>Detailed docs at >>http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/WorkflowFunc >>t >>i >>onalSpec.html. Look under "4.2.5 Hadoop EL Functions" >> >>Thanks, >>Virag >> >> >> >> >> >>On 8/30/12 10:31 AM, "Eduardo Afonso Ferreira" <[email protected]> >>wrote: >> >>>Hi there, >>> >>>I have a pig that runs periodically by oozie via coordinator with a set >>>frequency. >>>I wanted to capture the Pig script output because I need to look at some >>>information on the results to keep track of several things. >>>I know I can look at the output by doing a whole bunch of clicks >>>starting >>>at the oozie web console as follows: >>> >>>- Open oozie web console (ex.: http://localhost:11000/oozie/) >>>- Find and click the specific job under "Workflow Jobs" >>>- Select (click) the pig action in the window that pops up >>>- Click the magnifying glass icon on the "Console URL" field >>>- Click the Map of the launcher job >>>- Click the task ID >>>- Click All under "Task Logs" >>> >>>My question is how can I know the exact name and location of that log >>>file in HDFS so I can programmaticaly retrieve the file from HDFS and >>>parse and look for what I need? >>> >>>Is this something I can determine ahead of time, like pass a >>>parameter/argument to the action/pig so that it will store the log where >>>I want with the file name I want? >>> >>>Thanks in advance for your help. >>>Eduardo. >
