Also, the -info -verbose should be on pig action and not workflow job.

On 9/6/12 10:38 AM, "Virag Kothari" <[email protected]> wrote:

>Hi,
>
>I forgot to mention that you need to install oozie share library on hdfs.
>The json-simple.jar is bundled with that package.
>Look at 'Oozie-sharelib installation' under
>http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/DG_QuickStart
>.
>html
>
>
>You should be able to view the stats and external Ids with -info -verbose.
>Please make sure that you are using 3.2.0 client. Command to check client
>version is 'oozie version'
>
>Thanks,
>Virag
>
>
>
>
>On 9/6/12 10:27 AM, "Eduardo Afonso Ferreira" <[email protected]> wrote:
>
>>Hey,
>>
>>
>>I'm still interested in learning if a pre-packaged 3.2 version is
>>available out there that I can install, but I was able to move a little
>>more by adding another jar to my app, i.e. json-simple-1.1.1.jar which
>>solved the NoClassDefFoundError I experienced.
>>
>>Now I see the stats field on the oozie database (WF_ACTIONS.stats) filled
>>with a JSON of the PigStats that I'm interested in. But I still can't see
>>it when I run -info with -verbose. Am I missing something?
>>
>>Thanks.
>>Eduardo.
>>
>>
>>
>>________________________________
>> From: Eduardo Afonso Ferreira <[email protected]>
>>To: "[email protected]" <[email protected]>
>>Sent: Thursday, September 6, 2012 12:14 PM
>>Subject: Re: Capturing Pig action output
>> 
>>Hey, Virag,
>>
>>I built and installed Oozie 3.2 from
>>http://incubator.apache.org/oozie/Downloads.html.
>>I set the property oozie.action.external.stats.write to true on my WF and
>>deployed/submitted/etc.
>>But I still don't see PigStats when I do the -info request (ex. below)
>>and I see exceptions related to org.json.simple.JSONObject
>>(NoClassDefFoundError). Maybe a build problem.
>>
>>What would be the best way of getting version 3.2 up and running? Any
>>package out there already built that we could download and install? I
>>mean, without need to build/package and look for solving all sorts of
>>dependencies.
>>
>>
>>eferreira@eferreira-tbs-desktop:~/projects/aspen-core/oozie/apps$ oozie
>>job -oozie http://localhost:11000/oozie -info
>>0000197-120905170442968-oozie-oozi-W -verbose
>>Job ID : 0000197-120905170442968-oozie-oozi-W
>>-------------------------------------------------------------------------
>>-
>>----------------------------------------------------------
>>Workflow Name : video_play_counts-wf
>>App Path      : 
>>hdfs://aspendevhdp1.cnn.vgtf.net:54310/user/eferreira/oozie/apps/video_pl
>>a
>>y_counts
>>Status        : RUNNING
>>Run           : 0
>>User          : eferreira
>>Group         : -
>>Created       : 2012-09-06 14:53
>>Started       : 2012-09-06 14:53
>>Last Modified : 2012-09-06 14:53
>>Ended         : -
>>CoordAction ID: 0000196-120905170442968-oozie-oozi-C@1
>>
>>Actions
>>-------------------------------------------------------------------------
>>-
>>----------------------------------------------------------
>>ID    Console URL    Error Code    Error Message    External ID
>>External Status    Name    Retries    Tracker URI    Type    Started
>>Status    Ended
>>-------------------------------------------------------------------------
>>-
>>----------------------------------------------------------
>>0000197-120905170442968-oozie-oozi-W@pig-node
>>http://aspendevhdp1.cnn.vgtf.net:50030/jobdetails.jsp?jobid=job_201208071
>>5
>>02_69799    -    -    job_201208071502_69799    RUNNING    pig-node    0
>>  aspendevhdp1.cnn.vgtf.net:54311    pig    2012-09-06 14:53    RUNNING
>> -
>>-------------------------------------------------------------------------
>>-
>>----------------------------------------------------------
>>
>>
>>
>>
>>
>>________________________________
>>From: Virag Kothari <[email protected]>
>>To: "[email protected]"
>><[email protected]>; Eduardo Afonso Ferreira
>><[email protected]>
>>Sent: Thursday, August 30, 2012 2:59 PM
>>Subject: Re: Capturing Pig action output
>>
>>Hi,
>>
>>From 3.2 onwards, counters and hadoop job ids for Pig and Map-reduce can
>>be accessed through the API or EL function.
>>
>>First, the following should be set in wf configuration. This will store
>>the Pig/MR related statistics in the DB.
>><property>
>>                    <name>oozie.action.external.stats.write</name>
>>                    <value>true</value>
>>                </property>
>>
>>Then, the stats and jobIds can be accessed using the verbose API
>>oozie job -info <jobId> -verbose
>>
>>Also, the hadoop job Id's can be retrieved for a Pig action through
>>El-function
>>
>>wf:actionData(<pig-action-name>)["hadoopJobs"]
>>
>>
>>Detailed docs at 
>>http://incubator.apache.org/oozie/docs/3.2.0-incubating/docs/WorkflowFunc
>>t
>>i
>>onalSpec.html. Look under "4.2.5 Hadoop EL Functions"
>>
>>Thanks,
>>Virag
>>
>>
>>
>>
>>
>>On 8/30/12 10:31 AM, "Eduardo Afonso Ferreira" <[email protected]>
>>wrote:
>>
>>>Hi there,
>>>
>>>I have a pig that runs periodically by oozie via coordinator with a set
>>>frequency.
>>>I wanted to capture the Pig script output because I need to look at some
>>>information on the results to keep track of several things.
>>>I know I can look at the output by doing a whole bunch of clicks
>>>starting
>>>at the oozie web console as follows:
>>>
>>>- Open oozie web console (ex.: http://localhost:11000/oozie/)
>>>- Find and click the specific job under "Workflow Jobs"
>>>- Select (click) the pig action in the window that pops up
>>>- Click the magnifying glass icon on the "Console URL" field
>>>- Click the Map of the launcher job
>>>- Click the task ID
>>>- Click All under "Task Logs"
>>>
>>>My question is how can I know the exact name and location of that log
>>>file in HDFS so I can programmaticaly retrieve the file from HDFS and
>>>parse and look for what I need?
>>>
>>>Is this something I can determine ahead of time, like pass a
>>>parameter/argument to the action/pig so that it will store the log where
>>>I want with the file name I want?
>>>
>>>Thanks in advance for your help.
>>>Eduardo.
>

Reply via email to