Hi Prashant, [Replying to the dev list to get others take on these...]
Just curious, why do you prefer a List of JobStats over the already existing iterator? I hesitate to add one-liner methods if it's something that can be a one-liner my the caller, unless the use case if very common. Making getSuccessfulJobs() and getFailedJobs() public seems reasonable to me. I'm not sure about the rationale behind the differences between registerScript and store(). Store() and registerQuery() are able to manually add to the DAG as statements come in, but register script needs parsing for execution. That's probably why execution is delegated to the GruntParser. The resulting DAG for a single-store script should be the same though. It seems like registerScript() should be able to return a list of ExecJobs. thanks, Bill On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi <[email protected]>wrote: > Hi Bill, > > I am looking at PigStats and JobGraph, and am thinking of adding some > functions. Let me know what you think. > > *getJobList()* returns a List representation of the iterator. > > public List<JobStats> getJobList() { > return IteratorUtils.toList(iterator()); > } > > What do you think about making getSuccessfulJobs() and getFailedJobs() > public and exposing it to the API? Currently they are package-private? > > Had another question, seems like the execution flow for > PigServer.registerScript/Query is different from PigServer.store(). Was > there a reason to make these different? The function store() returns an > ExecJob which is great to get info regarding the runs, but registerScript() > calls the GruntParser for execution which I think is a different flow? > > Thanks, > Prashant > > > On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <[email protected]> wrote: > >> Makes sense to me. We could return a PigStats object. >> >> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi <[email protected] >> >wrote: >> >> > Hi All, >> > >> > I am looking at PigServer methods for running scripts/queries and it >> seems >> > like currently theie return type is void which does not tell much about >> job >> > completion. >> > >> > public void registerScript(InputStream in, Map<String,String> >> > params,List<String> paramsFiles) throws IOException { >> > try { >> > String substituted = doParamSubstitution(in, params, >> > paramsFiles); >> > GruntParser grunt = new GruntParser(new >> > StringReader(substituted)); >> > grunt.setInteractive(false); >> > grunt.setParams(this); >> > grunt.parseStopOnError(true); >> > } catch (org.apache.pig.tools.pigscript.parser.ParseException >> e) { >> > log.error(e.getLocalizedMessage()); >> > throw new IOException(e.getCause()); >> > } >> > } >> > >> > >> > We do have a handle on number of jobs succeeded/failed as part of the >> job >> > run, so that is something we should add as return type? >> > >> > Thanks, >> > Prashant >> > >> >> >> >> -- >> *Note that I'm no longer using my Yahoo! email address. Please email me at >> [email protected] going forward.* >> > > -- *Note that I'm no longer using my Yahoo! email address. Please email me at [email protected] going forward.*
