Doesn't executeBatch() return exactly what you want?
On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi <[email protected]> wrote: > I knew I had those negotiation skills :) > > Patch is available, please review. It's a minor one > https://issues.apache.org/jira/browse/PIG-2964 > > -Prashant > > On Wed, Oct 10, 2012 at 5:54 PM, Bill Graham <[email protected]> wrote: > >> Ok, I'm sold. :) >> >> >> On Wed, Oct 10, 2012 at 11:00 AM, Prashant Kommireddi <[email protected] >> > wrote: >> >>> Thanks Bill. >>> >>> The rationale behind providing a List is that it simply provides a lot >>> more methods than an iterator. You are right in saying one could do that in >>> the caller code, I have a feeling providing this helper in the API would be >>> beneficial. For eg, a framework that is used by clients could initiate >>> several pig scripts/store commands at once. At the framework layer, you >>> might want to be able to determine the number of MR jobs in total spawned >>> by these multiple scripts and query stats on those. That's just one >>> use-case, there could be more methods on List that a user could be >>> interested in. >>> >>> -Prashant >>> >>> >>> On Wed, Oct 10, 2012 at 10:28 AM, Bill Graham <[email protected]>wrote: >>> >>>> Hi Prashant, >>>> >>>> [Replying to the dev list to get others take on these...] >>>> >>>> Just curious, why do you prefer a List of JobStats over the already >>>> existing iterator? I hesitate to add one-liner methods if it's something >>>> that can be a one-liner my the caller, unless the use case if very common. >>>> >>>> Making getSuccessfulJobs() and getFailedJobs() public seems reasonable >>>> to me. >>>> >>>> I'm not sure about the rationale behind the differences between >>>> registerScript and store(). Store() and registerQuery() are able to >>>> manually add to the DAG as statements come in, but register script needs >>>> parsing for execution. That's probably why execution is delegated to the >>>> GruntParser. The resulting DAG for a single-store script should be the same >>>> though. It seems like registerScript() should be able to return a list of >>>> ExecJobs. >>>> >>>> thanks, >>>> Bill >>>> >>>> >>>> On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi < >>>> [email protected]> wrote: >>>> >>>>> Hi Bill, >>>>> >>>>> I am looking at PigStats and JobGraph, and am thinking of adding some >>>>> functions. Let me know what you think. >>>>> >>>>> *getJobList()* returns a List representation of the iterator. >>>>> >>>>> public List<JobStats> getJobList() { >>>>> return IteratorUtils.toList(iterator()); >>>>> } >>>>> >>>>> What do you think about making getSuccessfulJobs() and getFailedJobs() >>>>> public and exposing it to the API? Currently they are package-private? >>>>> >>>>> Had another question, seems like the execution flow for >>>>> PigServer.registerScript/Query is different from PigServer.store(). Was >>>>> there a reason to make these different? The function store() returns an >>>>> ExecJob which is great to get info regarding the runs, but >>>>> registerScript() >>>>> calls the GruntParser for execution which I think is a different flow? >>>>> >>>>> Thanks, >>>>> Prashant >>>>> >>>>> >>>>> On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <[email protected]>wrote: >>>>> >>>>>> Makes sense to me. We could return a PigStats object. >>>>>> >>>>>> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi < >>>>>> [email protected]>wrote: >>>>>> >>>>>> > Hi All, >>>>>> > >>>>>> > I am looking at PigServer methods for running scripts/queries and it >>>>>> seems >>>>>> > like currently theie return type is void which does not tell much >>>>>> about job >>>>>> > completion. >>>>>> > >>>>>> > public void registerScript(InputStream in, Map<String,String> >>>>>> > params,List<String> paramsFiles) throws IOException { >>>>>> > try { >>>>>> > String substituted = doParamSubstitution(in, params, >>>>>> > paramsFiles); >>>>>> > GruntParser grunt = new GruntParser(new >>>>>> > StringReader(substituted)); >>>>>> > grunt.setInteractive(false); >>>>>> > grunt.setParams(this); >>>>>> > grunt.parseStopOnError(true); >>>>>> > } catch >>>>>> (org.apache.pig.tools.pigscript.parser.ParseException e) { >>>>>> > log.error(e.getLocalizedMessage()); >>>>>> > throw new IOException(e.getCause()); >>>>>> > } >>>>>> > } >>>>>> > >>>>>> > >>>>>> > We do have a handle on number of jobs succeeded/failed as part of >>>>>> the job >>>>>> > run, so that is something we should add as return type? >>>>>> > >>>>>> > Thanks, >>>>>> > Prashant >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Note that I'm no longer using my Yahoo! email address. Please email >>>>>> me at >>>>>> [email protected] going forward.* >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Note that I'm no longer using my Yahoo! email address. Please email me >>>> at [email protected] going forward.* >>>> >>> >>> >> >> >> -- >> *Note that I'm no longer using my Yahoo! email address. Please email me >> at [email protected] going forward.* >>
