I knew I had those negotiation skills :) Patch is available, please review. It's a minor one https://issues.apache.org/jira/browse/PIG-2964
-Prashant On Wed, Oct 10, 2012 at 5:54 PM, Bill Graham <billgra...@gmail.com> wrote: > Ok, I'm sold. :) > > > On Wed, Oct 10, 2012 at 11:00 AM, Prashant Kommireddi <prash1...@gmail.com > > wrote: > >> Thanks Bill. >> >> The rationale behind providing a List is that it simply provides a lot >> more methods than an iterator. You are right in saying one could do that in >> the caller code, I have a feeling providing this helper in the API would be >> beneficial. For eg, a framework that is used by clients could initiate >> several pig scripts/store commands at once. At the framework layer, you >> might want to be able to determine the number of MR jobs in total spawned >> by these multiple scripts and query stats on those. That's just one >> use-case, there could be more methods on List that a user could be >> interested in. >> >> -Prashant >> >> >> On Wed, Oct 10, 2012 at 10:28 AM, Bill Graham <billgra...@gmail.com>wrote: >> >>> Hi Prashant, >>> >>> [Replying to the dev list to get others take on these...] >>> >>> Just curious, why do you prefer a List of JobStats over the already >>> existing iterator? I hesitate to add one-liner methods if it's something >>> that can be a one-liner my the caller, unless the use case if very common. >>> >>> Making getSuccessfulJobs() and getFailedJobs() public seems reasonable >>> to me. >>> >>> I'm not sure about the rationale behind the differences between >>> registerScript and store(). Store() and registerQuery() are able to >>> manually add to the DAG as statements come in, but register script needs >>> parsing for execution. That's probably why execution is delegated to the >>> GruntParser. The resulting DAG for a single-store script should be the same >>> though. It seems like registerScript() should be able to return a list of >>> ExecJobs. >>> >>> thanks, >>> Bill >>> >>> >>> On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi < >>> prash1...@gmail.com> wrote: >>> >>>> Hi Bill, >>>> >>>> I am looking at PigStats and JobGraph, and am thinking of adding some >>>> functions. Let me know what you think. >>>> >>>> *getJobList()* returns a List representation of the iterator. >>>> >>>> public List<JobStats> getJobList() { >>>> return IteratorUtils.toList(iterator()); >>>> } >>>> >>>> What do you think about making getSuccessfulJobs() and getFailedJobs() >>>> public and exposing it to the API? Currently they are package-private? >>>> >>>> Had another question, seems like the execution flow for >>>> PigServer.registerScript/Query is different from PigServer.store(). Was >>>> there a reason to make these different? The function store() returns an >>>> ExecJob which is great to get info regarding the runs, but registerScript() >>>> calls the GruntParser for execution which I think is a different flow? >>>> >>>> Thanks, >>>> Prashant >>>> >>>> >>>> On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <billgra...@gmail.com>wrote: >>>> >>>>> Makes sense to me. We could return a PigStats object. >>>>> >>>>> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi < >>>>> prash1...@gmail.com>wrote: >>>>> >>>>> > Hi All, >>>>> > >>>>> > I am looking at PigServer methods for running scripts/queries and it >>>>> seems >>>>> > like currently theie return type is void which does not tell much >>>>> about job >>>>> > completion. >>>>> > >>>>> > public void registerScript(InputStream in, Map<String,String> >>>>> > params,List<String> paramsFiles) throws IOException { >>>>> > try { >>>>> > String substituted = doParamSubstitution(in, params, >>>>> > paramsFiles); >>>>> > GruntParser grunt = new GruntParser(new >>>>> > StringReader(substituted)); >>>>> > grunt.setInteractive(false); >>>>> > grunt.setParams(this); >>>>> > grunt.parseStopOnError(true); >>>>> > } catch >>>>> (org.apache.pig.tools.pigscript.parser.ParseException e) { >>>>> > log.error(e.getLocalizedMessage()); >>>>> > throw new IOException(e.getCause()); >>>>> > } >>>>> > } >>>>> > >>>>> > >>>>> > We do have a handle on number of jobs succeeded/failed as part of >>>>> the job >>>>> > run, so that is something we should add as return type? >>>>> > >>>>> > Thanks, >>>>> > Prashant >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> *Note that I'm no longer using my Yahoo! email address. Please email >>>>> me at >>>>> billgra...@gmail.com going forward.* >>>>> >>>> >>>> >>> >>> >>> -- >>> *Note that I'm no longer using my Yahoo! email address. Please email me >>> at billgra...@gmail.com going forward.* >>> >> >> > > > -- > *Note that I'm no longer using my Yahoo! email address. Please email me > at billgra...@gmail.com going forward.* >