True, that does what would serve the purpose. However, I feel the
abstraction could be at a lower level so callers of other functions such as
"store" could use it too.

On Thu, Oct 11, 2012 at 12:27 PM, Dmitriy Ryaboy <[email protected]> wrote:

> Doesn't executeBatch() return exactly what you want?
>
>
>
> On Thu, Oct 11, 2012 at 2:12 AM, Prashant Kommireddi
> <[email protected]> wrote:
> > I knew I had those negotiation skills :)
> >
> > Patch is available, please review. It's a minor one
> > https://issues.apache.org/jira/browse/PIG-2964
> >
> > -Prashant
> >
> > On Wed, Oct 10, 2012 at 5:54 PM, Bill Graham <[email protected]>
> wrote:
> >
> >> Ok, I'm sold. :)
> >>
> >>
> >> On Wed, Oct 10, 2012 at 11:00 AM, Prashant Kommireddi <
> [email protected]
> >> > wrote:
> >>
> >>> Thanks Bill.
> >>>
> >>> The rationale behind providing a List is that it simply provides a lot
> >>> more methods than an iterator. You are right in saying one could do
> that in
> >>> the caller code, I have a feeling providing this helper in the API
> would be
> >>> beneficial. For eg, a framework that is used by clients could initiate
> >>> several pig scripts/store commands at once. At the framework layer, you
> >>> might want to be able to determine the number of MR jobs in total
> spawned
> >>> by these multiple scripts and query stats on those. That's just one
> >>> use-case, there could be more methods on List that a user could be
> >>> interested in.
> >>>
> >>> -Prashant
> >>>
> >>>
> >>> On Wed, Oct 10, 2012 at 10:28 AM, Bill Graham <[email protected]
> >wrote:
> >>>
> >>>> Hi Prashant,
> >>>>
> >>>> [Replying to the dev list to get others take on these...]
> >>>>
> >>>> Just curious, why do you prefer a List of JobStats over the already
> >>>> existing iterator? I hesitate to add one-liner methods if it's
> something
> >>>> that can be a one-liner my the caller, unless the use case if very
> common.
> >>>>
> >>>> Making getSuccessfulJobs() and getFailedJobs() public seems reasonable
> >>>> to me.
> >>>>
> >>>> I'm not sure about the rationale behind the differences between
> >>>> registerScript and store(). Store() and registerQuery() are able to
> >>>> manually add to the DAG as statements come in, but register script
> needs
> >>>> parsing for execution. That's probably why execution is delegated to
> the
> >>>> GruntParser. The resulting DAG for a single-store script should be
> the same
> >>>> though. It seems like registerScript() should be able to return a
> list of
> >>>> ExecJobs.
> >>>>
> >>>> thanks,
> >>>> Bill
> >>>>
> >>>>
> >>>> On Tue, Oct 9, 2012 at 11:22 PM, Prashant Kommireddi <
> >>>> [email protected]> wrote:
> >>>>
> >>>>> Hi Bill,
> >>>>>
> >>>>> I am looking at PigStats and JobGraph, and am thinking of adding some
> >>>>> functions. Let me know what you think.
> >>>>>
> >>>>> *getJobList()* returns a List representation of the iterator.
> >>>>>
> >>>>> public List<JobStats> getJobList() {
> >>>>>             return IteratorUtils.toList(iterator());
> >>>>> }
> >>>>>
> >>>>> What do you think about making getSuccessfulJobs() and
> getFailedJobs()
> >>>>> public and exposing it to the API? Currently they are
> package-private?
> >>>>>
> >>>>> Had another question, seems like the execution flow for
> >>>>> PigServer.registerScript/Query is different from PigServer.store().
> Was
> >>>>> there a reason to make these different? The function store() returns
> an
> >>>>> ExecJob which is great to get info regarding the runs, but
> registerScript()
> >>>>> calls the GruntParser for execution which I think is a different
> flow?
> >>>>>
> >>>>> Thanks,
> >>>>> Prashant
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 4, 2012 at 6:05 PM, Bill Graham <[email protected]
> >wrote:
> >>>>>
> >>>>>> Makes sense to me. We could return a PigStats object.
> >>>>>>
> >>>>>> On Thu, Oct 4, 2012 at 1:49 PM, Prashant Kommireddi <
> >>>>>> [email protected]>wrote:
> >>>>>>
> >>>>>> > Hi All,
> >>>>>> >
> >>>>>> > I am looking at PigServer methods for running scripts/queries and
> it
> >>>>>> seems
> >>>>>> > like currently theie return type is void which does not tell much
> >>>>>> about job
> >>>>>> > completion.
> >>>>>> >
> >>>>>> >     public void registerScript(InputStream in, Map<String,String>
> >>>>>> > params,List<String> paramsFiles) throws IOException {
> >>>>>> >         try {
> >>>>>> >             String substituted = doParamSubstitution(in, params,
> >>>>>> > paramsFiles);
> >>>>>> >             GruntParser grunt = new GruntParser(new
> >>>>>> > StringReader(substituted));
> >>>>>> >             grunt.setInteractive(false);
> >>>>>> >             grunt.setParams(this);
> >>>>>> >             grunt.parseStopOnError(true);
> >>>>>> >         } catch
> >>>>>> (org.apache.pig.tools.pigscript.parser.ParseException e) {
> >>>>>> >             log.error(e.getLocalizedMessage());
> >>>>>> >             throw new IOException(e.getCause());
> >>>>>> >         }
> >>>>>> >     }
> >>>>>> >
> >>>>>> >
> >>>>>> > We do have a handle on number of jobs succeeded/failed as part of
> >>>>>> the job
> >>>>>> > run, so that is something we should add as return type?
> >>>>>> >
> >>>>>> > Thanks,
> >>>>>> > Prashant
> >>>>>> >
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> *Note that I'm no longer using my Yahoo! email address. Please email
> >>>>>> me at
> >>>>>> [email protected] going forward.*
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> *Note that I'm no longer using my Yahoo! email address. Please email
> me
> >>>> at [email protected] going forward.*
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >> *Note that I'm no longer using my Yahoo! email address. Please email me
> >> at [email protected] going forward.*
> >>
>

Reply via email to