This stuff is a bit convoluted, isn't it? I think you may be right (I never use registerScript). Try an experiment?
On Thu, Mar 4, 2010 at 11:20 AM, Rohan Rai <rohan....@inmobi.com> wrote: > In addition > > Even > org.apache.pig.tools.pigscript.parser.PigScriptParser (.jj) > seems to tell that its not running in batch mode . > > Is the interpretation incorrect > > Regards > Rohan > > > Rohan Rai wrote: > >> Thats what makes it confusing >> If you see the the parameter getting passed is true which is sameBatch >> on which it should ideally not call setBatchOn >> >> if (!mInteractive && !sameBatch) { >> setBatchOn(); >> } >> >> >> >> Dmitriy Ryaboy wrote: >> >> Looks like it's on automatically. >>> >>> Code below is from trunk, but I don't think this changed recently. I got >>> rid >>> of exception handling for conciseness. >>> >>> In PigServer: >>> >>> public void registerScript(String fileName) throws IOException { >>> GruntParser grunt = new GruntParser(new FileReader(new >>> File(fileName))); >>> * grunt.setInteractive(false);* >>> grunt.setParams(this); >>> grunt.parseStopOnError(true); >>> } >>> >>> >>> In GruntParser: >>> >>> public int[] parseStopOnError(boolean sameBatch) throws IOException, >>> ParseException { >>> * if (!mInteractive && !sameBatch) { >>> setBatchOn(); >>> } >>> * prompt(); >>> mDone = false; >>> while(!mDone) { >>> parse(); >>> } >>> if (!sameBatch) { >>> executeBatch(); >>> } >>> int [] res = { mNumSucceededJobs, mNumFailedJobs }; >>> return res; >>> } >>> >>> >>> On Thu, Mar 4, 2010 at 10:00 AM, Rohan Rai <rohan....@inmobi.com> wrote: >>> >>> >>> >>> Thanks Dmitriy >>>> >>>> Just a question more >>>> >>>> registerScript allows to register a pig script in the embedded mode >>>> So the confusion was does it internally tries to optimize it. >>>> or setBatchOn has to be explicitly called >>>> >>>> Regards >>>> Rohan >>>> >>>> >>>> Dmitriy Ryaboy wrote: >>>> >>>> >>>> >>>> 1) Automatically, if you call it right. Look for the setBatchOn and >>>>> executeBatch methods (I may be slightly off on the method names, going >>>>> off >>>>> memory) >>>>> >>>>> 2) The optimizer moves stuff around and may be executing things in a >>>>> slightly different order then what you tell it. This can mean pushing >>>>> up >>>>> projections, filters, and limits, inserting casts, and doing all kinds >>>>> of >>>>> other manipulations. The logical plan shows you what's going to happen >>>>> without breaking it down into the MR plan. There are further >>>>> optimizations >>>>> at the MR level, so both are worth checking. In practice I usually look >>>>> at >>>>> the logical plan for order-of-operations and general sanity checking, >>>>> and >>>>> at >>>>> the MR plan for number of jobs and whether things like algebraic and >>>>> accumulative interfaces are kicking in. >>>>> >>>>> 3) Yes. Roughly speaking, one map per block will be generated. The >>>>> bigger >>>>> the block, the more work per mapper. The smaller the block, the more >>>>> mappers. Depending on the workload, there's an optimal value. >>>>> >>>>> 4) Playing with logical plan -- don't :-). It's exposed so that you can >>>>> look >>>>> at what's going on, and not intended to let you change execution plans. >>>>> Unless you actually want to hack Pig guts. If that's the case, look at >>>>> the >>>>> optimizer and the MRCompiler classes to see how it's getting modified >>>>> and >>>>> used. >>>>> >>>>> -D >>>>> >>>>> On Thu, Mar 4, 2010 at 9:14 AM, Rohan Rai <rohan....@inmobi.com> >>>>> wrote: >>>>> >>>>> >>>>> On using embedded Pig Server and registering a pig script for >>>>> execution >>>>> >>>>> >>>>> 1) Does Multi Query Optimization happens automatically, or has to >>>>>> explicitly told so. >>>>>> >>>>>> 2) Logical Plan. What one can infer out of it. >>>>>> >>>>>> 3) Does the Block Size (defined in hadoop) has an effect on >>>>>> performance >>>>>> or the number of map job getting selected. >>>>>> >>>>>> Regards >>>>>> Rohan >>>>>> >>>>>> The information contained in this communication is intended solely for >>>>>> the >>>>>> use of the individual or entity to whom it is addressed and others >>>>>> authorized to receive it. It may contain confidential or legally >>>>>> privileged >>>>>> information. If you are not the intended recipient you are hereby >>>>>> notified >>>>>> that any disclosure, copying, distribution or taking any action in >>>>>> reliance >>>>>> on the contents of this information is strictly prohibited and may be >>>>>> unlawful. If you have received this communication in error, please >>>>>> notify >>>>>> us >>>>>> immediately by responding to this email and then delete it from your >>>>>> system. >>>>>> The firm is neither liable for the proper and complete transmission of >>>>>> the >>>>>> information contained in this communication nor for any delay in its >>>>>> receipt. >>>>>> >>>>>> >>>>>> . >>>>>> >>>>>> >>>>>> >>>>> The information contained in this communication is intended solely for >>>> the >>>> use of the individual or entity to whom it is addressed and others >>>> authorized to receive it. It may contain confidential or legally >>>> privileged >>>> information. If you are not the intended recipient you are hereby >>>> notified >>>> that any disclosure, copying, distribution or taking any action in >>>> reliance >>>> on the contents of this information is strictly prohibited and may be >>>> unlawful. If you have received this communication in error, please >>>> notify us >>>> immediately by responding to this email and then delete it from your >>>> system. >>>> The firm is neither liable for the proper and complete transmission of >>>> the >>>> information contained in this communication nor for any delay in its >>>> receipt. >>>> >>>> >>>> >>>> . >>> >>> >>> >>> >> >> The information contained in this communication is intended solely for the >> use of the individual or entity to whom it is addressed and others >> authorized to receive it. It may contain confidential or legally privileged >> information. If you are not the intended recipient you are hereby notified >> that any disclosure, copying, distribution or taking any action in reliance >> on the contents of this information is strictly prohibited and may be >> unlawful. If you have received this communication in error, please notify us >> immediately by responding to this email and then delete it from your system. >> The firm is neither liable for the proper and complete transmission of the >> information contained in this communication nor for any delay in its >> receipt. >> . >> >> >> > > The information contained in this communication is intended solely for the > use of the individual or entity to whom it is addressed and others > authorized to receive it. It may contain confidential or legally privileged > information. If you are not the intended recipient you are hereby notified > that any disclosure, copying, distribution or taking any action in reliance > on the contents of this information is strictly prohibited and may be > unlawful. If you have received this communication in error, please notify us > immediately by responding to this email and then delete it from your system. > The firm is neither liable for the proper and complete transmission of the > information contained in this communication nor for any delay in its > receipt. >