Thanks Dmitriy

Just a question more

registerScript allows to register a pig script in the embedded mode
So the confusion was does it internally tries to optimize it.
or setBatchOn has to be explicitly called

Regards
Rohan

Dmitriy Ryaboy wrote:
1) Automatically, if you call it right.  Look for the setBatchOn and
executeBatch methods (I may be slightly off on the method names, going off
memory)

2) The optimizer moves stuff around and may be executing things in a
slightly different order then what you tell it. This can mean pushing up
projections, filters, and limits, inserting casts, and doing all kinds of
other manipulations. The logical plan shows you what's going to happen
without breaking it down into the MR plan. There are further optimizations
at the MR level, so both are worth checking. In practice I usually look at
the logical plan for order-of-operations and general sanity checking, and at
the MR plan for number of jobs and whether things like algebraic and
accumulative interfaces are kicking in.

3) Yes. Roughly speaking, one map per block will be generated. The bigger
the block, the more work per mapper. The smaller the block, the more
mappers. Depending on the workload, there's an optimal value.

4) Playing with logical plan -- don't :-). It's exposed so that you can look
at what's going on, and not intended to let you change execution plans.
Unless you actually want to hack Pig guts. If that's the case, look at the
optimizer and the MRCompiler classes to see how it's getting modified and
used.

-D

On Thu, Mar 4, 2010 at 9:14 AM, Rohan Rai <rohan....@inmobi.com> wrote:


On using embedded Pig Server and registering a pig script for execution

1) Does Multi Query Optimization happens automatically, or has to
explicitly told so.

2) Logical Plan. What one can infer out of it.

3) Does the Block Size (defined in hadoop) has an effect on performance
or the number of map job getting selected.

Regards
Rohan

The information contained in this communication is intended solely for the
use of the individual or entity to whom it is addressed and others
authorized to receive it. It may contain confidential or legally privileged
information. If you are not the intended recipient you are hereby notified
that any disclosure, copying, distribution or taking any action in reliance
on the contents of this information is strictly prohibited and may be
unlawful. If you have received this communication in error, please notify us
immediately by responding to this email and then delete it from your system.
The firm is neither liable for the proper and complete transmission of the
information contained in this communication nor for any delay in its
receipt.


.




The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.

Reply via email to