Hi Alan,
Thanks for the detailed review.
After getting Daniel's feedback (and grokking the relationship between
Pig's logical and physical operators, which is a little different than
that described in the literature), we agree that the proper place to
put the optimizer is at the logical layer, alt
This is a good start at adding a cost based optimizer to Pig. I have
a number of comments:
1) Your argument for putting it in the physical layer rather than the
logical is that the logical layer does not know physical statistics.
This need not be true. You suggest adding a getStatistics
Daniel, thanks for the information, this is useful.
On Wed, Sep 2, 2009 at 2:06 PM, Jianyong Dai wrote:
> Yes, physical properties is important for an optimizer. To optimize Pig
> well, we need to know the underlying hadoop execution environment, such as #
> of map-reduce jobs, how many maps/redu
Yes, physical properties is important for an optimizer. To optimize Pig
well, we need to know the underlying hadoop execution environment, such
as # of map-reduce jobs, how many maps/reducers, how the job is
configured, etc. This is true even for a rule based optimizer.
Unfortunately, physical
Our initial survey of related literature showed that the usual place
for a CBO tends to be between the physical and logical layer (in fact,
the famous Cascades paper advocates removing the distinction between
physical and logical operators altogether, and using an "is_logical"
and "is_physical" fla
I am still reading but one interesting question is why you decide to put
CBO in physical layer?
Dmitriy Ryaboy wrote:
Whoops :-)
Here's the Google doc:
http://docs.google.com/Doc?docid=0Adqb7pZsloe6ZGM4Z3o1OG1fMjFrZjViZ21jdA&hl=en
-Dmitriy
On Tue, Sep 1, 2009 at 12:51 PM, Santhosh Srinivasan
Whoops :-)
Here's the Google doc:
http://docs.google.com/Doc?docid=0Adqb7pZsloe6ZGM4Z3o1OG1fMjFrZjViZ21jdA&hl=en
-Dmitriy
On Tue, Sep 1, 2009 at 12:51 PM, Santhosh Srinivasan wrote:
> Dmitriy and Gang,
>
> The mailing list does not allow attachments. Can you post it on a
> website and just send t
Dmitriy and Gang,
The mailing list does not allow attachments. Can you post it on a
website and just send the URL ?
Thanks,
Santhosh
-Original Message-
From: Dmitriy Ryaboy [mailto:dvrya...@gmail.com]
Sent: Tuesday, September 01, 2009 9:48 AM
To: pig-dev@hadoop.apache.org
Subject: Requ