1. Collection is kind of a separate problem. You can write an optimizer from
the position of "if we have stats, we use them" and punt on this.  Assume
there is a something that provides the stats. Fake them while you are
dealing with the optimization problem.

2. Attach ResourceStatistics to the different operator instances and mutate
them as appropriate while walking down the operators.

-D

On Tue, Nov 2, 2010 at 10:09 PM, Renato Marroquín Mogrovejo <
[email protected]> wrote:

> A couple of weeks ago on a list discussion Alan suggested me an interesting
> project which consists in the idea of switching join operators based on
> some
> data properties e.g. at logical plan compiling time, a specific join
> operator might be chosen, but maybe this operator is probably not the most
> suitable for the data. For example, if both data sources are ordered by its
> key, then a merge join would be the best operator.
> But at this point I dunno how I should proceed. I have some 'general
> doubts'
> about the approach that should be taken.
>
> 1. Data statistics can be passed to the LoadFunc by using the LoadMetadata
> interface right? But how should these statistics be collected? should I
> modify the LOLoad class to use a different LoadFunc?
> 2. And how would these statistics be passed to the optimizer to change (if
> it were the case) the join operator?
>
> Please correct me if I am wrong (which I probably am), and any suggestions
> or comments are highly appreciated.
> Thanks in advance.
>
>
> Renato M.
>

Reply via email to