Wouldn't this be a first step towards a cost based optimizer?
I think it would pay in the long run to start thinking about a general
framework now.
Hacking it into the current framework might provide only a short term
solution and not a very elegant one.
Probably the best place to start would be database literature.

I am also interested in the topic and would like to help where I can :)

Cheers,
-- Gianmarco



On Wed, Nov 3, 2010 at 06:09, Renato Marroquín Mogrovejo
<renatoj.marroq...@gmail.com> wrote:
> A couple of weeks ago on a list discussion Alan suggested me an interesting
> project which consists in the idea of switching join operators based on some
> data properties e.g. at logical plan compiling time, a specific join
> operator might be chosen, but maybe this operator is probably not the most
> suitable for the data. For example, if both data sources are ordered by its
> key, then a merge join would be the best operator.
> But at this point I dunno how I should proceed. I have some 'general doubts'
> about the approach that should be taken.
>
> 1. Data statistics can be passed to the LoadFunc by using the LoadMetadata
> interface right? But how should these statistics be collected? should I
> modify the LOLoad class to use a different LoadFunc?
> 2. And how would these statistics be passed to the optimizer to change (if
> it were the case) the join operator?
>
> Please correct me if I am wrong (which I probably am), and any suggestions
> or comments are highly appreciated.
> Thanks in advance.
>
>
> Renato M.
>

Reply via email to