A couple of weeks ago on a list discussion Alan suggested me an interesting
project which consists in the idea of switching join operators based on some
data properties e.g. at logical plan compiling time, a specific join
operator might be chosen, but maybe this operator is probably not the most
suitable for the data. For example, if both data sources are ordered by its
key, then a merge join would be the best operator.
But at this point I dunno how I should proceed. I have some 'general doubts'
about the approach that should be taken.

1. Data statistics can be passed to the LoadFunc by using the LoadMetadata
interface right? But how should these statistics be collected? should I
modify the LOLoad class to use a different LoadFunc?
2. And how would these statistics be passed to the optimizer to change (if
it were the case) the join operator?

Please correct me if I am wrong (which I probably am), and any suggestions
or comments are highly appreciated.
Thanks in advance.


Renato M.

Reply via email to