Re: Review calcite-445

Julian Hyde Sun, 22 Feb 2015 09:09:00 -0800

> On Feb 22, 2015, at 4:11 AM, Vladimir Sitnikov <[email protected]> 
> wrote:
> 
> Hi,
> 
>> This change lays the foundations for streaming queries, such as being able
>> to query both the current contents of a table and future deltas to it
> 
> I am sorry, I cannot yet understand how BINDABLE projects/filters are
> the foundation for chi.
> 
>> to generate a different expression
>> for the two forms of the same table,
> 
> Can you pin-point the part of the code?
> As far as I can find, you always call StreamableTable.stream() before
> usage, so I cannot understand why do you need several expressions.


You can see my various previous attempts in the chi.2 branch. You can't 
understand how broken it was unless you run testStream and try to get it 
working. At one point I went as far as to add a "boolean stream" field to 
TableScan, which would have been awful.

The crux is this: EnumerableTableScan has to generate an Expression, and the 
expression is generated by RelOptTable.getExpression(Class). Therefore when I 
called Table.stream() I was forced to create a new RelOptTableImpl with the a 
new expression that somehow represented the fact that I was asking for the 
stream rather than the base table.

Things only got better when I allowed RelOptTable.getExpression to return null. 
For a long time Jacques has been pointing out that regular tables should not 
need to have linq4j expressions.

Longer term I would like EnumerableTableScan to work only on top of a 
QueryableTable. The expression will come from the QueryableTable and 
RelOptTable.getExpression(Class) will become obsolete.

>> more of the complex runtime logic (e.g. negotiating which filters and
>> projects to push down) into BindableTableScan.
> 
> You mean TableScanNode, don't you? (its create method is a bit too long)
> 
> It is interesting that TableScanNode can rescan _multiple_ times in
> case projects are too aggressive.
> It tuns out to be:
> 1) Functional issue: enumerable1 is not closed in case of rescan. This
> might lead to resource leak.

Yeah, I'm actually quite proud of those multiple attempts to find filters. It 
allows the ProjectableFilterableTable to be flaky (changing its mind about 
which filters it can execute) but will always terminate, and projects as few 
columns as possible.

Each "scan" creates an Enumerable, not an Enumerator. Enumerable does not 
acquire resources and hence does not have a close() method.

> 2) Documentation/design flaw. It probably would be nice if PFT could
> return more projections than asked.

If PFT were able to negotiate projections it would be a different interface, 
and more complex for users to implement.

> In general change in calcite-445 looks good, however as you put in
> javadoc of CalcSplitRule: "Not enabled by default, as it works against
> the usual flow". It is bad cost does not accomodate the number of
> pushed filters, etc.
> 
> Comments regarding the code:
> c1) TableScanNode seems to pretend it knows how to map Class to field
> list. I'm not sure if it is in line with table.rowType.
> https://github.com/julianhyde/incubator-calcite/blob/bd0b606178252c90805aff49851f4c9d5eea1a0a/core/src/main/java/org/apache/calcite/interpreter/TableScanNode.java#L180

This is emulating the behavior of ReflectiveTable and FieldSelector. I put it 
in to make existing tests work.

> c2) ImmutableIntList#identity might probably be better expressed in
> terms of ImmutableIntList#range (that is avoid creation of int[])

I thought about having ImmutableIntList#identity create an 
AbstractList<Integer> (as range does). But for code (such as 
TableScanNode.create2) that requires an ImmutableIntList you'd have to write 
"ImmutableIntList.copyOf(ImmutableIntList.identity(x))". That is verbose, less 
efficient, and ends up creating the int[] in the end.

Julian

Re: Review calcite-445

Reply via email to