Hi,

So what you are saying is when the TableScan gets a zero length record (no
projections means no record is projected so it has a length of zero) it
treats that as a request to read all columns instead of zero columns?  That
sounds like a bug in the TableScan.

On Wed, Jan 19, 2022 at 10:32 AM Viliam Durina <[email protected]> wrote:

> The issue here is not with zero-field records. The issue is that when doing
> `SELECT COUNT(*)`, the sql-to-rel conversion doesn't produce a projection
> with just a dummy constant to have one field, but there's no projection at
> all and all fields are read from the TableScan.
>
> Viliam
>
> On Wed, 19 Jan 2022 at 03:08, Julian Hyde <[email protected]> wrote:
>
> > As Stamatis said, we don’t have a consistent policy on zero-length
> > records. But in that thread I logged
> > https://issues.apache.org/jira/browse/CALCITE-4597 <
> > https://issues.apache.org/jira/browse/CALCITE-4597> to clarify the
> > situation. It would be great if someone worked on it.
> >
> > I see Viliam’s point that it makes physical optimization easier if there
> > is an explicit Project telling you which columns (if any) need to be read
> > from the TableScan. AggregateExtractProjectRule [1] may make it easier to
> > accomplish this. But in the usual case, when this rule is not enabled, I
> > don’t think we should create a Project.
> >
> > Julian
> >
> > [1]
> >
> https://github.com/apache/calcite/blob/d70583c4a8013f878457f82df6dffddd71875900/core/src/main/java/org/apache/calcite/rel/rules/AggregateExtractProjectRule.java#L53
> > <
> >
> https://github.com/apache/calcite/blob/d70583c4a8013f878457f82df6dffddd71875900/core/src/main/java/org/apache/calcite/rel/rules/AggregateExtractProjectRule.java#L53
> >
> >
> >
> > > On Jan 15, 2022, at 2:07 PM, Stamatis Zampetakis <[email protected]>
> > wrote:
> > >
> > > Hi Viliam,
> > >
> > > I don't see a problem with the current plan. It seems correct and more
> > > intuitive than the one with the DUMMY projection.
> > >
> > > LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
> > >    LogicalTableScan(table=foo)
> > >
> > > The code you cited in SqlToRelConverter seems an attempt to handle
> empty
> > > records/tuples that we are not handling very well in general [1].
> > > Doesn't seem related to performance as the use-case you mentioned.
> > >
> > > Best,
> > > Stamatis
> > >
> > > [1] https://lists.apache.org/thread/dtsz159x4nk3l9b3topgykqpsml024tv
> > >
> > > On Fri, Jan 14, 2022 at 12:57 PM Viliam Durina <[email protected]>
> > wrote:
> > >
> > >> I noticed this two pieces of code:
> > >>
> > >> 1. in SqlToRelConverter:
> > >>
> > >>  if (preExprs.size() == 0) {
> > >>    // Special case for COUNT(*), where we can end up with no inputs
> > >>    // at all.  The rest of the system doesn't like 0-tuples, so we
> > >>    // select a dummy constant here.
> > >>    final RexNode zero = rexBuilder.makeExactLiteral(BigDecimal.ZERO);
> > >>    preExprs = ImmutableList.of(Pair.of(zero, null));
> > >>  }
> > >>
> > >> 2. in RelBuilder:
> > >>
> > >>   // Some parts of the system can't handle rows with zero fields, so
> > >>  // pretend that one field is used.
> > >>  if (fieldsUsed.isEmpty()) {
> > >>    r = ((Project) r).getInput();
> > >>  }
> > >>
> > >> They run in this order, and the 2nd overrides the former. The end
> > result is
> > >> that for query `SELECT COUNT(*) FROM foo`, the result of sql-to-rel
> > >> conversion is:
> > >>
> > >> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
> > >>    LogicalTableScan(table=foo)
> > >>
> > >> instead of:
> > >>
> > >> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)])
> > >>  LogicalProject(DUMMY=[0])
> > >>    LogicalTableScan(table=foo)
> > >>
> > >> In our implementation we push the projection to table scan. Without
> the
> > >> project, we fetch full rows, even though the aggregation uses no row.
> > >>
> > >> The code was introduced in
> > >> https://issues.apache.org/jira/browse/CALCITE-3763, but maybe it was
> > >> broken
> > >> later.
> > >>
> > >> Do you think this is an issue?
> > >>
> > >> Viliam
> > >>
> > >> --
> > >> This message contains confidential information and is intended only
> for
> > >> the
> > >> individuals named. If you are not the named addressee you should not
> > >> disseminate, distribute or copy this e-mail. Please notify the sender
> > >> immediately by e-mail if you have received this e-mail by mistake and
> > >> delete this e-mail from your system. E-mail transmission cannot be
> > >> guaranteed to be secure or error-free as information could be
> > intercepted,
> > >> corrupted, lost, destroyed, arrive late or incomplete, or contain
> > viruses.
> > >> The sender therefore does not accept liability for any errors or
> > omissions
> > >> in the contents of this message, which arise as a result of e-mail
> > >> transmission. If verification is required, please request a hard-copy
> > >> version. -Hazelcast
> > >>
> >
> >
>
> --
> This message contains confidential information and is intended only for
> the
> individuals named. If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately by e-mail if you have received this e-mail by mistake and
> delete this e-mail from your system. E-mail transmission cannot be
> guaranteed to be secure or error-free as information could be intercepted,
> corrupted, lost, destroyed, arrive late or incomplete, or contain viruses.
> The sender therefore does not accept liability for any errors or omissions
> in the contents of this message, which arise as a result of e-mail
> transmission. If verification is required, please request a hard-copy
> version. -Hazelcast
>

Reply via email to