Re: Comprehensive Guide for EXPLAIN

James Taylor Mon, 26 Sep 2016 07:37:47 -0700

Hi Bernard,
No, unfortunately there's no documentation for our explain plan
(PHOENIX-1481).

In your example, it's not fetching everything from the server. If it
fetches everything from the server it would say "FULL TABLE SCAN". In this
case, it's doing a "RANGE SCAN" between these two key ranges:

    [0,'2016-08-27 06:27:17.719'] - [0,*]

I suspect your TEST.TABLE is salted, as the [0] above would represent the
salt byte. Maybe you're using a slightly older version of Phoenix too,
because AFAIR we show the salt bucket differently in later version. The
date is based on now() - 30. So the range is everything greater than that
date.

The 1-CHUNK PARALLEL 1-WAY tells you how much parallelization the client is
doing and over how many scans a merge sort is being done over. Your table
must be only one region, or the

Since you're only selecting columns contained in the row key, Phoenix is
pushing a FirstKeyOnlyFilter, that's what this represents:

    SERVER FILTER BY FIRST KEY ONLY

Then, since you're doing an ungrouped aggregation, this tells you that the
aggregation is happening on the server side:

    SERVER AGGREGATE INTO SINGLE ROW

Assuming your table is salted, the scan will be run on each salt bucket and
then the one row we get back from these scans will be summed together to
calculate the final COUNT(1).

HTH. Thanks,

    James

On Sun, Sep 25, 2016 at 11:36 PM, Bernard Quizon <bernard.quizon@
stellarloyalty.com> wrote:

> Hi,
>
>
> I know explain statement results are relatively easy to understand but is
> there a comprehensive guide for it?
>
>
> 0: jdbc:phoenix:localhost> explain select count(1) from test.table where
> insert_ts > now() - 30;
>
> *+------------------------------------------+*
>
> *| **                  PLAN                  ** |*
>
> *+------------------------------------------+*
>
> *| *CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER TEST.TABLE [0,'2016-08-27
> 06:27:17.719'] - [0,*]* |*
>
> *| *    SERVER FILTER BY FIRST KEY ONLY     * |*
>
> *| *    SERVER AGGREGATE INTO SINGLE ROW    * |*
>
> *+------------------------------------------+ *
>
>
>
> Like for example in the query above from what I understand, it fetches
> everything from the server then applies the key range on client side.
>
> But I’m thinking isn’t it less expensive if key range was already applied
> on the server so less data will be transmitted, thus saving network and
> memory overhead.
>
> Or is my understanding wrong?
>
>
> Thanks
>

Re: Comprehensive Guide for EXPLAIN

Reply via email to