Re: Additional SQL metrics

Denis Magda Fri, 03 Mar 2017 10:55:35 -0800

Sergey, agree, good point!

Igniters, any other thoughts before we wrap up the discussion updating the 
ticket content?


—
Denis

> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko 
> <[email protected]> wrote:
> 
> Sergey, that's great idea! Generally, user is not interested much in some
> average numbers, especially in case of SQL queries. What they need is a
> list of slow queries and detailed information about the execution flow of
> these particular queries.
> 
> -Val
> 
> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[email protected]> wrote:
> 
>> One more comment:
>> 
>> In general the customer is interested in slow queries details thus we can
>> introduce an option which will allow to store only queries executed more
>> than NNN seconds. It may significantly reduce the the memory consumption
>> for history (but logging of all queries is still available if set that
>> option to 0).
>> 
>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[email protected]> wrote:
>> 
>>> Vovan,
>>> 
>>> When I’m speaking of JOIN metrics I’m simply assume that we need to add
>>> metrics relevant for queries with joins, metrics that will help us get
>> more
>>> insights on non-collocated and collocated joins execution flow.
>>> 
>>>> 1) Query exec count
>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>> 
>>> Total query execution time might not be helpful in the trickiest cases.
>>> What if you have multiple joins in your query? How do I know which one
>>> contributes to the execution most?
>>> 
>>> So, I would do a breakdown having total time, map time, per-join time,
>>> reduce time. Hope it’s possible. If it’s unclear how to support
>> everything
>>> at the first place then it’s a different question. Let’s create several
>>> tickets and start implementing everything gracefully.
>>> 
>>>> 3) Number of bytes exchanged between nodes during query execution
>>> 
>>> It will be really helpful to make a breakdown showing a number of bytes
>>> exchanged per-join (physical join). Again, if you believe it makes sense
>> to
>>> do the breakdown later then let’s create an additional ticket for the 2
>>> tier metrics then.
>>> 
>>>> 4) Number of returned rows - min, max, avg
>>> 
>>> 
>>> Plus, let’s add the following to the list:
>>> 
>>> 5) Collocated: yes or no
>>> 
>>> —
>>> Denis
>>> 
>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[email protected]>
>>> wrote:
>>>> 
>>>> Denis,
>>>> 
>>>> The main problem with suggested metrics is that they implies that
>> ceratin
>>>> internal mechanics work in predefined way. For example, what is JOIN
>>>> metrics? There are no guarantees that JOIN in user's query will be
>>>> translated to a real physical join. What if several different query
>>>> execution pieces happen in parallel? What if we rework our distributed
>>>> query engine from pull to push approach for performance reasons and
>> there
>>>> will be no JOINs in classical sense?
>>>> 
>>>> This is why I think that we should start with very basic things.
>>> Something
>>>> like:
>>>> 1) Query exec count
>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>> 3) Number of bytes exchanged between nodes during query execution
>>>> 4) Number of returned rows - min, max, avg
>>>> 
>>>> Once we have base numbers in place, we can think of carefully
>> integrating
>>>> and enhancing all pieces of query execution into more verbose formats,
>>>> similar to query plans with relative weights in classical RDBMS
>> systems.
>>>> 
>>>> Thoughts?
>>>> 
>>>> 
>>>> 
>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[email protected]>
>> wrote:
>>>> 
>>>>> Vovan,
>>>>> 
>>>>> Your metrics make perfect sense to me. However, I see a high demand
>> for
>>>>> JOINs based metrics especially from those who give a try to
>>> non-collocated
>>>>> joins in production  and want to measure them somehow. This is why,
>>>>> personally, I prefer to see the metrics below in the top priority list
>>> as
>>>>> well:
>>>>> 
>>>>> if a query was executed in the collocated or non-collocated mode.
>> Three
>>>>> results are valid: collocated, non-collocated, simple query (no
>> joins).
>>>>> non-collocated query: size of the data exchanged between the nodes to
>>>>> complete a specific join. If there are multiple joins in the query we
>>> need
>>>>> to provide this metric for every of them.
>>>>> non-collocated and collocated query: a part of the time spent joining
>>> the
>>>>> data. If there are multiple joins in the query we need to provide this
>>>>> metric for every of them.
>>>>> 
>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>>>>> 
>>>>> In any case, can we include timing information (map phase, reduce
>> phase,
>>>>> join phase) into an execution plan produced by H2? Are there any
>>>>> implementation hooks?
>>>>> 
>>>>> —
>>>>> Denis
>>>>> 
>>>>> 
>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
>> [email protected]>
>>>>> wrote:
>>>>>> 
>>>>>> I think some of the metrics specified by Denis also make sense, so I
>>>>> would
>>>>>> add them as well. See below...
>>>>>> 
>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
>> [email protected]
>>>>> <mailto:[email protected]>>
>>>>>> wrote:
>>>>>> 
>>>>>>> Denis,
>>>>>>> 
>>>>>>> Query execution is complex process involving different stages which
>>> are
>>>>> not
>>>>>>> very easy to match with each other. Especially provided that any
>> node
>>>>> can
>>>>>>> leave topology at any time. Another problem is that engine evolves
>> and
>>>>>>> metrics like "did a query do broadcast or unicast" may easily become
>>>>>>> useless at some point, because for example there will be neither
>>>>> unicast,
>>>>>>> nor broadast, but something different. On the other hand I
>> completely
>>>>> agree
>>>>>>> that performance monitoring is essential part of any mature DBMS.
>>>>>>> 
>>>>>>> I would start with metrics which are both very basic and easy to
>>>>> implement
>>>>>>> at the same time. For example we can add fingerprint (hash) to every
>>>>> query
>>>>>>> which will be used to join "map" and "reduce" parts with each other
>>> and
>>>>> add
>>>>>>> the following basic metrics:
>>>>>>> 1) Execution count for particular query
>>>>>>> 2) Number of map nodes - min, max, avg
>>>>>>> 
>>>>>> 
>>>>>> (1) and (2) makes sense
>>>>>> 
>>>>>> 
>>>>>>> 3) Map step duration (if applicable) - min, max,
>>>>>> 
>>>>>> 4) Reduce step duration (if applicable) - min, max, avg
>>>>>>> 
>>>>>> 
>>>>>> Not sure if (3) and (4) are needed. I would only add them if they are
>>>>> easy
>>>>>> to implement.
>>>>>> 
>>>>>> I would also add these:
>>>>>> 
>>>>>> 5) Collocated: yes/no
>>>>>> 6) last execution time
>>>>>> 7) min/max/average execution duration
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Once done users will be able to get statistics for particular
>> queries.
>>>>>>> 
>>>>>>> Vladimir.
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[email protected]>
>>> wrote:
>>>>>>> 
>>>>>>>> BTW,
>>>>>>>> 
>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
>>> ANALYZE?
>>>>>>>> Sergi, is this feasible?
>>>>>>>> 
>>>>>>>> —
>>>>>>>> Denis
>>>>>>>> 
>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[email protected]>
>> wrote:
>>>>>>>>> 
>>>>>>>>> Igniters,
>>>>>>>>> 
>>>>>>>>> Let’s shed more light on SQL query execution internals
>> introducing a
>>>>>>> set
>>>>>>>> of useful metrics (https://issues.apache.org/
>> jira/browse/IGNITE-4757
>>> ).
>>>>>>>>> 
>>>>>>>>> Per-query metrics. Total history size is defined by
>>>>>>> *CacheConfiguration.
>>>>>>>> getQueryDetailMetricsSize*:
>>>>>>>>> * if a query was executed in the collocated or non-collocated
>> mode.
>>>>>>>> Three results are valid: collocated, non-collocated, simple query
>> (no
>>>>>>>> joins).
>>>>>>>>> * non-collocated query: size of the data exchanged between the
>> nodes
>>>>> to
>>>>>>>> complete a join.
>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to get
>>>>> data
>>>>>>>> needed to complete a join.
>>>>>>>>> * non-collocated and collocated query: a part of the time spent
>>>>> joining
>>>>>>>> the data.
>>>>>>>>> 
>>>>>>>>> CacheMetrics:
>>>>>>>>> * an average number of executed SQL queries (collocated,
>>>>>>> non-collocated,
>>>>>>>> simple query (no joins)).
>>>>>>>>> 
>>>>>>>>> Please don’t hesitate do share suggest another metrics or improve
>>>>>>>> proposed ones.
>>>>>>>>> 
>>>>>>>>> —
>>>>>>>>> Denis
>>>>> 
>>>>> 
>>> 
>>> 
>> 
>> 
>> --
>> Sergey Kozlov
>> GridGain Systems
>> www.gridgain.com
>>

Re: Additional SQL metrics

Reply via email to