Sergey, that's great idea! Generally, user is not interested much in some
average numbers, especially in case of SQL queries. What they need is a
list of slow queries and detailed information about the execution flow of
these particular queries.

-Val

On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[email protected]> wrote:

> One more comment:
>
> In general the customer is interested in slow queries details thus we can
> introduce an option which will allow to store only queries executed more
> than NNN seconds. It may significantly reduce the the memory consumption
> for history (but logging of all queries is still available if set that
> option to 0).
>
> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[email protected]> wrote:
>
> > Vovan,
> >
> > When I’m speaking of JOIN metrics I’m simply assume that we need to add
> > metrics relevant for queries with joins, metrics that will help us get
> more
> > insights on non-collocated and collocated joins execution flow.
> >
> > > 1) Query exec count
> > > 2) Query exec time (first define what "time" means) - min, max, avg
> >
> > Total query execution time might not be helpful in the trickiest cases.
> > What if you have multiple joins in your query? How do I know which one
> > contributes to the execution most?
> >
> > So, I would do a breakdown having total time, map time, per-join time,
> > reduce time. Hope it’s possible. If it’s unclear how to support
> everything
> > at the first place then it’s a different question. Let’s create several
> > tickets and start implementing everything gracefully.
> >
> > > 3) Number of bytes exchanged between nodes during query execution
> >
> > It will be really helpful to make a breakdown showing a number of bytes
> > exchanged per-join (physical join). Again, if you believe it makes sense
> to
> > do the breakdown later then let’s create an additional ticket for the 2
> > tier metrics then.
> >
> > > 4) Number of returned rows - min, max, avg
> >
> >
> > Plus, let’s add the following to the list:
> >
> > 5) Collocated: yes or no
> >
> > —
> > Denis
> >
> > > On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[email protected]>
> > wrote:
> > >
> > > Denis,
> > >
> > > The main problem with suggested metrics is that they implies that
> ceratin
> > > internal mechanics work in predefined way. For example, what is JOIN
> > > metrics? There are no guarantees that JOIN in user's query will be
> > > translated to a real physical join. What if several different query
> > > execution pieces happen in parallel? What if we rework our distributed
> > > query engine from pull to push approach for performance reasons and
> there
> > > will be no JOINs in classical sense?
> > >
> > > This is why I think that we should start with very basic things.
> > Something
> > > like:
> > > 1) Query exec count
> > > 2) Query exec time (first define what "time" means) - min, max, avg
> > > 3) Number of bytes exchanged between nodes during query execution
> > > 4) Number of returned rows - min, max, avg
> > >
> > > Once we have base numbers in place, we can think of carefully
> integrating
> > > and enhancing all pieces of query execution into more verbose formats,
> > > similar to query plans with relative weights in classical RDBMS
> systems.
> > >
> > > Thoughts?
> > >
> > >
> > >
> > > On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[email protected]>
> wrote:
> > >
> > >> Vovan,
> > >>
> > >> Your metrics make perfect sense to me. However, I see a high demand
> for
> > >> JOINs based metrics especially from those who give a try to
> > non-collocated
> > >> joins in production  and want to measure them somehow. This is why,
> > >> personally, I prefer to see the metrics below in the top priority list
> > as
> > >> well:
> > >>
> > >> if a query was executed in the collocated or non-collocated mode.
> Three
> > >> results are valid: collocated, non-collocated, simple query (no
> joins).
> > >> non-collocated query: size of the data exchanged between the nodes to
> > >> complete a specific join. If there are multiple joins in the query we
> > need
> > >> to provide this metric for every of them.
> > >> non-collocated and collocated query: a part of the time spent joining
> > the
> > >> data. If there are multiple joins in the query we need to provide this
> > >> metric for every of them.
> > >>
> > >> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> > >>
> > >> In any case, can we include timing information (map phase, reduce
> phase,
> > >> join phase) into an execution plan produced by H2? Are there any
> > >> implementation hooks?
> > >>
> > >> —
> > >> Denis
> > >>
> > >>
> > >>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
> [email protected]>
> > >> wrote:
> > >>>
> > >>> I think some of the metrics specified by Denis also make sense, so I
> > >> would
> > >>> add them as well. See below...
> > >>>
> > >>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
> [email protected]
> > >> <mailto:[email protected]>>
> > >>> wrote:
> > >>>
> > >>>> Denis,
> > >>>>
> > >>>> Query execution is complex process involving different stages which
> > are
> > >> not
> > >>>> very easy to match with each other. Especially provided that any
> node
> > >> can
> > >>>> leave topology at any time. Another problem is that engine evolves
> and
> > >>>> metrics like "did a query do broadcast or unicast" may easily become
> > >>>> useless at some point, because for example there will be neither
> > >> unicast,
> > >>>> nor broadast, but something different. On the other hand I
> completely
> > >> agree
> > >>>> that performance monitoring is essential part of any mature DBMS.
> > >>>>
> > >>>> I would start with metrics which are both very basic and easy to
> > >> implement
> > >>>> at the same time. For example we can add fingerprint (hash) to every
> > >> query
> > >>>> which will be used to join "map" and "reduce" parts with each other
> > and
> > >> add
> > >>>> the following basic metrics:
> > >>>> 1) Execution count for particular query
> > >>>> 2) Number of map nodes - min, max, avg
> > >>>>
> > >>>
> > >>> (1) and (2) makes sense
> > >>>
> > >>>
> > >>>> 3) Map step duration (if applicable) - min, max,
> > >>>
> > >>> 4) Reduce step duration (if applicable) - min, max, avg
> > >>>>
> > >>>
> > >>> Not sure if (3) and (4) are needed. I would only add them if they are
> > >> easy
> > >>> to implement.
> > >>>
> > >>> I would also add these:
> > >>>
> > >>> 5) Collocated: yes/no
> > >>> 6) last execution time
> > >>> 7) min/max/average execution duration
> > >>>
> > >>>
> > >>>>
> > >>>> Once done users will be able to get statistics for particular
> queries.
> > >>>>
> > >>>> Vladimir.
> > >>>>
> > >>>>
> > >>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[email protected]>
> > wrote:
> > >>>>
> > >>>>> BTW,
> > >>>>>
> > >>>>> What if we expose per-query metrics below as a part of EXPLAIN
> > ANALYZE?
> > >>>>> Sergi, is this feasible?
> > >>>>>
> > >>>>> —
> > >>>>> Denis
> > >>>>>
> > >>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[email protected]>
> wrote:
> > >>>>>>
> > >>>>>> Igniters,
> > >>>>>>
> > >>>>>> Let’s shed more light on SQL query execution internals
> introducing a
> > >>>> set
> > >>>>> of useful metrics (https://issues.apache.org/
> jira/browse/IGNITE-4757
> > ).
> > >>>>>>
> > >>>>>> Per-query metrics. Total history size is defined by
> > >>>> *CacheConfiguration.
> > >>>>> getQueryDetailMetricsSize*:
> > >>>>>> * if a query was executed in the collocated or non-collocated
> mode.
> > >>>>> Three results are valid: collocated, non-collocated, simple query
> (no
> > >>>>> joins).
> > >>>>>> * non-collocated query: size of the data exchanged between the
> nodes
> > >> to
> > >>>>> complete a join.
> > >>>>>> * non-collocated query: did a query do broadcast or unicast to get
> > >> data
> > >>>>> needed to complete a join.
> > >>>>>> * non-collocated and collocated query: a part of the time spent
> > >> joining
> > >>>>> the data.
> > >>>>>>
> > >>>>>> CacheMetrics:
> > >>>>>> * an average number of executed SQL queries (collocated,
> > >>>> non-collocated,
> > >>>>> simple query (no joins)).
> > >>>>>>
> > >>>>>> Please don’t hesitate do share suggest another metrics or improve
> > >>>>> proposed ones.
> > >>>>>>
> > >>>>>> —
> > >>>>>> Denis
> > >>
> > >>
> >
> >
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>

Reply via email to