On Fri, Mar 3, 2017 at 11:07 AM, Denis Magda <dma...@apache.org> wrote:

> What you’re saying should be default behavior. Plus, we can add a special
> parameter that will gather metrics for queries executed longer that N.
>

Agree. I thought we already have that in some form. I remember seeing a
warning for long queries in the log.


>
> —
> Denis
>
> > On Mar 3, 2017, at 11:01 AM, Dmitriy Setrakyan <dsetrak...@apache.org>
> wrote:
> >
> > Hm... as a user I would be interested to know that, say, 95% of my
> "select
> > * from sometable where..." query executes under 10ms or so.
> >
> > I think holding some history is important and is not that hard to
> implement.
> >
> > D.
> >
> > On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <dma...@apache.org> wrote:
> >
> >> Sergey, agree, good point!
> >>
> >> Igniters, any other thoughts before we wrap up the discussion updating
> the
> >> ticket content?
> >>
> >> —
> >> Denis
> >>
> >>> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <
> >> valentin.kuliche...@gmail.com> wrote:
> >>>
> >>> Sergey, that's great idea! Generally, user is not interested much in
> some
> >>> average numbers, especially in case of SQL queries. What they need is a
> >>> list of slow queries and detailed information about the execution flow
> of
> >>> these particular queries.
> >>>
> >>> -Val
> >>>
> >>> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <skoz...@gridgain.com>
> >> wrote:
> >>>
> >>>> One more comment:
> >>>>
> >>>> In general the customer is interested in slow queries details thus we
> >> can
> >>>> introduce an option which will allow to store only queries executed
> more
> >>>> than NNN seconds. It may significantly reduce the the memory
> consumption
> >>>> for history (but logging of all queries is still available if set that
> >>>> option to 0).
> >>>>
> >>>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <dma...@apache.org>
> wrote:
> >>>>
> >>>>> Vovan,
> >>>>>
> >>>>> When I’m speaking of JOIN metrics I’m simply assume that we need to
> add
> >>>>> metrics relevant for queries with joins, metrics that will help us
> get
> >>>> more
> >>>>> insights on non-collocated and collocated joins execution flow.
> >>>>>
> >>>>>> 1) Query exec count
> >>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>>>
> >>>>> Total query execution time might not be helpful in the trickiest
> cases.
> >>>>> What if you have multiple joins in your query? How do I know which
> one
> >>>>> contributes to the execution most?
> >>>>>
> >>>>> So, I would do a breakdown having total time, map time, per-join
> time,
> >>>>> reduce time. Hope it’s possible. If it’s unclear how to support
> >>>> everything
> >>>>> at the first place then it’s a different question. Let’s create
> several
> >>>>> tickets and start implementing everything gracefully.
> >>>>>
> >>>>>> 3) Number of bytes exchanged between nodes during query execution
> >>>>>
> >>>>> It will be really helpful to make a breakdown showing a number of
> bytes
> >>>>> exchanged per-join (physical join). Again, if you believe it makes
> >> sense
> >>>> to
> >>>>> do the breakdown later then let’s create an additional ticket for
> the 2
> >>>>> tier metrics then.
> >>>>>
> >>>>>> 4) Number of returned rows - min, max, avg
> >>>>>
> >>>>>
> >>>>> Plus, let’s add the following to the list:
> >>>>>
> >>>>> 5) Collocated: yes or no
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <voze...@gridgain.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Denis,
> >>>>>>
> >>>>>> The main problem with suggested metrics is that they implies that
> >>>> ceratin
> >>>>>> internal mechanics work in predefined way. For example, what is JOIN
> >>>>>> metrics? There are no guarantees that JOIN in user's query will be
> >>>>>> translated to a real physical join. What if several different query
> >>>>>> execution pieces happen in parallel? What if we rework our
> distributed
> >>>>>> query engine from pull to push approach for performance reasons and
> >>>> there
> >>>>>> will be no JOINs in classical sense?
> >>>>>>
> >>>>>> This is why I think that we should start with very basic things.
> >>>>> Something
> >>>>>> like:
> >>>>>> 1) Query exec count
> >>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>>>> 3) Number of bytes exchanged between nodes during query execution
> >>>>>> 4) Number of returned rows - min, max, avg
> >>>>>>
> >>>>>> Once we have base numbers in place, we can think of carefully
> >>>> integrating
> >>>>>> and enhancing all pieces of query execution into more verbose
> formats,
> >>>>>> similar to query plans with relative weights in classical RDBMS
> >>>> systems.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <dma...@apache.org>
> >>>> wrote:
> >>>>>>
> >>>>>>> Vovan,
> >>>>>>>
> >>>>>>> Your metrics make perfect sense to me. However, I see a high demand
> >>>> for
> >>>>>>> JOINs based metrics especially from those who give a try to
> >>>>> non-collocated
> >>>>>>> joins in production  and want to measure them somehow. This is why,
> >>>>>>> personally, I prefer to see the metrics below in the top priority
> >> list
> >>>>> as
> >>>>>>> well:
> >>>>>>>
> >>>>>>> if a query was executed in the collocated or non-collocated mode.
> >>>> Three
> >>>>>>> results are valid: collocated, non-collocated, simple query (no
> >>>> joins).
> >>>>>>> non-collocated query: size of the data exchanged between the nodes
> to
> >>>>>>> complete a specific join. If there are multiple joins in the query
> we
> >>>>> need
> >>>>>>> to provide this metric for every of them.
> >>>>>>> non-collocated and collocated query: a part of the time spent
> joining
> >>>>> the
> >>>>>>> data. If there are multiple joins in the query we need to provide
> >> this
> >>>>>>> metric for every of them.
> >>>>>>>
> >>>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> >>>>>>>
> >>>>>>> In any case, can we include timing information (map phase, reduce
> >>>> phase,
> >>>>>>> join phase) into an execution plan produced by H2? Are there any
> >>>>>>> implementation hooks?
> >>>>>>>
> >>>>>>> —
> >>>>>>> Denis
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
> >>>> dsetrak...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> I think some of the metrics specified by Denis also make sense,
> so I
> >>>>>>> would
> >>>>>>>> add them as well. See below...
> >>>>>>>>
> >>>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
> >>>> voze...@gridgain.com
> >>>>>>> <mailto:voze...@gridgain.com>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Denis,
> >>>>>>>>>
> >>>>>>>>> Query execution is complex process involving different stages
> which
> >>>>> are
> >>>>>>> not
> >>>>>>>>> very easy to match with each other. Especially provided that any
> >>>> node
> >>>>>>> can
> >>>>>>>>> leave topology at any time. Another problem is that engine
> evolves
> >>>> and
> >>>>>>>>> metrics like "did a query do broadcast or unicast" may easily
> >> become
> >>>>>>>>> useless at some point, because for example there will be neither
> >>>>>>> unicast,
> >>>>>>>>> nor broadast, but something different. On the other hand I
> >>>> completely
> >>>>>>> agree
> >>>>>>>>> that performance monitoring is essential part of any mature DBMS.
> >>>>>>>>>
> >>>>>>>>> I would start with metrics which are both very basic and easy to
> >>>>>>> implement
> >>>>>>>>> at the same time. For example we can add fingerprint (hash) to
> >> every
> >>>>>>> query
> >>>>>>>>> which will be used to join "map" and "reduce" parts with each
> other
> >>>>> and
> >>>>>>> add
> >>>>>>>>> the following basic metrics:
> >>>>>>>>> 1) Execution count for particular query
> >>>>>>>>> 2) Number of map nodes - min, max, avg
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> (1) and (2) makes sense
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> 3) Map step duration (if applicable) - min, max,
> >>>>>>>>
> >>>>>>>> 4) Reduce step duration (if applicable) - min, max, avg
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Not sure if (3) and (4) are needed. I would only add them if they
> >> are
> >>>>>>> easy
> >>>>>>>> to implement.
> >>>>>>>>
> >>>>>>>> I would also add these:
> >>>>>>>>
> >>>>>>>> 5) Collocated: yes/no
> >>>>>>>> 6) last execution time
> >>>>>>>> 7) min/max/average execution duration
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Once done users will be able to get statistics for particular
> >>>> queries.
> >>>>>>>>>
> >>>>>>>>> Vladimir.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <dma...@apache.org>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> BTW,
> >>>>>>>>>>
> >>>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
> >>>>> ANALYZE?
> >>>>>>>>>> Sergi, is this feasible?
> >>>>>>>>>>
> >>>>>>>>>> —
> >>>>>>>>>> Denis
> >>>>>>>>>>
> >>>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <dma...@apache.org>
> >>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Igniters,
> >>>>>>>>>>>
> >>>>>>>>>>> Let’s shed more light on SQL query execution internals
> >>>> introducing a
> >>>>>>>>> set
> >>>>>>>>>> of useful metrics (https://issues.apache.org/
> >>>> jira/browse/IGNITE-4757
> >>>>> ).
> >>>>>>>>>>>
> >>>>>>>>>>> Per-query metrics. Total history size is defined by
> >>>>>>>>> *CacheConfiguration.
> >>>>>>>>>> getQueryDetailMetricsSize*:
> >>>>>>>>>>> * if a query was executed in the collocated or non-collocated
> >>>> mode.
> >>>>>>>>>> Three results are valid: collocated, non-collocated, simple
> query
> >>>> (no
> >>>>>>>>>> joins).
> >>>>>>>>>>> * non-collocated query: size of the data exchanged between the
> >>>> nodes
> >>>>>>> to
> >>>>>>>>>> complete a join.
> >>>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to
> >> get
> >>>>>>> data
> >>>>>>>>>> needed to complete a join.
> >>>>>>>>>>> * non-collocated and collocated query: a part of the time spent
> >>>>>>> joining
> >>>>>>>>>> the data.
> >>>>>>>>>>>
> >>>>>>>>>>> CacheMetrics:
> >>>>>>>>>>> * an average number of executed SQL queries (collocated,
> >>>>>>>>> non-collocated,
> >>>>>>>>>> simple query (no joins)).
> >>>>>>>>>>>
> >>>>>>>>>>> Please don’t hesitate do share suggest another metrics or
> improve
> >>>>>>>>>> proposed ones.
> >>>>>>>>>>>
> >>>>>>>>>>> —
> >>>>>>>>>>> Denis
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Sergey Kozlov
> >>>> GridGain Systems
> >>>> www.gridgain.com
> >>>>
> >>
> >>
>
>

Reply via email to