What you’re saying should be default behavior. Plus, we can add a special parameter that will gather metrics for queries executed longer that N.
— Denis > On Mar 3, 2017, at 11:01 AM, Dmitriy Setrakyan <[email protected]> wrote: > > Hm... as a user I would be interested to know that, say, 95% of my "select > * from sometable where..." query executes under 10ms or so. > > I think holding some history is important and is not that hard to implement. > > D. > > On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <[email protected]> wrote: > >> Sergey, agree, good point! >> >> Igniters, any other thoughts before we wrap up the discussion updating the >> ticket content? >> >> — >> Denis >> >>> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko < >> [email protected]> wrote: >>> >>> Sergey, that's great idea! Generally, user is not interested much in some >>> average numbers, especially in case of SQL queries. What they need is a >>> list of slow queries and detailed information about the execution flow of >>> these particular queries. >>> >>> -Val >>> >>> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[email protected]> >> wrote: >>> >>>> One more comment: >>>> >>>> In general the customer is interested in slow queries details thus we >> can >>>> introduce an option which will allow to store only queries executed more >>>> than NNN seconds. It may significantly reduce the the memory consumption >>>> for history (but logging of all queries is still available if set that >>>> option to 0). >>>> >>>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[email protected]> wrote: >>>> >>>>> Vovan, >>>>> >>>>> When I’m speaking of JOIN metrics I’m simply assume that we need to add >>>>> metrics relevant for queries with joins, metrics that will help us get >>>> more >>>>> insights on non-collocated and collocated joins execution flow. >>>>> >>>>>> 1) Query exec count >>>>>> 2) Query exec time (first define what "time" means) - min, max, avg >>>>> >>>>> Total query execution time might not be helpful in the trickiest cases. >>>>> What if you have multiple joins in your query? How do I know which one >>>>> contributes to the execution most? >>>>> >>>>> So, I would do a breakdown having total time, map time, per-join time, >>>>> reduce time. Hope it’s possible. If it’s unclear how to support >>>> everything >>>>> at the first place then it’s a different question. Let’s create several >>>>> tickets and start implementing everything gracefully. >>>>> >>>>>> 3) Number of bytes exchanged between nodes during query execution >>>>> >>>>> It will be really helpful to make a breakdown showing a number of bytes >>>>> exchanged per-join (physical join). Again, if you believe it makes >> sense >>>> to >>>>> do the breakdown later then let’s create an additional ticket for the 2 >>>>> tier metrics then. >>>>> >>>>>> 4) Number of returned rows - min, max, avg >>>>> >>>>> >>>>> Plus, let’s add the following to the list: >>>>> >>>>> 5) Collocated: yes or no >>>>> >>>>> — >>>>> Denis >>>>> >>>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[email protected]> >>>>> wrote: >>>>>> >>>>>> Denis, >>>>>> >>>>>> The main problem with suggested metrics is that they implies that >>>> ceratin >>>>>> internal mechanics work in predefined way. For example, what is JOIN >>>>>> metrics? There are no guarantees that JOIN in user's query will be >>>>>> translated to a real physical join. What if several different query >>>>>> execution pieces happen in parallel? What if we rework our distributed >>>>>> query engine from pull to push approach for performance reasons and >>>> there >>>>>> will be no JOINs in classical sense? >>>>>> >>>>>> This is why I think that we should start with very basic things. >>>>> Something >>>>>> like: >>>>>> 1) Query exec count >>>>>> 2) Query exec time (first define what "time" means) - min, max, avg >>>>>> 3) Number of bytes exchanged between nodes during query execution >>>>>> 4) Number of returned rows - min, max, avg >>>>>> >>>>>> Once we have base numbers in place, we can think of carefully >>>> integrating >>>>>> and enhancing all pieces of query execution into more verbose formats, >>>>>> similar to query plans with relative weights in classical RDBMS >>>> systems. >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[email protected]> >>>> wrote: >>>>>> >>>>>>> Vovan, >>>>>>> >>>>>>> Your metrics make perfect sense to me. However, I see a high demand >>>> for >>>>>>> JOINs based metrics especially from those who give a try to >>>>> non-collocated >>>>>>> joins in production and want to measure them somehow. This is why, >>>>>>> personally, I prefer to see the metrics below in the top priority >> list >>>>> as >>>>>>> well: >>>>>>> >>>>>>> if a query was executed in the collocated or non-collocated mode. >>>> Three >>>>>>> results are valid: collocated, non-collocated, simple query (no >>>> joins). >>>>>>> non-collocated query: size of the data exchanged between the nodes to >>>>>>> complete a specific join. If there are multiple joins in the query we >>>>> need >>>>>>> to provide this metric for every of them. >>>>>>> non-collocated and collocated query: a part of the time spent joining >>>>> the >>>>>>> data. If there are multiple joins in the query we need to provide >> this >>>>>>> metric for every of them. >>>>>>> >>>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now. >>>>>>> >>>>>>> In any case, can we include timing information (map phase, reduce >>>> phase, >>>>>>> join phase) into an execution plan produced by H2? Are there any >>>>>>> implementation hooks? >>>>>>> >>>>>>> — >>>>>>> Denis >>>>>>> >>>>>>> >>>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan < >>>> [email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> I think some of the metrics specified by Denis also make sense, so I >>>>>>> would >>>>>>>> add them as well. See below... >>>>>>>> >>>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov < >>>> [email protected] >>>>>>> <mailto:[email protected]>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Denis, >>>>>>>>> >>>>>>>>> Query execution is complex process involving different stages which >>>>> are >>>>>>> not >>>>>>>>> very easy to match with each other. Especially provided that any >>>> node >>>>>>> can >>>>>>>>> leave topology at any time. Another problem is that engine evolves >>>> and >>>>>>>>> metrics like "did a query do broadcast or unicast" may easily >> become >>>>>>>>> useless at some point, because for example there will be neither >>>>>>> unicast, >>>>>>>>> nor broadast, but something different. On the other hand I >>>> completely >>>>>>> agree >>>>>>>>> that performance monitoring is essential part of any mature DBMS. >>>>>>>>> >>>>>>>>> I would start with metrics which are both very basic and easy to >>>>>>> implement >>>>>>>>> at the same time. For example we can add fingerprint (hash) to >> every >>>>>>> query >>>>>>>>> which will be used to join "map" and "reduce" parts with each other >>>>> and >>>>>>> add >>>>>>>>> the following basic metrics: >>>>>>>>> 1) Execution count for particular query >>>>>>>>> 2) Number of map nodes - min, max, avg >>>>>>>>> >>>>>>>> >>>>>>>> (1) and (2) makes sense >>>>>>>> >>>>>>>> >>>>>>>>> 3) Map step duration (if applicable) - min, max, >>>>>>>> >>>>>>>> 4) Reduce step duration (if applicable) - min, max, avg >>>>>>>>> >>>>>>>> >>>>>>>> Not sure if (3) and (4) are needed. I would only add them if they >> are >>>>>>> easy >>>>>>>> to implement. >>>>>>>> >>>>>>>> I would also add these: >>>>>>>> >>>>>>>> 5) Collocated: yes/no >>>>>>>> 6) last execution time >>>>>>>> 7) min/max/average execution duration >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Once done users will be able to get statistics for particular >>>> queries. >>>>>>>>> >>>>>>>>> Vladimir. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[email protected]> >>>>> wrote: >>>>>>>>> >>>>>>>>>> BTW, >>>>>>>>>> >>>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN >>>>> ANALYZE? >>>>>>>>>> Sergi, is this feasible? >>>>>>>>>> >>>>>>>>>> — >>>>>>>>>> Denis >>>>>>>>>> >>>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[email protected]> >>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Igniters, >>>>>>>>>>> >>>>>>>>>>> Let’s shed more light on SQL query execution internals >>>> introducing a >>>>>>>>> set >>>>>>>>>> of useful metrics (https://issues.apache.org/ >>>> jira/browse/IGNITE-4757 >>>>> ). >>>>>>>>>>> >>>>>>>>>>> Per-query metrics. Total history size is defined by >>>>>>>>> *CacheConfiguration. >>>>>>>>>> getQueryDetailMetricsSize*: >>>>>>>>>>> * if a query was executed in the collocated or non-collocated >>>> mode. >>>>>>>>>> Three results are valid: collocated, non-collocated, simple query >>>> (no >>>>>>>>>> joins). >>>>>>>>>>> * non-collocated query: size of the data exchanged between the >>>> nodes >>>>>>> to >>>>>>>>>> complete a join. >>>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to >> get >>>>>>> data >>>>>>>>>> needed to complete a join. >>>>>>>>>>> * non-collocated and collocated query: a part of the time spent >>>>>>> joining >>>>>>>>>> the data. >>>>>>>>>>> >>>>>>>>>>> CacheMetrics: >>>>>>>>>>> * an average number of executed SQL queries (collocated, >>>>>>>>> non-collocated, >>>>>>>>>> simple query (no joins)). >>>>>>>>>>> >>>>>>>>>>> Please don’t hesitate do share suggest another metrics or improve >>>>>>>>>> proposed ones. >>>>>>>>>>> >>>>>>>>>>> — >>>>>>>>>>> Denis >>>>>>> >>>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Sergey Kozlov >>>> GridGain Systems >>>> www.gridgain.com >>>> >> >>
