Sergey, agree, good point! Igniters, any other thoughts before we wrap up the discussion updating the ticket content?
— Denis > On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko > <valentin.kuliche...@gmail.com> wrote: > > Sergey, that's great idea! Generally, user is not interested much in some > average numbers, especially in case of SQL queries. What they need is a > list of slow queries and detailed information about the execution flow of > these particular queries. > > -Val > > On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <skoz...@gridgain.com> wrote: > >> One more comment: >> >> In general the customer is interested in slow queries details thus we can >> introduce an option which will allow to store only queries executed more >> than NNN seconds. It may significantly reduce the the memory consumption >> for history (but logging of all queries is still available if set that >> option to 0). >> >> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <dma...@apache.org> wrote: >> >>> Vovan, >>> >>> When I’m speaking of JOIN metrics I’m simply assume that we need to add >>> metrics relevant for queries with joins, metrics that will help us get >> more >>> insights on non-collocated and collocated joins execution flow. >>> >>>> 1) Query exec count >>>> 2) Query exec time (first define what "time" means) - min, max, avg >>> >>> Total query execution time might not be helpful in the trickiest cases. >>> What if you have multiple joins in your query? How do I know which one >>> contributes to the execution most? >>> >>> So, I would do a breakdown having total time, map time, per-join time, >>> reduce time. Hope it’s possible. If it’s unclear how to support >> everything >>> at the first place then it’s a different question. Let’s create several >>> tickets and start implementing everything gracefully. >>> >>>> 3) Number of bytes exchanged between nodes during query execution >>> >>> It will be really helpful to make a breakdown showing a number of bytes >>> exchanged per-join (physical join). Again, if you believe it makes sense >> to >>> do the breakdown later then let’s create an additional ticket for the 2 >>> tier metrics then. >>> >>>> 4) Number of returned rows - min, max, avg >>> >>> >>> Plus, let’s add the following to the list: >>> >>> 5) Collocated: yes or no >>> >>> — >>> Denis >>> >>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <voze...@gridgain.com> >>> wrote: >>>> >>>> Denis, >>>> >>>> The main problem with suggested metrics is that they implies that >> ceratin >>>> internal mechanics work in predefined way. For example, what is JOIN >>>> metrics? There are no guarantees that JOIN in user's query will be >>>> translated to a real physical join. What if several different query >>>> execution pieces happen in parallel? What if we rework our distributed >>>> query engine from pull to push approach for performance reasons and >> there >>>> will be no JOINs in classical sense? >>>> >>>> This is why I think that we should start with very basic things. >>> Something >>>> like: >>>> 1) Query exec count >>>> 2) Query exec time (first define what "time" means) - min, max, avg >>>> 3) Number of bytes exchanged between nodes during query execution >>>> 4) Number of returned rows - min, max, avg >>>> >>>> Once we have base numbers in place, we can think of carefully >> integrating >>>> and enhancing all pieces of query execution into more verbose formats, >>>> similar to query plans with relative weights in classical RDBMS >> systems. >>>> >>>> Thoughts? >>>> >>>> >>>> >>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <dma...@apache.org> >> wrote: >>>> >>>>> Vovan, >>>>> >>>>> Your metrics make perfect sense to me. However, I see a high demand >> for >>>>> JOINs based metrics especially from those who give a try to >>> non-collocated >>>>> joins in production and want to measure them somehow. This is why, >>>>> personally, I prefer to see the metrics below in the top priority list >>> as >>>>> well: >>>>> >>>>> if a query was executed in the collocated or non-collocated mode. >> Three >>>>> results are valid: collocated, non-collocated, simple query (no >> joins). >>>>> non-collocated query: size of the data exchanged between the nodes to >>>>> complete a specific join. If there are multiple joins in the query we >>> need >>>>> to provide this metric for every of them. >>>>> non-collocated and collocated query: a part of the time spent joining >>> the >>>>> data. If there are multiple joins in the query we need to provide this >>>>> metric for every of them. >>>>> >>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now. >>>>> >>>>> In any case, can we include timing information (map phase, reduce >> phase, >>>>> join phase) into an execution plan produced by H2? Are there any >>>>> implementation hooks? >>>>> >>>>> — >>>>> Denis >>>>> >>>>> >>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan < >> dsetrak...@apache.org> >>>>> wrote: >>>>>> >>>>>> I think some of the metrics specified by Denis also make sense, so I >>>>> would >>>>>> add them as well. See below... >>>>>> >>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov < >> voze...@gridgain.com >>>>> <mailto:voze...@gridgain.com>> >>>>>> wrote: >>>>>> >>>>>>> Denis, >>>>>>> >>>>>>> Query execution is complex process involving different stages which >>> are >>>>> not >>>>>>> very easy to match with each other. Especially provided that any >> node >>>>> can >>>>>>> leave topology at any time. Another problem is that engine evolves >> and >>>>>>> metrics like "did a query do broadcast or unicast" may easily become >>>>>>> useless at some point, because for example there will be neither >>>>> unicast, >>>>>>> nor broadast, but something different. On the other hand I >> completely >>>>> agree >>>>>>> that performance monitoring is essential part of any mature DBMS. >>>>>>> >>>>>>> I would start with metrics which are both very basic and easy to >>>>> implement >>>>>>> at the same time. For example we can add fingerprint (hash) to every >>>>> query >>>>>>> which will be used to join "map" and "reduce" parts with each other >>> and >>>>> add >>>>>>> the following basic metrics: >>>>>>> 1) Execution count for particular query >>>>>>> 2) Number of map nodes - min, max, avg >>>>>>> >>>>>> >>>>>> (1) and (2) makes sense >>>>>> >>>>>> >>>>>>> 3) Map step duration (if applicable) - min, max, >>>>>> >>>>>> 4) Reduce step duration (if applicable) - min, max, avg >>>>>>> >>>>>> >>>>>> Not sure if (3) and (4) are needed. I would only add them if they are >>>>> easy >>>>>> to implement. >>>>>> >>>>>> I would also add these: >>>>>> >>>>>> 5) Collocated: yes/no >>>>>> 6) last execution time >>>>>> 7) min/max/average execution duration >>>>>> >>>>>> >>>>>>> >>>>>>> Once done users will be able to get statistics for particular >> queries. >>>>>>> >>>>>>> Vladimir. >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <dma...@apache.org> >>> wrote: >>>>>>> >>>>>>>> BTW, >>>>>>>> >>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN >>> ANALYZE? >>>>>>>> Sergi, is this feasible? >>>>>>>> >>>>>>>> — >>>>>>>> Denis >>>>>>>> >>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <dma...@apache.org> >> wrote: >>>>>>>>> >>>>>>>>> Igniters, >>>>>>>>> >>>>>>>>> Let’s shed more light on SQL query execution internals >> introducing a >>>>>>> set >>>>>>>> of useful metrics (https://issues.apache.org/ >> jira/browse/IGNITE-4757 >>> ). >>>>>>>>> >>>>>>>>> Per-query metrics. Total history size is defined by >>>>>>> *CacheConfiguration. >>>>>>>> getQueryDetailMetricsSize*: >>>>>>>>> * if a query was executed in the collocated or non-collocated >> mode. >>>>>>>> Three results are valid: collocated, non-collocated, simple query >> (no >>>>>>>> joins). >>>>>>>>> * non-collocated query: size of the data exchanged between the >> nodes >>>>> to >>>>>>>> complete a join. >>>>>>>>> * non-collocated query: did a query do broadcast or unicast to get >>>>> data >>>>>>>> needed to complete a join. >>>>>>>>> * non-collocated and collocated query: a part of the time spent >>>>> joining >>>>>>>> the data. >>>>>>>>> >>>>>>>>> CacheMetrics: >>>>>>>>> * an average number of executed SQL queries (collocated, >>>>>>> non-collocated, >>>>>>>> simple query (no joins)). >>>>>>>>> >>>>>>>>> Please don’t hesitate do share suggest another metrics or improve >>>>>>>> proposed ones. >>>>>>>>> >>>>>>>>> — >>>>>>>>> Denis >>>>> >>>>> >>> >>> >> >> >> -- >> Sergey Kozlov >> GridGain Systems >> www.gridgain.com >>