Thanks Willy for your detailed review, especially some design related points that I was not aware of. I will go through these and respond accordingly.
Regards, - Krishna On Thu, Dec 22, 2016 at 4:23 PM, Willy Tarreau <w...@1wt.eu> wrote: > Hi Krishna, > > On Thu, Dec 22, 2016 at 09:41:49AM +0530, Krishna Kumar (Engineering) > wrote: > > We have found that the current mechanism of qtime, ctime, rtime, and > ttime > > based on last 1024 requests is not the most suitable to debug/visualize > > latency issues with servers, especially if they happen to last a very > short > > time. For live dashboards showing server timings, we found an additional > > last-'n' seconds metrics useful. The logs could also be parsed to derive > > these > > values, but suffers from delays at high volume, requiring higher > processing > > power and enabling logs. > > > > The 'last-n' seconds metrics per server/backend can be configured as > follows > > in the HAProxy configuration file: > > backend backend-1 > > stats-period 32 > > ... > > > > To retrieve these stats at the CLI (in addition to existing metrics), > run: > > echo show stat-duration time 3 | socat /var/run/admin.sock stdio > > > > These are also available on the GUI. > > > > The justification for this patch are: > > 1. Allows to capture spikes for a server during a short period. This > helps > > having dashboards that show server response times every few seconds > (e.g. > > every 1 second), so as to be able to chart it across timelines. > > 2. Be able to get an average across different time intervals, e.g. the > > configuration file may specify to save the last 32 seconds, but the > cli > > interface can request for average across any interval upto 32 seconds. > > E.g. > > the following command prints the existing metrics appended by the time > > based ones for the last 1 second: > > echo show stat-duration time 1 | socat /var/run/admin.sock stdio > > Running the following existing command appends the time-based metric > > values > > based on the time period configured in the configuration file per > > backend/server: > > echo show stat | socat /var/run/admin.sock stdio > > 3. Option per backend for configuring the server stat's time interval, > and. > > no API breakage to stats (new metrics are added at end of line). > > > > Please review, any feedback on the code/usability/extensibility is very > much > > appreciated. > > First, thanks for this work. I'm having several concerns and comments > however > about it. > > The first one is that the amount of storage is overkill if the output can > only emit an average over a few periods. I mean, the purpose of stats is > to emit what we know internally. Some people might want to see historgrams, > and while we have everything internally with your patch, it's not possible > to produce them. > > For this reason I think we should proceed differently and always emit these > stats over a few hard-coded periods. You proved that they don't take that > much space, and I think it would make sense probably to emit them over a > small series of power of 4 seconds : 1s, 4s, 16s, 64s. That's quite cheap > to store and easy to compute because it's not needed anymore to store all > individual values, you can cascade them while filling a bucket. > > And if you go down that route then there's no need anymore for adding yet > another setting in the configuration, it will be done by default. Another > point regarding this setting "stats-period" is that you currently do not > support the "defaults" section while I guess almost all users will want to > have it there. That's another reason for keeping it hard-coded. > > Now some general comments and guidance on the code itself : > > > diff --git a/include/proto/stats.h b/include/proto/stats.h > > index ac893b8..f33ceb1 100644 > > --- a/include/proto/stats.h > > +++ b/include/proto/stats.h > > @@ -95,7 +95,8 @@ int stats_fill_fe_stats(struct proxy *px, struct field > *stats, int len); > > int stats_fill_li_stats(struct proxy *px, struct listener *l, int flags, > > struct field *stats, int len); > > int stats_fill_sv_stats(struct proxy *px, struct server *sv, int flags, > > - struct field *stats, int len); > > + struct field *stats, int len, > > + struct stream_interface *si); > > Please avoid re-introducing the stream_interface in stats, we're trying > hard > to abstract it. From what I'm seeing you only need it to know if there's a > different period. Better pass the period directly in argument from the > caller. > > > diff --git a/include/types/counters.h b/include/types/counters.h > > index 06ce617..a33e01f 100644 > > --- a/include/types/counters.h > > +++ b/include/types/counters.h > > @@ -105,6 +105,41 @@ struct be_counters { > > } p; /* protocol-specific stats > */ > > }; > > > > +/* > > + * Struct time-stat-counters: > > + * > > + * Implement a last-n seconds based counters for backend/server's > q/c/r/t > > + * times. Total memory consumption on x86_64 to maintain all 4 event > time > > + * values per backend/server is: 16 + #seconds * 56 bytes. > > + * for 1 second: 72 bytes > > + * for 16 seconds: 912 bytes > > + * for 32 seconds: 1808 bytes > > + * for 64 second: 3600 bytes > > + */ > > +struct __tstat_counters { > > Avoid struct names using '__' as a prefix. In general we use such a prefix > for functions that people should use with extreme caution (eg: functions > which do not check for nulls or do not lock). Here there's no reason for > this. > > > + unsigned int timestamp; /* time when this entry was added > */ > > + long t_queue; /* sum of queue times */ > > + long t_connect; /* sum of times to connect to > server */ > > + long t_data; /* sum of times to get response */ > > + long t_close; /* sum of session times */ > > + int t_queue_counter; /* total number of 'queue' entries > */ > > + int t_connect_counter; /* total number of 'connect' > entries */ > > + int t_data_counter; /* total number of 'response' > entries */ > > + int t_close_counter; /* total number of 'session' > entries */ > > +}; > > + > > +struct tstat_counters { > > + int max_entries; /* max number of 'times' entries */ > > + struct __tstat_counters *times; /* Stats for every second */ > > +}; > > (...) > > diff --git a/include/types/proxy.h b/include/types/proxy.h > > index 3bf5a4d..97c5732 100644 > > --- a/include/types/proxy.h > > +++ b/include/types/proxy.h > > @@ -365,6 +365,8 @@ struct proxy { > > struct be_counters be_counters; /* backend statistics > counters */ > > struct fe_counters fe_counters; /* frontend statistics > counters */ > > > > + struct tstat_counters be_tstat_counters; /* last 'n' time counters > for backend */ > > + > > Given that you use it both in the backend and in the server, you should > probably move that directly to the struct be_counters. That will also get > rid of your intermediary tstats_counters struct. > > > diff --git a/src/cfgparse.c b/src/cfgparse.c > > index 3ed2c22..b212ee6 100644 > > --- a/src/cfgparse.c > > +++ b/src/cfgparse.c > > @@ -2551,6 +2551,50 @@ static void free_email_alert(struct proxy *p) > > p->email_alert.myhostname = NULL; > > } > > > > +/* > > + * Parse "stats-period" option in backend specifying how many > > + * seconds of stats to keep per server/backend. > > + */ > > +int parse_stats_period(const char *file, int linenum, char **args, > > + struct proxy *curproxy, struct proxy *defproxy) > > Better register a new keyword from stats.c than further increasing > cfgparse.c. > We're trying to deflate that huge beast by providing a more decentralized > parsing system. Please check "cfg_register_keywords()" for this. > > > diff --git a/src/haproxy.c b/src/haproxy.c > > index 239db80..57536ea 100644 > > --- a/src/haproxy.c > > +++ b/src/haproxy.c > > @@ -860,6 +860,12 @@ static void init(int argc, char **argv) > > } > > #endif > > > > + /* Enable time stats counters for those backends that asked for it > */ > > + if (enable_be_tstat_counters()) { > > + Alert("Unable to allocate backend time-counter stats.\n"); > > + exit(1); > > + } > > + > > At some point all proxies are started (start_proxies()). This should be > done at the same time instead of having it own loop. > > > +/* Initialize data structure for a single server/backend */ > > +int tstat_counters_init(struct tstat_counters *tstat, int num_entries) > > Please always indicate the expected return values. That saves a lot of > time during debugging sessions, otherwise we always have to read the whole > function to know what 1 or 0 means for example. Here I'm seeing that 0 is > OK and non-zero indicates a failure. > > > +{ > > + tstat->max_entries = num_entries; > > + if (!num_entries) { > > + tstat->times = NULL; > > + return 0; > > + } > > + > > + tstat->times = calloc(num_entries, sizeof(*tstat->times)); > > + return tstat->times == NULL; > > +} > > + > > +/* Wrapper function for tstat_counters_init() for all backends */ > > +int enable_be_tstat_counters(void) > > Same here. > > (...) > > diff --git a/src/stats.c b/src/stats.c > > index 976496e..571ec11 100644 > > --- a/src/stats.c > > +++ b/src/stats.c > > @@ -216,6 +216,10 @@ const char *stat_field_names[ST_F_TOTAL_FIELDS] = { > > [ST_F_INTERCEPTED] = "intercepted", > > [ST_F_DCON] = "dcon", > > [ST_F_DSES] = "dses", > > + [ST_F_QTIME_PERIOD] = "qt_period", > > + [ST_F_CTIME_PERIOD] = "ct_period", > > + [ST_F_RTIME_PERIOD] = "rt_period", > > + [ST_F_TTIME_PERIOD] = "tt_period", > > I think that if we go for the multiple periods output, it would make sense > to have the value directly in the name. For example, qt1, qt4, qt16, qt64 > etc... > > > }; > > > > /* one line of info */ > > @@ -385,7 +389,7 @@ static int stats_dump_fields_typed(struct chunk > *out, const struct field *stats) > > * reserved for the checkbox is ST_SHOWADMIN is set in <flags>. Some > extra info > > * are provided if ST_SHLGNDS is present in <flags>. > > */ > > -static int stats_dump_fields_html(struct chunk *out, const struct field > *stats, unsigned int flags) > > +static int stats_dump_fields_html(struct chunk *out, const struct field > *stats, unsigned int flags, struct proxy *px) > > Please don't add the proxy here. We've tried hard to remove any dependency > on internal information in the dump code so that all outputs can contain > all info. That's what will make it possible to have some JSON output for > example. The dump functions are just "output drivers" which translate the > contents of the stats table. Here the only reason why you need this is to > have the duration. Just add another metric in the stats which is the > duration > and an external aggregator will even be able to use it and report it. But > as > you see this is also eliminated if you produce a few values over different > periods. > > (...) > > > +/* > > + * Time stat entries have one of the following values: > > + * - 0 (never used, or was set with a value of 0. May or may not be > valid) > > + * - -1 (time was not relevant at update, this entry is not used to > > + * calculate the average) > > + * - non-zero (used in the past, may be valid or invalid). > > + * To determine whether the value is useful or not, we compare current > > + * time to saved time. We cannot stop once we reach a time outside of > the > > + * interval, as intermediate entries may have old values and not be > updated > > + * if no updates took place at those times. This means going through the > > + * entire array, but it is better to do extra work at (rare) retrieval > times > > + * (control plane), rather than updating unused entries at run time > (data > > + * plane). This leads to fast updates and slower retrievals. > > + */ > > +static void get_tstat_counters(struct tstat_counters *tstat, > > + struct stream_interface *si, > > + unsigned int now, struct field *qtimep, > > + struct field *ctimep, struct field *rtimep, > > + struct field *ttimep) > > +{ > > + int i; > > + int64_t qtime = 0, ctime = 0, rtime = 0, ttime = 0; > > + int64_t qcounter = 0, ccounter = 0, rcounter = 0, tcounter = 0; > > + int duration = tstat->max_entries; > > + > > + /* > > + * Return zero for those backends that did not ask for time based > > + * stats. > > + */ > > + if (!tstat->times) { > > + *qtimep = mkf_u32(FN_AVG, 0); > > + *ctimep = mkf_u32(FN_AVG, 0); > > + *rtimep = mkf_u32(FN_AVG, 0); > > + *ttimep = mkf_u32(FN_AVG, 0); > > + return; > > + } > > + > > + if (si && si->end) { > > + /* > > + * If user via CLI specified a specific time period over > which > > + * to report stats, then retrieve that value instead of the > > + * default value specified in the haproxy configuration > file. > > + */ > > + struct appctx *appctx = objt_appctx(si->end); > > + > > + if (appctx && (appctx->ctx.stats.flags & STAT_DURATION)) > > + duration = MAX(MIN(duration, > > + appctx->ctx.stats.duration), 0); > > + } > > + > > + /* Sum all entries to calculate time window based average */ > > + for (i = 0; i < tstat->max_entries; i++) { > > + if (!within_period(tstat->times[i].timestamp, now, > duration)) { > > + /* Outside the requested period, but cannot break > out */ > > + continue; > > + } > > + > > + /* Valid time period - increment counts/average */ > > + qtime += tstat->times[i].t_queue; > > + ctime += tstat->times[i].t_connect; > > + rtime += tstat->times[i].t_data; > > + ttime += tstat->times[i].t_close; > > + > > + qcounter += tstat->times[i].t_queue_counter; > > + ccounter += tstat->times[i].t_connect_counter; > > + rcounter += tstat->times[i].t_data_counter; > > + tcounter += tstat->times[i].t_close_counter; > > + } > > + > > + /* Avoid divide-by-zero errors */ > > + if (!qcounter) > > + qcounter = 1; > > + if (!ccounter) > > + ccounter = 1; > > + if (!rcounter) > > + rcounter = 1; > > + if (!tcounter) > > + tcounter = 1; > > + > > + /* Convert average times to microsecond */ > > + *qtimep = mkf_u32(FN_AVG, qtime / qcounter); > > + *ctimep = mkf_u32(FN_AVG, ctime / ccounter); > > + *rtimep = mkf_u32(FN_AVG, rtime / rcounter); > > + *ttimep = mkf_u32(FN_AVG, ttime / tcounter); > > Above I'd proceed differently. Since the stats supports the "empty" type > (which is the default), I'd do it this way instead, which has both the > benefit of avoiding the divide by zero and not to emit any value when > there's none (instead of emitting zero) : > > /* Convert average times to microsecond */ > if (qcounter) > *qtimep = mkf_u32(FN_AVG, qtime / qcounter); > if (ccounter) > *ctimep = mkf_u32(FN_AVG, ctime / ccounter); > if (rcounter) > *rtimep = mkf_u32(FN_AVG, rtime / rcounter); > if (tcounter) > *ttimep = mkf_u32(FN_AVG, ttime / tcounter); > > > +/* Update time-stat counter values if enabled for this backend */ > > +void set_tstat_counters(struct tstat_counters *tstat, > > + unsigned int now, int t_queue, > > + int t_connect, int t_data, int t_close) > > +{ > > + if (tstat->times) { > > + unsigned int slot = now % tstat->max_entries; > > + > > + /* > > + * Optimize for heavy load when many requests are made in > > + * the same second. During very low load, the following > > + * 'likely' will become false more often. > > + */ > > + if (likely(now == tstat->times[slot].timestamp)) { > (...) > > Here you can make the code slightly simpler by first checking if the period > is different to reinitialize the slot to zero using a simple memset(), and > then go to the usual increment. That removes a significant amount of logic. > > Cheers, > Willy >