Mike, this might be what your are referring to, maybe not, for time series visualization.
http://square.github.io/cubism/ Also, I found jmxtrans to be useful when writing metrics to ganglia / graphite. Cheers, Miguel On Mon, Apr 22, 2013 at 1:50 PM, Keith Turner <[email protected]> wrote: > On Mon, Apr 22, 2013 at 12:42 PM, Supun Kamburugamuva <[email protected] > >wrote: > > > Great.. we could certainly introduce the graph Mike and Keith have > > mentioned. > > > > I mentioned that it would be useful to display info collected from clients. > Tracing already collects this info. The graph Mike mentioned may be > useful for displaying trace info, maybe a plot per a trace field. > > > > > > Supun.. > > > > > > On Mon, Apr 22, 2013 at 12:02 PM, Keith Turner <[email protected]> wrote: > > > > > On Mon, Apr 22, 2013 at 11:42 AM, Mike Drob <[email protected]> wrote: > > > > > > > Adding on to the comment about summaries, averages, and outliers. If, > > for > > > > some reason, you end up with a two-hump population, then simply > showing > > > > averages will mask the split and lose a lot of valuable information. > It > > > is > > > > often valuable to know that a particular set of users or servers are > > > > experiencing degraded performance while the rest of the ecosystem is > > > > healthy. > > > > > > > > This isn't something that shows up in a regular time series because > the > > > > secondary population is usually very small compared to the total > > > > population. There was a graph for request latency of a service that I > > saw > > > > once that I really wish I could find again, maybe somebody on the > list > > > will > > > > be able to chime in - It had timestamps on the x-axis, latency on the > > y, > > > > and each (x,y) point was colored on a gradient representing how many > > > > requests were fulfilled at time x with latency y. This chart make it > > > > immediately easy to see that most data points fit a normal > distribution > > > > with a low mean, but there was also a cluster at the top for some > > reason. > > > > > > > > > > > > > That sounds really cool. Maybe the y-axis/latency could be log scale. > > > Inevitably a 3004 second operation will finish and obscure the > > > smaller latencies. > > > > > > Sometimes its more useful to sample this type of info from the clients > > > rather than tablet servers. A tablet server may report low latencies, > > but > > > all clients using may experience high latencies because of a network > > issue. > > > We could certainly consider making the client code report this info. > > > > > > > > > > > > > > I'd love to see that type of chart show up for tablet servers > (probably > > > not > > > > as useful for tables). > > > > > > > > Mike > > > > > > > > > > > > On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <[email protected]> > > > > wrote: > > > > > > > > > Another thing to consider is scale. On large clusters (many > hundreds > > > of > > > > > nodes), more data is not helpful for visualization. Instead, > > > summaries, > > > > > averages and outliers are important. > > > > > > > > > > For example, if one node is consistently slow, it is better to know > > > that > > > > > than to see one graph with low numbers in a sea of graphs. > > > > > > > > > > If the monitor collects information using JMX, collection time for > > each > > > > > node would be a good thing to know, too. > > > > > > > > > > -Eric > > > > > > > > > > > > > > > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <[email protected] > > > > > > wrote: > > > > > > > > > > > Supun, > > > > > > > > > > > > Yup, very much so. Having a way to consume any and all metrics > via > > > JMX > > > > > > would simplify things for any consumers (internal or external). > > > > > > > > > > > > > > > > > > > > > > > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote: > > > > > > > > > > > >> Hi Josh, > > > > > >> > > > > > >> Thanks for the suggestions. I'll incorporate these to the > > proposal. > > > > > >> > > > > > >> Another area I would like to work is on JMX. There is a Jira > that > > > says > > > > > to > > > > > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do > > you > > > > > think > > > > > >> this is a good addition to the Monitor? > > > > > >> > > > > > >> Thanks, > > > > > >> Supun.. > > > > > >> > > > > > >> > > > > > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser < > [email protected] > > > > > > > > wrote: > > > > > >> > > > > > >> Supun, > > > > > >>> > > > > > >>> Looks good! Can I make some suggestions/comments? > > > > > >>> > > > > > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see > minor > > > > > >>> compactions, major compactions, index cache hit rate, and data > > > cache > > > > > hit > > > > > >>> rate per table (same graphs that are displayed system-wide when > > you > > > > > visit > > > > > >>> http://${MONITOR_HOST}:50095/. > > > > > >>> > > > > > >>> For "Per tablet [server] plots", it would be neat if you could > > also > > > > > >>> extract some general statistics like top N least performing, > top > > N > > > > > >>> highest > > > > > >>> performing, etc. tablet servers. Ideally, this could correlate > > with > > > > > >>> servers > > > > > >>> that may be having problems :). > > > > > >>> > > > > > >>> Do you see these proposed changes as being sufficient for 3-4 > > > months > > > > of > > > > > >>> 40hrs/week work? If you plan to really dig into these changes > > > > (perhaps > > > > > >>> reworking components of the monitor itself), I could perhaps > see > > > > this. > > > > > Do > > > > > >>> you have any ideas for more lofty goals that you could pursue > as > > > > well? > > > > > I > > > > > >>> don't want you/us to get one month into things and see you > > complete > > > > > >>> everything we initially planned to accomplish :) > > > > > >>> > > > > > >>> - Josh > > > > > >>> > > > > > >>> > > > > > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote: > > > > > >>> > > > > > >>> Hi all, > > > > > >>>> > > > > > >>>> I would like to start writing the proposal for the GSoc. I've > > put > > > > > >>>> together > > > > > >>>> some initial high level goals of the project. Please let me > know > > > > what > > > > > I > > > > > >>>> can > > > > > >>>> improve. > > > > > >>>> > > > > > >>>> Per table plots: Accumulo 594 > > > > > >>>> --------------------- > > > > > >>>> > > > > > >>>> The goal of this is to display plots that explains the various > > > > > >>>> activtities > > > > > >>>> that happens per table. When we go to the tables page of the > > > monitor > > > > > and > > > > > >>>> go > > > > > >>>> to a specific table it displays some information in a table > > > format. > > > > We > > > > > >>>> can > > > > > >>>> argument this information by showing graphs for > > > > > >>>> > > > > > >>>> 1. Ingest entries > > > > > >>>> 2. Ingest data size > > > > > >>>> 3. Scan entries > > > > > >>>> 4. Scan data size > > > > > >>>> > > > > > >>>> Per tablet plots > > > > > >>>> ---------------------- > > > > > >>>> > > > > > >>>> Same as in the table plots we can display information > regarding > > > > tablet > > > > > >>>> servers in the tablet server page. The plots will display the > > same > > > > > >>>> information as table plots considering data per tablet server. > > > > > >>>> > > > > > >>>> Trace Visualization: Accumulo 1198 > > > > > >>>> ---------------------------- > > > > > >>>> > > > > > >>>> Since we are displaying graphs about each tablet and each > table > > we > > > > can > > > > > >>>> add > > > > > >>>> major and minor compaction graph to each table and each > tablet. > > > > > >>>> > > > > > >>>> Or other option is to display this in a single graph in > overview > > > > page > > > > > >>>> with > > > > > >>>> different graph lines for different tables and tablets. > > > > > >>>> > > > > > >>>> Server type information : Accumulo 807 > > > > > >>>> ------------------------------****--- > > > > > >>>> > > > > > >>>> For displaying this informations we can add a new page and > > display > > > > the > > > > > >>>> information as a table. The table should specify the network > > > address > > > > > of > > > > > >>>> the > > > > > >>>> server, server type, weather it is active or in-active etc. > > > > > >>>> > > > > > >>>> Thanks, > > > > > >>>> Supun... > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Supun Kamburugamuva > > Member, Apache Software Foundation; http://www.apache.org > > E-mail: [email protected]; Mobile: +1 812 369 6762 > > Blog: http://supunk.blogspot.com > > >
