Re: [Graphite-dev] [Question #223956]: Graphite-Web Refactoring Help Request

Dieter P Sun, 17 Mar 2013 08:16:14 -0700

Question #223956 on Graphite changed:
https://answers.launchpad.net/graphite/+question/223956


Dieter P posted a new comment:
re: "Nodes/Leaves should ID a timeseries with a generic ID field and provide a 
display_name field for GUI use"
1) In graphite only leaves in the tree point to timeseries. (i.e. if you have a 
metric name "foo.bar.baz" then "foo.bar" is nothing). I think this is sensible. 
 do you want to change this? (if so what would "foo.bar" be?)
2) what are reasons why the display name would be different from the metric 
name? (assuming the metric name would still be all nodes in the tree 
"as.we.are.used.to" and which would translate into the TSUID) i don't see a 
need for this, especially in graphite where you interact so closely with the 
metric names (when building graphs and dashboards) that hiding their names 
seems to be disadvantageous (probably read the tagging section below first)

=== tagging ===
I was actually going to start a separate discussion, but since you bring it up 
here...
First, you should know about:
* https://github.com/Dieterbe/graph-explorer/tree/master/structured_metrics : a 
library that converts the graphite metric list into a tag space of metrics
* https://github.com/Dieterbe/graph-explorer: a graphite dashboard that takes 
this tag space and provides a query language so you can filter metrics and 
group them into graphs by tag(s)
I will refer to these as 'G-E'

to take the last example from 
http://www.euphoriaaudio.com/opentsdb/http-api-meta.html
that metric can be written as:
{
        "name": "tsd.http.latency_50pct",
        "display_name": "HTTP Latency 50pct",
        "tags": {'host': 'hobbes-64bit', 'type': 'all'}
}
1) as you can see, I brought down the markup for tags substantially. I find the 
syntax demonstrated in your opentsdb RFC quite overengineered, which is also 
evident because so many fields are just empty.
2) one thing that I learned with G-E is that the more information you can 
capture in tags, the better (because it's structured data, clearly defined, so 
more usable.   "name" has no clear meaning for metrics and so can only be used 
for text filtering ). Luckily, there's no need for a "name" attribute, if you 
add additional tags such as protocol=http, what=seconds, type=latency_50pct.  
this gives more power for filtering, aggregating and grouping metrics when 
composing graphs.  As a rule of thumb, I would say never have a 'name' 
attribute, always aim to structure data in more specific tags.
(the canonical opentsdb example metric "mysql.bytes_sent schema=foo host=db2" 
becomes "service=mysql what=bytes type=sent schema=foo foo=db2")

Furthermore:

1) I would argue that the display of metric names should not be configured at 
the metric level as in your examples, but can easily be generated.
* in a composer interface you can just list all the tags in a predefined order: 
In G-E it's just "%what %type %target_type <other tags alphabetically sorted> 
%server %plugin")
* this becomes more apparent when viewing a graph: say you are plotting these 
two metrics on one graph:
{'what': 'bytes', 'service': 'mysql', 'server':'host1', 'type': 'sent'}
{'what': 'bytes', 'service': 'mysql', 'server':'host1', 'type': 'received'}

for this graph, the tags 'what', 'service' and 'server' are constant,
and the 'type' tag is variable.  So the graph title can be computed to
be "host1 mysql bytes" and the entries on the legend would be 'sent' and
'received'.  This is what G-E does, it's trivial to implement and
creates a non-reduntant display of information independent of how you
group metrics into graphs.

2) nowadays, many metrics in graphite have names that are just to
unclear. often they don't specify the unit of measurement (seconds? ms?,
bits? bytes? elements in a queue? an amount of errors?), prefixes used
(M,G, etc) and how it should be interpreted (is this a number per
second? per flushinterval (like statsd counts), etc).  G-E solves this
by making the 'what' and 'target_type' tags mandatory and clearly
defined.

3) the current tree based organisational paradigm is a bit too simple.  There's 
basically no way to organise your tree to support all ways of later querying it 
(so that you can later do "give me all metrics related to service mysql, or all 
metrics that are an amount of errors", which causes people spending too much 
time trying to.
(see also the amounts of statsd issues/PR's related to suffixes, prefixes and 
namespacing). a tag based system makes this moot.

This is why I'm in favor of deprecating the tree based method entirely and 
moving towards a completely tag-based database and query method.
This can actually be implemented more easily than one would think: actual 
metrics would still be stored based on a key/filename/id; this would either be 
a hash of all tag key/value pairs, sorted, or whatever the 'name' tag says (and 
if you specify 'foo.bar.baz' that's the name tag. this gives instant backwards 
compatibility). for all incoming metrics, just store all tags in a database 
along with the id of the metric, so it's easy to query for metrics, but because 
of the hashing, no lookups are needed when storing time series data. this also 
has the benefit of being compatible with different (existing or not) backends 
such as ceres, whisper, etc; they don't have to implement the tagging.


=== events ===
quote: "B) We’re also adding annotation support to track/mark events. Same 
thing, Graphite could store notes in the DB or get the info from OpenTSDB".
I think there's no benefit of deep integration between an event/change 
management system with a metrics database/management system, because 
events/changes are inherently very different things than metrics.   They have 
different requirements wrt ingestion, storage, management, GUI's, etc.  (I 
believe deep integration leads to feature creep, scope dilution, and harder 
integration with other software i.e. monolithic software)
They do go alongside on graphs, which is AFAICT the only place where metrics 
and events meet.  That's why I think it's sensible to have a separate 
change/event management system, and have a timeseries graphing widget where 
they can be rendered together (as directed by dashboard software)
>From this conviction, I've written:
* https://github.com/Dieterbe/anthracite change/event management system 
(inspired by graphite's philosophy)
* https://github.com/Dieterbe/timeserieswidget to render graphs and 
events/changes in a "rich" way (with annotation text etc), as you would expect 
it supports and targets graphite and anthracite.

Btw, are you going to monitorama? I will, as will a bunch of other
graphite devs.

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.

_______________________________________________
Mailing list: https://launchpad.net/~graphite-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~graphite-dev
More help   : https://help.launchpad.net/ListHelp

Re: [Graphite-dev] [Question #223956]: Graphite-Web Refactoring Help Request

Reply via email to