Re: [Ganglia-developers] g3 (really long)

matt massie Fri, 28 Mar 2003 12:54:44 -0800

i'll respond to your email first since it's shorter.  :)  i hate that i 
sent a quick email without explaining everything in more detail.. i just 
wanted to get the discussion going but i should have waited and finished 
the email.


Today, Federico Sacerdoti wrote forth saying...

> I dont understand how you got 60 bytes for each XDR metric. There is 
> only one header since the four metrics are grouped. In addition, the 
> units/type/tmax/dmax/slope will likely be applicable to all metrics in 
> the group and dont need to be repeated.

if you read the original email.. i was talking about the way we are doing 
things now.  right now we are not grouping metrics in any way.  each 
message is 60 bytes including the header.

i just didn't mention the inheritance model that we've talked about 
before.  in my early tests i used inheritance to make the xdr message 
smaller.. we've discussed that.  most groups of metrics will have very 
similar metric attribute so.. we'll only need to send them once. 

> Also, how will you accurately transmit float and double metrics in XML? 
> There are some fundamental problems with floating point accuracy: 
> something is always lost in a IEEE float -> string translation.

yellow flag.  foul.  you cannot use my own argument against me.  :)  we 
talked about this before and i expressed my fear of translating strings to 
doubles and back again... it's not an issue here.  i'm not taking ascii 
and converting it to binary .. storing it.. and then translating it back 
to ascii for xml... instead.. i'm saying.. we store everything as ascii.  
the only time we need to interpret it to binary form is for value 
comparisons in the C library.  if we keep everything in ascii, it will 
make the speed of storing/retrieving the data much higher.

this also raises the question of data typing.. and whether we should make 
it explicit or implicit.

> > Apples, meet Oranges.  Oranges, meet Apples.  :)  I'm sure a 
> > carefully-thought-out XDR scheme wouldn't provide numbers like that...
> 
> I agree with Steve.

i'll respond to this in steve's email...

> > World:US:California:Berkeley:UCB:Millennium::mm56:cpu:number
> >
> > so here is how the URL is read.. i'll call the data between the 
> > delimiters
> > tokens.
> > ----
> > 1. any token before the double delimiter is considered an 
> > organizational
> > unit (yes.. i stole that name from LDAP).  [btw, no more cluster or 
> > grid
> > tags!]
> 
> If this is just locating a part of an XML tree, why do we need the 
> "organizational unit" special delimiter? Why not a straight XPath-like 
> expression.

the example below in XPath would require

/[EMAIL PROTECTED]'World']/[EMAIL PROTECTED]'California']/[EMAIL 
PROTECTED]'cpu']/[EMAIL PROTECTED]'number']

i'm not saying this is an either/or situation.  if people want to use
XPath expressions on the exported xml.. more power.  i'm just expressing a
shorthand which will be used in the tree library (and possibly elsewhere).  
it simpler/faster to parse and easier to read.  i don't want to reinvent
the wheel unless of course the new wheel rolls faster.  each token
translates directly into a key for the hierarchical hash structure.

another point.. if you look at the XPath expression above.. it would only 
return...

<metric name="number" value="2" ../>

which is fine... but think about what happens when i want to traverse a 
list of hosts and pull out only the number of cpus from each.

<host name="foo">
  <mu name="cpu"><metric name="number" value="2"../></mu>
</host>
<host name="bar">
  ...
</host>

> I am also a believer of consistency in design. The micro structure 
> (semantics, syntax, addressing) of metrics in a host should be echoed 
> in the macro structure of grids in the world.

i forgot to mention that a host token can be empty for summary
information.  for example...

World:California:::cpu:number
                ^^-------------- empty host token         

would point to the number of cpus available in California... in xml...

<ganglia_xml version="3">
<ou name="World">
  <!--- summary info -->
  <ou name="California">
    <mu name="cpu">
      <metric name="number" value="4 bajillion"/>  <!-- data selected -->
    </mu>
    <ou name="Berkeley">
      <ou name="UCB">
        <!--- specific host info -->
        <host name="www.berkeley.edu">
        ...
      </ou>
    </ou>
  </ou>
</ou>
</ganglia_xml>

> > gmetad doesn't simply aggregate though.  it summarizes and delegates.
> > gmetad pushes summary info upstream with links (like hyperlinks) which
> > point to get where to get more detailed information.
> 
> This feature exists right now in the webfrontend 2.5.3 release, except 
> for the upstream summary view. To get this we will parse incoming XML 
> at gmetad so it can be queried more efficiently (not just summary 
> views, but any simple subtree).

right now gmetad uses an aggregation model.. not a delegation model.  it 
simply takes the raw xml from each data source and wraps it inside of a 
grid tag.  we don't want upstream gmetad to get any details only summary 
information with a reference link that points to a gmetad (or group of 
gmetad) that can provide more details.

in order to do the summarizing/delegating correctly.. we need the tree 
data structures.  the new gmetad will be a completely different beast.  
we'll be able to very quickly update individual fields in the tree 
structure (which translates directly into xml).  this will allow us to 
mark organization units as down for example... we can't do that now.  we 
need to be able to treat groups of nodes in the same way we treat 
individual nodes.

we could also allow data sources to be collapsed or expanded by default
depending on how an administrator configures them.

> Full XPath is not ready or fast enough. I dont believe we need the full 
> power of regexs to get summary scalability, which is the most juicy 
> fruit on this tree. I believe a simple interactive gmetad that can 
> provide summary views and simple subtrees is a good idea. With Matt's 
> tree code now available, I am confident it wont take much time.

i still have some more work to do on the tree library side of things but
i'm close.  the demo that i showed you doesn't understand the tree path
shorthand... it just assumes all tokens are equal in function.  it will be 
quick for me to add that functionality.  once i have the tree structures 
working as i like.. it shouldn't take me long as you say to get gmetad3 up 
and going.

> Good that we're thinking about this, g3 looks like it will incorporate 
> some good ideas.

i don't want to give any strict timelines but i feel i'm to the point that 
i can see the end of the tunnel.  i need to do more testing but i think 
i'll be finished with a good working beta in the coming months.  

-matt

Re: [Ganglia-developers] g3 (really long)

Reply via email to