Matt/Martin and all.

I am finding that I am still getting occassional truncated XML from
gmond, even
after the EAGAIN patches to gmond.c. Interestingly, when the data was
truncated, it ended with a </HOST> tag. i.e. a host boundary.

Looking at the code, I see this:
<snip>
  /* Walk the host hash */
  for(hi = apr_hash_first(client_context, hosts);
      hi;
      hi = apr_hash_next(hi))
    {
      apr_hash_this(hi, NULL, NULL, &val);
      status = print_host_start(client, (Ganglia_host *)val);
      if(status != APR_SUCCESS)
        {
          goto close_accept_socket;
        }
</snip>

Ahh. This is another place that we need the EAGAIN retry loop.
In fact to be safe, the print_xml_header code should also be protected.

Do you guys agree with the analysis?

My gmonds run on windows - for some reason windows/cygwin often gives
me the EGAIN returns while the Linux daemons never seem to.

regards,
Richard

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Martin Knoblauch
Sent: 21 March 2006 12:14
To: Grevis, Richard: IT (LDN); ganglia-developers@lists.sourceforge.net
Subject: Re: [Ganglia-developers] Possible bug in hosts up calculation
when federating clusters.


Hi Richard,

 oops. You are both right and wrong :-) Looking at the code and comments
for "old", it seems that the whole logic is pre-3.0. It was used to
distinguish between 2.5.x and prior versions. For that purpose the code
is right. Of course, it now fails when it encounters the 3.0.X string.

 Something like this should solve it. Care to test? I am not sure about
setting old to 0 in the other case. In any case, the whole xmldata_t
structure is initialized early on.


diff -u -r1.45 process_xml.c
--- process_xml.c       18 Nov 2004 20:14:31 -0000      1.45
+++ process_xml.c       21 Mar 2006 12:09:16 -0000
@@ -821,7 +821,7 @@
          if (xt->tag == VERSION_TAG)
             {
                   /* Process the version tag later */
-                  if(! strstr( attr[i+1], "2.5." ) )
+                  if( strcmp( attr[i+1], "2.5." ) < 0 )
                      {
                          debug_msg("[%s] is an OLD version",
xmldata->ds->name);
                          xmldata->old = 1;


Cheers
Martin

--- [EMAIL PROTECTED] wrote:

> All,
> 
> when I had debugging turned on in gmetad, the daemon was announcing 
> data sources
> as old, when they were not. Looking at the code, 3.0.2 or 3.0.3
> gmetad/process_xml.c, line 821 or so, function
> startElement_GANGLIA_XML:
> <snip>
>          if (xt->tag == VERSION_TAG)
>             {
>                   /* Process the version tag later */
>                   if(! strstr( attr[i+1], "2.5." ) )
>                      {
>                          debug_msg("[%s] is an OLD version",
> xmldata->ds->name);
>                          xmldata->old = 1;
>                       }
>              }
>        }
> </snip>
> 
> It seems there are two problems here. First, is not the strstr test 
> the wrong
> way round? Second, if it is a new version of ganglia, xmldata->old
> should be
> explicitely set to zero. This seemed to make it better:
> <snip>
>          if (xt->tag == VERSION_TAG)
>             {
>                   /* Process the version tag later */
>                   if( strstr( attr[i+1], "2.5." ) )
>                      {
>                          debug_msg("[%s] version %s is an OLD
> version",
> xmldata->ds->name, attr[i+1]);
>                          xmldata->old = 1;
>                       } else {
>                          debug_msg("[%s] version %s is a NEW version",
> xmldata->ds->name, attr[i+1]);
>                         xmldata->old = 0;
>                         }
>              }
>        }
> </snip>
> 
> You actually don't notice the problem if your clocks are well 
> syncronised everywhere, because when clusters/grids are tagged as old,

> it does the up/down calculation from
> the current time, and it still works.
> 
> What do you guys all think?
> 
> kind regards,
> 
> Richard Grevis
> CTO wallah,
> Barclays Capital
> 


------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language that extends applications into web and mobile media. Attend the
live webcast and join the prime developer group breaking into this new
coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


------------------------------------------------------------------------
For more information about Barclays Capital, please
visit our web site at http://www.barcap.com.


Internet communications are not secure and therefore the Barclays 
Group does not accept legal responsibility for the contents of this 
message.  Although the Barclays Group operates anti-virus programmes, 
it does not accept responsibility for any damage whatsoever that is 
caused by viruses being passed.  Any views or opinions presented are 
solely those of the author and do not necessarily represent those of the 
Barclays Group.  Replies to this email may be monitored by the Barclays 
Group for operational or business reasons.

------------------------------------------------------------------------


Reply via email to