Gil,

This patch works on cygwin:
http://bugzilla.ganglia.info/cgi-bin/bugzilla/attachment.cgi?id=26&action=view

However there are two good suggestion that would make the patch much better. Martin suggests that its possible that the while loops could be infinite if we always get EAGAIN, so there should be some doomsday logic written in. Richard suggest that we spread the EAGAIN loop to anywhere we do a print to the client which seems smart. I count at least 5 places in the TCP code that "should" be patched, the reason being, we could get an EAGAIN anywhere we call a print statement. A more correct solution is to refactor all calls to apr_socket_send (9 calls) into a new function that calls it again if it returns EAGAIN, but I didn't originally have time to write and test that. I think the patch works because most of the EAGAINs are recieved in that for() loop.

Ian

[EMAIL PROTECTED] wrote:

Gee,

I thought that was fixed with this patch:
http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=50

Actually, looking at 3.0.3 gmond.c, it looks like the patch did not make
it
into the release - that's a shame.

Even looking at the patch, it looks as if it is a partial fix, because
while
the patched metric printing is protected like this (gmond.c,
process_tcp_accept_channel):
<snip>
       rv = print_host_metric(client, metric, now);
       while(rv == EAGAIN)
       {
         rv = print_host_metric(client, metric, now);
       }
         if(rv != APR_SUCCESS)
           {
             goto close_accept_socket;
           }
       }
</snip>

the gmetric printing in the same function is not protected:
<snip>

     /* Send the gmetric info for this particular host */
     for(metric_hi = apr_hash_first(client_context, ((Ganglia_host
*)val)->gmetrics);
         metric_hi;
         metric_hi = apr_hash_next(metric_hi))
       {
         void *metric;
         apr_hash_this(metric_hi, NULL, NULL, &metric);

         /* Print each of the metrics from gmetric for this host... */
         if(print_host_gmetric(client, metric, now) != APR_SUCCESS)
           {
             goto close_accept_socket;
           }
       }

It may be best to talk to the original owner of the patch,
I'm not confident to submit a patch myself, although I will try
to submit a bugzill entry.

kind regards,
Richard

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Gilad Raphaelli
Sent: 14 March 2006 18:35
To: ganglia-developers@lists.sourceforge.net
Subject: [Ganglia-developers] RE: First prerelease of ganglia-3.0.3
ready for testing


I have tried the new release 3.0.3.200602231926
without success on FreeBSD 4.11 - the xml is still
truncated when attempting to access the data from a
remote host.  Interestingly, this is not the case when
trying from the host running gmond.  Based on the
strace, my colleague commented:

 Default socket buffer is 64K.  It appears that
socket is non-blocking.  That last write is failing
(EAGAIN) because the socket buffer is full.  The
application is ignoring that fact and shutting down
the socket.  Looks to me like an application bug that
just accidentally works on rhel.

 Please let me know if you need any more information.

Thank you,

Gil
-----------------------------------------------------

Running an strace on gmond (on the target host) while
trying to retrieve the data shows:
71160 write(10, "<METRIC NAME=\"swap_free\"
VAL=\"41"..., 124) = 124
71160 write(10, "<METRIC NAME=\"bytes_in\"
VAL=\"608"..., 129) = -1 EAGAIN
(Resource temporarily unavailable)
71160 shutdown(10, 0 /* receive */)     = 0

What this looks like from the requester (not the
exact same transaction):

<METRIC NAME="mem_buffers" VAL="204096" TYPE="uint32" UNITS="KB"
TN="119" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>  <METRIC
NAME="swap_free" VAL="4194136" TYPE="uint32" UNITS="KB" TN="119"
TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>  Connection closed by
foreign host.

A normal transaction closes with a closing tag: </GANGLIA_XML>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting
language that extends applications into web and mobile media. Attend the
live webcast and join the prime developer group breaking into this new
coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers


------------------------------------------------------------------------
For more information about Barclays Capital, please
visit our web site at http://www.barcap.com.


Internet communications are not secure and therefore the Barclays Group does not accept legal responsibility for the contents of this message. Although the Barclays Group operates anti-virus programmes, it does not accept responsibility for any damage whatsoever that is caused by viruses being passed. Any views or opinions presented are solely those of the author and do not necessarily represent those of the Barclays Group. Replies to this email may be monitored by the Barclays Group for operational or business reasons.

------------------------------------------------------------------------



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers



Reply via email to