Hi Ian, thanks for updation the patch.
Puuhhh. That behaviour you describe is bad indeed. Seems either Cygwin or M$ are doing something stupid. One thought - you are calling apr_socket_send() at a high frequency in that loop. Have you played with inserting some delay code in the loop? Maybe waiting a ms or so would increase the chance of success? Cheers Martin --- Ian Cunningham <[EMAIL PROTECTED]> wrote: > Martin, > > Non-scientific numbers here for you. Connecting to the tcp port 600 > times, print_host_metric() called apr_socket_send() at least 90,624 > times. Of those 90,624 times, we got stuck in a EAGAIN while loop > 1,190 > times. On average that while loop looped 29,116.66 times, with > maximum > of 525,705 loops. > > Pretty bad in my opinion. But the workaround... works :/ > > I have refactored all of the apr_socket_sends to use the workaround. > I > have it error out if it loops more than 750,000 times *shakes head*. > I've posted a patch to the bug that seems to work, it only bombed out > > once in 600 tries. > > http://bugzilla.ganglia.info/cgi-bin/bugzilla/attachment.cgi?id=27&action=view > > Ian > > Martin Knoblauch wrote: > > >Hi Richard, > > > > correct. I was waiting for a comment from Ian on my concerns about > >possible endless loops before committing the patch. > > > > Ian: what do you think. Do you have any data how often you iterate > >those EAGAIN loops? > > > >Cheers > >Martin > > > > > >--- [EMAIL PROTECTED] wrote: > > > > > > > >>Gee, > >> > >>I thought that was fixed with this patch: > >>http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=50 > >> > >>Actually, looking at 3.0.3 gmond.c, it looks like the patch did not > >>make > >>it > >>into the release - that's a shame. > >> > >>Even looking at the patch, it looks as if it is a partial fix, > >>because > >>while > >>the patched metric printing is protected like this (gmond.c, > >>process_tcp_accept_channel): > >><snip> > >> rv = print_host_metric(client, metric, now); > >> while(rv == EAGAIN) > >> { > >> rv = print_host_metric(client, metric, now); > >> } > >> if(rv != APR_SUCCESS) > >> { > >> goto close_accept_socket; > >> } > >> } > >></snip> > >> > >>the gmetric printing in the same function is not protected: > >><snip> > >> > >> /* Send the gmetric info for this particular host */ > >> for(metric_hi = apr_hash_first(client_context, ((Ganglia_host > >>*)val)->gmetrics); > >> metric_hi; > >> metric_hi = apr_hash_next(metric_hi)) > >> { > >> void *metric; > >> apr_hash_this(metric_hi, NULL, NULL, &metric); > >> > >> /* Print each of the metrics from gmetric for this > host... > >>*/ > >> if(print_host_gmetric(client, metric, now) != > APR_SUCCESS) > >> { > >> goto close_accept_socket; > >> } > >> } > >> > >>It may be best to talk to the original owner of the patch, > >>I'm not confident to submit a patch myself, although I will try > >>to submit a bugzill entry. > >> > >>kind regards, > >>Richard > >> > >>-----Original Message----- > >>From: [EMAIL PROTECTED] > >>[mailto:[EMAIL PROTECTED] On Behalf > Of > >>Gilad Raphaelli > >>Sent: 14 March 2006 18:35 > >>To: ganglia-developers@lists.sourceforge.net > >>Subject: [Ganglia-developers] RE: First prerelease of ganglia-3.0.3 > >>ready for testing > >> > >> > >>I have tried the new release 3.0.3.200602231926 > >>without success on FreeBSD 4.11 - the xml is still > >>truncated when attempting to access the data from a > >>remote host. Interestingly, this is not the case when > >>trying from the host running gmond. Based on the > >>strace, my colleague commented: > >> > >> Default socket buffer is 64K. It appears that > >>socket is non-blocking. That last write is failing > >>(EAGAIN) because the socket buffer is full. The > >>application is ignoring that fact and shutting down > >>the socket. Looks to me like an application bug that > >>just accidentally works on rhel. > >> > >> Please let me know if you need any more information. > >> > >>Thank you, > >> > >>Gil > >>----------------------------------------------------- > >> > >>Running an strace on gmond (on the target host) while > >>trying to retrieve the data shows: > >> 71160 write(10, "<METRIC NAME=\"swap_free\" > >>VAL=\"41"..., 124) = 124 > >> 71160 write(10, "<METRIC NAME=\"bytes_in\" > >>VAL=\"608"..., 129) = -1 EAGAIN > >> (Resource temporarily unavailable) > >> 71160 shutdown(10, 0 /* receive */) = 0 > >> > >> What this looks like from the requester (not the > >>exact same transaction): > >> > >> <METRIC NAME="mem_buffers" VAL="204096" TYPE="uint32" UNITS="KB" > >>TN="119" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/> <METRIC > >>NAME="swap_free" VAL="4194136" TYPE="uint32" UNITS="KB" TN="119" > >>TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/> Connection > closed > >>by > >>foreign host. > >> > >> A normal transaction closes with a closing tag: </GANGLIA_XML> > >> > >>__________________________________________________ > >>Do You Yahoo!? > >>Tired of spam? Yahoo! Mail has the best spam protection around > >>http://mail.yahoo.com > >> > >> > >>------------------------------------------------------- > >>This SF.Net email is sponsored by xPML, a groundbreaking scripting > >>language that extends applications into web and mobile media. > Attend > >>the > >>live webcast and join the prime developer group breaking into this > >>new > >>coding territory! > >> > >> > >> > >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 > > > > > >>_______________________________________________ > >>Ganglia-developers mailing list > >>Ganglia-developers@lists.sourceforge.net > >>https://lists.sourceforge.net/lists/listinfo/ganglia-developers > >> > >> > >> > >> > >> > >------------------------------------------------------------------------ > > > > > >>For more information about Barclays Capital, please > >>visit our web site at http://www.barcap.com. > >> > >> > >>Internet communications are not secure and therefore the Barclays > >>Group does not accept legal responsibility for the contents of this > > >>message. Although the Barclays Group operates anti-virus > programmes, > >> > >>it does not accept responsibility for any damage whatsoever that is > > >>caused by viruses being passed. Any views or opinions presented > are > >>solely those of the author and do not necessarily represent those > of > >>the > >>Barclays Group. Replies to this email may be monitored by the > >>Barclays > >>Group for operational or business reasons. > >> > >> > >> > >> > >------------------------------------------------------------------------ > > > > > >> > >>------------------------------------------------------- > >>This SF.Net email is sponsored by xPML, a groundbreaking scripting > >>language > >>that extends applications into web and mobile media. Attend the > live > >>webcast > >>and join the prime developer group breaking into this new coding > >>territory! > >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642 > >>_______________________________________________ > >>Ganglia-developers mailing list > >>Ganglia-developers@lists.sourceforge.net > >>https://lists.sourceforge.net/lists/listinfo/ganglia-developers > >> > >> > >> > >> > > > > > >------------------------------------------------------ > >Martin Knoblauch > >email: k n o b i AT knobisoft DOT de > >www: http://www.knobisoft.de > > > > > > > ------------------------------------------------------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de