Hi Ian,

 thanks for updation the patch.

 Puuhhh. That behaviour you describe is bad indeed. Seems either Cygwin
or M$ are doing something stupid.

 One thought - you are calling apr_socket_send() at a high frequency in
that loop. Have you played with inserting some delay code in the loop?
Maybe waiting a ms or so would increase the chance of success?

Cheers
Martin

--- Ian Cunningham <[EMAIL PROTECTED]> wrote:

> Martin,
> 
> Non-scientific numbers here for you. Connecting to the tcp port 600 
> times, print_host_metric() called apr_socket_send() at least 90,624 
> times. Of those 90,624 times, we got stuck in a EAGAIN while loop
> 1,190 
> times. On average that while loop looped 29,116.66 times, with
> maximum 
> of 525,705 loops.
> 
> Pretty bad in my opinion. But the workaround... works :/
> 
> I have refactored all of the apr_socket_sends to use the workaround.
> I 
> have it error out if it loops more than 750,000 times *shakes head*. 
> I've posted a patch to the bug that seems to work, it only bombed out
> 
> once in 600 tries.
> 
>
http://bugzilla.ganglia.info/cgi-bin/bugzilla/attachment.cgi?id=27&action=view
> 
> Ian
> 
> Martin Knoblauch wrote:
> 
> >Hi Richard,
> >
> > correct. I was waiting for a comment from Ian on my concerns about
> >possible endless loops before committing the patch.
> >
> > Ian: what do you think. Do you have any data how often you iterate
> >those EAGAIN loops?
> >
> >Cheers
> >Martin
> >
> > 
> >--- [EMAIL PROTECTED] wrote:
> >
> >  
> >
> >>Gee,
> >>
> >>I thought that was fixed with this patch:
> >>http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=50
> >>
> >>Actually, looking at 3.0.3 gmond.c, it looks like the patch did not
> >>make
> >>it
> >>into the release - that's a shame.
> >>
> >>Even looking at the patch, it looks as if it is a partial fix,
> >>because
> >>while
> >>the patched metric printing is protected like this (gmond.c,
> >>process_tcp_accept_channel):
> >><snip>
> >>        rv = print_host_metric(client, metric, now);
> >>        while(rv == EAGAIN)
> >>        {
> >>          rv = print_host_metric(client, metric, now);
> >>        }
> >>          if(rv != APR_SUCCESS)
> >>            {
> >>              goto close_accept_socket;
> >>            }
> >>        }
> >></snip>
> >>
> >>the gmetric printing in the same function is not protected:
> >><snip>
> >>
> >>      /* Send the gmetric info for this particular host */
> >>      for(metric_hi = apr_hash_first(client_context, ((Ganglia_host
> >>*)val)->gmetrics);
> >>          metric_hi;
> >>          metric_hi = apr_hash_next(metric_hi))
> >>        {
> >>          void *metric;
> >>          apr_hash_this(metric_hi, NULL, NULL, &metric);
> >>
> >>          /* Print each of the metrics from gmetric for this
> host...
> >>*/
> >>          if(print_host_gmetric(client, metric, now) !=
> APR_SUCCESS)
> >>            {
> >>              goto close_accept_socket;
> >>            }
> >>        }
> >>
> >>It may be best to talk to the original owner of the patch,
> >>I'm not confident to submit a patch myself, although I will try
> >>to submit a bugzill entry.
> >>
> >>kind regards,
> >>Richard
> >>
> >>-----Original Message-----
> >>From: [EMAIL PROTECTED]
> >>[mailto:[EMAIL PROTECTED] On Behalf
> Of
> >>Gilad Raphaelli
> >>Sent: 14 March 2006 18:35
> >>To: ganglia-developers@lists.sourceforge.net
> >>Subject: [Ganglia-developers] RE: First prerelease of ganglia-3.0.3
> >>ready for testing
> >>
> >>
> >>I have tried the new release 3.0.3.200602231926
> >>without success on FreeBSD 4.11 - the xml is still
> >>truncated when attempting to access the data from a
> >>remote host.  Interestingly, this is not the case when
> >>trying from the host running gmond.  Based on the
> >>strace, my colleague commented:
> >>
> >>  Default socket buffer is 64K.  It appears that
> >>socket is non-blocking.  That last write is failing
> >>(EAGAIN) because the socket buffer is full.  The
> >>application is ignoring that fact and shutting down
> >>the socket.  Looks to me like an application bug that
> >>just accidentally works on rhel.
> >>
> >>  Please let me know if you need any more information.
> >>
> >>Thank you,
> >>
> >>Gil
> >>-----------------------------------------------------
> >>
> >>Running an strace on gmond (on the target host) while
> >>trying to retrieve the data shows:
> >> 71160 write(10, "<METRIC NAME=\"swap_free\"
> >>VAL=\"41"..., 124) = 124
> >> 71160 write(10, "<METRIC NAME=\"bytes_in\"
> >>VAL=\"608"..., 129) = -1 EAGAIN
> >> (Resource temporarily unavailable)
> >> 71160 shutdown(10, 0 /* receive */)     = 0
> >>
> >> What this looks like from the requester (not the
> >>exact same transaction):
> >>
> >> <METRIC NAME="mem_buffers" VAL="204096" TYPE="uint32" UNITS="KB"
> >>TN="119" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>  <METRIC
> >>NAME="swap_free" VAL="4194136" TYPE="uint32" UNITS="KB" TN="119"
> >>TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond"/>  Connection
> closed
> >>by
> >>foreign host.
> >>
> >> A normal transaction closes with a closing tag: </GANGLIA_XML>
> >>
> >>__________________________________________________
> >>Do You Yahoo!?
> >>Tired of spam?  Yahoo! Mail has the best spam protection around 
> >>http://mail.yahoo.com 
> >>
> >>
> >>-------------------------------------------------------
> >>This SF.Net email is sponsored by xPML, a groundbreaking scripting
> >>language that extends applications into web and mobile media.
> Attend
> >>the
> >>live webcast and join the prime developer group breaking into this
> >>new
> >>coding territory!
> >>
> >>    
> >>
>
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> >  
> >
> >>_______________________________________________
> >>Ganglia-developers mailing list
> >>Ganglia-developers@lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/ganglia-developers
> >>
> >>
> >>
> >>    
> >>
>
>------------------------------------------------------------------------
> >  
> >
> >>For more information about Barclays Capital, please
> >>visit our web site at http://www.barcap.com.
> >>
> >>
> >>Internet communications are not secure and therefore the Barclays 
> >>Group does not accept legal responsibility for the contents of this
> 
> >>message.  Although the Barclays Group operates anti-virus
> programmes,
> >>
> >>it does not accept responsibility for any damage whatsoever that is
> 
> >>caused by viruses being passed.  Any views or opinions presented
> are 
> >>solely those of the author and do not necessarily represent those
> of
> >>the 
> >>Barclays Group.  Replies to this email may be monitored by the
> >>Barclays 
> >>Group for operational or business reasons.
> >>
> >>
> >>    
> >>
>
>------------------------------------------------------------------------
> >  
> >
> >>
> >>-------------------------------------------------------
> >>This SF.Net email is sponsored by xPML, a groundbreaking scripting
> >>language
> >>that extends applications into web and mobile media. Attend the
> live
> >>webcast
> >>and join the prime developer group breaking into this new coding
> >>territory!
> >>http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
> >>_______________________________________________
> >>Ganglia-developers mailing list
> >>Ganglia-developers@lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/ganglia-developers
> >>
> >>
> >>    
> >>
> >
> >
> >------------------------------------------------------
> >Martin Knoblauch
> >email: k n o b i AT knobisoft DOT de
> >www:   http://www.knobisoft.de
> >
> >  
> >
> 


------------------------------------------------------
Martin Knoblauch
email: k n o b i AT knobisoft DOT de
www:   http://www.knobisoft.de

Reply via email to