Just rolled out a build against today's cvs (as a
result of Matt's note) and no apparent EAGAIN issues
on FreeBSD 4.11 - full XML stream is being returned,
~97K of data.

Thanks,

Gil

--- Ian Cunningham <[EMAIL PROTECTED]> wrote:

> Martin,
> 
> Now I have played around with delay code, this
> version has an 
> exponential back off. The mode for the number of
> loops was 1 loop which 
> implys your assumption is correct. The next highest
> frequency of 
> occurrence for number of loops was 22, so waiting a
> very short bit 
> either works, or doesn't and you have to wait a lot
> longer.
> 
> One idea we been throwing around is to idea is to
> not call send() for 
> each metric, but instead, get all the data for one
> host, and then send 
> it as one huge. We may run benchmarks on that later
> and see whats better 
> in terms of wall clock time.
> 
> Ian
> 
> /* this function wraps calls to apr_send_socket to
> handle EAGAIN */
> apr_status_t socket_send_full(apr_socket_t *sock,
> const char *buf, 
> apr_size_t *len)
> {
>   apr_status_t rv;
>   int loop = 0;
>   apr_size_t start_len;
>   apr_interval_time_t t;
> 
>   start_len = (*len);
>   (*len) = start_len;
>   rv = apr_socket_send( sock, buf, len);
> 
>   while (loop++ < 33 && APR_STATUS_IS_EAGAIN(rv))
>   {
>     t = loop * loop * 100;
>     apr_sleep(t);
>     (*len) = start_len;
>     rv = apr_socket_send( sock, buf, len);
>   }
>   return rv;
> }
> 
> 
> Martin Knoblauch wrote:
> 
> >Hi Ian,
> >
> > thanks for updation the patch.
> >
> > Puuhhh. That behaviour you describe is bad indeed.
> Seems either Cygwin
> >or M$ are doing something stupid.
> >
> > One thought - you are calling apr_socket_send() at
> a high frequency in
> >that loop. Have you played with inserting some
> delay code in the loop?
> >Maybe waiting a ms or so would increase the chance
> of success?
> >
> >Cheers
> >Martin
> >
> >--- Ian Cunningham <[EMAIL PROTECTED]>
> wrote:
> >
> >  
> >
> >>Martin,
> >>
> >>Non-scientific numbers here for you. Connecting to
> the tcp port 600 
> >>times, print_host_metric() called
> apr_socket_send() at least 90,624 
> >>times. Of those 90,624 times, we got stuck in a
> EAGAIN while loop
> >>1,190 
> >>times. On average that while loop looped 29,116.66
> times, with
> >>maximum 
> >>of 525,705 loops.
> >>
> >>Pretty bad in my opinion. But the workaround...
> works :/
> >>
> >>I have refactored all of the apr_socket_sends to
> use the workaround.
> >>I 
> >>have it error out if it loops more than 750,000
> times *shakes head*. 
> >>I've posted a patch to the bug that seems to work,
> it only bombed out
> >>
> >>once in 600 tries.
> >>
> >>
> >>    
> >>
>
>http://bugzilla.ganglia.info/cgi-bin/bugzilla/attachment.cgi?id=27&action=view
> >  
> >
> >>Ian
> >>
> >>Martin Knoblauch wrote:
> >>
> >>    
> >>
> >>>Hi Richard,
> >>>
> >>>correct. I was waiting for a comment from Ian on
> my concerns about
> >>>possible endless loops before committing the
> patch.
> >>>
> >>>Ian: what do you think. Do you have any data how
> often you iterate
> >>>those EAGAIN loops?
> >>>
> >>>Cheers
> >>>Martin
> >>>
> >>>
> >>>--- [EMAIL PROTECTED] wrote:
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>Gee,
> >>>>
> >>>>I thought that was fixed with this patch:
>
>>>>http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=50
> >>>>
> >>>>Actually, looking at 3.0.3 gmond.c, it looks
> like the patch did not
> >>>>make
> >>>>it
> >>>>into the release - that's a shame.
> >>>>
> >>>>Even looking at the patch, it looks as if it is
> a partial fix,
> >>>>because
> >>>>while
> >>>>the patched metric printing is protected like
> this (gmond.c,
> >>>>process_tcp_accept_channel):
> >>>><snip>
> >>>>       rv = print_host_metric(client, metric,
> now);
> >>>>       while(rv == EAGAIN)
> >>>>       {
> >>>>         rv = print_host_metric(client, metric,
> now);
> >>>>       }
> >>>>         if(rv != APR_SUCCESS)
> >>>>           {
> >>>>             goto close_accept_socket;
> >>>>           }
> >>>>       }
> >>>></snip>
> >>>>
> >>>>the gmetric printing in the same function is not
> protected:
> >>>><snip>
> >>>>
> >>>>     /* Send the gmetric info for this
> particular host */
> >>>>     for(metric_hi =
> apr_hash_first(client_context, ((Ganglia_host
> >>>>*)val)->gmetrics);
> >>>>         metric_hi;
> >>>>         metric_hi = apr_hash_next(metric_hi))
> >>>>       {
> >>>>         void *metric;
> >>>>         apr_hash_this(metric_hi, NULL, NULL,
> &metric);
> >>>>
> >>>>         /* Print each of the metrics from
> gmetric for this
> >>>>        
> >>>>
> >>host...
> >>    
> >>
> >>>>*/
> >>>>         if(print_host_gmetric(client, metric,
> now) !=
> >>>>        
> >>>>
> >>APR_SUCCESS)
> >>    
> >>
> >>>>           {
> >>>>             goto close_accept_socket;
> >>>>           }
> >>>>       }
> >>>>
> >>>>It may be best to talk to the original owner of
> the patch,
> >>>>I'm not confident to submit a patch myself,
> although I will try
> >>>>to submit a bugzill entry.
> >>>>
> >>>>kind regards,
> >>>>Richard
> >>>>
> >>>>-----Original Message-----
> >>>>From:
> [EMAIL PROTECTED]
>
>>>>[mailto:[EMAIL PROTECTED]
> On Behalf
> >>>>        
> >>>>
> >>Of
> >>    
> >>
> >>>>Gilad Raphaelli
> >>>>Sent: 14 March 2006 18:35
> >>>>To: ganglia-developers@lists.sourceforge.net
> >>>>Subject: [Ganglia-developers] RE: First
> prerelease of ganglia-3.0.3
> >>>>ready for testing
> >>>>
> >>>>
> >>>>I have tried the new release 3.0.3.200602231926
> >>>>without success on FreeBSD 4.11 - the xml is
> still
> >>>>truncated when attempting to access the data
> from a
> >>>>remote host.  Interestingly, this is not the
> case when
> >>>>trying from the host running gmond.  Based on
> the
> >>>>strace, my colleague commented:
> >>>>
> >>>> Default socket buffer is 64K.  It appears that
> >>>>socket is non-blocking.  That last write is
> failing
> >>>>(EAGAIN) because the socket buffer is full.  The
> >>>>application is ignoring that fact and shutting
> down
> >>>>the socket.  Looks to me like an application bug
> that
> >>>>just accidentally works on rhel.
> >>>>
> >>>> Please let me know if you need any more
> information.
> >>>>
> >>>>Thank you,
> >>>>
> >>>>Gil
>
>>>>-----------------------------------------------------
> >>>>
> >>>>Running an strace on gmond (on the target host)
> while
> >>>>trying to retrieve the data shows:
> >>>>71160 write(10, "<METRIC NAME=\"swap_free\"
> >>>>VAL=\"41"..., 124) = 124
> >>>>71160 write(10, "<METRIC NAME=\"bytes_in\"
> >>>>VAL=\"608"..., 129) = -1 EAGAIN
> >>>>(Resource temporarily unavailable)
> >>>>71160 shutdown(10, 0 /* receive */)     = 0
> >>>>
> >>>>What this looks like from the requester (not the
> >>>>exact same transaction):
> >>>>
> >>>><METRIC NAME="mem_buffers" VAL="204096"
> TYPE="uint32" UNITS="KB"
> >>>>TN="119" TMAX="180" DMAX="0" SLOPE="both"
> SOURCE="gmond"/>  <METRIC
> >>>>NAME="swap_free" VAL="4194136" TYPE="uint32"
> UNITS="KB" TN="119"
> >>>>TMAX="180" DMAX="0" SLOPE="both"
> SOURCE="gmond"/>  Connection
> >>>>        
> >>>>
> >>closed
> >>    
> >>
> >>>>by
> >>>>foreign host.
> >>>>
> >>>>A normal transaction closes with a closing tag:
> </GANGLIA_XML>
> >>>>
>
>>>>__________________________________________________
> >>>>Do You Yahoo!?
> >>>>Tired of spam?  Yahoo! Mail has the best spam
> protection around 
> >>>>http://mail.yahoo.com 
> >>>>
> >>>>
>
>>>>-------------------------------------------------------
> >>>>This SF.Net email is sponsored by xPML, a
> groundbreaking scripting
> >>>>language that extends applications into web and
> mobile media.
> >>>>        
> >>>>
> >>Attend
> >>    
> >>
> >>>>the
> >>>>live webcast and join the prime developer group
> breaking into this
> >>>>new
> >>>>coding territory!
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
>
>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> >>    
> >>
> >>> 
> >>>
> >>>      
> >>>
> >>>>_______________________________________________
> >>>>Ganglia-developers mailing list
> >>>>Ganglia-developers@lists.sourceforge.net
>
>>>>https://lists.sourceforge.net/lists/listinfo/ganglia-developers
> >>>>
> >>>>
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
>
>>------------------------------------------------------------------------
> >>    
> >>
> >>> 
> >>>
> >>>      
> >>>
> >>>>For more information about Barclays Capital,
> please
> >>>>visit our web site at http://www.barcap.com.
> >>>>
> >>>>
> >>>>Internet communications are not secure and
> therefore the Barclays 
> >>>>Group does not accept legal responsibility for
> the contents of this
> >>>>        
> >>>>
> >>>>message.  Although the Barclays Group operates
> anti-virus
> >>>>        
> >>>>
> >>programmes,
> >>    
> >>
> >>>>it does not accept responsibility for any damage
> whatsoever that is
> >>>>        
> >>>>
> >>>>caused by viruses being passed.  Any views or
> opinions presented
> >>>>        
> >>>>
> >>are 
> >>    
> >>
> >>>>solely those of the author and do not
> necessarily represent those
> >>>>        
> >>>>
> >>of
> >>    
> >>
> >>>>the 
> >>>>Barclays Group.  Replies to this email may be
> monitored by the
> >>>>Barclays 
> >>>>Group for operational or business reasons.
> >>>>
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
>
>>------------------------------------------------------------------------
> >>    
> >>
> >>> 
> >>>
> >>>      
> >>>
>
>>>>-------------------------------------------------------
> >>>>This SF.Net email is sponsored by xPML, a
> groundbreaking scripting
> >>>>language
> >>>>that extends applications into web and mobile
> media. Attend the
> >>>>        
> >>>>
> >>live
> >>    
> >>
> >>>>webcast
> >>>>and join the prime developer group breaking into
> this new coding
> >>>>territory!
>
>>>>http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
> >>>>_______________________________________________
> >>>>Ganglia-developers mailing list
> >>>>Ganglia-developers@lists.sourceforge.net
>
>>>>https://lists.sourceforge.net/lists/listinfo/ganglia-developers
> >>>>
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
>
>>>------------------------------------------------------
> >>>Martin Knoblauch
> >>>email: k n o b i AT knobisoft DOT de
> >>>www:   http://www.knobisoft.de
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >
> >
>
>------------------------------------------------------
> >Martin Knoblauch
> >email: k n o b i AT knobisoft DOT de
> >www:   http://www.knobisoft.de
> >
> >  
> >
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to