Specifically "query" does not work: http://bugzilla.ganglia.info/cgi-bin/bugzilla/query.cgi
Thanks, Bernard ---------- Forwarded message ---------- From: Ian Cunningham <[EMAIL PROTECTED]> Date: Mar 28, 2007 12:12 PM Subject: Re: [Ganglia-general] gmetad patch to contact random data_source hosts To: "Witham, Timothy D" <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Timothy, I have filed a bug for this exact behavior. I would direct you to attach this patch to the bug, however bugzilla seems to be down still. If the bugzilla admin is on list, please reply or fix bugzilla. Thanks, Ian Witham, Timothy D wrote:
Hi, I just had a situation where the first host in a gmetad data_source accepts the connection but offers no data, like this: poll() timeout for [clustername] data source after 0 bytes read Gmetad always tries the sources in order and so it just keeps getting stuck on this first one, and losing the data for the entire cluster. Here is a quick patch that tries random hosts from the list instead, and solved my problem. It is not careful to make sure it tried them all, but if it fails it will just try again next time. If someone wants to fix it to try all the sources in a random order, that would be fine. Perhaps this could be included in the next release unless someone knows a good reason to always try the sources in order. Thanks! -8<--------------------------------------------------------- diff -c -r1.1.1.1 data_thread.c *** data_thread.c 19 Mar 2007 18:52:32 -0000 1.1.1.1 --- data_thread.c 28 Mar 2007 18:12:08 -0000 *************** *** 18,24 **** void * data_thread ( void *arg ) { ! int i, sleep_time, bytes_read, rval; data_source_list_t *d = (data_source_list_t *)arg; g_inet_addr *addr; g_tcp_socket *sock=0; --- 18,24 ---- void * data_thread ( void *arg ) { ! int i, j, sleep_time, bytes_read, rval; data_source_list_t *d = (data_source_list_t *)arg; g_inet_addr *addr; g_tcp_socket *sock=0; *************** *** 60,75 **** if(d->last_good_index >= 0) sock = g_tcp_socket_new ( d->sources[d->last_good_index] ); ! /* If there was no good connection last time or the above connect failed then try each host in the list. */ if(!sock) { ! for(i=0; i < d->num_sources; i++) { ! /* Find first viable source in list. */ ! sock = g_tcp_socket_new ( d->sources[i] ); if( sock ) { ! d->last_good_index = i; break; } } --- 60,80 ---- if(d->last_good_index >= 0) sock = g_tcp_socket_new ( d->sources[d->last_good_index] ); ! /* If there was no good connection last time or the above ! connect failed then try random hosts in the list. We try ! random ones in case someone is accepting the connection ! but refusing to provide any data; we don't want to get ! stuck with a non-working host. */ if(!sock) { ! for(i=0; i < d->num_sources * 2; i++) { ! /* Find random viable source in list. */ ! j = d->num_sources * (rand() / (RAND_MAX - 1.0)); ! sock = g_tcp_socket_new ( d->sources[j] ); if( sock ) { ! d->last_good_index = j; break; } } -8<----------------------------------------------------------
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Ganglia-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/ganglia-general