guys-

i just checked the XML port fix into CVS. this is the "simple" solution where we trust a timeout to prevent a client from hogging gmond. keep in mind this is a timeout per write and there are multiple writes per XML connection.

the changes are so minimal that i didn't tag CVS. i basically added a (int blocking) parameter to the create_net_server() function in ./ lib/apr_net.c (which i not a part of apr but our special hacks). the create_tcp_server also has a blocking flag as well (that is passes to create_net_server()). i've set all tcp servers (aka XML ports) to be blocking w/timeout.

please test this code and let me know what you find. i suspect that you will not find any truncated XML as you did before. thanks to the guys out there who are putting gmond through its paces. hope to find 0.00% errors. :)

also, i still want to get back to the gmetric truncated data problem as well... i hope to find some time soon. been *crazy* busy.

-matt




here is the diff with the previous CVS...

Index: gmond/gmond.c
===================================================================
RCS file: /cvsroot/ganglia/monitor-core/gmond/gmond.c,v
retrieving revision 1.104
diff -r1.104 gmond.c
541c541,542
< socket = create_tcp_server(pool, sock_family, port, bindaddr, interface);
---
> socket = create_tcp_server(pool, sock_family, port, bindaddr, interface,
>                  1);// blocking w/timeout
Index: lib/apr_net.c
===================================================================
RCS file: /cvsroot/ganglia/monitor-core/lib/apr_net.c,v
retrieving revision 1.13
diff -r1.13 apr_net.c
97c97,98
< create_net_server(apr_pool_t *context, int32_t ofamily, int type, apr_port_t port, char *bind_addr)
---
> create_net_server(apr_pool_t *context, int32_t ofamily, int type, apr_port_t
>            port, char *bind_addr, int blocking)
121,127c122,130
<   /* Setup to be non-blocking */
<   stat = apr_setsocketopt(sock, APR_SO_NONBLOCK, 1);
<   if (stat != APR_SUCCESS)
<     {
<       apr_socket_close(sock);
<       return NULL;
<     }
---
>   if(!blocking){
>      /* This is a non-blocking server */
>      stat = apr_setsocketopt(sock, APR_SO_NONBLOCK, 1);
>      if (stat != APR_SUCCESS)
>      {
>            apr_socket_close(sock);
>            return NULL;
>      }
>   }
169c172,173
< create_udp_server(apr_pool_t *context, int32_t family, apr_port_t port, char *bind_addr)
---
> create_udp_server(apr_pool_t *context, int32_t family, apr_port_t port, char
>            *bind_addr)
171c175
< return create_net_server(context, family, SOCK_DGRAM, port, bind_addr);
---
> return create_net_server(context, family, SOCK_DGRAM, port, bind_addr, 0);
175c179,180
< create_tcp_server(apr_pool_t *context, int32_t family, apr_port_t port, char *bind_addr, char *interface)
---
> create_tcp_server(apr_pool_t *context, int32_t family, apr_port_t port, char
>            *bind_addr, char *interface, int blocking)
177c182,183
< apr_socket_t *sock = create_net_server(context, family, SOCK_STREAM, port, bind_addr);
---
> apr_socket_t *sock = create_net_server(context, family, SOCK_STREAM, port,
>              bind_addr, blocking);
Index: lib/apr_net.h
===================================================================
RCS file: /cvsroot/ganglia/monitor-core/lib/apr_net.h,v
retrieving revision 1.7
diff -r1.7 apr_net.h
25c25,26
< create_tcp_server(apr_pool_t *context, int32_t family, apr_port_t port, char *bind, char *interface);
---
> create_tcp_server(apr_pool_t *context, int32_t family, apr_port_t port, char
>            *bind, char *interface, int blocking);



--
[EMAIL PROTECTED]
  http://massie.us




Reply via email to