Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Thu, Feb 24, 2011 at 10:44:58PM +, Kostas Georgiou wrote: On Thu, Feb 24, 2011 at 12:25:16PM +, Carlo Marcelo Arenas Belon wrote: On Wed, Feb 23, 2011 at 09:42:56AM -0800, Bernard Li wrote: 123 read: 124 read_len = read(fd, db, buflen); 125 if (read_len = 0) 126 { 127if (errno == EINTR) 128 goto read; 129err_ret(slurpfile() read() error on file %s, filename); 130close(fd); 131return SYNAPSE_FAILURE; 132 } this code is not relevant as it is only called when EINTR is received because a signal interrupts the read call (very unlikely) Shouldn't this be if (read_len 0), a return of zero from read is possible (EOF for example). If slurpfile is called with buffer=NULL and buffsize equal or a multiple of the file size then we get SYNAPSE_FAILURE. The errno check will be against an old value of errno in this which makes it more likely to hit (still very unlikely though :) and then we have an infinite loop... good point, eventhough I would like to think that errno will be reset by the read call and avoid the infinite loop anyway. Committed revision 2494 Carlo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Wed, Feb 23, 2011 at 05:12:03PM -0800, Bernard Li wrote: I tested under EL5 and EL6 and it was't able to get past the initial buffer size. ?I believe what I did was: Correction. It works on EL6, but not on EL5: most likely the test is just giving inconsistent results, and that is why now works in EL5, while it didn't before. [CentOS 5.5 x86_64 with kernel 2.6.18-194.32.1.el5] read(3, 2.6.18-194.32.1., 16) = 16 read(3, , 16) = 0 [RHEL6b2 x86_64 with kernel 2.6.32-37.el6.x86_64] read(3, 2.6.32-37.el6.x8, 16) = 16 read(3, 6_64\n, 16) = 5 The issue may be specific to files in /proc/sys, because I tried reading /proc/stat on CentOS 5.5 and it worked fine. very unlikely, and considering that this is some code modification you made and that you only have, the problem is most likely in your code anyway (maybe even miscompiled) Carlo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Wed, Feb 23, 2011 at 09:42:56AM -0800, Bernard Li wrote: what second pass? ? dummy = proc_sys_kernel_osrelease; ? rval.int32 = slurpfile(/proc/sys/kernel/osrelease, dummy, ? ? ? ? ? ? ? ? ? ? ? ? ?MAX_G_STRING_SIZE); why would anyone call slurpfile in a loop anyway?, and slurpfile doesn't call itself recursively but just reads as much data as it can into the buffer provided (second parameter). Sorry I wasn't clear, I meant the goto read loop: 123 read: 124 read_len = read(fd, db, buflen); 125 if (read_len = 0) 126 { 127if (errno == EINTR) 128 goto read; 129err_ret(slurpfile() read() error on file %s, filename); 130close(fd); 131return SYNAPSE_FAILURE; 132 } this code is not relevant as it is only called when EINTR is received because a signal interrupts the read call (very unlikely) the second conditional after that code is used to continue reading the buffer after it is resized if that is possible and that works fine as shown by your tests 136if (read_len == buflen) 137 { 138 if (dynamic) { 139 dynamic += buflen; 140 db = realloc(*buffer, dynamic); 141 *buffer = db; 142 db = *buffer + dynamic - buflen; 143 goto read; 144 } else { 145 --read_len; 146 err_msg(slurpfile() read() buffer overflow on file %s, filename); 147 } 148 } When I straced the process, the first read() was able to read up to MAX_G_STRING, however, the second read() returns 0. However, if I read a regular file (not in /proc filesystem), it was able to read the rest of the string in the second pass just fine. this just sounds to strange, but was able to replicate it after a lot of guessing in a CentOS 5 VM (both 32bit and 64bit) as shown by : # strace -e read dd if=/proc/sys/kernel/osrelease bs=16 /dev/null read(0, 2.6.18-164.9.1.e, 16) = 16 read(0, , 16) = 0 so not a ganglia problem, and just a problem with the way you were trying to use slurpfile and the way that specific sysctl handler is implemented in that version of the kernel. makes sense anyway to not worry about partial reads from a value that is meant to be used whole anyway, but interestingly enough and as you reported later it is no longer working that way with newer kernels. Regarding this particular bug -- how should we fix this? There are currently two issues: 1) The OS release is truncated in the web frontend and that is to protect the gmond process against crashes 2) The warning slurpfile() read() buffer overflow on file /proc/sys/kernel/osrelease is displayed multiple times during RPM installation (possibly because gmond was called to generate conf files etc.) that was meant to be mostly informative, but the message might need to be reworked to be more effective. Can we potentially increase MAX_G_STRING or have proc_sys_kernel_osrelease buffer size resize dynamically? no Carlo -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] Dynamically resizable buffer for slurpfile()
On Wed, Feb 23, 2011 at 05:12:03PM -0800, Bernard Li wrote: Hi Carlo: On Wed, Feb 23, 2011 at 9:42 AM, Bernard Li bern...@vanhpc.org wrote: I tested under EL5 and EL6 and it was't able to get past the initial buffer size. I believe what I did was: Correction. It works on EL6, but not on EL5: [CentOS 5.5 x86_64 with kernel 2.6.18-194.32.1.el5] read(3, 2.6.18-194.32.1., 16) = 16 read(3, , 16) = 0 [RHEL6b2 x86_64 with kernel 2.6.32-37.el6.x86_64] read(3, 2.6.32-37.el6.x8, 16) = 16 read(3, 6_64\n, 16) = 5 The issue may be specific to files in /proc/sys, because I tried reading /proc/stat on CentOS 5.5 and it worked fine. In any case the slurpfile resizable buffer doesn't really work :( slurpfile only resizes a buffer if it's NULL but this is the case only at the first call for a metric. char *bp=NULL; slurpfile(file, bp, 32); ...next polling interval... /* the buffer isn't resibale any more since it isn't NULL */ slurpfile(file, bp, 32); Kostas -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
[Ganglia-developers] Dynamically resizable buffer for slurpfile()
Hi all: According to the bug: http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=298 the recent patch to make slurpfile's buffer resizable doesn't seem to be working. Upon second pass, we can't seem to be able to read any additional characters from the file. It would appear that the special files in the /proc file system are not seekable? The code does work with regular files. Any ideas? Cheers, Bernard -- Free Software Download: Index, Search Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers