> compile your prog with 'gcc -pg ...'. When you run the prog, a > gmon.out file will be created, which can processed by gprof to list > called routines, and time spent in routines. > > $ gcc -g -pg src.c -o exe > $ ./exe > $ gprof exe gmon.out
Cool. I tried it but it busted on a "select". I will play with this more in the future. I may have worked around the problem. I have a proprietary wrapper around datagrams as they pass through the box. I added 6 new fields to the wrapper as follows: struct timeval timeStampA struct timeval timeStampB struct timeval timeStampC struct timeval timeStampD struct timeval timeStampE struct timeval timeStampF I loaded the fields with gettimeofday() after the recvfrom on entry and before the sendto on exit from each of the 3 daemons n my app. I found the D to E transfer in one direction was taking about 1/2 minute consistently. After substituting code from a "fast" daemon into the "slow" daemon and gnashing my teeth in frustration for a while I noticed that the fast daemon had a blocking select (wait-forever). The "slow" daemon used a polling select (no-wait). I changed the slow daemon select to wait for 10000 usecs. Voila. Transfer time overall goes down dramatically. I went through the entire code body and changed all polling selects to wait for 10000 usecs selects. The change is dramatic. Before the change, using LOTs of syslogging, I got cross 2-box delays of about 30 secs and with almost no syslogging I got .9 sec delays. With the new change I get delays of .2-.02 secs with LOTs of syslogging. These numbers are much more reasonable. (Great sigh of relief.) This result does not seem consistent with how I read Stevens explanation of no-wait select in section 5.6 of Unix Network Programming Vol 1. I am using a 2.2.14 kernel. Mike
