Re: readv() questions

2006-05-09 Thread Mark Pizzolato

On Tuesday, May 09, 2006 12:44 AM clayne wrote:


[...]

My actual readv() wrapping code is very basic and standard, so I don't 
think

it's doing anything evil or causing a problem:

400 size_t n_recv_iov(int s, const struct iovec *v, size_t c, int 
tout)

401 {
402 size_t  br;
403 int res;
404 struct timeval  to;
405 fd_set  fds, fds_m;
406
407 FD_ZERO(&fds_m);
408 FD_SET(s, &fds_m);
409
410 while (1) {
411 fds = fds_m;
412 to.tv_sec = tout;
413 to.tv_usec = 0;
414
415 if ((br = readv(s, v, c)) == (size_t)-1) {
416 switch (errno) {
417 case EWOULDBLOCK:
418 case EINTR:
419 break;
420 default:
421 perror("readv");
422 return -1;
423 }
424 } else {
425 break;
426 }
427
428 if ((res = select(s + 1, &fds, NULL, NULL, &to)) 
== 0)

429 return -1; /* timeout */
430 else if (res == -1) {
431 perror("select");
432 return -1; /* never happen */
433 }
434 }
435
436 return br;
437 }

And my call to it is basic as well:

 61 IOV_SET(&packet[0], &byte_tl, sizeof(byte_tl));
 62 IOV_SET(&packet[1], &byte_vl, sizeof(byte_vl));
 63 IOV_SET(&packet[2], &byte_flags, sizeof(byte_flags));
 64 IOV_SET(&packet[3], &nbo_s, sizeof(nbo_s));
 65 IOV_SET(&packet[4], &nbo_t_onl, sizeof(nbo_t_onl));
 66 IOV_SET(&packet[5], &nbo_t_ofl, sizeof(nbo_t_ofl));
 67
 68 for (error = 0; !error; ) {
 69 error = 1;
 70
 71 if ((hl = n_recv_iov(s, packet, NE(packet), 60)) 
== (size_t)-1)

 72 break;
 73
 74 assert(byte_vl < sizeof(byte_var));
 75
 76 if ((vl = n_recv(s, byte_var, byte_vl, 60)) == 
(size_t)-1)

 77 break;
 78 if (hl == 0 || vl == 0)
 79 break;
 80
 81 error = 0;
 82
 83 /* process_data(); */
 84 }

Sorry for the ultra mail, but I know for a fact that readv() on cygwin is
doing bad things when faced with a lot of data to read from the wire. Any
insights?


Well, to me this looks like a variation on the classic error made when 
coding applications which use tcp.  Specifically that there is a 1<->1 
crrespondence between sends( write, writev, etc) on the sending side to 
rcvs(read, readv, etc) on the recieving side.  TCP makes no such guarantee. 
It only guarantees tha the bytes put in on the sending side of the 
connection will come out in the same order on the recieving side.  No 
guarantee about the size of the respective reads of the data delivered.  If 
you are expecting to receive a certain size data element, the burden is 
yours to actually make sure you get as much data as you expect, and to 
continue reading until you are happy.


Your code does not seem to do anything to validate that the length of the 
data returned by readv is indeed what you expected.


- Mark Pizzolato 




--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Multi Threaded programs deadlock doing simple I/O operations

2005-06-19 Thread Mark Pizzolato

On Sunday, June 12, 2005 T 5:37 PM, Mark Pizzolato wrote:

On Friday, June 10, 2005 at 3:44 PM, Mark Pizzolato wrote:

> On Thursday, June 09, 2005 at 6:12 PM, Mark Pizzolato wrote:
>> On Thursday, June 09, 2005 at 3:35 PM, Christopher Faylor wrote:
>> > On Wed, Jun 08, 2005 at 05:43:59PM -0700, Mark Pizzolato wrote:
>> > >There is a serious problem for multi threaded programs doing simple 
>> > >I/O

>> > >operations in cygwin (open, dup, fdopen, fclose, and close).
>> > >
>> > >The attached 81 line test program clearly demonstrates the issue 
>> > >(by
>> > >hanging and no longer consuming CPU or performing any I/O 
>> > >operations).

>> >
>> > Thanks for the relatively small test case.  That was enough to track 
>> > the

>> > problem down.  I'm generating a new snapshot with a fix for this
>> > problem.
>>
>> The snapshot looks good!
>>
>> This fixes the stability problems with clamav's clamd that I've been 
>> chasing

>> for a long time.
>
> Some more follow up here...I'm running with the 20050609 snapshot dll.
>
> clamav's clamd now runs better than it has ever for me on cygwin.
>
>   until "it doesn't",
>
> once it starts to run poorly it won't run cleanly again until I reboot 
> the system

> (I haven't actually tried after merely exiting all processes ..)


Well, i spoke too soon here.  There may be some interaction with many 
recently closed tcp sessions sitting in TIME_WAIT.  I'm not sure, but 
after some time, I can restart and experience aparrently good behavior and 
then things get "poor" as described.


If I run with the 20050607 snapshot, the new "poor" behavior doesn't 
happen, while the test program I provided earlier in this thread hangs as 
described. So, the fix to the original problem and the new "poor" behavior 
are clearly related to changes between the 20050607 and the 20050609 
snapshots.



> To be more specific about the "poor" behavior:
>
>
> - pthread_unlock_mutex fails leaving errno with a value of 90.  This is 
> in a place where there is only one path through about a dozen lines of 
> code and the mutex is definately locked.  there may have been a call to 
> pthread_create, and a definate call to pthread_cond_signal.
> - once the above error happens, calls (by the same thread) to accept() 
> fail using a file descriptor which we've been successfully using all 
> along and only close when the program exists.

>
> so some change introduced recently (since 1.5.17-1), and possibly in 
> 20050609 fixes the dup() issue but now mutex operations are failing in 
> strange ways.

>
> Sorry not to have a simple isolated test case for this.  The good news 
> is that once it breaks it won't run correcfly again until a reboot.


I'm working on a test program to recreate this behavior.


Well...  The problem wasn't in cygwin.

As it happens in clamav's clamd there were several pthread_mutex_t objects
which weren't initialized to reasonable values (i.e. left to be zero instead 
of

PTHREAD_MUTEX_INITIALIZER).  Calls to pthread_mutex_lock and
pthread_mutex_unlock on the uninitialized objects, depending on timing and
sequence aparrently confused some aspect of mutex processing causing
other calls to pthread_mutex_lock and pthread_mutex_unlock to fail in
strange ways.

Appropriate patches have been submitted to the clamav team.

- Mark Pizzolato 



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Multi Threaded programs deadlock doing simple I/O operations

2005-06-12 Thread Mark Pizzolato

On Friday, June 10, 2005 at 3:44 PM, Mark Pizzolato wrote:

> On Thursday, June 09, 2005 at 6:12 PM, Mark Pizzolato wrote:
>> On Thursday, June 09, 2005 at 3:35 PM, Christopher Faylor wrote:
>> > On Wed, Jun 08, 2005 at 05:43:59PM -0700, Mark Pizzolato wrote:
>> > >There is a serious problem for multi threaded programs doing simple 
>> > >I/O

>> > >operations in cygwin (open, dup, fdopen, fclose, and close).
>> > >
>> > >The attached 81 line test program clearly demonstrates the issue (by
>> > >hanging and no longer consuming CPU or performing any I/O 
>> > >operations).

>> >
>> > Thanks for the relatively small test case.  That was enough to track 
>> > the

>> > problem down.  I'm generating a new snapshot with a fix for this
>> > problem.
>>
>> The snapshot looks good!
>>
>> This fixes the stability problems with clamav's clamd that I've been 
>> chasing

>> for a long time.
>
> Some more follow up here...I'm running with the 20050609 snapshot dll.
>
> clamav's clamd now runs better than it has ever for me on cygwin.
>
>   until "it doesn't",
>
> once it starts to run poorly it won't run cleanly again until I reboot 
> the system

> (I haven't actually tried after merely exiting all processes ..)


Well, i spoke too soon here.  There may be some interaction with many 
recently closed tcp sessions sitting in TIME_WAIT.  I'm not sure, but after 
some time, I can restart and experience aparrently good behavior and then 
things get "poor" as described.


If I run with the 20050607 snapshot, the new "poor" behavior doesn't happen, 
while the test program I provided earlier in this thread hangs as described. 
So, the fix to the original problem and the new "poor" behavior are clearly 
related to changes between the 20050607 and the 20050609 snapshots.



> To be more specific about the "poor" behavior:
>
>
> - pthread_unlock_mutex fails leaving errno with a value of 90.  This is 
> in a place where there is only one path through about a dozen lines of 
> code and the mutex is definately locked.  there may have been a call to 
> pthread_create, and a definate call to pthread_cond_signal.
> - once the above error happens, calls (by the same thread) to accept() 
> fail using a file descriptor which we've been successfully using all 
> along and only close when the program exists.

>
> so some change introduced recently (since 1.5.17-1), and possibly in 
> 20050609 fixes the dup() issue but now mutex operations are failing in 
> strange ways.

>
> Sorry not to have a simple isolated test case for this.  The good news 
> is that once it breaks it won't run correcfly again until a reboot.


I'm working on a test program to recreate this behavior.

- Mark Pizzolato 



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Multi Threaded programs deadlock doing simple I/O operations

2005-06-10 Thread Mark Pizzolato

On Thursday, June 09, 2005 at 6:12 PM, Mark Pizzolato wrote:

On Thursday, June 09, 2005 at 3:35 PM, Christopher Faylor wrote:
> On Wed, Jun 08, 2005 at 05:43:59PM -0700, Mark Pizzolato wrote:
> >There is a serious problem for multi threaded programs doing simple I/O
> >operations in cygwin (open, dup, fdopen, fclose, and close).
> >
> >The attached 81 line test program clearly demonstrates the issue (by
> >hanging and no longer consuming CPU or performing any I/O operations).
>
> Thanks for the relatively small test case.  That was enough to track the
> problem down.  I'm generating a new snapshot with a fix for this
> problem.

The snapshot looks good!

This fixes the stability problems with clamav's clamd that I've been 
chasing

for a long time.


Some more follow up here...I'm running with the 20050609 snapshot dll.

clamav's clamd now runs better than it has ever for me on cygwin.

  until "it doesn't",

once it starts to run poorly it won't run cleanly again until I reboot the 
system

(I haven't actually tried after merely exiting all processes ..)

To be more specific about the "poor" behavior:


- pthread_unlock_mutex fails leaving errno with a value of 90.  This is in 
a place where there is only one path through about a dozen lines of code and 
the mutex is definately locked.  there may have been a call to 
pthread_create, and a definate call to pthread_cond_signal.
- once the above error happens, calls (by the same thread) to accept() fail 
using a file descriptor which we've been successfully using all along and 
only close when the program exists.


so some change introduced recently (since 1.5.17-1), and possibly in 
20050609 fixes the dup() issue but now mutex operations are failing in 
strange ways.


Sorry not to have a simple isolated test case for this.  The good news is 
that once it breaks it won't run correcfly again until a reboot.


Ideas?

Thanks.

- Mark Pizzolato 



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Re: Multi Threaded programs deadlock doing simple I/O operations

2005-06-09 Thread Mark Pizzolato

On Thursday, June 09, 2005 at 3:35 PM, Christopher Faylor wrote:

On Wed, Jun 08, 2005 at 05:43:59PM -0700, Mark Pizzolato wrote:
>There is a serious problem for multi threaded programs doing simple I/O
>operations in cygwin (open, dup, fdopen, fclose, and close).
>
>The attached 81 line test program clearly demonstrates the issue (by
>hanging and no longer consuming CPU or performing any I/O operations).

Thanks for the relatively small test case.  That was enough to track the
problem down.  I'm generating a new snapshot with a fix for this
problem.


The snapshot looks good!

This fixes the stability problems with clamav's clamd that I've been chasing 
for a long time.


Thanks.

- Mark Pizzolato 



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Multi Threaded programs deadlock doing simple I/O operations

2005-06-08 Thread Mark Pizzolato
There is a serious problem for multi threaded programs doing simple I/O 
operations in cygwin (open, dup, fdopen, fclose, and close).


The attached 81 line test program clearly demonstrates the issue (by hanging 
and no longer consuming CPU or performing any I/O operations).


I'm sure that anyone who ever encountered a stange hang in any program 
running under cygwin would appreciate a fix for this issue.


- Mark Pizzolato

#include 
#include 
#include 
#include 
#include 
#include 

pthread_mutex_t log_mutex = PTHREAD_MUTEX_INITIALIZER;

void
logit(const char *fmt, ...) {
va_list args;
char buf[1024];
int bytes;

buf[sizeof(buf)-1] = '\0';
va_start(args, fmt);
bytes = vsnprintf(buf, sizeof(buf)-1, fmt, args);
va_end(args);
pthread_mutex_lock(&log_mutex);
printf("%d:", pthread_self());
printf("%s", buf);
pthread_mutex_unlock(&log_mutex);
}

struct TestIoInfo {
int Iterations;
int Progress;
};

void *
TestIoThread (void *arg) {
struct TestIoInfo *t = (struct TestIoInfo *)arg;
int i, j;
int fd, fdd;
char FileName[255];
FILE *f;

logit("IO Thread %d starting...\n", pthread_self());
snprintf(FileName, sizeof(FileName), "/tmp/TestIoThread-%d-%x", 
getpid(), pthread_self());
sleep(1);
for (j=0; jIterations; ++j) {
if ((fd = open(FileName, O_RDWR|O_CREAT|O_TRUNC|O_BINARY, 
S_IRWXU)) < 0) {
logit("Error Opening File: %s - %d\n", FileName, errno);
return;
}
fdd = dup(fd);
if ((f = fdopen(fdd, "rb")) == NULL) {
logit("Can't open descriptor %d - %d\n", fd, errno);
return;
}
fclose(f);
close(fd);
if (0 == (j%t->Progress)) {
logit("IO Thread %d - %d\n", pthread_self(), j);
}
}
unlink(FileName);
logit("IO Thread %d done.\n", pthread_self());
return NULL;
}

main (int argc, char ** argv) {
int threadcount = 4;
int progress = 1;
pthread_t tid[10];
int i;
struct TestIoInfo IoInfo;

logit("Testing with %d concurrent threads\n", threadcount);
logit("Progress indicated every %d operations...\n", progress);
IoInfo.Iterations = 200;
IoInfo.Progress = progress;
for (i=0; i--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

gdb threading troubles

2005-06-08 Thread Mark Pizzolato
The attached example test program runs to completion when run directly, but 
spins infinitely when run under gdb.


I'm compiling with:

gcc -g -O0 mutexttest.c -o mutexttest

running under:
   cygwin1.5.17-1
   gdb 20041228-3 

#include 
#include 
#include 
#include 

pthread_mutex_t log_mutex = PTHREAD_MUTEX_INITIALIZER;

void
logit(const char *fmt, ...) {
va_list args;
char buf[1024];
int bytes;

buf[sizeof(buf)-1] = '\0';
va_start(args, fmt);
bytes = vsnprintf(buf, sizeof(buf)-1, fmt, args);
va_end(args);
pthread_mutex_lock(&log_mutex);
printf("%s", buf);
pthread_mutex_unlock(&log_mutex);
}

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

struct TestInfo {
void (*Lock)(void);
void (*Unlock)(void);
int Iterations;
int Data;
int Increment;
};

void
NoLock(void) {
}

void
NoUnlock(void) {
}

void
PthreadLock(void) {
pthread_mutex_lock(&mutex);
}

void
PthreadUnlock(void) {
pthread_mutex_unlock(&mutex);
}

void *
TestThread (void *arg) {
struct TestInfo *t = (struct TestInfo *)arg;
int i;
int *pData = &t->Data;

logit("Thread %d starting...\n", pthread_self());
sleep(1);
for (i=0; iIterations; ++i) {
int tmp;
int trand;
t->Lock();
srand(46);
trand = rand();
srand(46);
tmp = t->Data;
tmp = tmp + trand - rand();
tmp = tmp + t->Increment;
t->Data = tmp;
t->Unlock();
}
logit("Thread %d done.\n", pthread_self());
return NULL;
}

main (int argc, char ** argv) {
int threadcount = 10;
pthread_t tid[10];
int i;
struct TestInfo Info;

Info.Iterations = 50;
Info.Increment = 3;

Info.Data = 0;
Info.Unlock = &NoUnlock;
Info.Lock = &NoLock;
for (i=0; i--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

strace data seems to show hang during socket close

2005-06-02 Thread Mark Pizzolato
orresponds to a kill -9 3772 I did some hours after the above output 
was produced.


-39462844 672036353 [sig] clamd 3772 sigpacket::process: signal 9 processing
 278 672036631 [sig] clamd 3772 sigpacket::process: signal 9, about to call 
do_exit

 409 672037040 [sig] clamd 3772 signal_exit: about to call do_exit (9)
 130 672037170 [sig] clamd 3772 do_exit: do_exit (9), exit_state 0
36124 672073294 [sig] clamd 3772 void: 0x0 = signal (20, 0x1)
 199 672073493 [sig] clamd 3772 void: 0x405170 = signal (1, 0x1)
 119 672073612 [sig] clamd 3772 void: 0x405170 = signal (2, 0x1)
 111 672073723 [sig] clamd 3772 void: 0x0 = signal (3, 0x1)
 188 672073911 [sig] clamd 3772 fhandler_console::close: decremented 
open_fhs, now 3
 157 672074068 [sig] clamd 3772 fhandler_console::close: decremented 
open_fhs, now 2
 150 672074218 [sig] clamd 3772 fhandler_console::close: decremented 
open_fhs, now 1
 662 672074880 [sig] clamd 3772 fhandler_base::close: closing 
'/var/log/clamd.log' handle 0x33C
 680 672075560 [sig] clamd 3772 fhandler_socket::close: 0 = 
fhandler_socket::close()
 423 672075983 [sig] clamd 3772 fhandler_socket::close: 0 = 
fhandler_socket::close()

 305 672076288 [sig] clamd 3772 sigproc_terminate: entering
 147 672076435 [sig] clamd 3772 proc_terminate: n

Thanks for any comments, observations or advise.

- Mark Pizzolato 



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/



Possible lack of thread safety in dup,fdopen

2005-03-07 Thread Mark Pizzolato
I've been using clamav's clamd under cygwin and today noticed an issue.
Clamd is a multi-threaded program.
The following code sequence encountered an error today (ERROR calling fdopen 
on fd 11):

int i, fd;
FILE *f, *tmp;
tmp = fopen("somefile", "wb+");
if (NULL == tmp) return;
fd = fileno(tmp);
{write some stuff to fd }
lseek(fd, 0, SEEK_SET);
i = dup(fd);
if ((f - fdopen(i, "rb")) == NULL) {
fprintf(stderr, "ERROR calling fdopen on  fd %d", i);
This is happening in one thread while other threads are merrily open and 
closing files and sockets.

I've got some log output suggesting that fd 11 might have also been used by 
another thread at "around" the same time.  It would seem that this could 
only happen if something lost track of the bookeeping for fd's or there was 
a race managing that bookeeping.

I tried to look at the code for fopen(),dup(), and fdopen() myself before 
reporting this, but I can't find the implementations of these system calls 
in the source package for cygwin-1.5.13-1

Can someone point me to where I can look at the source code?
Thanks.
- Mark Pizzolato



--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/


Re: pthreads leaks handles and threads when threads use sockets

2005-01-29 Thread Mark Pizzolato
On Saturday, January 29, 2005 at 12:36 PM, Christopher Faylor wrote:
On Fri, Jan 28, 2005 at 09:09:31AM -0800, Mark Pizzolato wrote:
>I've been using clamav's clamd under cygwin and noticed that over time 
>the
>handle count as viewed with TaskManager seems to grow to arbitrary 
>values.

This should be fixed in the latest snapshot.  Corinna tracked this down to
the offending line of code and I made a change which should fix the 
behavior.

Thanks for the test case.  It helped a lot in tracking this problem down.
Thanks for the fix to the handle leakage. It works for me cleanly now.
Any clue as to why there seems to be some bounded thread leakage (i.e. extra 
threads are created and persist for each thread in the test case which uses 
sockets concurrently)?  Bounded leakage we can llive with, but I'm still 
curious.

- Mark Pizzolato 

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/


Re: pthreads leaks handles and threads when threads use sockets

2005-01-29 Thread Mark Pizzolato
Hi Reini,
Reini Urban wrote:
Mark Pizzolato schrieb:
> I've been using clamav's clamd under cygwin and noticed that over time
> the handle count as viewed with TaskManager seems to grow to arbitrary
> values. I used clamd's option IdleTimeout set to 600 seconds which
> dramatically reduced the growth rate of the Handle Count. Of course
> clamd has many things going on that could contribute to handle leakags,
> so I tried to write a simple program to demonstrate the problem.
Thanks a lot! Maybe we should restart the two daemons daily or weekly?
I will change the default IdleTimeout to 600 secs with the upcoming
clamav-0.81 release. Which fixes the freshclam proxy problem and some
OLE issues.
Merely setting IdleTimeout to 600 is currently insufficient due to a bug 
which only uses the IdleTimeout parameter for the Initial value used.  After 
the AV Database is reloaded, the idle timeout is hardset to 30 seconds.  The 
attached patch (to 0.80 or 0.81) fixes this issue.  This patch has been 
submitted on the clamav-devel list.

The right choice for the IdleTimeout is a value which is larger than the 
largest gap between messages that arrive on your system.  This is somewhat 
complicated by multiple connections arriving concurrently which is handled 
by MaxThreads.  MaxThreads has a default value of 10.  This would be fine 
for most systems, however libclamav uses tmpfile() internally which is NOT 
threadsafe (using newlib's tmpfile()) for any system which returns the same 
value for getpid() for each thread in a process (i.e. it works fine on Linux 
since getpid() on Linux returns a unique value for all threads on the 
system).  I've submitted changes which address this to the clamav 
folks(avoiding tempfile()), but they have not been accepted as yet.  To 
avoid this issue (and avoid clamd producing "ERROR: ScanStream: Can't create 
temporary file." messages), setting MaxThreads to 1 should work, but will 
probably affect the behavior of client software that talks to it (possibly 
causing deadlocks).

Do you have any insight to help address the underlying socket issues in 
threaded programs would clearly help with clamd and every other multi 
threaded program which may not even know it has these issues.

- Mark Pizzolato 


idletimeout.patch
Description: Binary data
--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/

pthreads leaks handles and threads when threads use sockets

2005-01-28 Thread Mark Pizzolato
I've been using clamav's clamd under cygwin and noticed that over time the 
handle count as viewed with TaskManager seems to grow to arbitrary values. 
I used clamd's option IdleTimeout set to 600 seconds which dramatically 
reduced the growth rate of the Handle Count.  Of course clamd has many 
things going on that could contribute to handle leakags, so I tried to write 
a simple program to demonstrate the problem.

The attached program demonstrates the problem when sockets are used and that 
things look pretty clean when they are not.

There  seems to be both a thread leakage issue and a separate handle leakage 
issue.

 Invoking the program as:
 threadtest -sockets 0
creates groups of 5 threads simultaneously.  Each thread merely prints 
something and sleeps, prints something else and exits.  This is repeated 10 
times displaying the process handle count between each iteration.  While 
running and watching with Task Manager, the process thread counts seems to 
start at 2 and reach 7 at times and then return to 2.  The handle count 
grows during the first iteration but stays flat thereafter.

 Invoking the program as:
 threadtest -sockets 3
creates groups of 5 threads simultaneously.  Each thread merely prints 
something and sleeps, connects a socket to the main thread, passes a little 
data and closes the socket.  This socket business is repeated 3 times after 
which the thread prints something else and exits.  This is repeated 10 times 
displaying the process handle count between each iteration.  While running 
and watching with Task Manager, the process thread counts seems to start at 
2 and reach 14 during the each iteration and drops back to 9 between 
iterations.  The handle count grows significantly during the first iteration 
but seems to grow by 10 or 11 between each subsequent iteration. 
The -sockets 3 argument controls the number of sockets each thread creates 
during its life.  The amount of thread and handle leakage seems to be 
independent of the number of sockets the thread uses during its lifetime (as 
long as the number of sockets used is 1 or greater).

The number of threads created simultaneously can be controlled by 
specifying -threads n as command arguments.  The number of threads leaked 
seems to be directly related to the number of thread using sockets 
concurrently.  running the program with -sockets 3 and -threads 10 causes 
the thread count to jump to 24 during each iteration and drop back to 14 
between iterations, while the handle count seems to increase by 10 or 11 
between each iteration identical to the case described in the previous 
paragraph.

I hope this test can help someone familar enough with cygwin internals to 
help get this problem under control.

- Mark Pizzolato 
#include 
typedef enum _PROCESSINFOCLASS {
   ProcessHandleCount = 20,
} PROCESSINFOCLASS;
typedef LONG NTSTATUS;
int
GetHandleCount()
{
static NTSTATUS
(*lpNtQueryInformationProcess)(
HANDLE   ProcessHandle,
PROCESSINFOCLASS ProcessInformationClass,
PVOIDProcessInformation,
ULONGProcessInformationLength,
PULONG   ReturnLength ) = NULL;
static int bInit = 0;
int HandleCount = 0;
if (FALSE == bInit) {
lpNtQueryInformationProcess = (NTSTATUS ( *)(HANDLE, PROCESSINFOCLASS, PVOID, 
ULONG, PULONG))GetProcAddress(GetModuleHandle("ntdll.dll"), 
"NtQueryInformationProcess");
bInit = 1;
}
if (NULL != lpNtQueryInformationProcess) {
lpNtQueryInformationProcess( GetCurrentProcess(), 
ProcessHandleCount, &HandleCount, sizeof(HandleCount), NULL );
}
return HandleCount;
}

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
pthread_mutex_t mutex;
pthread_cond_t cond;
pthread_attr_t attr;
struct sockaddr_in server;
int sockets=2;
int threadgroup=5;
void *thread_code(void *arg)
{
int threadnum = (int) arg;
int i;
int s;
char buf[64];
pthread_mutex_lock(&mutex);
printf("Thread %d running...\n", threadnum);
pthread_mutex_unlock(&mutex);
for (i=0; i
main(int argc, char **argv)
{
int i, j, s;
pthread_t thr_id;
int true = 1;
int HandleCount;
	while (--argc>0) {
		++argv;
		if (!strcmp("-sockets", *argv)) {
			++argv; --argc;
			sockets = atoi(*argv);
			continue;
		}
		if (!strcmp("-threads", *argv)) {
			++argv; --argc;
			threadgroup = atoi(*argv);
			continue;
		}
		printf("Unknown argument: %s\n", *argv);\
		printf("Usage: threadtest -s n -t n\n");
		printf("where:\n");
		printf(" -sockets   specifies the number of sockets that each thread should\n");
		printf("use during its life.  Interesting values are 0

Re: [ANNOUNCEMENT] Updated: clamav-0.80-1

2004-10-26 Thread Mark Pizzolato
I have updated the version of clamav on cygwin.com to 0.80-1.
This has now a shared version of the library and several updates.
Clam AntiVirus - GPL anti-virus toolkit
This distribution was built without the Windows UI.
You might want to use clamavwin (python wxWindows) instead.
See /usr/share/doc/Cygwin/clamav-0.80-1.README
I'm trying to track down a potential bug in either clamav or cygwin which 
appears while running clamd.  In order to find what and ehere I tried to 
install the source package for this release of clamav to do some debugging.

There seems to be a bug in the source package for this updated clamav 
release..

This becomes obvious when invoking the clamav-0-80-1.sh script with the 
argument prep or all.

The source package has a patch file named clamv-0.80-1.patch.  While this 
patch is being applied, the patches to the first few files goes OK, while 
the remaining patches seem to be 1) reverse patches and 2) done with a diff 
specifying one directory level too deep.  So, I edit the patch file, 
globally replacing /clamav-0.80/ with /.  After doing this, things get 
furthur, but as I mentioned there seems to be a reverse patch going on, so 
patch prompts to "Assume -R" for several files, but not all of the patches 
can be applied anyway.  Then things go on to get worse.  This should 
probably be addressed by someone who created the patches in the package.

- Mark Pizzolato

--
Unsubscribe info:  http://cygwin.com/ml/#unsubscribe-simple
Problem reports:   http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ:   http://cygwin.com/faq/