Re: Process hanging on 6.0-STABLE

2006-03-30 Thread David Xu
在 Friday 31 March 2006 08:38,Daniel O'Connor 写道:
> On Wednesday 22 March 2006 23:49, Daniel O'Connor wrote:
> > On Wednesday 22 March 2006 18:41, David Xu wrote:
> > > > The problem is that every now and then the process gets stuck and
> > > > becomes unkillable just after forking, ie..
> > >
> > > Are you using pthreads ?
> >
> > Nope.
> 
> Hmm I just found I had an FD leak - I am not sure if it's related.
> 
> Interestingly I had ~1046 FDs open and FD_SET wasn't working properly, eg...
> fd_setfds;
> int   fd;
> 
> fd = 1046;
> FD_ZERO(&fds);
> FD_SET(fd, &fds);
> 
> Results in fds being empty :(
> 
> Does anyone know if that is expected behaviour? And whether this problem 
could 
> cause the second one (eg tickle a bug).
> 
> Thanks.
> 
> -- 
> Daniel O'Connor software and network engineer
> for Genesis Software - http://www.gsoft.com.au
> "The nice thing about standards is that there
> are so many of them to choose from."
>   -- Andrew Tanenbaum
> GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C
> 
In /sys/sys/select.h :
#ifndef FD_SETSIZE
#define FD_SETSIZE  1024U
#endif

so you should define FD_SETSIZE if you have fd which is larger than 1024.

Regards,
David Xu
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Process hanging on 6.0-STABLE

2006-03-30 Thread Daniel O'Connor
On Wednesday 22 March 2006 23:49, Daniel O'Connor wrote:
> On Wednesday 22 March 2006 18:41, David Xu wrote:
> > > The problem is that every now and then the process gets stuck and
> > > becomes unkillable just after forking, ie..
> >
> > Are you using pthreads ?
>
> Nope.

Hmm I just found I had an FD leak - I am not sure if it's related.

Interestingly I had ~1046 FDs open and FD_SET wasn't working properly, eg...
fd_set  fds;
int fd;

fd = 1046;
FD_ZERO(&fds);
FD_SET(fd, &fds);

Results in fds being empty :(

Does anyone know if that is expected behaviour? And whether this problem could 
cause the second one (eg tickle a bug).

Thanks.

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


pgpMhLAF0G4Tg.pgp
Description: PGP signature


Re: Process hanging on 6.0-STABLE

2006-03-22 Thread Daniel O'Connor
On Wednesday 22 March 2006 18:41, David Xu wrote:
> > The problem is that every now and then the process gets stuck and becomes
> >  unkillable just after forking, ie..
>
> Are you using pthreads ?

Nope.

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


pgpnn4HoQij9A.pgp
Description: PGP signature


Re: Process hanging on 6.0-STABLE

2006-03-22 Thread David Xu
在 Wednesday 22 March 2006 15:25,Daniel O'Connor 写道:
> Hi,
> I work for a small company that makes radar systems for research 
> organisations and we use FreeBSD on the PCs for data acquisition and 
> processing. We have recently shifted to FreeBSD6/amd64 and one machine in 
> particular is exhibiting a strange problem.
> 
> The acquisition process is a Tcl interpreter with a largish chunk of C code
>  which talks to the hardware (via RS485 and a custom PCI card). Once the 
> system is set up it streams data back via the PCI card and runs it through
>  various data processors (eg dump raw data to disk, FFT, winds, etc..). 
> 
> The actual forking of processes is handled in Tcl and the C code only gets
>  involved to write the data out (to an FD the Tcl layer keeps).
> 
> The problem is that every now and then the process gets stuck and becomes
>  unkillable just after forking, ie..
Are you using pthreads ?

David Xu
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Process hanging on 6.0-STABLE

2006-03-21 Thread Daniel O'Connor
Hi,
I work for a small company that makes radar systems for research 
organisations and we use FreeBSD on the PCs for data acquisition and 
processing. We have recently shifted to FreeBSD6/amd64 and one machine in 
particular is exhibiting a strange problem.

The acquisition process is a Tcl interpreter with a largish chunk of C code
 which talks to the hardware (via RS485 and a custom PCI card). Once the 
system is set up it streams data back via the PCI card and runs it through
 various data processors (eg dump raw data to disk, FFT, winds, etc..). 

The actual forking of processes is handled in Tcl and the C code only gets
 involved to write the data out (to an FD the Tcl layer keeps).

The problem is that every now and then the process gets stuck and becomes
 unkillable just after forking, ie..
eureka:~>ps -axl | grep Reco
1   881 1  12  -8 -5 21716 15984 piperd Igdb $GSHOME/libexec/Recorder
...
(gdb) attach 881
...
(gdb) bt
#0  0x0008009c395c in read () from /lib/libc.so.6
#1  0x00080072f77f in TclpCreateProcess () from /usr/local/lib/libtcl84.so.1
#2  0x000800717d25 in TclCreatePipeline () from /usr/local/lib/libtcl84.so.1
#3  0x0008007186d0 in Tcl_OpenCommandChannel () from 
/usr/local/lib/libtcl84.so.1
#4  0x000800704af8 in Tcl_ExecObjCmd () from /usr/local/lib/libtcl84.so.1
...

However the newly made one..
(gdb) attach 80154
Attaching to program: /usr/home/radar/skiymet/libexec/Recorder, process 80154
ptrace: Resource temporarily unavailable.

The original is killable..
eureka:~>kill 881
eureka:~>kill 881
881: No such process

But the old one is not..
eureka:~>kill 80154
eureka:~>kill 80154
eureka:~>kill -9 80154
eureka:~>kill -9 80154

I can fstat the new process and it shows a slew of open FDs (presumably
inherited from the old process), but I can't ktrace it..
eureka:~>ktrace -f 80154.ktr -p 80154
ktrace: 80154.ktr: Operation not permitted
eureka:~>sudo ktrace -f 80154.ktr -p 80154
ktrace: 80154.ktr: Operation not permitted

Or get a memory map..
eureka:~>dd if=/proc/80154/map bs=64k
dd: /proc/80154/map: Resource temporarily unavailable
0+0 records in
0+0 records out
0 bytes transferred in 0.96 secs (0 bytes/sec)

Unfortunately the machine is at a very remote location and I have not
been able to replicate it locally (and I can't run, say memtest remotely
either).

The custom PCI card has a driver which may be the cause of the problems
but it does not appear to be involved from what I can see.

Does anyone have any suggestions? The version of FreeBSD is a little 
after 6.0-RELEASE but not much.

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C


pgprmgJYjXxVT.pgp
Description: PGP signature