Hi, It is a race condition. While I am adding debug methods the problem goes away. After a pause of some time the problem start over again.
By the look of the logs and strace seems that noop service ends to fast and the event loop waits for a process that have already terminated. Kind regards Jose M Calhariz On Fri, Jul 10, 2020 at 02:42:38PM +0100, Jose M Calhariz wrote: > Hi, > > I have done more research on this problem. More information about it > inline. > > On Tue, Jun 30, 2020 at 12:18:55PM +0100, Jose M Calhariz wrote: > > On Mon, Jun 29, 2020 at 11:06:46AM -0600, Charles Curley wrote: > > > On Mon, 29 Jun 2020 16:36:33 +0100 > > > Jose M Calhariz <jose.calha...@tecnico.ulisboa.pt> wrote: > > > > > > > On my main amanda installation I have a client that gives time out > > > > when doing backups. I have researched and checked out the most common > > > > problems. In the end I have found that: > > > > > > > > - "amcheck Config -c client" gives 30 seconds of timeout. > > > > > > I do not see that. I checked two clients, one AMD64, the other i386. > > > > On mine main amanda installation and have dozens of Debian clients. > > It is one client only that is failing and I do not understand why. > > Now I have two clients with problems, but the main problem are different. > > > > > > > > > > > > > > - I do not have any clue on the logs at the client. > > > > > > > > - Running the command by hand on the client I get segmentation fault. > > > > > > > > /usr/lib/amanda/amandad -auth=ssh amdump > > > > Segmentation fault > > > > > > I see that, on both machines. > > > > > > > OK, so the problem is another. More research to do. Thank you. > > I will post here when I find more info. > > > This nigth I have done more investigation. When I run "amcheck Conf > -c client" it gives a 30 seconds timeout. This timeout was increased > by me some time ago for other problem. So YMMV. > > On the client the amandad runs successfully but the never try to run > selfcheck and do not give an error. The prof is that I have no logs > in /var/log/amanda/client/Conf. I have made some changes, increased > the client debug logs and some others things and the selfcheck started > to run. Today selfcheck is not running again. > > > Anyone know where in the code of amandad is the launch of selfcheck so > I can quickly find the place and possibly add more debugging code? > > > > > > > > > > > > > > > > > - Running the command inside gdb I see a NULL pointer. > > > > > > > > gdb /usr/lib/amanda/amandad > > > > GNU gdb (Debian 8.2.1-2+b3) 8.2.1 > > > > > > I do not have symbols installed, so of course I can't examine the > > > variables. I do see a segment fault: > > > > > > ... > > > (gdb) run -auth=ssh amdump > > > Starting program: /usr/lib/amanda/amandad -auth=ssh amdump > > > [Thread debugging using libthread_db enabled] > > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > 0x00007ffff7f3a687 in stream_sendpkt () from > > > /usr/lib/x86_64-linux-gnu/amanda/libamanda-3.5.1.so > > > (gdb) > > > ... > > > > > > I have no idea what is going on here, not having the source in front of > > > me. But I wonder if this is because amanda is trying to use an SSH > > > connection that isn't there? > > > > > > The first time, I SSHed in as root, did an su - to backup, then ran > > > gdb. To test my hypothesis above, I went to my amanda server, su - to > > > backup, then sshed to the client. This time I did not get a seg fault, > > > and used Ctl-c to end the process: > > > > > > (gdb) run -auth=ssh amdump > > > Starting program: /usr/lib/amanda/amandad -auth=ssh amdump > > > [Thread debugging using libthread_db enabled] > > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > > > ^C > > > Program received signal SIGINT, Interrupt. > > > 0x00007ffff76a27e4 in __GI___poll (fds=0x55555559e3d0, nfds=2, > > > timeout=30000) at ../sysdeps/unix/sysv/linux/poll.c:29 > > > 29 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. > > > (gdb) > > > > > > > > > > > > This is with the bog standard Buster version of amanda: > > > > > > root@dzur:~# pre amanda > > > amanda-client 1:3.5.1-2+b2 amd64 > > > amanda-common 1:3.5.1-2+b2 amd64 > > > root@dzur:~# > > > > > > > > > > Kind regards > > Jose M Calhariz > > > > > > Kind regards > Jose M Calhariz > > > -- -- Quando um não quer... o outro insiste.
signature.asc
Description: PGP signature