On Fri, Jun 26, 2020 at 7:35 PM sven falempin <[email protected]> wrote:
> > > On Fri, Jun 26, 2020 at 5:22 PM Stuart Henderson <[email protected]> > wrote: > >> On 2020/06/26 15:30, sven falempin wrote: >> > behavior confirmed on current. >> > >> > Once the process stalls, ( could be anything writing to the vnconfig >> disk, >> > cp , umount ) >> > a few other calls like df , or ps, etc may hang, never the same >> > sp or mp kernel, reproduced on today's snapshots. >> >> vnconfig is used as part of "make release", many builds are done every >> week using this so it's not a general problem with vnconfig. >> >> Can you show some commands or a script to trigger the behaviour? >> > > the perl script use the system to call : > > vnconfig. > mount. > umount. <- saw hanged > cp.<- saw hanged > tar.<- saw hanged > svn up.<- saw hanged > and dd. > newfs. > > really nothing fancy, only stuff writing to disk got stuck. > > At one point it does a chroot but it never hangs near that , most of the > time it hangs before. > > The script has been used like 1000 times on 6.0 and maybe twice more on > 6.4. > > I have absolutely no idea what the 'needbuf' of top is . > > the script hangs at random position , always writing into vnconfig. > > I have no idea how to reproduce outside the perl script , so maybe it is > related > to some devious perl stdin/stdout buffer . > > Nevertheless there's like a 5% chance that's the script will work( slowly ) > > Most of the system call are inside a routine to log > > sub debug_system { > $logger->debug('running: '.join(' ', @_)); > return system(@_); > } > > so i can easily put things inside to try to understand the issue. > > It is really a strange behavior, and the device must be shut down > electrically. > Something really odd, i run syslogd on a buffer, and syslogc buffer is > stuck too > when the device stuck (but it supposed to be mostly already allocated > memory ). > > It's really like the vm does not want to give anymore bucket (<- i > don't know what i m talking about here, > but i looks like that anything that doesn't malloc is ok , computer reply > to ping , can do a few things for a while , and then complete > hang ) > > I ran the 6.7 release on a VM somewhere and another device with many perl > script and they work. > > Only this fails 95% of the time and is VERY VERY slow when ok. > compared to what i saw in /usr/src the vnconfig is big , ( forgot to copy > df -h ), > like 2GB > i put ktrace in front of the perl system call An di was able to recover a 800MB trace $ kdump -f ./trace.out | tail -20 kdump: realloc: Cannot allocate memory 25955 UNKNOWN(1634890859) 72466 ▒▒▒ CALL syscall() could that be of some use ? -- -- --------------------------------------------------------------------------------------------------------------------- Knowing is not enough; we must apply. Willing is not enough; we must do
