On 2020-06-27 19:29:31, Bob Beck <[email protected]> wrote:
>
> No.
>
> I know *exactly* what needbuf is but to attempt to diagnose what your
> problem is we need exact details. especially:
>
> 1) The configuration of your system including all the details of the
> filesystems
> you have mounted, all options used, etc.
>
> 2) The script you are using to generate the problem (Not a paraphrasing of
> what
> you think the script does) What filesystems it is using.
>
Not the OP, but this problem sounds almost exactly like the bug I
reported last year.
There is a detailed list of steps I used to reproduce the bug in
the following bug report.
https://marc.info/?l=openbsd-bugs&m=156412299418191
I was even able to bisect and identify the commit which first
caused the breakage for me.
---8<---
CVSROOT: /cvs
Module name: src
Changes by: [email protected] 2019/05/08 06:40:57
Modified files:
sys/kern : vfs_bio.c vfs_biomem.c
Log message:
Modify the buffer cache to always flip recovered DMA buffers high.
This also modifies the backoff logic to only back off what is requested
and not a "mimimum" amount. Tested by me, benno@, tedu@ anda ports build
by naddy@.
ok tedu@
---8<---
However, I have since migrated away from using vnd(4)s since I was
able to find other solutions that worked for my use cases. So I
may not be able to provide much additional information other than
what is contained in the above bug report.
--
Bryan
>
>
> On Sat, Jun 27, 2020 at 08:09:18PM -0400, sven falempin wrote:
> > On Fri, Jun 26, 2020 at 7:35 PM sven falempin <[email protected]>
> > wrote:
> >
> > >
> > >
> > > On Fri, Jun 26, 2020 at 5:22 PM Stuart Henderson <[email protected]>
> > > wrote:
> > >
> > >> On 2020/06/26 15:30, sven falempin wrote:
> > >> > behavior confirmed on current.
> > >> >
> > >> > Once the process stalls, ( could be anything writing to the vnconfig
> > >> disk,
> > >> > cp , umount )
> > >> > a few other calls like df , or ps, etc may hang, never the same
> > >> > sp or mp kernel, reproduced on today's snapshots.
> > >>
> > >> vnconfig is used as part of "make release", many builds are done every
> > >> week using this so it's not a general problem with vnconfig.
> > >>
> > >> Can you show some commands or a script to trigger the behaviour?
> > >>
> > >
> > > the perl script use the system to call :
> > >
> > > vnconfig.
> > > mount.
> > > umount. <- saw hanged
> > > cp.<- saw hanged
> > > tar.<- saw hanged
> > > svn up.<- saw hanged
> > > and dd.
> > > newfs.
> > >
> > > really nothing fancy, only stuff writing to disk got stuck.
> > >
> > > At one point it does a chroot but it never hangs near that , most of the
> > > time it hangs before.
> > >
> > > The script has been used like 1000 times on 6.0 and maybe twice more on
> > > 6.4.
> > >
> > > I have absolutely no idea what the 'needbuf' of top is .
> > >
> > > the script hangs at random position , always writing into vnconfig.
> > >
> > > I have no idea how to reproduce outside the perl script , so maybe it is
> > > related
> > > to some devious perl stdin/stdout buffer .
> > >
> > > Nevertheless there's like a 5% chance that's the script will work( slowly
> > > )
> > >
> > > Most of the system call are inside a routine to log
> > >
> > > sub debug_system {
> > > $logger->debug('running: '.join(' ', @_));
> > > return system(@_);
> > > }
> > >
> > > so i can easily put things inside to try to understand the issue.
> > >
> > > It is really a strange behavior, and the device must be shut down
> > > electrically.
> > > Something really odd, i run syslogd on a buffer, and syslogc buffer is
> > > stuck too
> > > when the device stuck (but it supposed to be mostly already allocated
> > > memory ).
> > >
> > > It's really like the vm does not want to give anymore bucket (<- i
> > > don't know what i m talking about here,
> > > but i looks like that anything that doesn't malloc is ok , computer reply
> > > to ping , can do a few things for a while , and then complete
> > > hang )
> > >
> > > I ran the 6.7 release on a VM somewhere and another device with many perl
> > > script and they work.
> > >
> > > Only this fails 95% of the time and is VERY VERY slow when ok.
> > > compared to what i saw in /usr/src the vnconfig is big , ( forgot to copy
> > > df -h ),
> > > like 2GB
> > >
> >
> >
> > i put ktrace in front of the perl system call
> >
> > An di was able to recover a 800MB trace
> >
> > $ kdump -f ./trace.out | tail -20
> > kdump: realloc: Cannot allocate memory
> > 25955 UNKNOWN(1634890859)
> > 72466 ????????? CALL syscall()
> >
> >
> > could that be of some use ?
> >
> >
> > --
> > --
> > ---------------------------------------------------------------------------------------------------------------------
> > Knowing is not enough; we must apply. Willing is not enough; we must do
>
>