amdgpu module hangs

2024-04-26 Thread Reinoud Zandijk
Dear folks,

i've updated my tree and built myself a new kernel but it won't boot anymore
and hangs in loading the amdgpu module in the bootloader. It is the same
-current version that worked before but something must have changed.

Anyone experiencing the same?

With regards,
Reinoud



Re: nvmm users - experience

2023-05-24 Thread Reinoud Zandijk
Hi,

On Sun, May 21, 2023 at 02:01:22PM +, Mathew, Cherry G.* wrote:
> I'm wondering if there are any nvmm(4) users out there - I'd like to
> understand what your user experience is - expecially for multiple VMs
> running simultaneously.
> 
> Specifically, I'd like to understand if nvmm based qemu VMs have
> interactive "jitter" or any such scheduling related effects.
> 
> I tried 10.0_BETA with nvmm and it was unusable for 3guests that I
> migrated from XEN, so I had to fall back.
> 
> Just looking for experiences from any users of nvmm(4) on NetBSD (any
> version, including -current is fine).

How many CPUs does each instance have and how many CPUs does your machine
have? It could be, but thats just a hunch reading the code since I'm not that
deep in NVMM (!), that possibly more than one Qemu has vCPUs activated and all
host CPUs are busy? This will mean that the other Qemu processes are not run
(for I/O etc) until a tick interrupt comes by or an userland event like IO
read/pagefault comes by and one vCPU is aborted. This will be handled in the
kernel and control will be given back to its nvmm userland part and the qemu
handler might interrupt the other vCPUs to trigger their virtual interrupt
handling. Note that while running a vCPU the kernel preemption is switched
off.

This might give the irregular IO performance you see as the qemu IO threads
are not sheduled often enough.

A solution might be that the nvmm code in *_vcpu_run() sets a timer interrupt
with a dummy handler on its CPU and on its forced vmreturn checks the number
of cpus currently not running NVMM cpus and if that number is lower than the
number of virtual machines that are declared and running it yields by
returning to userland to have at least one CPU for each of the Qemus.
Otherwise it can continue as before without making the expensive switch back
to userland.

Ideally we ought to ask Maxime Villard who wrote the code and check/coordinate
with DragonFlyBSD but he also isn't with DragonFly anymore IIRC.

So, I might be barking up the wrong tree but this might be it.

With regards,
Reinoud



Re: nvmm users - experience

2023-05-24 Thread Reinoud Zandijk
Hi,

On Sun, May 21, 2023 at 02:01:22PM +, Mathew, Cherry G.* wrote:
> I'm wondering if there are any nvmm(4) users out there - I'd like to
> understand what your user experience is - expecially for multiple VMs
> running simultaneously.
> 
> Specifically, I'd like to understand if nvmm based qemu VMs have
> interactive "jitter" or any such scheduling related effects.
> 
> I tried 10.0_BETA with nvmm and it was unusable for 3guests that I
> migrated from XEN, so I had to fall back.
> 
> Just looking for experiences from any users of nvmm(4) on NetBSD (any
> version, including -current is fine).

I'm a regular user of nvmm(4) but normally only use one nvmm/qemu instance at
a time. I think your assessment is correct in that its scheduling related.
This explains both the jitter and the irregular behaviour.

AFAIK in nvmm(4) the code runs until it reaches a page fault, a read/write of
a memory mapped device or some other event and then returns to Qemu. There is
AFAIK no timer involved and as long as the scheduled CPU doesn't get an
interrupt or hits a page fault it just continues on. Code that spins will thus
neglect virtual irqs generated until some random event or when the NetBSD
scheduler decides its time to yield this CPU. This can be considered a bug :)

I don't see references to timers being used in sys/dev/nvmm/ nor are CPU
specific timer/preemption support bits set.

With regards,
Reinoud



Qemu dropping 32 bit hosts? [peter.mayd...@linaro.org: Re: [RFC PATCH] docs/about/deprecated: Deprecate 32-bit host systems]

2023-01-30 Thread Reinoud Zandijk
Hi folks,

Qemu folks are talking about possibly dropping 32 bit HOST support for Qemu so
my question is, are there users of a 32 bit system that use Qemu?

If there is enough response I can forward these to the Qemu folks or else we
might indeed see 32 bit host support dropped/deprecated...

With regards,
Reinoud

- Forwarded message from Peter Maydell  -

Date: Mon, 30 Jan 2023 11:47:02 +
Subject: Re: [RFC PATCH] docs/about/deprecated: Deprecate 32-bit host systems

On Mon, 30 Jan 2023 at 11:44, Thomas Huth  wrote:
>
> Testing 32-bit host OS support takes a lot of precious time during the QEMU
> contiguous integration tests, and considering that many OS vendors stopped
> shipping 32-bit variants of their OS distributions and most hardware from
> the past >10 years is capable of 64-bit

True for x86, not necessarily true for other architectures.
Are you proposing to deprecate x86 32-bit, or all 32-bit?
I'm not entirely sure about whether we're yet at a point where
I'd want to deprecate-and-drop 32-bit arm host support.

thanks
-- PMM

- End forwarded message -


Re: Branching for netbsd-10 next week

2022-12-10 Thread Reinoud Zandijk
Hi,

related, I think we should really iron out all installation issues that
plagued NetBSD before and were scorned on say Slashdot i.e. provide easy
install/live images with a gui installed, with optional extra variants with
say a complete xfce4 one with FF etc. and provide complete installs like the
live CD running as option in sysinst.

Separate images could be made for VMs and for the cloud instances complete
with documentation and sensible default setup.

Also kind of important, good defaults for the shell logins so noone sees ^T
etc when line editing. That might be fixed nowadays but some stuff migrated
here over from earlier installs dating from 1.4 and at times i just enter tcsh
or so just to get the line editing working.

Just my $0.02 :)

Reinoud



Re: Virtio Viocon driver - possible to backport from OpenBSD?

2022-08-06 Thread Reinoud Zandijk
On Thu, Aug 04, 2022 at 09:53:25PM +0200, Matthias Petermann wrote:
> https://man.openbsd.org/virtio.4
> 
> the OpenBSD virtio driver has its origin in NetBSD. Viocon Support was added
> later and not ported back yet. I am wondering how much effort it would take
> to merge it from
> 
> https://cvsweb.openbsd.org/src/sys/dev/pv/viocon.c?rev=1.8=text/x-cvsweb-markup
> 
> This would help to run netbsd on qemu without VGA Emulation which seems to
> become the default for some cloud environments. 

I always use `serial` console for my Qemu hacking but if some cloud
environments rather have viocon's it seems like a sound idea. AFAIK its not
that hard and was on my TODO list when I worked on virtio but it got
sidetracked by other work.

I am currently working on something completely different but might take a
peek but feel free to try :)

With regards,
Reinoud



Re: x86 console size

2022-06-07 Thread Reinoud Zandijk
Hi :)

On Sun, Jun 05, 2022 at 11:52:51AM +, RVP wrote:
> On Sun, 5 Jun 2022, Reinoud Zandijk wrote:
> 
> > Could switching to the big fonts be an option?
> > 
> 
> With both fonts compiled in you can switch between them
> using:
> 
> wsconsctl -dw font='Boldface 16x32'
> wsconsctl -dw font=Boldface

These work quite well! Only how to configure this? When and where is this set
and configured?

Thanks,
Reinoud



x86 console size

2022-06-05 Thread Reinoud Zandijk
Dear folks,

since some time I noticed that the initial graphical console has small letters
but is later reset to big letters again. Is this intentional? How can I
preserve the smaller font? The 80x25 (or x31?) is quite huge on this monitor
and I really liked the smaller fonts!

Could switching to the big fonts be an option?

With regards,
Reinoud



Re: Branching for NetBSD 10

2022-06-03 Thread Reinoud Zandijk
On Fri, Jun 03, 2022 at 03:32:50PM +0200, Martin Husemann wrote:
> On Mon, May 02, 2022 at 04:22:56PM +0200, Martin Husemann wrote:
> > We are planning to branch netbsd-10 in about a week from now.
> 
> As you may hgqapave noticed, this did not happen.
> Those who followed the NetBSD Foundation's annual general meeting saw
> that we discussed a new plan that wanted to branch early two weeks ago -
> and that also did not happen.
> 
> The major issue that came up late and that we are now trying to resolve
> properly before the branch (so we can document what people testing the
> branch need to know about it) is the handling of extended attributes
> on FFS file systems.

Well I'd like to add another point! Fixed i915 DRM support! There are quite
some machines featuring them and the only result they get is a black screen or
even a locked up machine when it used to work fine. People won't be expecting
this.

Reinoud



Re: Status of NetBSD virtualization roadmap - support jails like features?

2022-04-16 Thread Reinoud Zandijk
Hi,

On Sat, Apr 16, 2022 at 06:07:46AM +0200, Matthias Petermann wrote:
> > 1) nvmm seems to work well in netbsd (I haven't run it yet) and there has
> > been bug fixing.
> > 
> > 2) code flows between BSDs a lot, in many directions.
> > 
> > 3) You could run diff to see what's different and why.

Its basicly adjusting the code to work in DragonFly; the repo the m00nbsd
refers to already has the backport for NetBSD in it (on first glance). They
claim in the commit that it works on both so why isn't the developer not
committing it back or PR'ing the code? It was ported by Aaron LI (aly@) with
help from Matt Dillon and Maxime according to the DragonFly BSD NVMM page
(https://www.dragonflybsd.org/docs/docs/howtos/nvmm/)

> > Probably after your message (which I view as helpful) someone(tm) will
> > look at the diff.  But if you are inclined to do  that and post some
> > comments, that's probably useful.

>From what I saw on first glance it was making the code mode OS independent and
some minor glue and NetBSD compat is provided. No functional change on first
glance.

> Thank you very much for your points. I will indeed do the diff asap out of
> interest, although I can't promise that anything can be derived from it -
> I'm not a kernel developer, let alone know anything about virtualization
> beyond the administrator level ;-)
> 
> But from a roadmap point of view, I see it as a good sign that nvmm gets bug
> fixes and is described quite comprehensively in the NetBSD guide.

Indeed! It needs to updated in NetBSD ASAP so they don't drift and changes are
easily picked up. Our few own modification could be PR'd to them too.

With regard,
Reinoud



make(1) enhancements (was Re: Automated report: NetBSD-current/i386 build failure)

2021-07-22 Thread Reinoud Zandijk
On Mon, Jul 19, 2021 at 08:13:19AM +0300, Andreas Gustafsson wrote:
> David Holland wrote:
> > On Mon, Jul 19, 2021 at 10:32:20AM +0900, Rin Okuyama wrote:
> >  > Logs below are usually more helpful.
> > 
> > Right... I wonder what happened to bracket's error-matching script; it
> > usually does better than that.
> 
> There are multiple causes, but a major one is that since babylon5 was
> upgraded to a new server with more cores, the builds have more
> parallelism, which causes make(1) to print more output from the other
> parallel jobs after the actual error message, and bracket isn't
> looking far enough back in the log.  I have a fix in testing on my own
> testbed but still need to deploy it on babylon5.

I think I've expressed this idea before, but can't we enhance our make(1) to
record all printouts for each make target? It could discard the logs of the
targets that succeeded and print the one(s) that failed out on reporting the
complation error? Then at least all the output of the failed targets are
collected and readable.

If this was already shot down once before, please ignore it. I'm not that
familiar with the make(1) internals.

Reinoud



signature.asc
Description: PGP signature


Re: netbsd update expanded kernel with netbsdIleNEk file name

2021-07-22 Thread Reinoud Zandijk
On Sun, Jul 18, 2021 at 10:22:50PM +0300, Andrius V wrote:
> Today I upgraded my current setup from the latest nycdn image (amd64)
> using sysinst (update flow) and I was surprised that netbsd kernel was
> copied to file with unusual /netbsdIleNEk name instead of /netbsd.
> Thus, I needed to recopy it manually to netbsd. Unfortunately, I
> didn't have time to do it twice, because of that I am not if it is a
> bug in sysinst or just a one time weird situation. Does anyone
> experiencing the same with the latest image?

Is this still present in the newer versions?

Reinoud



signature.asc
Description: PGP signature


Re: GCC 10 available for testing etc. in -current.

2021-04-19 Thread Reinoud Zandijk
On Mon, Apr 19, 2021 at 04:41:32AM -, Michael van Elst wrote:
> ll...@must-have-coffee.gen.nz (Lloyd Parkes) writes:
> 
> >On 17/04/21 6:30 pm, Lloyd Parkes wrote:
> >> I am using the Mercurial repository at https://anonhg.NetBSD.org/src 
> >> for fetching the source code because it's nice and quick
> 
> >I've been running CVS for more than two hours now, and it has terminated 
> >with a broken connection 10 (make that 11) times so far.
> 
> 
> The funny thing is that it works the opposite way for me. CVS checkout
> works without problems and Mercurial checkouts almost always time out
> or aren't succesful.
> 
> Should tell you that the software is only a small part of the problem.

Same for me; I've never had trouble with CVS trees and they always just work
and update fine.

Hg on the otherhand I had to delete and recheckout my hg tree *again*; i
had interrupted hg during a merge and oh boy; it was completely shot and
thought i had tons of local changes that all conflicted; a whopping 500+ files
or so, thus resorting to just nuking it and rechecking it out. This never
happened to my CVS tree.

So, no, hg is not mature enough yet to switch over to and don't get me started
on git!

Reinoud



signature.asc
Description: PGP signature


Re: NVMM failure

2021-03-31 Thread Reinoud Zandijk
Hi,

On Mon, Mar 29, 2021 at 10:57:36AM +0100, Chavdar Ivanov wrote:
> One of the patches needs patching... I thought nvmm was upstreamed and
> we didn't need the patches for it?
> 
> Index: patch-target_i386_nvmm_all.c
> ===
> RCS file: 
> /cvsroot/pkgsrc/emulators/qemu/patches/patch-target_i386_nvmm_all.c,v

Well, the last upstreaming of NVMM bounced and got side railed; the enhanced
version of NVMM adresses some of the issues found. I'm currently in the
process of re-upstreaming it. Could you please try out Qemu 5.2.0nb5 from
emulators/qemu in pkgsrc-current? It can then be pulled up to the latest
pkgsrc fork.

With regards,
Reinoud



Re: panic: _bus_virt_to_bus for vioif on GCE with GENERIC kernel

2021-02-01 Thread Reinoud Zandijk
Hi Paul,

On Mon, Feb 01, 2021 at 06:46:17PM +1100, Paul Ripke wrote:
> On Mon, Feb 01, 2021 at 04:18:17PM +1100, Paul Ripke wrote:
> > However, forcing the full size virtio_net_hdr results in a working kernel!
...
> > Does that give any hints?

I'll double check all header size dependent code again. This is very odd but
good to know it makes a difference.

> Major correction: that patch results in a *booting* kernel, but without a
> working NIC. I forgot I was logged on via the serial console...

thats not surprising since the header lengths are wrong :)

> > > Legacy support has to be disabled in the hypervisor (like GCE) as it 
> > > needs to
> > > pass a different PCI product number. In Qemu its a property of each 
> > > virtio PCI
> > > device but in GCE it might be global.
> > 
> > Ah, I had wondered if that was the case. I haven't seen anything in the GCE
> > configs to control this; Googling for answers is also made awkward given
> > the ambiguous "PCI" acronym.

Its a wonder you got that far :) From what i read on the google compute engine
docs its far from trivial to set one up. It looks like they wanted to create a
swiss-knife that can do everything in one tool.

Reinoud



Re: panic: _bus_virt_to_bus for vioif on GCE with GENERIC kernel

2021-01-31 Thread Reinoud Zandijk
Dear Paul,

On Sat, Jan 30, 2021 at 10:32:13PM +1100, Paul Ripke wrote:
> On Sat, Jan 30, 2021 at 12:37:31AM +0100, Reinoud Zandijk wrote:
> > On Thu, Jan 28, 2021 at 11:56:30PM +1100, Paul Ripke wrote:
> > > Just tried running a newly built kernel on a GCE instance, and ran into
> > > this panic. The previously running kernel is 9.99.73 from back around
> > > October last year.

> Confirmed that a kernel built immediately prior to the following commit
> works,
> and fails after this commit:
> https://github.com/NetBSD/src/commit/7bca0bcf21c9b3465a6ee4eef6c01be32c9de1eb

Thats good to know; I found a bug in memory allocation that might explain your
panic and committed a fix for it. Could you please try out -current and see if
the problem still persists?

> > Could you A) test with virtio v1 PCI devices? ie without legacy and if
> > that
> > fails too could you B) test with src/sys/dev/pci/if_vioif.c:832 commented
> > out
> > and see if that makes a difference? That's a new virtio 1.0 feature that
> > was
> > apparently negotiated and should work in transitional devices and should
> > not
> > be accepted in older. It could be that CGE is making a mistake there but
> > negotiating EVENT_IDX shifts registers so has a big impact if it goes
> > wrong.
> 
> A) Erm, how? Read thru some of the source and saw mentions of v1.0 vs v0.9,
> but didn't see a way of just disabling legacy support

Legacy support has to be disabled in the hypervisor (like GCE) as it needs to
pass a different PCI product number. In Qemu its a property of each virtio PCI
device but in GCE it might be global.

With regards,
Reinoud



signature.asc
Description: PGP signature


Re: panic: _bus_virt_to_bus for vioif on GCE with GENERIC kernel

2021-01-29 Thread Reinoud Zandijk
On Thu, Jan 28, 2021 at 11:56:30PM +1100, Paul Ripke wrote:
> Just tried running a newly built kernel on a GCE instance, and ran into
> this panic. The previously running kernel is 9.99.73 from back around
> October last year.
> 
> Anyone else tried booting -current on GCE recently? My suspicion is
> the VirtIO changes committed around Jan 20. I'll sync back prior to
> those and retry, if nobody else beats me to it.

Not on GCE no. Have you tried the earlier version?

> [   1.0303647] piixpm0: SMBus disabled
> [   1.0303647] virtio0 at pci0 dev 3 function 0
> [   1.0303647] virtio0: SCSI device (rev. 0x00)
> [   1.0303647] vioscsi0 at virtio0: features: 0
> [   1.0303647] vioscsi0: cmd_per_lun 8 qsize 8192 seg_max 64 max_target 253 
> max_lun 1
> [   1.0303647] virtio0: config interrupting at msix0 vec 0
> [   1.0303647] virtio0: queues interrupting at msix0 vec 1
> [   1.0303647] scsibus0 at vioscsi0: 16 targets, 1 lun per target
> [   1.0303647] virtio1 at pci0 dev 4 function 0
> [   1.0303647] virtio1: network device (rev. 0x00)
> [   1.0303647] vioif0 at virtio1: features: 
> 0x20030020

Could you A) test with virtio v1 PCI devices? ie without legacy and if that
fails too could you B) test with src/sys/dev/pci/if_vioif.c:832 commented out
and see if that makes a difference? That's a new virtio 1.0 feature that was
apparently negotiated and should work in transitional devices and should not
be accepted in older. It could be that CGE is making a mistake there but
negotiating EVENT_IDX shifts registers so has a big impact if it goes wrong.

If commenting out the EVENT_IDX works, it is easily solvable by disabling it
on PCI v0.9 attachments to work around GCE.

Qemu 5.1.0 does work fine with the new kernel:

[   1.0189426] virtio5 at pci0 dev 8 function 0
[   1.0189426] virtio5: network device (rev. 0x00)
[   1.0189426] vioif0 at virtio5: features: 
0x31070020
[   1.0189426] vioif0: Ethernet address 52:54:00:12:34:56
[   1.0189426] virtio5: config interrupting at msix2 vec 0
[   1.0189426] virtio5: queues interrupting at msix2 vec 1
[   1.0189426] isa0 at pcib0

interestingly INDIRECT_DESC and NOTIFY_ON_EMPTY (v0.9) are not negotiated with
GCE in your config; EVENT_IDX is the successor of NOTIFTY_ON_EMPTY and should
work fine on itself since its always alone in v1.0.

What is strange, is the INDIRECT_DESC not being negotiated. I haven't touched
that code at all and Qemu always gives it. Is this also the case with older
kernels?

Thanks in advance,
Reinoud



signature.asc
Description: PGP signature


Re: virtio scsi under VirtualBox

2021-01-22 Thread Reinoud Zandijk
On Fri, Jan 22, 2021 at 02:23:46PM +, Chavdar Ivanov wrote:
> After the latest virtio commits I no longer get the panic; the
> virtio-scsi device is recognized and the bus is created; however, a
> disk attached to it is not seen at all.

thats interesting, since nothing changed that ought to have had influence; the
only change was for virtio PCI v1.0's i386 compat the bus_space_write_8() was
split into two bus_space_write_4()'s as its allowed by the spec. Since amd64
can write on every alignment, its odd that the write_8 would fail unless its a
VirtualBox emulation error that doesn't expect an 8 byter to be written in one
go. Very strange, i'll see if i can manage that.

But the discs are there in the dmesg! See

sd0 at scsibus1 target 0 lun 0:  disk fixed
sd0: fabricating a geometry 
sd0: 16384 MB, 16384 cyl, 64 head, 32 sec, 512 bytes/sect x 33554432 sectors
sd0: fabricating a geometry
sd1 at scsibus1 target 1 lun 0:  disk fixed
sd1: fabricating a geometry
sd1: 16384 MB, 16384 cyl, 64 head, 32 sec, 512 bytes/sect x 33554432 sectors
sd1: fabricating a geometry
sd2 at scsibus1 target 2 lun 0:  disk fixed
sd2: fabricating a geometry
sd2: 32768 MB, 32768 cyl, 64 head, 32 sec, 512 bytes/sect x 67108864 sectors
sd2: fabricating a geometry

So you can just access them using `disklabel' and fsck'ing and mounting them
etc. Since they don't spawn dk* i presume they are not GPT.

Please let me know if anything is wrong?

Reinoud



Re: virtio scsi under VirtualBox

2021-01-21 Thread Reinoud Zandijk
On Thu, Jan 21, 2021 at 11:45:25AM +, Chavdar Ivanov wrote:
> On Sat, 21 Nov 2020 at 16:37, Chavdar Ivanov  wrote:
> I see there were a few recent commits around sys/dev/pci/virtio*. Just
> to mention that the presence of a virtio scsi device in a -current
> (from yesterday) vm under VirtualBox 6.1.16 now results in a Guru
> Meditation Error:
> ..
> 00:00:16.505215 [ 1.0443388] virtio1: SCSI device (rev. 0x01)
> 00:00:16.505234 [ 1.0443388] vioscsi0 at virtio1: features: 0x1

It doesn't report that the feature negotiation goes wrong before printing its
features, so the host is accepting v1 but nothing else and that should be OK.

> 00:00:16.505323 virtio-scsi#0: virtio-scsci
> numTargets=1!!
> 00:00:16.505391 emR3Debug: rc=VERR_INVALID_PARAMETER

Clumsy error reporting of VirtualBox! What parameter, what call etc.

Could you please test the code with vioscsi revision 0? i.e. with `legacy'
setting? Or/and the setting 'Virtio SCSI Single' if applicable?
Also please boot the kernel with `netbsd -vx'?

Is there a way to get better error message?

With regards,
Reinoud



Re: Automated report: NetBSD-current/i386 build failure

2021-01-21 Thread Reinoud Zandijk
I'd like to fix this ASAP but what is the correct way of dealing with this? Is
this an i386 failure or should code just not use bus_space_read_8() or
bus_space_write_8() ?

In the VirtIO case, it doesn't have to be written atomically though.

What I could do is define a bus_space_write_8() function that gets compiled
for i386 only but that's a hack.

Reinoud

On Thu, Jan 21, 2021 at 07:59:03PM +0700, Robert Elz wrote:
> Date:Wed, 20 Jan 2021 21:17:40 + (UTC)
> From:NetBSD Test Fixture 
> Message-ID:  <161117746032.12857.1128493575446...@babylon5.netbsd.org>
> 
>   | This is an automatically generated notice of a NetBSD-current/i386
>   | build failure.
>   |
>   | The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host,
>   | using sources from CVS date 2021.01.20.19.46.48.
> 
> The problem that caused that particular failure is fixed (martin@
> fixed the immediate problem, then I fixed a much older bug (2019 vintage)
> that made martin's (correct) fix fail.
> 
> But these commits are still responsible for the build (still) failing
> 
>   | The following commits were made between the last successful build and
>   | the failed build:
>   |
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/acpi/virtio_acpi.c,v 1.5
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/fdt/virtio_mmio_fdt.c,v 1.5
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/if_vioif.c,v 1.66
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/ld_virtio.c,v 1.29
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/vio9p.c,v 1.3
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/viomb.c,v 1.12
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/viornd.c,v 1.14
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/vioscsi.c,v 1.25
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/virtio.c,v 1.43
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/virtio_pci.c,v 1.15
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/virtio_pcireg.h,v 1.1
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/virtioreg.h,v 1.7
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/pci/virtiovar.h,v 1.17
>   | 2021.01.20.19.46.48 reinoud src/sys/dev/virtio/virtio_mmio.c,v 1.4
> 
> The most recent log is at:
> 
> http://releng.netbsd.org/b5reports/i386/2021/2021.01.21.09.50.37/build.log.tail
> 
> and contains the following error messages (all one underlying issue)
> 
> /tmp/build/2021.01.21.09.50.37-i386/tools/bin/i486--netbsdelf-ld: 
> virtio_pci.o: in function `virtio_pci_setup_queue_10':
> virtio_pci.c:(.text+0x1285): undefined reference to `bus_space_write_8'
> /tmp/build/2021.01.21.09.50.37-i386/tools/bin/i486--netbsdelf-ld: 
> virtio_pci.c:(.text+0x12a9): undefined reference to `bus_space_write_8'
> /tmp/build/2021.01.21.09.50.37-i386/tools/bin/i486--netbsdelf-ld: 
> virtio_pci.c:(.text+0x12cd): undefined reference to `bus_space_write_8'
> /tmp/build/2021.01.21.09.50.37-i386/tools/bin/i486--netbsdelf-ld: 
> virtio_pci.c:(.text+0x132e): undefined reference to `bus_space_write_8'
> /tmp/build/2021.01.21.09.50.37-i386/tools/bin/i486--netbsdelf-ld: 
> virtio_pci.c:(.text+0x1357): undefined reference to `bus_space_write_8'
> /tmp/build/2021.01.21.09.50.37-i386/tools/bin/i486--netbsdelf-ld: 
> virtio_pci.o:virtio_pci.c:(.text+0x1380): more undefined references to 
> `bus_space_write_8' follow
> 
> This is as far as I can go with this one, I don't know whether i386 is
> supposed to have a bus_space_write_8() function or not (and if not, what
> should be used instead) or whether it exists, but for some reason isn't
> being found (the error is detected in the INSTALL kernel build), or
> something else.
> 
> kre


Re: Unusable system during dd to USB block device

2020-12-30 Thread Reinoud Zandijk
On Tue, Dec 29, 2020 at 10:02:44AM +, Thomas Mueller wrote:
> > On Tue, 29 Dec 2020, Thomas Mueller wrote:
> > > I tried to dd to /dev/sd2 (or whatever the number was), and it looked
> > > good on the screen, no error message, but nothing happened on the USB 
> > > stick.
> 
> > 3 possibilities here:
> 
> > a) You wrote to the wrong device.
> > b) You now have a copy of the input file in /dev
> > c) _Very_ exotic HW issue. (You write to the device; the device says "write 
> > OK"
> > but, it doesn't actually write anything. I've known this to happen on 
> > some
> > models of eMMC/SDHC cards.)

I suspect b) :) happened to me before too. Just check your /dev with `ls
-altr' it should show up easily.

As for the device name, `dmesg | tail` will tell what drive was added and then
use the raw partion like /dev/rsd2 or similar.

Reinoud



Re: recent sysinst UX changes

2020-11-09 Thread Reinoud Zandijk
On Mon, Nov 09, 2020 at 10:10:56AM +, nia wrote:
> i run into it on real hardware, thinkpad t60.
> 
> my preference is:
> 
> - when booting in a VM, if there is no RNG device attached,
>   the system should print a warning with instructions on how
>   to attach the device.

In practice this means running in Qemu I guess? For all machines there is the
possibility virtio-rng device as per spec (is there another?) and mentioned in
the virtio bounty on tech-kern@. For x86_64 aka AMD64, the situation can be a
lot easier.

When running Qemu on an recent host using NVMM, the RDRANDOM instruction is
not trapped and will use the hosts entropy. I've checked this with installing
the 9.99.75 installation CD and installing on a harddisc. At no time i've been
asked for entropy. So apparently when using qemu+nvmm new installs
automatically get good entropy to start with.

Reinoud



Oddball small memory qemu images and networking

2020-11-09 Thread Reinoud Zandijk
Hi folks,

On Mon, May 18, 2020 at 09:41:17PM +, Andrew Doran wrote:
> Finally got around to trying this.  Having beaten on it for a while with
> real hardware I don't see any problem with swapping over NFS on 9.99.63.
> 
> On Sat, May 02, 2020 at 12:06:48PM +1000, Paul Ripke wrote:
> 
> > I have a qemu guest for experimenting with -current, 1 CPU & 64MiB RAM.
> 
> 64 megs, I'm surprised it makes to the login prompt.

I can get a prompt and compile some stuff locally in 48 MiB but no network
until sat 80 MiB.

As a challenge I tried a bog standard GENERIC 9.99.69/amd64 on qemu with small
amounts of memory with a `harddisc' including a swap space. I increased the
memory size in steps of 2 MiB.

With one CPU
40 MiB boots but fsck_ffs killed due to out of swap; swap is later?
42 MiB booted to multiuser, mount_ffs failing
44 MiB booted to multiuser but is bog slow due to swap, no dhcpcd
46 MiB booted to multiuser fine, usable, no dhcpcd
...
66 MiB booted to multiuser fine, dhcpcd sometimes works
...
80 MiB booted to multiuser fine, dhcpcd works reliable


With 2 CPUs :
44 MiB doesn't boot and creates a panic in pmap_get_physpage()
46 MiB booted to multiuser but slowish due to swapping
48 MiB booted to multiuser fine, more usable, no dhcpcd
...
54 MiB booted to multiuser fine, dhcpcd sporadically works
...
60 MiB booted to multiuser fine, dhcpcd sometimes works
...
70 MiB booted to multiuser fine, dhcpcd mostly works
...
80 MiB booted to multiuser fine, dhcpcd works reliable


Network buffers are thus limiting its useability in such odd small memory
systems even though swap is available; the kernel can't allocate memory for
SIOCAOFADDR when done manually. Its strange that it sometimes succeeds though!

Not useless though as without network, a 1 CPU machine can already compile on
a 48Mb machine on FFS.

Note that on very low memory (60MiB) I can compile programs hosted over NFS
(when dhcpcd works) though SIGINFO can print odd stuff like:

[ 174.2806880] load: 0.48  cmd: cc1 3164 [0x693f93fb] 0.00u 267344117007.14s
1% 9572k

When compiling on bigger, say 80MiB it prints normal times.

Hope this was of help,
Reinoud



Re: benchmark results on ryzen 3950x with netbsd-9, -current, and -current (no DIAGNOSTIC)

2020-11-09 Thread Reinoud Zandijk
On Sat, Nov 07, 2020 at 06:12:34PM +1100, matthew green wrote:
> this is an update on my previous testing.  i've excluded
> amd64 release builds from this set, takes too long ;)

Indeed remarkable! I assume you build the same source tree on the various
versions? Did you use the benchmarking stuff that was done on the SoC or did
you do it `manually' ;)

I am curious as to what is AMD64 specific ie how the speeds are comparing on
say a big Sparc64 or an beefy AArch64 system in the cloud. My guess is that
less memory is going to haunt them since harddiscs and the like are going to
hinder good performance more than code enhancements.

Thanks a lot for testing!

Reinoud



Update, its on NFS only (Re: ffs corruption on 9.66.69 on qemu)

2020-10-12 Thread Reinoud Zandijk
Hi folks,

On Sun, Oct 11, 2020 at 04:10:00PM +0200, Reinoud Zandijk wrote:
> qemu-system-x86_64 -m 4096 -accel nvmm -smp cpus=2 -drive \
>   file=work/wd0.img,format=raw -nographic -gdb tcp::1234 -net nic -net \
>   tap,ifname=tap0,script=no
> 
> Now is this an qemu related problem? I am a bit hessistent to try the -current
> kernel out on real hardware but i've ran a couple minor version on real
> hardware without any issue so far.

I've tried out different combinations and using NVMM or not doesn't make any
difference. Using with wapbl and without doesn't matter.

The image that got corrupted was served over NFS from another NetBSD-9 amd64
machine. When I tried the image locally on FFS the problem disappeared! The
image was modified correctly and got a clean bill of health.

When I see the file opened, the only flags it gives it are O_RDWR | O_CLOEXEC.
Could this interplay with NFS in a different way than with FFS?

Any ideas?
Reinoud



signature.asc
Description: PGP signature


qemu-nvmm cpu accounting problem

2020-10-11 Thread Reinoud Zandijk
Hi Maxime, hi Andrew,

I'm currently playing with qemu/nvmm on this intel machine and when checking
if it really used both CPUs i noticed that `top' showed the qemu-system-x86_64
process can have a 150%+ CPU time but all the time is not registered as user
time but only as system time and hardly if any user time. When compiling on
both virtual processors in qemu I can get 70+% cpu time in systat but only
1.5% user and the rest all system time.

I'm not sure how its measured but something is off for sure or is this done by
design?

With regards,
Reinoud Zandijk



signature.asc
Description: PGP signature


Re: Daily packages for NetBSD/amd64 current

2020-07-26 Thread Reinoud Zandijk
Hi Jonathan,

On Sat, Jul 25, 2020 at 10:52:48PM +0100, Jonathan Perkin wrote:
> I needed a NetBSD-current system for testing pkgin changes, and figured I
> may as well also set it up for daily package builds.
> 
> So if anybody would like a repository for the latest packages then head over
> to https://pkgsrc.joyent.com/install-on-netbsd/ to install the bootstrap
> kit.

The bootstrap part is a bit confusing to me; why don't you just provide a
directory suitable for pkgin's /usr/pkg/etc/pkgin/repositories.conf? Is that
to explicitly allow the signed packages only?

With regards,
Reinoud



NiLFS changes and usage inquiry

2020-03-09 Thread Reinoud Zandijk
Dear folks,

i'm currently looking at getting NiLFS writing support. Since the linux
adoptation of NiLFS seems to be waning I'm contemplating an incompatible
change to make it more efficient and less errorprone wuthout losing its
continous snapshot feature which is imho one of the major advantages of NiLFS.
One could call it NiLFS3 though this is just a worktitle. Compatibility with
NiLFS2 could be kept but at a conplexity cost that might not be justifyable.

My question is now, would you support such a move or would you rather keep it
compatible with the NiLFS2 implementation in Linux. I don't know how many of
you use this read compatibility right now or would like to keep it. Since disc
or device exchange is getting more rare these days due to fast interconnection
it mught be less relevant to keep the compatibility since linux also shares
FAT, NTFS, NFS and ext2(3?4?) file systems. That said, i'm trying to get one
if the origional developers of NiLFS2 on Linux to contemplate to port the
changes into the linux version too but i haven't had a reply yet.

Please do share your views with me and i'd really like feature requests too!

With regards,
Reinoud Zandijk




Re: daily CVS update output

2018-07-17 Thread Reinoud Zandijk
On Mon, Jul 16, 2018 at 03:11:06AM +, NetBSD source update wrote:
> cvs update: `src/share/man/man4/ipkdb.4' is no longer in the repository
...

i must have missed a discussion, but why has IPKDB removed? Even if
disfunctional, it could better be repaired or at least let it work in Qemu?

Reinoud



make(1) cleanup patch

2018-07-11 Thread Reinoud Zandijk
Dear folks,

in my attempt to clean up make(1) I stumbled on the following duplicate code.
It was replaced by the cached_stat() addition but the old code was not
removed.

I tested the unit-test and all output is identical as well as are all ATF
tests are still passing.

If there are no objections I'd like to commit these patches,
Reinoud



Index: dir.c
===
RCS file: /cvsroot/src/usr.bin/make/dir.c,v
retrieving revision 1.71
diff -u -p -r1.71 dir.c
--- dir.c   16 Apr 2017 21:14:47 -  1.71
+++ dir.c   11 Jul 2018 18:47:44 -
@@ -268,15 +268,6 @@ struct cache_st {
 };
 
 /* minimize changes below */
-static time_t
-Hash_GetTimeValue(Hash_Entry *entry)
-{
-struct cache_st *cst;
-
-cst = entry->clientPtr;
-return cst->mtime;
-}
-
 #define CST_LSTAT 1
 #define CST_UPDATE 2
 
@@ -1134,7 +1125,6 @@ Dir_FindFile(const char *name, Lst path)
 Boolean  hasLastDot = FALSE;   /* true we should search dot last */
 Boolean  hasSlash; /* true if 'name' contains a / */
 struct stat  stb;  /* Buffer for stat, if 
necessary */
-Hash_Entry   *entry;   /* Entry for mtimes table */
 const char   *trailing_dot = ".";
 
 /*
@@ -1395,13 +1385,7 @@ Dir_FindFile(const char *name, Lst path)
 }
 
 bigmisses += 1;
-entry = Hash_FindEntry(, name);
-if (entry != NULL) {
-   if (DEBUG(DIR)) {
-   fprintf(debug_file, "   got it (in mtime cache)\n");
-   }
-   return(bmake_strdup(name));
-} else if (cached_stat(name, ) == 0) {
+if (cached_stat(name, ) == 0) {
if (DEBUG(DIR)) {
fprintf(debug_file, "   Caching %s for %s\n", 
Targ_FmtTime(stb.st_mtime),
name);
@@ -1518,7 +1502,6 @@ Dir_MTime(GNode *gn, Boolean recheck)
 {
 char  *fullName;  /* the full pathname of name */
 struct stat  stb;/* buffer for finding the mod time */
-Hash_Entry   *entry;
 
 if (gn->type & OP_ARCHV) {
return Arch_MTime(gn);
@@ -1569,17 +1552,7 @@ Dir_MTime(GNode *gn, Boolean recheck)
fullName = bmake_strdup(gn->name);
 }
 
-if (!recheck)
-   entry = Hash_FindEntry(, fullName);
-else
-   entry = NULL;
-if (entry != NULL) {
-   stb.st_mtime = Hash_GetTimeValue(entry);
-   if (DEBUG(DIR)) {
-   fprintf(debug_file, "Using cached time %s for %s\n",
-   Targ_FmtTime(stb.st_mtime), fullName);
-   }
-} else if (cached_stats(, fullName, , recheck ? CST_UPDATE : 0) 
< 0) {
+if (cached_stats(, fullName, , recheck ? CST_UPDATE : 0) < 0) {
if (gn->type & OP_MEMBER) {
if (fullName != gn->path)
free(fullName);


signature.asc
Description: PGP signature


Re: ./build.sh -k feature request (was Re: GENERIC Kernel Build errors with -fsanitize=undefined option enabled)

2018-06-25 Thread Reinoud Zandijk
On Mon, Jun 25, 2018 at 10:29:30PM +0200, Kamil Rytarowski wrote:
> I'm usually going the other direction. If I hit a problem and an issue
> is not visible from a multi-job build (multiple processes print onto the
> same screen concurrently), I go for -j1. If this is still no enough I
> change the verbosity level and check commands that caused a problem.
> 
> The -k option wouldn't not help me.
> 
> On the other hand, can you just specify -V MAKEFLAGS="-k" in your local
> build?
> 

AFAIK, the MAKEFLAGS are cleared on start; I remember me having tried
something like this before.

Well, it can also help to just do everything with say -k -j10 and inspect all
failures again with the -k -j1

Reinoud



Re: ./build.sh -k feature request (was Re: GENERIC Kernel Build errors with -fsanitize=undefined option enabled)

2018-06-25 Thread Reinoud Zandijk
the patch:

Index: build.sh
===
RCS file: /cvsroot/src/build.sh,v
retrieving revision 1.327
diff -u -p -r1.327 build.sh
--- build.sh2 May 2018 07:34:44 -   1.327
+++ build.sh25 Jun 2018 18:26:34 -
@@ -1027,7 +1027,7 @@ usage()
cat <<_usage_
 
 Usage: ${progname} [-EhnoPRrUuxy] [-a arch] [-B buildid] [-C cdextras]
-[-D dest] [-j njob] [-M obj] [-m mach] [-N noisy]
+[-D dest] [-j njob] [-k] [-M obj] [-m mach] [-N noisy]
 [-O obj] [-R release] [-S seed] [-T tools]
 [-V var=[value]] [-w wrapper] [-X x11src] [-Y extsrcsrc]
 [-Z var]
@@ -1084,6 +1084,9 @@ Usage: ${progname} [-EhnoPRrUuxy] [-a ar
Should not be used without expert knowledge of the build 
system.
 -h Print this help message.
 -j njobRun up to njob jobs in parallel; see make(1) -j.
+-k Continue processing after errors are encountered, but only 
on
+   those targets that do not depend on the target whose 
creation
+   caused the error.
 -M obj Set obj root directory to obj; sets MAKEOBJDIRPREFIX.
Unsets MAKEOBJDIR.
 -m machSet MACHINE to mach.  Some mach values are actually
@@ -1128,7 +1131,7 @@ _usage_
 
 parseoptions()
 {
-   opts='a:B:C:D:Ehj:M:m:N:nO:oPR:rS:T:UuV:w:X:xY:yZ:'
+   opts='a:B:C:D:Ehj:kM:m:N:nO:oPR:rS:T:UuV:w:X:xY:yZ:'
opt_a=false
opt_m=false
 
@@ -1188,6 +1191,10 @@ parseoptions()
parallel="-j ${OPTARG}"
;;
 
+   -k)
+   MAKEFLAGS="-k ${MAKEFLAGS}"
+   ;;
+
-M)
eval ${optargcmd}; resolvepath OPTARG
case "${OPTARG}" in



signature.asc
Description: PGP signature


Re: ./build.sh -k feature request (was Re: GENERIC Kernel Build errors with -fsanitize=undefined option enabled)

2018-06-25 Thread Reinoud Zandijk
On Sun, Jun 24, 2018 at 10:01:42PM +0200, Reinoud Zandijk wrote:
> On Wed, May 30, 2018 at 07:11:19PM +0800, Harry Pantazis wrote:
> > Continuing..
> > 
> > This first errors are located in
> > src/sys/external/bsd/drm2/dist/drm/i915/intel_ddi.c and are specific to
> > the switch statement concerning that the case flags are not reducing
> > directly to integer constants.
> 
> I'd like to request a -k flag to ./build.sh that as with a normal make(1)
> continues to build as much as possible. This will result in reporting all
> errors in one go without needing the 1st to be resolved before the 2nd is
> showing up!

Attached patch will do, objections against me comitting it? It allows all that
is buildable to be build and the failing files to be compiled later when
patched with the -u option.

With regards,
Reinoud



signature.asc
Description: PGP signature


showstopper, libGL error with (at least) radeon in -current

2015-10-30 Thread Reinoud Zandijk
Hi folks,

I've just basically re-installed my NetBSD/amd64 to -current of the 26st of
October and I'm running into libGL init issues:

> glxgears 
libGL error: unable to load driver: r600_dri.so
libGL error: driver pointer missing
libGL error: failed to load driver: r600
libGL error: unable to load driver: swrast_dri.so
libGL error: failed to load driver: swrast
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
...

Looking at the ktrace I don't see any obvious issues nor am I seeing any
missing dependencies. It runs on:

radeon0 at pci2 dev 0 function 0: vendor 1002 product 9440 (rev. 0x00)
drm: initializing kernel modesetting (RV770 0x1002:0x9440 0x174B:0x114A).

Any ideas?

With regards,
Reinoud



signature.asc
Description: PGP signature


Re: blacklistd is now available for current (comments?)

2015-07-12 Thread Reinoud Zandijk
Hi Christos,

Thanks for your blacklistd, its soo much more lightweight that the others i've
seen in pkgsrc; really frees up my small NAS. I've installed the -current
version as in tree.

There are a few oddities though, and maybe you could enlighten me on those.

First of all your name is still in a custom rule in the default installed
bloacklistd.conf. I'd say just comment it oug :)

More importantly, blacklistctl can only dump rules; it doesn't have commands
for adding or removing rules manually. So when i had to manually allow a
machine, my only option was to trunk the db file and restarting blacklistd. I
later learned that blacklistd also has a -f to do this, but its a bit odd that
there isn't say a `blacklistctl allow host port' that reverses a decision it
made.

`blacklistctl dump' without the '-a' doesn't show anything even when there are
machines blacklisted with timeouts.

With regards,
Reinoud



pgpIht2VI9Ey8.pgp
Description: PGP signature


pkg_add fails to install earmv5 packages on full earmv5 machine

2015-07-12 Thread Reinoud Zandijk
Hi folks,

i'm experiencing problems on evbarm when i try to install a package, in this
case pkg_install itself.

I've compiled the kernel and its userland using ./build.sh -mevbarmv5-el ...
and its running fine. When i try to install a package however, i get:

=== Install binary package of pkg_install-20150508
pkg_add: Warning: package `pkg_install-20150508' was built for a platform:
pkg_add: NetBSD/earmv5 7.99.19 (pkg) vs. NetBSD/earm 7.99.19 (this host)
pkg_add: 1 package addition failed

It looks like pkg_add is getting the wrong architecture value even though
sysctl reports hw.machine_arch = earmv5

If I force install with pkg_add -f I get the same behaviour with the newly
installed /usr/pkg/sbin/pkg_add. There is still the default path issue picking
/usr/sbin first but thats another discussion that ought to be resolved!

I learned that there is a patch somehwere that also takes into account the
various super sets?

With regards,
Reinoud



pgpCI0fVXZhA2.pgp
Description: PGP signature