Re: Testing requested for the next version of GNU Mach
Richard Braun, on Tue 15 Mar 2016 00:40:40 +0100, wrote: > On Tue, Mar 15, 2016 at 12:32:43AM +0100, Samuel Thibault wrote: > > Richard Braun, on Mon 14 Mar 2016 23:37:33 +0100, wrote: > > > On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote: > > > > After poking around a bit more, it seems that the space is eaten by > > > > debug info. Appending -g0 to CFLAGS allowed gnumach to boot > > > > successfully and resulted in this: > > > > > > > > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M) > > > > > > > > The difference may be because GRUB apparently passes the multiboot ELF > > > > info, while QEMU does not. I'll just make sure to disable debug flags > > > > when building gnumach for the time being. > > > > > > Yes that makes sense. > > > > Uh. Should we perhaps move the kernel out of the 16MiB area? I don't > > think we have any such requirement here, AIUI we are just putting it > > where it poses the least fragmentation issues, but the scarse DMA > > resource is even more important. > > That's what Linux does. It should be configurable though. I ended up needing it, so I did it. Samuel
Re: Testing requested for the next version of GNU Mach
On Tue, Mar 15, 2016 at 12:32:43AM +0100, Samuel Thibault wrote: > Richard Braun, on Mon 14 Mar 2016 23:37:33 +0100, wrote: > > On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote: > > > After poking around a bit more, it seems that the space is eaten by > > > debug info. Appending -g0 to CFLAGS allowed gnumach to boot > > > successfully and resulted in this: > > > > > > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M) > > > > > > The difference may be because GRUB apparently passes the multiboot ELF > > > info, while QEMU does not. I'll just make sure to disable debug flags > > > when building gnumach for the time being. > > > > Yes that makes sense. > > Uh. Should we perhaps move the kernel out of the 16MiB area? I don't > think we have any such requirement here, AIUI we are just putting it > where it poses the least fragmentation issues, but the scarse DMA > resource is even more important. That's what Linux does. It should be configurable though. -- Richard Braun
Re: Testing requested for the next version of GNU Mach
Richard Braun, on Mon 14 Mar 2016 23:37:33 +0100, wrote: > On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote: > > After poking around a bit more, it seems that the space is eaten by > > debug info. Appending -g0 to CFLAGS allowed gnumach to boot > > successfully and resulted in this: > > > > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M) > > > > The difference may be because GRUB apparently passes the multiboot ELF > > info, while QEMU does not. I'll just make sure to disable debug flags > > when building gnumach for the time being. > > Yes that makes sense. Uh. Should we perhaps move the kernel out of the 16MiB area? I don't think we have any such requirement here, AIUI we are just putting it where it poses the least fragmentation issues, but the scarse DMA resource is even more important. Samuel
Re: Testing requested for the next version of GNU Mach
On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote: > > The realtek board shouldn't be working without DDE... Something looks > > wrong in your test setup. > > I've been using it almost exclusively for the last three or four years > without DDE. It works fine for me until I try to change the IP > address with fsysopts, then kernel panics ensue. If you really built the kernel with --disable-rtl8139, it shouldn't be working. > After poking around a bit more, it seems that the space is eaten by > debug info. Appending -g0 to CFLAGS allowed gnumach to boot > successfully and resulted in this: > > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M) > > The difference may be because GRUB apparently passes the multiboot ELF > info, while QEMU does not. I'll just make sure to disable debug flags > when building gnumach for the time being. Yes that makes sense. -- Richard Braun
Re: Testing requested for the next version of GNU Mach
On Sun, Mar 13, 2016 at 9:06 AM, Richard Braunwrote: > On Fri, Mar 11, 2016 at 05:38:06PM -0500, David Michael wrote: >> I didn't get a chance to try with Debian yet, but after looking a bit >> more, the failure I'm getting is from linux_kmem_init() allocating >> memory. It panics in vm_page_grab_contig() when allocating the 29th >> (i=28) chunk. > > This means the allocator can't allocate 64k contiguous physical memory > in the first 16M. > >> I tried reducing MEM_CHUNKS from 32 to 24, but it then panicked when >> linux_init() tried to allocate more memory immediately after. After > > That doesn't make sense. Could you report the panic message ? It was "panic: vm_page_grab_contig" in all cases. >> reducing it to 16, gnumach booted fine, and everything worked properly >> (including rtl8139 devices). > > The realtek board shouldn't be working without DDE... Something looks > wrong in your test setup. I've been using it almost exclusively for the last three or four years without DDE. It works fine for me until I try to change the IP address with fsysopts, then kernel panics ensue. I've switched my QEMU commands to -device pcnet just in case. >> The panics don't seem to be affected by configure options or device >> drivers used. This is with the latest mostly pristine upstream code >> from gnumach and GRUB, but I will try to see what Debian is doing >> differently over the weekend. > > This is what I have when I boot : > vm_page: DMA: pages: 4080 (15M), free: 1265 (4M) > > The amount of free memory is already quite low for some reason, but on > your machine, it's even lower : > vm_page: DMA: pages: 4080 (15M), free: 451 (1M) > > This explains why your test succeeds with 16 chunks (16*64k = 1M). But > We have to understand why you have so little memory in the first place. > It could be that newer versions of GRUB place more boot data, or > align them differently, and that we neeed to do a better job at freeing > boot data. I'm also not sure why my changes affect that... > > What's your versions of GRUB (please don't say "latest", always state > the IDs explicitely) ? I've encountered the failure with both GRUB beta2 and beta3. They're from last month and over two years ago, so I don't think the change was too recent. After poking around a bit more, it seems that the space is eaten by debug info. Appending -g0 to CFLAGS allowed gnumach to boot successfully and resulted in this: vm_page: DMA: pages: 4080 (15M), free: 2053 (8M) The difference may be because GRUB apparently passes the multiboot ELF info, while QEMU does not. I'll just make sure to disable debug flags when building gnumach for the time being. Thanks. David
Re: Testing requested for the next version of GNU Mach
On Fri, Mar 11, 2016 at 05:38:06PM -0500, David Michael wrote: > I didn't get a chance to try with Debian yet, but after looking a bit > more, the failure I'm getting is from linux_kmem_init() allocating > memory. It panics in vm_page_grab_contig() when allocating the 29th > (i=28) chunk. This means the allocator can't allocate 64k contiguous physical memory in the first 16M. > I tried reducing MEM_CHUNKS from 32 to 24, but it then panicked when > linux_init() tried to allocate more memory immediately after. After That doesn't make sense. Could you report the panic message ? > reducing it to 16, gnumach booted fine, and everything worked properly > (including rtl8139 devices). The realtek board shouldn't be working without DDE... Something looks wrong in your test setup. > The panics don't seem to be affected by configure options or device > drivers used. This is with the latest mostly pristine upstream code > from gnumach and GRUB, but I will try to see what Debian is doing > differently over the weekend. This is what I have when I boot : vm_page: DMA: pages: 4080 (15M), free: 1265 (4M) The amount of free memory is already quite low for some reason, but on your machine, it's even lower : vm_page: DMA: pages: 4080 (15M), free: 451 (1M) This explains why your test succeeds with 16 chunks (16*64k = 1M). But We have to understand why you have so little memory in the first place. It could be that newer versions of GRUB place more boot data, or align them differently, and that we neeed to do a better job at freeing boot data. I'm also not sure why my changes affect that... What's your versions of GRUB (please don't say "latest", always state the IDs explicitely) ? -- Richard Braun
Re: Testing requested for the next version of GNU Mach
David Michael, on Fri 11 Mar 2016 17:38:06 -0500, wrote: > but I will try to see what Debian is doing differently over the > weekend. Debian simply doesn't use those drivers any more, but netdde instead. Samuel
Re: Testing requested for the next version of GNU Mach
On Tue, Mar 08, 2016 at 12:15:25PM -0500, David Michael wrote: > I've just removed linux/src/drivers/net/rtl8139.c and dropped all > references to rtl8139 from the following files Please use --disable-rtl8139 instead, and only use GRUB as configured by Debian so that we restrict the variables of our tests. -- Richard Braun
Re: Testing requested for the next version of GNU Mach
On Tue, Mar 8, 2016 at 10:00 AM, David Michaelwrote: > On Tue, Mar 8, 2016 at 5:06 AM, Richard Braun wrote: >> In any case, this isn't a regression caused by my work, and I don't >> intend to fix in-kernel drivers, in particular when we have a good >> user space replacement. As a result, I suggest we remove the rtl8139 >> driver from the kernel. > > That's probably for the best. Even when it would boot with that > driver, changing the IP address with fsysopts would cause a kernel > panic. > > I've changed it to -device ne2k_pci, and it seems to work now. Sorry, I accidentally regenerated my GRUB configuration pointed at an old build, confusing my results. When gnumach is built with the rtl8139 driver removed, it has the same behavior. It fails to boot using GRUB, but succeeds using QEMU's multiboot options. I've just removed linux/src/drivers/net/rtl8139.c and dropped all references to rtl8139 from the following files: doc/mach.texi linux/Makefrag.am linux/configfrag.ac linux/dev/drivers/net/Space.c linux/src/drivers/net/Space.c So unless rtl8139 bits are still hidden somewhere (grep only finds some PCI ID definitions), it looks like there may be a different problem. David
Re: Testing requested for the next version of GNU Mach
On Tue, Mar 8, 2016 at 5:06 AM, Richard Braunwrote: > In any case, this isn't a regression caused by my work, and I don't > intend to fix in-kernel drivers, in particular when we have a good > user space replacement. As a result, I suggest we remove the rtl8139 > driver from the kernel. That's probably for the best. Even when it would boot with that driver, changing the IP address with fsysopts would cause a kernel panic. I've changed it to -device ne2k_pci, and it seems to work now. Thanks. David
Re: Testing requested for the next version of GNU Mach
On Mon, Mar 07, 2016 at 11:01:47PM -0500, David Michael wrote: > Yes, this was when booting gnumach in QEMU with -device rtl8139 for a > net device. Here is the situation : At some point in 2006 (but apparently merged in 2009), netdrivers, the network drivers written by Donald Becker, were updated to version 3.5. For some reason, rtl8139 has this bug that we apparently didn't have in the past. Since then, we've been using netdde, more recent Linux drivers running in user space, by default as part of the Debian packages. I can clearly reproduce the bug, which seems specific to this particular driver. However I'm completely unable to find any netdrivers code except our own. The driver was replaced with 8139too in Linux around 2.2/2.3. In any case, this isn't a regression caused by my work, and I don't intend to fix in-kernel drivers, in particular when we have a good user space replacement. As a result, I suggest we remove the rtl8139 driver from the kernel. -- Richard Braun
Re: Testing requested for the next version of GNU Mach
On Mon, Mar 7, 2016 at 8:56 PM, Richard Braunwrote: > On Tue, Mar 08, 2016 at 12:00:03AM +0100, Richard Braun wrote: >> On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote: >> > Yes, that did it. The latest gnumach can be booted with GRUB when >> > those options are disabled. >> >> It seems we've been having this bug for a long time, but noone is using >> the in-kernel drivers for these boards any more, since the Debian >> packages don't include them, only the userspace DDE ones. > > Can you confirm that your board is a realtek 8139 (or compatible) ? Yes, this was when booting gnumach in QEMU with -device rtl8139 for a net device. David
Re: Testing requested for the next version of GNU Mach
On Tue, Mar 08, 2016 at 12:00:03AM +0100, Richard Braun wrote: > On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote: > > Yes, that did it. The latest gnumach can be booted with GRUB when > > those options are disabled. > > It seems we've been having this bug for a long time, but noone is using > the in-kernel drivers for these boards any more, since the Debian > packages don't include them, only the userspace DDE ones. Can you confirm that your board is a realtek 8139 (or compatible) ? -- Richard Braun
Re: Testing requested for the next version of GNU Mach
On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote: > On Sun, Feb 28, 2016 at 3:37 PM, Richard Braunwrote: > > On Sun, Feb 28, 2016 at 03:27:50PM -0500, David Michael wrote: > >> The same GRUB has no problem booting older gnumach (bee3f0) or Linux. > >> Are you aware of any patches required by GRUB to boot the X15 > >> multiboot/biosmem bits? > > > > Thanks for your report. > > > > It's not related to GRUB at all apparently. My guess is something got > > wrong in the Linux glue code for in kernel drivers. Could you build > > gnumach with --disable-net-group --disable-pcmcia-group > > --disable-wireless-group and try again please ? > > Yes, that did it. The latest gnumach can be booted with GRUB when > those options are disabled. It seems we've been having this bug for a long time, but noone is using the in-kernel drivers for these boards any more, since the Debian packages don't include them, only the userspace DDE ones. -- Richard Braun
Re: Testing requested for the next version of GNU Mach
On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote: > > It's not related to GRUB at all apparently. My guess is something got > > wrong in the Linux glue code for in kernel drivers. Could you build > > gnumach with --disable-net-group --disable-pcmcia-group > > --disable-wireless-group and try again please ? > > Yes, that did it. The latest gnumach can be booted with GRUB when > those options are disabled. Thanks again for testing this. We've had several reports pointing in that direction. I'll try to fix this soon and update the branch. -- Richard Braun
Re: Testing requested for the next version of GNU Mach
On Sun, Feb 28, 2016 at 3:37 PM, Richard Braunwrote: > On Sun, Feb 28, 2016 at 03:27:50PM -0500, David Michael wrote: >> The same GRUB has no problem booting older gnumach (bee3f0) or Linux. >> Are you aware of any patches required by GRUB to boot the X15 >> multiboot/biosmem bits? > > Thanks for your report. > > It's not related to GRUB at all apparently. My guess is something got > wrong in the Linux glue code for in kernel drivers. Could you build > gnumach with --disable-net-group --disable-pcmcia-group > --disable-wireless-group and try again please ? Yes, that did it. The latest gnumach can be booted with GRUB when those options are disabled. Thanks. David
Re: Testing requested for the next version of GNU Mach
On Sun, Feb 28, 2016 at 03:27:50PM -0500, David Michael wrote: > The same GRUB has no problem booting older gnumach (bee3f0) or Linux. > Are you aware of any patches required by GRUB to boot the X15 > multiboot/biosmem bits? Thanks for your report. It's not related to GRUB at all apparently. My guess is something got wrong in the Linux glue code for in kernel drivers. Could you build gnumach with --disable-net-group --disable-pcmcia-group --disable-wireless-group and try again please ? -- Richard Braun
Re: Testing requested for the next version of GNU Mach
Hi, I haven't been able to boot gnumach with upstream GRUB since this X15 code was imported. It does boot successfully with QEMU's multiboot arguments, and it seems fine when it's running. When booting the latest gnumach (689810) with GRUB (both old versions and today's beta3 release) built for the i386-pc platform, the following is displayed for a fraction of a second before the system reboots: GNU Mach 1.6 ELF section header table at c0010278 biosmem: physical memory map: biosmem: 00:09fc00, available biosmem: 09fc00:0a, reserved biosmem: 0f:10, reserved biosmem: 10:003ffe, available biosmem: 003ffe:004000, reserved biosmem: 00feffc000:00ff00, reserved biosmem: 00fffc:01, reserved biosmem: heap: e3d000-3a00 Loaded ELF symbol table for mach (6033 symbols) vm_page: page table size: 262096 entries (17408k) vm_page: DMA: pages: 4080 (15M), free: 451 (1M) vm_page: DIRECTMAP: pages: 233472 (912M), free: 226345 (884M) vm_page: HIGHMEM: pages: 24544 (95M), free: 24544 (95M) unexpected ACK from keyboard The same GRUB has no problem booting older gnumach (bee3f0) or Linux. Are you aware of any patches required by GRUB to boot the X15 multiboot/biosmem bits? Thanks. David