Re: Testing requested for the next version of GNU Mach

2016-03-18 Thread Samuel Thibault
Richard Braun, on Tue 15 Mar 2016 00:40:40 +0100, wrote:
> On Tue, Mar 15, 2016 at 12:32:43AM +0100, Samuel Thibault wrote:
> > Richard Braun, on Mon 14 Mar 2016 23:37:33 +0100, wrote:
> > > On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote:
> > > > After poking around a bit more, it seems that the space is eaten by
> > > > debug info.  Appending -g0 to CFLAGS allowed gnumach to boot
> > > > successfully and resulted in this:
> > > > 
> > > > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M)
> > > > 
> > > > The difference may be because GRUB apparently passes the multiboot ELF
> > > > info, while QEMU does not.  I'll just make sure to disable debug flags
> > > > when building gnumach for the time being.
> > > 
> > > Yes that makes sense.
> > 
> > Uh.  Should we perhaps move the kernel out of the 16MiB area?  I don't
> > think we have any such requirement here, AIUI we are just putting it
> > where it poses the least fragmentation issues, but the scarse DMA
> > resource is even more important.
> 
> That's what Linux does. It should be configurable though.

I ended up needing it, so I did it.

Samuel



Re: Testing requested for the next version of GNU Mach

2016-03-14 Thread Richard Braun
On Tue, Mar 15, 2016 at 12:32:43AM +0100, Samuel Thibault wrote:
> Richard Braun, on Mon 14 Mar 2016 23:37:33 +0100, wrote:
> > On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote:
> > > After poking around a bit more, it seems that the space is eaten by
> > > debug info.  Appending -g0 to CFLAGS allowed gnumach to boot
> > > successfully and resulted in this:
> > > 
> > > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M)
> > > 
> > > The difference may be because GRUB apparently passes the multiboot ELF
> > > info, while QEMU does not.  I'll just make sure to disable debug flags
> > > when building gnumach for the time being.
> > 
> > Yes that makes sense.
> 
> Uh.  Should we perhaps move the kernel out of the 16MiB area?  I don't
> think we have any such requirement here, AIUI we are just putting it
> where it poses the least fragmentation issues, but the scarse DMA
> resource is even more important.

That's what Linux does. It should be configurable though.

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-03-14 Thread Samuel Thibault
Richard Braun, on Mon 14 Mar 2016 23:37:33 +0100, wrote:
> On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote:
> > After poking around a bit more, it seems that the space is eaten by
> > debug info.  Appending -g0 to CFLAGS allowed gnumach to boot
> > successfully and resulted in this:
> > 
> > vm_page: DMA: pages: 4080 (15M), free: 2053 (8M)
> > 
> > The difference may be because GRUB apparently passes the multiboot ELF
> > info, while QEMU does not.  I'll just make sure to disable debug flags
> > when building gnumach for the time being.
> 
> Yes that makes sense.

Uh.  Should we perhaps move the kernel out of the 16MiB area?  I don't
think we have any such requirement here, AIUI we are just putting it
where it poses the least fragmentation issues, but the scarse DMA
resource is even more important.

Samuel



Re: Testing requested for the next version of GNU Mach

2016-03-14 Thread Richard Braun
On Mon, Mar 14, 2016 at 06:27:24PM -0400, David Michael wrote:
> > The realtek board shouldn't be working without DDE... Something looks
> > wrong in your test setup.
> 
> I've been using it almost exclusively for the last three or four years
> without DDE.  It works fine for me until I try to change the IP
> address with fsysopts, then kernel panics ensue.

If you really built the kernel with --disable-rtl8139, it shouldn't be
working.

> After poking around a bit more, it seems that the space is eaten by
> debug info.  Appending -g0 to CFLAGS allowed gnumach to boot
> successfully and resulted in this:
> 
> vm_page: DMA: pages: 4080 (15M), free: 2053 (8M)
> 
> The difference may be because GRUB apparently passes the multiboot ELF
> info, while QEMU does not.  I'll just make sure to disable debug flags
> when building gnumach for the time being.

Yes that makes sense.

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-03-14 Thread David Michael
On Sun, Mar 13, 2016 at 9:06 AM, Richard Braun  wrote:
> On Fri, Mar 11, 2016 at 05:38:06PM -0500, David Michael wrote:
>> I didn't get a chance to try with Debian yet, but after looking a bit
>> more, the failure I'm getting is from linux_kmem_init() allocating
>> memory.  It panics in vm_page_grab_contig() when allocating the 29th
>> (i=28) chunk.
>
> This means the allocator can't allocate 64k contiguous physical memory
> in the first 16M.
>
>> I tried reducing MEM_CHUNKS from 32 to 24, but it then panicked when
>> linux_init() tried to allocate more memory immediately after.  After
>
> That doesn't make sense. Could you report the panic message ?

It was "panic: vm_page_grab_contig" in all cases.

>> reducing it to 16, gnumach booted fine, and everything worked properly
>> (including rtl8139 devices).
>
> The realtek board shouldn't be working without DDE... Something looks
> wrong in your test setup.

I've been using it almost exclusively for the last three or four years
without DDE.  It works fine for me until I try to change the IP
address with fsysopts, then kernel panics ensue.

I've switched my QEMU commands to -device pcnet just in case.

>> The panics don't seem to be affected by configure options or device
>> drivers used.  This is with the latest mostly pristine upstream code
>> from gnumach and GRUB, but I will try to see what Debian is doing
>> differently over the weekend.
>
> This is what I have when I boot :
> vm_page: DMA: pages: 4080 (15M), free: 1265 (4M)
>
> The amount of free memory is already quite low for some reason, but on
> your machine, it's even lower :
> vm_page: DMA: pages: 4080 (15M), free: 451 (1M)
>
> This explains why your test succeeds with 16 chunks (16*64k = 1M). But
> We have to understand why you have so little memory in the first place.
> It could be that newer versions of GRUB place more boot data, or
> align them differently, and that we neeed to do a better job at freeing
> boot data. I'm also not sure why my changes affect that...
>
> What's your versions of GRUB (please don't say "latest", always state
> the IDs explicitely) ?

I've encountered the failure with both GRUB beta2 and beta3.  They're
from last month and over two years ago, so I don't think the change
was too recent.

After poking around a bit more, it seems that the space is eaten by
debug info.  Appending -g0 to CFLAGS allowed gnumach to boot
successfully and resulted in this:

vm_page: DMA: pages: 4080 (15M), free: 2053 (8M)

The difference may be because GRUB apparently passes the multiboot ELF
info, while QEMU does not.  I'll just make sure to disable debug flags
when building gnumach for the time being.

Thanks.

David



Re: Testing requested for the next version of GNU Mach

2016-03-13 Thread Richard Braun
On Fri, Mar 11, 2016 at 05:38:06PM -0500, David Michael wrote:
> I didn't get a chance to try with Debian yet, but after looking a bit
> more, the failure I'm getting is from linux_kmem_init() allocating
> memory.  It panics in vm_page_grab_contig() when allocating the 29th
> (i=28) chunk.

This means the allocator can't allocate 64k contiguous physical memory
in the first 16M.

> I tried reducing MEM_CHUNKS from 32 to 24, but it then panicked when
> linux_init() tried to allocate more memory immediately after.  After

That doesn't make sense. Could you report the panic message ?

> reducing it to 16, gnumach booted fine, and everything worked properly
> (including rtl8139 devices).

The realtek board shouldn't be working without DDE... Something looks
wrong in your test setup.

> The panics don't seem to be affected by configure options or device
> drivers used.  This is with the latest mostly pristine upstream code
> from gnumach and GRUB, but I will try to see what Debian is doing
> differently over the weekend.

This is what I have when I boot :
vm_page: DMA: pages: 4080 (15M), free: 1265 (4M)

The amount of free memory is already quite low for some reason, but on
your machine, it's even lower :
vm_page: DMA: pages: 4080 (15M), free: 451 (1M)

This explains why your test succeeds with 16 chunks (16*64k = 1M). But
We have to understand why you have so little memory in the first place.
It could be that newer versions of GRUB place more boot data, or
align them differently, and that we neeed to do a better job at freeing
boot data. I'm also not sure why my changes affect that...

What's your versions of GRUB (please don't say "latest", always state
the IDs explicitely) ?

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-03-11 Thread Samuel Thibault
David Michael, on Fri 11 Mar 2016 17:38:06 -0500, wrote:
> but I will try to see what Debian is doing differently over the
> weekend.

Debian simply doesn't use those drivers any more, but netdde instead.

Samuel



Re: Testing requested for the next version of GNU Mach

2016-03-09 Thread Richard Braun
On Tue, Mar 08, 2016 at 12:15:25PM -0500, David Michael wrote:
> I've just removed linux/src/drivers/net/rtl8139.c and dropped all
> references to rtl8139 from the following files

Please use --disable-rtl8139 instead, and only use GRUB as configured
by Debian so that we restrict the variables of our tests.

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-03-08 Thread David Michael
On Tue, Mar 8, 2016 at 10:00 AM, David Michael  wrote:
> On Tue, Mar 8, 2016 at 5:06 AM, Richard Braun  wrote:
>> In any case, this isn't a regression caused by my work, and I don't
>> intend to fix in-kernel drivers, in particular when we have a good
>> user space replacement. As a result, I suggest we remove the rtl8139
>> driver from the kernel.
>
> That's probably for the best.  Even when it would boot with that
> driver, changing the IP address with fsysopts would cause a kernel
> panic.
>
> I've changed it to -device ne2k_pci, and it seems to work now.

Sorry, I accidentally regenerated my GRUB configuration pointed at an
old build, confusing my results.

When gnumach is built with the rtl8139 driver removed, it has the same
behavior.  It fails to boot using GRUB, but succeeds using QEMU's
multiboot options.

I've just removed linux/src/drivers/net/rtl8139.c and dropped all
references to rtl8139 from the following files:

doc/mach.texi
linux/Makefrag.am
linux/configfrag.ac
linux/dev/drivers/net/Space.c
linux/src/drivers/net/Space.c

So unless rtl8139 bits are still hidden somewhere (grep only finds
some PCI ID definitions), it looks like there may be a different
problem.

David



Re: Testing requested for the next version of GNU Mach

2016-03-08 Thread David Michael
On Tue, Mar 8, 2016 at 5:06 AM, Richard Braun  wrote:
> In any case, this isn't a regression caused by my work, and I don't
> intend to fix in-kernel drivers, in particular when we have a good
> user space replacement. As a result, I suggest we remove the rtl8139
> driver from the kernel.

That's probably for the best.  Even when it would boot with that
driver, changing the IP address with fsysopts would cause a kernel
panic.

I've changed it to -device ne2k_pci, and it seems to work now.

Thanks.

David



Re: Testing requested for the next version of GNU Mach

2016-03-08 Thread Richard Braun
On Mon, Mar 07, 2016 at 11:01:47PM -0500, David Michael wrote:
> Yes, this was when booting gnumach in QEMU with -device rtl8139 for a
> net device.

Here is the situation :

At some point in 2006 (but apparently merged in 2009), netdrivers,
the network drivers written by Donald Becker, were updated to version
3.5. For some reason, rtl8139 has this bug that we apparently didn't
have in the past. Since then, we've been using netdde, more recent
Linux drivers running in user space, by default as part of the Debian
packages.

I can clearly reproduce the bug, which seems specific to this particular
driver. However I'm completely unable to find any netdrivers code except
our own. The driver was replaced with 8139too in Linux around 2.2/2.3.

In any case, this isn't a regression caused by my work, and I don't
intend to fix in-kernel drivers, in particular when we have a good
user space replacement. As a result, I suggest we remove the rtl8139
driver from the kernel.

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-03-07 Thread David Michael
On Mon, Mar 7, 2016 at 8:56 PM, Richard Braun  wrote:
> On Tue, Mar 08, 2016 at 12:00:03AM +0100, Richard Braun wrote:
>> On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote:
>> > Yes, that did it.  The latest gnumach can be booted with GRUB when
>> > those options are disabled.
>>
>> It seems we've been having this bug for a long time, but noone is using
>> the in-kernel drivers for these boards any more, since the Debian
>> packages don't include them, only the userspace DDE ones.
>
> Can you confirm that your board is a realtek 8139 (or compatible) ?

Yes, this was when booting gnumach in QEMU with -device rtl8139 for a
net device.

David



Re: Testing requested for the next version of GNU Mach

2016-03-07 Thread Richard Braun
On Tue, Mar 08, 2016 at 12:00:03AM +0100, Richard Braun wrote:
> On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote:
> > Yes, that did it.  The latest gnumach can be booted with GRUB when
> > those options are disabled.
> 
> It seems we've been having this bug for a long time, but noone is using
> the in-kernel drivers for these boards any more, since the Debian
> packages don't include them, only the userspace DDE ones.

Can you confirm that your board is a realtek 8139 (or compatible) ?

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-03-07 Thread Richard Braun
On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote:
> On Sun, Feb 28, 2016 at 3:37 PM, Richard Braun  wrote:
> > On Sun, Feb 28, 2016 at 03:27:50PM -0500, David Michael wrote:
> >> The same GRUB has no problem booting older gnumach (bee3f0) or Linux.
> >> Are you aware of any patches required by GRUB to boot the X15
> >> multiboot/biosmem bits?
> >
> > Thanks for your report.
> >
> > It's not related to GRUB at all apparently. My guess is something got
> > wrong in the Linux glue code for in kernel drivers. Could you build
> > gnumach with --disable-net-group --disable-pcmcia-group
> > --disable-wireless-group and try again please ?
> 
> Yes, that did it.  The latest gnumach can be booted with GRUB when
> those options are disabled.

It seems we've been having this bug for a long time, but noone is using
the in-kernel drivers for these boards any more, since the Debian
packages don't include them, only the userspace DDE ones.

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-02-29 Thread Richard Braun
On Sun, Feb 28, 2016 at 03:50:06PM -0500, David Michael wrote:
> > It's not related to GRUB at all apparently. My guess is something got
> > wrong in the Linux glue code for in kernel drivers. Could you build
> > gnumach with --disable-net-group --disable-pcmcia-group
> > --disable-wireless-group and try again please ?
> 
> Yes, that did it.  The latest gnumach can be booted with GRUB when
> those options are disabled.

Thanks again for testing this. We've had several reports pointing in
that direction. I'll try to fix this soon and update the branch.

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-02-28 Thread David Michael
On Sun, Feb 28, 2016 at 3:37 PM, Richard Braun  wrote:
> On Sun, Feb 28, 2016 at 03:27:50PM -0500, David Michael wrote:
>> The same GRUB has no problem booting older gnumach (bee3f0) or Linux.
>> Are you aware of any patches required by GRUB to boot the X15
>> multiboot/biosmem bits?
>
> Thanks for your report.
>
> It's not related to GRUB at all apparently. My guess is something got
> wrong in the Linux glue code for in kernel drivers. Could you build
> gnumach with --disable-net-group --disable-pcmcia-group
> --disable-wireless-group and try again please ?

Yes, that did it.  The latest gnumach can be booted with GRUB when
those options are disabled.

Thanks.

David



Re: Testing requested for the next version of GNU Mach

2016-02-28 Thread Richard Braun
On Sun, Feb 28, 2016 at 03:27:50PM -0500, David Michael wrote:
> The same GRUB has no problem booting older gnumach (bee3f0) or Linux.
> Are you aware of any patches required by GRUB to boot the X15
> multiboot/biosmem bits?

Thanks for your report.

It's not related to GRUB at all apparently. My guess is something got
wrong in the Linux glue code for in kernel drivers. Could you build
gnumach with --disable-net-group --disable-pcmcia-group
--disable-wireless-group and try again please ?

-- 
Richard Braun



Re: Testing requested for the next version of GNU Mach

2016-02-28 Thread David Michael
Hi,

I haven't been able to boot gnumach with upstream GRUB since this X15
code was imported.  It does boot successfully with QEMU's multiboot
arguments, and it seems fine when it's running.

When booting the latest gnumach (689810) with GRUB (both old versions
and today's beta3 release) built for the i386-pc platform, the
following is displayed for a fraction of a second before the system
reboots:

GNU Mach 1.6
ELF section header table at c0010278
biosmem: physical memory map:
biosmem: 00:09fc00, available
biosmem: 09fc00:0a, reserved
biosmem: 0f:10, reserved
biosmem: 10:003ffe, available
biosmem: 003ffe:004000, reserved
biosmem: 00feffc000:00ff00, reserved
biosmem: 00fffc:01, reserved
biosmem: heap: e3d000-3a00
Loaded ELF symbol table for mach (6033 symbols)
vm_page: page table size: 262096 entries (17408k)
vm_page: DMA: pages: 4080 (15M), free: 451 (1M)
vm_page: DIRECTMAP: pages: 233472 (912M), free: 226345 (884M)
vm_page: HIGHMEM: pages: 24544 (95M), free: 24544 (95M)
unexpected ACK from keyboard

The same GRUB has no problem booting older gnumach (bee3f0) or Linux.
Are you aware of any patches required by GRUB to boot the X15
multiboot/biosmem bits?

Thanks.

David