Re: incorrect usleep/select delays with HZ > 2500

2009-09-06 Thread Peter Wemm
On Sun, Sep 6, 2009 at 8:51 AM, Luigi Rizzo wrote:
> (this problem seems to affect both current and -stable,
> so let's see if here i have better luck)
>
> I just noticed [Note 1,2] that when setting HZ > 2500 (even if it is
> an exact divisor of the APIC/CPU clock) there is a significant
> drift between the delays generated by usleep()/select() and those
> computed by gettimeofday().  In other words, the error grows with
> the amount of delay requested.
>
> To show the problem, try this function
>
>        int f(int wait_time) {  // wait_time in usec
>                struct timeval start, end;
>                gettimeofday(&start);
>                usleep(w);      // or try select
>                gettimeofday(&end)
>                timersub(&end, &start, &x);
>                return = x.tv_usec + 100*x.tv_sec - wait_time;
>        }
>
> for various HZ (kern.hz= in /boot/loader.conf) and wait times.
> Ideally, we would expect the timings to be in error by something
> between 0 and 1 (or 2) ticks, irrespective of the value of wait_time.
> In fact, this is what you see with HZ=1000, 2000 and 2500.
> But larger values of HZ (e.g. 4000, 5000, 10k, 40k) create
> a drift of 0.5% and above (i.e. with HZ=5000, a 1-second delay
> lasts 1.0064s and a 10s delay lasts 10.062s; with HZ=10k the
> error becomes 1% and at HZ=40k the error becomes even bigger.

Technically, it isn't even an error because the sleeps are defined as
'at least' the value specified.  If you're looking for real-time-OS
level performance, you probably need to look at one.

> Note that with the fixes described below, even HZ=40k works perfectly well.
>
> Turns out that the error has three components (described with
> possible fixes):
>
> 1.  CAUSE: Errors in the measurement of the TSC (and APIC) frequencies,
>        see [Note 3] for more details. This is responsible for the drift.
>    FIX: It can be removed by rounding the measurement to the closest
>        nominal values (e.g. my APIC runs at 100 MHz; we can use a
>        table of supported values). Otherwise, see [Note 4]
>    PROBLEM: is this general enough ?
>
> 2.  CAUSE: Use of approximate kernel time functions (getnanotime/getmicrotime)
>        in nanosleep() and select(). This imposes an error of max(1tick, 1ms)
>        in the computation of delays, irrespective of HZ values.
>        BTW For reasons I don't understand this seems to affect
>        nanosleep() more than select().
>    FIX: It can be reduced to just 1 tick making kern.timecounter.tick writable
>        and letting the user set it to 1 if high precision is required.
>    PROBLEM: none that i see.
>
> 3.  CAUSE an error in tvtohz(), reported long ago in
>        http://www.dragonflybsd.org/presentations/nanosleep/
>        which causes a systematic error of an extra tick in the computation
>        of the sleep times.
>    FIX: the above link also contains a proposed fix (which in fact
>        reverts a bug introduced in some old commit on FreeBSD)
>    PROBLEM: none that i see.

This change, as-is, is extremely dangerous.  tsleep/msleep() use a
value of 0 meaning 'forever'.  Changing tvtohz() so that it can now
return 0 for a non-zero tv is setting land-mines all over the place.
There's something like 27 callers of tvtohz() in sys/kern alone, some
of which are used to supply tsleep/msleep() timeouts.  Note that the
dragonflybsd patch above only adds the 'returns 0' check to one single
caller.  You either need to patch all callers of tvtohz() since you've
change the semantics, or add a 'if (ticks == 0) ticks = 1' check (or
checks) in the appropriate places inside tvtohz().

If you don't do it, then you end up with callers of random functions
with very small timeouts instead finding themselves sleeping forever.

> Applying these three fixes i was able to run a kernel with HZ=4
> and see timing errors within 80-90us even with ten second delays.
> This would put us on par with Linux [Note 5].
> This is a significant improvement over the current situation
> and the reason why I would like to explore the possibility of applying
> some of these fixes.
>
> I know there are open problems -- e.g. when the timer source used
> by gettimeofday() gets corrected by ntp or other things, hardclock()
> still ticks at the same rate so you'll see a drift if you don't apply
> corrections there as well. Similarly, if HZ is not an exact
> divisor of the clock source used by gettimeofday(), i suppose
> errors will accumulate as well. However fixing these other
> drift seems reasonably orthogonal at least to #2 and #3 above, and
> a lot more difficult so we could at least start from these simple
> fixes.
>
>
> Would anyone be interested in reproducing the experiment (test program
> attached -- run it with 'lat -p -i wait_time_in_microseconds')
> and try to explain me what changes the system's behaviour above HZ=2500 ?
>
>        cheers
>        luigi
>
> Notes:
>
> [Note 1] I have some interest in running machines with high HZ va

Re: panic: vm_phys_paddr_to_vm_page: paddr 0xf8000 is not in any segment

2009-09-06 Thread Kostik Belousov
On Sun, Sep 06, 2009 at 10:00:26AM -0700, David Wolfskill wrote:
> On Sun, Sep 06, 2009 at 07:09:55PM +0300, Kostik Belousov wrote:
> > On Sun, Sep 06, 2009 at 08:44:54AM -0700, David Wolfskill wrote:
> > > First got this on my laptop (but not my headless build machine) --
> > > each of which is i386 -- yesterday, at r196858; after reverting to
> > > r196827 (from Thursday), then rebuilding stable/7 at r196886, it
> > > recurred.
> > Please try r196894.
> 
> Hand-applied; rebuilt.  On reboot, no panic -- thanks!  :-)
> 
> [The bad news is that I did get the apparent hang at xdm(8) start-up.
> Three out of three tries :-{]

Could you, please, get more details ? I assume that hang occured
during the X server startup, actually.

Does machine respond to the pings ?
If yes, can you ssh into it ?

Also, you might try to install sysinstall/dmidecode and try running
it, to verify that /dev/mem works.


pgpvhmtHmKnVg.pgp
Description: PGP signature


Re: panic: vm_phys_paddr_to_vm_page: paddr 0xf8000 is not in any segment

2009-09-06 Thread David Wolfskill
On Sun, Sep 06, 2009 at 07:09:55PM +0300, Kostik Belousov wrote:
> On Sun, Sep 06, 2009 at 08:44:54AM -0700, David Wolfskill wrote:
> > First got this on my laptop (but not my headless build machine) --
> > each of which is i386 -- yesterday, at r196858; after reverting to
> > r196827 (from Thursday), then rebuilding stable/7 at r196886, it
> > recurred.
> Please try r196894.

Hand-applied; rebuilt.  On reboot, no panic -- thanks!  :-)

[The bad news is that I did get the apparent hang at xdm(8) start-up.
Three out of three tries :-{]

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpIZ30hGEajl.pgp
Description: PGP signature


Re: panic: vm_phys_paddr_to_vm_page: paddr 0xf8000 is not in any segment

2009-09-06 Thread A.J. "Fonz" van Werven
David Wolfskill wrote:

> It appears to be happening when xdm(1) gets started.

I'm getting the same when starting X manually (startx).

> I welcome clues.

A patch has been submitted to this list less than an hour ago and I've
already seen the SVN commit as well. I'm currently rebuilding, you might
want to try the same. Let's hope this fixes it.

Regards,

Alphons

-- 
All right, that does it Bill [Donahue]. I'm pretty sure that killing
Jesus is not very Christian.
 -- Pope Benedict XVI, Southpark season 11 episode 5
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Panic in recent 7.2-Stable

2009-09-06 Thread A.J. "Fonz" van Werven
Kostik Belousov wrote:

> I expect that the following patch, that is the partial merge of r194459,
> would fix it. It patches sys/vm/vm_phys.c.
> 
> Index: vm_phys.c
> ===
> --- vm_phys.c (revision 194458)
> +++ vm_phys.c (revision 194459)
> @@ -382,8 +382,7 @@
>   if (pa >= seg->start && pa < seg->end)
>   return (&seg->first_page[atop(pa - seg->start)]);
>   }
> - panic("vm_phys_paddr_to_vm_page: paddr %#jx is not in any segment",
> - (uintmax_t)pa);
> + return (NULL);
>  }
>  
>  /*

Hi,

A quick grep on the file in question revealed that there are two
functions that may panic() with "page not in any segment": the
vm_phys_paddr_to_vm_page() being patched and also the next function
vm_phys_paddr_to_segind(). I'm not exactly current with the memory
management code so this may be a very stupid question, but I'll ask it
anyway: don't both functions need to be patched?

My apologies if I'm way off the mark here, but I'm just trying to help.

Regards,

Alphons

-- 
All right, that does it Bill [Donahue]. I'm pretty sure that killing
Jesus is not very Christian.
 -- Pope Benedict XVI, Southpark season 11 episode 5
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: panic: vm_phys_paddr_to_vm_page: paddr 0xf8000 is not in any segment

2009-09-06 Thread Kostik Belousov
On Sun, Sep 06, 2009 at 08:44:54AM -0700, David Wolfskill wrote:
> First got this on my laptop (but not my headless build machine) --
> each of which is i386 -- yesterday, at r196858; after reverting to
> r196827 (from Thursday), then rebuilding stable/7 at r196886, it
> recurred.
Please try r196894.


pgpwBOK3wvSWh.pgp
Description: PGP signature


Re: Panic in recent 7.2-Stable

2009-09-06 Thread Thierry Herbelot
Le Sunday 06 September 2009, A.J. "Fonz" van Werven a écrit :
> Kostik Belousov wrote:
> > I expect that the following patch, that is the partial merge of r194459,
> > would fix it. It patches sys/vm/vm_phys.c.
> >
> > Index: vm_phys.c
> > ===
> > --- vm_phys.c   (revision 194458)
> > +++ vm_phys.c   (revision 194459)
> > @@ -382,8 +382,7 @@
> > if (pa >= seg->start && pa < seg->end)
> > return (&seg->first_page[atop(pa - seg->start)]);
> > }
> > -   panic("vm_phys_paddr_to_vm_page: paddr %#jx is not in any segment",
> > -   (uintmax_t)pa);
> > +   return (NULL);
> >  }
> >
> >  /*
>
> Hi,
>
> A quick grep on the file in question revealed that there are two
> functions that may panic() with "page not in any segment": the
> vm_phys_paddr_to_vm_page() being patched and also the next function
> vm_phys_paddr_to_segind(). I'm not exactly current with the memory
> management code so this may be a very stupid question, but I'll ask it
> anyway: don't both functions need to be patched?
>
> My apologies if I'm way off the mark here, but I'm just trying to help.

you are right : there seems the vm handling has been recently updated and 
maybe even "those who know" may not have reviewed/updated all panic 
conditions (removing the panic in vm_phys_paddr_to_vm_page at least allows 
correct operation of a -Stable kernel, like under -Current)

TfH
>
> Regards,
>
> Alphons


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


panic: vm_phys_paddr_to_vm_page: paddr 0xf8000 is not in any segment

2009-09-06 Thread David Wolfskill
First got this on my laptop (but not my headless build machine) --
each of which is i386 -- yesterday, at r196858; after reverting to
r196827 (from Thursday), then rebuilding stable/7 at r196886, it
recurred.

It appears to be happening when xdm(1) gets started., which is pretty
late in the transition to multi-user mode.

One oddity of which to be aware: all ports (save for misc/compat6x)
are built and installed while running stable/6.  (I track stable/6,
stable/7, and head, as well as track ports, daily, on both the build
machine and the laptop.  As I try to have some time to actually use
the laptop, rather than merely building stuff on it, I don't try
to update the ports collection daily for each of the 3 versions of
the OS I run.  And as the laptop is "user-facing," it tends to have
a lot (863, at last count) of ports installed.)  misc/compat6x was
installed and is updated under stable/7; it is presently at
compat6x-i386-6.4.604000.200810_3 -- updated Sep  4 06:03:18 2009.

For the past couple of weeks (until yesterday), I noticed that
during the attempt to start xdm(1), the laptop (when running stable/7)
would sometimes lock up (i.e., keyboard apparently non-functional;;
mouse non-functional; only thing I could find to make any progress
was a power cycle, then booting single-user & issuing "fsck -p && exit").
Since I wasn't able to get any information, I didn't mention
it here previously, but now aat least I have an apparently consistennt
panic -- but only when running stable/7.

I have no problems runnning xdm(1) under stable/6 (not that that's
a surprise), but I also have no problems runing xdm(1) under head.

I've copied the crashinfo(8) information to a file visible to my
Web server; it may be viewed at
.  I'll paste
the uname info & backtrace here, but for more details, please see that
page.  (Of course, if the details you seek aren't in the crashinfo(8)
output, please just let me know what you seek)

FreeBSD localhost 7.2-STABLE FreeBSD 7.2-STABLE #935 r196886: Sun Sep  6 
05:35:04 PDT 2009 
r...@g1-69.catwhisker.org:/common/S3/obj/usr/src/sys/CANARY  i386

#0  doadump () at pcpu.h:196
196 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:196
#1  0xc049a979 in db_fncall (dummy1=1, dummy2=0, dummy3=-1060239008, 
dummy4=0xc3b6986c "\...@âÃ") at /usr/src/sys/ddb/db_command.c:516
#2  0xc049aefc in db_command (last_cmdp=0xc0c95694, cmd_table=0x0, dopager=1)
at /usr/src/sys/ddb/db_command.c:413
#3  0xc049b00a in db_command_loop () at /usr/src/sys/ddb/db_command.c:466
#4  0xc049cabd in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228
#5  0xc0812406 in kdb_trap (type=3, code=0, tf=0xc3b69a14)
at /usr/src/sys/kern/subr_kdb.c:524
#6  0xc0af205b in trap (frame=0xc3b69a14) at /usr/src/sys/i386/i386/trap.c:692
#7  0xc0ad5d4b in calltrap () at /usr/src/sys/i386/i386/exception.s:166
#8  0xc081258a in kdb_enter_why (why=0xc0b93e11 "panic", 
msg=0xc0b93e11 "panic") at cpufunc.h:60
#9  0xc07e55b6 in panic (
fmt=0xc0bb024d "vm_phys_paddr_to_vm_page: paddr %#jx is not in any 
segment") at /usr/src/sys/kern/kern_shutdown.c:557
#10 0xc0a504bd in vm_phys_paddr_to_vm_page (pa=1015808)
at /usr/src/sys/vm/vm_phys.c:385
#11 0xc0a2ec21 in dev_pager_getpages (object=0xc4d29000, m=0xc3b69c04, 
count=1, reqpage=0) at /usr/src/sys/vm/device_pager.c:240
#12 0xc0a3ae90 in vm_fault (map=0xc4d0d000, vaddr=676900864, 
fault_type=1 '\001', fault_flags=Variable "fault_flags" is not available.
) at vm_pager.h:130
#13 0xc0af13bb in trap_pfault (frame=0xc3b69d38, usermode=1, eva=676904576)
at /usr/src/sys/i386/i386/trap.c:833
#14 0xc0af1d27 in trap (frame=0xc3b69d38) at /usr/src/sys/i386/i386/trap.c:399
#15 0xc0ad5d4b in calltrap () at /usr/src/sys/i386/i386/exception.s:166
#16 0x285599c1 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

Given that last ("Previous frame inner to this frame (corrupt
stack?)"), I'm not at all certain that the backtraace (or the dump)
will be all that useful.  And because of my odd configuration, this
may not be of sufficient interest to merit much expenditure of
anyone else's time.

I'm quite willing to experiment, try patches, or whatnot.  I have
local mirrors of the CVVS & SVN repositories handy.  I'm not much
of a kernel hacker per se, but I am fairly comfortable hacking
sources in general.

I welcome clues.

Thanks.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpp4Alw6FIPP.pgp
Description: PGP signature


Re: Panic in recent 7.2-Stable

2009-09-06 Thread Kostik Belousov
On Sun, Sep 06, 2009 at 05:37:52PM +0200, A.J. Fonz van Werven wrote:
> Kostik Belousov wrote:
> 
> > I expect that the following patch, that is the partial merge of r194459,
> > would fix it. It patches sys/vm/vm_phys.c.
> > 
> > Index: vm_phys.c
> > ===
> > --- vm_phys.c   (revision 194458)
> > +++ vm_phys.c   (revision 194459)
> > @@ -382,8 +382,7 @@
> > if (pa >= seg->start && pa < seg->end)
> > return (&seg->first_page[atop(pa - seg->start)]);
> > }
> > -   panic("vm_phys_paddr_to_vm_page: paddr %#jx is not in any segment",
> > -   (uintmax_t)pa);
> > +   return (NULL);
> >  }
> >  
> >  /*
> 
> Hi,
> 
> A quick grep on the file in question revealed that there are two
> functions that may panic() with "page not in any segment": the
> vm_phys_paddr_to_vm_page() being patched and also the next function
> vm_phys_paddr_to_segind(). I'm not exactly current with the memory
> management code so this may be a very stupid question, but I'll ask it
> anyway: don't both functions need to be patched?

vm_phys_paddr_to_segind is used during vm bootstrap, the call sequence
is vm_page_startup->vm_phys_add_page->vm_phys_paddr_to_segind.

vm_page_startup calls vm_phys_add_page only for pages
that should not cause the mentioned panic in vm_phys_paddr_to_segind,
since it iterates over the pages of the segments created by
vm_phys_create_seg() in vm_phys_init().


pgpx2GFet22ty.pgp
Description: PGP signature


incorrect usleep/select delays with HZ > 2500

2009-09-06 Thread Luigi Rizzo
(this problem seems to affect both current and -stable,
so let's see if here i have better luck)

I just noticed [Note 1,2] that when setting HZ > 2500 (even if it is
an exact divisor of the APIC/CPU clock) there is a significant
drift between the delays generated by usleep()/select() and those
computed by gettimeofday().  In other words, the error grows with
the amount of delay requested.

To show the problem, try this function

int f(int wait_time) {  // wait_time in usec
struct timeval start, end;
gettimeofday(&start);
usleep(w);  // or try select
gettimeofday(&end)
timersub(&end, &start, &x);
return = x.tv_usec + 100*x.tv_sec - wait_time;
}

for various HZ (kern.hz= in /boot/loader.conf) and wait times.
Ideally, we would expect the timings to be in error by something
between 0 and 1 (or 2) ticks, irrespective of the value of wait_time.
In fact, this is what you see with HZ=1000, 2000 and 2500.
But larger values of HZ (e.g. 4000, 5000, 10k, 40k) create
a drift of 0.5% and above (i.e. with HZ=5000, a 1-second delay
lasts 1.0064s and a 10s delay lasts 10.062s; with HZ=10k the
error becomes 1% and at HZ=40k the error becomes even bigger.

Note that with the fixes described below, even HZ=40k works perfectly well.

Turns out that the error has three components (described with
possible fixes):

1.  CAUSE: Errors in the measurement of the TSC (and APIC) frequencies,
see [Note 3] for more details. This is responsible for the drift.
FIX: It can be removed by rounding the measurement to the closest
nominal values (e.g. my APIC runs at 100 MHz; we can use a
table of supported values). Otherwise, see [Note 4]
PROBLEM: is this general enough ?

2.  CAUSE: Use of approximate kernel time functions (getnanotime/getmicrotime)
in nanosleep() and select(). This imposes an error of max(1tick, 1ms)
in the computation of delays, irrespective of HZ values.
BTW For reasons I don't understand this seems to affect
nanosleep() more than select().
FIX: It can be reduced to just 1 tick making kern.timecounter.tick writable
and letting the user set it to 1 if high precision is required.
PROBLEM: none that i see.

3.  CAUSE an error in tvtohz(), reported long ago in
http://www.dragonflybsd.org/presentations/nanosleep/
which causes a systematic error of an extra tick in the computation
of the sleep times.
FIX: the above link also contains a proposed fix (which in fact
reverts a bug introduced in some old commit on FreeBSD)
PROBLEM: none that i see.

Applying these three fixes i was able to run a kernel with HZ=4
and see timing errors within 80-90us even with ten second delays.
This would put us on par with Linux [Note 5].
This is a significant improvement over the current situation
and the reason why I would like to explore the possibility of applying
some of these fixes.

I know there are open problems -- e.g. when the timer source used
by gettimeofday() gets corrected by ntp or other things, hardclock()
still ticks at the same rate so you'll see a drift if you don't apply
corrections there as well. Similarly, if HZ is not an exact
divisor of the clock source used by gettimeofday(), i suppose
errors will accumulate as well. However fixing these other
drift seems reasonably orthogonal at least to #2 and #3 above, and
a lot more difficult so we could at least start from these simple
fixes.


Would anyone be interested in reproducing the experiment (test program
attached -- run it with 'lat -p -i wait_time_in_microseconds')
and try to explain me what changes the system's behaviour above HZ=2500 ?

cheers
luigi

Notes:

[Note 1] I have some interest in running machines with high HZ values
because this gives better precision to dummynet and various
other tasks with soft timing constraints.

[Note 2] I have seen the same phenomenon on the following platform:
RELENG_8/amd64 with AMD BE-2400 dual core cpu
RELENG_7/i386 with AMD BE-2400 dual core cpu
RELENG_7/i386 with Intel Centrino single core (Dell X1 Laptop)


[Note 3] the TSC frequency is computed reading the tsc around a
call to DELAY(100) and assuming that the i8254 runs
at the nominal rate, 1.193182 MHz.
From tests I have made, the measurement in init_TSC() returns
a large error when HZ is large, whereas repeating the measurement
at a later time returns a much more reliable value.
As an example, see the following:

Sep  6 14:21:59 lr kernel: TSC clock: at init_TSC 2323045616 Hz
Sep  6 14:21:59 lr kernel: 
Features=0x178bfbff
Sep  6 14:21:59 lr kernel: AMD 
Features=0xea500800
Sep  6 14:21:59 lr kernel: TSC: P-state invariant
Sep  6 14:21:59 lr kernel: TSC clock: at cpu_startup_end 2323056910 Hz
Sep  6 14:21:59 lr kerne

Re: Panic in recent 7.2-Stable

2009-09-06 Thread Thierry Herbelot
Le Sunday 06 September 2009, Kostik Belousov a écrit :
>
> I expect that the following patch, that is the partial merge of r194459,
> would fix it. It patches sys/vm/vm_phys.c.
>
> Index: vm_phys.c
> ===
> --- vm_phys.c (revision 194458)
> +++ vm_phys.c (revision 194459)
> @@ -382,8 +382,7 @@
>   if (pa >= seg->start && pa < seg->end)
>   return (&seg->first_page[atop(pa - seg->start)]);
>   }
> - panic("vm_phys_paddr_to_vm_page: paddr %#jx is not in any segment",
> - (uintmax_t)pa);
> + return (NULL);

This seems indeed the missing part : I should have looked in -current  

Thanks

TfH
>  }
>
>  /*


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Build kernel failure

2009-09-06 Thread Mikael Bak
Mikael Bak wrote:
> -options SMP
> -device apic
> -device eisa
> 

Responding to myself.
I reactivated the above options, and now I get a different error:

/usr/src/sys/dev/aic7xxx/aic7xxx.c:7896: internal compiler error: in
output_constructor, at varasm.c:4311
[snip]
*** Error code 1

I installed kernel source using sysinstall using an FTP mirror as
described here:
http://www.freebsd.org/doc/en/books/handbook/kernelconfig-building.html

Did I perhaps miss installing something?

TIA,
Mikael

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Build kernel failure

2009-09-06 Thread Mikael Bak
Hi,

I have installed FreeBSD 7.2 on an old laptop. It's only for fun and
learning.

I wanted to make my console use full 1024x768, so I followed this howto:

http://kimklai.blogspot.com/2007/05/howto-freebsd-console-framebuffer.html

# make buildkernel KERNCONF=GENERICVESA
Gives me after quite long time this error message:

make: don't know how to make /usr/src/sys/sys/gdefs.h. Stop
*** Error code 2

Changes to GENERIC:

+options VESA
+options SC_PIXEL_MODE

-options SMP
-device apic
-device eisa

I would like to disable things I don't need, because the laptop is quite
old and slow and don't have too much RAM.

The laptop is an old Dell Latitude CPt C-series, 400MHz PII, 256 MB RAM,
6.5GB HDD.

If I succeed I will try to disable other thing in the kernel to optimize
even more. But perhaps I managed to disable somethig vital?

TIA,
Mikael
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Panic in recent 7.2-Stable

2009-09-06 Thread Kostik Belousov
On Sun, Sep 06, 2009 at 11:02:39AM +0200, Thierry Herbelot wrote:
> Hello,
> 
> I'm having a panic with the latest kernel build of my -Stable file server 
> (sources cvsupped around yesterday evening, CEST). The panic happens soon 
> after entering multi-user :
> 
> panic: vm_phys_paddr_to_vm_page: paddr 0xf is not in any segment
> KDB: enter: panic
> [thread pid 1005 tid 100154 ]
> Stopped at  kdb_enter_why+0x3a: movl$0,kdb_why
> db> where
> Tracing pid 1005 tid 100154 td 0x8ecad480
> kdb_enter_why(80ba731f,80ba731f,80bc1ad6,fb301a94,fb301a94,...) at 
> kdb_enter_why+0x3a
> panic(80bc1ad6,f,0,8ecad480,0,...) at panic+0xd1
> vm_phys_paddr_to_vm_page(f,f,fb301ad8,1,80a36a78,...) at 
> vm_phys_paddr_to_vm_page+0x4d
> dev_pager_getpages(92b8d980,fb301c04,1,0,fb301bcc,...) at 
> dev_pager_getpages+0xe1
> vm_fault(89267bfc,33d9,1,0,89b0b50c,...) at vm_fault+0x1020
> trap_pfault(202,7,8583b900,80cd9800,89b0eb00,...) at trap_pfault+0x15b
> trap(fb301d38) at trap+0x247
> 
> An excerpt of the dmesg is :
> 
> Copyright (c) 1992-2009 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 7.2-STABLE #35: Sun Sep  6 10:04:40 CEST 2009
> x...@yyy:/usr/obj/usr/src/sys/GENERIC
> Preloaded elf kernel "/boot/kernel/kernel" at 0x8101a000.
> Preloaded elf module "/boot/kernel/zfs.ko" at 0x8101a188.
> Preloaded elf module "/boot/kernel/opensolaris.ko" at 0x8101a230.
> Preloaded elf module "/boot/kernel/snd_cmi.ko" at 0x8101a2e0.
> Preloaded elf module "/boot/kernel/sound.ko" at 0x8101a38c.
> Preloaded /boot/zfs/zpool.cache "/boot/zfs/zpool.cache" at 0x8101a438.
> Preloaded elf module "/boot/kernel/acpi.ko" at 0x8101a490.
> 
> The previous kernel is older (around 22 august) and works as expected.
> 
> Some idents for the panic kernel are : (ie after SVN rev 196838)
> $FreeBSD: src/sys/i386/i386/pmap.c,v 1.594.2.20 2009/09/04 19:59:32 jhb Exp $
> $FreeBSD: src/sys/kern/kern_mbuf.c,v 1.32.2.6 2009/09/04 19:59:32 jhb Exp $
> $FreeBSD: src/sys/vm/device_pager.c,v 1.84.2.3 2009/09/04 19:59:32 jhb Exp $
> $FreeBSD: src/sys/vm/vm_object.c,v 1.385.2.7 2009/09/04 19:59:32 jhb Exp $
> $FreeBSD: src/sys/vm/vm_page.c,v 1.357.2.10 2009/09/04 19:59:32 jhb Exp $
> $FreeBSD: src/sys/vm/vm_phys.c,v 1.4.2.2 2009/09/04 19:59:32 jhb Exp $

I expect that the following patch, that is the partial merge of r194459,
would fix it. It patches sys/vm/vm_phys.c.

Index: vm_phys.c
===
--- vm_phys.c   (revision 194458)
+++ vm_phys.c   (revision 194459)
@@ -382,8 +382,7 @@
if (pa >= seg->start && pa < seg->end)
return (&seg->first_page[atop(pa - seg->start)]);
}
-   panic("vm_phys_paddr_to_vm_page: paddr %#jx is not in any segment",
-   (uintmax_t)pa);
+   return (NULL);
 }
 
 /*


pgpVPiOjOExg1.pgp
Description: PGP signature


Re: Panic in recent 7.2-Stable

2009-09-06 Thread Dmitrij Tejblum

Thierry Herbelot wrote:

Hello,

I'm having a panic with the latest kernel build of my -Stable file server 
(sources cvsupped around yesterday evening, CEST). The panic happens soon 
after entering multi-user :


panic: vm_phys_paddr_to_vm_page: paddr 0xf is not in any segment
KDB: enter: panic
[thread pid 1005 tid 100154 ]
Stopped at  kdb_enter_why+0x3a: movl$0,kdb_why
db> where
Tracing pid 1005 tid 100154 td 0x8ecad480
kdb_enter_why(80ba731f,80ba731f,80bc1ad6,fb301a94,fb301a94,...) at 
kdb_enter_why+0x3a

panic(80bc1ad6,f,0,8ecad480,0,...) at panic+0xd1
vm_phys_paddr_to_vm_page(f,f,fb301ad8,1,80a36a78,...) at 
vm_phys_paddr_to_vm_page+0x4d
dev_pager_getpages(92b8d980,fb301c04,1,0,fb301bcc,...) at 
dev_pager_getpages+0xe1

vm_fault(89267bfc,33d9,1,0,89b0b50c,...) at vm_fault+0x1020
trap_pfault(202,7,8583b900,80cd9800,89b0eb00,...) at trap_pfault+0x15b
trap(fb301d38) at trap+0x247



Similar panic here. I believe, the panic introduced in SVN revision 196838.

For us, the panic is caused by the `dmidecode' program. The dmidecode 
program mmap /dev/mem at offset 0xf and reads on...


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Panic in recent 7.2-Stable

2009-09-06 Thread Thierry Herbelot
Hello,

I'm having a panic with the latest kernel build of my -Stable file server 
(sources cvsupped around yesterday evening, CEST). The panic happens soon 
after entering multi-user :

panic: vm_phys_paddr_to_vm_page: paddr 0xf is not in any segment
KDB: enter: panic
[thread pid 1005 tid 100154 ]
Stopped at  kdb_enter_why+0x3a: movl$0,kdb_why
db> where
Tracing pid 1005 tid 100154 td 0x8ecad480
kdb_enter_why(80ba731f,80ba731f,80bc1ad6,fb301a94,fb301a94,...) at 
kdb_enter_why+0x3a
panic(80bc1ad6,f,0,8ecad480,0,...) at panic+0xd1
vm_phys_paddr_to_vm_page(f,f,fb301ad8,1,80a36a78,...) at 
vm_phys_paddr_to_vm_page+0x4d
dev_pager_getpages(92b8d980,fb301c04,1,0,fb301bcc,...) at 
dev_pager_getpages+0xe1
vm_fault(89267bfc,33d9,1,0,89b0b50c,...) at vm_fault+0x1020
trap_pfault(202,7,8583b900,80cd9800,89b0eb00,...) at trap_pfault+0x15b
trap(fb301d38) at trap+0x247

An excerpt of the dmesg is :

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.2-STABLE #35: Sun Sep  6 10:04:40 CEST 2009
x...@yyy:/usr/obj/usr/src/sys/GENERIC
Preloaded elf kernel "/boot/kernel/kernel" at 0x8101a000.
Preloaded elf module "/boot/kernel/zfs.ko" at 0x8101a188.
Preloaded elf module "/boot/kernel/opensolaris.ko" at 0x8101a230.
Preloaded elf module "/boot/kernel/snd_cmi.ko" at 0x8101a2e0.
Preloaded elf module "/boot/kernel/sound.ko" at 0x8101a38c.
Preloaded /boot/zfs/zpool.cache "/boot/zfs/zpool.cache" at 0x8101a438.
Preloaded elf module "/boot/kernel/acpi.ko" at 0x8101a490.

The previous kernel is older (around 22 august) and works as expected.

Some idents for the panic kernel are : (ie after SVN rev 196838)
$FreeBSD: src/sys/i386/i386/pmap.c,v 1.594.2.20 2009/09/04 19:59:32 jhb Exp $
$FreeBSD: src/sys/kern/kern_mbuf.c,v 1.32.2.6 2009/09/04 19:59:32 jhb Exp $
$FreeBSD: src/sys/vm/device_pager.c,v 1.84.2.3 2009/09/04 19:59:32 jhb Exp $
$FreeBSD: src/sys/vm/vm_object.c,v 1.385.2.7 2009/09/04 19:59:32 jhb Exp $
$FreeBSD: src/sys/vm/vm_page.c,v 1.357.2.10 2009/09/04 19:59:32 jhb Exp $
$FreeBSD: src/sys/vm/vm_phys.c,v 1.4.2.2 2009/09/04 19:59:32 jhb Exp $

Cheers

TfH
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"