Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-27 Thread Mike Stilson

On Fri, Jul 26, 2002 at 11:18:13AM -0700, Mark Vojkovich wrote:
>On Fri, 26 Jul 2002, Luca Olivetti wrote:
>
>> Mike Stilson wrote:
>> 
>> | BTW: Since I stepped back to 2314 it's been nice and stable (Only 3 days
>> | so far but better than the 10 hours I could manage with the latest
>> | driver.)
>> 
>> I'm using this release now and while it seems better, it's still not
>> totally reliable: it just killed X while using xvideo (this is probably
>> the reason I've been upgrading nvidia drivers, to see if problems went
>> away), but it seems it isn't screwing up anything else. Let's see how it
>> works and if nvidia do care about fixing its driver (which I doubt).
>> 
>
>   Don't use AGPGART.  Use NVGART instead.  It's not clear that this
>is NVIDIA's bug, and even if it were, it would definitely be in the open
>source part of the driver.
>

I have option nvagp 0 in my config file.  This didn't get changed
at all.  The absolute only change when my problems started was the
upgrade to the latest NV_kern and NV_GL.  If I remember, the longest it
was stable with the latest one was 2 days, and I wasn't even stressing
it with anything even close to xvideo.  It would just decide to oops
when it was doing basically nothing but blinking a cursor in an xterm.

Changing back to my original 2314 rpm's, without even touching the
config file has done away with all the problems.

-mike

-- 
Windows has detected that you have moved your mouse.
Your system must now be restarted for the changes to take effect.
- Unknown
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-26 Thread Luca Olivetti

Mark Vojkovich wrote:

>Don't use AGPGART.  Use NVGART instead.  It's not clear that this
> is NVIDIA's bug, and even if it were, it would definitely be in the open
> source part of the driver.

Ok, I assumed that AGPGART was the preferred way since it's the default 
configuration, I'll try with NVGART.

btw: with 2314 just had to reboot to reset the graphic card, no way to 
start X (text was fine):


Jul 26 20:26:36 pippo kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 
Jul 26 20:26:36 pippo kernel:  printing eip:
Jul 26 20:26:36 pippo kernel: d2a363ea
Jul 26 20:26:36 pippo kernel: *pde = 
Jul 26 20:26:36 pippo kernel: Oops: 
Jul 26 20:26:36 pippo kernel: CPU:0
jul 26 20:26:36 pippo gdm[19124]: gdm_slave_xioerror_handler: S'ha 
produït un error X fatal - S'està reinicialitzant :0
Jul 26 20:26:36 pippo kernel: EIP: 
0010:[usbcore:usb_devfs_handle_Re9c5f87f+1023350/240943286]Tainted: P
Jul 26 20:26:36 pippo kernel: EIP:0010:[]Tainted: P
Jul 26 20:26:36 pippo kernel: EFLAGS: 00013056
Jul 26 20:26:36 pippo kernel: eax:    ebx:    ecx: 
   edx: c5805944
Jul 26 20:26:36 pippo kernel: esi: d2af8004   edi:    ebp: 
cc53bae0   esp: cc53bacc
Jul 26 20:26:36 pippo kernel: ds: 0018   es: 0018   ss: 0018
Jul 26 20:26:36 pippo kernel: Process X (pid: 19125, stackpage=cc53b000)
Jul 26 20:26:36 pippo kernel: Stack:  d2af8004 0010 4000 
 cc53bb08 d2a3689d d2af8004
Jul 26 20:26:36 pippo kernel:0010 cc53bb00 cbb226a0 cc53bc34 
d2ad8300  c5805944 cc53bb48
Jul 26 20:26:36 pippo kernel:d2a2a816 d2af8004 cc53bc1c 0010 
0041 cc53bbcc  0028
Jul 26 20:26:36 pippo kernel: Call Trace: 
[usbcore:usb_devfs_handle_Re9c5f87f+1024553/240942083] 
[usbcore:usb_devfs_handle_Re9c5
f87f+1686668/240279968] 
[usbcore:usb_devfs_handle_Re9c5f87f+975266/240991370] 
[usbcore:usb_devfs_handle_Re9c5f87f+994093/2409725
43] [usbcore:usb_devfs_handle_Re9c5f87f+984681/240981955]
Jul 26 20:26:36 pippo kernel: Call Trace: [] [] 
[] [] []
Jul 26 20:26:36 pippo kernel: 
[usbcore:usb_devfs_handle_Re9c5f87f+1686668/240279968] 
[usbcore:usb_devfs_handle_Re9c5f87f+1004
615/240962021] [usbcore:usb_devfs_handle_Re9c5f87f+1004133/240962503] 
[usbcore:usb_devfs_handle_Re9c5f87f+990952/240975684] [usb
core:usb_devfs_handle_Re9c5f87f+1179697/240786939] 
[usbcore:usb_devfs_handle_Re9c5f87f+1179697/240786939]
Jul 26 20:26:36 pippo kernel:[] [] [] 
[] [] []
Jul 26 20:26:36 pippo kernel: 
[usbcore:usb_devfs_handle_Re9c5f87f+1179697/240786939] 
[usbcore:usb_devfs_handle_Re9c5f87f+1179
775/240786861] [usbcore:usb_devfs_handle_Re9c5f87f+1179668/240786968] 
[usbcore:usb_devfs_handle_Re9c5f87f+1342376/240624260] [us
bcore:usb_devfs_handle_Re9c5f87f+1180269/240786367] 
[usbcore:usb_devfs_handle_Re9c5f87f+552466/241414170]
Jul 26 20:26:36 pippo kernel:[] [] [] 
[] [] []
Jul 26 20:26:36 pippo kernel: 
[usbcore:usb_devfs_handle_Re9c5f87f+1434146/240532490] 
[update_process_times+32/176] [update_wa
ll_time+11/64] [timer_bh+36/592] [bh_action+27/80] 
[tasklet_hi_action+80/128]
Jul 26 20:26:36 pippo kernel:[] [] [] 
[] [] []
Jul 26 20:26:36 pippo kernel:[do_softirq+83/160] 
[af_packet:__insmod_af_packet_O/lib/modules/2.4.18-8.1mdk/kernel/net/p+-839
215/96] 
[af_packet:__insmod_af_packet_O/lib/modules/2.4.18-8.1mdk/kernel/net/p+-839135/96] 
[af_packet:__insmod_af_packet_O/lib/m
odules/2.4.18-8.1mdk/kernel/net/p+-903534/96] 
[af_packet:__insmod_af_packet_O/lib/modules/2.4.18-8.1mdk/kernel/net/p+-903534/96]
 
[af_packet:__insmod_af_packet_O/lib/modules/2.4.18-8.1mdk/kernel/net/p+-865454/96]
Jul 26 20:26:36 pippo kernel:[] [] [] 
[] [] []
Jul 26 20:26:36 pippo kernel: 
[af_packet:__insmod_af_packet_O/lib/modules/2.4.18-8.1mdk/kernel/net/p+-817878/96] 
[usbcore:usb
_devfs_handle_Re9c5f87f+971414/240995222] 
[usbcore:usb_devfs_handle_Re9c5f87f+1686668/240279968] 
[af_packet:__insmod_af_packet_O
/lib/modules/2.4.18-8.1mdk/kernel/net/p+-835200/96] [sys_ioctl+583/608] 
[system_call+51/64]
Jul 26 20:26:36 pippo kernel:[] [] [] 
[] [] []
Jul 26 20:26:36 pippo kernel:
Jul 26 20:26:36 pippo kernel: Code: 80 3c 38 00 75 08 83 c3 07 e9 88 00 
00 00 89 d8 c1 e8 03 8b




-- 
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247



msg08041/pgp0.pgp
Description: PGP signature


Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-26 Thread Luca Olivetti

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mike Stilson wrote:

| BTW: Since I stepped back to 2314 it's been nice and stable (Only 3 days
| so far but better than the 10 hours I could manage with the latest
| driver.)

I'm using this release now and while it seems better, it's still not
totally reliable: it just killed X while using xvideo (this is probably
the reason I've been upgrading nvidia drivers, to see if problems went
away), but it seems it isn't screwing up anything else. Let's see how it
works and if nvidia do care about fixing its driver (which I doubt).

Bye
- --
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: http://www.gnupg.org

iD8DBQE9QXGwCQPXTRx9NmQRAuKsAJ92sDqbGRE7E38zxYXNKKxSkS6a0gCePHce
UWwDKiyd0OyW7N12m+m96MY=
=+qcj
-END PGP SIGNATURE-

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-26 Thread Mark Vojkovich

On Fri, 26 Jul 2002, Luca Olivetti wrote:

> Mike Stilson wrote:
> 
> | BTW: Since I stepped back to 2314 it's been nice and stable (Only 3 days
> | so far but better than the 10 hours I could manage with the latest
> | driver.)
> 
> I'm using this release now and while it seems better, it's still not
> totally reliable: it just killed X while using xvideo (this is probably
> the reason I've been upgrading nvidia drivers, to see if problems went
> away), but it seems it isn't screwing up anything else. Let's see how it
> works and if nvidia do care about fixing its driver (which I doubt).
> 

   Don't use AGPGART.  Use NVGART instead.  It's not clear that this
is NVIDIA's bug, and even if it were, it would definitely be in the open
source part of the driver.


Mark.

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-23 Thread Mike Stilson

On Tue, Jul 23, 2002 at 01:36:39PM +0200, Charl P. Botha wrote:
>On Tue, Jul 23, 2002 at 07:16:50AM -0400, Mike Stilson wrote:
>> Just to follow up after giving a little more thought...
>> The kernel module doesn't taint it since I'm recompiling it from source.
>> the NVdriver.o module (although it doesn't specifically have a
>> MODULE_LICENSE("GPL")) also doesn't have any conflicting or proprietary
>> license.  Now my understanding might be wrong, but it SHOULDN'T be
>> tainting the kernel.  Only the X driver (nvidia_drv.o) is closed source.
>
>The kernel module *does* taint it.  You're only building three of the
>objects and then linking with "Module-nvkernel", which is binary and which
>is where most of the juicy bits are.

Ok, that being the case, any idea as to why it's not tainted?  The
module is remaining loaded.

-- 
Windows has detected that you have moved your mouse.
Your system must now be restarted for the changes to take effect.
- Unknown
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-23 Thread Luca Olivetti

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mike Stilson wrote:


| Just to follow up after giving a little more thought...
| The kernel module doesn't taint it since I'm recompiling it from source.

No, you're just recompiling some glue code, the real module is binary only.

| the NVdriver.o module (although it doesn't specifically have a
| MODULE_LICENSE("GPL")) also doesn't have any conflicting or proprietary
| license.  Now my understanding might be wrong, but it SHOULDN'T be
| tainting the kernel.  Only the X driver (nvidia_drv.o) is closed source.

Yes, it should, in fact I think that kernel developers introduced the
"Tainted" flag specifically for the nvidia module, see this message from
Alan Cox, http://old.lwn.net/2001/0906/a/ac-license.php3, where he states:

"Unfortunately I get so many bug reports caused by the nvidia modules
and people lying when asked if they have them loaded that some kind of
action has to occur, otherwise I'm going to have to stop reading bug
reports from anyone I don't know personally."


Bye
- --
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: http://www.gnupg.org

iD8DBQE9PWC+CQPXTRx9NmQRAk9YAKCsQ4Vs/WbeYpLIZkifnwgEras7DACgs33R
YImr5J3DGqYfftYR5hPbO8g=
=dE9X
-END PGP SIGNATURE-

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-23 Thread Charl P. Botha

On Tue, Jul 23, 2002 at 07:16:50AM -0400, Mike Stilson wrote:
> Just to follow up after giving a little more thought...
> The kernel module doesn't taint it since I'm recompiling it from source.
> the NVdriver.o module (although it doesn't specifically have a
> MODULE_LICENSE("GPL")) also doesn't have any conflicting or proprietary
> license.  Now my understanding might be wrong, but it SHOULDN'T be
> tainting the kernel.  Only the X driver (nvidia_drv.o) is closed source.

The kernel module *does* taint it.  You're only building three of the
objects and then linking with "Module-nvkernel", which is binary and which
is where most of the juicy bits are.

-- 
charl p. botha http://cpbotha.net/ http://visualisation.tudelft.nl/
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-23 Thread Mike Stilson

On Mon, Jul 22, 2002 at 11:25:35PM +0200, Luca Olivetti wrote:
>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA1
>
>Mike Stilson wrote:
>Are you sure the nvidia module was loaded at the time of this crash?
>It would taint the kernel if loaded:
>
>Jul  7 20:20:54 pippo kernel: kernel BUG at page_alloc.c:216!
>Jul  7 20:20:54 pippo kernel: invalid operand: 
>Jul  7 20:20:54 pippo kernel: CPU:0
>Jul  7 20:20:54 pippo kernel: EIP:0010:[rmqueue+479/544]Tainted: P
>Jul  7 20:20:54 pippo kernel: EIP:0010:[]Tainted: P

Just to follow up after giving a little more thought...
The kernel module doesn't taint it since I'm recompiling it from source.
the NVdriver.o module (although it doesn't specifically have a
MODULE_LICENSE("GPL")) also doesn't have any conflicting or proprietary
license.  Now my understanding might be wrong, but it SHOULDN'T be
tainting the kernel.  Only the X driver (nvidia_drv.o) is closed source.

Perhaps you have something else tainting it?

-me

-- 
Windows has detected that you have moved your mouse.
Your system must now be restarted for the changes to take effect.
- Unknown
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread Mike Stilson

On Mon, Jul 22, 2002 at 11:25:35PM +0200, Luca Olivetti wrote:
>Mike Stilson wrote:
>
>| FWIW I had the same issues.
>| The problem I had was that it would crash kswapd with the following
>
>In my case it was either kdeinit, xmessage, kswapd, startkde,...

99% of the time it would crash kswapd, more or less causing a cascade
from there.  Usually one or two apps running on the desktop would oops,
then X itself.  Sometimes it would prevent you from switching to a vc
(just a garbled screen) but would let you return to the x desktop fine.
Other times it took a hard reset to do anything.

>
>| kernel: kernel BUG at page_alloc.c:82!
>| kernel: invalid operand: 
>| kernel: CPU:0
>| kernel: EIP:0010:[__free_pages_ok+66/688]Not tainted
>| kernel: EIP:0010:[]Not tainted
>
>Are you sure the nvidia module was loaded at the time of this crash?
>It would taint the kernel if loaded:
>
>Jul  7 20:20:54 pippo kernel: kernel BUG at page_alloc.c:216!
>Jul  7 20:20:54 pippo kernel: invalid operand: 
>Jul  7 20:20:54 pippo kernel: CPU:0
>Jul  7 20:20:54 pippo kernel: EIP:0010:[rmqueue+479/544]Tainted: P
>Jul  7 20:20:54 pippo kernel: EIP:0010:[]Tainted: P

positive.  I've wondered about that myself several times, so I made
quite sure to check whenever possible.  Unless it was doing something
VERY strange, it most definitely was loaded for all of them.

In fact, several other oopses (unrelated) while the nvdriver was loaded
weren't tainted either.


BTW: Since I stepped back to 2314 it's been nice and stable (Only 3 days
so far but better than the 10 hours I could manage with the latest
driver.)

-me

-- 
Windows has detected that you have moved your mouse.
Your system must now be restarted for the changes to take effect.
- Unknown
___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread Luca Olivetti

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mike Stilson wrote:

| FWIW I had the same issues.
| The problem I had was that it would crash kswapd with the following

In my case it was either kdeinit, xmessage, kswapd, startkde,...

| kernel: kernel BUG at page_alloc.c:82!
| kernel: invalid operand: 
| kernel: CPU:0
| kernel: EIP:0010:[__free_pages_ok+66/688]Not tainted
| kernel: EIP:0010:[]Not tainted

Are you sure the nvidia module was loaded at the time of this crash?
It would taint the kernel if loaded:

Jul  7 20:20:54 pippo kernel: kernel BUG at page_alloc.c:216!
Jul  7 20:20:54 pippo kernel: invalid operand: 
Jul  7 20:20:54 pippo kernel: CPU:0
Jul  7 20:20:54 pippo kernel: EIP:0010:[rmqueue+479/544]Tainted: P
Jul  7 20:20:54 pippo kernel: EIP:0010:[]Tainted: P

Bye
- --
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: http://www.gnupg.org

iD8DBQE9PHhOCQPXTRx9NmQRArrSAKCsQ/UTgWzt3fvNx4E2qEnJFFs8BACfS9VC
HBl1+/NQs9OmDwGdYFV7bLc=
=cXz4
-END PGP SIGNATURE-

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread Luca Olivetti

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

krjw wrote:


| The issue I've seen is reproducable like so:
| 1) boot linux box
| 2) enable use of kernel AGPGART (not internal nvidia AGPGART code) in
| XF86Config (load agpgart kernel module if needed)
| 3) Start X
| 4) Leave X
| 5) enable use of internal nvidia AGPGART code in XF86Config
| 6) Start X
| 7) choke.  have to log in from remote box to reboot.
|
| (2) and (5) imply changing the value of the option NvAgp in XF86Config.

Nope, I remember I have tried both the nvidia internal agp support and
agpgart, but I think I always rebooted between each change. The
"mem=nopentium" option, albeit not an optimal solution, cured the
lockups, still I had "spontaneous" session ending (especially when using
xvideo). Now, the problem I have with the latest nvidia driver is much
more serious, and it seems I'm not the only one experiencing this (Mike
Stilson in another reply is confirming the same problem, with a
different kernel, different compiler, different distribution and no agp).

| Why it chokes, I don't know.  Maybe it has something to do with the AMD
| AGP issue you quoted and the "inadvertant trashing of AGP memory".
| Does anyone else know more about this[1]?  Perhaps the NV guys on the
| list?  How does this problem manifest itself?  Segfaults?  System
| lockups?

This particular problem ("solved" with the mem=nopentium option)
manifested itself with an X lockup, the machine was still running fine
and nothing was registered in the logs, only a reboot recovered the
graphic subsystem, though. I don't know if there were other problems, I
didn't notice.
The current problem that leaves the message "kernel BUG at page_alloc.c:
xxx" is more subtle: everything seems to be running fine but then
something strange happens, e.g: I cannot su or I cannot login or I
cannot list the content of a directory (the ls locks up and is
unkillable). When that happens, I go looking in the logs and there it
is, a "kernel BUG" some time (1 minute, 1 hour, 5 hours..) before the
strangeness began.
"shutdown -r" doesn't work (halfway through it locks up, on occasions it
never starts) and a reset is the only option.
When kernel memory is corrupted everything could happen.

Bye
- --
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: http://www.gnupg.org

iD8DBQE9PHYaCQPXTRx9NmQRAtklAJ4lH+8jf9sPLMHqnqxDpVKW4MYyIQCaA5sI
UNWOeOFmpMLDSvZXd+Oqfh4=
=LrZV
-END PGP SIGNATURE-

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread Mike Stilson

On Mon, Jul 22, 2002 at 03:16:40PM +0200, Luca Olivetti wrote:
>Anyway, I was having a lot of these messages (kernel BUG at
>page_alloc.c) with the current version of nvidia driver (1.0-2960) and
>afterwards the system is unpredictable (possible hard locks, file system
>corruption, anything).
>At first I didn't relate this to the nvidia driver, but I've looked at
>the kernel mailing list and everybody says that the problem is NVIDIA
>related (I tried to contact nvidia but it seems that
>[EMAIL PROTECTED] is redirected to /dev/null).
It is related, and yeah, I'm pretty sure they dump it.  I haven't heard
anything back since reporting the problem about a week after their
latest drivers were available.  (Took it that long to isolate their
driver as the culprit.  The oops'es were unpredictable and I still
haven't found what the actual cause is.)

>I'm beginning to think that it is true, because, tired of all the
>problems with this driver (earlier versions just froze X but this one is
>more serious), I switched to the xfree "nv" driver and the system has
>been rock solid since then, the only problem is that I cannot play
>tuxracer anymore ;-)
>I'm running Mandrake 8.2 with the updated kernel
Redhat (mixed up versions, I handle my own tarballs) with 2.4.18.
I didn't try the nv driver though.  I tried it to start with and had
some problems I don't remember specifically at the moment.

FWIW I had the same issues.
The problem I had was that it would crash kswapd with the following

kernel: kernel BUG at page_alloc.c:82!
kernel: invalid operand: 
kernel: CPU:0
kernel: EIP:0010:[__free_pages_ok+66/688]Not tainted
kernel: EIP:0010:[]Not tainted
kernel: EFLAGS: 00010286
kernel: eax: 001f   ebx: c10dfd9c   ecx: c0286fa0   edx: 0001332c
kernel: esi: c10dfd80   edi:    ebp:    esp: c3ec5f20
kernel: ds: 0018   es: 0018   ss: 0018
kernel: Process kswapd (pid: 4, stackpage=c3ec5000)
kernel: Stack: c0239c40 0052 c10dfd9c c10dfd80  001c c10dfd80 01d0 
kernel:c10dfd9c c10dfd80 c012a072 c012b077 c012a0ab 0020 01d0 0020 
kernel:0006 c3ec4000 008a 057c 01d0 c0288308 c012a2c0 0006 
kernel: Call Trace: [shrink_cache+506/780] [__free_pages+27/28] [shrink_cache+563/780] 
[shrink_caches+88/136] [try_to_free_ pages+60/92] 
kernel: Call Trace: [] [] [] [] [] 
kernel:[kswapd_balance_pgdat+67/140] [kswapd_balance+18/40] [kswapd+153/188] 
[kernel_thread+40/56] 
kernel:[] [] [] [] 
kernel: 
kernel: Code: 0f 0b 83 c4 08 89 f0 2b 05 6c 3f 2f c0 c1 f8 06 3b 05 60 3f 

(ksymoops output)
>>EIP; c012a7e2 <__free_pages_ok+42/2b0>   <=
Trace; c012a072 
Trace; c012b077 <__free_pages+1b/1c>
Trace; c012a0ab 
Trace; c012a2c0 
Trace; c012a32c 
Trace; c012a3c3 
Trace; c012a41e 
Trace; c012a52d 
Trace; c0105478 
Code;  c012a7e2 <__free_pages_ok+42/2b0>
 <_EIP>:
Code;  c012a7e2 <__free_pages_ok+42/2b0>   <=
   0:   0f 0b ud2a  <=
Code;  c012a7e4 <__free_pages_ok+44/2b0>
   2:   83 c4 08  add$0x8,%esp
Code;  c012a7e7 <__free_pages_ok+47/2b0>
   5:   89 f0 mov%esi,%eax
Code;  c012a7e9 <__free_pages_ok+49/2b0>
   7:   2b 05 6c 3f 2f c0 sub0xc02f3f6c,%eax
Code;  c012a7ef <__free_pages_ok+4f/2b0>
   d:   c1 f8 06  sar$0x6,%eax
Code;  c012a7f2 <__free_pages_ok+52/2b0>
  10:   3b 05 60 3f 00 00 cmp0x3f60,%eax


This vanished when I backed down to NVIDIA_kernel-2314.
No big loss to me except I miss the loss of DigitalVibrance.

If only I could grab the source I'd backport it, and alas my asm
(especially gdb disassembling it) is pretty rusty.

>
>$ cat /proc/driver/nvidia/cards/0
>Model:   GeForce2 MX/MX 400
>IRQ: 10
>Video BIOS:  03.11.00.18
>Card Type:   AGP
>
>$ cat /proc/driver/nvidia/agp/card
>Fast Writes: Supported
>SBA: Not Supported
>AGP Rates:   4x 2x 1x
>Registers:   0x1f17:0x1f000104
>
>$ cat /proc/driver/nvidia/agp/host-bridge
>Host Bridge: Via Apollo Pro KT133
>Fast Writes: Not Supported
>SBA: Supported
>AGP Rates:   4x 2x 1x
>Registers:   0x1f000207:0x0104
>
>$ cat /proc/driver/nvidia/agp/status
>Status:  Enabled
>Driver:  AGPGART
>AGP Rate:4x
>Fast Writes: Disabled
>SBA: Disabled
>
>$ cat /proc/driver/nvidia/version
>NVRM version: NVIDIA NVdriver Kernel Module  1.0-2960  Tue May 14
>07:41:42 PDT 2002
>GCC version:  gcc version 2.96 2731 (Mandrake Linux 8.2 2.96-0.76mdk)

$ find /proc/nv -type f -exec cat {} ';'
- Driver Info - 
NVRM Version: NVIDIA NVdriver Kernel Module  1.0.2314  Fri Nov 30 19:33:20 PST 2001
Compiled with: gcc version 2.95.3 20010315 (release)
-- Card Info --
Model:GeForce2 MX/MX 400
IRQ:  5
Video BIOS:   03.11.01.37
-- PCI Only ---

-- 
Windows has detected that you have moved your mouse.
Your system must now be r

Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread krjw

On 2002-07-22 at 16:34 +0200, Luca Olivetti uttered:

|
| | The only real issue I've seen thus far is related to agpgart but this
| | probably has nothing to do with your problem.
|
| well, it could be, since the "nv" driver doesn't use agp
|

The issue I've seen is reproducable like so:
1) boot linux box
2) enable use of kernel AGPGART (not internal nvidia AGPGART code) in
XF86Config (load agpgart kernel module if needed)
3) Start X
4) Leave X
5) enable use of internal nvidia AGPGART code in XF86Config
6) Start X
7) choke.  have to log in from remote box to reboot.

(2) and (5) imply changing the value of the option NvAgp in XF86Config.

Why it chokes, I don't know.  Maybe it has something to do with the AMD
AGP issue you quoted and the "inadvertant trashing of AGP memory".
Does anyone else know more about this[1]?  Perhaps the NV guys on the
list?  How does this problem manifest itself?  Segfaults?  System
lockups?

Ciao.
krjw.

[1] http://www.geocrawler.com/lists/3/Linux/35/175/7626960/



___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread Luca Olivetti

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

krjw wrote:

| Ack.  Makes you wish you could poke around in the source, eh?

Well, not really, I wouldn't understand much of it. Other more
knowledgeable people could, though.


|  Anyway,
| does this only occur when you attempt to do 3D things?  Or does it
| happen in 2D land as well?  I've been using the nvidia binary drivers
| with the geforce3 for eons with no persistant life-threatening problems.

even with 2D only.

| The only real issue I've seen thus far is related to agpgart but this
| probably has nothing to do with your problem.

well, it could be, since the "nv" driver doesn't use agp


| | At first I didn't relate this to the nvidia driver, but I've looked at
| | the kernel mailing list and everybody says that the problem is NVIDIA
| | related (I tried to contact nvidia but it seems that
| | [EMAIL PROTECTED] is redirected to /dev/null).
|
| /dev/null eh?  Hmm maybe so today but usually they respond within a day
| or two.  Most often they don't bother to read your message in entirety
| but they do (try to) respond.

I didn't get anything, not even an automated reply.


[...]


| Perhaps Mandrake tweaked the kernel a bit and broke something such that
| it doesn't get along with the nvidia code (or vice versa)?  Did you try
| or consider trying vanilla kernel source, ie, from kernel.org?

No, I though that Mandrake patched kernel was fine (probably "better
than the real thing" ;-), since nvidia provides a package for it.

|
| | The kernel is booted with the "mem=nopentium" parameter to workaround
| | the known athlon bug (or linux bug wrt athlon) with agp.
| | XFree is 4.2.0, straight from Mandrake RPMS.
|
| Call me ignorant but what exactly is the nopentium option for?

I don't remember where I got it originally, but you can find a
description of the problem here:
http://www.gentoo.org/news/20020123-amd-news.html

Bye
- --
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: http://www.gnupg.org

iD8DBQE9PBfpCQPXTRx9NmQRArgHAJ4tb+pmMMvTinA7gnSlabPPuMAQFwCeOCOW
R8kI5/ir6XyFk2mn1fW4ESY=
=PTnm
-END PGP SIGNATURE-

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



Re: [Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread krjw

On 2002-07-22 at 15:16 +0200, Luca Olivetti uttered:

[snip]
| Anyway, I was having a lot of these messages (kernel BUG at
| page_alloc.c) with the current version of nvidia driver (1.0-2960) and
| afterwards the system is unpredictable (possible hard locks, file system
| corruption, anything).

Ack.  Makes you wish you could poke around in the source, eh?  Anyway,
does this only occur when you attempt to do 3D things?  Or does it
happen in 2D land as well?  I've been using the nvidia binary drivers
with the geforce3 for eons with no persistant life-threatening problems.
The only real issue I've seen thus far is related to agpgart but this
probably has nothing to do with your problem.

| At first I didn't relate this to the nvidia driver, but I've looked at
| the kernel mailing list and everybody says that the problem is NVIDIA
| related (I tried to contact nvidia but it seems that
| [EMAIL PROTECTED] is redirected to /dev/null).

/dev/null eh?  Hmm maybe so today but usually they respond within a day
or two.  Most often they don't bother to read your message in entirety
but they do (try to) respond.

| I'm beginning to think that it is true, because, tired of all the
| problems with this driver (earlier versions just froze X but this one is
| more serious), I switched to the xfree "nv" driver and the system has
| been rock solid since then, the only problem is that I cannot play
| tuxracer anymore ;-)
| I'm running Mandrake 8.2 with the updated kernel
| kernel-2.4.18.8.1mdk-1-3mdk and the NVIDIA_kernel built from the source
| rpm, but I had the same problem with the original kernel
| kernel-2.4.18.6mdk-1-1mdk and the corresponding rpm package downloaded
| from www.nvidia.com.

Perhaps Mandrake tweaked the kernel a bit and broke something such that
it doesn't get along with the nvidia code (or vice versa)?  Did you try
or consider trying vanilla kernel source, ie, from kernel.org?

| The kernel is booted with the "mem=nopentium" parameter to workaround
| the known athlon bug (or linux bug wrt athlon) with agp.
| XFree is 4.2.0, straight from Mandrake RPMS.

Call me ignorant but what exactly is the nopentium option for?



Cheers.
krjw.

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert



[Xpert]nvidia binary driver: kernel BUG at page_alloc.c

2002-07-22 Thread Luca Olivetti

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi,
apologies in advance if this isn't the right mailing list, I saw earlier
messages regarding nvidia binary driver so this should be ok.

Anyway, I was having a lot of these messages (kernel BUG at
page_alloc.c) with the current version of nvidia driver (1.0-2960) and
afterwards the system is unpredictable (possible hard locks, file system
corruption, anything).
At first I didn't relate this to the nvidia driver, but I've looked at
the kernel mailing list and everybody says that the problem is NVIDIA
related (I tried to contact nvidia but it seems that
[EMAIL PROTECTED] is redirected to /dev/null).
I'm beginning to think that it is true, because, tired of all the
problems with this driver (earlier versions just froze X but this one is
more serious), I switched to the xfree "nv" driver and the system has
been rock solid since then, the only problem is that I cannot play
tuxracer anymore ;-)
I'm running Mandrake 8.2 with the updated kernel
kernel-2.4.18.8.1mdk-1-3mdk and the NVIDIA_kernel built from the source
rpm, but I had the same problem with the original kernel
kernel-2.4.18.6mdk-1-1mdk and the corresponding rpm package downloaded
from www.nvidia.com.
The kernel is booted with the "mem=nopentium" parameter to workaround
the known athlon bug (or linux bug wrt athlon) with agp.
XFree is 4.2.0, straight from Mandrake RPMS.

$ rpm -qa | grep XFree
XFree86-libs-4.2.0-10mdk
XFree86-100dpi-fonts-4.2.0-10mdk
XFree86-4.2.0-10mdk
XFree86-server-4.2.0-10mdk
XFree86-devel-4.2.0-10mdk
XFree86-75dpi-fonts-4.2.0-10mdk
XFree86-xfs-4.2.0-10mdk
XFree86-compat-libs-4.1.0-2mdk


$ cat /proc/driver/nvidia/cards/0
Model:   GeForce2 MX/MX 400
IRQ: 10
Video BIOS:  03.11.00.18
Card Type:   AGP

$ cat /proc/driver/nvidia/agp/card
Fast Writes: Supported
SBA: Not Supported
AGP Rates:   4x 2x 1x
Registers:   0x1f17:0x1f000104

$ cat /proc/driver/nvidia/agp/host-bridge
Host Bridge: Via Apollo Pro KT133
Fast Writes: Not Supported
SBA: Supported
AGP Rates:   4x 2x 1x
Registers:   0x1f000207:0x0104

$ cat /proc/driver/nvidia/agp/status
Status:  Enabled
Driver:  AGPGART
AGP Rate:4x
Fast Writes: Disabled
SBA: Disabled

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA NVdriver Kernel Module  1.0-2960  Tue May 14
07:41:42 PDT 2002
GCC version:  gcc version 2.96 2731 (Mandrake Linux 8.2 2.96-0.76mdk)



- --
Luca Olivetti
Note.- This message reached you today, it may not tomorrow if you
are using MAPS services. They arbitrarily include in their lists
IP addresses not related in any way to spam, and in so doing are
disrupting Internet connectivity.  Please stop supporting them.
See http://slashdot.org/article.pl?sid=01/05/21/1944247
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: http://www.gnupg.org

iD8DBQE9PAW4CQPXTRx9NmQRAnsQAKCxuLR5LybiJEkDzifIckltqPkrjgCfXnAE
DbCibEmnjxVhpl/COP3Oz+c=
=JGUK
-END PGP SIGNATURE-

___
Xpert mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/xpert