Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Carlo Wood
On Mon, Jun 18, 2007 at 10:37:26AM +0800, Wang Zhenyu wrote:
> Carlo, I've just built latest kernel git tree on a Dell 965G box and
> have a NV card plugged-in. It boots fine.

It would be nice if you could test it with the exact same
hardware ... I am pretty sure it should be reproducable then *cough*

> Linux agpgart interface v0.102 (c) Dave Jones
> agpgart: Detected an Intel 965G Chipset.
> agpgart: AGP aperture is 256M @ 0x0
> 
> I don't know why it hangs your machine when loading this module, it should
> just not bother anything. But from your last "modprobe: ..." line, it seems
> there's really badness somewhere, do you have serial console to see more
> in the message?

The reason I react a bit late to your post is because in the meantime
I set up a serial console and figured out how to boot that way... and
captured the boot messages.

I will add that to one of the other posts, because as a result of
what they posted there I added agp=off - and that (indeed) makes it
possible for me to boot. Please see my next post thus.

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Dave Jones
On Mon, Jun 18, 2007 at 01:42:13PM -0400, Chuck Ebbert wrote:
 > On 06/17/2007 10:37 PM, Wang Zhenyu wrote:
 > > On 2007.06.18 03:56:36 +, Carlo Wood wrote:
 > >> On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
 >  Right now, I'm at a loss to explain the corruption, so it's
 >  difficult to suggest what to try.
 > >>> The thing is here, this is PCIE, so if there is a GPU plugged into the
 > >>> PCIE 16x slot in theory the main onboard graphics should disable, AGP
 > >>> code is used to control the GART for the onboard chip, in this case a
 > >>> plugged in card will  not use AGP, I wonder have Intel tested with a
 > >>> pcie card in place...
 > > 
 > > Agree. We seem to always enable AGP even IGD is disabled or not exists,
 > > other card should not depend on this module ever.
 > > 
 > >> That is Chinese for me :/.
 > >> Do you want me to try something?
 > > 
 > > Carlo, I've just built latest kernel git tree on a Dell 965G box and
 > > have a NV card plugged-in. It boots fine.
 > > 
 > > Linux agpgart interface v0.102 (c) Dave Jones
 > > agpgart: Detected an Intel 965G Chipset.
 > > agpgart: AGP aperture is 256M @ 0x0
 > > 
 > > I don't know why it hangs your machine when loading this module, it should
 > > just not bother anything. But from your last "modprobe: ..." line, it seems
 > > there's really badness somewhere, do you have serial console to see more
 > > in the message?
 > 
 > There are also these bug reports:
 > 
 > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229913
 > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242101

good find.

Looking through intel-agp brings up a wart that I've been 
complaining about for a while.

Start looking for functions with '965' in them.
the first one you hit is the wonderfully named
'intel_i830_init_gtt_entries'.
Two thirds of that function no longer have anything to
do with an i830.  Instead of adding a intel_i965_init_gtt_entries,
this thing has grown into a monster dealing with three different
generations of hardware.

This results in tables like this..

static const struct agp_bridge_driver intel_i965_driver = {
   .owner  = THIS_MODULE,
   .aperture_sizes = intel_i830_sizes,
   .size_type  = FIXED_APER_SIZE,
   .num_aperture_sizes = 4,
   .needs_scratch_page = TRUE,
   .configure  = intel_i915_configure,
   .fetch_size = intel_i9xx_fetch_size,
   .cleanup= intel_i915_cleanup,
   .tlb_flush  = intel_i810_tlbflush,
   .mask_memory= intel_i965_mask_memory,
   .masks  = intel_i810_masks,
   .agp_enable = intel_i810_agp_enable,
   .cache_flush= global_cache_flush,
   .create_gatt_table  = intel_i965_create_gatt_table,
   .free_gatt_table= intel_i830_free_gatt_table,
   .insert_memory  = intel_i915_insert_entries,
   .remove_memory  = intel_i915_remove_entries,
   .alloc_by_type  = intel_i830_alloc_by_type,
   .free_by_type   = intel_i810_free_by_type,
   .agp_alloc_page = agp_generic_alloc_page,
   .agp_destroy_page   = agp_generic_destroy_page,
   .agp_type_to_mask_type  = intel_i830_type_to_mask_type,

so we use bits and pieces from 810, 830, 915, and throw in
some new 965 routines too.  Why is this a mess ?
Because it's non-trivial to just look at this table
and spot bugs like "wait, that 810 should be using 915"
without lots of staring at data sheets.
Additionally each time we twist these routines to cope with
an additional chipset, we risk breaking previous generations.

Having functions do ONE thing is a good thing, even if
it means having 15 of them that look similar.
The alternative of a single function that becomes a nest
of if's & switches is just horrible.

It could be that all of the above is actually pointing to
the correct routines.  It could also be that the codepaths
in those routines, as twisty as they are, are fine, and this
is just some normal bug, but hunting for it becomes a lot
harder when the code is this baroque.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Chuck Ebbert
On 06/17/2007 10:37 PM, Wang Zhenyu wrote:
> On 2007.06.18 03:56:36 +, Carlo Wood wrote:
>> On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
 Right now, I'm at a loss to explain the corruption, so it's
 difficult to suggest what to try.
>>> The thing is here, this is PCIE, so if there is a GPU plugged into the
>>> PCIE 16x slot in theory the main onboard graphics should disable, AGP
>>> code is used to control the GART for the onboard chip, in this case a
>>> plugged in card will  not use AGP, I wonder have Intel tested with a
>>> pcie card in place...
> 
> Agree. We seem to always enable AGP even IGD is disabled or not exists,
> other card should not depend on this module ever.
> 
>> That is Chinese for me :/.
>> Do you want me to try something?
> 
> Carlo, I've just built latest kernel git tree on a Dell 965G box and
> have a NV card plugged-in. It boots fine.
> 
> Linux agpgart interface v0.102 (c) Dave Jones
> agpgart: Detected an Intel 965G Chipset.
> agpgart: AGP aperture is 256M @ 0x0
> 
> I don't know why it hangs your machine when loading this module, it should
> just not bother anything. But from your last "modprobe: ..." line, it seems
> there's really badness somewhere, do you have serial console to see more
> in the message?

There are also these bug reports:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229913
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242101
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Carlo Wood
On Mon, Jun 18, 2007 at 12:18:32PM +1000, Dave Airlie wrote:
> Well it was more for davej's benefit, in theory for your machine with
> a PCIE graphics card you don't need agpgart enabled at all granted it
> shouldn't screw up if it is..

I can't disable it though.

I am not THAT much interested in disabling it however as long as I
get help here to debug this. I am interested in finding out what is
causing the crash, can't stand it to have a bug on my machine without
knowing what it is ;)

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Carlo Wood
On Mon, Jun 18, 2007 at 12:18:32PM +1000, Dave Airlie wrote:
 Well it was more for davej's benefit, in theory for your machine with
 a PCIE graphics card you don't need agpgart enabled at all granted it
 shouldn't screw up if it is..

I can't disable it though.

I am not THAT much interested in disabling it however as long as I
get help here to debug this. I am interested in finding out what is
causing the crash, can't stand it to have a bug on my machine without
knowing what it is ;)

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Chuck Ebbert
On 06/17/2007 10:37 PM, Wang Zhenyu wrote:
 On 2007.06.18 03:56:36 +, Carlo Wood wrote:
 On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
 Right now, I'm at a loss to explain the corruption, so it's
 difficult to suggest what to try.
 The thing is here, this is PCIE, so if there is a GPU plugged into the
 PCIE 16x slot in theory the main onboard graphics should disable, AGP
 code is used to control the GART for the onboard chip, in this case a
 plugged in card will  not use AGP, I wonder have Intel tested with a
 pcie card in place...
 
 Agree. We seem to always enable AGP even IGD is disabled or not exists,
 other card should not depend on this module ever.
 
 That is Chinese for me :/.
 Do you want me to try something?
 
 Carlo, I've just built latest kernel git tree on a Dell 965G box and
 have a NV card plugged-in. It boots fine.
 
 Linux agpgart interface v0.102 (c) Dave Jones
 agpgart: Detected an Intel 965G Chipset.
 agpgart: AGP aperture is 256M @ 0x0
 
 I don't know why it hangs your machine when loading this module, it should
 just not bother anything. But from your last modprobe: ... line, it seems
 there's really badness somewhere, do you have serial console to see more
 in the message?

There are also these bug reports:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229913
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242101
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Dave Jones
On Mon, Jun 18, 2007 at 01:42:13PM -0400, Chuck Ebbert wrote:
  On 06/17/2007 10:37 PM, Wang Zhenyu wrote:
   On 2007.06.18 03:56:36 +, Carlo Wood wrote:
   On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
   Right now, I'm at a loss to explain the corruption, so it's
   difficult to suggest what to try.
   The thing is here, this is PCIE, so if there is a GPU plugged into the
   PCIE 16x slot in theory the main onboard graphics should disable, AGP
   code is used to control the GART for the onboard chip, in this case a
   plugged in card will  not use AGP, I wonder have Intel tested with a
   pcie card in place...
   
   Agree. We seem to always enable AGP even IGD is disabled or not exists,
   other card should not depend on this module ever.
   
   That is Chinese for me :/.
   Do you want me to try something?
   
   Carlo, I've just built latest kernel git tree on a Dell 965G box and
   have a NV card plugged-in. It boots fine.
   
   Linux agpgart interface v0.102 (c) Dave Jones
   agpgart: Detected an Intel 965G Chipset.
   agpgart: AGP aperture is 256M @ 0x0
   
   I don't know why it hangs your machine when loading this module, it should
   just not bother anything. But from your last modprobe: ... line, it seems
   there's really badness somewhere, do you have serial console to see more
   in the message?
  
  There are also these bug reports:
  
  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229913
  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242101

good find.

Looking through intel-agp brings up a wart that I've been 
complaining about for a while.

Start looking for functions with '965' in them.
the first one you hit is the wonderfully named
'intel_i830_init_gtt_entries'.
Two thirds of that function no longer have anything to
do with an i830.  Instead of adding a intel_i965_init_gtt_entries,
this thing has grown into a monster dealing with three different
generations of hardware.

This results in tables like this..

static const struct agp_bridge_driver intel_i965_driver = {
   .owner  = THIS_MODULE,
   .aperture_sizes = intel_i830_sizes,
   .size_type  = FIXED_APER_SIZE,
   .num_aperture_sizes = 4,
   .needs_scratch_page = TRUE,
   .configure  = intel_i915_configure,
   .fetch_size = intel_i9xx_fetch_size,
   .cleanup= intel_i915_cleanup,
   .tlb_flush  = intel_i810_tlbflush,
   .mask_memory= intel_i965_mask_memory,
   .masks  = intel_i810_masks,
   .agp_enable = intel_i810_agp_enable,
   .cache_flush= global_cache_flush,
   .create_gatt_table  = intel_i965_create_gatt_table,
   .free_gatt_table= intel_i830_free_gatt_table,
   .insert_memory  = intel_i915_insert_entries,
   .remove_memory  = intel_i915_remove_entries,
   .alloc_by_type  = intel_i830_alloc_by_type,
   .free_by_type   = intel_i810_free_by_type,
   .agp_alloc_page = agp_generic_alloc_page,
   .agp_destroy_page   = agp_generic_destroy_page,
   .agp_type_to_mask_type  = intel_i830_type_to_mask_type,

so we use bits and pieces from 810, 830, 915, and throw in
some new 965 routines too.  Why is this a mess ?
Because it's non-trivial to just look at this table
and spot bugs like wait, that 810 should be using 915
without lots of staring at data sheets.
Additionally each time we twist these routines to cope with
an additional chipset, we risk breaking previous generations.

Having functions do ONE thing is a good thing, even if
it means having 15 of them that look similar.
The alternative of a single function that becomes a nest
of if's  switches is just horrible.

It could be that all of the above is actually pointing to
the correct routines.  It could also be that the codepaths
in those routines, as twisty as they are, are fine, and this
is just some normal bug, but hunting for it becomes a lot
harder when the code is this baroque.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-18 Thread Carlo Wood
On Mon, Jun 18, 2007 at 10:37:26AM +0800, Wang Zhenyu wrote:
 Carlo, I've just built latest kernel git tree on a Dell 965G box and
 have a NV card plugged-in. It boots fine.

It would be nice if you could test it with the exact same
hardware ... I am pretty sure it should be reproducable then *cough*

 Linux agpgart interface v0.102 (c) Dave Jones
 agpgart: Detected an Intel 965G Chipset.
 agpgart: AGP aperture is 256M @ 0x0
 
 I don't know why it hangs your machine when loading this module, it should
 just not bother anything. But from your last modprobe: ... line, it seems
 there's really badness somewhere, do you have serial console to see more
 in the message?

The reason I react a bit late to your post is because in the meantime
I set up a serial console and figured out how to boot that way... and
captured the boot messages.

I will add that to one of the other posts, because as a result of
what they posted there I added agp=off - and that (indeed) makes it
possible for me to boot. Please see my next post thus.

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Wang Zhenyu
On 2007.06.18 03:56:36 +, Carlo Wood wrote:
> On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
> > >Right now, I'm at a loss to explain the corruption, so it's
> > >difficult to suggest what to try.
> > 
> > The thing is here, this is PCIE, so if there is a GPU plugged into the
> > PCIE 16x slot in theory the main onboard graphics should disable, AGP
> > code is used to control the GART for the onboard chip, in this case a
> > plugged in card will  not use AGP, I wonder have Intel tested with a
> > pcie card in place...

Agree. We seem to always enable AGP even IGD is disabled or not exists,
other card should not depend on this module ever.

> 
> That is Chinese for me :/.
> Do you want me to try something?

Carlo, I've just built latest kernel git tree on a Dell 965G box and
have a NV card plugged-in. It boots fine.

Linux agpgart interface v0.102 (c) Dave Jones
agpgart: Detected an Intel 965G Chipset.
agpgart: AGP aperture is 256M @ 0x0

I don't know why it hangs your machine when loading this module, it should
just not bother anything. But from your last "modprobe: ..." line, it seems
there's really badness somewhere, do you have serial console to see more
in the message?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Airlie

>
> The thing is here, this is PCIE, so if there is a GPU plugged into the
> PCIE 16x slot in theory the main onboard graphics should disable, AGP
> code is used to control the GART for the onboard chip, in this case a
> plugged in card will  not use AGP, I wonder have Intel tested with a
> pcie card in place...

That is Chinese for me :/.
Do you want me to try something?



Well it was more for davej's benefit, in theory for your machine with
a PCIE graphics card you don't need agpgart enabled at all granted it
shouldn't screw up if it is..

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
> >Right now, I'm at a loss to explain the corruption, so it's
> >difficult to suggest what to try.
> 
> The thing is here, this is PCIE, so if there is a GPU plugged into the
> PCIE 16x slot in theory the main onboard graphics should disable, AGP
> code is used to control the GART for the onboard chip, in this case a
> plugged in card will  not use AGP, I wonder have Intel tested with a
> pcie card in place...

That is Chinese for me :/.
Do you want me to try something?

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Airlie


Right now, I'm at a loss to explain the corruption, so it's
difficult to suggest what to try.


The thing is here, this is PCIE, so if there is a GPU plugged into the
PCIE 16x slot in theory the main onboard graphics should disable, AGP
code is used to control the GART for the onboard chip, in this case a
plugged in card will  not use AGP, I wonder have Intel tested with a
pcie card in place...

Dave.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Mon, Jun 18, 2007 at 02:06:43AM +0200, Carlo Wood wrote:
 > On Sun, Jun 17, 2007 at 06:49:18PM -0400, Dave Jones wrote:
 > > ok, then you must have CONFIG_AGP=y
 > 
 > I do - not voluntary however. For some mysterious reason I am
 > unable to set it to n or m.

You likely have CONFIG_IOMMU set. This makes AGP non-optional.

 > I am also not able to set CONFIG_AGP_INTEL
 > to n, only to y or m.
 > (using make 'menuconfig' for example).

Can't explain that one.

 > >  > Hence, in the case that the kernel works, intel-agp is loaded
 > >  > WITHOUT printing this Detected line
 > > 
 > > That doesn't make much sense.  The hardware doesn't change between
 > > a working & not-working kernel, and somehow the PCI probing fails.
 > > Hmm, do you have CONFIG_EDAC set ?
 > 
 > CONFIG_EDAC=m
 > 
 > $ lsmod | grep edac

ok, red herring.

 > > There's an outstanding bug (well, lack of feature) , where it claims
 > > the PCI device before AGP gets a chance to.
 > > This is unrelated to your hang however, but would at least explain
 > > the inconsistent probing.
 > > 
 > >  > Perhaps my "solution" is to remove this module completely?
 > >  > I don't seem to need it.
 > > 
 > > It's needed only for 3d,
 > 
 > Nope - I just played UT2004 without problems, and without intel-agp loaded.

Using the nvidia driver? It has its own built-in AGP support which will
get used if the kernel AGP support is missing.

 > > but it'd be good to figure out why its so
 > > broken on your system, even if you don't need it.
 > 
 > Please tell me what to do / try.
 > I'm an experienced coder - but I never really played with the
 > kernel before - so you'll have to spell out how to turn on debugging etc.

Right now, I'm at a loss to explain the corruption, so it's
difficult to suggest what to try.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 06:49:18PM -0400, Dave Jones wrote:
> ok, then you must have CONFIG_AGP=y

I do - not voluntary however. For some mysterious reason I am
unable to set it to n or m. I am also not able to set CONFIG_AGP_INTEL
to n, only to y or m.
(using make 'menuconfig' for example).

>  > $ find /lib/modules -name 'agpgart.ko'
>  > $ 
> /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64 
> -name '*agp*.ko'
>  > 
> /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/sis-agp.ko
>  > 
> /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
>  > 
> /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/via-agp.ko
>  > 
>  > 2)
>  > 
>  > $ strings 
> /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
>  | grep 'agpgart: Detected an Intel'
>  > <6>agpgart: Detected an Intel %s Chipset.
>  > 
>  > Hence, in the case that the kernel works, intel-agp is loaded
>  > WITHOUT printing this Detected line
> 
> That doesn't make much sense.  The hardware doesn't change between
> a working & not-working kernel, and somehow the PCI probing fails.
> Hmm, do you have CONFIG_EDAC set ?

CONFIG_EDAC=m

$ lsmod | grep edac
$

> There's an outstanding bug (well, lack of feature) , where it claims
> the PCI device before AGP gets a chance to.
> This is unrelated to your hang however, but would at least explain
> the inconsistent probing.
> 
>  > Perhaps my "solution" is to remove this module completely?
>  > I don't seem to need it.
> 
> It's needed only for 3d,

Nope - I just played UT2004 without problems, and without intel-agp loaded.

> but it'd be good to figure out why its so
> broken on your system, even if you don't need it.

Please tell me what to do / try.
I'm an experienced coder - but I never really played with the
kernel before - so you'll have to spell out how to turn on debugging etc.

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Mon, Jun 18, 2007 at 12:36:49AM +0200, Carlo Wood wrote:
 > On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
 > > intel-agp
 > > 
 > > Though by the looks of things, with the working kernel, you don't have
 > > it loaded (it's dependant upon the 'agpgart' module, which prints the
 > > "Detected" line that was missing).
 > 
 > You are wrong.
 > I don't have any agpgart module at all, in any of the kernels that I 
 > compiled.

ok, then you must have CONFIG_AGP=y

 > $ find /lib/modules -name 'agpgart.ko'
 > $ 
 > /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64
 >  -name '*agp*.ko'
 > /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/sis-agp.ko
 > /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
 > /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/via-agp.ko
 > 
 > 2)
 > 
 > $ strings 
 > /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
 >  | grep 'agpgart: Detected an Intel'
 > <6>agpgart: Detected an Intel %s Chipset.
 > 
 > Hence, in the case that the kernel works, intel-agp is loaded
 > WITHOUT printing this Detected line

That doesn't make much sense.  The hardware doesn't change between
a working & not-working kernel, and somehow the PCI probing fails.
Hmm, do you have CONFIG_EDAC set ?
There's an outstanding bug (well, lack of feature) , where it claims
the PCI device before AGP gets a chance to.
This is unrelated to your hang however, but would at least explain
the inconsistent probing.

 > Perhaps my "solution" is to remove this module completely?
 > I don't seem to need it.

It's needed only for 3d, but it'd be good to figure out why its so
broken on your system, even if you don't need it.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
> intel-agp
> 
> Though by the looks of things, with the working kernel, you don't have
> it loaded (it's dependant upon the 'agpgart' module, which prints the
> "Detected" line that was missing).

You are wrong.

1)

I don't have any agpgart module at all, in any of the kernels that I compiled.

$ find /lib/modules -name 'agpgart.ko'
$ /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64 
-name '*agp*.ko'
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/sis-agp.ko
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/via-agp.ko

2)

$ strings 
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
 | grep 'agpgart: Detected an Intel'
<6>agpgart: Detected an Intel %s Chipset.

Hence, in the case that the kernel works, intel-agp is loaded
WITHOUT printing this Detected line - while when it doesn't work,
it is loaded while printing this line.

Perhaps my "solution" is to remove this module completely?
I don't seem to need it.

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
>  > What would the name be of such module?
> 
> intel-agp
> 
> Though by the looks of things, with the working kernel, you don't have
> it loaded (it's dependant upon the 'agpgart' module, which prints the
> "Detected" line that was missing).

I can rmmod intel-agp without problems.
The agpgart module isn't loaded.

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
>  > What would the name be of such module?
> 
> intel-agp
> 
> Though by the looks of things, with the working kernel, you don't have
> it loaded (it's dependant upon the 'agpgart' module, which prints the
> "Detected" line that was missing).

hikaru:~>lsmod | grep agp
intel_agp  31776  0

It's loaded right now... that is with 2.6.22-rc4-+something (one of the
working ones).  I don't know WHEN it was loaded though. Probably
not at the same time as the others. 

And

hikaru:~>dmesg | grep agpgart
Linux agpgart interface v0.102 (c) Dave Jones

No 'Detected' line :/

Also, I noted that this 'agpgart: Detected..' line is always(?) printed
right after (often even without THAT line having printed a newline yet!)
something like:

udev: Waiting for /dev to be fully populated...agpgart: Detected an Intel 965G 
Chipset.

And this 'udev: Waiting for /dev to be fully populated' (sorry, didn't
write down the exact phrase, this is from memory) is also not printed
by kernels that work.

-- 
Carlo Wood <[EMAIL PROTECTED]>

PS I asked you move to the other thread - it's a bit annoying to have
two threads about this now.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Sun, Jun 17, 2007 at 11:13:38PM +0200, Carlo Wood wrote:
 > On Sun, Jun 17, 2007 at 04:49:04PM -0400, Dave Jones wrote:
 > > That's pretty bad corruption indeed.  What I'm puzzling over though
 > > is why other 965G users aren't seeing the same thing.
 > > My own 965G seems to be fine, though that's using Intel graphics
 > > instead of nvidia.
 > > 
 > > (Just to rule it out, I'm assuming at this stage in boot that the
 > >  nvidia driver module has never been loaded?)
 > 
 > Doesn't even exist for that kernel. I only compiled a new
 > nvidia driver module for one kernel - the one that I am using
 > on a daily basis.
 > 
 > > And if you never load the agpgart modules, you never see lockups?
 > 
 > What would the name be of such module?

intel-agp

Though by the looks of things, with the working kernel, you don't have
it loaded (it's dependant upon the 'agpgart' module, which prints the
"Detected" line that was missing).

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 04:49:04PM -0400, Dave Jones wrote:
> That's pretty bad corruption indeed.  What I'm puzzling over though
> is why other 965G users aren't seeing the same thing.
> My own 965G seems to be fine, though that's using Intel graphics
> instead of nvidia.
> 
> (Just to rule it out, I'm assuming at this stage in boot that the
>  nvidia driver module has never been loaded?)

Doesn't even exist for that kernel. I only compiled a new
nvidia driver module for one kernel - the one that I am using
on a daily basis.

> And if you never load the agpgart modules, you never see lockups?

What would the name be of such module?

In fact, I think that when the kernel does NOT lockup, it
doesn't print this "agpgart: Detected.." line either.

Ie, the dmesg of cf68676222e54cd0a31efd968da00e65f9a0963f
which boots fine, gives:

$ grep Detected dmesg-cf686
time.c: Detected 2666.669 MHz processor.
Detected 16.666 MHz APIC timer.

$ grep agpgart dmesg-cf686
Linux agpgart interface v0.102 (c) Dave Jones


Does that give an indication of what you want me to test/try?

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Sun, Jun 17, 2007 at 09:59:01PM +0200, Carlo Wood wrote:
 > On Sun, Jun 17, 2007 at 03:07:14PM -0400, Dave Jones wrote:
 > > Sometimes things fall through the cracks..
 > > I haven't heard any similar problems, which makes it somewhat odd.
 > 
 > Ok. Well, the lockup is all to real here :p
 > 
 > I suspect a memory corruption going on, as under certain circumstances
 > there is printed more after the "agpgart: Detected an Intel 965G
 > Chipset." -- that is, the kernel *sometimes* doesn't hang silently,
 > but either reboots by itself (hard reset) or prints something
 > that looks like a total crash to me. Often, this is prefixed by
 > the name of running process. I am sorry, I never wrote any of that
 > down. Only the last time, it then went like:
 > 
 > agpgart: Detected an Intel 965G Chipset.
 > modprobe: Corrupt page table at address 20
 > PGD 178c6c067 PUD 201000c0049c04f BAD

That's pretty bad corruption indeed.  What I'm puzzling over though
is why other 965G users aren't seeing the same thing.
My own 965G seems to be fine, though that's using Intel graphics
instead of nvidia.

(Just to rule it out, I'm assuming at this stage in boot that the
 nvidia driver module has never been loaded?)

And if you never load the agpgart modules, you never see lockups?

Right now, I'm at a loss to explain this.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 03:07:14PM -0400, Dave Jones wrote:
> Sometimes things fall through the cracks..
> I haven't heard any similar problems, which makes it somewhat odd.

Ok. Well, the lockup is all to real here :p

I suspect a memory corruption going on, as under certain circumstances
there is printed more after the "agpgart: Detected an Intel 965G
Chipset." -- that is, the kernel *sometimes* doesn't hang silently,
but either reboots by itself (hard reset) or prints something
that looks like a total crash to me. Often, this is prefixed by
the name of running process. I am sorry, I never wrote any of that
down. Only the last time, it then went like:

agpgart: Detected an Intel 965G Chipset.
modprobe: Corrupt page table at address 20
PGD 178c6c067 PUD 201000c0049c04f BAD

This very same kernel (I boot it several times to see how
reproducable things were) always prints "agpgart: Detected an Intel 965G
Chipset." but then either hangs (two times), prints the above
and hangs (one time) or hard resets (one time).

Then I also booted the other kernels several times, and kernels
that booted fine ALWAYS boot fine - while kernels that locked
up (that I tested by booting them three times) only printed
the agpgart line and locked up three times.

>  > The patch causes my machine to lock up and/or crash
>  > in various ways (depending on the exact version of the kernel),
>  > very shortly after printing: agpgart: Detected an Intel 965G Chipset.
>  > 
>  > The first kernel version that stops doing that is 2.6.22-rc5.
>  > I used git bisect once more to find the patch that fixes this bug:
>  > I am mailing this mostly because the comment doesn't seem to indicate
>  > that the author is aware that this patch fixes an existing regression
>  > (it worked fine for me with 2.6.18).
> 
> Indeed, it was unknown to me too, this should have just been
> a clean-up.

I hope you saw my later post too? I sent it to you too - but also to the
list, so I don't know if you'll notice it.

The Subject of that post is: 2.6.22-rc5 regression

I concluded too soon that 2.6.22-rc5 was working :(.
I just really tested it and it doesn't! Same error.

Please have a look at that other post of me and lets continue this
thread there.

> Out of curiousity, I'd like to see your lspci
> (not -v or anything, just run with no args)

I'll add that to that other thread.

-- 
Carlo Wood <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Sun, Jun 17, 2007 at 06:22:35PM +0200, Carlo Wood wrote:

 > > Hi Dave, I have an amd64 box for which every kernel
 > > after 2.6.18 hangs during boot, so I have no dmesg :(
 > > 
 > > I used git bisect to find out where the problem patch is,
 > > and assuming bisect works (imho it jumped between versions
 > > very weirdly: closing in on the last 6 revisions, it
 > > jumped from 2.6.18 to 2.6.18-rc2 to 2.6.18-rc6), the problem
 > > is a patch in intel-agp.c, where support for the intel 965G
 > > is added (which I have).
 > Dave, I have no idea why you never replied to this -- don't
 > you care that kernels 2.6.19 through 2.6.21 lockup on boot? --

Sometimes things fall through the cracks..
I haven't heard any similar problems, which makes it somewhat odd.

 > The patch causes my machine to lock up and/or crash
 > in various ways (depending on the exact version of the kernel),
 > very shortly after printing: agpgart: Detected an Intel 965G Chipset.
 > 
 > The first kernel version that stops doing that is 2.6.22-rc5.
 > I used git bisect once more to find the patch that fixes this bug:
 > I am mailing this mostly because the comment doesn't seem to indicate
 > that the author is aware that this patch fixes an existing regression
 > (it worked fine for me with 2.6.18).

Indeed, it was unknown to me too, this should have just been
a clean-up.

 > If anyone wants to know more (like what hardware I'm using), please
 > show me that you're actually alive / reading my mails.

Out of curiousity, I'd like to see your lspci
(not -v or anything, just run with no args)

Dave


-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Sun, Jun 17, 2007 at 06:22:35PM +0200, Carlo Wood wrote:

   Hi Dave, I have an amd64 box for which every kernel
   after 2.6.18 hangs during boot, so I have no dmesg :(
   
   I used git bisect to find out where the problem patch is,
   and assuming bisect works (imho it jumped between versions
   very weirdly: closing in on the last 6 revisions, it
   jumped from 2.6.18 to 2.6.18-rc2 to 2.6.18-rc6), the problem
   is a patch in intel-agp.c, where support for the intel 965G
   is added (which I have).
  Dave, I have no idea why you never replied to this -- don't
  you care that kernels 2.6.19 through 2.6.21 lockup on boot? --

Sometimes things fall through the cracks..
I haven't heard any similar problems, which makes it somewhat odd.

  The patch causes my machine to lock up and/or crash
  in various ways (depending on the exact version of the kernel),
  very shortly after printing: agpgart: Detected an Intel 965G Chipset.
  
  The first kernel version that stops doing that is 2.6.22-rc5.
  I used git bisect once more to find the patch that fixes this bug:
  I am mailing this mostly because the comment doesn't seem to indicate
  that the author is aware that this patch fixes an existing regression
  (it worked fine for me with 2.6.18).

Indeed, it was unknown to me too, this should have just been
a clean-up.

  If anyone wants to know more (like what hardware I'm using), please
  show me that you're actually alive / reading my mails.

Out of curiousity, I'd like to see your lspci
(not -v or anything, just run with no args)

Dave


-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 03:07:14PM -0400, Dave Jones wrote:
 Sometimes things fall through the cracks..
 I haven't heard any similar problems, which makes it somewhat odd.

Ok. Well, the lockup is all to real here :p

I suspect a memory corruption going on, as under certain circumstances
there is printed more after the agpgart: Detected an Intel 965G
Chipset. -- that is, the kernel *sometimes* doesn't hang silently,
but either reboots by itself (hard reset) or prints something
that looks like a total crash to me. Often, this is prefixed by
the name of running process. I am sorry, I never wrote any of that
down. Only the last time, it then went like:

agpgart: Detected an Intel 965G Chipset.
modprobe: Corrupt page table at address 20
PGD 178c6c067 PUD 201000c0049c04f BAD

This very same kernel (I boot it several times to see how
reproducable things were) always prints agpgart: Detected an Intel 965G
Chipset. but then either hangs (two times), prints the above
and hangs (one time) or hard resets (one time).

Then I also booted the other kernels several times, and kernels
that booted fine ALWAYS boot fine - while kernels that locked
up (that I tested by booting them three times) only printed
the agpgart line and locked up three times.

   The patch causes my machine to lock up and/or crash
   in various ways (depending on the exact version of the kernel),
   very shortly after printing: agpgart: Detected an Intel 965G Chipset.
   
   The first kernel version that stops doing that is 2.6.22-rc5.
   I used git bisect once more to find the patch that fixes this bug:
   I am mailing this mostly because the comment doesn't seem to indicate
   that the author is aware that this patch fixes an existing regression
   (it worked fine for me with 2.6.18).
 
 Indeed, it was unknown to me too, this should have just been
 a clean-up.

I hope you saw my later post too? I sent it to you too - but also to the
list, so I don't know if you'll notice it.

The Subject of that post is: 2.6.22-rc5 regression

I concluded too soon that 2.6.22-rc5 was working :(.
I just really tested it and it doesn't! Same error.

Please have a look at that other post of me and lets continue this
thread there.

 Out of curiousity, I'd like to see your lspci
 (not -v or anything, just run with no args)

I'll add that to that other thread.

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Sun, Jun 17, 2007 at 09:59:01PM +0200, Carlo Wood wrote:
  On Sun, Jun 17, 2007 at 03:07:14PM -0400, Dave Jones wrote:
   Sometimes things fall through the cracks..
   I haven't heard any similar problems, which makes it somewhat odd.
  
  Ok. Well, the lockup is all to real here :p
  
  I suspect a memory corruption going on, as under certain circumstances
  there is printed more after the agpgart: Detected an Intel 965G
  Chipset. -- that is, the kernel *sometimes* doesn't hang silently,
  but either reboots by itself (hard reset) or prints something
  that looks like a total crash to me. Often, this is prefixed by
  the name of running process. I am sorry, I never wrote any of that
  down. Only the last time, it then went like:
  
  agpgart: Detected an Intel 965G Chipset.
  modprobe: Corrupt page table at address 20
  PGD 178c6c067 PUD 201000c0049c04f BAD

That's pretty bad corruption indeed.  What I'm puzzling over though
is why other 965G users aren't seeing the same thing.
My own 965G seems to be fine, though that's using Intel graphics
instead of nvidia.

(Just to rule it out, I'm assuming at this stage in boot that the
 nvidia driver module has never been loaded?)

And if you never load the agpgart modules, you never see lockups?

Right now, I'm at a loss to explain this.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 04:49:04PM -0400, Dave Jones wrote:
 That's pretty bad corruption indeed.  What I'm puzzling over though
 is why other 965G users aren't seeing the same thing.
 My own 965G seems to be fine, though that's using Intel graphics
 instead of nvidia.
 
 (Just to rule it out, I'm assuming at this stage in boot that the
  nvidia driver module has never been loaded?)

Doesn't even exist for that kernel. I only compiled a new
nvidia driver module for one kernel - the one that I am using
on a daily basis.

 And if you never load the agpgart modules, you never see lockups?

What would the name be of such module?

In fact, I think that when the kernel does NOT lockup, it
doesn't print this agpgart: Detected.. line either.

Ie, the dmesg of cf68676222e54cd0a31efd968da00e65f9a0963f
which boots fine, gives:

$ grep Detected dmesg-cf686
time.c: Detected 2666.669 MHz processor.
Detected 16.666 MHz APIC timer.

$ grep agpgart dmesg-cf686
Linux agpgart interface v0.102 (c) Dave Jones


Does that give an indication of what you want me to test/try?

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Sun, Jun 17, 2007 at 11:13:38PM +0200, Carlo Wood wrote:
  On Sun, Jun 17, 2007 at 04:49:04PM -0400, Dave Jones wrote:
   That's pretty bad corruption indeed.  What I'm puzzling over though
   is why other 965G users aren't seeing the same thing.
   My own 965G seems to be fine, though that's using Intel graphics
   instead of nvidia.
   
   (Just to rule it out, I'm assuming at this stage in boot that the
nvidia driver module has never been loaded?)
  
  Doesn't even exist for that kernel. I only compiled a new
  nvidia driver module for one kernel - the one that I am using
  on a daily basis.
  
   And if you never load the agpgart modules, you never see lockups?
  
  What would the name be of such module?

intel-agp

Though by the looks of things, with the working kernel, you don't have
it loaded (it's dependant upon the 'agpgart' module, which prints the
Detected line that was missing).

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
   What would the name be of such module?
 
 intel-agp
 
 Though by the looks of things, with the working kernel, you don't have
 it loaded (it's dependant upon the 'agpgart' module, which prints the
 Detected line that was missing).

hikaru:~lsmod | grep agp
intel_agp  31776  0

It's loaded right now... that is with 2.6.22-rc4-+something (one of the
working ones).  I don't know WHEN it was loaded though. Probably
not at the same time as the others. 

And

hikaru:~dmesg | grep agpgart
Linux agpgart interface v0.102 (c) Dave Jones

No 'Detected' line :/

Also, I noted that this 'agpgart: Detected..' line is always(?) printed
right after (often even without THAT line having printed a newline yet!)
something like:

udev: Waiting for /dev to be fully populated...agpgart: Detected an Intel 965G 
Chipset.

And this 'udev: Waiting for /dev to be fully populated' (sorry, didn't
write down the exact phrase, this is from memory) is also not printed
by kernels that work.

-- 
Carlo Wood [EMAIL PROTECTED]

PS I asked you move to the other thread - it's a bit annoying to have
two threads about this now.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
   What would the name be of such module?
 
 intel-agp
 
 Though by the looks of things, with the working kernel, you don't have
 it loaded (it's dependant upon the 'agpgart' module, which prints the
 Detected line that was missing).

I can rmmod intel-agp without problems.
The agpgart module isn't loaded.

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
 intel-agp
 
 Though by the looks of things, with the working kernel, you don't have
 it loaded (it's dependant upon the 'agpgart' module, which prints the
 Detected line that was missing).

You are wrong.

1)

I don't have any agpgart module at all, in any of the kernels that I compiled.

$ find /lib/modules -name 'agpgart.ko'
$ /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64 
-name '*agp*.ko'
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/sis-agp.ko
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/via-agp.ko

2)

$ strings 
/lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
 | grep 'agpgart: Detected an Intel'
6agpgart: Detected an Intel %s Chipset.

Hence, in the case that the kernel works, intel-agp is loaded
WITHOUT printing this Detected line - while when it doesn't work,
it is loaded while printing this line.

Perhaps my solution is to remove this module completely?
I don't seem to need it.

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Mon, Jun 18, 2007 at 12:36:49AM +0200, Carlo Wood wrote:
  On Sun, Jun 17, 2007 at 05:33:55PM -0400, Dave Jones wrote:
   intel-agp
   
   Though by the looks of things, with the working kernel, you don't have
   it loaded (it's dependant upon the 'agpgart' module, which prints the
   Detected line that was missing).
  
  You are wrong.
  I don't have any agpgart module at all, in any of the kernels that I 
  compiled.

ok, then you must have CONFIG_AGP=y

  $ find /lib/modules -name 'agpgart.ko'
  $ 
  /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64
   -name '*agp*.ko'
  /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/sis-agp.ko
  /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
  /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/via-agp.ko
  
  2)
  
  $ strings 
  /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
   | grep 'agpgart: Detected an Intel'
  6agpgart: Detected an Intel %s Chipset.
  
  Hence, in the case that the kernel works, intel-agp is loaded
  WITHOUT printing this Detected line

That doesn't make much sense.  The hardware doesn't change between
a working  not-working kernel, and somehow the PCI probing fails.
Hmm, do you have CONFIG_EDAC set ?
There's an outstanding bug (well, lack of feature) , where it claims
the PCI device before AGP gets a chance to.
This is unrelated to your hang however, but would at least explain
the inconsistent probing.

  Perhaps my solution is to remove this module completely?
  I don't seem to need it.

It's needed only for 3d, but it'd be good to figure out why its so
broken on your system, even if you don't need it.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Sun, Jun 17, 2007 at 06:49:18PM -0400, Dave Jones wrote:
 ok, then you must have CONFIG_AGP=y

I do - not voluntary however. For some mysterious reason I am
unable to set it to n or m. I am also not able to set CONFIG_AGP_INTEL
to n, only to y or m.
(using make 'menuconfig' for example).

   $ find /lib/modules -name 'agpgart.ko'
   $ 
 /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64 
 -name '*agp*.ko'
   
 /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/sis-agp.ko
   
 /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
   
 /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/via-agp.ko
   
   2)
   
   $ strings 
 /lib/modules/2.6.22-rc5-master-188e1f81ba31af1b65a2f3611df4c670b092bbac-amd64/kernel/drivers/char/agp/intel-agp.ko
  | grep 'agpgart: Detected an Intel'
   6agpgart: Detected an Intel %s Chipset.
   
   Hence, in the case that the kernel works, intel-agp is loaded
   WITHOUT printing this Detected line
 
 That doesn't make much sense.  The hardware doesn't change between
 a working  not-working kernel, and somehow the PCI probing fails.
 Hmm, do you have CONFIG_EDAC set ?

CONFIG_EDAC=m

$ lsmod | grep edac
$

 There's an outstanding bug (well, lack of feature) , where it claims
 the PCI device before AGP gets a chance to.
 This is unrelated to your hang however, but would at least explain
 the inconsistent probing.
 
   Perhaps my solution is to remove this module completely?
   I don't seem to need it.
 
 It's needed only for 3d,

Nope - I just played UT2004 without problems, and without intel-agp loaded.

 but it'd be good to figure out why its so
 broken on your system, even if you don't need it.

Please tell me what to do / try.
I'm an experienced coder - but I never really played with the
kernel before - so you'll have to spell out how to turn on debugging etc.

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Jones
On Mon, Jun 18, 2007 at 02:06:43AM +0200, Carlo Wood wrote:
  On Sun, Jun 17, 2007 at 06:49:18PM -0400, Dave Jones wrote:
   ok, then you must have CONFIG_AGP=y
  
  I do - not voluntary however. For some mysterious reason I am
  unable to set it to n or m.

You likely have CONFIG_IOMMU set. This makes AGP non-optional.

  I am also not able to set CONFIG_AGP_INTEL
  to n, only to y or m.
  (using make 'menuconfig' for example).

Can't explain that one.

 Hence, in the case that the kernel works, intel-agp is loaded
 WITHOUT printing this Detected line
   
   That doesn't make much sense.  The hardware doesn't change between
   a working  not-working kernel, and somehow the PCI probing fails.
   Hmm, do you have CONFIG_EDAC set ?
  
  CONFIG_EDAC=m
  
  $ lsmod | grep edac

ok, red herring.

   There's an outstanding bug (well, lack of feature) , where it claims
   the PCI device before AGP gets a chance to.
   This is unrelated to your hang however, but would at least explain
   the inconsistent probing.
   
 Perhaps my solution is to remove this module completely?
 I don't seem to need it.
   
   It's needed only for 3d,
  
  Nope - I just played UT2004 without problems, and without intel-agp loaded.

Using the nvidia driver? It has its own built-in AGP support which will
get used if the kernel AGP support is missing.

   but it'd be good to figure out why its so
   broken on your system, even if you don't need it.
  
  Please tell me what to do / try.
  I'm an experienced coder - but I never really played with the
  kernel before - so you'll have to spell out how to turn on debugging etc.

Right now, I'm at a loss to explain the corruption, so it's
difficult to suggest what to try.

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Airlie


Right now, I'm at a loss to explain the corruption, so it's
difficult to suggest what to try.


The thing is here, this is PCIE, so if there is a GPU plugged into the
PCIE 16x slot in theory the main onboard graphics should disable, AGP
code is used to control the GART for the onboard chip, in this case a
plugged in card will  not use AGP, I wonder have Intel tested with a
pcie card in place...

Dave.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Carlo Wood
On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
 Right now, I'm at a loss to explain the corruption, so it's
 difficult to suggest what to try.
 
 The thing is here, this is PCIE, so if there is a GPU plugged into the
 PCIE 16x slot in theory the main onboard graphics should disable, AGP
 code is used to control the GART for the onboard chip, in this case a
 plugged in card will  not use AGP, I wonder have Intel tested with a
 pcie card in place...

That is Chinese for me :/.
Do you want me to try something?

-- 
Carlo Wood [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Dave Airlie


 The thing is here, this is PCIE, so if there is a GPU plugged into the
 PCIE 16x slot in theory the main onboard graphics should disable, AGP
 code is used to control the GART for the onboard chip, in this case a
 plugged in card will  not use AGP, I wonder have Intel tested with a
 pcie card in place...

That is Chinese for me :/.
Do you want me to try something?



Well it was more for davej's benefit, in theory for your machine with
a PCIE graphics card you don't need agpgart enabled at all granted it
shouldn't screw up if it is..

Dave.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [AGPGART] intel_agp: use table for device probe

2007-06-17 Thread Wang Zhenyu
On 2007.06.18 03:56:36 +, Carlo Wood wrote:
 On Mon, Jun 18, 2007 at 10:57:38AM +1000, Dave Airlie wrote:
  Right now, I'm at a loss to explain the corruption, so it's
  difficult to suggest what to try.
  
  The thing is here, this is PCIE, so if there is a GPU plugged into the
  PCIE 16x slot in theory the main onboard graphics should disable, AGP
  code is used to control the GART for the onboard chip, in this case a
  plugged in card will  not use AGP, I wonder have Intel tested with a
  pcie card in place...

Agree. We seem to always enable AGP even IGD is disabled or not exists,
other card should not depend on this module ever.

 
 That is Chinese for me :/.
 Do you want me to try something?

Carlo, I've just built latest kernel git tree on a Dell 965G box and
have a NV card plugged-in. It boots fine.

Linux agpgart interface v0.102 (c) Dave Jones
agpgart: Detected an Intel 965G Chipset.
agpgart: AGP aperture is 256M @ 0x0

I don't know why it hangs your machine when loading this module, it should
just not bother anything. But from your last modprobe: ... line, it seems
there's really badness somewhere, do you have serial console to see more
in the message?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/