Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-06 Thread Felipe Contreras
On Thu, Feb 4, 2010 at 11:23 PM, Andrew Morton
a...@linux-foundation.org wrote:
 On Thu, 4 Feb 2010 22:05:59 +0100
 Ingo Molnar mi...@elte.hu wrote:
 Regressions are not limited to 'same config' kernels, last i checked. If that
 has changed (or if i'm misunderstanding it) then it would be nice to hear a
 clarification about that from Linus.

 The way i understand it is that there are narrow exceptions from the
 regression rules, such as completely new drivers for which there can be no
 prior expectation of stability by users. (but for even them we are generally
 on the safer side to list bugs in them as regressions as well - especially if
 we expect many users to enable it.)

 AFAIK there's no exception for new sub-features of existing facilities or
 drivers, even if it's default-disabled.

 This issue materially affects quite a few bugs i'm handling as a maintainer.
 Many of them are under default-off config options - most new aspects to
 existing code are introduced in such a way. It would remove quite a bit of
 urgent-workload from my workflow if i could strike them from Rafael's list
 and could deprioritize them as plain bugs, to be fixed as time permits.

 IMHO it would be rather counter-productive to kernel quality if we did that
 kind of regression-lawyering though.

 Yes, it's mainly semantics.

 From the user's point of view

 kernel N: boots, works, plays nethack
 kernel N+1: goes splat

 That kernel regressed for that user.  He'll shrug and will go back to
 kernel N and we lost an N+1 tester.  And the distros who ship N+1 get a
 lot of hack work to do.

 If the feature is this buggy, it was wrong to make it accessible in Kconfig.

That's why some features are marked as _experimental_ and disabled by
default. If the feature is not marked as such, then yeah, I would
consider it a regression.

-- 
Felipe Contreras

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-05 Thread Dave Airlie

 If it now does not boot up if all its sub-options are enabled, even of some 
 of those sub-options are new, does that count as a driver regression? Sure it 
 does to me ...

But it doesn't to anyone else under any reasonable meaning of the word 
regression. 

The config option states
Choose this option if you want kernel modesetting enabled by default,
  and you have a new enough userspace to support this. Running old
  userspaces with this enabled will cause pain.

Will cause pain sounds painful to me, I can make it seem much worse if 
you'd like.

Dave.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-05 Thread Ingo Molnar

* Dave Airlie airl...@linux.ie wrote:

 
  If it now does not boot up if all its sub-options are enabled, even of some 
  of those sub-options are new, does that count as a driver regression? Sure 
  it 
  does to me ...
 
 But it doesn't to anyone else under any reasonable meaning of the word 
 regression.

There are reactions in this thread that contradict your 'anyone else' point.

 The config option states
 Choose this option if you want kernel modesetting enabled by default,
   and you have a new enough userspace to support this. Running old
   userspaces with this enabled will cause pain.
 
 Will cause pain sounds painful to me, I can make it seem much worse if 
 you'd like.

Except you are missing that the hang (and the first crash as well) happens on 
brand-new user-space just as much - not just on 'old userspaces'.

The bugs i've triggered are independent of any user-space component - it 
happens with a fresh distro just as much.

As i suggested before, at least the text should be updated to include what 
has been written about CONFIG_DRM_RADEON_KMS in this thread before:

  This is a completely new driver.  It's only part of the existing drm for 
  compatibility reasons.  It requires an entirely different graphics stack 
  above it and works very differently from the old drm stack.

Thanks,

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-05 Thread Dave Airlie
On Fri, Feb 5, 2010 at 7:00 PM, Ingo Molnar mi...@elte.hu wrote:

 * Dave Airlie airl...@linux.ie wrote:


  If it now does not boot up if all its sub-options are enabled, even of some
  of those sub-options are new, does that count as a driver regression? Sure 
  it
  does to me ...

 But it doesn't to anyone else under any reasonable meaning of the word
 regression.

 There are reactions in this thread that contradict your 'anyone else' point.

 The config option states
 Choose this option if you want kernel modesetting enabled by default,
           and you have a new enough userspace to support this. Running old
           userspaces with this enabled will cause pain.

 Will cause pain sounds painful to me, I can make it seem much worse if
 you'd like.

 Except you are missing that the hang (and the first crash as well) happens on
 brand-new user-space just as much - not just on 'old userspaces'.

 The bugs i've triggered are independent of any user-space component - it
 happens with a fresh distro just as much.

 As i suggested before, at least the text should be updated to include what
 has been written about CONFIG_DRM_RADEON_KMS in this thread before:

   This is a completely new driver.  It's only part of the existing drm for
   compatibility reasons.  It requires an entirely different graphics stack
   above it and works very differently from the old drm stack.


Okay I've attached a patch with a revised Kconfig in it.

Does this sound more like reality?

Dave.


0001-drm-radeon-kms-change-Kconfig-text-to-reflect-the-ne.patch
Description: Binary data
--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-05 Thread Ingo Molnar

* Dave Airlie airl...@gmail.com wrote:

 On Fri, Feb 5, 2010 at 7:00 PM, Ingo Molnar mi...@elte.hu wrote:
 
  * Dave Airlie airl...@linux.ie wrote:
 
 
   If it now does not boot up if all its sub-options are enabled, even of 
   some
   of those sub-options are new, does that count as a driver regression? 
   Sure it
   does to me ...
 
  But it doesn't to anyone else under any reasonable meaning of the word
  regression.
 
  There are reactions in this thread that contradict your 'anyone else' point.
 
  The config option states
  Choose this option if you want kernel modesetting enabled by default,
  ? ? ? ? ? and you have a new enough userspace to support this. Running old
  ? ? ? ? ? userspaces with this enabled will cause pain.
 
  Will cause pain sounds painful to me, I can make it seem much worse if
  you'd like.
 
  Except you are missing that the hang (and the first crash as well) happens 
  on
  brand-new user-space just as much - not just on 'old userspaces'.
 
  The bugs i've triggered are independent of any user-space component - it
  happens with a fresh distro just as much.
 
  As i suggested before, at least the text should be updated to include what
  has been written about CONFIG_DRM_RADEON_KMS in this thread before:
 
  ? This is a completely new driver. ?It's only part of the existing drm for
  ? compatibility reasons. ?It requires an entirely different graphics stack
  ? above it and works very differently from the old drm stack.
 
 
 Okay I've attached a patch with a revised Kconfig in it.
 
 Does this sound more like reality?

Looks perfect to me!

Acked-by: Ingo Molnar mi...@elte.hu

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Feb 04, 2010 at 08:17:05AM +0100, Ingo Molnar wrote:
 
  btw., i just found another bug activated via this same commit, a boot 
  hang after DRM init:
 
 The commit in question didn't cause the hang, so reverting it isn't the 
 appropriate fix.

Well, once i applied the revert i got no more hangs or crashes today, in lots 
of bootups. This is fully repeatable - if i re-apply that commit with the 
config i sent the hang happens again.

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Linus Torvalds


On Thu, 4 Feb 2010, Ingo Molnar wrote:
 
 Well, once i applied the revert i got no more hangs or crashes today, in lots 
 of bootups. This is fully repeatable - if i re-apply that commit with the 
 config i sent the hang happens again.

But that's just because when it was in staging, you'd never enable it, 
right? Because your config generator always turns off staging stuff. IOW, 
it's not the code - it's more an interaction with how your test-bed works.

Linus

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Matthew Garrett
On Thu, Feb 04, 2010 at 08:17:05AM +0100, Ingo Molnar wrote:

 btw., i just found another bug activated via this same commit, a boot hang 
 after DRM init:

The commit in question didn't cause the hang, so reverting it isn't the 
appropriate fix.

-- 
Matthew Garrett | mj...@srcf.ucam.org

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Linus Torvalds torva...@linux-foundation.org wrote:

 On Thu, 4 Feb 2010, Ingo Molnar wrote:
 
  Well, once i applied the revert i got no more hangs or crashes today, in 
  lots of bootups. This is fully repeatable - if i re-apply that commit 
  with the config i sent the hang happens again.
 
 But that's just because when it was in staging, you'd never enable it, 
 right? Because your config generator always turns off staging stuff. IOW, 
 it's not the code - it's more an interaction with how your test-bed works.

Correct. Staging drivers can crash and there's no guarantee of quick 
regression fixes (although fixes are generally quick so this isnt a 
complaint), so i exclude them from boot testing.

Anyway, i've essentially moved it back to staging as far as my testing goes, 
that solves the problem too.

Thanks,

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Alex Deucher
On Thu, Feb 4, 2010 at 1:12 PM, Ingo Molnar mi...@elte.hu wrote:

 * Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Feb 04, 2010 at 06:54:45PM +0100, Ingo Molnar wrote:

  But you could claim that it's not a regression because 1) technically the
  code got introduced in drivers/staging/, and staging drivers are not on
  the regression list 2) the Kconfig value is default-off so it can only
  harm those who got lured by a new Kconfig value popping up in -rc7 in a
  well working driver they already have enabled.
 
  So the moving of driver functionality from drivers/staging/ to drivers/
  is a grey area it appears. Wouldnt it have been better to do this in the
  next merge window, as all other drivers do? It's not new hardware
  enablement either, it's feature enablement for an existing driver.

 The reason the option was in staging (as has been mentioned before) was
 because the ABI wasn't felt to be stable enough. Upstream is now willing to
 commit to that stability, so now seems as good a time to move it as any.
 There's no code change and there's no default configuration change, so I
 really can't see any way that it can be classed as a regression.

 But that argument in essence renders the regression policy meaningless for
 such code: just about any new driver feature under the sun could be shaped as
 a Kconfig option, introduced via a drivers/staging Kconfig entry, and then
 activated via a twoliner commit in a later -rc.

 IMHO the point of tracking regressions is to reduce the bugginess of the
 kernel and thus to help users, not to give ground for legalistic arguments.

 There _are_ common-sense exceptions from the regression rules, such as the
 introduction of a new piece of hardware that was previously unsupported
 (hence there's no expectation of stability) - but the tweaking of an
 existing, widely used driver (even if the new opion is default-off) hardly
 seems to qualify for that.


This is a completely new driver.  It's only part of the existing drm
for compatibility reasons.  It requires an entirely different graphics
stack above it and works very differently from the old drm stack.

Alex

 I dont mind making useful exceptions from rules, as long as we are honest
 about having done it.

 Anyway, i've bisected it back to that Kconfig change and i am able to work
 the crashes around by reverting that, so my immediate problems are solved.

 Thanks,

        Ingo

 --
 The Planet: dedicated and managed hosting, cloud storage, colocation
 Stay online with enterprise data centers and the best network in the business
 Choose flexible plans and management services without long-term contracts
 Personal 24x7 support from experience hosting pros just a phone call away.
 http://p.sf.net/sfu/theplanet-com
 --
 ___
 Dri-devel mailing list
 Dri-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dri-devel


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Matthew Garrett
On Thu, Feb 04, 2010 at 06:08:26PM +0100, Ingo Molnar wrote:

 Well, once i applied the revert i got no more hangs or crashes today, in lots 
 of bootups. This is fully repeatable - if i re-apply that commit with the 
 config i sent the hang happens again.

If you leave the commit applied, use that config and then enable Radeon 
KMS under staging, do you get the hang?

-- 
Matthew Garrett | mj...@srcf.ucam.org

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Matthew Garrett
On Thu, Feb 04, 2010 at 07:12:18PM +0100, Ingo Molnar wrote:
 
 * Matthew Garrett mj...@srcf.ucam.org wrote:
  The reason the option was in staging (as has been mentioned before) was 
  because the ABI wasn't felt to be stable enough. Upstream is now willing to 
  commit to that stability, so now seems as good a time to move it as any. 
  There's no code change and there's no default configuration change, so I 
  really can't see any way that it can be classed as a regression.
 
 But that argument in essence renders the regression policy meaningless for 
 such code: just about any new driver feature under the sun could be shaped as 
 a Kconfig option, introduced via a drivers/staging Kconfig entry, and then 
 activated via a twoliner commit in a later -rc.

Before this patch, CONFIG_DRM_RADEON_KMS=y would crash your system on 
boot. After this patch, CONFIG_DRM_RADEON_KMS=y still crashes your 
system. There's certainly the argument that this means it's premature to 
make that change, but given that the same configuration behaves in the 
same way, it's clearly not a regression.

-- 
Matthew Garrett | mj...@srcf.ucam.org

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Feb 04, 2010 at 07:12:18PM +0100, Ingo Molnar wrote:
  
  * Matthew Garrett mj...@srcf.ucam.org wrote:
   The reason the option was in staging (as has been mentioned before) was 
   because the ABI wasn't felt to be stable enough. Upstream is now willing 
   to 
   commit to that stability, so now seems as good a time to move it as any. 
   There's no code change and there's no default configuration change, so I 
   really can't see any way that it can be classed as a regression.
  
  But that argument in essence renders the regression policy meaningless for 
  such code: just about any new driver feature under the sun could be shaped 
  as 
  a Kconfig option, introduced via a drivers/staging Kconfig entry, and then 
  activated via a twoliner commit in a later -rc.
 
 Before this patch, CONFIG_DRM_RADEON_KMS=y would crash your system on boot. 
 [...]

Hm, in what way does that observation address the concerns i've outlined?

Before this patch i could enable CONFIG_DRM_RADEON_KMS=y only if i enabled 
CONFIG_STAGING, which i dont, because doing so would taint my kernel with 
TAINT_CRAP, and the kernel log would contain:

 %s: module is from the staging directory, the quality is unknown, you have 
been warned.,

 [...] After this patch, CONFIG_DRM_RADEON_KMS=y still crashes your system. 
 [...]

After this patch i suddenly get a new body of code with a default-off option 
that would only show up before if i had CONFIG_STAGING=y enabled before.

Do you see my argument why any user who is hit by this would categorize this 
as a kernel regression in an existing driver?

Moving driver functionality from drivers/staging/ to drivers/ might be 
justified, it might be pragmatic, but you dont try to justify it and you dont 
try to outline the pragmatic reasons - from all i can see you seem to argue 
that this is all perfectly fine in late -rc's, which has me worried somewhat.

[ And if that is really fine i'd like to hear Linus's amen on that as well, 
  because i'm sure others would like to use that mechanism too to enable
  new functionality in late -rc's. ]

Thanks,

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Feb 04, 2010 at 07:56:03PM +0100, Ingo Molnar wrote:
 
  Do you see my argument why any user who is hit by this would categorize 
  this as a kernel regression in an existing driver?
 
 No. If a user changes configuration and gets a hang, that's a bug but not a 
 regression.

Only if it's some brand-new driver or a brand-new kernel feature for which 
no-one can have any prior expectations of stability. Especially if it's added 
in the merge window when many new drivers are added.

But isnt it a regression to a user if it's shipped in -rc7 appearing as a new 
sub-option of an existing driver?

I'd wager that most main-street Linux users would consider that a regression.
 
As i see it is that you are trying to have it both ways: claim it's a new 
driver when it comes to handling regressions, but also try to have the 
benefits (and adoption flux) of an old driver when it comes to facing it to 
users.

Some info about that in the Kconfig would be helpful IMO - so that people are 
less surprised if it happens to break while the radeon driver worked fine for 
them before.

Thanks,

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Linus Torvalds


On Thu, 4 Feb 2010, Alex Deucher wrote:
 
 And if it crashes, he'll report a bug and we'll fix it.

Ok, you have a bug-report. See earlier in the thread:

 btw., i just found another bug activated via this same commit, a boot hang 
 after DRM init:

 [9.858352] [drm] Connector 1:
 [9.861417] [drm]   DVI-I
 [9.864031] [drm]   HPD1
 [9.866562] [drm]   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
 [9.872579] [drm]   Encoders:
 [9.875540] [drm] CRT2: INTERNAL_DAC2
 [9.879541] [drm] DFP1: INTERNAL_TMDS1
 [9.883646] [drm] Connector 2:
 [9.886695] [drm]   S-video
 [9.889483] [drm]   Encoders:
 [9.892463] [drm] TV1: INTERNAL_DAC2
 [9.896392] i2c i2c-0: master_xfer[0] W, addr=0x50, len=1
 [9.901796] i2c i2c-0: master_xfer[1] R, addr=0x50, len=128
 [9.909246] i2c i2c-0: NAK from device addr 0x50 msg #0
 [9.914564] i2c i2c-1: master_xfer[0] W, addr=0x50, len=1
 [9.919957] i2c i2c-1: master_xfer[1] R, addr=0x50, len=128
 [9.927413] i2c i2c-1: NAK from device addr 0x50 msg #0

So can we get it fixed, please?

Linus

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Jerome Glisse
On Thu, Feb 04, 2010 at 08:19:35PM +0100, Ingo Molnar wrote:
 
 * Matthew Garrett mj...@srcf.ucam.org wrote:
 
  On Thu, Feb 04, 2010 at 07:56:03PM +0100, Ingo Molnar wrote:
  
   Do you see my argument why any user who is hit by this would categorize 
   this as a kernel regression in an existing driver?
  
  No. If a user changes configuration and gets a hang, that's a bug but not a 
  regression.
 
 Only if it's some brand-new driver or a brand-new kernel feature for which 
 no-one can have any prior expectations of stability. Especially if it's added 
 in the merge window when many new drivers are added.
 
 But isnt it a regression to a user if it's shipped in -rc7 appearing as a new 
 sub-option of an existing driver?
 
 I'd wager that most main-street Linux users would consider that a regression.
  
 As i see it is that you are trying to have it both ways: claim it's a new 
 driver when it comes to handling regressions, but also try to have the 
 benefits (and adoption flux) of an old driver when it comes to facing it to 
 users.

We have been treating KMS regression as regression, i fixed numerous regressions
since it was first merged as an staging driver, and i keep doing so, i try to be
as much reactive as i can. I am sorry you have a bad experience about it. I just
wanted to add that we planed to move KMS out of staging in 2.6.33 long time ago
and yes maybe we should have done it earlier, but no matter when we do the 
change
you will still face this bug until we fix it.

So on fixing the issue front,  one question do you also enable radeonfb ? if so
then its likely the root issue of this bug, i think kconfig should forbid having
both radeon kms + radeonfb but i am not sure how allyesconfig behave in respect
of such constraint.
 
 Some info about that in the Kconfig would be helpful IMO - so that people are 
 less surprised if it happens to break while the radeon driver worked fine for 
 them before.
 
 Thanks,
 
   Ingo

I think make menuconfig has a more explicit message iirc.

Cheers,
Jerome

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Alex Deucher alexdeuc...@gmail.com wrote:

 On Thu, Feb 4, 2010 at 2:06 PM, Ingo Molnar mi...@elte.hu wrote:
 
  * Alex Deucher alexdeuc...@gmail.com wrote:
 
  On Thu, Feb 4, 2010 at 1:12 PM, Ingo Molnar mi...@elte.hu wrote:
  
   * Matthew Garrett mj...@srcf.ucam.org wrote:
  
   On Thu, Feb 04, 2010 at 06:54:45PM +0100, Ingo Molnar wrote:
  
But you could claim that it's not a regression because 1) technically 
the
code got introduced in drivers/staging/, and staging drivers are not 
on
the regression list 2) the Kconfig value is default-off so it can only
harm those who got lured by a new Kconfig value popping up in -rc7 in 
a
well working driver they already have enabled.
   
So the moving of driver functionality from drivers/staging/ to 
drivers/
is a grey area it appears. Wouldnt it have been better to do this in 
the
next merge window, as all other drivers do? It's not new hardware
enablement either, it's feature enablement for an existing driver.
  
   The reason the option was in staging (as has been mentioned before) was
   because the ABI wasn't felt to be stable enough. Upstream is now 
   willing to
   commit to that stability, so now seems as good a time to move it as any.
   There's no code change and there's no default configuration change, so I
   really can't see any way that it can be classed as a regression.
  
   But that argument in essence renders the regression policy meaningless 
   for
   such code: just about any new driver feature under the sun could be 
   shaped as
   a Kconfig option, introduced via a drivers/staging Kconfig entry, and 
   then
   activated via a twoliner commit in a later -rc.
  
   IMHO the point of tracking regressions is to reduce the bugginess of the
   kernel and thus to help users, not to give ground for legalistic 
   arguments.
  
   There _are_ common-sense exceptions from the regression rules, such as 
   the
   introduction of a new piece of hardware that was previously unsupported
   (hence there's no expectation of stability) - but the tweaking of an
   existing, widely used driver (even if the new opion is default-off) 
   hardly
   seems to qualify for that.
  
 
  This is a completely new driver. ?It's only part of the existing drm for
  compatibility reasons. ?It requires an entirely different graphics stack
  above it and works very differently from the old drm stack.
 
  Will the user know? IMHO what matters in the end is user expectation.
 
  Lets walk through what a current kernel tester of the drm/radeon driver sees
  when he types 'make oldconfig' after installing the (to-be-released) .33-rc7
  kernel. Firstly, the user with a brand-new distro already has this enabled:
 
  ?CONFIG_DRM_RADEON=y
 
  and knows the driver, and it performs adequately. Then in -rc7 he gets a new
  option:
 
  ?ATI Radeon (DRM_RADEON) [Y/n/?] y
  ? ?Enable modesetting on radeon by default (DRM_RADEON_KMS) [N/y/?] (NEW)
 
  The user might easily go: Hey this is a driver i already have, and there's 
  a
  new sub-option for this well-working driver. Sure, enable it, these kernel
  folks know what they are doing and i rarely see any crashes past -rc2
  kernels.
 
  Does this new option tell him what you just told me, that:
 
  ? This is a completely new driver. ?It's only part of the existing drm for
  ? compatibility reasons. ?It requires an entirely different graphics stack
  ? above it and works very differently from the old drm stack.
 
  ?
 
  it doesnt. Even if he types '?', it tells:
 
  ?CONFIG_DRM_RADEON_KMS:
 
  ?Choose this option if you want kernel modesetting enabled by default,
  ?and you have a new enough userspace to support this. Running old
  ?userspaces with this enabled will cause pain.
 
  The user will likely go cool I have a fresh distro with recent Xorg, lets
  try it.
 
 
 And if it crashes, he'll report a bug and we'll fix it.

Nobody has reacted to my related boot hang bugreport yet - and it's detailed 
and fully reproducible (so i can test any proposed fixes as well in short 
order). I.e. my limited testing has triggered two separate bugs in the same 
driver - and this will show up in -rc7.

It might be all OK and no-one else will see trouble. Or past patterns might 
repeat themselves and i might simply be an early bird for trouble to come.

My (oft repeated) point is that adding new sub-features to existing drivers 
is not what we do in late -rc's: there's simply not enough time to shake out 
bugs/regressions in them.

We introduce new functionality to existing drivers in the merge window - in 
the two weeks following a stable kernel's release.

In late -rc's we only try to fix regressions. Sometimes we make exceptions 
for pragmatic reasons, but then we are straightforward about those reasons 
and try to warn users about our zeal to help them with cool, new, 
not-to-be-missed GPU functionality ;-)

Thanks,

Ingo


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Dave Airlie
On Fri, Feb 5, 2010 at 5:24 AM, Linus Torvalds
torva...@linux-foundation.org wrote:


 On Thu, 4 Feb 2010, Alex Deucher wrote:

 And if it crashes, he'll report a bug and we'll fix it.

 Ok, you have a bug-report. See earlier in the thread:

 btw., i just found another bug activated via this same commit, a boot hang
 after DRM init:

 [    9.858352] [drm] Connector 1:
 [    9.861417] [drm]   DVI-I
 [    9.864031] [drm]   HPD1
 [    9.866562] [drm]   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
 [    9.872579] [drm]   Encoders:
 [    9.875540] [drm]     CRT2: INTERNAL_DAC2
 [    9.879541] [drm]     DFP1: INTERNAL_TMDS1
 [    9.883646] [drm] Connector 2:
 [    9.886695] [drm]   S-video
 [    9.889483] [drm]   Encoders:
 [    9.892463] [drm]     TV1: INTERNAL_DAC2
 [    9.896392] i2c i2c-0: master_xfer[0] W, addr=0x50, len=1
 [    9.901796] i2c i2c-0: master_xfer[1] R, addr=0x50, len=128
 [    9.909246] i2c i2c-0: NAK from device addr 0x50 msg #0
 [    9.914564] i2c i2c-1: master_xfer[0] W, addr=0x50, len=1
 [    9.919957] i2c i2c-1: master_xfer[1] R, addr=0x50, len=128
 [    9.927413] i2c i2c-1: NAK from device addr 0x50 msg #0

 So can we get it fixed, please?

Ingo,

got the full dmesg? and lspci -vvnn

the bug reporting needs work, this snippet hasn't the useful info in it.

Dave.


                Linus

 --
 The Planet: dedicated and managed hosting, cloud storage, colocation
 Stay online with enterprise data centers and the best network in the business
 Choose flexible plans and management services without long-term contracts
 Personal 24x7 support from experience hosting pros just a phone call away.
 http://p.sf.net/sfu/theplanet-com
 --
 ___
 Dri-devel mailing list
 Dri-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dri-devel


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Jesse Barnes jbar...@virtuousgeek.org wrote:

 On Thu, 4 Feb 2010 20:32:32 +0100
 Ingo Molnar mi...@elte.hu wrote:
  Nobody has reacted to my related boot hang bugreport yet - and it's
  detailed and fully reproducible (so i can test any proposed fixes as
  well in short order). I.e. my limited testing has triggered two
  separate bugs in the same driver - and this will show up in -rc7.
  
  It might be all OK and no-one else will see trouble. Or past patterns
  might repeat themselves and i might simply be an early bird for
  trouble to come.
  
  My (oft repeated) point is that adding new sub-features to existing
  drivers is not what we do in late -rc's: there's simply not enough
  time to shake out bugs/regressions in them.
  
  We introduce new functionality to existing drivers in the merge
  window - in the two weeks following a stable kernel's release.
 
 This is the .config issue right?  It doesn't sound like the bug is new, 
 you're just seeing now it because of the way you run tests.  It shouldn't 
 affect any more or fewer users than it did before, and reverting the move 
 radeon KMS out of staging won't fix the bug at all or prevent anyone from 
 seeing it.  People using KMS will still use KMS and people without it 
 won't, [...]

I think you are missing my point. My point is very simple: existing non-KMS 
users of CONFIG_DRM_RADON=y (a pre-existing driver) might turn on the new 
sub-feature (CONFIG_DRM_RADEON_KMS=y), in the expectation that this is a safe 
addition to his currently well-working driver.

( I have to confess i do that all the time for drivers that work well for me, 
  and if it pops up in a late -rc i sure expect it to be safe to enable. I 
  dont even read the help text most of the time - if the single-line summary 
  sounds useful i enable it. Especially if the Kconfig help entry says it's 
  safe with a new distro, it's not CONFIG_EXPERIMENTAL, it's not marked 
  CONFIG_BROKEN, it's not in CONFIG_STAGING, etc. )

That action might hang or crash his kernel, and if that user then reports:

   Hey, -rc7 just hung on me after enabling this new .config option it 
offered for the radeon driver i am using, please add this to the list of 
regressions. 

is this really the right kind of reply:

  Since we moved it from drivers/staging/ to drivers/ this hang you are 
   seeing is technically not a regression, we might or might not fix it. 

?

I doubt the user would be overly enthusiastic about that kind of reply ;-)

Guys, you should really _think_ about it a minute and realize what the 
purpose of a regression policy is.

It's not to be a PITA to subsystem maintainers, it's not an annoyance just to 
keep you from doing cool stuff. It's not something which you should try to 
lawyer your way out of via an as narrow interpretation as you can.

A regression policy is something that generally helps the quality of Linux, 
so it's worth interpreting broadly and generously in spirit not just in 
letter. If there's a single most prominent complaint i hear about the 
upstream kernel is that it breaks too often. (right after 'it doesnt support 
my graphics hardware' - so i sure can relate to the pragmatic reasons of 
pushing KMS strongly!)

If i run into a crash and a hang, you can bet that others will as well.

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Jesse Barnes
On Thu, 4 Feb 2010 20:32:32 +0100
Ingo Molnar mi...@elte.hu wrote:
 Nobody has reacted to my related boot hang bugreport yet - and it's
 detailed and fully reproducible (so i can test any proposed fixes as
 well in short order). I.e. my limited testing has triggered two
 separate bugs in the same driver - and this will show up in -rc7.
 
 It might be all OK and no-one else will see trouble. Or past patterns
 might repeat themselves and i might simply be an early bird for
 trouble to come.
 
 My (oft repeated) point is that adding new sub-features to existing
 drivers is not what we do in late -rc's: there's simply not enough
 time to shake out bugs/regressions in them.
 
 We introduce new functionality to existing drivers in the merge
 window - in the two weeks following a stable kernel's release.

This is the .config issue right?  It doesn't sound like the bug is new,
you're just seeing now it because of the way you run tests.  It
shouldn't affect any more or fewer users than it did before, and
reverting the move radeon KMS out of staging won't fix the bug at all
or prevent anyone from seeing it.  People using KMS will still use KMS
and people without it won't, because no one actually uses allyes
configs, and the option defaults to N anyway.

 In late -rc's we only try to fix regressions. Sometimes we make
 exceptions for pragmatic reasons, but then we are straightforward
 about those reasons and try to warn users about our zeal to help them
 with cool, new, not-to-be-missed GPU functionality ;-)

Agree, I just don't think this is a regression or an exception.

-- 
Jesse Barnes, Intel Open Source Technology Center

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Jerome Glisse gli...@freedesktop.org wrote:

 On Thu, Feb 04, 2010 at 08:19:35PM +0100, Ingo Molnar wrote:
  
  * Matthew Garrett mj...@srcf.ucam.org wrote:
  
   On Thu, Feb 04, 2010 at 07:56:03PM +0100, Ingo Molnar wrote:
   
Do you see my argument why any user who is hit by this would categorize 
this as a kernel regression in an existing driver?
   
   No. If a user changes configuration and gets a hang, that's a bug but not 
   a 
   regression.
  
  Only if it's some brand-new driver or a brand-new kernel feature for which 
  no-one can have any prior expectations of stability. Especially if it's 
  added 
  in the merge window when many new drivers are added.
  
  But isnt it a regression to a user if it's shipped in -rc7 appearing as a 
  new 
  sub-option of an existing driver?
  
  I'd wager that most main-street Linux users would consider that a 
  regression.
   
  As i see it is that you are trying to have it both ways: claim it's a new 
  driver when it comes to handling regressions, but also try to have the 
  benefits (and adoption flux) of an old driver when it comes to facing it to 
  users.
 
 We have been treating KMS regression as regression, [...]

Great!

 [...] i fixed numerous regressions since it was first merged as an staging 
 driver, and i keep doing so, i try to be as much reactive as i can. I am 
 sorry you have a bad experience about it. I just wanted to add that we 
 planed to move KMS out of staging in 2.6.33 long time ago and yes maybe we 
 should have done it earlier, but no matter when we do the change you will 
 still face this bug until we fix it.

I dont think you'll ever see my complain about a bug. I dont, and i introduce 
far too many of them to have any moral basis for complaint in any case ;-)

I only questioned the validity of this initial reaction by Dave Arlie:

 | Its not enabled by default so reverting this doesn't make much sense.
 |
 | We can just treat this as a normal driver bugreport.

 So on fixing the issue front, one question do you also enable radeonfb ? if 
 so then its likely the root issue of this bug, i think kconfig should 
 forbid having both radeon kms + radeonfb but i am not sure how allyesconfig 
 behave in respect of such constraint.

Please see the bugreprt i made in this thread, under the following subject:

  hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of 
staging.

It has all that info and more. (i've bounced it to you privately as well)

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Jesse Barnes jbar...@virtuousgeek.org wrote:

[...]
  That action might hang or crash his kernel, and if that user then 
  reports:
  
 Hey, -rc7 just hung on me after enabling this new .config option
  it offered for the radeon driver i am using, please add this to the
  list of regressions. 
  
  is this really the right kind of reply:
  
Since we moved it from drivers/staging/ to drivers/ this hang you
  are seeing is technically not a regression, we might or might not fix
  it. 
  
  ?
  
  I doubt the user would be overly enthusiastic about that kind of
  reply ;-)
 
 Whether or not it's a regression is mostly irrelevant, it's a real bug and 
 the radeon guys are working on fixing it. [...]

Fortunately it's being worked on.

I beg to differ with your argument about it not mattering whether a bug is 
categorized as a regression: Rafael's regression list is far more prominent 
and the bugs listed there get fixed with a high likelhood.

Note that there's clear evidence of that in this very thread: the hang bug 
was ignored as a plain DRM non-regression bugreport, _despite_ the prior 
scrutiny in the thread, up to the moment Linus pointed it out and turned it 
into a de-facto regression ...

There's also another purpose of categorizing bugs as regressions: tester 
timeliness. We tend to treat bugs as 'plain bugs' when they are reported too 
late after a few kernel releases of the bug having been in the wild. We do 
this to encourage testers to test earlier -rc's as well, as there's a real 
tangible benefit of the 'we dont do regressions' policy: bugs get fixed and 
testers feel involved, and it's also the stage were we _can_ fix bugs with a 
lower cost to all parties involved.

But what 'timeliness of testing' can there be if new features are added in a 
late -rc and bugs are explicitly categorized as 'not a regression'?

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Feb 04, 2010 at 09:22:54PM +0100, Ingo Molnar wrote:
 
 Hey, -rc7 just hung on me after enabling this new .config option it 
  offered for the radeon driver i am using, please add this to the list 
  of 
  regressions. 
 
 If the same configuration options hang on both an old kernel and a new 
 kernel, how is that in any plausible way a regression? What's regressed?

Regressions are not limited to 'same config' kernels, last i checked. If that 
has changed (or if i'm misunderstanding it) then it would be nice to hear a 
clarification about that from Linus.

The way i understand it is that there are narrow exceptions from the 
regression rules, such as completely new drivers for which there can be no 
prior expectation of stability by users. (but for even them we are generally 
on the safer side to list bugs in them as regressions as well - especially if 
we expect many users to enable it.)

AFAIK there's no exception for new sub-features of existing facilities or 
drivers, even if it's default-disabled.

This issue materially affects quite a few bugs i'm handling as a maintainer. 
Many of them are under default-off config options - most new aspects to 
existing code are introduced in such a way. It would remove quite a bit of 
urgent-workload from my workflow if i could strike them from Rafael's list 
and could deprioritize them as plain bugs, to be fixed as time permits.

IMHO it would be rather counter-productive to kernel quality if we did that 
kind of regression-lawyering though.

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Andrew Morton
On Thu, 4 Feb 2010 22:05:59 +0100
Ingo Molnar mi...@elte.hu wrote:

 
 * Matthew Garrett mj...@srcf.ucam.org wrote:
 
  On Thu, Feb 04, 2010 at 09:22:54PM +0100, Ingo Molnar wrote:
  
  Hey, -rc7 just hung on me after enabling this new .config option it 
   offered for the radeon driver i am using, please add this to the list 
   of 
   regressions. 
  
  If the same configuration options hang on both an old kernel and a new 
  kernel, how is that in any plausible way a regression? What's regressed?
 
 Regressions are not limited to 'same config' kernels, last i checked. If that 
 has changed (or if i'm misunderstanding it) then it would be nice to hear a 
 clarification about that from Linus.
 
 The way i understand it is that there are narrow exceptions from the 
 regression rules, such as completely new drivers for which there can be no 
 prior expectation of stability by users. (but for even them we are generally 
 on the safer side to list bugs in them as regressions as well - especially if 
 we expect many users to enable it.)
 
 AFAIK there's no exception for new sub-features of existing facilities or 
 drivers, even if it's default-disabled.
 
 This issue materially affects quite a few bugs i'm handling as a maintainer. 
 Many of them are under default-off config options - most new aspects to 
 existing code are introduced in such a way. It would remove quite a bit of 
 urgent-workload from my workflow if i could strike them from Rafael's list 
 and could deprioritize them as plain bugs, to be fixed as time permits.
 
 IMHO it would be rather counter-productive to kernel quality if we did that 
 kind of regression-lawyering though.
 

Yes, it's mainly semantics.

From the user's point of view

kernel N: boots, works, plays nethack
kernel N+1: goes splat

That kernel regressed for that user.  He'll shrug and will go back to
kernel N and we lost an N+1 tester.  And the distros who ship N+1 get a
lot of hack work to do.

If the feature is this buggy, it was wrong to make it accessible in Kconfig.


Anyway.  The number of DRI regressions which have come in over the past
few weeks is really quite extraordinary.  We're now showing 31 open
DRI regressions in bugzilla, but a lot of those are presumably
defunct.

It's been bad ever since the KMS stuff went in.  That's understandable
given the magnitude of the change, I guess, but the wheels really seem
to have falled off in 2.6.32 and 2.6.33.


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Dave Airlie
On Fri, Feb 5, 2010 at 7:23 AM, Andrew Morton a...@linux-foundation.org wrote:
 On Thu, 4 Feb 2010 22:05:59 +0100
 Ingo Molnar mi...@elte.hu wrote:


 * Matthew Garrett mj...@srcf.ucam.org wrote:

  On Thu, Feb 04, 2010 at 09:22:54PM +0100, Ingo Molnar wrote:
 
      Hey, -rc7 just hung on me after enabling this new .config option it
       offered for the radeon driver i am using, please add this to the 
   list of
       regressions. 
 
  If the same configuration options hang on both an old kernel and a new
  kernel, how is that in any plausible way a regression? What's regressed?

 Regressions are not limited to 'same config' kernels, last i checked. If that
 has changed (or if i'm misunderstanding it) then it would be nice to hear a
 clarification about that from Linus.

 The way i understand it is that there are narrow exceptions from the
 regression rules, such as completely new drivers for which there can be no
 prior expectation of stability by users. (but for even them we are generally
 on the safer side to list bugs in them as regressions as well - especially if
 we expect many users to enable it.)

 AFAIK there's no exception for new sub-features of existing facilities or
 drivers, even if it's default-disabled.

 This issue materially affects quite a few bugs i'm handling as a maintainer.
 Many of them are under default-off config options - most new aspects to
 existing code are introduced in such a way. It would remove quite a bit of
 urgent-workload from my workflow if i could strike them from Rafael's list
 and could deprioritize them as plain bugs, to be fixed as time permits.

 IMHO it would be rather counter-productive to kernel quality if we did that
 kind of regression-lawyering though.


 Yes, it's mainly semantics.

 From the user's point of view

 kernel N: boots, works, plays nethack
 kernel N+1: goes splat

 That kernel regressed for that user.  He'll shrug and will go back to
 kernel N and we lost an N+1 tester.  And the distros who ship N+1 get a
 lot of hack work to do.

If they used the same .config and it breaks then its a regression
if not its not. both then intel and radeon KMS enable is also quite
clear on the fact that'll it
break your userspace, so I'd hope ppl are reading it.


 If the feature is this buggy, it was wrong to make it accessible in Kconfig.

The bug was identified after we enabled the option, we have no record
of a similiar
problem occuring in Fedora or Ubuntu bug trackers, and my future sight
is broken.


 Anyway.  The number of DRI regressions which have come in over the past
 few weeks is really quite extraordinary.  We're now showing 31 open
 DRI regressions in bugzilla, but a lot of those are presumably
 defunct.

 It's been bad ever since the KMS stuff went in.  That's understandable
 given the magnitude of the change, I guess, but the wheels really seem
 to have falled off in 2.6.32 and 2.6.33.


Its not unsurprising, also Intel vs Radeon KMS is an big distinction,
the core KMS
code hasn't seen much in the way of problems its driver related.

The problem is the kernel is now exposed to the sort of things for
years we've had in
userspace, graphics drivers are hard. Add the interactions with ACPI,
crazy BIOS writers,
SMMs, suspend/resume, power management and it just is really really messy.

I know in the Intel driver we've been backing out a lot of the new
features as soon
as we can identify if the hw or sw is at fault and I've been pushing
the Intel guys
to keep on top of the regession list better, hopefully they are doing so.

Also things like the idr change that just bounced in/out broke all of
the GEM drivers
along with AGP changes in the x86 tree that broke shit.

Dave.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Matthew Garrett
On Thu, Feb 04, 2010 at 09:22:54PM +0100, Ingo Molnar wrote:

Hey, -rc7 just hung on me after enabling this new .config option it 
 offered for the radeon driver i am using, please add this to the list of 
 regressions. 

If the same configuration options hang on both an old kernel and a new 
kernel, how is that in any plausible way a regression? What's regressed?

-- 
Matthew Garrett | mj...@srcf.ucam.org

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Matthew Garrett
On Thu, Feb 04, 2010 at 10:05:59PM +0100, Ingo Molnar wrote:

 Regressions are not limited to 'same config' kernels, last i checked. If that 
 has changed (or if i'm misunderstanding it) then it would be nice to hear a 
 clarification about that from Linus.

If an option has *never* worked on a given configuration, then it's not 
a regression. That's not a matter of taste, it's a matter of language. 
We prioritise regressions because they mean that someone's previously 
working configuration no longer works. Tying the word regression to 
other bugs just to get someone to look at them faster is 
counterproductive.
 
-- 
Matthew Garrett | mj...@srcf.ucam.org

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread david
On Thu, 4 Feb 2010, Ingo Molnar wrote:

 * Jesse Barnes jbar...@virtuousgeek.org wrote:

 On Thu, 4 Feb 2010 20:32:32 +0100
 Ingo Molnar mi...@elte.hu wrote:
 Nobody has reacted to my related boot hang bugreport yet - and it's
 detailed and fully reproducible (so i can test any proposed fixes as
 well in short order). I.e. my limited testing has triggered two
 separate bugs in the same driver - and this will show up in -rc7.

 It might be all OK and no-one else will see trouble. Or past patterns
 might repeat themselves and i might simply be an early bird for
 trouble to come.

 My (oft repeated) point is that adding new sub-features to existing
 drivers is not what we do in late -rc's: there's simply not enough
 time to shake out bugs/regressions in them.

 We introduce new functionality to existing drivers in the merge
 window - in the two weeks following a stable kernel's release.

 This is the .config issue right?  It doesn't sound like the bug is new,
 you're just seeing now it because of the way you run tests.  It shouldn't
 affect any more or fewer users than it did before, and reverting the move
 radeon KMS out of staging won't fix the bug at all or prevent anyone from
 seeing it.  People using KMS will still use KMS and people without it
 won't, [...]

 I think you are missing my point. My point is very simple: existing non-KMS
 users of CONFIG_DRM_RADON=y (a pre-existing driver) might turn on the new
 sub-feature (CONFIG_DRM_RADEON_KMS=y), in the expectation that this is a safe
 addition to his currently well-working driver.

 ( I have to confess i do that all the time for drivers that work well for me,
  and if it pops up in a late -rc i sure expect it to be safe to enable. I
  dont even read the help text most of the time - if the single-line summary
  sounds useful i enable it. Especially if the Kconfig help entry says it's
  safe with a new distro, it's not CONFIG_EXPERIMENTAL, it's not marked
  CONFIG_BROKEN, it's not in CONFIG_STAGING, etc. )

forget the people testing the rc kernels, what about people who will see 
this in the released kernel?

David Lang

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: hung bootup with drm/radeon/kms: move radeon KMS on/off switch out of staging.

2010-02-04 Thread Ingo Molnar

* Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Feb 04, 2010 at 10:05:59PM +0100, Ingo Molnar wrote:
 
  Regressions are not limited to 'same config' kernels, last i checked. If 
  that 
  has changed (or if i'm misunderstanding it) then it would be nice to hear a 
  clarification about that from Linus.
 
 If an option has *never* worked on a given configuration, then it's not a 
 regression. [...]

The *main driver* (CONFIG_DRM_RADEON=y, as far as the user is concerned) is 
many years old and it certainly worked just fine for tens of thousands of 
test iterations on this box, up until that commit.

That box alone has done in excess of half a million boot iterations in the 
past 2+ years. About 28% of the tests had CONFIG_DRM_RADEON=y, so the number 
of successful bootups with CONFIG_DRM_RADEON=y is in excess of one hundred 
thousand. There was not a single failed bootup in those two years due to the 
CONFIG_DRM_RADEON driver that i can remember.

If it now does not boot up if all its sub-options are enabled, even of some 
of those sub-options are new, does that count as a driver regression? Sure it 
does to me ...

IMHO you are trying to put a narrow technical distinction into it which does 
not exist for users. You argue that it's a new default-off sub-option of an 
existing driver, hence it cannot be a regression. The option shows up as a 
sub-option to an existing driver in make oldconfig, with a fairly innocious 
sounding help text, so to a user it certainly looks as if it's one unit and 
that it is the radeon driver that regressed.

*If* we make a driver feature available in a popular driver and make it 
Kconfig visible without obvious warnings (i.e. lure the user to enable it 
with the notion that it's just one coherent unit with a trusted, existing 
driver), we should also hold up the other side of the deal and treat the 
*bugs* as a coherent unit as well.

I.e. treat the driver as a coherent whole not only when it's convenient to 
us, but also when it's somewhat inconvenient to do. We cannot have it both 
ways.

Ingo

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel