Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-31 Thread Benjamin Redelings I

Vincent Stemen wrote:
 > The problem is, that's not true.  These problems are not slipping
 > through because of lack of testers.
Just to add some sanity to this thread, I have been using the 2.4.x 
kernels ever since they came out, on my personal workstation and on some 
workstations that I administrate for fellow students in my department 
here at UCLA.  They have basically worked fine for me.  They are not 
perfect, but many of the 2.4.x releases have been a big improvement over 
the 2.2.x releases.  For one, 2.4.x actually can tell which pages are 
not used, and swap out unused daemons, which helps a lot on a 64Mb box :)

-BenR
-- 
Einstein did not prove that everything is relative.
Einstein explained how the speed of light could be constant.
Benjamin Redelings I  <>< http://www.bol.ucla.edu/~bredelin/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Mike Galbraith

On Wed, 30 May 2001, Vincent Stemen wrote:

> On Wednesday 30 May 2001 15:17, Mike Galbraith wrote:
> > On Wed, 30 May 2001, Vincent Stemen wrote:
> > > On Wednesday 30 May 2001 01:02, Mike Galbraith wrote:
> > > > On Tue, 29 May 2001, Vincent Stemen wrote:
> > > > > On Tuesday 29 May 2001 15:16, Alan Cox wrote:
> > > > > > > a reasonably stable release until 2.2.12.  I do not understand
> > > > > > > why code with such serious reproducible problems is being
> > > > > > > introduced into the even numbered kernels.  What happened to
> > > > > > > the plan to use only the
> > > > > >
> > > > > > Who said it was introduced ?? It was more 'lurking' than
> > > > > > introduced. And unfortunately nobody really pinned it down in
> > > > > > 2.4test.
> > > > >
> > > > > I fail to see the distinction.  First of all, why was it ever
> > > > > released as 2.4-test?  That question should probably be directed at
> > > > > Linus.  If it is not fully tested, then it should be released it as
> > > > > an odd number.  If it already existed in the odd numbered
> > > > > development kernel and was known, then it should have never been
> > > > > released as a production kernel until it was resolved.  Otherwise,
> > > > > it completely defeats the purpose of having the even/odd numbering
> > > > > system.
> > > > >
> > > > > I do not expect bugs to never slip through to production kernels,
> > > > > but known bugs that are not trivial should not, and serious bugs
> > > > > like these VM problems especially should not.
> > > >
> > > > And you can help prevent them from slipping through by signing up as
> > > > a shake and bake tester.  Indeed, you can make your expectations
> > > > reality absolutely free of charge,  and or compensation
> > > >  what a bargain!
> > > >
> > > > X ___ ;-)
> > > >
> > > > -Mike
> > >
> > > The problem is, that's not true.  These problems are not slipping
> > > through because of lack of testers.  As Alan said, the VM problem has
> >
> > Sorry, that's a copout.  You (we) had many chances to notice.  Don't
> > push the problems back onto developers.. it's our problem.
> >
>
> How is that a copout?  The problem was noticed.  I am only suggesting
> that we not be in such a hurry to put code in the production kernels
> until we are pretty sure it works well enough, and that we release
> major production versions more often so that they do not contain 2 or
> 3 years worth of new code making it so hard to debug.  We probably
> should have had 2 or 3 code freezes and production releases since
> 2.2.x.  As I mentioned in a previous posting, this way we do not have
> to run a 2 or 3 year old kernel in order to have reasonable stability.

I don't think you or I can do a better job of release management than
Linus and friends, so there's no point in us discussing it.  If you
want to tell Linus, Alan et al how to do it 'right', you go do that.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

On Wednesday 30 May 2001 15:17, Mike Galbraith wrote:
> On Wed, 30 May 2001, Vincent Stemen wrote:
> > On Wednesday 30 May 2001 01:02, Mike Galbraith wrote:
> > > On Tue, 29 May 2001, Vincent Stemen wrote:
> > > > On Tuesday 29 May 2001 15:16, Alan Cox wrote:
> > > > > > a reasonably stable release until 2.2.12.  I do not understand
> > > > > > why code with such serious reproducible problems is being
> > > > > > introduced into the even numbered kernels.  What happened to
> > > > > > the plan to use only the
> > > > >
> > > > > Who said it was introduced ?? It was more 'lurking' than
> > > > > introduced. And unfortunately nobody really pinned it down in
> > > > > 2.4test.
> > > >
> > > > I fail to see the distinction.  First of all, why was it ever
> > > > released as 2.4-test?  That question should probably be directed at
> > > > Linus.  If it is not fully tested, then it should be released it as
> > > > an odd number.  If it already existed in the odd numbered
> > > > development kernel and was known, then it should have never been
> > > > released as a production kernel until it was resolved.  Otherwise,
> > > > it completely defeats the purpose of having the even/odd numbering
> > > > system.
> > > >
> > > > I do not expect bugs to never slip through to production kernels,
> > > > but known bugs that are not trivial should not, and serious bugs
> > > > like these VM problems especially should not.
> > >
> > > And you can help prevent them from slipping through by signing up as
> > > a shake and bake tester.  Indeed, you can make your expectations
> > > reality absolutely free of charge,  and or compensation
> > >  what a bargain!
> > >
> > > X ___ ;-)
> > >
> > >   -Mike
> >
> > The problem is, that's not true.  These problems are not slipping
> > through because of lack of testers.  As Alan said, the VM problem has
>
> Sorry, that's a copout.  You (we) had many chances to notice.  Don't
> push the problems back onto developers.. it's our problem.
>

How is that a copout?  The problem was noticed.  I am only suggesting
that we not be in such a hurry to put code in the production kernels
until we are pretty sure it works well enough, and that we release
major production versions more often so that they do not contain 2 or
3 years worth of new code making it so hard to debug.  We probably
should have had 2 or 3 code freezes and production releases since
2.2.x.  As I mentioned in a previous posting, this way we do not have
to run a 2 or 3 year old kernel in order to have reasonable stability.

> > Here are some of the problems I see:
> >
> > There was far to long of a stretch with to much code dumped into both
> > the 2.2 and 2.4 kernels before release.  There needs to be a smaller
> > number changes between major releases so that they can be more
> > thoroughly tested and debugged.  In the race to get it out there they
> > are making the same mistakes as Microsoft, releasing production
> > kernels with known serious bugs because it is taking to long and they
> > want to move on forward.  I enjoy criticizing Microsoft so much for
> > the same thing that I do not want to have to stop in order to not
> > sound hypocritical :-).  The Linux community has built a lot of it's
> > reputation on not making these mistakes.  Please lets try not to
> > destroy that.
> >
> > They are disregarding the even/odd versioning system.
> > For example:
> > There was a new 8139too driver added to the the 2.4.5 (I think) kernel
> > which Alan Cox took back out and reverted to the old one in his
> > 2.4.5-ac? versions because it is apparently causing lockups.
> > Shouldn't this new driver have been released in a 2.5.x development
> > kernel and proven there before replacing the one in the production
> > kernel?  I haven't even seen a 2.5.x kernel released yet.
> >
> > Based on Linus's original very good plan for even/odd numbers, there
> > should not have been 2.4.0-test? kernels either.  This was another
> > example of the rush to increment to 2.4 long before it was ready.
> > There was a long stretch of test kernels and and now we are all the
> > way to 2.4.5 and it is still not stable.  We are repeating the 2.2.x
> > process all over again.  It should have been 2.3.x until the
> > production release was ready.  If they needed to distinguish a code
> > freeze for final testing, it could be done with a 4th version
> > component (2.3.xx.xx), where the 4 component is incremented for final
> > bug fixes.
>
> Sorry, I disagree with every last bit.  Either you accept a situation
> or you try to do something about it.
>
>   -Mike

I am spending a lot of time testing new kernels, reporting bugs and
offering suggestions that I think may improve on the stability of
production kernels.  Is this not considered doing something about it?
It is necessary to point out where one sees a problem in order to
offer possible solutions for improvement. 



- Vincent 

-
To unsubscribe from this list: send the line 

Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

On Wednesday 30 May 2001 15:30, Rik van Riel wrote:
> On Wed, 30 May 2001, Vincent Stemen wrote:
> > The problem is, that's not true.  These problems are not slipping
> > through because of lack of testers.  As Alan said, the VM problem has
> > been lurking, which means that it was known already.
>
> Fully agreed, it went through because of a lack of hours
> per day and the fact that the priority of developers was
> elsewhere.
>
> For me, for example, the priorities have mostly been with
> bugs that bothered me or that bothered Conectiva's customers.
>
> If you _really_ feel this strongly about the bug, you could
> either try to increase the number of hours a day for all of

I sure wish I could :-).

> us or you could talk to my boss about hiring me as a consultant
> to fix the problem for you on an emergency basis :)
> The other two alternatives would be either waiting until
> somebody gets around to fixing the bug or sending in a patch
> yourself.
>
> Trying to piss off developers has adverse effect on all four
> of the methods above :)
>

Why should my comments piss anybody off?  I am just trying to point
out a problem, as I see it, an offer suggestions for improvement.
Other developers will either agree with me or they wont.
Contributions are not made only through writing code.  I contribute
through code, bug reports, ideas, and suggestions.  I would love to
dive in and try to help fix some of the kernel problems but my hands
are just to full right now.

My comments are not meant to rush anybody and I am not criticizing how
long it is taking.  I know everybody is doing everything they can just
like I am, and they are doing a terrific job.  I am just suggesting a
modification to the way the kernels are distributed that is more like
the early versions that I hoped would allow us to maintain a stable
kernel for distributions and production machines.

- Vincent Stemen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

Ronald Bultje writes:
 > On 30 May 2001 14:58:57 -0500, Vincent Stemen wrote:
 > > There was a new 8139too driver added to the the 2.4.5 (I think) kernel
 > > which Alan Cox took back out and reverted to the old one in his
 > > 2.4.5-ac? versions because it is apparently causing lockups.
 > > Shouldn't this new driver have been released in a 2.5.x development
 > > kernel and proven there before replacing the one in the production
 > > kernel?  I haven't even seen a 2.5.x kernel released yet.
 > 
 > If every driver has to go thorugh the complete development cycle (of 2+
 > years), I'm sure very little driver writers will be as motivated as they
 > are now - it takes ages before they see their efforts "rewarded" with a
 > place in the kernel.
 > The ideal case is that odd-numbered kernels are "for testing" and
 > even-numbered kernels are stable. However, this is only theory. In
 > practice, you can't rule out all bugs. And you can't test all things for
 > all cases and every test case, the linux community doesn't have the
 > manpower for that. And to prevent a complete driver development cycle
 > taking 2+ years, you have to compromise.
 > 
 > If you would take 2+ years for a single driver development cycle, nobody
 > would be interested in linux since the new devices would only be
 > supported by a stable kernel two years after their release. See the
 > point? To prevent that, you need to compromise. and thus, sometimes, you
 > have some crashes.

I agree with everything you say up till this point, but you are
arguing against a point I never made.  First of all, bugs like the
8139too lockup was found within the first day or two of release in the
2.4.3 kernel.  Also, most show stopper bugs such as the VM problems
are found fairly quickly.  Even if it takes a long time to figure out
how to fix them, I do not think they should be pushed on through into
production kernels until they are until they are fixed.  I already
said that I do not expect minor bugs not to slip through.  However, if
they are minor, they can usually be fixed quickly once they are
discovered and it is no big deal if they make it into a production
kernel.

 > That's why there's still 2.2.x - that's purely stable
 > and won't crash as fast as 2.4.x, but misses the "newest
 > cutting-edge-technology device support" and "newest technology" (like
 > new SMP handling , ReiserFS, etc... But it *is* stable.
 > 

The reason I suggested more frequent major production releases is so
that you don't have to go back to a 2 or 3 year old kernel and loose
out on years worth of new features to have any stability.  One show
stopper bug like the VM problems would not be as much of a problem if
there was a stable production kernel that we could run that was only 4
or 6 months old.

 > > Based on Linus's original very good plan for even/odd numbers, there
 > > should not have been 2.4.0-test? kernels either.  This was another
 > > example of the rush to increment to 2.4 long before it was ready.
 > > There was a long stretch of test kernels and and now we are all the
 > > way to 2.4.5 and it is still not stable.  We are repeating the 2.2.x
 > > process all over again.
 > 
 > Wrong again.
 > 2.3.x is for development, adding new things, testing, adding, testing,
 > changing, testing, etc.

Which is the same point I made.

 > 2.4-test is for testing only, it's some sort of feature freeze.

Agreed.  My only point here was that it suggests that there are only
minor bugs left to be solved before the production release by setting
the version to 2.4-test.  That is one of the reasons I made the
suggestion to keep it in the 2.3 range, since there were actually
serious VM problems still upon the production 2.4 release.

 > 2.4.x is for final/stable 2.4.
 > It's a standard *nix development cycle. That's how everyone does it.

My point exactly.

 > 
 > Regards,
 > 
 > Ronald Bultje

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Rik van Riel

On Wed, 30 May 2001, Vincent Stemen wrote:

> The problem is, that's not true.  These problems are not slipping
> through because of lack of testers.  As Alan said, the VM problem has
> been lurking, which means that it was known already.

Fully agreed, it went through because of a lack of hours
per day and the fact that the priority of developers was
elsewhere.

For me, for example, the priorities have mostly been with
bugs that bothered me or that bothered Conectiva's customers.

If you _really_ feel this strongly about the bug, you could
either try to increase the number of hours a day for all of
us or you could talk to my boss about hiring me as a consultant
to fix the problem for you on an emergency basis :)

The other two alternatives would be either waiting until
somebody gets around to fixing the bug or sending in a patch
yourself.

Trying to piss off developers has adverse effect on all four
of the methods above :)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Alan Cox

> There was a new 8139too driver added to the the 2.4.5 (I think) kernel
> which Alan Cox took back out and reverted to the old one in his
> 2.4.5-ac? versions because it is apparently causing lockups.
> Shouldn't this new driver have been released in a 2.5.x development
> kernel and proven there before replacing the one in the production
> kernel?  I haven't even seen a 2.5.x kernel released yet.

Nope. The 2.4.3 one is buggy too - but differently (and it turns out a 
little less) buggy. Welcome to software.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

On Wednesday 30 May 2001 01:02, Mike Galbraith wrote:
> On Tue, 29 May 2001, Vincent Stemen wrote:
> > On Tuesday 29 May 2001 15:16, Alan Cox wrote:
> > > > a reasonably stable release until 2.2.12.  I do not understand why
> > > > code with such serious reproducible problems is being introduced
> > > > into the even numbered kernels.  What happened to the plan to use
> > > > only the
> > >
> > > Who said it was introduced ?? It was more 'lurking' than introduced.
> > > And unfortunately nobody really pinned it down in 2.4test.
> >
> > I fail to see the distinction.  First of all, why was it ever released
> > as 2.4-test?  That question should probably be directed at Linus.  If
> > it is not fully tested, then it should be released it as an odd
> > number.  If it already existed in the odd numbered development kernel
> > and was known, then it should have never been released as a production
> > kernel until it was resolved.  Otherwise, it completely defeats the
> > purpose of having the even/odd numbering system.
> >
> > I do not expect bugs to never slip through to production kernels, but
> > known bugs that are not trivial should not, and serious bugs like
> > these VM problems especially should not.
>
> And you can help prevent them from slipping through by signing up as a
> shake and bake tester.  Indeed, you can make your expectations reality
> absolutely free of charge,  and or compensation 
> what a bargain!
>
> X ___ ;-)
>
>   -Mike

The problem is, that's not true.  These problems are not slipping
through because of lack of testers.  As Alan said, the VM problem has
been lurking, which means that it was known already.  We currently
have no development/production kernel distinction and I have not been
able to find even one stable 2.4.x version to run on our main
machines.  Reverting back to 2.2.x is a real pain because of all the
surrounding changes which will affect our initscripts and other system
configuration issues, such as Unix98 pty's, proc filesystem
differences, device numbering, etc.

I have the greatest respect and appreciation for Linus, Alan, and all
of the other kernel developers.  My comments are not meant to
criticize, but rather to point out some the problems I see that are
making it so difficult to stabilize the kernel and encourage them to
steer back on track.

Here are some of the problems I see:

There was far to long of a stretch with to much code dumped into both
the 2.2 and 2.4 kernels before release.  There needs to be a smaller
number changes between major releases so that they can be more
thoroughly tested and debugged.  In the race to get it out there they
are making the same mistakes as Microsoft, releasing production
kernels with known serious bugs because it is taking to long and they
want to move on forward.  I enjoy criticizing Microsoft so much for
the same thing that I do not want to have to stop in order to not
sound hypocritical :-).  The Linux community has built a lot of it's
reputation on not making these mistakes.  Please lets try not to
destroy that.

They are disregarding the even/odd versioning system.
For example:
There was a new 8139too driver added to the the 2.4.5 (I think) kernel
which Alan Cox took back out and reverted to the old one in his
2.4.5-ac? versions because it is apparently causing lockups.
Shouldn't this new driver have been released in a 2.5.x development
kernel and proven there before replacing the one in the production
kernel?  I haven't even seen a 2.5.x kernel released yet.

Based on Linus's original very good plan for even/odd numbers, there
should not have been 2.4.0-test? kernels either.  This was another
example of the rush to increment to 2.4 long before it was ready.
There was a long stretch of test kernels and and now we are all the
way to 2.4.5 and it is still not stable.  We are repeating the 2.2.x
process all over again.  It should have been 2.3.x until the
production release was ready.  If they needed to distinguish a code
freeze for final testing, it could be done with a 4th version
component (2.3.xx.xx), where the 4 component is incremented for final
bug fixes.


- Vincent Stemen
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

Ronald Bultje writes:
  On 30 May 2001 14:58:57 -0500, Vincent Stemen wrote:
   There was a new 8139too driver added to the the 2.4.5 (I think) kernel
   which Alan Cox took back out and reverted to the old one in his
   2.4.5-ac? versions because it is apparently causing lockups.
   Shouldn't this new driver have been released in a 2.5.x development
   kernel and proven there before replacing the one in the production
   kernel?  I haven't even seen a 2.5.x kernel released yet.
  
  If every driver has to go thorugh the complete development cycle (of 2+
  years), I'm sure very little driver writers will be as motivated as they
  are now - it takes ages before they see their efforts rewarded with a
  place in the kernel.
  The ideal case is that odd-numbered kernels are for testing and
  even-numbered kernels are stable. However, this is only theory. In
  practice, you can't rule out all bugs. And you can't test all things for
  all cases and every test case, the linux community doesn't have the
  manpower for that. And to prevent a complete driver development cycle
  taking 2+ years, you have to compromise.
  
  If you would take 2+ years for a single driver development cycle, nobody
  would be interested in linux since the new devices would only be
  supported by a stable kernel two years after their release. See the
  point? To prevent that, you need to compromise. and thus, sometimes, you
  have some crashes.

I agree with everything you say up till this point, but you are
arguing against a point I never made.  First of all, bugs like the
8139too lockup was found within the first day or two of release in the
2.4.3 kernel.  Also, most show stopper bugs such as the VM problems
are found fairly quickly.  Even if it takes a long time to figure out
how to fix them, I do not think they should be pushed on through into
production kernels until they are until they are fixed.  I already
said that I do not expect minor bugs not to slip through.  However, if
they are minor, they can usually be fixed quickly once they are
discovered and it is no big deal if they make it into a production
kernel.

  That's why there's still 2.2.x - that's purely stable
  and won't crash as fast as 2.4.x, but misses the newest
  cutting-edge-technology device support and newest technology (like
  new SMP handling , ReiserFS, etc... But it *is* stable.
  

The reason I suggested more frequent major production releases is so
that you don't have to go back to a 2 or 3 year old kernel and loose
out on years worth of new features to have any stability.  One show
stopper bug like the VM problems would not be as much of a problem if
there was a stable production kernel that we could run that was only 4
or 6 months old.

   Based on Linus's original very good plan for even/odd numbers, there
   should not have been 2.4.0-test? kernels either.  This was another
   example of the rush to increment to 2.4 long before it was ready.
   There was a long stretch of test kernels and and now we are all the
   way to 2.4.5 and it is still not stable.  We are repeating the 2.2.x
   process all over again.
  
  Wrong again.
  2.3.x is for development, adding new things, testing, adding, testing,
  changing, testing, etc.

Which is the same point I made.

  2.4-test is for testing only, it's some sort of feature freeze.

Agreed.  My only point here was that it suggests that there are only
minor bugs left to be solved before the production release by setting
the version to 2.4-test.  That is one of the reasons I made the
suggestion to keep it in the 2.3 range, since there were actually
serious VM problems still upon the production 2.4 release.

  2.4.x is for final/stable 2.4.
  It's a standard *nix development cycle. That's how everyone does it.

My point exactly.

  
  Regards,
  
  Ronald Bultje

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

On Wednesday 30 May 2001 15:30, Rik van Riel wrote:
 On Wed, 30 May 2001, Vincent Stemen wrote:
  The problem is, that's not true.  These problems are not slipping
  through because of lack of testers.  As Alan said, the VM problem has
  been lurking, which means that it was known already.

 Fully agreed, it went through because of a lack of hours
 per day and the fact that the priority of developers was
 elsewhere.

 For me, for example, the priorities have mostly been with
 bugs that bothered me or that bothered Conectiva's customers.

 If you _really_ feel this strongly about the bug, you could
 either try to increase the number of hours a day for all of

I sure wish I could :-).

 us or you could talk to my boss about hiring me as a consultant
 to fix the problem for you on an emergency basis :)
 The other two alternatives would be either waiting until
 somebody gets around to fixing the bug or sending in a patch
 yourself.

 Trying to piss off developers has adverse effect on all four
 of the methods above :)


Why should my comments piss anybody off?  I am just trying to point
out a problem, as I see it, an offer suggestions for improvement.
Other developers will either agree with me or they wont.
Contributions are not made only through writing code.  I contribute
through code, bug reports, ideas, and suggestions.  I would love to
dive in and try to help fix some of the kernel problems but my hands
are just to full right now.

My comments are not meant to rush anybody and I am not criticizing how
long it is taking.  I know everybody is doing everything they can just
like I am, and they are doing a terrific job.  I am just suggesting a
modification to the way the kernels are distributed that is more like
the early versions that I hoped would allow us to maintain a stable
kernel for distributions and production machines.

- Vincent Stemen

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

On Wednesday 30 May 2001 15:17, Mike Galbraith wrote:
 On Wed, 30 May 2001, Vincent Stemen wrote:
  On Wednesday 30 May 2001 01:02, Mike Galbraith wrote:
   On Tue, 29 May 2001, Vincent Stemen wrote:
On Tuesday 29 May 2001 15:16, Alan Cox wrote:
  a reasonably stable release until 2.2.12.  I do not understand
  why code with such serious reproducible problems is being
  introduced into the even numbered kernels.  What happened to
  the plan to use only the

 Who said it was introduced ?? It was more 'lurking' than
 introduced. And unfortunately nobody really pinned it down in
 2.4test.
   
I fail to see the distinction.  First of all, why was it ever
released as 2.4-test?  That question should probably be directed at
Linus.  If it is not fully tested, then it should be released it as
an odd number.  If it already existed in the odd numbered
development kernel and was known, then it should have never been
released as a production kernel until it was resolved.  Otherwise,
it completely defeats the purpose of having the even/odd numbering
system.
   
I do not expect bugs to never slip through to production kernels,
but known bugs that are not trivial should not, and serious bugs
like these VM problems especially should not.
  
   And you can help prevent them from slipping through by signing up as
   a shake and bake tester.  Indeed, you can make your expectations
   reality absolutely free of charge, microfont and or compensation
   /microfont what a bargain!
  
   X ___ ;-)
  
 -Mike
 
  The problem is, that's not true.  These problems are not slipping
  through because of lack of testers.  As Alan said, the VM problem has

 Sorry, that's a copout.  You (we) had many chances to notice.  Don't
 push the problems back onto developers.. it's our problem.


How is that a copout?  The problem was noticed.  I am only suggesting
that we not be in such a hurry to put code in the production kernels
until we are pretty sure it works well enough, and that we release
major production versions more often so that they do not contain 2 or
3 years worth of new code making it so hard to debug.  We probably
should have had 2 or 3 code freezes and production releases since
2.2.x.  As I mentioned in a previous posting, this way we do not have
to run a 2 or 3 year old kernel in order to have reasonable stability.

  Here are some of the problems I see:
 
  There was far to long of a stretch with to much code dumped into both
  the 2.2 and 2.4 kernels before release.  There needs to be a smaller
  number changes between major releases so that they can be more
  thoroughly tested and debugged.  In the race to get it out there they
  are making the same mistakes as Microsoft, releasing production
  kernels with known serious bugs because it is taking to long and they
  want to move on forward.  I enjoy criticizing Microsoft so much for
  the same thing that I do not want to have to stop in order to not
  sound hypocritical :-).  The Linux community has built a lot of it's
  reputation on not making these mistakes.  Please lets try not to
  destroy that.
 
  They are disregarding the even/odd versioning system.
  For example:
  There was a new 8139too driver added to the the 2.4.5 (I think) kernel
  which Alan Cox took back out and reverted to the old one in his
  2.4.5-ac? versions because it is apparently causing lockups.
  Shouldn't this new driver have been released in a 2.5.x development
  kernel and proven there before replacing the one in the production
  kernel?  I haven't even seen a 2.5.x kernel released yet.
 
  Based on Linus's original very good plan for even/odd numbers, there
  should not have been 2.4.0-test? kernels either.  This was another
  example of the rush to increment to 2.4 long before it was ready.
  There was a long stretch of test kernels and and now we are all the
  way to 2.4.5 and it is still not stable.  We are repeating the 2.2.x
  process all over again.  It should have been 2.3.x until the
  production release was ready.  If they needed to distinguish a code
  freeze for final testing, it could be done with a 4th version
  component (2.3.xx.xx), where the 4 component is incremented for final
  bug fixes.

 Sorry, I disagree with every last bit.  Either you accept a situation
 or you try to do something about it.

   -Mike

I am spending a lot of time testing new kernels, reporting bugs and
offering suggestions that I think may improve on the stability of
production kernels.  Is this not considered doing something about it?
It is necessary to point out where one sees a problem in order to
offer possible solutions for improvement. 



- Vincent 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Rik van Riel

On Wed, 30 May 2001, Vincent Stemen wrote:

 The problem is, that's not true.  These problems are not slipping
 through because of lack of testers.  As Alan said, the VM problem has
 been lurking, which means that it was known already.

Fully agreed, it went through because of a lack of hours
per day and the fact that the priority of developers was
elsewhere.

For me, for example, the priorities have mostly been with
bugs that bothered me or that bothered Conectiva's customers.

If you _really_ feel this strongly about the bug, you could
either try to increase the number of hours a day for all of
us or you could talk to my boss about hiring me as a consultant
to fix the problem for you on an emergency basis :)

The other two alternatives would be either waiting until
somebody gets around to fixing the bug or sending in a patch
yourself.

Trying to piss off developers has adverse effect on all four
of the methods above :)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Mike Galbraith

On Wed, 30 May 2001, Vincent Stemen wrote:

 On Wednesday 30 May 2001 15:17, Mike Galbraith wrote:
  On Wed, 30 May 2001, Vincent Stemen wrote:
   On Wednesday 30 May 2001 01:02, Mike Galbraith wrote:
On Tue, 29 May 2001, Vincent Stemen wrote:
 On Tuesday 29 May 2001 15:16, Alan Cox wrote:
   a reasonably stable release until 2.2.12.  I do not understand
   why code with such serious reproducible problems is being
   introduced into the even numbered kernels.  What happened to
   the plan to use only the
 
  Who said it was introduced ?? It was more 'lurking' than
  introduced. And unfortunately nobody really pinned it down in
  2.4test.

 I fail to see the distinction.  First of all, why was it ever
 released as 2.4-test?  That question should probably be directed at
 Linus.  If it is not fully tested, then it should be released it as
 an odd number.  If it already existed in the odd numbered
 development kernel and was known, then it should have never been
 released as a production kernel until it was resolved.  Otherwise,
 it completely defeats the purpose of having the even/odd numbering
 system.

 I do not expect bugs to never slip through to production kernels,
 but known bugs that are not trivial should not, and serious bugs
 like these VM problems especially should not.
   
And you can help prevent them from slipping through by signing up as
a shake and bake tester.  Indeed, you can make your expectations
reality absolutely free of charge, microfont and or compensation
/microfont what a bargain!
   
X ___ ;-)
   
-Mike
  
   The problem is, that's not true.  These problems are not slipping
   through because of lack of testers.  As Alan said, the VM problem has
 
  Sorry, that's a copout.  You (we) had many chances to notice.  Don't
  push the problems back onto developers.. it's our problem.
 

 How is that a copout?  The problem was noticed.  I am only suggesting
 that we not be in such a hurry to put code in the production kernels
 until we are pretty sure it works well enough, and that we release
 major production versions more often so that they do not contain 2 or
 3 years worth of new code making it so hard to debug.  We probably
 should have had 2 or 3 code freezes and production releases since
 2.2.x.  As I mentioned in a previous posting, this way we do not have
 to run a 2 or 3 year old kernel in order to have reasonable stability.

I don't think you or I can do a better job of release management than
Linus and friends, so there's no point in us discussing it.  If you
want to tell Linus, Alan et al how to do it 'right', you go do that.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Vincent Stemen

On Wednesday 30 May 2001 01:02, Mike Galbraith wrote:
 On Tue, 29 May 2001, Vincent Stemen wrote:
  On Tuesday 29 May 2001 15:16, Alan Cox wrote:
a reasonably stable release until 2.2.12.  I do not understand why
code with such serious reproducible problems is being introduced
into the even numbered kernels.  What happened to the plan to use
only the
  
   Who said it was introduced ?? It was more 'lurking' than introduced.
   And unfortunately nobody really pinned it down in 2.4test.
 
  I fail to see the distinction.  First of all, why was it ever released
  as 2.4-test?  That question should probably be directed at Linus.  If
  it is not fully tested, then it should be released it as an odd
  number.  If it already existed in the odd numbered development kernel
  and was known, then it should have never been released as a production
  kernel until it was resolved.  Otherwise, it completely defeats the
  purpose of having the even/odd numbering system.
 
  I do not expect bugs to never slip through to production kernels, but
  known bugs that are not trivial should not, and serious bugs like
  these VM problems especially should not.

 And you can help prevent them from slipping through by signing up as a
 shake and bake tester.  Indeed, you can make your expectations reality
 absolutely free of charge, microfont and or compensation /microfont
 what a bargain!

 X ___ ;-)

   -Mike

The problem is, that's not true.  These problems are not slipping
through because of lack of testers.  As Alan said, the VM problem has
been lurking, which means that it was known already.  We currently
have no development/production kernel distinction and I have not been
able to find even one stable 2.4.x version to run on our main
machines.  Reverting back to 2.2.x is a real pain because of all the
surrounding changes which will affect our initscripts and other system
configuration issues, such as Unix98 pty's, proc filesystem
differences, device numbering, etc.

I have the greatest respect and appreciation for Linus, Alan, and all
of the other kernel developers.  My comments are not meant to
criticize, but rather to point out some the problems I see that are
making it so difficult to stabilize the kernel and encourage them to
steer back on track.

Here are some of the problems I see:

There was far to long of a stretch with to much code dumped into both
the 2.2 and 2.4 kernels before release.  There needs to be a smaller
number changes between major releases so that they can be more
thoroughly tested and debugged.  In the race to get it out there they
are making the same mistakes as Microsoft, releasing production
kernels with known serious bugs because it is taking to long and they
want to move on forward.  I enjoy criticizing Microsoft so much for
the same thing that I do not want to have to stop in order to not
sound hypocritical :-).  The Linux community has built a lot of it's
reputation on not making these mistakes.  Please lets try not to
destroy that.

They are disregarding the even/odd versioning system.
For example:
There was a new 8139too driver added to the the 2.4.5 (I think) kernel
which Alan Cox took back out and reverted to the old one in his
2.4.5-ac? versions because it is apparently causing lockups.
Shouldn't this new driver have been released in a 2.5.x development
kernel and proven there before replacing the one in the production
kernel?  I haven't even seen a 2.5.x kernel released yet.

Based on Linus's original very good plan for even/odd numbers, there
should not have been 2.4.0-test? kernels either.  This was another
example of the rush to increment to 2.4 long before it was ready.
There was a long stretch of test kernels and and now we are all the
way to 2.4.5 and it is still not stable.  We are repeating the 2.2.x
process all over again.  It should have been 2.3.x until the
production release was ready.  If they needed to distinguish a code
freeze for final testing, it could be done with a 4th version
component (2.3.xx.xx), where the 4 component is incremented for final
bug fixes.


- Vincent Stemen
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-30 Thread Alan Cox

 There was a new 8139too driver added to the the 2.4.5 (I think) kernel
 which Alan Cox took back out and reverted to the old one in his
 2.4.5-ac? versions because it is apparently causing lockups.
 Shouldn't this new driver have been released in a 2.5.x development
 kernel and proven there before replacing the one in the production
 kernel?  I haven't even seen a 2.5.x kernel released yet.

Nope. The 2.4.3 one is buggy too - but differently (and it turns out a 
little less) buggy. Welcome to software.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-29 Thread Mike Galbraith

On Tue, 29 May 2001, Vincent Stemen wrote:

> On Tuesday 29 May 2001 15:16, Alan Cox wrote:
> > > a reasonably stable release until 2.2.12.  I do not understand why
> > > code with such serious reproducible problems is being introduced into
> > > the even numbered kernels.  What happened to the plan to use only the
> >
> > Who said it was introduced ?? It was more 'lurking' than introduced. And
> > unfortunately nobody really pinned it down in 2.4test.
> >
>
> I fail to see the distinction.  First of all, why was it ever released
> as 2.4-test?  That question should probably be directed at Linus.  If
> it is not fully tested, then it should be released it as an odd
> number.  If it already existed in the odd numbered development kernel
> and was known, then it should have never been released as a production
> kernel until it was resolved.  Otherwise, it completely defeats the
> purpose of having the even/odd numbering system.
>
> I do not expect bugs to never slip through to production kernels, but
> known bugs that are not trivial should not, and serious bugs like
> these VM problems especially should not.

And you can help prevent them from slipping through by signing up as a
shake and bake tester.  Indeed, you can make your expectations reality
absolutely free of charge,  and or compensation 
what a bargain!

X ___ ;-)

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-29 Thread Alan Cox

> a reasonably stable release until 2.2.12.  I do not understand why
> code with such serious reproducible problems is being introduced into
> the even numbered kernels.  What happened to the plan to use only the

Who said it was introduced ?? It was more 'lurking' than introduced. And 
unfortunately nobody really pinned it down in 2.4test.

> By the way,  The 2.4.5-ac3 kernel still fills swap and runs out of
> memory during my morning NFS incremental backup.  I got this message
> in the syslog.

2.4.5-ac doesn't do some of the write throttling. Thats one thing I'm still
working out. Linus 2.4.5 does write throttling but Im not convinced its done
the right way

> completely full.  By that time the memory was in a reasonable state
> but the swap space is still never being released.

It wont be, its copied of memory already in apps. Linus said 2.4.0 would need
more swap than ram when he put out 2.4.0.


Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-29 Thread Vincent Stemen

On Tuesday 29 May 2001 10:37, elko wrote:
> On Tuesday 29 May 2001 11:10, Alan Cox wrote:
> > > It's not a bug.  It's a feature.  It only breaks systems that are run
> > > w= ith "too
> > > little" swap, and the only difference from 2.2 till now is, that the
> > > de= finition
> > > of "too little" changed.
> >
> > its a giant bug. Or do you want to add 128Gb of unused swap to a full
> > kitted out Xeon box - or 512Gb to a big athlon ???
>
> this bug is biting me too and I do NOT like it !
>
> if it's a *giant* bug, then why is LK-2.4 called a *stable* kernel ??

This has been my complaint ever since the 2.2.0 kernel.  I did not see
a reasonably stable release until 2.2.12.  I do not understand why
code with such serious reproducible problems is being introduced into
the even numbered kernels.  What happened to the plan to use only the
odd numbered kernels for debugging and refinement of the code?  I
never said anything because I thought the the kernel developers would
eventually get back on track after the mistakes of the 2.2.x kernels
but it has been years now and it still has not happened.  I do not
wish sound un-appreciative to those that have put so much wonderful
work into the Linux kernel but this is the same thing we criticize
Microsoft for.  Putting out production code that obviously is not
ready.  Please lets not earn the same reputation of such commercial
companies.  

By the way,  The 2.4.5-ac3 kernel still fills swap and runs out of
memory during my morning NFS incremental backup.  I got this message
in the syslog.

May 29 06:39:15 (none) kernel: Out of Memory: Killed process 23502 
(xteevee).

For some reason xteevee is commonly the process that gets killed.  My
understanding is that it is part of Xscreensaver, but it was during my
backup.

This was the output of 'free' after I got up and found the swap
completely full.  By that time the memory was in a reasonable state
but the swap space is still never being released.

 total   used   free sharedbuffers cached
Mem:255960 220668  35292292 110960  80124
-/+ buffers/cache:  29584 226376
Swap:40124  40112 12


Configuration
-
AMD K6-2/450
256Mb RAM
2.4.5-ac3 Kernel compiled with egcs-1.1.2.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-29 Thread Vincent Stemen

On Tuesday 29 May 2001 10:37, elko wrote:
 On Tuesday 29 May 2001 11:10, Alan Cox wrote:
   It's not a bug.  It's a feature.  It only breaks systems that are run
   w= ith too
   little swap, and the only difference from 2.2 till now is, that the
   de= finition
   of too little changed.
 
  its a giant bug. Or do you want to add 128Gb of unused swap to a full
  kitted out Xeon box - or 512Gb to a big athlon ???

 this bug is biting me too and I do NOT like it !

 if it's a *giant* bug, then why is LK-2.4 called a *stable* kernel ??

This has been my complaint ever since the 2.2.0 kernel.  I did not see
a reasonably stable release until 2.2.12.  I do not understand why
code with such serious reproducible problems is being introduced into
the even numbered kernels.  What happened to the plan to use only the
odd numbered kernels for debugging and refinement of the code?  I
never said anything because I thought the the kernel developers would
eventually get back on track after the mistakes of the 2.2.x kernels
but it has been years now and it still has not happened.  I do not
wish sound un-appreciative to those that have put so much wonderful
work into the Linux kernel but this is the same thing we criticize
Microsoft for.  Putting out production code that obviously is not
ready.  Please lets not earn the same reputation of such commercial
companies.  

By the way,  The 2.4.5-ac3 kernel still fills swap and runs out of
memory during my morning NFS incremental backup.  I got this message
in the syslog.

May 29 06:39:15 (none) kernel: Out of Memory: Killed process 23502 
(xteevee).

For some reason xteevee is commonly the process that gets killed.  My
understanding is that it is part of Xscreensaver, but it was during my
backup.

This was the output of 'free' after I got up and found the swap
completely full.  By that time the memory was in a reasonable state
but the swap space is still never being released.

 total   used   free sharedbuffers cached
Mem:255960 220668  35292292 110960  80124
-/+ buffers/cache:  29584 226376
Swap:40124  40112 12


Configuration
-
AMD K6-2/450
256Mb RAM
2.4.5-ac3 Kernel compiled with egcs-1.1.2.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-29 Thread Alan Cox

 a reasonably stable release until 2.2.12.  I do not understand why
 code with such serious reproducible problems is being introduced into
 the even numbered kernels.  What happened to the plan to use only the

Who said it was introduced ?? It was more 'lurking' than introduced. And 
unfortunately nobody really pinned it down in 2.4test.

 By the way,  The 2.4.5-ac3 kernel still fills swap and runs out of
 memory during my morning NFS incremental backup.  I got this message
 in the syslog.

2.4.5-ac doesn't do some of the write throttling. Thats one thing I'm still
working out. Linus 2.4.5 does write throttling but Im not convinced its done
the right way

 completely full.  By that time the memory was in a reasonable state
 but the swap space is still never being released.

It wont be, its copied of memory already in apps. Linus said 2.4.0 would need
more swap than ram when he put out 2.4.0.


Alan

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Plain 2.4.5 VM... (and 2.4.5-ac3)

2001-05-29 Thread Mike Galbraith

On Tue, 29 May 2001, Vincent Stemen wrote:

 On Tuesday 29 May 2001 15:16, Alan Cox wrote:
   a reasonably stable release until 2.2.12.  I do not understand why
   code with such serious reproducible problems is being introduced into
   the even numbered kernels.  What happened to the plan to use only the
 
  Who said it was introduced ?? It was more 'lurking' than introduced. And
  unfortunately nobody really pinned it down in 2.4test.
 

 I fail to see the distinction.  First of all, why was it ever released
 as 2.4-test?  That question should probably be directed at Linus.  If
 it is not fully tested, then it should be released it as an odd
 number.  If it already existed in the odd numbered development kernel
 and was known, then it should have never been released as a production
 kernel until it was resolved.  Otherwise, it completely defeats the
 purpose of having the even/odd numbering system.

 I do not expect bugs to never slip through to production kernels, but
 known bugs that are not trivial should not, and serious bugs like
 these VM problems especially should not.

And you can help prevent them from slipping through by signing up as a
shake and bake tester.  Indeed, you can make your expectations reality
absolutely free of charge, microfont and or compensation /microfont
what a bargain!

X ___ ;-)

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/