from:"Marc Lehmann"

Re: [f2fs-dev] [PATCH] f2fs: give RO message when recovering superblock

2016-03-23 Thread Marc Lehmann

On Wed, Mar 23, 2016 at 01:38:19PM -0700, Jaegeuk Kim <jaeg...@kernel.org> 
wrote:
> When one of superblocks is missing, f2fs recovers it with the valid one.
> But, even if f2fs is mounted as RO, we'd better notify that too.

(I have written this in my other mail, but in case you didn't see it, because
it wasn't directly sent to you, I replied directly).

Basically all other filesystems do not treat "ro" as anything but as a vfs
flag - the mounted volume will be readonly, but they will happily write to
the volume for recovery or integrity purposes. This has been extensively
discussed on lkml in the past and it was decided that overloading "ro" to
have two different meanings is bad.

If f2fs wants to suppress writes, it should use the norecovery option to
decide, not the ro option. This is the behaviour that other filesystems
follow (at least extN, xfs).

Unless f2fs has a very good reason (which I don't think it has), it should
behave like the other filesystems, and treat "ro" merely as a vfs flag to
suppress writing.

There is a third reason to not change the meaning: typically, the root fs
is mounted ro first and later rw. Therefore f2fs must make sure to have
full integrity on a ro mount, even if that means writing to the backing
store. It isn't acceptable to make ro mounts fail when rw mounts would
work, for example, when upgrading the kernel and rebooting.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

Re: [f2fs-dev] [PATCH] f2fs: give RO message when recovering superblock

2016-03-23 Thread Marc Lehmann

On Wed, Mar 23, 2016 at 01:38:19PM -0700, Jaegeuk Kim  
wrote:
> When one of superblocks is missing, f2fs recovers it with the valid one.
> But, even if f2fs is mounted as RO, we'd better notify that too.

(I have written this in my other mail, but in case you didn't see it, because
it wasn't directly sent to you, I replied directly).

Basically all other filesystems do not treat "ro" as anything but as a vfs
flag - the mounted volume will be readonly, but they will happily write to
the volume for recovery or integrity purposes. This has been extensively
discussed on lkml in the past and it was decided that overloading "ro" to
have two different meanings is bad.

If f2fs wants to suppress writes, it should use the norecovery option to
decide, not the ro option. This is the behaviour that other filesystems
follow (at least extN, xfs).

Unless f2fs has a very good reason (which I don't think it has), it should
behave like the other filesystems, and treat "ro" merely as a vfs flag to
suppress writing.

There is a third reason to not change the meaning: typically, the root fs
is mounted ro first and later rw. Therefore f2fs must make sure to have
full integrity on a ro mount, even if that means writing to the backing
store. It isn't acceptable to make ro mounts fail when rw mounts would
work, for example, when upgrading the kernel and rebooting.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

Re: epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

On Sat, Oct 27, 2007 at 12:23:52PM +0200, Eric Dumazet <[EMAIL PROTECTED]> 
wrote:
> >Q6 Will the close of an fd cause it to be removed from all epoll 
> >sets automatically?
> >A6 Yes.
> 
> Answer : epoll documentation cannot explain the full semantic of file 

epoll documentation easily can. there is nothig keeping it from it. don't
make silly arguments like that.

> Or should, since you had problems

You are again implying I lakc understanding. That is, however, not true.
I don't see the point in being insulted by you, so I won'T continue
talking to you :(

> The 'close' of a file is not close(fd) :)

Good that you understand that.

That is one of my problems, as the manpage talks about closing of the fd,
but there are multiple ways to do that, and some are not handled the same
way.

> epoll has to deal with files, but documentation is a User side 
> documentation, so has to use 'file descriptors'.

There is obviously no need for documentation to do that, contrary to your
claim. The manpages for e.g. dup or the official sus manpages manage to
document it (mostly) correctly, so your claim that documentation must use
file descriptors when the underlying file structure is meant is disproven.

> fork() is acting sort of dup() , as it increases all file refcounts.
> 
> You have problems about close()/dup()/fork()/... file descriptors semantic, 
> which is handled by a layer independent from epoll stuff.

No, I have no problem with dup at all.

I have a problem with explicitlx closing file descriptors in the child will
stop events for those files to be reported in the parent.

I am sorry, but I epxlained this very clearly a number of times, but for some
reason, apart from accusing me to not understanding files and file
descritpors or (clear enough) documentation, you ignore that and instead
hammer on other problems.

To me, it seems you are not the one who understands.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

On Sat, Oct 27, 2007 at 11:22:25AM +0200, Eric Dumazet <[EMAIL PROTECTED]> 
wrote:
> >Well, it behaves like documented, which is the problem. You admit you
> >don't understand the problem or the documentation, so again, no need to
> >insult me.
> 
> Hum... I will update my english vocabulary and mark "missed" as an insult.

Well, ignoring my arguments by claiming I lack understanding is an insult,
as you didn't take my arguments at face value but declassified them by
attacking my person.

> I have no problem with epoll nor its documentation.

Thats fine for you. But I have, at least, with epoll, as the documented
and observed behaviour makes epoll unusable as a general event loop
replacement.

> It doesnt on every kernels I had played with. And I played with *lot* of 
> kernels you know.

No, I don't know that. And so far you only said you used fork+exec, not
close in between, so maybe the playing you did was not related to this
problem?

I also played with a lot of kernels, but for epoll specifically, I played
with 2.6.21-2-amd64 and 2.6.22-1-amd64, both from debian unstable with no
customisations.

> If such a bug exists on your kernel, please fill a complete bug report, 
> giving details.

As this behaviour is clearly documented in the epoll manpage, why do you
think it is a bug? I think its fairly bad, but at least tis documented as
the behaviour it should be:

Q6 Will the close of an fd cause it to be removed from all epoll sets 
automatically?
A6 Yes.

As such filing, a bug report for behaviour which isn't in fact a bug would
be counterproductive. My goal in my mail was to find out if there are
work arounds for this peculiar behaviour (Or inspire discussion on this
behaviour).

Of course, one can create big programs using epoll to their advantage. I
never claimed otherwise. But as a general event loop replacement (i.e.
outside of controleld environments), epoll does not currently qualify,
as I would have to control an awful lot of code (think of an perl module
interfacing to epoll: you would not have to control all third-party
modules that might interfere with fork+close+exec. This is very common in
scripting languages).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

On Sat, Oct 27, 2007 at 10:23:17AM +0200, Eric Dumazet <[EMAIL PROTECTED]> 
wrote:
> >  In this case, the parent process works fine until the child closes fds,
> >  after which the fds become unarmed in the parent too. This works as
> 
> I have no idea what exact problem you have. 

Well, I explained it rather succinctly, I think. If you tell me whats unclear
I can explain...

> But if the child closes some 
> file descriptor that were 'cloned' at fork() time, this only decrements a 
> refcount, and definitely should not close it for the 'parent'.

It doesn't. It removes it from the epoll set, though, so the parent will not
receive events for that fd anymore.

> I have some apps that are happily using epoll() and fork()/exec() and have 

The problem I described is fork/close/exec. close being the explicit
syscall.

> no problem at all. I usually use O_CLOEXEC so that all close() are done at 
> exec() time without having to do it in a loop. epoll continues to work as 
> expected in the parent process.

This is because epoll doesn't behave like documented: It removes the fd
from the parents epoll set only on an explicit close() syscall, not on an
implicit close from exec.

> >fd sets. This would explain the behaviour above. Unfortunately (or
> >fortunately?) this is not what happens: when the fds are being closed by
> >exec or exit, the fds do not get removed from the epoll set.
> 
> at exec() (granted CLOEXEC is asserted) or exit() time, only the refcount 
> of each file is decremented. Only if their refcount becomes NULL, files are 
> then removed from epoll set.

Yes. But thats obviously not the only way to close fds.

> >Is epoll really designed to be so incompatible with the most commno fork
> >patterns? Shouldn't epoll do refcounting, as is commonly done under
> >Unix? As the fd space is not shared between rpocesses, why does epoll
> >try? Shouldn't the epoll information be copied just like the fd table
> >itself, memory, and other resources?
> 
> Too many questions here, showing lack of understanding.

You already said you don't the problem. No need to get insulting :(

> epoll definitly is not useless. It is used on major and critical apps.
> You certainly missed something.

Well, it behaves like documented, which is the problem. You admit you
don't understand the problem or the documentation, so again, no need to
insult me.

> Please provide some code to illustrate one exact problem you have.

   // assume there is an open epoll set that listens for events on fd 5
   if (fork () = 0)
 {
   close (5);
   // fd 5 is now removed from the epoll set of the parent.
   _exit (0);
 }

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

Hi!

I ran into what I see as unsolvable problems that make epoll useless as a
generic event mechanism.

I recently switched to libevent as event loop, and found that my programs
work fine when it is using select or poll, but work eratically or halt
when using epoll.

The reason as I found out is the peculiar behaviour of epoll over fork.
It doesn't work as documented, and even if, it would make the use of
third-party libraries using fork usually impossible.

Here are two scenarios where it screws up:

- some library forks, explicitly closes all fd's it doesn't need, and execs
  another program (which is common behvaiour).

  In this case, the parent process works fine until the child closes fds,
  after which the fds become unarmed in the parent too. This works as
  documented, but since libraries expect this to work without affecting the
  parent, this puts a new and incompatible strain on what libraries can do,
  which in turn makes epoll unsuitable in cases where you don't control all
  your code.

- I have a library that emulates asynchronous I/O with a thread pool, and
  uses a pipe for event notification. That library registers a fork handler
  that closes the pipe in the child and recreates it, so the child could
  continue doing AIO (as could the parent).

  This, too, screws up notifications for the parent,

Now, the epoll manpage says that closing a fd will remove it from all
fd sets. This would explain the behaviour above. Unfortunately (or
fortunately?) this is not what happens: when the fds are being closed by
exec or exit, the fds do not get removed from the epoll set.

This behaviour strikes me as extremely illogical. On the one hand, one
cannot share the epoll fd between processes normally, but on fork,
you can, even though it makes no sense (the child has a different fd
"namespace" than the parent) and actually works on (then( unrelated fds in
the other process.

It also strikes as weird that the order of closing fds should make so much
of a difference: if the epoll fd is closed first in the child, the other
fds will survive in the parent, if its closed last, they don't. Makes no
sense to me.

Now, the problem I see is not that it makes no sense to me - thats clearly
my problem. The problem I see is that there is no way to avoid the
associated problems except by patching all code that would ever use fork,
even if it never has heard anything about epoll yet. This is extremely
nonlocal action at a distance, as this affects a lot of code not even the
author might be aware of (fork is rather common).

To illustrate, here are some workarounds I thought about:

- rearming all fds after fork: doesn't work, as the fds get removed
  asynchronously so I would have to wait for the child to do it.
- closing the epoll fd after fork: doesn't work unless I control
  the fork. I can install a handler to be called using pthreads, but
  that won't help as other handlers might be called first (as in the case of
  the aio library above), screwing me.
- closing and recreating the epoll fd before the fork: isn't support event
  remotely by libevent or similar event loops, and would not help either
  as I cnanot control the calls to fork.

Is epoll really designed to be so incompatible with the most commno fork
patterns? Shouldn't epoll do refcounting, as is commonly done under
Unix? As the fd space is not shared between rpocesses, why does epoll
try? Shouldn't the epoll information be copied just like the fd table
itself, memory, and other resources?

As it looks now, epoll looks useless except in the most controlled
environments, as it doesn't duplicate state on fork as is done with the
other fd-related resources (as opposed to the underlying files, which are
properly shared).

-- 
The choice of a 
  -==- _GNU_  Deliantra, the free in data+content MORPG
  ==-- _   generation
  ---==---(_)__  __   __  http://www.deliantra.net/
  --==---/ / _ \/ // /\ \/ /
  -=/_/_//_/\_,_/ /_/\_\
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

Hi!

I ran into what I see as unsolvable problems that make epoll useless as a
generic event mechanism.

I recently switched to libevent as event loop, and found that my programs
work fine when it is using select or poll, but work eratically or halt
when using epoll.

The reason as I found out is the peculiar behaviour of epoll over fork.
It doesn't work as documented, and even if, it would make the use of
third-party libraries using fork usually impossible.

Here are two scenarios where it screws up:

- some library forks, explicitly closes all fd's it doesn't need, and execs
  another program (which is common behvaiour).

  In this case, the parent process works fine until the child closes fds,
  after which the fds become unarmed in the parent too. This works as
  documented, but since libraries expect this to work without affecting the
  parent, this puts a new and incompatible strain on what libraries can do,
  which in turn makes epoll unsuitable in cases where you don't control all
  your code.

- I have a library that emulates asynchronous I/O with a thread pool, and
  uses a pipe for event notification. That library registers a fork handler
  that closes the pipe in the child and recreates it, so the child could
  continue doing AIO (as could the parent).

  This, too, screws up notifications for the parent,

Now, the epoll manpage says that closing a fd will remove it from all
fd sets. This would explain the behaviour above. Unfortunately (or
fortunately?) this is not what happens: when the fds are being closed by
exec or exit, the fds do not get removed from the epoll set.

This behaviour strikes me as extremely illogical. On the one hand, one
cannot share the epoll fd between processes normally, but on fork,
you can, even though it makes no sense (the child has a different fd
namespace than the parent) and actually works on (then( unrelated fds in
the other process.

It also strikes as weird that the order of closing fds should make so much
of a difference: if the epoll fd is closed first in the child, the other
fds will survive in the parent, if its closed last, they don't. Makes no
sense to me.

Now, the problem I see is not that it makes no sense to me - thats clearly
my problem. The problem I see is that there is no way to avoid the
associated problems except by patching all code that would ever use fork,
even if it never has heard anything about epoll yet. This is extremely
nonlocal action at a distance, as this affects a lot of code not even the
author might be aware of (fork is rather common).

To illustrate, here are some workarounds I thought about:

- rearming all fds after fork: doesn't work, as the fds get removed
  asynchronously so I would have to wait for the child to do it.
- closing the epoll fd after fork: doesn't work unless I control
  the fork. I can install a handler to be called using pthreads, but
  that won't help as other handlers might be called first (as in the case of
  the aio library above), screwing me.
- closing and recreating the epoll fd before the fork: isn't support event
  remotely by libevent or similar event loops, and would not help either
  as I cnanot control the calls to fork.

Is epoll really designed to be so incompatible with the most commno fork
patterns? Shouldn't epoll do refcounting, as is commonly done under
Unix? As the fd space is not shared between rpocesses, why does epoll
try? Shouldn't the epoll information be copied just like the fd table
itself, memory, and other resources?

As it looks now, epoll looks useless except in the most controlled
environments, as it doesn't duplicate state on fork as is done with the
other fd-related resources (as opposed to the underlying files, which are
properly shared).

-- 
The choice of a 
  -==- _GNU_  Deliantra, the free in data+content MORPG
  ==-- _   generation
  ---==---(_)__  __   __  http://www.deliantra.net/
  --==---/ / _ \/ // /\ \/ /
  -=/_/_//_/\_,_/ /_/\_\
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

On Sat, Oct 27, 2007 at 10:23:17AM +0200, Eric Dumazet [EMAIL PROTECTED] 
wrote:
   In this case, the parent process works fine until the child closes fds,
   after which the fds become unarmed in the parent too. This works as
 
 I have no idea what exact problem you have. 

Well, I explained it rather succinctly, I think. If you tell me whats unclear
I can explain...

 But if the child closes some 
 file descriptor that were 'cloned' at fork() time, this only decrements a 
 refcount, and definitely should not close it for the 'parent'.

It doesn't. It removes it from the epoll set, though, so the parent will not
receive events for that fd anymore.

 I have some apps that are happily using epoll() and fork()/exec() and have 

The problem I described is fork/close/exec. close being the explicit
syscall.

 no problem at all. I usually use O_CLOEXEC so that all close() are done at 
 exec() time without having to do it in a loop. epoll continues to work as 
 expected in the parent process.

This is because epoll doesn't behave like documented: It removes the fd
from the parents epoll set only on an explicit close() syscall, not on an
implicit close from exec.

 fd sets. This would explain the behaviour above. Unfortunately (or
 fortunately?) this is not what happens: when the fds are being closed by
 exec or exit, the fds do not get removed from the epoll set.
 
 at exec() (granted CLOEXEC is asserted) or exit() time, only the refcount 
 of each file is decremented. Only if their refcount becomes NULL, files are 
 then removed from epoll set.

Yes. But thats obviously not the only way to close fds.

 Is epoll really designed to be so incompatible with the most commno fork
 patterns? Shouldn't epoll do refcounting, as is commonly done under
 Unix? As the fd space is not shared between rpocesses, why does epoll
 try? Shouldn't the epoll information be copied just like the fd table
 itself, memory, and other resources?
 
 Too many questions here, showing lack of understanding.

You already said you don't the problem. No need to get insulting :(

 epoll definitly is not useless. It is used on major and critical apps.
 You certainly missed something.

Well, it behaves like documented, which is the problem. You admit you
don't understand the problem or the documentation, so again, no need to
insult me.

 Please provide some code to illustrate one exact problem you have.

   // assume there is an open epoll set that listens for events on fd 5
   if (fork () = 0)
 {
   close (5);
   // fd 5 is now removed from the epoll set of the parent.
   _exit (0);
 }

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

On Sat, Oct 27, 2007 at 11:22:25AM +0200, Eric Dumazet [EMAIL PROTECTED] 
wrote:
 Well, it behaves like documented, which is the problem. You admit you
 don't understand the problem or the documentation, so again, no need to
 insult me.
 
 Hum... I will update my english vocabulary and mark missed as an insult.

Well, ignoring my arguments by claiming I lack understanding is an insult,
as you didn't take my arguments at face value but declassified them by
attacking my person.

 I have no problem with epoll nor its documentation.

Thats fine for you. But I have, at least, with epoll, as the documented
and observed behaviour makes epoll unusable as a general event loop
replacement.

 It doesnt on every kernels I had played with. And I played with *lot* of 
 kernels you know.

No, I don't know that. And so far you only said you used fork+exec, not
close in between, so maybe the playing you did was not related to this
problem?

I also played with a lot of kernels, but for epoll specifically, I played
with 2.6.21-2-amd64 and 2.6.22-1-amd64, both from debian unstable with no
customisations.

 If such a bug exists on your kernel, please fill a complete bug report, 
 giving details.

As this behaviour is clearly documented in the epoll manpage, why do you
think it is a bug? I think its fairly bad, but at least tis documented as
the behaviour it should be:

Q6 Will the close of an fd cause it to be removed from all epoll sets 
automatically?
A6 Yes.

As such filing, a bug report for behaviour which isn't in fact a bug would
be counterproductive. My goal in my mail was to find out if there are
work arounds for this peculiar behaviour (Or inspire discussion on this
behaviour).

Of course, one can create big programs using epoll to their advantage. I
never claimed otherwise. But as a general event loop replacement (i.e.
outside of controleld environments), epoll does not currently qualify,
as I would have to control an awful lot of code (think of an perl module
interfacing to epoll: you would not have to control all third-party
modules that might interfere with fork+close+exec. This is very common in
scripting languages).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: epoll design problems with common fork/exec patterns

2007-10-27 Thread Marc Lehmann

On Sat, Oct 27, 2007 at 12:23:52PM +0200, Eric Dumazet [EMAIL PROTECTED] 
wrote:
 Q6 Will the close of an fd cause it to be removed from all epoll 
 sets automatically?
 A6 Yes.
 
 Answer : epoll documentation cannot explain the full semantic of file 

epoll documentation easily can. there is nothig keeping it from it. don't
make silly arguments like that.

 Or should, since you had problems

You are again implying I lakc understanding. That is, however, not true.
I don't see the point in being insulted by you, so I won'T continue
talking to you :(

 The 'close' of a file is not close(fd) :)

Good that you understand that.

That is one of my problems, as the manpage talks about closing of the fd,
but there are multiple ways to do that, and some are not handled the same
way.

 epoll has to deal with files, but documentation is a User side 
 documentation, so has to use 'file descriptors'.

There is obviously no need for documentation to do that, contrary to your
claim. The manpages for e.g. dup or the official sus manpages manage to
document it (mostly) correctly, so your claim that documentation must use
file descriptors when the underlying file structure is meant is disproven.

 fork() is acting sort of dup() , as it increases all file refcounts.
 
 You have problems about close()/dup()/fork()/... file descriptors semantic, 
 which is handled by a layer independent from epoll stuff.

No, I have no problem with dup at all.

I have a problem with explicitlx closing file descriptors in the child will
stop events for those files to be reported in the parent.

I am sorry, but I epxlained this very clearly a number of times, but for some
reason, apart from accusing me to not understanding files and file
descritpors or (clear enough) documentation, you ignore that and instead
hammer on other problems.

To me, it seems you are not the one who understands.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: masquerading failure for at least icmp and tcp+sack on amd64

2005-09-07 Thread Marc Lehmann

On Tue, Sep 06, 2005 at 07:29:30PM +0200, Marc Lehmann <[EMAIL PROTECTED]> 
wrote:
> Weird obervation 2:
> 
> Some sites could be connected to with TCP. It turned out that those
> sites did not support TCP SACK. Indeed, turning off SACK either on the
> remote side of a connection or on the origonator side resulted in workign
> masquerading:

Sorry for the F'up, but this turned to be slightly untrue: turning off SACK
makes the syn handshake happen, but some packets further down the stream
the masquerading router sends a RST again.

> Kernels that don't work:
> 
>2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)
> 

I forgot to mention that the kernels that don't work are for amd64. In
the meantime, I also tried out 2.6.11 (as I had some troubles with
2.6.12..2.6.13-rc7 on other amd64 machines), with the same result (reply
packets are ignored/rejected).

-- 
The choice of a
  -==- _GNU_
  ----==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: masquerading failure for at least icmp and tcp+sack on amd64

2005-09-07 Thread Marc Lehmann

On Tue, Sep 06, 2005 at 07:29:30PM +0200, Marc Lehmann [EMAIL PROTECTED] 
wrote:
 Weird obervation 2:
 
 Some sites could be connected to with TCP. It turned out that those
 sites did not support TCP SACK. Indeed, turning off SACK either on the
 remote side of a connection or on the origonator side resulted in workign
 masquerading:

Sorry for the F'up, but this turned to be slightly untrue: turning off SACK
makes the syn handshake happen, but some packets further down the stream
the masquerading router sends a RST again.

 Kernels that don't work:
 
2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)
 

I forgot to mention that the kernels that don't work are for amd64. In
the meantime, I also tried out 2.6.11 (as I had some troubles with
2.6.12..2.6.13-rc7 on other amd64 machines), with the same result (reply
packets are ignored/rejected).

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

masquerading failure for at least icmp and tcp+sack on amd64

2005-09-06 Thread Marc Lehmann

Hi!

I recently upgraded a 32 bit machine to a new amd64 board+cpu. I took the
same kernel (2.6.13-rc7) and just recompiled it for 64 bit, plus upgraded
userspace to 64 bit.

Firewall config stayed the same.

Problem: neither ping nor tcp was being masqueraded properly. I created
the following test-set-up:

   iptables -t mangle -F
   iptables -t filter -F
   iptables -t nat -F
   iptables -t nat -A POSTROUTING -p all -s 10.0.0.0/8 -d \! 10.0.0.0/8 -j 
MASQUERADE

i..e the above masquerade rule should be the only firewall rule, and all
fules shoul[d have policy ACCEPT.

The effect was that tcp packets and icmp packets coming from 10.0.0.1 on
interface eth0 were properly masqueraded on the outgoing "inet" interface
(ppp0 renamed):

eth0:
   19:17:24.364351 IP 10.0.0.1.44320 > 129.13.162.95.80: S 
3745828676:3745828676(0) win 5840 

inet:
   19:17:24.364505 IP 84.56.237.68.44320 > 129.13.162.95.80: S 
3745828676:3745828676(0) win 5840 
   19:17:24.378029 IP 129.13.162.95.80 > 84.56.237.68.44320: S 
3777391404:3777391404(0) ack 3745828677 win 5840 
   19:17:24.378103 IP 84.56.237.68.44320 > 129.13.162.95.80: R 
3745828677:3745828677(0) win 0

However, the reverse packets were rejected. ip_conntrack showed this:

   tcp  6 52 SYN_SENT src=10.0.0.1 dst=129.13.162.95 sport=44320 dport=80 
[UNREPLIED] src=129.13.162.95 dst=84.56.237.68 sport=80 dport=44320 mark=0 use=1

ICMP echo replies were also masqueraded, but the reply was ignored.

Weird observation 1:

   ip route del default
   ip add default via 10.0.0.17

Resulted in working masquerading, this time over device "vpn0", which is
a tuntap-interface. Working means that outgoing packets were correctly
re-written with source 10.0.0.5 (local address of vpn0) and replie were
correctly "un"-translated.

Weird obervation 2:

Some sites could be connected to with TCP. It turned out that those
sites did not support TCP SACK. Indeed, turning off SACK either on the
remote side of a connection or on the origonator side resulted in workign
masquerading:

eth0:
   19:23:29.928470 IP 10.0.0.1.45611 > 129.13.162.95.80: S 
4113365634:4113365634(0) win 5840 
   19:23:29.942246 IP 129.13.162.95.80 > 10.0.0.1.45611: S 
4161877683:4161877683(0) ack 4113365635 win 5840 
   19:23:29.942313 IP 10.0.0.1.45611 > 129.13.162.95.80: . ack 1 win 5840

inet:
   19:23:29.928249 IP 84.56.237.68.45611 > 129.13.162.95.80: S 
4113365634:4113365634(0) win 5840 
   19:23:29.942199 IP 129.13.162.95.80 > 84.56.237.68.45611: S 
4161877683:4161877683(0) ack 4113365635 win 5840 
   19:23:29.942332 IP 84.56.237.68.45611 > 129.13.162.95.80: . ack 1 win 5840

However, ICMP still is not masqueraded.

Kernels that worked:

   2.6.13-rc7, 2.6.12.5, 2.6.11 and lower, compiled for x86 with gcc-3.4

Kernels that don't work:

   2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)

Kernel configuration was exactly the same for the 2.6.13-rc7 kernels,
modulo the cpu and architectrue selections.

I have a somewhat nontrivial source routing set-up on that machine that I
could document more if that could be a possible reason for that problem. I
am confident that this is not a configuration error, as the configuraiton
worked basically unchanged since the 2.4 days, and I am confident it's not
a iptables setup problem either, as I can reproduce it with empty rules
except for the masquerading rule.

I did not mention UDP because I didn't test it, but it's likely that UDP
masquerading also fails.

Any idea at what I could look at or try out to find out more about this
problem?

-- 
The choice of a
      -----==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

masquerading failure for at least icmp and tcp+sack on amd64

2005-09-06 Thread Marc Lehmann

Hi!

I recently upgraded a 32 bit machine to a new amd64 board+cpu. I took the
same kernel (2.6.13-rc7) and just recompiled it for 64 bit, plus upgraded
userspace to 64 bit.

Firewall config stayed the same.

Problem: neither ping nor tcp was being masqueraded properly. I created
the following test-set-up:

   iptables -t mangle -F
   iptables -t filter -F
   iptables -t nat -F
   iptables -t nat -A POSTROUTING -p all -s 10.0.0.0/8 -d \! 10.0.0.0/8 -j 
MASQUERADE

i..e the above masquerade rule should be the only firewall rule, and all
fules shoul[d have policy ACCEPT.

The effect was that tcp packets and icmp packets coming from 10.0.0.1 on
interface eth0 were properly masqueraded on the outgoing inet interface
(ppp0 renamed):

eth0:
   19:17:24.364351 IP 10.0.0.1.44320  129.13.162.95.80: S 
3745828676:3745828676(0) win 5840 mss 1460,nop,nop,sackOK

inet:
   19:17:24.364505 IP 84.56.237.68.44320  129.13.162.95.80: S 
3745828676:3745828676(0) win 5840 mss 1452,nop,nop,sackOK
   19:17:24.378029 IP 129.13.162.95.80  84.56.237.68.44320: S 
3777391404:3777391404(0) ack 3745828677 win 5840 mss 1460,nop,nop,sackOK
   19:17:24.378103 IP 84.56.237.68.44320  129.13.162.95.80: R 
3745828677:3745828677(0) win 0

However, the reverse packets were rejected. ip_conntrack showed this:

   tcp  6 52 SYN_SENT src=10.0.0.1 dst=129.13.162.95 sport=44320 dport=80 
[UNREPLIED] src=129.13.162.95 dst=84.56.237.68 sport=80 dport=44320 mark=0 use=1

ICMP echo replies were also masqueraded, but the reply was ignored.

Weird observation 1:

   ip route del default
   ip add default via 10.0.0.17

Resulted in working masquerading, this time over device vpn0, which is
a tuntap-interface. Working means that outgoing packets were correctly
re-written with source 10.0.0.5 (local address of vpn0) and replie were
correctly un-translated.

Weird obervation 2:

Some sites could be connected to with TCP. It turned out that those
sites did not support TCP SACK. Indeed, turning off SACK either on the
remote side of a connection or on the origonator side resulted in workign
masquerading:

eth0:
   19:23:29.928470 IP 10.0.0.1.45611  129.13.162.95.80: S 
4113365634:4113365634(0) win 5840 mss 1460
   19:23:29.942246 IP 129.13.162.95.80  10.0.0.1.45611: S 
4161877683:4161877683(0) ack 4113365635 win 5840 mss 1460
   19:23:29.942313 IP 10.0.0.1.45611  129.13.162.95.80: . ack 1 win 5840

inet:
   19:23:29.928249 IP 84.56.237.68.45611  129.13.162.95.80: S 
4113365634:4113365634(0) win 5840 mss 1452
   19:23:29.942199 IP 129.13.162.95.80  84.56.237.68.45611: S 
4161877683:4161877683(0) ack 4113365635 win 5840 mss 1460
   19:23:29.942332 IP 84.56.237.68.45611  129.13.162.95.80: . ack 1 win 5840

However, ICMP still is not masqueraded.

Kernels that worked:

   2.6.13-rc7, 2.6.12.5, 2.6.11 and lower, compiled for x86 with gcc-3.4

Kernels that don't work:

   2.6.13-rc7 (compiled with gcc-3.4 and 4.0.2 debian), 2.6.13 (gcc-4.02)

Kernel configuration was exactly the same for the 2.6.13-rc7 kernels,
modulo the cpu and architectrue selections.

I have a somewhat nontrivial source routing set-up on that machine that I
could document more if that could be a possible reason for that problem. I
am confident that this is not a configuration error, as the configuraiton
worked basically unchanged since the 2.4 days, and I am confident it's not
a iptables setup problem either, as I can reproduce it with empty rules
except for the masquerading rule.

I did not mention UDP because I didn't test it, but it's likely that UDP
masquerading also fails.

Any idea at what I could look at or try out to find out more about this
problem?

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel BUG at "fs/exec.c":777

2005-08-21 Thread Marc Lehmann

On Sun, Aug 21, 2005 at 01:49:45AM -0700, Andrew Morton <[EMAIL PROTECTED]> 
wrote:
> Marc Lehmann <[EMAIL PROTECTED]> wrote:
> >
> > If wanted, I can probably reproduce
> > that without the nvidia kernel module loaded.
> > 
> 
> Yes, please do that, thanks.

Ooops, you are not Alexander Nyberg :) Sorry, to give my previous reply
more context: I had a conversation with Alexander Nyberg who wanted to
debug this problem this weekend, and I gave detailed instructions on how
to reproduce it (which is a bit awkward). I also wrote a script that
doesn't rely on X running, but triggers the bug much less often (in fact,
only twice for me so far), and then it seems only the first time after
reboot (which *could* be caused by the very different timing of the
stat()-threads due to the extra disk access).

Let's see what Alexander found out (if he found time). The problem does
not happen (or is not reproducible) with newer IO::AIO releases, as that
one doesn't start threads in the child after the fork/before the exec.

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel BUG at "fs/exec.c":777

2005-08-21 Thread Marc Lehmann

On Sun, Aug 21, 2005 at 01:49:45AM -0700, Andrew Morton <[EMAIL PROTECTED]> 
wrote:
> Marc Lehmann <[EMAIL PROTECTED]> wrote:
> >
> > If wanted, I can probably reproduce
> > that without the nvidia kernel module loaded.
> > 
> 
> Yes, please do that, thanks.

I tried a few times with booting into textmode (the X-server loads the
nvidia module) and running the oops script, and after the third try, I
get the oops again, but not afterwards (I kept running it on the same
machine).


-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel BUG at fs/exec.c:777

2005-08-21 Thread Marc Lehmann

On Sun, Aug 21, 2005 at 01:49:45AM -0700, Andrew Morton [EMAIL PROTECTED] 
wrote:
 Marc Lehmann [EMAIL PROTECTED] wrote:
 
  If wanted, I can probably reproduce
  that without the nvidia kernel module loaded.
  
 
 Yes, please do that, thanks.

I tried a few times with booting into textmode (the X-server loads the
nvidia module) and running the oops script, and after the third try, I
get the oops again, but not afterwards (I kept running it on the same
machine).


-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel BUG at fs/exec.c:777

2005-08-21 Thread Marc Lehmann

On Sun, Aug 21, 2005 at 01:49:45AM -0700, Andrew Morton [EMAIL PROTECTED] 
wrote:
 Marc Lehmann [EMAIL PROTECTED] wrote:
 
  If wanted, I can probably reproduce
  that without the nvidia kernel module loaded.
  
 
 Yes, please do that, thanks.

Ooops, you are not Alexander Nyberg :) Sorry, to give my previous reply
more context: I had a conversation with Alexander Nyberg who wanted to
debug this problem this weekend, and I gave detailed instructions on how
to reproduce it (which is a bit awkward). I also wrote a script that
doesn't rely on X running, but triggers the bug much less often (in fact,
only twice for me so far), and then it seems only the first time after
reboot (which *could* be caused by the very different timing of the
stat()-threads due to the extra disk access).

Let's see what Alexander found out (if he found time). The problem does
not happen (or is not reproducible) with newer IO::AIO releases, as that
one doesn't start threads in the child after the fork/before the exec.

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Kernel BUG at "fs/exec.c":777

2005-08-17 Thread Marc Lehmann

(A courteasy CC: on replies would be appreciated, thanks)

Hi!

I get the above oops message (full details below) sometimes when running the
CVS version of "cv", a gtk+ image viewer.

I use kernel 2.6.12.5, but it occured on 2.6.11 that I ran earlier, too.
Unfortunately, it only happens during interactive use (or at least my
simple test scripts were unable to reproduce the behaviour yet).

cv is a perl/gtk script that uses the IO::AIO module. That module starts a
number of threads that basically emulate asynchronous I/O.

The oops happen when cv starts to move files (which it does by fork+exec
of /bin/mv, which is where it oopses). IO::AIO has an pthread_atfork
handler that recreates the aio threads after the fork (but doesn't kill
the threads before the fork). The forked process than does an exec() and
rarely oopses.

So what happens is:

   pthread_create (4 or more times)
   fork   ("main" thread forks)
   pthread_create (4 or more times)
   exec   ("main" thread execs)
   ...(very rarely oopses)

All in quick successsion.

fs/exec.c:777 is:

593 static inline int de_thread(struct task_struct *tsk)
   ...
776 if (!thread_group_empty(current))
777 BUG();
778 if (!thread_group_leader(current))
779 BUG();
780 return 0;

2.6.11 oopsed at the same BUG().

The system is an SMP dual opteron in 64 bit mode with gcc-3.3 (I think)
compiled kernel and the nvidia kernel module loaded (but the program only
does X calls, no direct gl access). If wanted, I can probably reproduce
that without the nvidia kernel module loaded.

If any other info is required to fix that bug I'll happily try to find
out or test things.

Thanks!

The complete OOPS is:

--- [cut here ] - [please bite here ] -
Kernel BUG at "fs/exec.c":777
invalid operand:  [1] SMP 
CPU 0 
Modules linked in: nls_utf8 nls_cp850 vfat fat loop nvidia tg3 
snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_midi 
snd_seq_midi_event snd_seq snd_emu10k1 snd_seq_device snd_util_mem snd_hwdep 
w83627hf i2c_sensor i2c_isa amd64_agp 3w_9xxx
Pid: 11032, comm: cv Tainted: P  2.6.12.5
RIP: 0010:[] {flush_old_exec+1531}
RSP: 0018:810002cddd28  EFLAGS: 00010202
RAX: 81001d064a90 RBX: 0001 RCX: 
RDX: 81001d064910 RSI: 81003e501680 RDI: 81003fec4e80
RBP: 81002555ccc0 R08: 805b1880 R09: 0002
R10:  R11: 810001e104e0 R12: ffb0
R13: 81002577a8c0 R14: 81002577b0c8 R15: 81003e501680
FS:  2b1a7e10() GS:80576780() knlGS:56a06500
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 03563600 CR3: 15335000 CR4: 06e0
Process cv (pid: 11032, threadinfo 810002cdc000, task 81001d064910)
Stack: 00010101 801a2eee 0080 81001295e480 
   0080 81002dfbd600 8100 81002dfbd600 
   1295e480 81000a4c9b40 
Call Trace:{dnotify_parent+46} 
{load_elf_binary+1335}
   {buffered_rmqueue+323} 
{load_elf_binary+0}
   {search_binary_handler+158} 
{do_execve+386}
   {system_call+126} {sys_execve+65}
   {stub_execve+106} 

Code: 0f 0b 05 ec 42 80 ff ff ff ff 09 03 65 48 8b 04 25 00 00 00 
RIP {flush_old_exec+1531} RSP 
 nfs warning: mount version older than kernel





-- 
The choice of a
  -==- _GNU_
      ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Kernel BUG at fs/exec.c:777

2005-08-17 Thread Marc Lehmann

(A courteasy CC: on replies would be appreciated, thanks)

Hi!

I get the above oops message (full details below) sometimes when running the
CVS version of cv, a gtk+ image viewer.

I use kernel 2.6.12.5, but it occured on 2.6.11 that I ran earlier, too.
Unfortunately, it only happens during interactive use (or at least my
simple test scripts were unable to reproduce the behaviour yet).

cv is a perl/gtk script that uses the IO::AIO module. That module starts a
number of threads that basically emulate asynchronous I/O.

The oops happen when cv starts to move files (which it does by fork+exec
of /bin/mv, which is where it oopses). IO::AIO has an pthread_atfork
handler that recreates the aio threads after the fork (but doesn't kill
the threads before the fork). The forked process than does an exec() and
rarely oopses.

So what happens is:

   pthread_create (4 or more times)
   fork   (main thread forks)
   pthread_create (4 or more times)
   exec   (main thread execs)
   ...(very rarely oopses)

All in quick successsion.

fs/exec.c:777 is:

593 static inline int de_thread(struct task_struct *tsk)
   ...
776 if (!thread_group_empty(current))
777 BUG();
778 if (!thread_group_leader(current))
779 BUG();
780 return 0;

2.6.11 oopsed at the same BUG().

The system is an SMP dual opteron in 64 bit mode with gcc-3.3 (I think)
compiled kernel and the nvidia kernel module loaded (but the program only
does X calls, no direct gl access). If wanted, I can probably reproduce
that without the nvidia kernel module loaded.

If any other info is required to fix that bug I'll happily try to find
out or test things.

Thanks!

The complete OOPS is:

--- [cut here ] - [please bite here ] -
Kernel BUG at fs/exec.c:777
invalid operand:  [1] SMP 
CPU 0 
Modules linked in: nls_utf8 nls_cp850 vfat fat loop nvidia tg3 
snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_seq_midi 
snd_seq_midi_event snd_seq snd_emu10k1 snd_seq_device snd_util_mem snd_hwdep 
w83627hf i2c_sensor i2c_isa amd64_agp 3w_9xxx
Pid: 11032, comm: cv Tainted: P  2.6.12.5
RIP: 0010:[8017e47b] 8017e47b{flush_old_exec+1531}
RSP: 0018:810002cddd28  EFLAGS: 00010202
RAX: 81001d064a90 RBX: 0001 RCX: 
RDX: 81001d064910 RSI: 81003e501680 RDI: 81003fec4e80
RBP: 81002555ccc0 R08: 805b1880 R09: 0002
R10:  R11: 810001e104e0 R12: ffb0
R13: 81002577a8c0 R14: 81002577b0c8 R15: 81003e501680
FS:  2b1a7e10() GS:80576780() knlGS:56a06500
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 03563600 CR3: 15335000 CR4: 06e0
Process cv (pid: 11032, threadinfo 810002cdc000, task 81001d064910)
Stack: 00010101 801a2eee 0080 81001295e480 
   0080 81002dfbd600 8100 81002dfbd600 
   1295e480 81000a4c9b40 
Call Trace:801a2eee{dnotify_parent+46} 
8019f4a7{load_elf_binary+1335}
   80157fd3{buffered_rmqueue+323} 
8019ef70{load_elf_binary+0}
   8017ea3e{search_binary_handler+158} 
8017ed82{do_execve+386}
   8010e72a{system_call+126} 8010d181{sys_execve+65}
   8010eb4a{stub_execve+106} 

Code: 0f 0b 05 ec 42 80 ff ff ff ff 09 03 65 48 8b 04 25 00 00 00 
RIP 8017e47b{flush_old_exec+1531} RSP 810002cddd28
 nfs warning: mount version older than kernel





-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: critical bugs in md raid5 and ATA disk failure/recovery modes

2005-01-27 Thread Marc Lehmann

ation for the
drive.

What the drive in many failures is simply tag the block as unreadable
(mostly because the checksum/ecc data does not match) and correct this on
write. Most drivers will also check the surface and allocate a replacement
block automatically if required.

> of replacement blocks, and will eventually fail. That is why

Then the drive would be very buggy. If it runs out of replacement blocks it
will not suddenly fail, but only be unable to repair the block.

> Linux "forces" early replacement of the disk on any error - it is the
> safest thing to do.

That is certainly untrue. The safest thing to do would doubtlessly be to
make a warning that the disk needs to be replaced but still provide the
data as long as possible, instead of killing the device.

It would certainly make sense to no touch the disk in write mode, or, if
one is paranoid, in read mode, but right now the device is simply lost.

> > Of course, but that's supposed to be worked around by using a journaling
> > file system, right?
> 
> Nope, journaling is no magical fix for meta data corruption.

Meta data corruption of what? The raid device, then yes, the filesystem,
then no.

raid5 works by relying on error detetcion of the underlying device. it
will suffer form the same kind of corruption that a normal device suffers,
i.e. if data gets corrupted silently it's gone. However, in other cases
(loud error reporting), the raid device will not corrupt data, as it can
always know which data is there and which isn't, juts as with a normal
disk.

What raid provides is just more redundant data in normal operation - it
doens't suffer from silent data corruption more than a normal disk.

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: critical bugs in md raid5 and ATA disk failure/recovery modes

2005-01-27 Thread Marc Lehmann

 and allocate a replacement
block automatically if required.

 of replacement blocks, and will eventually fail. That is why

Then the drive would be very buggy. If it runs out of replacement blocks it
will not suddenly fail, but only be unable to repair the block.

 Linux forces early replacement of the disk on any error - it is the
 safest thing to do.

That is certainly untrue. The safest thing to do would doubtlessly be to
make a warning that the disk needs to be replaced but still provide the
data as long as possible, instead of killing the device.

It would certainly make sense to no touch the disk in write mode, or, if
one is paranoid, in read mode, but right now the device is simply lost.

  Of course, but that's supposed to be worked around by using a journaling
  file system, right?
 
 Nope, journaling is no magical fix for meta data corruption.

Meta data corruption of what? The raid device, then yes, the filesystem,
then no.

raid5 works by relying on error detetcion of the underlying device. it
will suffer form the same kind of corruption that a normal device suffers,
i.e. if data gets corrupted silently it's gone. However, in other cases
(loud error reporting), the raid device will not corrupt data, as it can
always know which data is there and which isn't, juts as with a normal
disk.

What raid provides is just more redundant data in normal operation - it
doens't suffer from silent data corruption more than a normal disk.

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: critical bugs in md raid5

2005-01-26 Thread Marc Lehmann

On Thu, Jan 27, 2005 at 06:11:34AM +0100, Andi Kleen <[EMAIL PROTECTED]> wrote:
> Marc Lehmann <[EMAIL PROTECTED]> writes:
> > The summary seems to be that the linux raid driver only protects your data
> > as long as all disks are fine and the machine never crashes.
> 
> "as long as the machine never crashes". That's correct. If you think
> about how RAID 5 works there is no way around it. When a write to 

I disagree. When not working in degraded mode, it's absolutely reasonable
to e.g. use only the non-parity data. A crash with raid5 is in no way
different to a crash without raid5 then: either the old data is on the
disk, the new data is on the disk, or you had some catastrophic disk event
and no data is on the disk.

The case I reported was not a catastrophic failure: either the old or new
data was on the disk, and the filesystem journaling (which is ext3) will
take care of it. Even if the parity information is not in sync, either old or
new data is on the disk.

> a single stripe is interrupted (machine crash) and you lose a disk
> during the recovery a lot of data (even unrelated to the data just written)
> is lost.

This is not what I described, in fact, I haven't lost any data, despite
having had a number of such problems (I did verify that afterwards, and
found no differences. Maybe this is luck, but it seems to happen in the
majority of cases, and I ahd a similar problem at least 5 or 6 times
because I didn't encounter the bug I reported).

> But that's nothing inherent in Linux RAID5. It's a generic problem.
> Pretty much all Software RAID5 implementations have it.

Indeed, but I think linux' behaviour is especially poor. For example, the
renumbering of the devices or the strange rebuild-restart behaviour (which
is definitely a bug) will make recovery unnecessarily complicated.

> RAID-1 helps a bit, because you either get the old or the new data,
> but not some corruption.

You don't get any magical corruption with RAID5 either... the data contents
will either be old, or new. The differnce is that you cannot trust parity.

> In practice even old data can be a big
> problem though (e.g. when file system metadata is affected)

Of course, but that's supposed to be worked around by using a journaling
file system, right?

> Morale: if you really care about your data backup very often and
> use RAID-1 or get an expensive hardware RAID with battery backup
> (all the cheap "hardware RAIDs" are equally useless for this) 

Yes, I am thinking of that for some time now, but always had a problem
because the affordable ones have low performance. But given linux'
effective slower-than-a-single-disk performance it shouldn't be hard to
beat nowadays.

There is, however, at least the resyncing with only 4 out of 5 disks, that
is doubtlessly a bug somewhere.

-- 
        The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

critical bugs in md raid5

2005-01-26 Thread Marc Lehmann

 never read more than about 25-35MB/s top, which is much
less than the speed of a single disk - dd'ing from a single disk gives a
speed of >50MB/s, and dd'ing from, say, 4 or 5 disks gives me wlel over
200MB/s).

Of course, this last issue is not critical at all - I am working with this
problem since 2.4 days :)

Thanks for all the good work that alraedy went into linux, though!

Hope this helps,

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

critical bugs in md raid5

2005-01-26 Thread Marc Lehmann

 about 25-35MB/s top, which is much
less than the speed of a single disk - dd'ing from a single disk gives a
speed of 50MB/s, and dd'ing from, say, 4 or 5 disks gives me wlel over
200MB/s).

Of course, this last issue is not critical at all - I am working with this
problem since 2.4 days :)

Thanks for all the good work that alraedy went into linux, though!

Hope this helps,

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: critical bugs in md raid5

2005-01-26 Thread Marc Lehmann

On Thu, Jan 27, 2005 at 06:11:34AM +0100, Andi Kleen [EMAIL PROTECTED] wrote:
 Marc Lehmann [EMAIL PROTECTED] writes:
  The summary seems to be that the linux raid driver only protects your data
  as long as all disks are fine and the machine never crashes.
 
 as long as the machine never crashes. That's correct. If you think
 about how RAID 5 works there is no way around it. When a write to 

I disagree. When not working in degraded mode, it's absolutely reasonable
to e.g. use only the non-parity data. A crash with raid5 is in no way
different to a crash without raid5 then: either the old data is on the
disk, the new data is on the disk, or you had some catastrophic disk event
and no data is on the disk.

The case I reported was not a catastrophic failure: either the old or new
data was on the disk, and the filesystem journaling (which is ext3) will
take care of it. Even if the parity information is not in sync, either old or
new data is on the disk.

 a single stripe is interrupted (machine crash) and you lose a disk
 during the recovery a lot of data (even unrelated to the data just written)
 is lost.

This is not what I described, in fact, I haven't lost any data, despite
having had a number of such problems (I did verify that afterwards, and
found no differences. Maybe this is luck, but it seems to happen in the
majority of cases, and I ahd a similar problem at least 5 or 6 times
because I didn't encounter the bug I reported).

 But that's nothing inherent in Linux RAID5. It's a generic problem.
 Pretty much all Software RAID5 implementations have it.

Indeed, but I think linux' behaviour is especially poor. For example, the
renumbering of the devices or the strange rebuild-restart behaviour (which
is definitely a bug) will make recovery unnecessarily complicated.

 RAID-1 helps a bit, because you either get the old or the new data,
 but not some corruption.

You don't get any magical corruption with RAID5 either... the data contents
will either be old, or new. The differnce is that you cannot trust parity.

 In practice even old data can be a big
 problem though (e.g. when file system metadata is affected)

Of course, but that's supposed to be worked around by using a journaling
file system, right?

 Morale: if you really care about your data backup very often and
 use RAID-1 or get an expensive hardware RAID with battery backup
 (all the cheap hardware RAIDs are equally useless for this) 

Yes, I am thinking of that for some time now, but always had a problem
because the affordable ones have low performance. But given linux'
effective slower-than-a-single-disk performance it shouldn't be hard to
beat nowadays.

There is, however, at least the resyncing with only 4 out of 5 disks, that
is doubtlessly a bug somewhere.

-- 
The choice of a
  -==- _GNU_
  ==-- _   generation Marc Lehmann
  ---==---(_)__  __   __  [EMAIL PROTECTED]
  --==---/ / _ \/ // /\ \/ /  http://schmorp.de/
  -=/_/_//_/\_,_/ /_/\_\  XX11-RIPE
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-06-06 Thread Marc Lehmann


On Sun, Jun 03, 2001 at 11:10:02PM +0100, Adrian Cox <[EMAIL PROTECTED]> wrote:
> > data corruption was easily detectable, one couldn't even write 500megs
> > without altered bytes).
> 
> 
> Wrong way round. You're right that the pci master is supposed to handle 
> delayed transactions, but during data transfer the pdc is the pci master 
> and the northbridge is the PCI target.

Ok, so it could be the promise controller (the controller, however, worked
for a long time in another board with no via chipset and pci delayed
transactions enabled, so I guess it is not only dependnet on the promise
controller).

and this means that there is no automatic workaround, since not all
systems seem to have this problem.

I *do* hate silent data corruption :()

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-06-06 Thread Marc Lehmann


On Sun, Jun 03, 2001 at 11:10:02PM +0100, Adrian Cox [EMAIL PROTECTED] wrote:
  data corruption was easily detectable, one couldn't even write 500megs
  without altered bytes).
 
 
 Wrong way round. You're right that the pci master is supposed to handle 
 delayed transactions, but during data transfer the pdc is the pci master 
 and the northbridge is the PCI target.

Ok, so it could be the promise controller (the controller, however, worked
for a long time in another board with no via chipset and pci delayed
transactions enabled, so I guess it is not only dependnet on the promise
controller).

and this means that there is no automatic workaround, since not all
systems seem to have this problem.

I *do* hate silent data corruption :()

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-06-01 Thread Marc Lehmann

On Fri, Jun 01, 2001 at 11:28:48AM -0400, Jeff Garzik <[EMAIL PROTECTED]> wrote:
> Once you get into the area of flushing data (or not flushing, which is
> what delayed txn would imply), it is entirely possible that the driver
> simply does not support what occurs when the PCI Delay Txn option is
> set.

Aren't PCI delayed transaction supposed to be handled by the pci master
(e.g. my northbridge), not by the (software) driver for my pdc(?) I would
also be surprised if my pdc actually used that feature, not to speak of
the fact that the promise + harddisk worked fine in another computer (the
data corruption was easily detectable, one couldn't even write 500megs
without altered bytes).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-06-01 Thread Marc Lehmann

On Sat, May 19, 2001 at 11:07:21AM +0200, Axel Thimm <[EMAIL PROTECTED]> 
wrote:
> if( KT133A || KT133 || KX133 ) {
>   if( Mainboard=="Epox 8KTA-3(+)" && BIOS>="8kt31417" )
> return 0; /* EPOX already fixed it their way. */
> #ifdef NEW_PATCH
>   Offset 76: Set bit5=0 and bit4=1 ("every PCI master grand")
> #else /* this is already part of 2.4.4 */
>   Offset 70: Set bit1=0 ("PCI Delay Transaction = 0")

one thing I found out using triel and error is that setting "PCI Delay
Transaction" to enabled causes data corruption on WRITE to my ide drives
connected to an Promise Ultra 100 PCI controlelr (I didn't get any
corruption on the devices connected to the via ide interface, presumably
because my bios already had the right fix).

So, while the every pci master grant setting apperently fixes the internal
via ide interface corruption the PCI Delay Transaction option also must be
buggy (or my promise controller is) and causes data corruption at least
with an additional promise ultra 100.

board: asus cuv4x-d (Apollo MVP3 AGP + via686b southbridge)

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-06-01 Thread Marc Lehmann


On Sat, May 19, 2001 at 11:07:21AM +0200, Axel Thimm [EMAIL PROTECTED] 
wrote:
 if( KT133A || KT133 || KX133 ) {
   if( Mainboard==Epox 8KTA-3(+)  BIOS=8kt31417 )
 return 0; /* EPOX already fixed it their way. */
 #ifdef NEW_PATCH
   Offset 76: Set bit5=0 and bit4=1 (every PCI master grand)
 #else /* this is already part of 2.4.4 */
   Offset 70: Set bit1=0 (PCI Delay Transaction = 0)

one thing I found out using triel and error is that setting PCI Delay
Transaction to enabled causes data corruption on WRITE to my ide drives
connected to an Promise Ultra 100 PCI controlelr (I didn't get any
corruption on the devices connected to the via ide interface, presumably
because my bios already had the right fix).

So, while the every pci master grant setting apperently fixes the internal
via ide interface corruption the PCI Delay Transaction option also must be
buggy (or my promise controller is) and causes data corruption at least
with an additional promise ultra 100.

board: asus cuv4x-d (Apollo MVP3 AGP + via686b southbridge)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: VIA's Southbridge bug: Latest (pseudo-)patch

2001-06-01 Thread Marc Lehmann


On Fri, Jun 01, 2001 at 11:28:48AM -0400, Jeff Garzik [EMAIL PROTECTED] wrote:
 Once you get into the area of flushing data (or not flushing, which is
 what delayed txn would imply), it is entirely possible that the driver
 simply does not support what occurs when the PCI Delay Txn option is
 set.

Aren't PCI delayed transaction supposed to be handled by the pci master
(e.g. my northbridge), not by the (software) driver for my pdc(?) I would
also be surprised if my pdc actually used that feature, not to speak of
the fact that the promise + harddisk worked fine in another computer (the
data corruption was easily detectable, one couldn't even write 500megs
without altered bytes).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.4.2 + aic7xxx still broken

2001-03-10 Thread Marc Lehmann

On Wed, Feb 28, 2001 at 02:07:30PM +0100, Igor Mozetic <[EMAIL PROTECTED]> wrote:
> 2.4.2 + stock aic7xxx:
> --
> ...
> SCSI host 0 channel 0 reset (pid 0) timed out - trying harder

interestingly, I have exactly the same problems when booting my smp kernel
with either maxcpus=1, nosmp or the second cpu removed but NOT when the
kernel boots with two cpus (it works *perfectly*)

Unless macpus=! switches off apic (it doens't) this doesn't look like a
IRAQ problem, as the bios has no idea of the maxcpus=! option.

One thing that puzzles me is why the new driver looks for db_185.h in
/usr/include/db, which seems to be a rather nonstandard position for that
header (none my my slackware or home-grown boxes have that directory, all
of them have the db_185.h file in /usr/include, which is the standard
location I'd think since glibc-2.1 installed it there).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.4.2 + aic7xxx still broken

2001-03-10 Thread Marc Lehmann


On Wed, Feb 28, 2001 at 02:07:30PM +0100, Igor Mozetic [EMAIL PROTECTED] wrote:
 2.4.2 + stock aic7xxx:
 --
 ...
 SCSI host 0 channel 0 reset (pid 0) timed out - trying harder

interestingly, I have exactly the same problems when booting my smp kernel
with either maxcpus=1, nosmp or the second cpu removed but NOT when the
kernel boots with two cpus (it works *perfectly*)

Unless macpus=! switches off apic (it doens't) this doesn't look like a
IRAQ problem, as the bios has no idea of the maxcpus=! option.

One thing that puzzles me is why the new driver looks for db_185.h in
/usr/include/db, which seems to be a rather nonstandard position for that
header (none my my slackware or home-grown boxes have that directory, all
of them have the db_185.h file in /usr/include, which is the standard
location I'd think since glibc-2.1 installed it there).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-26 Thread Marc Lehmann


On Mon, Feb 26, 2001 at 08:11:55AM +0100, Mike Galbraith <[EMAIL PROTECTED]> wrote:
> Hmm.. I remember having this problem and it was a problem with strace.

Well, I obviously strace'd it to find out why I get a memory fault without
one (I would be happy if it worked without strace ;->)

> Anyway, it works fine here with virgin 2.4.2, so it seems unlikely it's
> a kernel problem.

> 259   execve("/sbin/losetup", ["losetup", "/dev/loop0", "/dev/hda5"], [/* 47 vars 
>*/]) = 0

The -e switch is causing the memory fault and subsequent breakage:

743   open("/dev/hdd", O_RDWR)  = 4
743   open("/dev/loop0", O_RDWR)= 5
743   mlockall(0x3, 0x804c272)  = 0
743   ioctl(5, LOOP_SET_FD, 0x4)= -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_FD, 0x4)= 0
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS 
743   +++ killed by SIGSEGV +++

(which is a strange strace anyway...)

However, I just need to wait until there is a new crypto patch (and, if
not, I'll eventually have to hack it myself to gte my data. After all it's
source... ...)

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-26 Thread Marc Lehmann


On Mon, Feb 26, 2001 at 08:11:55AM +0100, Mike Galbraith [EMAIL PROTECTED] wrote:
 Hmm.. I remember having this problem and it was a problem with strace.

Well, I obviously strace'd it to find out why I get a memory fault without
one (I would be happy if it worked without strace ;-)

 Anyway, it works fine here with virgin 2.4.2, so it seems unlikely it's
 a kernel problem.

 259   execve("/sbin/losetup", ["losetup", "/dev/loop0", "/dev/hda5"], [/* 47 vars 
*/]) = 0

The -e switch is causing the memory fault and subsequent breakage:

743   open("/dev/hdd", O_RDWR)  = 4
743   open("/dev/loop0", O_RDWR)= 5
743   mlockall(0x3, 0x804c272)  = 0
743   ioctl(5, LOOP_SET_FD, 0x4)= -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_FD, 0x4)= 0
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS, 0xb5d8) = -1 ENOSYS (Function not implemented)
743   ioctl(5, LOOP_SET_STATUS unfinished ...
743   +++ killed by SIGSEGV +++

(which is a strange strace anyway...)

However, I just need to wait until there is a new crypto patch (and, if
not, I'll eventually have to hack it myself to gte my data. After all it's
source... ...)

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


Oh, and one last thing I forgot: loop devices. Since 2.4.1 (the first
version I used) through 2.4.2 and 2.4.2ac3 I only get:

cerebro:~# strace -f -o x losetup -e rc6 /dev/loop0 /dev/hdd
Memory Fault

And then no access to the loop device works anymore (clearly this is after
the 2.4.0.something crypto-patch applied, so this is probably not a 2.4.2
issue anyway since there is no 2.4.2 crypto patch).

Happy Hacking ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann

On Sun, Feb 25, 2001 at 05:58:32PM +0100, Mike Galbraith <[EMAIL PROTECTED]> wrote:
> Signal delivery during oomest does not work (last time I tested).
> Andrea fixed this once.. long time ~problem.

Hmm, here is soemthing that is new: Just now, the machine gets VERY very
sluggish and swaps:

 total   used   free sharedbuffers cached
Mem:255296 253708   1588  0  29808 183020
-/+ buffers/cache:  40880 214416
Swap:2  2  0

now, there is plenty of free memory (200megs!) but no spwapsace and the
kernel keeps swapping. The only interesting processes here are:

  PID TTY  STAT   TIME  MAJFL   TRS   DRS  RSS %MEM COMMAND
  112 ?S  0:00742  1366 38921 3460  1.3 /opt/mysql//libexec/mysqld 
--basedir=/opt/mysql/ --datadir=/var/mysql --user=root --pid-
  205 ?S  2:28  12335  1444 27167 4294966180 6728.9 /usr/bin/X11/X :0 
-audit 1 -auth /etc/cfg/Xauthority -a 2 -once -t 5 vt02 -defer
  421 pts/13   TN 1:00804   707 31552 17444  6.8 /usr/bin/perl ./summarize
  376 pts/10   R  7:07269   129 22614 1852  0.7 rsync -av . doom 
cerebro-root/. --delete

when I SIGSTOP the summarize script (which uses mysql very intensively)
the system starts to work again but the memory situation does not
improve. The RSS size of X puzzles me a bit, but this was always the case
under 2.4.2 and 2.4.2ac3 (and maybe before) and didn't cause a problem
before.

Another bug I found is that initializing md on the kernel commandline in
the wrong order (first md1 then md0) keeps the kernel from mounting md0
as root-device. Another problem is that, when I "startraid /dev/md1" (a
two-partition, striped raid without persistent superblock) I get strange
errors in /var/log/kernel (if anybody asks I'll provide them) but it works
fine when I sue md=x on the kernel commandline. It's not a configuration
problem sicne I got the same strange probkems with the mdstart I used
successfully under 2.1 and 2.2.

Another nitpick is kernel-pcmcia: For some unexplainable reason, the
kernel SWITCHES OFF POWER to the pcmcia slots BEFORE notifying apmd, which
then tries to save important data and locks (not the machine, just the
script) since the network is suddenly dead although interface etc.. all
still exist. Under the pcmcia-cs package one could work around this bug by
specifying do_apm=0 for the pcmcia_core module, which has no effect under
2.4.

So I do keep asking me: does anybody actually use 2.4 on production
machines? ;-> (Historically, it seesm that my machines tend to freeze
easily because of sudden OOM and/or reiserfs ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


On Sun, Feb 25, 2001 at 05:58:32PM +0100, Mike Galbraith <[EMAIL PROTECTED]> wrote:
> > Usually I swapon ./swap some 512MB swapfile, but today I forgot it. When the
> > machine started to get sluggish I sent the process a -STOP signal.
> 
> Signal delivery during oomest does not work (last time I tested).
> Andrea fixed this once.. long time ~problem.

Well, the signal delivery seemed to have worked fine - the machine
was quite usable (it swapped a lot, but the system was never unusable
for longer than a second or so). The problem started when I did the
swapon. Well, it didn't start, the system just froze.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


It seems linux-2.4 still freezes on out-of-memory situations:

I was using 2.4.2-ac3 SMP and had a fairly large background job that takes
hundreds of megabytes of memory, much more than I have:

Mem:255296  81836 173460  0  10324  30608
Swap:2  0  2

Usually I swapon ./swap some 512MB swapfile, but today I forgot it. When the
machine started to get sluggish I sent the process a -STOP signal.

Swap:2  2  0

O.k, (I had about 12MB of main memory free (in the +/- buffers line of
free) and the machine was sluggish but workable for about five minutes. At
the instant I did a swapon ./swap the machine froze hard (no sysrq, no
ping etc...)

I thought these complete freezes on OOM-situations had been fixed in
2.4.x? Do I have to watch out for andrea's fix-2.4-oom patches?

;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


It seems linux-2.4 still freezes on out-of-memory situations:

I was using 2.4.2-ac3 SMP and had a fairly large background job that takes
hundreds of megabytes of memory, much more than I have:

Mem:255296  81836 173460  0  10324  30608
Swap:2  0  2

Usually I swapon ./swap some 512MB swapfile, but today I forgot it. When the
machine started to get sluggish I sent the process a -STOP signal.

Swap:2  2  0

O.k, (I had about 12MB of main memory free (in the +/- buffers line of
free) and the machine was sluggish but workable for about five minutes. At
the instant I did a swapon ./swap the machine froze hard (no sysrq, no
ping etc...)

I thought these complete freezes on OOM-situations had been fixed in
2.4.x? Do I have to watch out for andrea's fix-2.4-oom patches?

;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


On Sun, Feb 25, 2001 at 05:58:32PM +0100, Mike Galbraith [EMAIL PROTECTED] wrote:
  Usually I swapon ./swap some 512MB swapfile, but today I forgot it. When the
  machine started to get sluggish I sent the process a -STOP signal.
 
 Signal delivery during oomest does not work (last time I tested).
 Andrea fixed this once.. long time ~problem.

Well, the signal delivery seemed to have worked fine - the machine
was quite usable (it swapped a lot, but the system was never unusable
for longer than a second or so). The problem started when I did the
swapon. Well, it didn't start, the system just froze.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


On Sun, Feb 25, 2001 at 05:58:32PM +0100, Mike Galbraith [EMAIL PROTECTED] wrote:
 Signal delivery during oomest does not work (last time I tested).
 Andrea fixed this once.. long time ~problem.

Hmm, here is soemthing that is new: Just now, the machine gets VERY very
sluggish and swaps:

 total   used   free sharedbuffers cached
Mem:255296 253708   1588  0  29808 183020
-/+ buffers/cache:  40880 214416
Swap:2  2  0

now, there is plenty of free memory (200megs!) but no spwapsace and the
kernel keeps swapping. The only interesting processes here are:

  PID TTY  STAT   TIME  MAJFL   TRS   DRS  RSS %MEM COMMAND
  112 ?S  0:00742  1366 38921 3460  1.3 /opt/mysql//libexec/mysqld 
--basedir=/opt/mysql/ --datadir=/var/mysql --user=root --pid-
  205 ?S  2:28  12335  1444 27167 4294966180 6728.9 /usr/bin/X11/X :0 
-audit 1 -auth /etc/cfg/Xauthority -a 2 -once -t 5 vt02 -defer
  421 pts/13   TN 1:00804   707 31552 17444  6.8 /usr/bin/perl ./summarize
  376 pts/10   R  7:07269   129 22614 1852  0.7 rsync -av . doom 
cerebro-root/. --delete

when I SIGSTOP the summarize script (which uses mysql very intensively)
the system starts to work again but the memory situation does not
improve. The RSS size of X puzzles me a bit, but this was always the case
under 2.4.2 and 2.4.2ac3 (and maybe before) and didn't cause a problem
before.

Another bug I found is that initializing md on the kernel commandline in
the wrong order (first md1 then md0) keeps the kernel from mounting md0
as root-device. Another problem is that, when I "startraid /dev/md1" (a
two-partition, striped raid without persistent superblock) I get strange
errors in /var/log/kernel (if anybody asks I'll provide them) but it works
fine when I sue md=x on the kernel commandline. It's not a configuration
problem sicne I got the same strange probkems with the mdstart I used
successfully under 2.1 and 2.2.

Another nitpick is kernel-pcmcia: For some unexplainable reason, the
kernel SWITCHES OFF POWER to the pcmcia slots BEFORE notifying apmd, which
then tries to save important data and locks (not the machine, just the
script) since the network is suddenly dead although interface etc.. all
still exist. Under the pcmcia-cs package one could work around this bug by
specifying do_apm=0 for the pcmcia_core module, which has no effect under
2.4.

So I do keep asking me: does anybody actually use 2.4 on production
machines? ;- (Historically, it seesm that my machines tend to freeze
easily because of sudden OOM and/or reiserfs ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux swap freeze STILL in 2.4.x

2001-02-25 Thread Marc Lehmann


Oh, and one last thing I forgot: loop devices. Since 2.4.1 (the first
version I used) through 2.4.2 and 2.4.2ac3 I only get:

cerebro:~# strace -f -o x losetup -e rc6 /dev/loop0 /dev/hdd
Memory Fault

And then no access to the loop device works anymore (clearly this is after
the 2.4.0.something crypto-patch applied, so this is probably not a 2.4.2
issue anyway since there is no 2.4.2 crypto patch).

Happy Hacking ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED]  |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

major security bug in reiserfs (may affect SuSE Linux)

2001-01-09 Thread Marc Lehmann


We are still investigating, but there seems to be a major security problem
in at least some versions of reiserfs. Since reiserfs is shipped with
newer versions of SuSE Linux and the problem is too easy to reproduce and
VERY dangerous I think alerting people to this problem is in order.

We have tested and verified this problem on a number of different systems
and kernels 2.2.17/2.2.8 with reiserfs-3.5.28 and probably other versions.

Basically, you do:

mkdir "$(perl -e 'print "x" x 768')"

I.e. create a very long directory. The name doesn't seem to be of
relevance (we found this out by doing mkdir "$(cat /etc/hosts)" for other
tests). This works.  The next ls (or echo *) command will segfault and the
kernel oopses. all following accesses to the volume in question will oops
and hang the process, even afetr a reboot.

reiserfsck (the filesystem check program) does _NOT_ detect or solve this
problem:

Replaying journal..ok
Checking S+tree..ok
Comparing bitmaps..ok

But fortunately, rmdir  works and seems to leave the filesystem
undamaged.

Since a kernel oops results (see below), this indicates a buffer overrun
(the kernel jumps to address 78787878, which is "") inside the kernel,
which is of course very nasty (think ftp-upload!) and certainly gives you
root access from anywhere, even from inside a chrooted environment. We
didn't pursue this further.

The best workaround at this time seems to be to uninstall reiserfs
completely or not allow any user access (even indirect) to these volumes.
While this individual bug might be easy to fix, we believe that other,
similar bugs should be easy to find so reiserfs should not be trusted (it
shouldn't be trusted to full user access for other reasons anyway, but it
is still widely used).

Unable to handle kernel paging request at virtual address 78787878
current->tss.cr3 = 0d074000, %cr3 = 0d074000
*pde = 
Oops: 0002
CPU:0
EIP:0010:[]
EFLAGS: 00010282
eax:    ebx: bfffe78c   ecx:    edx: bfffe78c
esi: ccbddd62   edi: 78787878   ebp: 0300   esp: ccbddd3c
ds: 0018   es: 0018   ss: 0018
Process bash (pid: 292, process nr: 54, stackpage=ccbdd000)
Stack: c013f66a ccbddf6c cd10 ccbddd62 030c c0136d49 0700 2013 
   1000 7878030c 78787878 78787878 78787878 78787878 78787878 78787878 
   78787878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 
Call Trace: [] [] 
Code: 89 1f 8b 44 24 18 29 47 08 31 c0 5b 5e 5f 5d 81 c4 2c 01 00 


-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

major security bug in reiserfs (may affect SuSE Linux)

2001-01-09 Thread Marc Lehmann


We are still investigating, but there seems to be a major security problem
in at least some versions of reiserfs. Since reiserfs is shipped with
newer versions of SuSE Linux and the problem is too easy to reproduce and
VERY dangerous I think alerting people to this problem is in order.

We have tested and verified this problem on a number of different systems
and kernels 2.2.17/2.2.8 with reiserfs-3.5.28 and probably other versions.

Basically, you do:

mkdir "$(perl -e 'print "x" x 768')"

I.e. create a very long directory. The name doesn't seem to be of
relevance (we found this out by doing mkdir "$(cat /etc/hosts)" for other
tests). This works.  The next ls (or echo *) command will segfault and the
kernel oopses. all following accesses to the volume in question will oops
and hang the process, even afetr a reboot.

reiserfsck (the filesystem check program) does _NOT_ detect or solve this
problem:

Replaying journal..ok
Checking S+tree..ok
Comparing bitmaps..ok

But fortunately, rmdir filename works and seems to leave the filesystem
undamaged.

Since a kernel oops results (see below), this indicates a buffer overrun
(the kernel jumps to address 78787878, which is "") inside the kernel,
which is of course very nasty (think ftp-upload!) and certainly gives you
root access from anywhere, even from inside a chrooted environment. We
didn't pursue this further.

The best workaround at this time seems to be to uninstall reiserfs
completely or not allow any user access (even indirect) to these volumes.
While this individual bug might be easy to fix, we believe that other,
similar bugs should be easy to find so reiserfs should not be trusted (it
shouldn't be trusted to full user access for other reasons anyway, but it
is still widely used).

Unable to handle kernel paging request at virtual address 78787878
current-tss.cr3 = 0d074000, %cr3 = 0d074000
*pde = 
Oops: 0002
CPU:0
EIP:0010:[c013f875]
EFLAGS: 00010282
eax:    ebx: bfffe78c   ecx:    edx: bfffe78c
esi: ccbddd62   edi: 78787878   ebp: 0300   esp: ccbddd3c
ds: 0018   es: 0018   ss: 0018
Process bash (pid: 292, process nr: 54, stackpage=ccbdd000)
Stack: c013f66a ccbddf6c cd10 ccbddd62 030c c0136d49 0700 2013 
   1000 7878030c 78787878 78787878 78787878 78787878 78787878 78787878 
   78787878 78787878 78787878 78787878 78787878 78787878 78787878 78787878 
Call Trace: [c013f66a] [c0136d49] 
Code: 89 1f 8b 44 24 18 29 47 08 31 c0 5b 5e 5f 5d 81 c4 2c 01 00 


-- 
  -==- |
  ==-- _   |
      ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: `rmdir .` doesn't work in 2.4

2001-01-08 Thread Marc Lehmann


On Tue, Jan 09, 2001 at 02:55:15AM +0100, Andrea Arcangeli <[EMAIL PROTECTED]> wrote:
> > [wakko@:/home/wakko/test] rmdir "`pwd`"
> > rmdir: /home/wakko/test: Invalid argument
> 
> Some other OS with a yet different retval? :)

It can be much worse (irix-6.5.4):

   bash# mkdir x; cd x; rmdir "`pwd`"
   /x: Can't remove current directory or ..

Here the error message makes sense - but is totally wron in this case :(

And here is linux-2.2.18:

   cerebro:~# mkdir x; cd x;rmdir "`pwd`"
   cerebro:~/x# ls -la
   total 6
   drwxr-x---   0 root root   35 Jan  9 05:54 .
   drwx--  69 root root 5372 Jan  9 05:54 ..
   cerebro:~/x# cd
   cerebro:~# ls -la x
   ls: x: No such file or directory

So, no, linux certainly does NOT remove "." ;)

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: ramfs problem... (unlink of sparse file in "D" state)

2001-01-08 Thread Marc Lehmann


On Mon, Jan 08, 2001 at 01:33:50PM -0500, Alexander Viro <[EMAIL PROTECTED]> wrote:
> And prefix would be what? "/"? Besides, I said that you don't have
> read permissions on /foo, not search ones.

You do not need read permissions on /foo to make pathconf on it. This
makes sense: you are not reading the directory...

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: ramfs problem... (unlink of sparse file in D state)

2001-01-08 Thread Marc Lehmann


On Mon, Jan 08, 2001 at 01:33:50PM -0500, Alexander Viro [EMAIL PROTECTED] wrote:
 And prefix would be what? "/"? Besides, I said that you don't have
 read permissions on /foo, not search ones.

You do not need read permissions on /foo to make pathconf on it. This
makes sense: you are not reading the directory...

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: `rmdir .` doesn't work in 2.4

2001-01-08 Thread Marc Lehmann


On Tue, Jan 09, 2001 at 02:55:15AM +0100, Andrea Arcangeli [EMAIL PROTECTED] wrote:
  [wakko@removed:/home/wakko/test] rmdir "`pwd`"
  rmdir: /home/wakko/test: Invalid argument
 
 Some other OS with a yet different retval? :)

It can be much worse (irix-6.5.4):

   bash# mkdir x; cd x; rmdir "`pwd`"
   /x: Can't remove current directory or ..

Here the error message makes sense - but is totally wron in this case :(

And here is linux-2.2.18:

   cerebro:~# mkdir x; cd x;rmdir "`pwd`"
   cerebro:~/x# ls -la
   total 6
   drwxr-x---   0 root root   35 Jan  9 05:54 .
   drwx--  69 root root 5372 Jan  9 05:54 ..
   cerebro:~/x# cd
   cerebro:~# ls -la x
   ls: x: No such file or directory

So, no, linux certainly does NOT remove "." ;)

-- 
  -==- |
  ==-- _   |
      ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-06 Thread Marc Lehmann


On Sat, Jan 06, 2001 at 03:35:02PM -0500, Chris Mason <[EMAIL PROTECTED]> wrote:
> > Nobody with working brain would read it completely into memory.

Instead everybody with a working brain would introduce another hashing
layer for every block access? I don't think the reiserfs code (e.g.) would
cope with yte another compliation in the code ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-06 Thread Marc Lehmann

On Fri, Jan 05, 2001 at 11:58:56AM +, David Woodhouse <[EMAIL PROTECTED]> wrote:
> You mount it read-only, recover as much as possible from it, and bin it.
> 
> You _don't_ want the fs code to ignore your explicit instructions not to
> write to the medium, and to destroy whatever data were left.

The problem is: where did you give the explicit instruction? Just that you
define "read-only" as "the medium should not be written" does not mean
everybody else thinks the same.

actually, I regard "ro" mainly as a "hey kernel, I won't handle writes
now, so please don't try it", like for cd-roms or other non-writeale
media, and please filesystem stay in a clean state.

That ro means "the medium is never written" is an assumption that does not
hold for most disks anyway and is, in the case of journlaing filesystems,
often impossible to implement. You simply can't salvage data without a log
reply. Sure, you can do virtual log replays, but for example the reiserfs
log is currently 32mb. Pinning down that much memory for a virtual log
reply is not possible on low-memory machines.

So the first thing would be to precisely define the meaning of the "ro"
flag.  Before this has happened it is ansolutely senseless to argue about
what it means, as it doesn't mean anything at the moment, except (man mount):

  ro Mount the file system read-only.

Which it does even with journaling filesystems...

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-06 Thread Marc Lehmann


On Fri, Jan 05, 2001 at 11:58:56AM +, David Woodhouse [EMAIL PROTECTED] wrote:
 You mount it read-only, recover as much as possible from it, and bin it.
 
 You _don't_ want the fs code to ignore your explicit instructions not to
 write to the medium, and to destroy whatever data were left.

The problem is: where did you give the explicit instruction? Just that you
define "read-only" as "the medium should not be written" does not mean
everybody else thinks the same.

actually, I regard "ro" mainly as a "hey kernel, I won't handle writes
now, so please don't try it", like for cd-roms or other non-writeale
media, and please filesystem stay in a clean state.

That ro means "the medium is never written" is an assumption that does not
hold for most disks anyway and is, in the case of journlaing filesystems,
often impossible to implement. You simply can't salvage data without a log
reply. Sure, you can do virtual log replays, but for example the reiserfs
log is currently 32mb. Pinning down that much memory for a virtual log
reply is not possible on low-memory machines.

So the first thing would be to precisely define the meaning of the "ro"
flag.  Before this has happened it is ansolutely senseless to argue about
what it means, as it doesn't mean anything at the moment, except (man mount):

  ro Mount the file system read-only.

Which it does even with journaling filesystems...

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Journaling: Surviving or allowing unclean shutdown?

2001-01-06 Thread Marc Lehmann


On Sat, Jan 06, 2001 at 03:35:02PM -0500, Chris Mason [EMAIL PROTECTED] wrote:
  Nobody with working brain would read it completely into memory.

Instead everybody with a working brain would introduce another hashing
layer for every block access? I don't think the reiserfs code (e.g.) would
cope with yte another compliation in the code ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

time function problems with 2.2.18 / hang

2000-12-22 Thread Marc Lehmann


I have an error that occurs after upgrading from 2.2.18pre23 to 2.2.18 +
vm-global-7 patch.

Apart from enhanced stability in low-memory cases (hey, it doesn't
freeze ten times a day ;), I have the problem that once every few days,
preferably under high load, X behaves strangely (window manager shows no
reaction, mouse works OR mousecursor stops moving OR wm works, mouse works
but rxvt's stop working tc..)

When this happens I can still log-in via the network and run command, but
every copmmand that uses waits (select(0,0,0,xxx) or nanosleep) just hangs:

cerebro:~# strace -f sleep 1
...
nanosleep({1, 0}, 

Also, when I beep the terminal it starts beeping but never stops, so it
seems the timer system inside the kernel is somehow wrecked in this state.

Doing while :;do kill -CONT -1;done lets me do some things, like runing top
or kill and restart X (very slowly ;).

That is the strangest thing I ever saw in a release kernel ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

time function problems with 2.2.18 / hang

2000-12-22 Thread Marc Lehmann


I have an error that occurs after upgrading from 2.2.18pre23 to 2.2.18 +
vm-global-7 patch.

Apart from enhanced stability in low-memory cases (hey, it doesn't
freeze ten times a day ;), I have the problem that once every few days,
preferably under high load, X behaves strangely (window manager shows no
reaction, mouse works OR mousecursor stops moving OR wm works, mouse works
but rxvt's stop working tc..)

When this happens I can still log-in via the network and run command, but
every copmmand that uses waits (select(0,0,0,xxx) or nanosleep) just hangs:

cerebro:~# strace -f sleep 1
...
nanosleep({1, 0}, 

Also, when I beep the terminal it starts beeping but never stops, so it
seems the timer system inside the kernel is somehow wrecked in this state.

Doing while :;do kill -CONT -1;done lets me do some things, like runing top
or kill and restart X (very slowly ;).

That is the strangest thing I ever saw in a release kernel ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: recursive exports && linux nfs

2000-12-17 Thread Marc Lehmann

On Fri, Dec 15, 2000 at 11:54:46PM +0100, Pavel Machek <[EMAIL PROTECTED]> wrote:
> > 2) using: I can do cd /nfs/fs, but the directoy is always empty, and when I
> >try to step into a subdirectory I always get "No such file or directory".
> > 
> > Thanks a lot for any insights, even if this means "this is not supported"
> > ;)
> 
> This can't be supported, afaict, because nfs handles have limited
> size.

Ehrm, did you really read my mail? Most people told me something like
"recursive exports are not supported" (actually, they are and they work),
and it seems nobody really read what I wrote :(

My problem is that autofs doesn't work. Example:

/   reiserfs
/fs autofs
/fs/big ext2

When I exportfs /, /fs AND /fs/big then I can mount /fs on another box,
but it is always empty, even if something (e.g. /fs/big) is mounted and
can be accessed fine the whole time. Automounting doesn't work, either, of
course.

Another (less grave) problem is that exportfs (and/or rpc.nfsd) require
network access and access to the volume, so they a) mount all automounted
directories (VERY expensive) and require network access (making all
clients NOT survive a reboot).

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: recursive exports linux nfs

2000-12-17 Thread Marc Lehmann


On Fri, Dec 15, 2000 at 11:54:46PM +0100, Pavel Machek [EMAIL PROTECTED] wrote:
  2) using: I can do cd /nfs/fs, but the directoy is always empty, and when I
 try to step into a subdirectory I always get "No such file or directory".
  
  Thanks a lot for any insights, even if this means "this is not supported"
  ;)
 
 This can't be supported, afaict, because nfs handles have limited
 size.

Ehrm, did you really read my mail? Most people told me something like
"recursive exports are not supported" (actually, they are and they work),
and it seems nobody really read what I wrote :(

My problem is that autofs doesn't work. Example:

/   reiserfs
/fs autofs
/fs/big ext2

When I exportfs /, /fs AND /fs/big then I can mount /fs on another box,
but it is always empty, even if something (e.g. /fs/big) is mounted and
can be accessed fine the whole time. Automounting doesn't work, either, of
course.

Another (less grave) problem is that exportfs (and/or rpc.nfsd) require
network access and access to the volume, so they a) mount all automounted
directories (VERY expensive) and require network access (making all
clients NOT survive a reboot).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

recursive exports && linux nfs

2000-12-12 Thread Marc Lehmann


Hi ;)

I am trying to export the whole filesystem hierarchy on one of my servers
(this includes /fs, which is an automounted directory using autofs).

Now I have two problems:

1) exporting: exportfs does not really exports filesystems that are
   not present when exportfs is being called (some of my filesystems
   are only available temporarily). Also, exportfs of course forces the mount
   of all filesystems that are mountable, which can take considerable time.

2) using: I can do cd /nfs/fs, but the directoy is always empty, and when I
   try to step into a subdirectory I always get "No such file or directory".

I am using linux-2.2.18, nfsv3 + nfs-utils-0.2.1.

Thanks a lot for any insights, even if this means "this is not supported"
;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

recursive exports linux nfs

2000-12-12 Thread Marc Lehmann


Hi ;)

I am trying to export the whole filesystem hierarchy on one of my servers
(this includes /fs, which is an automounted directory using autofs).

Now I have two problems:

1) exporting: exportfs does not really exports filesystems that are
   not present when exportfs is being called (some of my filesystems
   are only available temporarily). Also, exportfs of course forces the mount
   of all filesystems that are mountable, which can take considerable time.

2) using: I can do cd /nfs/fs, but the directoy is always empty, and when I
   try to step into a subdirectory I always get "No such file or directory".

I am using linux-2.2.18, nfsv3 + nfs-utils-0.2.1.

Thanks a lot for any insights, even if this means "this is not supported"
;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

reordering pci interrupts?

2000-11-18 Thread Marc Lehmann


I have a motherboard with a broken bios that is unable to set interrupts
correctly, i.e. it initializes the devices corerctly but swaps the
interrupts for slot1/slot3 and slot2/slot4.

Now, is there a way to forcefully re-order the pci-interrupts? I do not
have an io-apic (thus no pirq=xxx), and I tried to poke the interrupt
values directly into /proc/bus/pic/*/*, but the kernel has it's own idea.

Thanks a lot for any info (I guess I'll just patch the kernel).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

reordering pci interrupts?

2000-11-18 Thread Marc Lehmann


I have a motherboard with a broken bios that is unable to set interrupts
correctly, i.e. it initializes the devices corerctly but swaps the
interrupts for slot1/slot3 and slot2/slot4.

Now, is there a way to forcefully re-order the pci-interrupts? I do not
have an io-apic (thus no pirq=xxx), and I tried to poke the interrupt
values directly into /proc/bus/pic/*/*, but the kernel has it's own idea.

Thanks a lot for any info (I guess I'll just patch the kernel).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

routing problems with 2.2

2000-11-14 Thread Marc Lehmann


The Problem:

the command "telnet 212.172.23.17 80", done from a machine outside my
network generates syn requests on the device tun2 on my machine (a tunnel
device using vtun). tcpdump tun2:

00:04:55.066516 12.4.218.41.4624 > 212.172.23.17.80: S 219810852:219810852(0) win 
16384  (DF) [tos 0x10]
00:04:55.119757 129.13.162.254 > 212.172.23.17: icmp: host 12.4.218.41 unreachable - 
admin prohibited filter

(the second packet is due to the misrouting of the return packet on the
interface tun1, which hits some firewall):

00:04:55.066779 212.172.23.17.80 > 12.4.218.41.4624: S 437426418:437426418(0) ack 
219810853 win 15510  (DF)
00:04:58.100986 212.172.23.17.80 > 12.4.218.41.4624: S 437426418:437426418(0) ack 
219810853 win 15510  (DF)

The problem is that everything works fine at first, but after some time
after starting the network tunnels (between 5 minutes and a few days!)
packets received on one interface get sound on another one, generally the
wrong one.

ifconfig down/up of the device usually works (it happens between tun1/tun2,
tun2/ippp0 and even ippp0 and eth1, for example).

Does anybody have an idea what's going wrong here, and how to fix
this? Thanks a lot in advance, I'd be happy to provide more info.

My config:

linux-2.2.17 with most advanced router functions enabled (I can send my
.config if neccessary).

doom:~# ip rule list
0:  from all lookup local 
32766:  from all lookup main 
32767:  from all lookup default 

doom:~# ip route list table local
local 10.0.0.5 dev eth0  proto kernel  scope host  src 10.0.0.5 
local 10.0.0.5 dev eth1  proto kernel  scope host  src 10.0.0.5 
broadcast 127.255.255.255 dev lo  proto kernel  scope link  src 127.0.0.1 
broadcast 193.0.0.0 dev ippp0  proto kernel  scope link  src 62.224.169.116 
local 62.224.169.116 dev ippp0  proto kernel  scope host  src 62.224.169.116 
broadcast 10.255.255.255 dev eth0  proto kernel  scope link  src 10.0.0.5 
broadcast 10.255.255.255 dev eth1  proto kernel  scope link  src 10.0.0.5 
broadcast 193.255.255.255 dev ippp0  proto kernel  scope link  src 62.224.169.116 
broadcast 127.0.0.0 dev lo  proto kernel  scope link  src 127.0.0.1 
local 127.0.0.1 dev lo  proto kernel  scope host  src 127.0.0.1 
local 129.13.162.92 dev tun1  proto kernel  scope host  src 129.13.162.92 
local 127.0.0.0/8 dev lo  proto kernel  scope host  src 127.0.0.1 

doom:~# ip route list table main 
192.168.255.202 dev tun1  proto kernel  scope link  src 129.13.162.92 
10.0.0.1 dev eth0  scope link 
212.172.23.18 via 10.0.0.1 dev eth0 
192.168.254.1 dev tun2  proto kernel  scope link  src 212.172.23.17 
10.0.0.2 dev eth1  scope link 
129.13.162.8 dev ippp0  scope link 
10.0.0.9 dev eth1  scope link 
129.13.162.93 via 10.0.0.1 dev eth0 
172.16.0.0/12 dev tun1  scope link 
193.0.0.0/8 dev ippp0  proto kernel  scope link  src 62.224.169.116 
default dev ippp0  scope link 
default via 193.158.133.205 dev ippp0 

doom:~# ip route list table default
[empty]

doom:~# ip link list
1: lo:  mtu 3924 qdisc noqueue 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ippp0:  mtu 1500 qdisc pfifo_fast qlen 30
link/ppp 
3: eth0:  mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:73 brd ff:ff:ff:ff:ff:ff
4: eth1:  mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:68 brd ff:ff:ff:ff:ff:ff
29: tun1:  mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 
30: tun2:  mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 

doom:~# ip address list
1: lo:  mtu 3924 qdisc noqueue 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: ippp0:  mtu 1500 qdisc pfifo_fast qlen 30
link/ppp 
inet 62.224.169.116 peer 193.158.133.205/8 scope global ippp0
3: eth0:  mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:73 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.5/32 brd 10.255.255.255 scope global eth0
4: eth1:  mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:68 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.5/32 brd 10.255.255.255 scope global eth1
29: tun1:  mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 
inet 129.13.162.92 peer 192.168.255.202/32 scope global tun1
30: tun2:  mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 
inet 212.172.23.17 peer 192.168.254.1/32 scope global tun2
inet 212.172.23.21/32 scope global tun2




-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

routing problems with 2.2

2000-11-14 Thread Marc Lehmann


The Problem:

the command "telnet 212.172.23.17 80", done from a machine outside my
network generates syn requests on the device tun2 on my machine (a tunnel
device using vtun). tcpdump tun2:

00:04:55.066516 12.4.218.41.4624  212.172.23.17.80: S 219810852:219810852(0) win 
16384 mss 1460,nop,wscale 0,nop,nop,timestamp[|tcp] (DF) [tos 0x10]
00:04:55.119757 129.13.162.254  212.172.23.17: icmp: host 12.4.218.41 unreachable - 
admin prohibited filter

(the second packet is due to the misrouting of the return packet on the
interface tun1, which hits some firewall):

00:04:55.066779 212.172.23.17.80  12.4.218.41.4624: S 437426418:437426418(0) ack 
219810853 win 15510 mss 1410,nop,nop,timestamp 7186830[|tcp] (DF)
00:04:58.100986 212.172.23.17.80  12.4.218.41.4624: S 437426418:437426418(0) ack 
219810853 win 15510 mss 1410,nop,nop,timestamp 7187134[|tcp] (DF)

The problem is that everything works fine at first, but after some time
after starting the network tunnels (between 5 minutes and a few days!)
packets received on one interface get sound on another one, generally the
wrong one.

ifconfig down/up of the device usually works (it happens between tun1/tun2,
tun2/ippp0 and even ippp0 and eth1, for example).

Does anybody have an idea what's going wrong here, and how to fix
this? Thanks a lot in advance, I'd be happy to provide more info.

My config:

linux-2.2.17 with most advanced router functions enabled (I can send my
.config if neccessary).

doom:~# ip rule list
0:  from all lookup local 
32766:  from all lookup main 
32767:  from all lookup default 

doom:~# ip route list table local
local 10.0.0.5 dev eth0  proto kernel  scope host  src 10.0.0.5 
local 10.0.0.5 dev eth1  proto kernel  scope host  src 10.0.0.5 
broadcast 127.255.255.255 dev lo  proto kernel  scope link  src 127.0.0.1 
broadcast 193.0.0.0 dev ippp0  proto kernel  scope link  src 62.224.169.116 
local 62.224.169.116 dev ippp0  proto kernel  scope host  src 62.224.169.116 
broadcast 10.255.255.255 dev eth0  proto kernel  scope link  src 10.0.0.5 
broadcast 10.255.255.255 dev eth1  proto kernel  scope link  src 10.0.0.5 
broadcast 193.255.255.255 dev ippp0  proto kernel  scope link  src 62.224.169.116 
broadcast 127.0.0.0 dev lo  proto kernel  scope link  src 127.0.0.1 
local 127.0.0.1 dev lo  proto kernel  scope host  src 127.0.0.1 
local 129.13.162.92 dev tun1  proto kernel  scope host  src 129.13.162.92 
local 127.0.0.0/8 dev lo  proto kernel  scope host  src 127.0.0.1 

doom:~# ip route list table main 
192.168.255.202 dev tun1  proto kernel  scope link  src 129.13.162.92 
10.0.0.1 dev eth0  scope link 
212.172.23.18 via 10.0.0.1 dev eth0 
192.168.254.1 dev tun2  proto kernel  scope link  src 212.172.23.17 
10.0.0.2 dev eth1  scope link 
129.13.162.8 dev ippp0  scope link 
10.0.0.9 dev eth1  scope link 
129.13.162.93 via 10.0.0.1 dev eth0 
172.16.0.0/12 dev tun1  scope link 
193.0.0.0/8 dev ippp0  proto kernel  scope link  src 62.224.169.116 
default dev ippp0  scope link 
default via 193.158.133.205 dev ippp0 

doom:~# ip route list table default
[empty]

doom:~# ip link list
1: lo: LOOPBACK,UP mtu 3924 qdisc noqueue 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ippp0: POINTOPOINT,NOARP,UP mtu 1500 qdisc pfifo_fast qlen 30
link/ppp 
3: eth0: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:73 brd ff:ff:ff:ff:ff:ff
4: eth1: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:68 brd ff:ff:ff:ff:ff:ff
29: tun1: POINTOPOINT,MULTICAST,NOARP,UP mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 
30: tun2: POINTOPOINT,MULTICAST,NOARP,UP mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 

doom:~# ip address list
1: lo: LOOPBACK,UP mtu 3924 qdisc noqueue 
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: ippp0: POINTOPOINT,NOARP,UP mtu 1500 qdisc pfifo_fast qlen 30
link/ppp 
inet 62.224.169.116 peer 193.158.133.205/8 scope global ippp0
3: eth0: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:73 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.5/32 brd 10.255.255.255 scope global eth0
4: eth1: BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:e0:7d:03:38:68 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.5/32 brd 10.255.255.255 scope global eth1
29: tun1: POINTOPOINT,MULTICAST,NOARP,UP mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 
inet 129.13.162.92 peer 192.168.255.202/32 scope global tun1
30: tun2: POINTOPOINT,MULTICAST,NOARP,UP mtu 1450 qdisc pfifo_fast qlen 10
link/ppp 
inet 212.172.23.17 peer 192.168.254.1/32 scope global tun2
inet 212.172.23.21/32 scope global tun2




-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PRO

Re: Dual XEON - >>SLOW<< on SMP

2000-11-13 Thread Marc Lehmann

On Sun, Nov 12, 2000 at 11:22:02PM -0700, "Jeff V. Merkey" 
<[EMAIL PROTECTED]> wrote:
> I can go and get the text from our discussion, and I distinctly remember
> your answer to this question on PII and you said "lots".  This was also a 

Well, my mail certainly contained the words "lot" (not "lots") and "PII",
but certainly not in the same sentence and certainly not refering to each
other and certainly not in refering to syscalls, and I am totally puzzled
of why you are keep claiming this in public (you can't even quote my name
correctly).

Could you please stop lying and hopefully apologize for abusing my name in
public for claiming wrong things I never said and abstain from doing so in
the future?

And please keep this off-list from now on.

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Dual XEON - SLOW on SMP

2000-11-13 Thread Marc Lehmann


On Sun, Nov 12, 2000 at 11:22:02PM -0700, "Jeff V. Merkey" 
[EMAIL PROTECTED] wrote:
 I can go and get the text from our discussion, and I distinctly remember
 your answer to this question on PII and you said "lots".  This was also a 

Well, my mail certainly contained the words "lot" (not "lots") and "PII",
but certainly not in the same sentence and certainly not refering to each
other and certainly not in refering to syscalls, and I am totally puzzled
of why you are keep claiming this in public (you can't even quote my name
correctly).

Could you please stop lying and hopefully apologize for abusing my name in
public for claiming wrong things I never said and abstain from doing so in
the future?

And please keep this off-list from now on.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Dual XEON - >>SLOW<< on SMP

2000-11-11 Thread Marc Lehmann

On Tue, Nov 07, 2000 at 04:03:25PM -0700, "Jeff V. Merkey" <[EMAIL PROTECTED]> 
wrote:
> 
> Marc Lehman verified that PII systems will generate tons of AGIs with
> gcc. 

It is a bit late (just came back from the systems'00 fair), but Jeff
Merkey just acknowledged that indeed he meant me with "Marc Lehman". I
have no idea why he wrote such a thing, since I never mentioned something
like that, nor did I verify anything like this (given that the sentence
doesn't make much sense, either).

Jeff, I never said such a thing and I would appreciate if you didn't put
your words into my mouth.

*puzzled*

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Dual XEON - SLOW on SMP

2000-11-11 Thread Marc Lehmann


On Tue, Nov 07, 2000 at 04:03:25PM -0700, "Jeff V. Merkey" [EMAIL PROTECTED] 
wrote:
 
 Marc Lehman verified that PII systems will generate tons of AGIs with
 gcc. 

It is a bit late (just came back from the systems'00 fair), but Jeff
Merkey just acknowledged that indeed he meant me with "Marc Lehman". I
have no idea why he wrote such a thing, since I never mentioned something
like that, nor did I verify anything like this (given that the sentence
doesn't make much sense, either).

Jeff, I never said such a thing and I would appreciate if you didn't put
your words into my mouth.

*puzzled*

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: non-gcc linux?

2000-11-05 Thread Marc Lehmann

On Sun, Nov 05, 2000 at 04:05:05PM -0700, Tim Riker <[EMAIL PROTECTED]> wrote:
> > Which can not and will not happen.
> 
> I understand "will not", but "can not"? There is nothing stopping

As I explained three lines below the mail, if you care to read.

> would include copyrights assigned to FSF and other parties. Let's say
> this happens and a new sgigcc source base is created. Presumably then

We recently saw that creating a new, probably incompatible compiler is a
very bad thing. If sgi would split the compiler that would be a problem
for the community at large.

> any defense of gcc code could be met with the argument that the code
> used came from sgigcc

YANAL and IANAL, but to defend code you must own it or have authored it.
Since the FSF would, in your example, neither own the code nor be the
author of it they couldn't defend that version of gcc.

> This being the case what has the FSD gained by

Well, simply this is _not_ the case ;)

> In short, I do not see any enforceable advantages to the current FSF

You don't. Lawyers do (certainly the FSD lawyer does), and probably the
law does, also ;)

> Statements above are my own, and I am not a lawyer.

Yepp.

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: non-gcc linux?

2000-11-05 Thread Marc Lehmann

On Sun, Nov 05, 2000 at 04:06:37PM -0500, Jakub Jelinek <[EMAIL PROTECTED]> wrote:
> That's hard to do, because the whole gcc has copyright assigned to FSF,
> which means that either gcc steering committee would have to make an
> exception from this

Which can not and will not happen.

> for SGI, or SGI would have to be willing to assign some code to FSF.

Which is the standard procedure that the FSF requires for all it's
programs to be able to defend them - incorporating non-assigned code into
gcc creates some intractable problems (i.e.: make it impossible) when the
FSD ever wanted to go to court to defend the freedom of gcc.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: non-gcc linux?

2000-11-05 Thread Marc Lehmann


On Sun, Nov 05, 2000 at 04:05:05PM -0700, Tim Riker [EMAIL PROTECTED] wrote:
  Which can not and will not happen.
 
 I understand "will not", but "can not"? There is nothing stopping

As I explained three lines below the mail, if you care to read.

 would include copyrights assigned to FSF and other parties. Let's say
 this happens and a new sgigcc source base is created. Presumably then

We recently saw that creating a new, probably incompatible compiler is a
very bad thing. If sgi would split the compiler that would be a problem
for the community at large.

 any defense of gcc code could be met with the argument that the code
 used came from sgigcc

YANAL and IANAL, but to defend code you must own it or have authored it.
Since the FSF would, in your example, neither own the code nor be the
author of it they couldn't defend that version of gcc.

 This being the case what has the FSD gained by

Well, simply this is _not_ the case ;)

 In short, I do not see any enforceable advantages to the current FSF

You don't. Lawyers do (certainly the FSD lawyer does), and probably the
law does, also ;)

 Statements above are my own, and I am not a lawyer.

Yepp.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: select() bug

2000-11-02 Thread Marc Lehmann


On Thu, Nov 02, 2000 at 11:55:52PM +, Alan Cox <[EMAIL PROTECTED]> wrote:
> > - If I'm correct that pipes have a 4K kernel buffer, then writing 1
> > byte shouldn't cause this situation, as the buffer is well more than
> > half empty.  Is this still a bug?
> 
> The pipe code uses totally full/empty. Im not sure why that was chosen

Just a quick guess: maybe because of the POSIX atomicity guarantees (if
select returned, write might have to block which is not what is expected),
and maybe this limitation was used not only on write but on read (Although
it's not necessary on the read side, AFAIK).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: select() bug

2000-11-02 Thread Marc Lehmann


On Thu, Nov 02, 2000 at 11:55:52PM +, Alan Cox [EMAIL PROTECTED] wrote:
  - If I'm correct that pipes have a 4K kernel buffer, then writing 1
  byte shouldn't cause this situation, as the buffer is well more than
  half empty.  Is this still a bug?
 
 The pipe code uses totally full/empty. Im not sure why that was chosen

Just a quick guess: maybe because of the POSIX atomicity guarantees (if
select returned, write might have to block which is not what is expected),
and maybe this limitation was used not only on write but on read (Although
it's not necessary on the read side, AFAIK).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-03 Thread Marc Lehmann

On Tue, Oct 03, 2000 at 01:27:36PM +0200, Jes Sorensen <[EMAIL PROTECTED]> wrote:
> Doesn't do much good if one of the compilers generates bogus output,
> but obviously you never had to deal with the bug reports coming out of
> distributors shipping $#@%$# pgcc as their default compiler.

I did, but of course not with all such distributions and bug reports.

> Looks to me like Alan's plonk was very appropriate here.

No, what Alan did was proving bad taste, or bad mood, or whatever. This
disucssion simply does not belong here and has nothig to do with the
now-off-topic disucssion about binary incompatibility.

As such, what Alan did was a cheap trick to try to draw attention away
from the real problem. He didn't succeed, of course and I only accurse him
of a temporary bad mood which I can certainly live with ;)

On Tue, Oct 03, 2000 at 01:38:01PM +0200, Jes Sorensen <[EMAIL PROTECTED]> wrote:
> release? Maybe you should stop insulting the people who are actually
> doing the Free Software work

Like myself??

> who just happens to be paid by Red Hat.

Only a very small part, actually. That means that everybody should play
well together, rather than trying to force non-standards onto others.

> glibc-2.2 was put out as a release candidate. gcc on the other hand I
> don't expect to see being released anytime soon enough for it to make
> sense (I might be wrong),

FYI: gcc is already "released" since quite some time.

> binary compat problems, so far nobody has even been able to agree on
> the naming scheme of the shared libstdc++ package, we just have to
> wait for 3.0.

Unfortunately some company couldn't wait. The higher numbers probably...

-- 
  -==- |
  ==-- _       |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-03 Thread Marc Lehmann


On Tue, Oct 03, 2000 at 01:27:36PM +0200, Jes Sorensen [EMAIL PROTECTED] wrote:
 Doesn't do much good if one of the compilers generates bogus output,
 but obviously you never had to deal with the bug reports coming out of
 distributors shipping $#@%$# pgcc as their default compiler.

I did, but of course not with all such distributions and bug reports.

 Looks to me like Alan's plonk was very appropriate here.

No, what Alan did was proving bad taste, or bad mood, or whatever. This
disucssion simply does not belong here and has nothig to do with the
now-off-topic disucssion about binary incompatibility.

As such, what Alan did was a cheap trick to try to draw attention away
from the real problem. He didn't succeed, of course and I only accurse him
of a temporary bad mood which I can certainly live with ;)

On Tue, Oct 03, 2000 at 01:38:01PM +0200, Jes Sorensen [EMAIL PROTECTED] wrote:
 release? Maybe you should stop insulting the people who are actually
 doing the Free Software work

Like myself??

 who just happens to be paid by Red Hat.

Only a very small part, actually. That means that everybody should play
well together, rather than trying to force non-standards onto others.

 glibc-2.2 was put out as a release candidate. gcc on the other hand I
 don't expect to see being released anytime soon enough for it to make
 sense (I might be wrong),

FYI: gcc is already "released" since quite some time.

 binary compat problems, so far nobody has even been able to agree on
 the naming scheme of the shared libstdc++ package, we just have to
 wait for 3.0.

Unfortunately some company couldn't wait. The higher numbers probably...

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-02 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 09:33:31PM -0400, Horst von Brand 
<[EMAIL PROTECTED]> wrote:
> > many others.
> 
> What makes Debian's package management "reasonable" where others aren't?

This *really* doesn't belong on linux-kernel.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: compiler explodes on pegasus driver in 2.4.0-test8

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 07:01:45PM -0400, Robert Dale <[EMAIL PROTECTED]> 
wrote:
> gcc -D__KERNEL__ -I/usr/src/linux-2.4.0-test8/include -Wall -Wstrict-prototypes -O2 
>-fomit-frame-pointer -pipe   -march=i686 -fno-strict-aliasing -DMODULE -DMODVERSIONS 
>-include /usr/src/linux-2.4.0-test8/include/linux/modversions.h   -c -o pegasus.o 
>pegasus.c
> ../../gcc/function.c:2392: Internal compiler error in function fixup_memory_subreg
> cpp: output pipe has been closed

This is a compiler bug. Better try gcc-2.95.2 (or 2.7.2.3)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: Disk priorities...

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 03:58:55PM -0700, LA Walsh <[EMAIL PROTECTED]> wrote:
>   Specifically, I'm talking about 'nice'd "down" processes -- things

Well, it is difficult to implement (network bandwidht limiting or i/o
latency for example), but asking for it once a year might make it reality.

OS2 had a lot of these things in their scheduler, but, according to
subjective reports from a lot of people, it didn't seem to work very well
(it slowed downt he scheduler considerably without ever working great).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann

On Mon, Oct 02, 2000 at 12:19:03AM +0200, Martin Dalecki 
<[EMAIL PROTECTED]> wrote:
> > > on rehdat need redhat versions of the development toolchain / runtime
> > > environment to use them :(
> 
> Ever tried to recompile SuSE apache from the src.rpm they provide?

We are talking binaries here, but anyway, what you say is easy to do:
nobody *forces* you to apply their patches or forces you to even use their
sourcecode. Go and fetch the official apcahe, it will just run fine.

> THAT is OFFENDING! Not just the fact whatever who want's to be

True, it is offending in some sense, but this is not specific to suse and
is, while maybe worthwhile on a "bash all distributions"-list (or even
here ;) is not the actual point, which is binary incompatibility because
of forked versions for no benefits.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 05:18:22PM -0400, Horst von Brand 
<[EMAIL PROTECTED]> wrote:
> And a "deliberate decision" by a "bunch of guys" (which by some freak
> accident of fate just so happens includes several of the lead people on the
> involved software projects) can't ever be right, or even just be a honest
> mistake. N, it _has_ to be sabotage, planned and executed by His
> Evilness Himself.

Now that'd an interesting new idea ;) Anyway, no, there is no conspiracy
theory, just a lot of very bad actions of some company in a row that adds
a a lot of extra, unneecessary work and confusion to the free software
community.

-- 
  -==- |
  ==-- _   |
      ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann

On Mon, Oct 02, 2000 at 12:07:11AM +0100, Alan Cox <[EMAIL PROTECTED]> wrote:
> > Why do you keep ignoring this point?
> 
> I don't see your point except as 'never change anything'.

Hmm... there is some misunderstanding here, see:

> I got bored of libc2 a while back. I prefer change

Now, what would you think if you developed libc2 and were about to go
to libc3 and then some company took libc2 made their own libc3 which is
incompatible to the libc3 that has been publicly announced some time ago,
put *your* address into the bug-report address if *their* libc3, told the
public nothing about the highly experimental aspect of their libc3 (that
will certainly not be compatible to the "official" libc3) etc.. etc...

I certainly am not "never change anything", I wouldn't have tried to patch
that pgcc thingy if I were. I am against mindless forking without stating
this, though, even if allowed by the license.

-- 
  -==- |
  ==-- _   |
      ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Mon, Oct 02, 2000 at 12:41:11AM +0200, Igmar Palsenberg <[EMAIL PROTECTED]> wrote:
> > on rehdat need redhat versions of the development toolchain / runtime
> > environment to use them :(
> 
> And you say that programs developed on for example SuSE don't need a SuSE
> enviroment ??

I said that, say that, and it's still true, yes ;) It's also true with the
majority of other distributions not cited so far: debian (which has the
advantage of a reasonable package management), slackware, stampede and
many others.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann

On Sun, Oct 01, 2000 at 10:36:00PM +0100, Alan Cox <[EMAIL PROTECTED]> wrote:
> > One never needed suse's or redhat's glibc to run binaries created on their
> > platforms. Likewise one never needed their libstdc++ or their toolchain,
> 
> You regularly did. Even with libc5 there were two semi incompatible sets
> of X libraries (with/without pthreads) and some other problems. Thats why we
> need the LSB work

You *keep* ignoring the point. Please, Alan, the point is that all these
libraries were not forked redhat-only versions. You keep citing irrelevant
facts about library incompatibilities, but the fact is that all these
came from the official sources and were compatible to the official
versions. Even egcs made a large effort to become gcc compatible.

Why do you keep ignoring this point?

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 04:39:06PM -0400, Horst von Brand 
<[EMAIL PROTECTED]> wrote:
> > I wouldn't mind, either, if this didn't mean that programs compiled
> > on rehdat need redhat versions of the development toolchain / runtime
> > environment to use them :(
> 
> Has happened on and off with each distribution I've ever played with. The
> point being?

That what you say is simply not true, so what's _your_ point in claiming
this?

One never needed suse's or redhat's glibc to run binaries created on their
platforms. Likewise one never needed their libstdc++ or their toolchain,
the official ones (released by the official maintainers) always were
enough.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 09:18:36PM +0200, Martin Dalecki 
<[EMAIL PROTECTED]> wrote:
> C++ ABI breaking: SuSE managed to break the VShop application in an
> entierly insane way between releases 6.1 and 6.2 - they stiupid did
> recompile the libstdc++ with a new compiler and didn't even
> bother to increment the binary version of this library
> At RedHat at least they know what they are changing...

Obviously redhat did and does a lot of similar braindamage, which could be
called "bugs" (no version of perl on redhat cd's really worked correctly
for example).

Again, the choice redhat did can not be construed as being some mistake by
some guy or a group of guys. It was a deliberate decision.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: To Matti

2000-10-01 Thread Marc Lehmann


On Sat, Sep 30, 2000 at 11:20:42PM +0200, Marc Lehmann <[EMAIL PROTECTED]> wrote:
> Just FYI; I tried to reply to your mail (you know the topic) but your

Thanks for your reply. O.k. in short: I didn't agree back when you sent
the message, but then the thread had more on-topic content, so basically
do as you think is best, but think about any political implications as
happens in every case. Killing threads rarely has good results IMHO as
compared to other methods.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 06:06:52PM +0200, Marc Lehmann <[EMAIL PROTECTED]> wrote:
> > owning Cygnus) is purest garbage. The whole *point* of the Steering
> > Committee is to prevent any single interest from gaining control of
> 
> BTW, AFAIK gcc is the only large free software project that has an

"AFAIK" has a very low information content. Alan just informed me that the
gnome project has a similar anti-takeover-rule (trying to avoid a mail
flood here ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann

On Sun, Oct 01, 2000 at 04:13:25PM +0100, Nix <[EMAIL PROTECTED]> wrote:
> > (Froget about the "committe" stuff...)
> 
> Marc will probably agree here that this (except for the bit about RH
> owning Cygnus) is purest garbage. The whole *point* of the Steering
> Committee is to prevent any single interest from gaining control of

BTW, AFAIK gcc is the only large free software project that has an
explicit rule that (quote):

   * No single organization is allowed to have 50% or more of the votes.
 [This includes groups of developers from the same company or a
 university]

The cygnus/redhat merger was indeed a point where this rule had to be
checked, fortunately even redhat+cygnus is well below the 50% mark.

But even if it were true, it isn't good.

> It is up to the release manager (following the release criteria) to
> release GCC. It is not up to RedHat. But they can, if they want, ship an
> unreleased GCC.

Yes, they can do whatever they are allowed by the license, of course. The
question is wether it's right, or what the consequences are.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 03:27:41PM +0200, Martin Dalecki 
<[EMAIL PROTECTED]> wrote:
> Get real: RedHat owns cygnus and cygnus owns GCC so what do you complain
> about? It's up to them to decide which compiler is stable or which

Now that's the problem. Claiming that redhat owns gcc (which is owned by
the FSF) is one of the major points in this discussion. I am sure you just
made a joke, but I miss the smileys...

> And then there is [EMAIL PROTECTED] - so wht's up with the glibc?

The same, see above :( Go through the changelog and you will see that
drepper is by far not the only coder. Hey, I even see @suse in there. A
lot! So what's up with glibc? Did you fell for some company's marketing
droids? Surely you didn't...

> I can understand redhat somehow. There are good reasons for them to take
> even CVS snaps and ship them instead of *very* outdated so called stable
> versions.

I wouldn't mind, either, if this didn't mean that programs compiled
on rehdat need redhat versions of the development toolchain / runtime
environment to use them :(

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __ ____  __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 01:50:44PM +0300, Matti Aarnio <[EMAIL PROTECTED]> 
wrote:
>   Aside of that pre-processor noice I don't know if 2.96 is really

Please keep in mind that there is no such definite thing as
gcc-2.96. There is the redhat version (with unknown changes to the
snapshot it bases on) and countless fsf snapshots of 2.96.

They act similarly, but not the same, complicating any discussion about
it.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 01:50:44PM +0300, Matti Aarnio [EMAIL PROTECTED] 
wrote:
   Aside of that pre-processor noice I don't know if 2.96 is really

Please keep in mind that there is no such definite thing as
gcc-2.96. There is the redhat version (with unknown changes to the
snapshot it bases on) and countless fsf snapshots of 2.96.

They act similarly, but not the same, complicating any discussion about
it.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 03:27:41PM +0200, Martin Dalecki 
[EMAIL PROTECTED] wrote:
 Get real: RedHat owns cygnus and cygnus owns GCC so what do you complain
 about? It's up to them to decide which compiler is stable or which

Now that's the problem. Claiming that redhat owns gcc (which is owned by
the FSF) is one of the major points in this discussion. I am sure you just
made a joke, but I miss the smileys...

 And then there is [EMAIL PROTECTED] - so wht's up with the glibc?

The same, see above :( Go through the changelog and you will see that
drepper is by far not the only coder. Hey, I even see @suse in there. A
lot! So what's up with glibc? Did you fell for some company's marketing
droids? Surely you didn't...

 I can understand redhat somehow. There are good reasons for them to take
 even CVS snaps and ship them instead of *very* outdated so called stable
 versions.

I wouldn't mind, either, if this didn't mean that programs compiled
on rehdat need redhat versions of the development toolchain / runtime
environment to use them :(

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 04:13:25PM +0100, Nix [EMAIL PROTECTED] wrote:
  (Froget about the "committe" stuff...)
 
 Marc will probably agree here that this (except for the bit about RH
 owning Cygnus) is purest garbage. The whole *point* of the Steering
 Committee is to prevent any single interest from gaining control of

BTW, AFAIK gcc is the only large free software project that has an
explicit rule that (quote):

   * No single organization is allowed to have 50% or more of the votes.
 [This includes groups of developers from the same company or a
 university]

The cygnus/redhat merger was indeed a point where this rule had to be
checked, fortunately even redhat+cygnus is well below the 50% mark.

But even if it were true, it isn't good.

 It is up to the release manager (following the release criteria) to
 release GCC. It is not up to RedHat. But they can, if they want, ship an
 unreleased GCC.

Yes, they can do whatever they are allowed by the license, of course. The
question is wether it's right, or what the consequences are.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 06:06:52PM +0200, Marc Lehmann [EMAIL PROTECTED] wrote:
  owning Cygnus) is purest garbage. The whole *point* of the Steering
  Committee is to prevent any single interest from gaining control of
 
 BTW, AFAIK gcc is the only large free software project that has an

"AFAIK" has a very low information content. Alan just informed me that the
gnome project has a similar anti-takeover-rule (trying to avoid a mail
flood here ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: To Matti

2000-10-01 Thread Marc Lehmann


On Sat, Sep 30, 2000 at 11:20:42PM +0200, Marc Lehmann [EMAIL PROTECTED] wrote:
 Just FYI; I tried to reply to your mail (you know the topic) but your

Thanks for your reply. O.k. in short: I didn't agree back when you sent
the message, but then the thread had more on-topic content, so basically
do as you think is best, but think about any political implications as
happens in every case. Killing threads rarely has good results IMHO as
compared to other methods.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 09:18:36PM +0200, Martin Dalecki 
[EMAIL PROTECTED] wrote:
 C++ ABI breaking: SuSE managed to break the VShop application in an
 entierly insane way between releases 6.1 and 6.2 - they stiupid did
 recompile the libstdc++ with a new compiler and didn't even
 bother to increment the binary version of this library
 At RedHat at least they know what they are changing...

Obviously redhat did and does a lot of similar braindamage, which could be
called "bugs" (no version of perl on redhat cd's really worked correctly
for example).

Again, the choice redhat did can not be construed as being some mistake by
some guy or a group of guys. It was a deliberate decision.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 04:39:06PM -0400, Horst von Brand 
[EMAIL PROTECTED] wrote:
  I wouldn't mind, either, if this didn't mean that programs compiled
  on rehdat need redhat versions of the development toolchain / runtime
  environment to use them :(
 
 Has happened on and off with each distribution I've ever played with. The
 point being?

That what you say is simply not true, so what's _your_ point in claiming
this?

One never needed suse's or redhat's glibc to run binaries created on their
platforms. Likewise one never needed their libstdc++ or their toolchain,
the official ones (released by the official maintainers) always were
enough.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Mon, Oct 02, 2000 at 12:41:11AM +0200, Igmar Palsenberg [EMAIL PROTECTED] wrote:
  on rehdat need redhat versions of the development toolchain / runtime
  environment to use them :(
 
 And you say that programs developed on for example SuSE don't need a SuSE
 enviroment ??

I said that, say that, and it's still true, yes ;) It's also true with the
majority of other distributions not cited so far: debian (which has the
advantage of a reasonable package management), slackware, stampede and
many others.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Sun, Oct 01, 2000 at 10:36:00PM +0100, Alan Cox [EMAIL PROTECTED] wrote:
  One never needed suse's or redhat's glibc to run binaries created on their
  platforms. Likewise one never needed their libstdc++ or their toolchain,
 
 You regularly did. Even with libc5 there were two semi incompatible sets
 of X libraries (with/without pthreads) and some other problems. Thats why we
 need the LSB work

You *keep* ignoring the point. Please, Alan, the point is that all these
libraries were not forked redhat-only versions. You keep citing irrelevant
facts about library incompatibilities, but the fact is that all these
came from the official sources and were compatible to the official
versions. Even egcs made a large effort to become gcc compatible.

Why do you keep ignoring this point?

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: What is up with Redhat 7.0?

2000-10-01 Thread Marc Lehmann


On Mon, Oct 02, 2000 at 12:07:11AM +0100, Alan Cox [EMAIL PROTECTED] wrote:
  Why do you keep ignoring this point?
 
 I don't see your point except as 'never change anything'.

Hmm... there is some misunderstanding here, see:

 I got bored of libc2 a while back. I prefer change

Now, what would you think if you developed libc2 and were about to go
to libc3 and then some company took libc2 made their own libc3 which is
incompatible to the libc3 that has been publicly announced some time ago,
put *your* address into the bug-report address if *their* libc3, told the
public nothing about the highly experimental aspect of their libc3 (that
will certainly not be compatible to the "official" libc3) etc.. etc...

I certainly am not "never change anything", I wouldn't have tried to patch
that pgcc thingy if I were. I am against mindless forking without stating
this, though, even if allowed by the license.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

1 2 >

1 - 100 of 134 matches

Mail list logo