Re: [dm-devel] [PATCH 0/3] add resync speed control for dm-raid1
On 2012-12-10T13:21:23, NeilBrown wrote: > The problem with this approach is that it slows down resync even when there > is no other IO happening. > If that is deemed to be acceptable, then the patch set seems fine, though I > would probably make the default a lot higher so as not to change current > default behaviour for anyone. I agree to the latter part. The difficulty is that our primary use case here is preventing IO starvation while cluster raid is resyncing; and we don't know the IO load on other nodes, or what other LVs might inflict on the same backend store / PV. Hence, a static limit probably is the easiest way to start. I agree that a more dynamic approach would be desirable, but that appears to be very complex to get right. Thanks, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH 2/3] dm raid1: add interface to set resync speed
On 2012-11-22T14:27:52, Guangliang Zhao wrote: Hi Guangliang, thanks for adding this. I think this approach is a good direction to take, just one feedback: > Add ioctl to control resync speed, userspace tool > is dmsetup message, message format is: > dmsetup message $device 0 "set $speed" > e.g. > dmsetup message /dev/dm-2 "set 12345" I think this should be "set-max-resync-rate" or something; "set" is very generic and not very extensible going forward, should the need arise. Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] hpwdt: Fix kdump issue in hpwdt
On 2012-08-27T12:52:24, Toshi Kani wrote: > kdump can be interrupted by watchdog timer when the timer is left > activated on the crash kernel. Changed the hpwdt driver to disable > watchdog timer at boot-time. This assures that watchdog timer is > disabled until /dev/watchdog is opened, and prevents watchdog timer > to be left running on the crash kernel. How does this protect against the system hanging again in the crash kernel, or possibly hardware caches to flush more data to shared storage? (I'm asking from the perspective of the hpwdt being used as a fencing mechanism in a cluster setting.) Or is the argument that it's "very unlikely" that a system in such a state would not make it far enough into the crash kernel? Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-25T17:14:11, Pavel Machek <[EMAIL PROTECTED]> wrote: > Actually, I surprised Lars a lot by telling him ln /etc/shadow /tmp/ > allows any user to make AA ineffective on large part of systems -- in > internal discussion. (It is not actually a _bug_, but it is certainly > unexpected). Pavel, no, you did not. You _did_ surprise me by misquoting me so badly, though. I agreed that actions by not mediated processes can interfere with mediated processes. That is a given. So you do not give them free access to a world writable directory. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-22T08:41:51, Stephen Smalley <[EMAIL PROTECTED]> wrote: > The issue arises even for a collection of collaborating confined > processes with different profiles, and the collaboration may be > intentional or unintentional (in the latter case, one of the confined > processes may be taking advantage of known behavior of another process > and shared access by both to some resource in order to induce a > particular behavior in that process). Point taken; the point remains is that you need at least several (intentionally or not) cooperating processes. The chances of this are significantly lower than a single process exploit. > And remember that confinement isn't just about limiting untrusted > processes but also about protecting trusted processes; limiting the > inputs and outputs of a trusted process can be just as important to > preventing exploitation. True. It'd appear that if you want that, you'd specify the AA profile so that it doesn't include directories/files writable by untrusted processes. > Sorry, do you mean the "server" as in the "server system" or as in the > "server daemon"? For the former, I'd agree - we would want SELinux > policy applied on the server as well as the client to ensure that the > data is being protected consistently throughout and that the server is > not misrepresenting the security guarantees expected by the clients. > Providing an illusion of confinement on each client without any > corresponding protection on the server system would be very prone to > bypass. For the latter, the kernel can only truly confine application > code, not in-kernel threads, although we can subject the in-kernel nfsd > to permission checking as a robustness check. We've always noted that > SELinux does depend on the correctness of the kernel. Oh, you're saying that this threat is out-of-scope? ;-) > Every time we've noted an issue with AA, the answer has been that it is > out of scope. Yet the public documentation for AA misrepresents the > situation and its comparisons with SELinux conveniently ignore its > limitations. I'm sorry. Again, I'm not responsible for marketing comparisons made by anyone else, nor do I think they should apply to this discussion where we're discussing the merits of what AA actually _does_; not what someone's marketing claims it does - otherwise I'll go dig out marketing claims about SELinux too ;-) And, coming at it from that direction, I feel it does something useful. Note that here we've already strayed from the focus of the discussion; we're no longer arguing "the implementation is ugly/broken", but you're claiming "doesn't do what I need" - which I'm not disagreeing with. It doesn't do what you want. Which is why you have SELinux, and it's going to stay. Fine. If we assume that the users who run it do have a need / use case for it which they can't solve differently, we should really get back to the discussion of how those needs can be met or provided by Linux in a feasible way. > > Your use case mandates complete system-wide mediation, because you want > > full data flow analysis. Mine doesn't. > Then yours isn't mandatory access control, nor is it confinement. I apologize for not using the word "confinement" in the way you expect it to be used. I certainly don't want to imply it does do things it doesn't. Keep in mind I'm not a native speaker, so nuances do get lost sometimes; nor do I have long years of experience in the security field. Thanks for clearing this up. So agreed, it is not confinement nor MAC. Would it be more appropriate if I used the word "restricts" or "constrains"? Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-22T07:53:47, Stephen Smalley <[EMAIL PROTECTED]> wrote: > > No the "incomplete" mediation does not flow from the design. We have > > deliberately focused on doing the necessary modifications for pathname > > based mediation. The IPC and network mediation are a wip. > The fact that you have to go back to the drawing board for them is that > you didn't get the abstraction right in the first place. That's an interesting claim, however I don't think it holds. AA was designed to mediate file access in a form which is intuitive to admins. It's to be expected that it doesn't directly apply to mediating other forms of access. > I think we must have different understandings of the words "generalize" > and "analyzable". Look, if I want to be able to state properties about > data flow in the system for confidentiality or integrity goals (my > secret data can never leak to unauthorized entities, my critical data > can never be corrupted/tainted by unauthorized entities - directly or > indirectly), I seem to think that this is not what AA is trying to do, so evaluating it in that context doesn't seem useful. It's like saying a screw driver isn't a hammer, so it is useless because you have a nail. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-22T07:19:39, Stephen Smalley <[EMAIL PROTECTED]> wrote: > > > Or can access the data under a different path to which their profile > > > does give them access, whether in its final destination or in some > > > temporary file processed along the way. > > Well, yes. That is intentional. > > > > Your point is? > > It may very well be unintentional access, especially when taking into > account wildcards in profiles and user-writable directories. Again, you're saying that AA is not confining unconfined processes. That's a given. If unconfined processes assist confined processes in breeching their confinement, yes, that is not mediated. You're basically saying that anything but system-wide mandatory access control is pointless. If you want to go down that route, what is your reply to me saying that SELinux cannot mediate NFS mounts - if the server is not confined using SELinux as well? The argument is really, really moot and pointless. Yes, unconfined actions can affect confined processes. That's generally true for _any_ security system. > > That is an interesting argument, but not what we're discussing here. > > We're arguing filesystem access mediation. > IOW, anything that AA cannot protect against is "out of scope". An easy > escape from any criticism. I'm quite sure that this reply is not AA specific as you try to make it appear. > > Yes. Your use case is different than mine. > My use case is being able to protect data reliably. Yours? I want to restrict certain possibly untrusted applications and network-facing services from accessing certain file patterns, because as a user and admin, that's the mindset I'm used to. I might be interested in mediating other channels too, but the files are what I really care about. I'm inclined to trust the other processes. Your use case mandates complete system-wide mediation, because you want full data flow analysis. Mine doesn't. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T23:45:36, Joshua Brindle <[EMAIL PROTECTED]> wrote: > >remember, the policies define a white-list > > Except for unconfined processes. The argument that AA doesn't mediate what it is not configured to mediate is correct, yes, but I don't think that's a valid _design_ issue with AA. > Or through IPC or the network, that is the point, filesystem only > coverage doesn't cut it; there is no way to say the browser can't access > the users mail in AA, and there never will be. We have a variety of filtering mechanisms which are specific to a domain. iptables filters networking only; file permissions filter file access only. This argument is not really strong. If you're now arguing the "spirit of Unix", I can turn your argument around too: the Unix spirit is to have smallish dedicated tools. If AA is dedicated to mediating file access, isn't that nice! AA _could_ be extended to mediate network access and IPC (and this is WIP). If we had tcpfs and ipcfs - you know, everything is a filesystem, the Linux spirit! ;-) - AA could mediate them as well. However, we're discussing the way it mediates file accesses here, for which it appears useful and capable of functionality which SELinux's approach cannot provide. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T20:16:25, Joshua Brindle <[EMAIL PROTECTED]> wrote: > not. One need only look at the wonderful marketing literature for AA to > see what you are telling people it can do, and your above statement > isn't consistent with that, sorry. I'm sorry. I don't work in marketing. -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T16:59:54, Stephen Smalley <[EMAIL PROTECTED]> wrote: > Or can access the data under a different path to which their profile > does give them access, whether in its final destination or in some > temporary file processed along the way. Well, yes. That is intentional. Your point is? > The emphasis on never modifying applications for security in AA likewise > has an adverse impact here, as you will ultimately have to deal with > application mediation of access to their own objects and operations not > directly visible to the kernel (as we have already done in SELinux for > D-BUS and others and are doing for X). Otherwise, your "protection" of > desktop applications is easily subverted. That is an interesting argument, but not what we're discussing here. We're arguing filesystem access mediation. > Um, no. It might not be able to directly open files via that path, but > showing that it can never read or write your mail is a rather different > matter. Yes. Your use case is different than mine. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T22:07:40, Pavel Machek <[EMAIL PROTECTED]> wrote: > > AA is supposed to allow valid access patterns, so for non-buggy apps + > > policies, the rename will be fine and does not change the (observed) > > permissions. > That still breaks POSIX, right? Hopefully it will not break any apps, > but... No, it does not break POSIX. Unless, of course, there's a bug in the policy or in the program. Bugs are generally not covered by POSIX, for some strange reason. (The argument that POSIX codifies implementation bugs in Unix(tm) implementations of the time non-withstanding.) > > A veto is not a technical argument. All technical arguments (except for > > "path name is ugly, yuk yuk!") have been addressed, have they not? > There still is "it does not work with long pathnames". > > Plus IIRC we have something like "AA has to allocate path-sized > buffers along every syscall". That is an implementation bug though. I'm sure we have other bugs in the kernel too - this isn't a design flaw. (If people are allowed to thinair solutions for implementing AA on top of SELinux, I can thinair that this can be solved by reverse-matching the dentry tree against the policy as the path is traversed and constructed, requiring a constant sized buffer.) Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T15:42:28, James Morris <[EMAIL PROTECTED]> wrote: > > A veto is not a technical argument. All technical arguments (except for > > "path name is ugly, yuk yuk!") have been addressed, have they not? > AppArmor doesn't actually provide confinement, because it only operates on > filesystem objects. > > What you define in AppArmor policy does _not_ reflect the actual > confinement properties of the policy. Applications can simply use other > mechanisms to access objects, and the policy is effectively meaningless. Only if they have access to another process which provides them with that data. And now, yes, I know AA doesn't mediate IPC or networking (yet), but that's a missing feature, not broken by design. > You might define this as a non-technical issue, but the fact that AppArmor > simply does not and can not work is a fairly significant consideration, I > would imagine. If I restrict my Mozilla to not access my on-disk mail folder, it can't get there. (Barring bugs in programs which Mozilla is allowed to run unconfined, sure.) If the argument is that AA provides somewhat different semantics - and for some use cases "weaker" ones - than SE Linux, that is undoubtly true. However, it appears to be the case that those are the differences which make AA's model different from SELinux as well, so it appears a trade-off best left to the admin / user to choose what fits their needs best. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T12:30:08, [EMAIL PROTECTED] wrote: > well, if you _really_ want people who are interested in this to do weekly > "why isn't it merged yet you $%#$%# developers" threads that can be > arranged. > > the people who want this have been trying to be patient and let the system > work. if it takes people being pests to get something implemented it can > be done, but I don't think other people on the list will appriciate this. Please. We're so not going down _that_ route. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-21T20:33:11, Pavel Machek <[EMAIL PROTECTED]> wrote: > inconvenient, yes, insecure, no. Well, only if you use the most restrictive permissions. And then you'll suddenly hit failure cases which you didn't expect to, which can possibly cause another exploit to become visible. > I believe AA breaks POSIX, already. rename() is not expected to change > permissions on target, nor is link link. And yes, both of these make > AA insecure. AA is supposed to allow valid access patterns, so for non-buggy apps + policies, the rename will be fine and does not change the (observed) permissions. The time window in the rename+relabel approach however introduces a slot where permissions are not consistent. This is a different case. > > You _must_ be kidding. The cure is worse than the problem. > Possibly. Yes. > > If that is the only way to implement AA on top of SELinux - and so far, > > noone has made a better suggestion - I'm convinced that AA has technical > > merit: it does something the on-disk label based approach cannot handle, > > and for which there is demand. > What demand? SELinux is superior to AA, and there was very little > demand for AA. Compare demand for reiser4 or suspend2 with demand for > AA. SELinux is superior to AA for a certain scenario of use cases; as we can see here, it is not superior to AA for _all_ use cases. > > The code has improved, and continues to improve, to meet all the coding > > style feedback except the bits which are essential to AA's function > Which are exactly the bits Christoph Hellwig and Al Viro > vetoed. http://www.uwsg.iu.edu/hypermail/linux/kernel/0706.1/2587.html > . I believe it takes more than "2 users want it" to overcome veto of > VFS maintainer. A veto is not a technical argument. All technical arguments (except for "path name is ugly, yuk yuk!") have been addressed, have they not? Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
I've caught up on this thread with growing disbelief while reading the mails, so much that I've found it hard to decide where to reply to. So people are claiming that AA is ugly, because it introduces pathnames and possibly a regex interpreter. Ok, taste differs. We've got many different flavours of filesystems in the kernel because of that. However, the suggested cure makes me cringe. You're saying that relabeling file(s) from user-space after a rename is a possible solution. This breaks POSIX - renames must be atomic. It is possibly insecure; if this is fixed by making a rename automatically default to restrictive permissions, it'll be even more inconvenient. It will break applications which expect to be able to access the file(s) immediately after a rename. It is slow, and can possibly cause a lot of disk access. Possibly over NFS or via slow disks. By going through user-space - which could block and introduce all sorts of memory deadlocks (compared to that deadlock, a regex is harmless.) (I also wonder how you propose to relabel files on a r/o mount if the policy changes, btw; or if the NFS mount is made available on several nodes w/different permissions.) AA only enforces user-space defined policy - the argument that policy doesn't belong into the kernel is bull. Adding a wrapper to glibc to block until relabeling is complete? "Let's first do the implementation and later worry about performance."? "The timing window is neglible."? "30 minutes during installation does not seem silly."? You _must_ be kidding. The cure is worse than the problem. If that is the only way to implement AA on top of SELinux - and so far, noone has made a better suggestion - I'm convinced that AA has technical merit: it does something the on-disk label based approach cannot handle, and for which there is demand. The code has improved, and continues to improve, to meet all the coding style feedback except the bits which are essential to AA's function (like the pathname lookup and the regex parser; though I'm sure that in particular the later one could be swapped for a less complex matcher as well). It certainly isn't worse than many other areas of the kernel. You're pointing to each other's opposition to the features - that, my dear gentlemen, is a circular argument. One of you could readily break the chain. This is trying to get rid of AA for the sake of it, masquerading as technical reasons. At least fucking admit it. Don't lie. This is distasteful. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching
On 2007-06-10T23:05:47, Pavel Machek <[EMAIL PROTECTED]> wrote: > But you have that regex in _user_ space, in a place where policy > is loaded into kernel. > > AA has regex parser in _kernel_ space, which is very wrong. That regex parser only applies user defined policy. The logical connection between your two points doesn't exist. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GFS, what's remaining
On 2005-09-03T01:57:31, Daniel Phillips <[EMAIL PROTECTED]> wrote: > The only current users of dlms are cluster filesystems. There are zero users > of the userspace dlm api. That is incorrect, and you're contradicting yourself here: > What does have to be resolved is a common API for node management. It is not > just cluster filesystems and their lock managers that have to interface to > node management. Below the filesystem layer, cluster block devices and > cluster volume management need to be coordinated by the same system, and > above the filesystem layer, applications also need to be hooked into it. > This work is, in a word, incomplete. The Cluster Volume Management of LVM2 for example _does_ use simple cluster-wide locks, and some OCFS2 scripts, I seem to recall, do too. (EVMS2 in cluster-mode uses a verrry simple locking scheme which is basically operated by the failover software and thus uses a different model.) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GFS, what's remaining
On 2005-09-03T09:27:41, Bernd Eckenfels <[EMAIL PROTECTED]> wrote: > Oh thats interesting, I never thought about putting data files (tablespaces) > in a clustered file system. Does that mean you can run supported RAC on > shared ocfs2 files and anybody is using that? That is the whole point why OCFS exists ;-) > Do you see this go away with ASM? No. Beyond the table spaces, there's also ORACLE_HOME; a cluster benefits in several aspects from a general-purpose SAN-backed CFS. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: GFS, what's remaining
On 2005-09-01T16:28:30, Alan Cox <[EMAIL PROTECTED]> wrote: > Competition will decide if OCFS or GFS is better, or indeed if someone > comes along with another contender that is better still. And competition > will probably get the answer right. Competition will come up with the same situation like reiserfs and ext3 and XFS, namely that they'll all be maintained going forward because of, uhm, political constraints ;-) But then, as long as they _are_ maintained and play along nicely with eachother (which, btw, is needed already so that at least data can be migrated...), I don't really see a problem of having two or three. > The only thing that is important is we don't end up with each cluster fs > wanting different core VFS interfaces added. Indeed. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/14] GFS
On 2005-08-10T12:05:11, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > What would a syntax look like which in your opinion does not remove > > totally valid symlink targets for magic mushroom bullshit? Prefix with > > // (which, according to POSIX, allows for implementation-defined > > behaviour)? Something else, not allowed in a regular pathname? > None. just don't do it. Use bindmount, they're cheap and have sane > defined semtantics. So for every directoy hiearchy on a shared filesystem, each user needs to have the complete list of bindmounts needed, and automatically resync that across all nodes when a new one is added or removed? And then have that executed by root, because a regular user can't? Sure. Very cheap and sane. I'm buying. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/14] GFS
On 2005-08-10T11:54:50, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > It works now. Unlike context link which steal totally valid symlink > targets for magic mushroom bullshit. Right, that is a valid concern. Avoiding context dependent symlinks entirely certainly is one possible path around this. But, let's just for the sake of this discussion continue the other path for a bit, to explore the options available for implementing CPS which don't result in shivers running down the spine, because I believe CPS do have some applications in which bind mounts are not entirely adequate replacements. (Unless, of course, you want a bind mount for each homedirectory which might include architecture-specific subdirectories or for every host-specific configuration file.) What would a syntax look like which in your opinion does not remove totally valid symlink targets for magic mushroom bullshit? Prefix with // (which, according to POSIX, allows for implementation-defined behaviour)? Something else, not allowed in a regular pathname? If we can't find an acceptable way of implementing them, maybe it's time to grab some magic mushrooms and come up with a new approach, then ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/14] GFS
On 2005-08-10T11:32:56, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > Would a generic implementation of that higher up in the VFS be more > > acceptable? > No. Use mount --bind That's a working and less complex alternative for upto how many places at once? That works for non-root users how...? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/14] GFS
On 2005-08-10T08:03:09, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > Kindly lose the "Context Dependent Pathname" crap. > Same for ocfs2. Would a generic implementation of that higher up in the VFS be more acceptable? It's not like context-dependent symlinks are an arbitary feature, but rather very useful in practice. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/14] GFS
On 2005-08-03T11:56:18, David Teigland <[EMAIL PROTECTED]> wrote: > > * Why use your own journalling layer and not say ... jbd ? > Here's an analysis of three approaches to cluster-fs journaling and their > pros/cons (including using jbd): http://tinyurl.com/7sbqq Very instructive read, thanks for the link. -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Power consumption HZ100, HZ250, HZ1000: new numbers
On 2005-08-02T10:52:00, Lee Revell <[EMAIL PROTECTED]> wrote: > > Power consumption matters to server, desktop, and laptop. > > > > Assuming this is a laptop issue is wildly incorrect. > > I would think you'd get the best power/performance ration from a desktop > by just having it suspend after 5 or 10 minutes of idle time. > > Oh well, I'll shut up, I've already demonstrated a complete ignorance of > new hardware. Lee, that is a very impressive statement to make in public, and not many people are capable of admitting that they were wrong, or that their focus has been too narrow. It's great to see this discussion has proven instructing to you. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Power consumption HZ100, HZ250, HZ1000: new numbers
On 2005-08-02T10:02:59, Lee Revell <[EMAIL PROTECTED]> wrote: > > Maybe new desktop systems - but what about the tens of millions of old > > systems that don't. > Does anyone really give a shit about saving power on the desktop anyway? > This is basically a laptop issue. Desktops? Screw desktops. (Unless of course you're one of those environmental friendly guys, but then you probably are simply too cheap to buy a SUV too!) But rather think "data center". Those guys want maximum cycles per watt. One way of getting there is using less watts when we don't use all cycles. This bring down power consumption, which directly brings down heat production, which brings down A/C needs. Everyone wants to save power. -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T11:39:38, Joel Becker <[EMAIL PROTECTED]> wrote: > In turn, let me clarify a little where configfs fits in to > things. Configfs is merely a convenient and transparent method to > communicate configuration to kernel objects. It's not a place for > uevents, for netlink sockets, or for fancy communication. It allows > userspace to create an in-kernel object and set/get values on that > object. It also allows userspace and kernelspace to share the same > representation of that object and its values. > For more complex interaction, sysfs and procfs are often more > appropriate. While you might "configure" all known nodes in configfs, > the node up/down state might live in sysfs. A netlink socket for > up/down events might live in procfs. And so on. Right. Thanks for the clarification and elaboration, for I am sure not entirely clear as to how all these mechanisms relate in detail and what is appropriate just where, and when to use something more classic like ioctl etc... ;-) FWIW, we didn't mean to get uevents out via configfs of course. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <[EMAIL PROTECTED]> wrote: > Like Lars, I too was under the wrong impression about this configfs > "nodemanager" kernel component. Our discussions in the cluster > meeting Monday and Tuesday were assuming it was a general service that > other kernel components could/would utilize and possibly also > something that could send uevents to non-kernel components wanting a > std. way to see membership information/events. Let me clarify that this was something we briefly touched on in Walldorf: The node manager would (re-)export the current data via sysfs (which would result in uevents being sent, too), and not something we dreamed up just Monday ;-) > As to kernel components without corresponding user-level "managers", > look no farther than OpenSSI. Our hope was that we could adapt to a > user-land membership service and this interface thru configfs would > drive all our kernel subsystems. Well, node manager still can provide you the input as to which nodes are configured, which in a way translates to "membership". The thing it doesn't seem to provide yet is the supsend/modify/resume cycle which for example the RHAT DLM seems to require. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T11:35:46, David Teigland <[EMAIL PROTECTED]> wrote: > > Also, eventually we obviously need to have state for the nodes - up/down > > et cetera. I think the node manager also ought to track this. > We don't have a need for that information yet; I'm hoping we won't ever > need it in the kernel, but we'll see. Hm, I'm thinking a service might have a good reason to want to know the possible list of nodes as opposed to the currently active membership; though the DLM as the service in question right now does not appear to need such. But, see below. > There are at least two ways to handle this: > > 1. Pass cluster events and data into the kernel (this sounds like what > you're talking about above), notify the effected kernel components, each > kernel component takes the cluster data and does whatever it needs to with > it (internal adjustments, recovery, etc). > > 2. Each kernel component "foo-kernel" has an associated user space > component "foo-user". Cluster events (from userland clustering > infrastructure) are passed to foo-user -- not into the kernel. foo-user > determines what the specific consequences are for foo-kernel. foo-user > then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs, > configfs, etc). These control hooks would largely be specific to foo. > > We're following option 2 with the dlm and gfs and have been for quite a > while, which means we don't need 1. I think ocfs2 is moving that way, > too. Someone could still try 1, of course, but it would be of no use or > interest to me. I'm not aware of any actual projects pushing forward with > something like 1, so the persistent reference to it is somewhat baffling. Right. I thought that the node manager changes for generalizing it where pushing into sort-of direction 1. Thanks for clearing this up. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-18T14:15:53, David Teigland <[EMAIL PROTECTED]> wrote: > Some of the comments about the dlm concerned how it's configured (from > user space.) In particular, there was interest in seeing the dlm and > ocfs2 use common methods for their configuration. > > The first area I'm looking at is how we get addresses/ids of other nodes. > Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a > separate kernel module called "ocfs2_nodemanager" that's based on > configfs. > > I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use > it (removing ocfs-specific stuff). It still needs some work, but I'd like > to know if this appeals to the ocfs group and to others who were > interested in seeing some similarity in dlm/ocfs configuration. Hi Dave, I finally found time to read through this. Yes, I most definetely like where this is going! > +/* TODO: > + - generic addresses (IPV4/6) > + - multiple addresses per node The nodeid, I thought, was relative to a given DLM namespace, no? This concept seems to be missing here, or are you suggesting the nodeid to be global across namespaces? Also, eventually we obviously need to have state for the nodes - up/down et cetera. I think the node manager also ought to track this. How would kernel components use this and be notified about changes to the configuration / membership state? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Provide better printk() support for SMP machines [try #2]
On 2005-07-08T13:36:12, David Howells <[EMAIL PROTECTED]> wrote: > The attached patch prevents oopses interleaving with characters from other > printks on other CPUs by only breaking the lock if the oops is happening on > the machine holding the lock. > > It might be better if the oops generator got the lock and then called an inner > vprintk routine that assumed the caller holds the lock, thus making oops > reports "atomic". > > Signed-Off-By: David Howells <[EMAIL PROTECTED]> After some discussion on IRC (me asking dumb questions) and reviewing the code I'm in infinite favour of this patch. It clearly is a step in a mucho desireable direction. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A "new driver model" and EXPORT_SYMBOL_GPL question
On 2005-07-05T07:09:47, "Richard B. Johnson" <[EMAIL PROTECTED]> wrote: > This problem will continue. Eventually there will be no general > exported symbols. The apparent idea is to prevent the use of the > kernel in proprietary systems. ... with proprietary kernel extensions. There's a difference. > Not to worry. The tools provided with a typical Linux distribution > are capable of resolving those symbols. You can make a script > that `greps` System.map for the correct offsets of those symbols. > You can use those offsets in a linker script. If it wasn't you, I'd be assuming you'd be joking when suggesting to subvert the clear wishes of and licensing granted by the copyright holders and authors. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/1] nbd: Don't create all MAX_NBD devices by default all the time
On 2005-04-15T00:56:35, Domen Puncer <[EMAIL PROTECTED]> wrote: > This is permissions in sysfs (or 0 if no file is to be created). Duh. Should have caught that. Try this one. Index: linux-2.6.11/drivers/block/nbd.c === --- linux-2.6.11.orig/drivers/block/nbd.c 2005-03-02 08:37:50.0 +0100 +++ linux-2.6.11/drivers/block/nbd.c2005-04-15 09:36:10.374854551 +0200 @@ -78,6 +78,7 @@ #define DBG_RX 0x0200 #define DBG_TX 0x0400 static unsigned int debugflags; +static unsigned int nbds_max = 16; #endif /* NDEBUG */ static struct nbd_device nbd_dev[MAX_NBD]; @@ -647,7 +648,13 @@ static int __init nbd_init(void) return -EIO; } - for (i = 0; i < MAX_NBD; i++) { + if (nbds_max > MAX_NBD) { + printk(KERN_CRIT "nbd: cannot allocate more than %u nbds; %u requested.\n", MAX_NBD, + nbds_max); + return -EINVAL; + } + + for (i = 0; i < nbds_max; i++) { struct gendisk *disk = alloc_disk(1); if (!disk) goto out; @@ -673,7 +680,7 @@ static int __init nbd_init(void) dprintk(DBG_INIT, "nbd: debugflags=0x%x\n", debugflags); devfs_mk_dir("nbd"); - for (i = 0; i < MAX_NBD; i++) { + for (i = 0; i < nbds_max; i++) { struct gendisk *disk = nbd_dev[i].disk; nbd_dev[i].file = NULL; nbd_dev[i].magic = LO_MAGIC; @@ -706,8 +713,9 @@ out: static void __exit nbd_cleanup(void) { int i; - for (i = 0; i < MAX_NBD; i++) { + for (i = 0; i < nbds_max; i++) { struct gendisk *disk = nbd_dev[i].disk; + nbd_dev[i].magic = 0; if (disk) { del_gendisk(disk); blk_cleanup_queue(disk->queue); @@ -725,6 +733,8 @@ module_exit(nbd_cleanup); MODULE_DESCRIPTION("Network Block Device"); MODULE_LICENSE("GPL"); +module_param(nbds_max, int, 0444); +MODULE_PARM_DESC(nbds_max, "How many network block devices to initialize."); #ifndef NDEBUG module_param(debugflags, int, 0644); MODULE_PARM_DESC(debugflags, "flags for controlling debug output"); Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[patch 1/1] nbd: Don't create all MAX_NBD devices by default all the time
From: Lars Marowsky-Bree <[EMAIL PROTECTED]> This patches adds the "nbds_max" parameter to the nbd kernel module, which limits the number of nbds allocated. Previously, always all 128 entries were allocated unconditionally, which used to waste resources and needlessly flood the hotplug system with events. (Defaults to 16 now.) Signed-off-by: Lars Marowsky-Bree <[EMAIL PROTECTED]> --- nbd.c | 16 +--- 1 files changed, 13 insertions(+), 3 deletions(-) Index: linux-2.6.11/drivers/block/nbd.c === --- linux-2.6.11.orig/drivers/block/nbd.c 2005-03-02 08:37:50.0 +0100 +++ linux-2.6.11/drivers/block/nbd.c2005-04-14 13:08:40.100896527 +0200 @@ -78,6 +78,7 @@ #define DBG_RX 0x0200 #define DBG_TX 0x0400 static unsigned int debugflags; +static unsigned int nbds_max; #endif /* NDEBUG */ static struct nbd_device nbd_dev[MAX_NBD]; @@ -647,7 +648,13 @@ static int __init nbd_init(void) return -EIO; } - for (i = 0; i < MAX_NBD; i++) { + if (nbds_max > MAX_NBD) { + printk(KERN_CRIT "nbd: cannot allocate more than %u nbds; %u requested.\n", MAX_NBD, + nbds_max); + return -EINVAL; + } + + for (i = 0; i < nbds_max; i++) { struct gendisk *disk = alloc_disk(1); if (!disk) goto out; @@ -673,7 +680,7 @@ static int __init nbd_init(void) dprintk(DBG_INIT, "nbd: debugflags=0x%x\n", debugflags); devfs_mk_dir("nbd"); - for (i = 0; i < MAX_NBD; i++) { + for (i = 0; i < nbds_max; i++) { struct gendisk *disk = nbd_dev[i].disk; nbd_dev[i].file = NULL; nbd_dev[i].magic = LO_MAGIC; @@ -706,8 +713,9 @@ out: static void __exit nbd_cleanup(void) { int i; - for (i = 0; i < MAX_NBD; i++) { + for (i = 0; i < nbds_max; i++) { struct gendisk *disk = nbd_dev[i].disk; + nbd_dev[i].magic = 0; if (disk) { del_gendisk(disk); blk_cleanup_queue(disk->queue); @@ -725,6 +733,8 @@ module_exit(nbd_cleanup); MODULE_DESCRIPTION("Network Block Device"); MODULE_LICENSE("GPL"); +module_param(nbds_max, int, 16); +MODULE_PARM_DESC(nbds_max, "How many network block devices to initialize."); #ifndef NDEBUG module_param(debugflags, int, 0644); MODULE_PARM_DESC(debugflags, "flags for controlling debug output"); -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Exploit in 2.6 kernels
On 2005-04-13T08:59:21, Lennart Sorensen <[EMAIL PROTECTED]> wrote: > It is becoming harder and harder to find supported cards it seems. > Finding a card with decent 2D drivers for X can still be done, but 3D is > just not really an option it seems. Even 2D seems to be a problem on > many cards if you don't use a binary only driver. You are confusing the cause with the symptom. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On 2005-03-09T18:36:37, Alex Aizman <[EMAIL PROTECTED]> wrote: > Heartbeat is good for reliability, etc. WRT "getting paged-out" - > non-deterministic (things depend on time), right? Right, if we didn't get scheduled often enough for us to send our heartbeat messages to the other peers, they'll evict us from the cluster and fence us, causing a service disruption. With all these protections in place though, we can run at roughly 50ms heartbeat intervals from user-space, reliably, which allows us a node dead timer of ~200ms. I think that's pretty damn good. (Of course, realistically, even for subsecond fail-over, 200ms keep alives are sufficient, and 50ms would be quite extreme. But, it works.) > >That works well in our current development series, and if you want to > >share code, you can either rip it off (Open Source, we love ya ;) or we > >can spin off these parts into a sub-package for you to depend on... > If it's not a big deal :-) let's do the "sub-package" option. I've brought this up on the linux-ha-dev list. When do you need this? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE 0/6] Open-iSCSI High-Performance Initiator for Linux
On 2005-03-08T22:25:29, Alex Aizman <[EMAIL PROTECTED]> wrote: > There's (or at least was up until today) an ongoing discussion on our > mailing list at http://groups-beta.google.com/group/open-iscsi. The > short and long of it: the problem can be solved, and it will. Couple > simple things we already do: mlockall() to keep the daemon un-swapped, > and also looking into potential dependency created by syslog (there's > one for 2.4 kernel, not sure if this is an issue for 2.6). BTW, to get around the very same issues, heartbeat does much the same: lock itself into memory, reserve a couple of pages more to spare on stack & heap, run at soft-realtime priority. syslog(), however, sucks. We went down the path of using our non-blocking IPC library to have all our various components log to ha_logd, which then logs to syslog() or writes to disk or wherever. That works well in our current development series, and if you want to share code, you can either rip it off (Open Source, we love ya ;) or we can spin off these parts into a sub-package for you to depend on... > The sfnet is a learning experience; it is by no means a proof that it > cannot be done. I'd also argue that it MUST be done, because the current way of "Oh, it's somehow related to block stuff, must be in kernel" leads down to hell. We better figure out good ways around it ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [MC] [CHECKER] Do ext2, jfs and reiserfs respect mount -o sync/dirsync option?
On 2005-03-04T01:44:06, Junfeng Yang <[EMAIL PROTECTED]> wrote: > > That would be a bug. Please send the e2fsck output. > > Here is the trace > > 1. file system is made with sbin/mkfs.ext2 -F -b 1024 /dev/hda9 60 > and mounted with -o sync,dirsync > > 1. operations FiSC did: > > creat(/mnt/sbd0/0001) > write(/mnt/sbd0/0001) > rename(/mnt/sbd0/0001, /mnt/sbd0/0002) > mkdir(/mnt/sbd0/0003) > > 2. FiSC "crashed" the test machine after mkdir returns. Crashed > disk image can be downloaded at: http://fisc.stanford.edu/bug2/crash.img.bz2 I've run into similar issues. For example, a "touch foo" also isn't synchronous with -o sync, but stays entirely in the cache. Andrea tells me this is expected behaviour, so I've given up on this one... Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFD: Kernel release numbering
On 2005-03-02T15:23:49, Greg KH <[EMAIL PROTECTED]> wrote: > > This could be improved: _All_ new features have to go through -mm first > > for a period (of whatever length) / one cycle. 2.6.x only directly picks > > up "obvious" bugfixes, and a select set of features which have ripened > > in -mm. 2.6.x-pre releases would then basically "only" clean up > > integration bugs. > > This is the way things work today already. The only exception being the > networking code, but hey, networking's always been special :) That's exactly what I'm saying. It already works this way, w/o misleading the users and attaching confusing meaning to minor revision numbers. It could do with being more clearly advertised and sticked to, but that's about it. Don't confuse more. People are already way to confused on their own. If anything, if we expect a release to have ... interesting side-effects, Linus will find just the funny words for the release notes to say so ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFD: Kernel release numbering
On 2005-03-02T14:21:38, Linus Torvalds <[EMAIL PROTECTED]> wrote: > We'd still do the -rcX candidates as we go along in either case, so as a > user you wouldn't even _need_ to know, but the numbering would be a rough > guide to intentions. Ie I'd expect that distributions would always try to > base their stuff off a 2.6. release. If the users wouldn't even have to know, why do it? Who will benefit from this, then? I think a better approach, and one which is already working out well in practice, is to put "more intrusive" features into -mm first, and only migrate them into 2.6.x when they have 'stabilized'. This could be improved: _All_ new features have to go through -mm first for a period (of whatever length) / one cycle. 2.6.x only directly picks up "obvious" bugfixes, and a select set of features which have ripened in -mm. 2.6.x-pre releases would then basically "only" clean up integration bugs. -mm would be the 'feature tree'. Of course, features which have matured in other eligible trees might also work; the key point is the two-stage approach and it doesn't matter whether the chaos stage has one or three trees, as long as it's not more than that. I think that would be natural extension of how things already work and just tightens the process some. (From a vendor perspective, this would mean we'd be safe picking up any 2.6.x tree + select choices from x+1-pre plus whatever we are force fed by those who pay.) If one wanted to get fancy, which I'm throwing in just to make everybody lose the core point of the argument: One could associate "points" with each feature / patch in -mm, based on an estimation of how intrusive/well-tested/dangerous/heavenly that patch is, and mandate that only 42 points per 2.6.x release are allowed. Of course, one could also apply common sense. But, that's not as silly. Or way more so, but less amusing than voting wars. > Comments? The numbering scheme is more confusing and unclear, and complexity is the enemy of reliability. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] device-mapper: multipath hardware handler for EMC
On 2005-02-11T19:58:41, Christoph Hellwig <[EMAIL PROTECTED]> wrote: > > +/* Code borrowed from dm-lsi-rdac by Mike Christie */ > > Any reason that module isn't submitted? No idea why. > > + bio->bi_bdev = path->dev->bdev; > > + bio->bi_sector = 0; > > + bio->bi_private = path; > > + bio->bi_end_io = emc_endio; > > + > > + page = alloc_page(GFP_ATOMIC); > > + if (!page) { > > + DMERR("dm-emc: get_failover_bio: alloc_page() failed."); > > + bio_put(bio); > > + return NULL; > > + } > > + > > + if (bio_add_page(bio, page, data_size, 0) != data_size) { > > + DMERR("dm-emc: get_failover_bio: alloc_page() failed."); > > + __free_page(page); > > + bio_put(bio); > > + return NULL; > > + } > > + > > + return bio; > > this would benefit from goto unwinding. OK. > > + if (h->short_trespass) { > > + memcpy(page22, short_trespass_pg, data_size); > > + } else { > > + memcpy(page22, long_trespass_pg, data_size); > > + } >memcpy(page22, h->short_trespass ? > short_trespass_pg : long_trespass_pg, data_size); > > ? Yes, I first did some other things there than just copying the commands around, it can surely benefit from cleanup. > > +static struct emc_handler *alloc_emc_handler(void) > > +{ > > + struct emc_handler *h = kmalloc(sizeof(*h), GFP_KERNEL); > > + > > + if (h) { > > + h->lock = SPIN_LOCK_UNLOCKED; > > + } > > if (h) > spin_lock_init(&h->lock); Came in via the copy, good catch. > > +static unsigned emc_err(struct hw_handler *hwh, struct bio *bio) > > +{ > > + /* FIXME: Patch from axboe still missing */ > > it's in -mm now afaik?? No, it's not. That's the request sense keys, but here we're dealing with the bio. > > +#if 0 > > + int sense; > > + > > + if (bio->bi_error & BIO_SENSE) { > > + sense = bio->bi_error & 0xff; /* sense key / asc / ascq */ > > + > > + if (sense == 0x020403) { > > please use the sense handling helpers from Doug Gilbert so you can handle > the descriptor sense format aswell. (And make the code a lot clear). I'll go look them up. > Also please try to use constants instead of magic numbers. Noted. I'll clean this part up when I actually have sense keys to try, so far this was mostly about getting that tiny bit of logic in. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc1-bk9 crash in mdadm
On 2005-01-21T17:12:30, Jan Kasprzak <[EMAIL PROTECTED]> wrote: > Just FWIW, I've got the following crash when trying to boot a 2.6.11-rc1-bk9 > kernel on my dual opteron Fedora Core 3 box. I will try -bk8 now. Attached is a likely candidate for a fix. (It's been discussed on linux-raid already.) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business From: Jens Axboe <[EMAIL PROTECTED]> Subject: Fix md using bio on stack with bio clones Patch-mainline: References: 49931 If md resides on top of a driver using bio_clone() (such as dm), it will oops the kernel due to md submitting a botched bio that has a veclist but doesn't have bio->bi_max_vecs set. Signed-off-by: Jens Axboe <[EMAIL PROTECTED]> = drivers/md/md.c 1.231 vs edited = --- 1.231/drivers/md/md.c 2004-12-01 09:13:51 +01:00 +++ edited/drivers/md/md.c 2005-01-19 13:23:30 +01:00 @@ -332,29 +332,26 @@ static int sync_page_io(struct block_device *bdev, sector_t sector, int size, struct page *page, int rw) { - struct bio bio; - struct bio_vec vec; + struct bio *bio = bio_alloc(GFP_KERNEL, 1); struct completion event; + int ret; + + bio_get(bio); rw |= (1 << BIO_RW_SYNC); - bio_init(&bio); - bio.bi_io_vec = &vec; - vec.bv_page = page; - vec.bv_len = size; - vec.bv_offset = 0; - bio.bi_vcnt = 1; - bio.bi_idx = 0; - bio.bi_size = size; - bio.bi_bdev = bdev; - bio.bi_sector = sector; + bio->bi_bdev = bdev; + bio->bi_sector = sector; + bio_add_page(bio, page, size, 0); init_completion(&event); - bio.bi_private = &event; - bio.bi_end_io = bi_complete; - submit_bio(rw, &bio); + bio->bi_private = &event; + bio->bi_end_io = bi_complete; + submit_bio(rw, bio); wait_for_completion(&event); - return test_bit(BIO_UPTODATE, &bio.bi_flags); + ret = test_bit(BIO_UPTODATE, &bio->bi_flags); + bio_put(bio); + return ret; } static int read_disk_sb(mdk_rdev_t * rdev)
Re: raid 1 - automatic 'repair' possible?
On 2005-01-18T22:18:01, "Kiniger, Karl (GE Healthcare)" <[EMAIL PROTECTED]> wrote: > idea for enhancement of software raid 1: > > every time the raid determines that a sector cannot > be read it could at least try to overwrite the bad are > with good data from the other disk. The idea is good and I'm sure we'll love to get a patch ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: using gdb to debug kernel
On 2001-06-26T15:31:04, "SATHISH.J" <[EMAIL PROTECTED]> said: > I would like to know how I can use gdb to debug some function in the > kernel. Please help me out with this detail. The easiest way would be user-mode-linux, hosted on sourceforge.net. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] drivers/net/others
On 2001-05-24T10:45:25, Tobias Ringstrom <[EMAIL PROTECTED]> said: > > if (!printed_version++) > > - printk(version); > > + printk("%s", version); > > > > DMFE_DBUG(0, "dmfe_init_one()", 0); > > > > Could you please explain the purpose of this change? To me it looks less > efficient in both performance and memory usage. Potentially, version might include stuff which is interpreted by printk if not quoted - the above fixes this. Paranoia always helps ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
On 2001-05-19T16:25:47, Daniel Phillips <[EMAIL PROTECTED]> said: > How about: > > # mkpart /dev/sda /dev/mypartition -o size=1024k,type=swap > # ls /dev/mypartition > basesizedevice type > # cat /dev/mypartition/size > 1048576 > # cat /dev/mypartition/device > /dev/sda > # mke2fs /dev/mypartition Ek. You want to run mke2fs on a _directory_ ? If anything, /dev/mypartition/realdev Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Storage - redundant path failover / failback - quo vadis linux?
On 2001-05-16T08:34:00, Christoph Biardzki <[EMAIL PROTECTED]> said: > I was investigating redundant path failover with FibreChannel disk devices > during the last weeks. The idea is to use a second, redundant path to a > storage device when the first one fails. Ideally one could also implement > load balancing with these paths. > > The problem is really important when using linux for mission-critical > applications which require large amounts of external storage. Yes. Device handling under Linux in the face of HA generally faces some annoying issues - the one you mention is actually the least of it ;-) Error handling and reporting is the most annoying one to me - no good way to find out whether a device has had an error. And even if the kernel logs a read error on device sda1 - great, what LVM volumes are affected? But on to your question... ;-) > - The "T3"-Patch for 2.2-Kernels which patches the sd-Driver und the > Qlogic-FC-HBA-Driver. When you pull an FC-Cable on a host equiped with two > HBAs the failover is almost immediate and an automatic failback (after > "repairing") is possible I actually like this one best, if it was forward ported to 2.4. > The low-level-approach of the "T3"-patch requires changes to the > scsi-drivers and the hardware-drivers but provides optimal communication > between the driver and the hardware The changes required for the hardware drivers are rather minimal. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4 add suffix for uname -r
On 2001-05-06T01:36:05, Mike Castle <[EMAIL PROTECTED]> said: > On Sun, May 06, 2001 at 10:12:17AM +0200, Lars Marowsky-Bree wrote: > > You assign a new EXTRAVERSION to the new kernel you are building, and keep the > > old kernel at the old name. > > Except that some patches (ie, RAID, -ac) use EXTRAVERSION. So? You can still set EXTRAVERSION to anything you like. It is just an identifier for the admin, and not used for anything inside the kernel. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] 2.4 add suffix for uname -r
On 2001-05-06T17:45:06, Keith Owens <[EMAIL PROTECTED]> said: > You already have a working kernel which you want to rename to use as a > backup version. Changing EXTRAVERSION and recompiling builds a new > kernel and adds uncertainty about whether the kernel still works - did > you change anything else before recompiling? You assign a new EXTRAVERSION to the new kernel you are building, and keep the old kernel at the old name. Problem solved. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Plans for 2.5
On 2001-03-31T09:36:33, James Lewis Nance <[EMAIL PROTECTED]> said: > > > 4) What is the time frame of releasing 2.5.x-final (or 2.6.x) ? > > wow that's jumping the gun a bit. > But its easy to answer. It will come out about 1 year after whatever > target date we initially set :-) Sorry, s/we initially set/we assume at any given time/. (Recursion, noun: see recursion) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch submissions
On 2001-03-06T16:56:32, Alan Cox <[EMAIL PROTECTED]> said: > I'm getting a notable increase in people sending me patches that do major > things and should be 2.5 stuff. Please if you want to rewrite the VM completely, > redesign the scsi layer and the like wait until 2.5. When will 2.5 be forked? If anyone wants to redesign the SCSI layer, by all means, DO NOT STOP HIM! ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.4 vs 2.2 performance under load comparison
Good morning, I did a comparison between 2.4 and 2.2.18 (+ Andrea's patches), using the respective latest SuSE kernels, but the results should apply to the versions in general. Situation: SAP R/3 + SAP DB + benchmark driver running on a single node 4 CPU SMP machine, tuned down to 1GB of RAM. Running the SAP benchmark with 75 users on 2.2 yields for the first benchmark run: - 7018ms average response time - 2967s CPU time in 1136s elapsed time - ~500MB swap allocated - ~1500 pages paged in/s, 268 pages/out/s on average Running the same benchmark on 2.4: - ~700ms average response time - 1884s CPU time in 669s elapsed time - ~500MB swap allocated - ~50 pages paged in, ~212 pages paged out per second on average Running the same benchmark the second time on both machines to get them warmed up, 2.2 stays in approximately the same range, while 2.4 gets even _better_, dropping down to ~350ms response time and ~20 pages in/out. This is a rather amazing improvement in swapping performance. Rik, it's time for you to break it again *g* Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> SuSE Linux AG at the SAP LinuxLab - [EMAIL PROTECTED] -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: lkml subject line
On 2001-02-12T11:56:00, Mike Harrold <[EMAIL PROTECTED]> said: > Maybe I don't *want* the LKML messages in a seperate folder. > Maybe I just want to identify them at a pinch in my inbox? You can use procmail to modify the subject line of incoming mail too. > Maybe my employer doesn't allow me to install additional software anyway? Those would all be your problems and I would suggest using a different account for mail then. This discussion happens on every mailing list occasionally, and it is just a generally bad idea, period. Especially for a list which is as often crossposted to as lk. Can we now move on? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://vger.kernel.org/lkml/
Re: [PATCH] Hot swap CPU support for 2.4.1
On 2001-02-05T15:00:40, Rusty Russell <[EMAIL PROTECTED]> said: > I did the infrastructure, Anton did the bugfinding and PPC support, > aka. the hard stuff. Other architectures need to implement > __cpu_disable, __cpu_die and __cpu_up for them to work. Volunteers > appreciated. Rusty, what would be needed to "hot-add" CPUs ? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> SuSE Linux AG at the SAP LinuxLab - [EMAIL PROTECTED] -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: RE: hotmail not dealing with ECN
On 2001-01-26T16:04:03, "Randal, Phil" <[EMAIL PROTECTED]> said: > We may be right, "they" may be wrong, but in the real world > arrogance rarely wins anyone friends. So you also turn of PMTU and just set the MTU to 200 bytes because broken firewalls may drop ICMP ? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: hotmail not dealing with ECN
On 2001-01-26T15:08:21, James Sutherland <[EMAIL PROTECTED]> said: > Obviously. The connection is now dead. However, trying to make a new > connection with different settings is perfectly reasonable. No. If connect() suddenly did two connection attempts instead of one, just how many timeouts might that break? > Why? The connection is dead, but there is nothing to prevent attempting > another connection. Right. And thats why connect() returns an error and retries are handled in userspace. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: hotmail not dealing with ECN
On 2001-01-26T06:39:36, "David S. Miller" <[EMAIL PROTECTED]> said: > The RST frame does not indicate why it happened, so you may not intuit > the reason, "retry" the connection, or anything else like that. It > means connection failed, and we must return error from connect(). > > Nothing else is acceptable. That would mean that the people worried about this should write a wrapper-library for the connect() call, and maybe add a no_ecn flag to a socket, and leave the kernel alone. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: hotmail not dealing with ECN
On 2001-01-26T13:44:53, James Sutherland <[EMAIL PROTECTED]> said: > > > A delayed retry without ECN might be a good compromise... > > _NO!_ > Why? As it stands, I have ECN disabled. It's staying disabled until I know > it won't degrade my Net access. First, you are ignoring a TCP_RST, which means "stop trying". You would have to retry a connection with a new source port. How do you handle cases where the application explicitly bound the socket to a specific source port / source IP ? Caching whether the site is able to speak ECN or not is also suboptimal if the local site is opening lots of outgoing connections, like a proxy server. (Of course, memory has gotten cheap) _And_ it is solving the problem on the wrong end. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: hotmail not dealing with ECN
On 2001-01-26T11:40:36, James Sutherland <[EMAIL PROTECTED]> said: > A delayed retry without ECN might be a good compromise... _NO!_ Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
cpqfc in 2.4 ?
cpqfc isn't build for 2.4 yet. Is there any specific reason for this? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: shmfs behaviour
On 2001-01-12T11:10:39, "J . A . Magallon" <[EMAIL PROTECTED]> said: > A couple of questions about shm filesystem: > - Time ago I remember you could see some dot files inside the /dev/shm > filesystem (then, even it was mounted in /var/shm...). No it shows nothing. > Is it the supposed behaviour ? AFAIK yes. > - By accident (switching between 2.2 and 2.4), i left the shm fs 'commented' > (with a fs type of 'ignore'). Kernel 2.4 looked working good. What is > /dev/shm for exactly ? Because it looks like I can live without it... No. You will need it for POSIX shared memory. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: khttpd beaten by boa
On 2001-01-11T22:20:56, Christoph Lameter <[EMAIL PROTECTED]> said: > Then we decided to switch persistant connection off... But boa still wins. > > What is wrong here? I would expect transferates of a 3-4 megabytes over a > localhost interface. The file is certainly in some kind of cache. This just goes on to show that khttpd is unnecessary kernel bloat and can be "just as well" handled by a userspace application, minus some rather very special cases which do not justify its inclusion into the main kernel. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Monitoring filesystems / blockdevice for errors
On 2000-12-17T13:23:52, Mark Hahn <[EMAIL PROTECTED]> said: > > Short of parsing syslog messages, which isn't particularly great. > what's wrong with it? Because it means having to know about all potential messages the filesystems might dump out. > reinventing /proc/kmsg and klogd would be tre gross. Well, only one process can read kmsg and get notified about new messages at any time, so that makes the monitoring depend on klogd/syslogd working, which given a write error by syslog might not be the case... > > I don't have a real idea how this could be added, short of adding a field to > > /proc/partitions (error count) or something similiar. > for reporting errors, that might be OK, but it's not a particularly nice > _notification_ mechanism... Well, yes. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Monitoring filesystems / blockdevice for errors
Good morning, currently, there is no way for an external application to monitor whether a filesystem or underlaying block device has hit an error condition - internal inconsistency, read or write error, whatever. Short of parsing syslog messages, which isn't particularly great. This is necessary for server monitoring in general. I don't have a real idea how this could be added, short of adding a field to /proc/partitions (error count) or something similiar. Comments? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.2.18 release notes
On 2000-12-12T12:26:26, Alan Cox <[EMAIL PROTECTED]> said: > > And thus it follows that 2.2.18 is the least buggy kernel ever, since > > it has gotten the most bug fixes. > > > > Right? (: > Hopefully not. I _do_ hope that 2.2.18 is the least buggy kernel ever... Why do you hope otherwise? ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Advanced Linux Kernel/Enterprise Linux Kernel
On 2000-11-13T13:56:16, Josue Emmanuel Amaro <[EMAIL PROTECTED]> said: Good morning Josue, I hope your certification matrix hasn't driven you mad yet ;-) > While I do not think it would be productive to enter a discussion whether > there is a need to fork the kernel to add features that would be beneficial > to mission/business critical applications, I am curious as to what are the > features that people consider important to have. This is in fact the valuable subpart of the discussion. Working for SuSE on High Availability, especially in the "enterprise" segment: Here, referring to systems running databases (mostly Oracle, surprise), ERP-Systems, but also providing services (NFS, Samba, firewalls) in such an environment. I personally need features which allow me to keep on running, shut down as gracefully as possible if an error occurs, and if an error occured, diagnose it out in the field. This means: ECC memory, hotpluggable everything, proper error handling and reporting in the kernel. Yes, christmas and easter do occur on the same day in the real world, unfortunately. This can best be summarised as "robustness". If an error occured, I need to be able to fully diagnose it without having to reproduce it - no, I do not wish to reproduce the error by crashing my critical server on purpose, nor is "The error appears to have gone away, we have no clue what it was" an acceptable answer. (kdb, LKCD, Oopsing to the network etc: And they must be part of the default kernel as far as possible, so they stay in sync and get widespread testing) But also scalability: 2TB is a problem for me in some cases, 32bit just don't cut it all the time - but I need to circumvent the storage problem even on a 32bit system. And adding disks to the system while running is desireable. Cluster awareness, again mostly referring to storage: Yes, there is more than one system accessing my SCSI bus, my FCAL RAID, and the error handling should be architected in a way that they do not start reset wars. The LVM should safeguard against multiple nodes changing the metadata. (Ok, this can be solved in userspace too) LVM must be transactional, so a crash on a node doesn't corrupt the data. Basically, the talks in Miami (The Second Annual Linux Storage Management Workshop) gave a great overview of everything I need. And: I need all of this as Open Source. Period. No binary kernel modules do me any good and I will pointedly ignore them. Oh, and by the way - if any hot kernel hacker, not yet working on this full time feels inspired to make this happen, contact me. Or any other Linux company, as long as the job gets done. We'll be glad to make you a fulltime kernel slave^Whacker! ;-) > Another problem is how people define Enterprise Systems. Many base it on the > definitions that go back to S390 systems, others in the context of the 24/7 > nature of the internet. That would also be a healthy discussion to have. _ 24/7 * 99.99% mission/business critical services with "medium to high" load. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] Generalised Kernel Hooks Interface (GKHI)
On 2000-11-10T19:12:29, "Theodore Y. Ts'o" <[EMAIL PROTECTED]> said: > Great! Are you thinking about putting the crash dumper and the raw > write disk routines in a separate text section, so they can be located > in pages which are write-protected from accidental modification in case > some kernel code goes wild? (Who me? Paranoid? :-) I would also suggest to have a little checksum over the relevant pages, to verify that the code is still correct and we are not going to crashdump all over our valuable data... And I am still very fond of the idea of crash dumping to a network server ;-) Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] Generalised Kernel Hooks Interface (GKHI)
On 2000-11-09T07:20:27, Michael Rothwell <[EMAIL PROTECTED]> said: > > I understand that the one size fits all approach has some limitations > > if you want to run on PDAs up to big iron. But a framework to overload > > core kernel functions with modules smells a lot of binary only, closed > > source, vendor specific Linux on high end machines. > > Since Linux is GPL, how would you stop this? Christoph / SAP is in a rather good position to stop that being supported by vendors... > Same as before -- freedom and low cost. The primary advantae of Linux > over other OSes is the GPL. And that is why that has to govern the kernel and its modules as far as possible. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [ANNOUNCE] Generalised Kernel Hooks Interface (GKHI)
On 2000-11-09T07:25:52, Michael Rothwell <[EMAIL PROTECTED]> said: > Why? I think the IBM GKHI code would be of tremendous value. It would > make the kernel much more flexible, and for users, much more friendly. > No more patch-and-recompile to add a filesystem or whatever. There's no > reason to hamstring their efforts because of the possibility of binary > modules. The GPL allows that, right? So any developer of binary-only > extensions using the GKHI would not be breaking the license agreement, I > don't think. There's lots of binary modules right now -- VMWare, Aureal > sound card drivers, etc. And we already refuse to support those kernels - your point being? Making this "commonplace" is a nightmare. Go away with that. > I understand and agree with your desire for full source for everything, > but I disagree that we should artificially limit people's ability to use > Linux to solve their problems. I want their solving of their problems not to create problems for me though. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: What is up with Redhat 7.0?
On 2000-10-02T21:40:59, [EMAIL PROTECTED] said: > So the other distributions end up having to take the same arbitrary > snapshot as what RH chose, which from the outside seems like it's done > completely outside of the package author/maintainer's control. (i.e., > Why didn't the package maintainer issue a formal release, if they really > thought it was the best thing for RedHat to be using --- especially when > the package maintainer in many cases is employed by Red Hat?) Horrors, the release of the product might have had to wait until the formal release was done! There is also that a) it looks like a split in the Linux "community" if the others chose to ship the official glibc/gcc (with incompatibilities), or b) it looks like Red Hat was leading the "pack" again and everyone else had to follow because of it. > Certainly the LSB will hopefully solve many of these problems. > Unfortunately, the LSB isn't ready yet. Getting more people to help > work on the LSB would be a big help on that score. This was a huge kick back for the LSB. The LSB's job definetely is to specify the ABI, but why do we need an LSB if vendors just ship whatever they want? > (*) I note Ulrich has yet to make a public statement guarateeing that > there won't be any ABI compatibility problems between RH 7.0 and glibc > 2.2. I am still hoping there won't be any (knock on wood) Most definetely. I do hope for everyone that glibc 2.2 will be compatible. Because if there are, we'll either see big ISVs pressing everyone else to using / providing a compatible glibc (if that can be easily done) or the other way around. Both of which have their own political issues. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: RE: Soft-Updates for Linux ?
On 2000-10-01T11:50:10, Ernesto Vargas <[EMAIL PROTECTED]> said: > What of those journalled file systems are more prominent to success 2.5. ext3 is stable on my laptop. reiserfs is stable at SuSE on a 250 GB RAID with 2.2 million files. XFS has IMHO the best chance to surpass both in server environments, and GFS is also a very good candidate for (some) clusters. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Availability of kdb
On 2000-09-11T18:11:11, Jamie Lokier <[EMAIL PROTECTED]> said: > I still don't see how processor traces will tell me what ordering > guarantees I can rely on across the range of processors. It will tell you when your assumptions were wrong. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Availability of kdb
On 2000-09-06T12:52:29, Linus Torvalds <[EMAIL PROTECTED]> said: I do agree with your assessment. Except for a single point: > And quite frankly, for most of the real problems (as opposed to the stupid > bugs - of which there are many, as the latest crap with "truncate()" has > shown us) a debugger doesn't much help. And the real problems are what I > worry about. The rest is just details. It will get fixed eventually. I want these to get fixed _soon_. Not later. I want fixing these to be made as easy as possible, because they are annoying and don't have to be, because the "simple" problems are easier spotted with a debugger. If a real problem gets spotted easier and fixed correctly, all the better. If some people are able to better understand the code and what it does with a debugger, all the better. The community peer review will weed out incorrect patches. I doubt there will be too many of them or that people will suddenly start patch bombing l-k with crap because a debugger has been added. Hell, if a debugger only supplies a single clue to make people go "Duh, I was looking in the entirely wrong direction, better get to understand how the networking code works instead of hunting it in the VM layer", it is well worth it. I don't need quick and shallow fixes any more than you do. I want the code to stabilize, and do The Right Thing(tm). For all "fixes" which lead to clumsy, harder to understand code - see figure 1. For those fixes which can be spotted because we can easily get a consistent view of what happened on the system the time it crashed - no more "What processes were you running, try disabling and see if it persists" -, debugger output which makes you go "Duh, I have a typo there", I _do_ care for a debugger. > I do realize that others disagree. And I'm not your Mom. You can use a > kernel debugger if you want to, and I won't give you the cold shoulder > because you have "sullied" yourself. But I'm not going to help you use > one, and I wuld frankly prefer people not to use kernel debuggers that > much. So I don't make it part of the standard distribution, and if the > existing debuggers aren't very well known I won't shed a tear over it. > > Because I'm a bastard, and proud of it! > > Linus This makes for a good Think Geek shirt, and I will personally buy one and try to get it signed by Alan. But for trying to shepherd a project the size of the Linux, I do think it is the wrong attitude if taken to the extremes. Not trying to make it easier for the masses to provide better QA data, or even fix the stupid errors themselves, so that the core team can focus on the real hard bugs and design issues, I call that wrong. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> Development HA -- Perfection is our goal, excellence will be tolerated. -- J. Yahl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/