Re: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On Thursday 21 July 2005 02:55, Walker, Bruce J (HP-Labs) wrote: > Like Lars, I too was under the wrong impression about this configfs > "nodemanager" kernel component. Our discussions in the cluster meeting > Monday and Tuesday were assuming it was a general service that other > kernel components could/would utilize and possibly also something that > could send uevents to non-kernel components wanting a std. way to see > membership information/events. > > As to kernel components without corresponding user-level "managers", look > no farther than OpenSSI. Our hope was that we could adapt to a user-land > membership service and this interface thru configfs would drive all our > kernel subsystems. Guys, it is absolutely stupid to rely on a virtual filesystem for userspace/kernel communication for any events that might have to be transmitted inside the block IO path. This includes, among other things, memberhips events. Inserting a virtual filesystem into this path does nothing but add long call chains and new, hard-to-characterize memory usage. There are already tried-and-true interfaces that are designed to do this kind of job efficiently and with quantifiable resource requirements: sockets (UNIX domain or netlink) and ioctls. If you want to layer a virtual filesystem on top as a user friendly way to present current cluster configuration or as a way to provide some administrator knobs, then fine, virtual filesystems are good for this kind of thing. But please do not try to insinuate that bloat into the block IO path. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T11:39:38, Joel Becker <[EMAIL PROTECTED]> wrote: > In turn, let me clarify a little where configfs fits in to > things. Configfs is merely a convenient and transparent method to > communicate configuration to kernel objects. It's not a place for > uevents, for netlink sockets, or for fancy communication. It allows > userspace to create an in-kernel object and set/get values on that > object. It also allows userspace and kernelspace to share the same > representation of that object and its values. > For more complex interaction, sysfs and procfs are often more > appropriate. While you might "configure" all known nodes in configfs, > the node up/down state might live in sysfs. A netlink socket for > up/down events might live in procfs. And so on. Right. Thanks for the clarification and elaboration, for I am sure not entirely clear as to how all these mechanisms relate in detail and what is appropriate just where, and when to use something more classic like ioctl etc... ;-) FWIW, we didn't mean to get uevents out via configfs of course. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T11:39:38, Joel Becker [EMAIL PROTECTED] wrote: In turn, let me clarify a little where configfs fits in to things. Configfs is merely a convenient and transparent method to communicate configuration to kernel objects. It's not a place for uevents, for netlink sockets, or for fancy communication. It allows userspace to create an in-kernel object and set/get values on that object. It also allows userspace and kernelspace to share the same representation of that object and its values. For more complex interaction, sysfs and procfs are often more appropriate. While you might configure all known nodes in configfs, the node up/down state might live in sysfs. A netlink socket for up/down events might live in procfs. And so on. Right. Thanks for the clarification and elaboration, for I am sure not entirely clear as to how all these mechanisms relate in detail and what is appropriate just where, and when to use something more classic like ioctl etc... ;-) FWIW, we didn't mean to get uevents out via configfs of course. Sincerely, Lars Marowsky-Brée [EMAIL PROTECTED] -- High Availability Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin Ignorance more frequently begets confidence than does knowledge - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On Thursday 21 July 2005 02:55, Walker, Bruce J (HP-Labs) wrote: Like Lars, I too was under the wrong impression about this configfs nodemanager kernel component. Our discussions in the cluster meeting Monday and Tuesday were assuming it was a general service that other kernel components could/would utilize and possibly also something that could send uevents to non-kernel components wanting a std. way to see membership information/events. As to kernel components without corresponding user-level managers, look no farther than OpenSSI. Our hope was that we could adapt to a user-land membership service and this interface thru configfs would drive all our kernel subsystems. Guys, it is absolutely stupid to rely on a virtual filesystem for userspace/kernel communication for any events that might have to be transmitted inside the block IO path. This includes, among other things, memberhips events. Inserting a virtual filesystem into this path does nothing but add long call chains and new, hard-to-characterize memory usage. There are already tried-and-true interfaces that are designed to do this kind of job efficiently and with quantifiable resource requirements: sockets (UNIX domain or netlink) and ioctls. If you want to layer a virtual filesystem on top as a user friendly way to present current cluster configuration or as a way to provide some administrator knobs, then fine, virtual filesystems are good for this kind of thing. But please do not try to insinuate that bloat into the block IO path. Regards, Daniel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On Wed, Jul 20, 2005 at 08:09:18PM +0200, Lars Marowsky-Bree wrote: > On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <[EMAIL PROTECTED]> wrote: > > > Like Lars, I too was under the wrong impression about this configfs > > "nodemanager" kernel component. Our discussions in the cluster > > meeting Monday and Tuesday were assuming it was a general service that > > other kernel components could/would utilize and possibly also > > something that could send uevents to non-kernel components wanting a > > std. way to see membership information/events. > > Let me clarify that this was something we briefly touched on in > Walldorf: The node manager would (re-)export the current data via sysfs > (which would result in uevents being sent, too), and not something we > dreamed up just Monday ;-) In turn, let me clarify a little where configfs fits in to things. Configfs is merely a convenient and transparent method to communicate configuration to kernel objects. It's not a place for uevents, for netlink sockets, or for fancy communication. It allows userspace to create an in-kernel object and set/get values on that object. It also allows userspace and kernelspace to share the same representation of that object and its values. For more complex interaction, sysfs and procfs are often more appropriate. While you might "configure" all known nodes in configfs, the node up/down state might live in sysfs. A netlink socket for up/down events might live in procfs. And so on. Joel -- "But all my words come back to me In shades of mediocrity. Like emptiness in harmony I need someone to comfort me." Joel Becker Senior Member of Technical Staff Oracle E-mail: [EMAIL PROTECTED] Phone: (650) 506-8127 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <[EMAIL PROTECTED]> wrote: > Like Lars, I too was under the wrong impression about this configfs > "nodemanager" kernel component. Our discussions in the cluster > meeting Monday and Tuesday were assuming it was a general service that > other kernel components could/would utilize and possibly also > something that could send uevents to non-kernel components wanting a > std. way to see membership information/events. Let me clarify that this was something we briefly touched on in Walldorf: The node manager would (re-)export the current data via sysfs (which would result in uevents being sent, too), and not something we dreamed up just Monday ;-) > As to kernel components without corresponding user-level "managers", > look no farther than OpenSSI. Our hope was that we could adapt to a > user-land membership service and this interface thru configfs would > drive all our kernel subsystems. Well, node manager still can provide you the input as to which nodes are configured, which in a way translates to "membership". The thing it doesn't seem to provide yet is the supsend/modify/resume cycle which for example the RHAT DLM seems to require. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
Like Lars, I too was under the wrong impression about this configfs "nodemanager" kernel component. Our discussions in the cluster meeting Monday and Tuesday were assuming it was a general service that other kernel components could/would utilize and possibly also something that could send uevents to non-kernel components wanting a std. way to see membership information/events. As to kernel components without corresponding user-level "managers", look no farther than OpenSSI. Our hope was that we could adapt to a user-land membership service and this interface thru configfs would drive all our kernel subsystems. Bruce Walker OpenSSI Cluster project -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lars Marowsky-Bree Sent: Wednesday, July 20, 2005 9:27 AM To: David Teigland Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm On 2005-07-20T11:35:46, David Teigland <[EMAIL PROTECTED]> wrote: > > Also, eventually we obviously need to have state for the nodes - > > up/down et cetera. I think the node manager also ought to track this. > We don't have a need for that information yet; I'm hoping we won't > ever need it in the kernel, but we'll see. Hm, I'm thinking a service might have a good reason to want to know the possible list of nodes as opposed to the currently active membership; though the DLM as the service in question right now does not appear to need such. But, see below. > There are at least two ways to handle this: > > 1. Pass cluster events and data into the kernel (this sounds like what > you're talking about above), notify the effected kernel components, > each kernel component takes the cluster data and does whatever it > needs to with it (internal adjustments, recovery, etc). > > 2. Each kernel component "foo-kernel" has an associated user space > component "foo-user". Cluster events (from userland clustering > infrastructure) are passed to foo-user -- not into the kernel. > foo-user determines what the specific consequences are for foo-kernel. > foo-user then manipulates foo-kernel accordingly, through user/kernel > hooks (sysfs, configfs, etc). These control hooks would largely be specific > to foo. > > We're following option 2 with the dlm and gfs and have been for quite > a while, which means we don't need 1. I think ocfs2 is moving that > way, too. Someone could still try 1, of course, but it would be of no > use or interest to me. I'm not aware of any actual projects pushing > forward with something like 1, so the persistent reference to it is somewhat > baffling. Right. I thought that the node manager changes for generalizing it where pushing into sort-of direction 1. Thanks for clearing this up. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" -- Linux-cluster mailing list [EMAIL PROTECTED] http://www.redhat.com/mailman/listinfo/linux-cluster - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T11:35:46, David Teigland <[EMAIL PROTECTED]> wrote: > > Also, eventually we obviously need to have state for the nodes - up/down > > et cetera. I think the node manager also ought to track this. > We don't have a need for that information yet; I'm hoping we won't ever > need it in the kernel, but we'll see. Hm, I'm thinking a service might have a good reason to want to know the possible list of nodes as opposed to the currently active membership; though the DLM as the service in question right now does not appear to need such. But, see below. > There are at least two ways to handle this: > > 1. Pass cluster events and data into the kernel (this sounds like what > you're talking about above), notify the effected kernel components, each > kernel component takes the cluster data and does whatever it needs to with > it (internal adjustments, recovery, etc). > > 2. Each kernel component "foo-kernel" has an associated user space > component "foo-user". Cluster events (from userland clustering > infrastructure) are passed to foo-user -- not into the kernel. foo-user > determines what the specific consequences are for foo-kernel. foo-user > then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs, > configfs, etc). These control hooks would largely be specific to foo. > > We're following option 2 with the dlm and gfs and have been for quite a > while, which means we don't need 1. I think ocfs2 is moving that way, > too. Someone could still try 1, of course, but it would be of no use or > interest to me. I'm not aware of any actual projects pushing forward with > something like 1, so the persistent reference to it is somewhat baffling. Right. I thought that the node manager changes for generalizing it where pushing into sort-of direction 1. Thanks for clearing this up. Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T11:35:46, David Teigland [EMAIL PROTECTED] wrote: Also, eventually we obviously need to have state for the nodes - up/down et cetera. I think the node manager also ought to track this. We don't have a need for that information yet; I'm hoping we won't ever need it in the kernel, but we'll see. Hm, I'm thinking a service might have a good reason to want to know the possible list of nodes as opposed to the currently active membership; though the DLM as the service in question right now does not appear to need such. But, see below. There are at least two ways to handle this: 1. Pass cluster events and data into the kernel (this sounds like what you're talking about above), notify the effected kernel components, each kernel component takes the cluster data and does whatever it needs to with it (internal adjustments, recovery, etc). 2. Each kernel component foo-kernel has an associated user space component foo-user. Cluster events (from userland clustering infrastructure) are passed to foo-user -- not into the kernel. foo-user determines what the specific consequences are for foo-kernel. foo-user then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs, configfs, etc). These control hooks would largely be specific to foo. We're following option 2 with the dlm and gfs and have been for quite a while, which means we don't need 1. I think ocfs2 is moving that way, too. Someone could still try 1, of course, but it would be of no use or interest to me. I'm not aware of any actual projects pushing forward with something like 1, so the persistent reference to it is somewhat baffling. Right. I thought that the node manager changes for generalizing it where pushing into sort-of direction 1. Thanks for clearing this up. Sincerely, Lars Marowsky-Brée [EMAIL PROTECTED] -- High Availability Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin Ignorance more frequently begets confidence than does knowledge - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
Like Lars, I too was under the wrong impression about this configfs nodemanager kernel component. Our discussions in the cluster meeting Monday and Tuesday were assuming it was a general service that other kernel components could/would utilize and possibly also something that could send uevents to non-kernel components wanting a std. way to see membership information/events. As to kernel components without corresponding user-level managers, look no farther than OpenSSI. Our hope was that we could adapt to a user-land membership service and this interface thru configfs would drive all our kernel subsystems. Bruce Walker OpenSSI Cluster project -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lars Marowsky-Bree Sent: Wednesday, July 20, 2005 9:27 AM To: David Teigland Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; [EMAIL PROTECTED] Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm On 2005-07-20T11:35:46, David Teigland [EMAIL PROTECTED] wrote: Also, eventually we obviously need to have state for the nodes - up/down et cetera. I think the node manager also ought to track this. We don't have a need for that information yet; I'm hoping we won't ever need it in the kernel, but we'll see. Hm, I'm thinking a service might have a good reason to want to know the possible list of nodes as opposed to the currently active membership; though the DLM as the service in question right now does not appear to need such. But, see below. There are at least two ways to handle this: 1. Pass cluster events and data into the kernel (this sounds like what you're talking about above), notify the effected kernel components, each kernel component takes the cluster data and does whatever it needs to with it (internal adjustments, recovery, etc). 2. Each kernel component foo-kernel has an associated user space component foo-user. Cluster events (from userland clustering infrastructure) are passed to foo-user -- not into the kernel. foo-user determines what the specific consequences are for foo-kernel. foo-user then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs, configfs, etc). These control hooks would largely be specific to foo. We're following option 2 with the dlm and gfs and have been for quite a while, which means we don't need 1. I think ocfs2 is moving that way, too. Someone could still try 1, of course, but it would be of no use or interest to me. I'm not aware of any actual projects pushing forward with something like 1, so the persistent reference to it is somewhat baffling. Right. I thought that the node manager changes for generalizing it where pushing into sort-of direction 1. Thanks for clearing this up. Sincerely, Lars Marowsky-Brée [EMAIL PROTECTED] -- High Availability Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin Ignorance more frequently begets confidence than does knowledge -- Linux-cluster mailing list [EMAIL PROTECTED] http://www.redhat.com/mailman/listinfo/linux-cluster - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-20T09:55:31, Walker, Bruce J (HP-Labs) [EMAIL PROTECTED] wrote: Like Lars, I too was under the wrong impression about this configfs nodemanager kernel component. Our discussions in the cluster meeting Monday and Tuesday were assuming it was a general service that other kernel components could/would utilize and possibly also something that could send uevents to non-kernel components wanting a std. way to see membership information/events. Let me clarify that this was something we briefly touched on in Walldorf: The node manager would (re-)export the current data via sysfs (which would result in uevents being sent, too), and not something we dreamed up just Monday ;-) As to kernel components without corresponding user-level managers, look no farther than OpenSSI. Our hope was that we could adapt to a user-land membership service and this interface thru configfs would drive all our kernel subsystems. Well, node manager still can provide you the input as to which nodes are configured, which in a way translates to membership. The thing it doesn't seem to provide yet is the supsend/modify/resume cycle which for example the RHAT DLM seems to require. Sincerely, Lars Marowsky-Brée [EMAIL PROTECTED] -- High Availability Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin Ignorance more frequently begets confidence than does knowledge - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On Wed, Jul 20, 2005 at 08:09:18PM +0200, Lars Marowsky-Bree wrote: On 2005-07-20T09:55:31, Walker, Bruce J (HP-Labs) [EMAIL PROTECTED] wrote: Like Lars, I too was under the wrong impression about this configfs nodemanager kernel component. Our discussions in the cluster meeting Monday and Tuesday were assuming it was a general service that other kernel components could/would utilize and possibly also something that could send uevents to non-kernel components wanting a std. way to see membership information/events. Let me clarify that this was something we briefly touched on in Walldorf: The node manager would (re-)export the current data via sysfs (which would result in uevents being sent, too), and not something we dreamed up just Monday ;-) In turn, let me clarify a little where configfs fits in to things. Configfs is merely a convenient and transparent method to communicate configuration to kernel objects. It's not a place for uevents, for netlink sockets, or for fancy communication. It allows userspace to create an in-kernel object and set/get values on that object. It also allows userspace and kernelspace to share the same representation of that object and its values. For more complex interaction, sysfs and procfs are often more appropriate. While you might configure all known nodes in configfs, the node up/down state might live in sysfs. A netlink socket for up/down events might live in procfs. And so on. Joel -- But all my words come back to me In shades of mediocrity. Like emptiness in harmony I need someone to comfort me. Joel Becker Senior Member of Technical Staff Oracle E-mail: [EMAIL PROTECTED] Phone: (650) 506-8127 - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
On Tue, Jul 19, 2005 at 05:48:26PM -0700, Mark Fasheh wrote: > For OCFS2 that would mean that an ocfs2_nodemanager would still exist, > but as a much smaller module sitting on top of 'nodemanager'. Yep, factoring out the common bits. > So no port attribute. The OCFS2 network code normally takes port from the > node manager in order to determine how to talk to a given node. We'll have > to figure out how to resolve that. The easiest would be to add 'port' back, > but I think that might be problematic if we have multiple cluster network > infrastructures as we do today. The port is specific to the component using it (ocfs2, dlm, etc), so defining port as a node property doesn't make sense if nodemanager is providing node info to multiple components. > Another way to handle this would be to have userspace symlink to the node > items as an attribute on an ocfs2_tcp item. We could store 'port' as a > second attribute. This would have the added benefit of pinning node > information while OCFS2 uses it. I expect each component will probably use another per-node configfs object for component-specific attributes, using the common bits from the nodemanager object. > > + charnd_name[NODEMANAGER_MAX_NAME_LEN+1]; > An accessor function for this would be nice for pretty prints - maybe strcpy > into a passed string. ok > > + int nd_nodeid; > This definitely won't work with OCFS2... Nodeid (what used to be called > node_num) needs to be unsigned. Otherwise this will break all our nodemap > stuff which uses a bitmap to represent cluster state. ok > > + struct list_headnd_status_list; > What are these two for? They don't seem to be referenced elsewhere... Missed ripping them out with the other ocfs-specific stuff. > > + if (!tmp && cluster->cl_has_local && > > + cluster->cl_local_node == node->nd_nodeid) { > > + cluster->cl_local_node = 0; > I think we might want to be setting cl_local_node to NODEMANAGER_MAX_NODES > here. It seems that ocfs2_nodemanager also does this so we might have just > caught a bug you inherited :) yep > You removed o2nm_configured_node_map but we need some sort of method for > enumerating over the set of configured nodes. > > Also we need a method for querying the existence of a node. > The OCFS2 code usually uses o2nm_get_node_by_num(..) != NULL for this but a > simple boolean api call would be cleaner and would avoid exposing the node > structure. Right, those should be on the TODO. Thanks, Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On Tue, Jul 19, 2005 at 05:52:14PM +0200, Lars Marowsky-Bree wrote: > The nodeid, I thought, was relative to a given DLM namespace, no? This > concept seems to be missing here, or are you suggesting the nodeid to be > global across namespaces? I'm not sure I understand what you mean. A node would have the same nodeid across different dlm locking-domains, assuming, of course, those dlm domains were in the context of the same cluster. The dlm only uses nodemanager to look up node addresses, though. > Also, eventually we obviously need to have state for the nodes - up/down > et cetera. I think the node manager also ought to track this. We don't have a need for that information yet; I'm hoping we won't ever need it in the kernel, but we'll see. > How would kernel components use this and be notified about changes to > the configuration / membership state? "Nodemanager" is perhaps a poor name; at the moment its only substantial purpose is to communicate node address/id associations in a way that's independent of a specific driver or fs. Changes to cluster configuration/membership happen in user space, of course. Those general events will have specific consequences to a given component (fs, lock manager, etc). These consequences vary quite widely depending on the component you're looking at. There are at least two ways to handle this: 1. Pass cluster events and data into the kernel (this sounds like what you're talking about above), notify the effected kernel components, each kernel component takes the cluster data and does whatever it needs to with it (internal adjustments, recovery, etc). 2. Each kernel component "foo-kernel" has an associated user space component "foo-user". Cluster events (from userland clustering infrastructure) are passed to foo-user -- not into the kernel. foo-user determines what the specific consequences are for foo-kernel. foo-user then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs, configfs, etc). These control hooks would largely be specific to foo. We're following option 2 with the dlm and gfs and have been for quite a while, which means we don't need 1. I think ocfs2 is moving that way, too. Someone could still try 1, of course, but it would be of no use or interest to me. I'm not aware of any actual projects pushing forward with something like 1, so the persistent reference to it is somewhat baffling. Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
Hi David, On Mon, Jul 18, 2005 at 02:15:53PM +0800, David Teigland wrote: > Some of the comments about the dlm concerned how it's configured (from > user space.) In particular, there was interest in seeing the dlm and > ocfs2 use common methods for their configuration. > > The first area I'm looking at is how we get addresses/ids of other nodes. Right. So this doesn't take into account other parts of node management (communication, heartbeat, etc). OCFS2 and dlm would still be handling that stuff on their own for now. For OCFS2 that would mean that an ocfs2_nodemanager would still exist, but as a much smaller module sitting on top of 'nodemanager'. > I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use > it (removing ocfs-specific stuff). It still needs some work, but I'd like > to know if this appeals to the ocfs group and to others who were > interested in seeing some similarity in dlm/ocfs configuration. While I agree that some things look like they still need a bit of work, I like the direction this is taking - thanks for getting this ball rolling. My questions and comments below: > +enum { > + NM_NODE_ATTR_NODEID = 0, > + NM_NODE_ATTR_ADDRESS, > + NM_NODE_ATTR_LOCAL, > +}; So no port attribute. The OCFS2 network code normally takes port from the node manager in order to determine how to talk to a given node. We'll have to figure out how to resolve that. The easiest would be to add 'port' back, but I think that might be problematic if we have multiple cluster network infrastructures as we do today. Another way to handle this would be to have userspace symlink to the node items as an attribute on an ocfs2_tcp item. We could store 'port' as a second attribute. This would have the added benefit of pinning node information while OCFS2 uses it. > +struct node { > + spinlock_t nd_lock; > + struct config_item nd_item; > + charnd_name[NODEMANAGER_MAX_NAME_LEN+1]; An accessor function for this would be nice for pretty prints - maybe strcpy into a passed string. > + int nd_nodeid; This definitely won't work with OCFS2... Nodeid (what used to be called node_num) needs to be unsigned. Otherwise this will break all our nodemap stuff which uses a bitmap to represent cluster state. > + u32 nd_ipv4_address; > + struct rb_node nd_ip_node; > + int nd_local; > + unsigned long nd_set_attributes; > + struct idr nd_status_idr; > + struct list_headnd_status_list; What are these two for? They don't seem to be referenced elsewhere... > +static ssize_t node_local_write(struct node *node, const char *page, > + size_t count) > +{ > + struct cluster *cluster = node_to_cluster(node); > + unsigned long tmp; > + char *p = (char *)page; > + > + tmp = simple_strtoul(p, , 0); > + if (!p || (*p && (*p != '\n'))) > + return -EINVAL; > + > + tmp = !!tmp; /* boolean of whether this node wants to be local */ > + > + /* the only failure case is trying to set a new local node > + * when a different one is already set */ > + > + if (tmp && tmp == cluster->cl_has_local && > + cluster->cl_local_node != node->nd_nodeid) > + return -EBUSY; > + > + if (!tmp && cluster->cl_has_local && > + cluster->cl_local_node == node->nd_nodeid) { > + cluster->cl_local_node = 0; I think we might want to be setting cl_local_node to NODEMANAGER_MAX_NODES here. It seems that ocfs2_nodemanager also does this so we might have just caught a bug you inherited :) > diff -urN a/drivers/nodemanager/nodemanager.h > b/drivers/nodemanager/nodemanager.h > --- a/drivers/nodemanager/nodemanager.h 1970-01-01 07:30:00.0 > +0730 > +++ b/drivers/nodemanager/nodemanager.h 2005-07-18 13:41:35.377583200 > +0800 > @@ -0,0 +1,37 @@ > +/* > + * nodemanager.h > + * > + * Copyright (C) 2004 Oracle. All rights reserved. > + * Copyright (C) 2005 Red Hat, Inc. All rights reserved. > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public > + * License as published by the Free Software Foundation; either > + * version 2 of the License, or (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public > + * License along with this program; if not, write to the > + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, > + * Boston, MA 021110-1307, USA. > + * > + */ > + > +#ifndef NODEMANAGER_H > +#define NODEMANAGER_H > + > +#define
Re: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
On Monday 18 July 2005 16:15, David Teigland wrote: > I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use > it (removing ocfs-specific stuff). It still needs some work, but I'd > like to know if this appeals to the ocfs group and to others who were > interested in seeing some similarity in dlm/ocfs configuration. Let me get this straight. The proposal is to expose cluster membership as a virtual filesystem and use that as the primary membership interface? So that, e.g., a server on the cluster does a getdents to find out what nodes are in the cluster or uses inotify to learn about membership changes, instead of subscribing for and receiving membership events directly from the cluster membership manager? Or what is this about, just providing a nice friendly view of the cluster to the administrator, not intended to be used by cluster infrastructure components? Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-18T14:15:53, David Teigland <[EMAIL PROTECTED]> wrote: > Some of the comments about the dlm concerned how it's configured (from > user space.) In particular, there was interest in seeing the dlm and > ocfs2 use common methods for their configuration. > > The first area I'm looking at is how we get addresses/ids of other nodes. > Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a > separate kernel module called "ocfs2_nodemanager" that's based on > configfs. > > I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use > it (removing ocfs-specific stuff). It still needs some work, but I'd like > to know if this appeals to the ocfs group and to others who were > interested in seeing some similarity in dlm/ocfs configuration. Hi Dave, I finally found time to read through this. Yes, I most definetely like where this is going! > +/* TODO: > + - generic addresses (IPV4/6) > + - multiple addresses per node The nodeid, I thought, was relative to a given DLM namespace, no? This concept seems to be missing here, or are you suggesting the nodeid to be global across namespaces? Also, eventually we obviously need to have state for the nodes - up/down et cetera. I think the node manager also ought to track this. How would kernel components use this and be notified about changes to the configuration / membership state? Sincerely, Lars Marowsky-Brée <[EMAIL PROTECTED]> -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On 2005-07-18T14:15:53, David Teigland [EMAIL PROTECTED] wrote: Some of the comments about the dlm concerned how it's configured (from user space.) In particular, there was interest in seeing the dlm and ocfs2 use common methods for their configuration. The first area I'm looking at is how we get addresses/ids of other nodes. Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a separate kernel module called ocfs2_nodemanager that's based on configfs. I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use it (removing ocfs-specific stuff). It still needs some work, but I'd like to know if this appeals to the ocfs group and to others who were interested in seeing some similarity in dlm/ocfs configuration. Hi Dave, I finally found time to read through this. Yes, I most definetely like where this is going! +/* TODO: + - generic addresses (IPV4/6) + - multiple addresses per node The nodeid, I thought, was relative to a given DLM namespace, no? This concept seems to be missing here, or are you suggesting the nodeid to be global across namespaces? Also, eventually we obviously need to have state for the nodes - up/down et cetera. I think the node manager also ought to track this. How would kernel components use this and be notified about changes to the configuration / membership state? Sincerely, Lars Marowsky-Brée [EMAIL PROTECTED] -- High Availability Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin Ignorance more frequently begets confidence than does knowledge - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
On Monday 18 July 2005 16:15, David Teigland wrote: I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use it (removing ocfs-specific stuff). It still needs some work, but I'd like to know if this appeals to the ocfs group and to others who were interested in seeing some similarity in dlm/ocfs configuration. Let me get this straight. The proposal is to expose cluster membership as a virtual filesystem and use that as the primary membership interface? So that, e.g., a server on the cluster does a getdents to find out what nodes are in the cluster or uses inotify to learn about membership changes, instead of subscribing for and receiving membership events directly from the cluster membership manager? Or what is this about, just providing a nice friendly view of the cluster to the administrator, not intended to be used by cluster infrastructure components? Regards, Daniel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
Hi David, On Mon, Jul 18, 2005 at 02:15:53PM +0800, David Teigland wrote: Some of the comments about the dlm concerned how it's configured (from user space.) In particular, there was interest in seeing the dlm and ocfs2 use common methods for their configuration. The first area I'm looking at is how we get addresses/ids of other nodes. Right. So this doesn't take into account other parts of node management (communication, heartbeat, etc). OCFS2 and dlm would still be handling that stuff on their own for now. For OCFS2 that would mean that an ocfs2_nodemanager would still exist, but as a much smaller module sitting on top of 'nodemanager'. I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use it (removing ocfs-specific stuff). It still needs some work, but I'd like to know if this appeals to the ocfs group and to others who were interested in seeing some similarity in dlm/ocfs configuration. While I agree that some things look like they still need a bit of work, I like the direction this is taking - thanks for getting this ball rolling. My questions and comments below: +enum { + NM_NODE_ATTR_NODEID = 0, + NM_NODE_ATTR_ADDRESS, + NM_NODE_ATTR_LOCAL, +}; So no port attribute. The OCFS2 network code normally takes port from the node manager in order to determine how to talk to a given node. We'll have to figure out how to resolve that. The easiest would be to add 'port' back, but I think that might be problematic if we have multiple cluster network infrastructures as we do today. Another way to handle this would be to have userspace symlink to the node items as an attribute on an ocfs2_tcp item. We could store 'port' as a second attribute. This would have the added benefit of pinning node information while OCFS2 uses it. +struct node { + spinlock_t nd_lock; + struct config_item nd_item; + charnd_name[NODEMANAGER_MAX_NAME_LEN+1]; An accessor function for this would be nice for pretty prints - maybe strcpy into a passed string. + int nd_nodeid; This definitely won't work with OCFS2... Nodeid (what used to be called node_num) needs to be unsigned. Otherwise this will break all our nodemap stuff which uses a bitmap to represent cluster state. + u32 nd_ipv4_address; + struct rb_node nd_ip_node; + int nd_local; + unsigned long nd_set_attributes; + struct idr nd_status_idr; + struct list_headnd_status_list; What are these two for? They don't seem to be referenced elsewhere... +static ssize_t node_local_write(struct node *node, const char *page, + size_t count) +{ + struct cluster *cluster = node_to_cluster(node); + unsigned long tmp; + char *p = (char *)page; + + tmp = simple_strtoul(p, p, 0); + if (!p || (*p (*p != '\n'))) + return -EINVAL; + + tmp = !!tmp; /* boolean of whether this node wants to be local */ + + /* the only failure case is trying to set a new local node + * when a different one is already set */ + + if (tmp tmp == cluster-cl_has_local + cluster-cl_local_node != node-nd_nodeid) + return -EBUSY; + + if (!tmp cluster-cl_has_local + cluster-cl_local_node == node-nd_nodeid) { + cluster-cl_local_node = 0; I think we might want to be setting cl_local_node to NODEMANAGER_MAX_NODES here. It seems that ocfs2_nodemanager also does this so we might have just caught a bug you inherited :) diff -urN a/drivers/nodemanager/nodemanager.h b/drivers/nodemanager/nodemanager.h --- a/drivers/nodemanager/nodemanager.h 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/nodemanager.h 2005-07-18 13:41:35.377583200 +0800 @@ -0,0 +1,37 @@ +/* + * nodemanager.h + * + * Copyright (C) 2004 Oracle. All rights reserved. + * Copyright (C) 2005 Red Hat, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + * + */ + +#ifndef NODEMANAGER_H +#define NODEMANAGER_H + +#define NODEMANAGER_MAX_NODES255 +#define NODEMANAGER_INVALID_NODE_NUM 255 +#define
Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
On Tue, Jul 19, 2005 at 05:52:14PM +0200, Lars Marowsky-Bree wrote: The nodeid, I thought, was relative to a given DLM namespace, no? This concept seems to be missing here, or are you suggesting the nodeid to be global across namespaces? I'm not sure I understand what you mean. A node would have the same nodeid across different dlm locking-domains, assuming, of course, those dlm domains were in the context of the same cluster. The dlm only uses nodemanager to look up node addresses, though. Also, eventually we obviously need to have state for the nodes - up/down et cetera. I think the node manager also ought to track this. We don't have a need for that information yet; I'm hoping we won't ever need it in the kernel, but we'll see. How would kernel components use this and be notified about changes to the configuration / membership state? Nodemanager is perhaps a poor name; at the moment its only substantial purpose is to communicate node address/id associations in a way that's independent of a specific driver or fs. Changes to cluster configuration/membership happen in user space, of course. Those general events will have specific consequences to a given component (fs, lock manager, etc). These consequences vary quite widely depending on the component you're looking at. There are at least two ways to handle this: 1. Pass cluster events and data into the kernel (this sounds like what you're talking about above), notify the effected kernel components, each kernel component takes the cluster data and does whatever it needs to with it (internal adjustments, recovery, etc). 2. Each kernel component foo-kernel has an associated user space component foo-user. Cluster events (from userland clustering infrastructure) are passed to foo-user -- not into the kernel. foo-user determines what the specific consequences are for foo-kernel. foo-user then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs, configfs, etc). These control hooks would largely be specific to foo. We're following option 2 with the dlm and gfs and have been for quite a while, which means we don't need 1. I think ocfs2 is moving that way, too. Someone could still try 1, of course, but it would be of no use or interest to me. I'm not aware of any actual projects pushing forward with something like 1, so the persistent reference to it is somewhat baffling. Dave - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
On Tue, Jul 19, 2005 at 05:48:26PM -0700, Mark Fasheh wrote: For OCFS2 that would mean that an ocfs2_nodemanager would still exist, but as a much smaller module sitting on top of 'nodemanager'. Yep, factoring out the common bits. So no port attribute. The OCFS2 network code normally takes port from the node manager in order to determine how to talk to a given node. We'll have to figure out how to resolve that. The easiest would be to add 'port' back, but I think that might be problematic if we have multiple cluster network infrastructures as we do today. The port is specific to the component using it (ocfs2, dlm, etc), so defining port as a node property doesn't make sense if nodemanager is providing node info to multiple components. Another way to handle this would be to have userspace symlink to the node items as an attribute on an ocfs2_tcp item. We could store 'port' as a second attribute. This would have the added benefit of pinning node information while OCFS2 uses it. I expect each component will probably use another per-node configfs object for component-specific attributes, using the common bits from the nodemanager object. + charnd_name[NODEMANAGER_MAX_NAME_LEN+1]; An accessor function for this would be nice for pretty prints - maybe strcpy into a passed string. ok + int nd_nodeid; This definitely won't work with OCFS2... Nodeid (what used to be called node_num) needs to be unsigned. Otherwise this will break all our nodemap stuff which uses a bitmap to represent cluster state. ok + struct list_headnd_status_list; What are these two for? They don't seem to be referenced elsewhere... Missed ripping them out with the other ocfs-specific stuff. + if (!tmp cluster-cl_has_local + cluster-cl_local_node == node-nd_nodeid) { + cluster-cl_local_node = 0; I think we might want to be setting cl_local_node to NODEMANAGER_MAX_NODES here. It seems that ocfs2_nodemanager also does this so we might have just caught a bug you inherited :) yep You removed o2nm_configured_node_map but we need some sort of method for enumerating over the set of configured nodes. Also we need a method for querying the existence of a node. The OCFS2 code usually uses o2nm_get_node_by_num(..) != NULL for this but a simple boolean api call would be cleaner and would avoid exposing the node structure. Right, those should be on the TODO. Thanks, Dave - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] nodemanager, ocfs2, dlm
Some of the comments about the dlm concerned how it's configured (from user space.) In particular, there was interest in seeing the dlm and ocfs2 use common methods for their configuration. The first area I'm looking at is how we get addresses/ids of other nodes. Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a separate kernel module called "ocfs2_nodemanager" that's based on configfs. I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use it (removing ocfs-specific stuff). It still needs some work, but I'd like to know if this appeals to the ocfs group and to others who were interested in seeing some similarity in dlm/ocfs configuration. Thanks, Dave diff -urN a/drivers/Kconfig b/drivers/Kconfig --- a/drivers/Kconfig 2005-07-18 13:40:31.011368352 +0800 +++ b/drivers/Kconfig 2005-07-18 13:46:17.661669496 +0800 @@ -68,4 +68,6 @@ source "drivers/dlm/Kconfig" +source "drivers/nodemanager/Kconfig" + endmenu diff -urN a/drivers/Makefile b/drivers/Makefile --- a/drivers/Makefile 2005-07-18 13:40:31.015367744 +0800 +++ b/drivers/Makefile 2005-07-18 13:46:06.846313680 +0800 @@ -70,3 +70,4 @@ obj-y += firmware/ obj-$(CONFIG_CRYPTO) += crypto/ obj-$(CONFIG_DLM) += dlm/ +obj-$(CONFIG_NODEMANAGER) += nodemanager/ diff -urN a/drivers/nodemanager/Kconfig b/drivers/nodemanager/Kconfig --- a/drivers/nodemanager/Kconfig 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/Kconfig 2005-07-18 13:52:16.449125512 +0800 @@ -0,0 +1,9 @@ +menu "Node Manager" + +config NODEMANAGER + tristate "Node Manager" + help + Node addresses and ID"s are provided from user space and made + available to kernel components from this module. + +endmenu diff -urN a/drivers/nodemanager/Makefile b/drivers/nodemanager/Makefile --- a/drivers/nodemanager/Makefile 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/Makefile 2005-07-18 13:45:52.620476336 +0800 @@ -0,0 +1,3 @@ +obj-$(CONFIG_NODEMANAGER) += nodemanager.o + +nodemanager-y := nodemanager.o diff -urN a/drivers/nodemanager/nodemanager.c b/drivers/nodemanager/nodemanager.c --- a/drivers/nodemanager/nodemanager.c 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/nodemanager.c 2005-07-18 13:55:17.043670968 +0800 @@ -0,0 +1,655 @@ +/* + * nodemanager.c + * + * Copyright (C) 2004, 2005 Oracle. All rights reserved. + * Copyright (C) 2005 Red Hat, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +/* TODO: + - generic addresses (IPV4/6) + - multiple addresses per node + - more than 255 nodes (no static MAXNODE array) + - function to get a list of all nodes +*/ + +#include +#include +#include +#include + +#include "nodemanager.h" + +enum { + NM_NODE_ATTR_NODEID = 0, + NM_NODE_ATTR_ADDRESS, + NM_NODE_ATTR_LOCAL, +}; + +struct clusters; +struct cluster; +struct nodes; +struct node; + +static ssize_t node_nodeid_read(struct node *, char *); +static ssize_t node_nodeid_write(struct node *, const char *, size_t); +static ssize_t node_ipv4_address_read(struct node *, char *); +static ssize_t node_ipv4_address_write(struct node *, const char *, size_t); +static ssize_t node_local_read(struct node *, char *); +static ssize_t node_local_write(struct node *, const char *, size_t); + +static struct config_item *make_node(struct config_group *, const char *); +static void drop_node(struct config_group *, struct config_item *); +static void release_node(struct config_item *); +static struct config_group *make_cluster(struct config_group *, const char *); +static void drop_cluster(struct config_group *, struct config_item *); +static void release_cluster(struct config_item *); + +static ssize_t show_node(struct config_item *, struct configfs_attribute *, +char *); +static ssize_t store_node(struct config_item *, struct configfs_attribute *, + const char *, size_t); + + +struct node_attribute { + struct configfs_attribute attr; + ssize_t (*show)(struct node *, char *); + ssize_t (*store)(struct node *, const char *, size_t); +}; + +static struct node_attribute node_attr_nodeid = { + .attr = { .ca_owner
[RFC] nodemanager, ocfs2, dlm
Some of the comments about the dlm concerned how it's configured (from user space.) In particular, there was interest in seeing the dlm and ocfs2 use common methods for their configuration. The first area I'm looking at is how we get addresses/ids of other nodes. Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a separate kernel module called ocfs2_nodemanager that's based on configfs. I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use it (removing ocfs-specific stuff). It still needs some work, but I'd like to know if this appeals to the ocfs group and to others who were interested in seeing some similarity in dlm/ocfs configuration. Thanks, Dave diff -urN a/drivers/Kconfig b/drivers/Kconfig --- a/drivers/Kconfig 2005-07-18 13:40:31.011368352 +0800 +++ b/drivers/Kconfig 2005-07-18 13:46:17.661669496 +0800 @@ -68,4 +68,6 @@ source drivers/dlm/Kconfig +source drivers/nodemanager/Kconfig + endmenu diff -urN a/drivers/Makefile b/drivers/Makefile --- a/drivers/Makefile 2005-07-18 13:40:31.015367744 +0800 +++ b/drivers/Makefile 2005-07-18 13:46:06.846313680 +0800 @@ -70,3 +70,4 @@ obj-y += firmware/ obj-$(CONFIG_CRYPTO) += crypto/ obj-$(CONFIG_DLM) += dlm/ +obj-$(CONFIG_NODEMANAGER) += nodemanager/ diff -urN a/drivers/nodemanager/Kconfig b/drivers/nodemanager/Kconfig --- a/drivers/nodemanager/Kconfig 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/Kconfig 2005-07-18 13:52:16.449125512 +0800 @@ -0,0 +1,9 @@ +menu Node Manager + +config NODEMANAGER + tristate Node Manager + help + Node addresses and IDs are provided from user space and made + available to kernel components from this module. + +endmenu diff -urN a/drivers/nodemanager/Makefile b/drivers/nodemanager/Makefile --- a/drivers/nodemanager/Makefile 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/Makefile 2005-07-18 13:45:52.620476336 +0800 @@ -0,0 +1,3 @@ +obj-$(CONFIG_NODEMANAGER) += nodemanager.o + +nodemanager-y := nodemanager.o diff -urN a/drivers/nodemanager/nodemanager.c b/drivers/nodemanager/nodemanager.c --- a/drivers/nodemanager/nodemanager.c 1970-01-01 07:30:00.0 +0730 +++ b/drivers/nodemanager/nodemanager.c 2005-07-18 13:55:17.043670968 +0800 @@ -0,0 +1,655 @@ +/* + * nodemanager.c + * + * Copyright (C) 2004, 2005 Oracle. All rights reserved. + * Copyright (C) 2005 Red Hat, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +/* TODO: + - generic addresses (IPV4/6) + - multiple addresses per node + - more than 255 nodes (no static MAXNODE array) + - function to get a list of all nodes +*/ + +#include linux/kernel.h +#include linux/module.h +#include linux/idr.h +#include linux/configfs.h + +#include nodemanager.h + +enum { + NM_NODE_ATTR_NODEID = 0, + NM_NODE_ATTR_ADDRESS, + NM_NODE_ATTR_LOCAL, +}; + +struct clusters; +struct cluster; +struct nodes; +struct node; + +static ssize_t node_nodeid_read(struct node *, char *); +static ssize_t node_nodeid_write(struct node *, const char *, size_t); +static ssize_t node_ipv4_address_read(struct node *, char *); +static ssize_t node_ipv4_address_write(struct node *, const char *, size_t); +static ssize_t node_local_read(struct node *, char *); +static ssize_t node_local_write(struct node *, const char *, size_t); + +static struct config_item *make_node(struct config_group *, const char *); +static void drop_node(struct config_group *, struct config_item *); +static void release_node(struct config_item *); +static struct config_group *make_cluster(struct config_group *, const char *); +static void drop_cluster(struct config_group *, struct config_item *); +static void release_cluster(struct config_item *); + +static ssize_t show_node(struct config_item *, struct configfs_attribute *, +char *); +static ssize_t store_node(struct config_item *, struct configfs_attribute *, + const char *, size_t); + + +struct node_attribute { + struct configfs_attribute attr; + ssize_t (*show)(struct node *, char *); + ssize_t (*store)(struct node *, const char *, size_t); +}; + +static struct node_attribute