Re: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-21 Thread Daniel Phillips
On Thursday 21 July 2005 02:55, Walker, Bruce J (HP-Labs) wrote:
> Like Lars, I too was under the wrong impression about this configfs
> "nodemanager" kernel component.  Our discussions in the cluster meeting
> Monday and Tuesday were assuming it was a general service that other
> kernel components could/would utilize and possibly also something that
> could send uevents to non-kernel components wanting a std. way to see
> membership information/events.
>
> As to kernel components without corresponding user-level "managers", look
> no farther than OpenSSI.  Our hope was that we could adapt to a user-land
> membership service and this interface thru configfs would drive all our
> kernel subsystems.

Guys, it is absolutely stupid to rely on a virtual filesystem for 
userspace/kernel communication for any events that might have to be 
transmitted inside the block IO path.  This includes, among other things, 
memberhips events.  Inserting a virtual filesystem into this path does 
nothing but add long call chains and new, hard-to-characterize memory 
usage.

There are already tried-and-true interfaces that are designed to do this 
kind of job efficiently and with quantifiable resource requirements: 
sockets (UNIX domain or netlink) and ioctls.  If you want to layer a 
virtual filesystem on top as a user friendly way to present current cluster 
configuration or as a way to provide some administrator knobs, then fine, 
virtual filesystems are good for this kind of thing.  But please do not try 
to insinuate that bloat into the block IO path.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-21 Thread Lars Marowsky-Bree
On 2005-07-20T11:39:38, Joel Becker <[EMAIL PROTECTED]> wrote:

>   In turn, let me clarify a little where configfs fits in to
> things.  Configfs is merely a convenient and transparent method to
> communicate configuration to kernel objects.  It's not a place for
> uevents, for netlink sockets, or for fancy communication.  It allows
> userspace to create an in-kernel object and set/get values on that
> object.  It also allows userspace and kernelspace to share the same
> representation of that object and its values.
>   For more complex interaction, sysfs and procfs are often more
> appropriate.  While you might "configure" all known nodes in configfs,
> the node up/down state might live in sysfs.  A netlink socket for
> up/down events might live in procfs.  And so on.

Right. Thanks for the clarification and elaboration, for I am sure
not entirely clear as to how all these mechanisms relate in detail and
what is appropriate just where, and when to use something more classic
like ioctl etc... ;-)

FWIW, we didn't mean to get uevents out via configfs of course.


Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-21 Thread Lars Marowsky-Bree
On 2005-07-20T11:39:38, Joel Becker [EMAIL PROTECTED] wrote:

   In turn, let me clarify a little where configfs fits in to
 things.  Configfs is merely a convenient and transparent method to
 communicate configuration to kernel objects.  It's not a place for
 uevents, for netlink sockets, or for fancy communication.  It allows
 userspace to create an in-kernel object and set/get values on that
 object.  It also allows userspace and kernelspace to share the same
 representation of that object and its values.
   For more complex interaction, sysfs and procfs are often more
 appropriate.  While you might configure all known nodes in configfs,
 the node up/down state might live in sysfs.  A netlink socket for
 up/down events might live in procfs.  And so on.

Right. Thanks for the clarification and elaboration, for I am sure
not entirely clear as to how all these mechanisms relate in detail and
what is appropriate just where, and when to use something more classic
like ioctl etc... ;-)

FWIW, we didn't mean to get uevents out via configfs of course.


Sincerely,
Lars Marowsky-Brée [EMAIL PROTECTED]

-- 
High Availability  Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
Ignorance more frequently begets confidence than does knowledge

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-21 Thread Daniel Phillips
On Thursday 21 July 2005 02:55, Walker, Bruce J (HP-Labs) wrote:
 Like Lars, I too was under the wrong impression about this configfs
 nodemanager kernel component.  Our discussions in the cluster meeting
 Monday and Tuesday were assuming it was a general service that other
 kernel components could/would utilize and possibly also something that
 could send uevents to non-kernel components wanting a std. way to see
 membership information/events.

 As to kernel components without corresponding user-level managers, look
 no farther than OpenSSI.  Our hope was that we could adapt to a user-land
 membership service and this interface thru configfs would drive all our
 kernel subsystems.

Guys, it is absolutely stupid to rely on a virtual filesystem for 
userspace/kernel communication for any events that might have to be 
transmitted inside the block IO path.  This includes, among other things, 
memberhips events.  Inserting a virtual filesystem into this path does 
nothing but add long call chains and new, hard-to-characterize memory 
usage.

There are already tried-and-true interfaces that are designed to do this 
kind of job efficiently and with quantifiable resource requirements: 
sockets (UNIX domain or netlink) and ioctls.  If you want to layer a 
virtual filesystem on top as a user friendly way to present current cluster 
configuration or as a way to provide some administrator knobs, then fine, 
virtual filesystems are good for this kind of thing.  But please do not try 
to insinuate that bloat into the block IO path.

Regards,

Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Joel Becker
On Wed, Jul 20, 2005 at 08:09:18PM +0200, Lars Marowsky-Bree wrote:
> On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <[EMAIL PROTECTED]> wrote:
> 
> > Like Lars, I too was under the wrong impression about this configfs
> > "nodemanager" kernel component.  Our discussions in the cluster
> > meeting Monday and Tuesday were assuming it was a general service that
> > other kernel components could/would utilize and possibly also
> > something that could send uevents to non-kernel components wanting a
> > std. way to see membership information/events.
> 
> Let me clarify that this was something we briefly touched on in
> Walldorf: The node manager would (re-)export the current data via sysfs
> (which would result in uevents being sent, too), and not something we
> dreamed up just Monday ;-)

In turn, let me clarify a little where configfs fits in to
things.  Configfs is merely a convenient and transparent method to
communicate configuration to kernel objects.  It's not a place for
uevents, for netlink sockets, or for fancy communication.  It allows
userspace to create an in-kernel object and set/get values on that
object.  It also allows userspace and kernelspace to share the same
representation of that object and its values.
For more complex interaction, sysfs and procfs are often more
appropriate.  While you might "configure" all known nodes in configfs,
the node up/down state might live in sysfs.  A netlink socket for
up/down events might live in procfs.  And so on.

Joel

-- 

"But all my words come back to me
 In shades of mediocrity.
 Like emptiness in harmony
 I need someone to comfort me."

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Lars Marowsky-Bree
On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <[EMAIL PROTECTED]> wrote:

> Like Lars, I too was under the wrong impression about this configfs
> "nodemanager" kernel component.  Our discussions in the cluster
> meeting Monday and Tuesday were assuming it was a general service that
> other kernel components could/would utilize and possibly also
> something that could send uevents to non-kernel components wanting a
> std. way to see membership information/events.

Let me clarify that this was something we briefly touched on in
Walldorf: The node manager would (re-)export the current data via sysfs
(which would result in uevents being sent, too), and not something we
dreamed up just Monday ;-)

> As to kernel components without corresponding user-level "managers",
> look no farther than OpenSSI.  Our hope was that we could adapt to a
> user-land membership service and this interface thru configfs would
> drive all our kernel subsystems.

Well, node manager still can provide you the input as to which nodes are
configured, which in a way translates to "membership". The thing it
doesn't seem to provide yet is the supsend/modify/resume cycle which for
example the RHAT DLM seems to require.


Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Walker, Bruce J (HP-Labs)
Like Lars, I too was under the wrong impression about this configfs 
"nodemanager" kernel component.  Our discussions in the cluster meeting Monday 
and Tuesday were assuming it was a general service that other kernel components 
could/would utilize and possibly also something that could send uevents to 
non-kernel components wanting a std. way to see membership information/events.

As to kernel components without corresponding user-level "managers", look no 
farther than OpenSSI.  Our hope was that we could adapt to a user-land 
membership service and this interface thru configfs would drive all our kernel 
subsystems.

Bruce Walker
OpenSSI Cluster project
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lars 
Marowsky-Bree
Sent: Wednesday, July 20, 2005 9:27 AM
To: David Teigland
Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

On 2005-07-20T11:35:46, David Teigland <[EMAIL PROTECTED]> wrote:

> > Also, eventually we obviously need to have state for the nodes - 
> > up/down et cetera. I think the node manager also ought to track this.
> We don't have a need for that information yet; I'm hoping we won't 
> ever need it in the kernel, but we'll see.

Hm, I'm thinking a service might have a good reason to want to know the 
possible list of nodes as opposed to the currently active membership; though 
the DLM as the service in question right now does not appear to need such.

But, see below.

> There are at least two ways to handle this:
> 
> 1. Pass cluster events and data into the kernel (this sounds like what 
> you're talking about above), notify the effected kernel components, 
> each kernel component takes the cluster data and does whatever it 
> needs to with it (internal adjustments, recovery, etc).
> 
> 2. Each kernel component "foo-kernel" has an associated user space 
> component "foo-user".  Cluster events (from userland clustering
> infrastructure) are passed to foo-user -- not into the kernel.  
> foo-user determines what the specific consequences are for foo-kernel.  
> foo-user then manipulates foo-kernel accordingly, through user/kernel 
> hooks (sysfs, configfs, etc).  These control hooks would largely be specific 
> to foo.
> 
> We're following option 2 with the dlm and gfs and have been for quite 
> a while, which means we don't need 1.  I think ocfs2 is moving that 
> way, too.  Someone could still try 1, of course, but it would be of no 
> use or interest to me.  I'm not aware of any actual projects pushing 
> forward with something like 1, so the persistent reference to it is somewhat 
> baffling.

Right. I thought that the node manager changes for generalizing it where 
pushing into sort-of direction 1. Thanks for clearing this up.



Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>

--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

--
Linux-cluster mailing list
[EMAIL PROTECTED]
http://www.redhat.com/mailman/listinfo/linux-cluster
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Lars Marowsky-Bree
On 2005-07-20T11:35:46, David Teigland <[EMAIL PROTECTED]> wrote:

> > Also, eventually we obviously need to have state for the nodes - up/down
> > et cetera. I think the node manager also ought to track this.
> We don't have a need for that information yet; I'm hoping we won't ever
> need it in the kernel, but we'll see.

Hm, I'm thinking a service might have a good reason to want to know the
possible list of nodes as opposed to the currently active membership;
though the DLM as the service in question right now does not appear to
need such.

But, see below.

> There are at least two ways to handle this:
> 
> 1. Pass cluster events and data into the kernel (this sounds like what
> you're talking about above), notify the effected kernel components, each
> kernel component takes the cluster data and does whatever it needs to with
> it (internal adjustments, recovery, etc).
> 
> 2. Each kernel component "foo-kernel" has an associated user space
> component "foo-user".  Cluster events (from userland clustering
> infrastructure) are passed to foo-user -- not into the kernel.  foo-user
> determines what the specific consequences are for foo-kernel.  foo-user
> then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs,
> configfs, etc).  These control hooks would largely be specific to foo.
> 
> We're following option 2 with the dlm and gfs and have been for quite a
> while, which means we don't need 1.  I think ocfs2 is moving that way,
> too.  Someone could still try 1, of course, but it would be of no use or
> interest to me.  I'm not aware of any actual projects pushing forward with
> something like 1, so the persistent reference to it is somewhat baffling.

Right. I thought that the node manager changes for generalizing it where
pushing into sort-of direction 1. Thanks for clearing this up.



Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Lars Marowsky-Bree
On 2005-07-20T11:35:46, David Teigland [EMAIL PROTECTED] wrote:

  Also, eventually we obviously need to have state for the nodes - up/down
  et cetera. I think the node manager also ought to track this.
 We don't have a need for that information yet; I'm hoping we won't ever
 need it in the kernel, but we'll see.

Hm, I'm thinking a service might have a good reason to want to know the
possible list of nodes as opposed to the currently active membership;
though the DLM as the service in question right now does not appear to
need such.

But, see below.

 There are at least two ways to handle this:
 
 1. Pass cluster events and data into the kernel (this sounds like what
 you're talking about above), notify the effected kernel components, each
 kernel component takes the cluster data and does whatever it needs to with
 it (internal adjustments, recovery, etc).
 
 2. Each kernel component foo-kernel has an associated user space
 component foo-user.  Cluster events (from userland clustering
 infrastructure) are passed to foo-user -- not into the kernel.  foo-user
 determines what the specific consequences are for foo-kernel.  foo-user
 then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs,
 configfs, etc).  These control hooks would largely be specific to foo.
 
 We're following option 2 with the dlm and gfs and have been for quite a
 while, which means we don't need 1.  I think ocfs2 is moving that way,
 too.  Someone could still try 1, of course, but it would be of no use or
 interest to me.  I'm not aware of any actual projects pushing forward with
 something like 1, so the persistent reference to it is somewhat baffling.

Right. I thought that the node manager changes for generalizing it where
pushing into sort-of direction 1. Thanks for clearing this up.



Sincerely,
Lars Marowsky-Brée [EMAIL PROTECTED]

-- 
High Availability  Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
Ignorance more frequently begets confidence than does knowledge

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Walker, Bruce J (HP-Labs)
Like Lars, I too was under the wrong impression about this configfs 
nodemanager kernel component.  Our discussions in the cluster meeting Monday 
and Tuesday were assuming it was a general service that other kernel components 
could/would utilize and possibly also something that could send uevents to 
non-kernel components wanting a std. way to see membership information/events.

As to kernel components without corresponding user-level managers, look no 
farther than OpenSSI.  Our hope was that we could adapt to a user-land 
membership service and this interface thru configfs would drive all our kernel 
subsystems.

Bruce Walker
OpenSSI Cluster project
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lars 
Marowsky-Bree
Sent: Wednesday, July 20, 2005 9:27 AM
To: David Teigland
Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org; [EMAIL PROTECTED]
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

On 2005-07-20T11:35:46, David Teigland [EMAIL PROTECTED] wrote:

  Also, eventually we obviously need to have state for the nodes - 
  up/down et cetera. I think the node manager also ought to track this.
 We don't have a need for that information yet; I'm hoping we won't 
 ever need it in the kernel, but we'll see.

Hm, I'm thinking a service might have a good reason to want to know the 
possible list of nodes as opposed to the currently active membership; though 
the DLM as the service in question right now does not appear to need such.

But, see below.

 There are at least two ways to handle this:
 
 1. Pass cluster events and data into the kernel (this sounds like what 
 you're talking about above), notify the effected kernel components, 
 each kernel component takes the cluster data and does whatever it 
 needs to with it (internal adjustments, recovery, etc).
 
 2. Each kernel component foo-kernel has an associated user space 
 component foo-user.  Cluster events (from userland clustering
 infrastructure) are passed to foo-user -- not into the kernel.  
 foo-user determines what the specific consequences are for foo-kernel.  
 foo-user then manipulates foo-kernel accordingly, through user/kernel 
 hooks (sysfs, configfs, etc).  These control hooks would largely be specific 
 to foo.
 
 We're following option 2 with the dlm and gfs and have been for quite 
 a while, which means we don't need 1.  I think ocfs2 is moving that 
 way, too.  Someone could still try 1, of course, but it would be of no 
 use or interest to me.  I'm not aware of any actual projects pushing 
 forward with something like 1, so the persistent reference to it is somewhat 
 baffling.

Right. I thought that the node manager changes for generalizing it where 
pushing into sort-of direction 1. Thanks for clearing this up.



Sincerely,
Lars Marowsky-Brée [EMAIL PROTECTED]

--
High Availability  Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
Ignorance more frequently begets confidence than does knowledge

--
Linux-cluster mailing list
[EMAIL PROTECTED]
http://www.redhat.com/mailman/listinfo/linux-cluster
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Lars Marowsky-Bree
On 2005-07-20T09:55:31, Walker, Bruce J (HP-Labs) [EMAIL PROTECTED] wrote:

 Like Lars, I too was under the wrong impression about this configfs
 nodemanager kernel component.  Our discussions in the cluster
 meeting Monday and Tuesday were assuming it was a general service that
 other kernel components could/would utilize and possibly also
 something that could send uevents to non-kernel components wanting a
 std. way to see membership information/events.

Let me clarify that this was something we briefly touched on in
Walldorf: The node manager would (re-)export the current data via sysfs
(which would result in uevents being sent, too), and not something we
dreamed up just Monday ;-)

 As to kernel components without corresponding user-level managers,
 look no farther than OpenSSI.  Our hope was that we could adapt to a
 user-land membership service and this interface thru configfs would
 drive all our kernel subsystems.

Well, node manager still can provide you the input as to which nodes are
configured, which in a way translates to membership. The thing it
doesn't seem to provide yet is the supsend/modify/resume cycle which for
example the RHAT DLM seems to require.


Sincerely,
Lars Marowsky-Brée [EMAIL PROTECTED]

-- 
High Availability  Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
Ignorance more frequently begets confidence than does knowledge

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-20 Thread Joel Becker
On Wed, Jul 20, 2005 at 08:09:18PM +0200, Lars Marowsky-Bree wrote:
 On 2005-07-20T09:55:31, Walker, Bruce J (HP-Labs) [EMAIL PROTECTED] wrote:
 
  Like Lars, I too was under the wrong impression about this configfs
  nodemanager kernel component.  Our discussions in the cluster
  meeting Monday and Tuesday were assuming it was a general service that
  other kernel components could/would utilize and possibly also
  something that could send uevents to non-kernel components wanting a
  std. way to see membership information/events.
 
 Let me clarify that this was something we briefly touched on in
 Walldorf: The node manager would (re-)export the current data via sysfs
 (which would result in uevents being sent, too), and not something we
 dreamed up just Monday ;-)

In turn, let me clarify a little where configfs fits in to
things.  Configfs is merely a convenient and transparent method to
communicate configuration to kernel objects.  It's not a place for
uevents, for netlink sockets, or for fancy communication.  It allows
userspace to create an in-kernel object and set/get values on that
object.  It also allows userspace and kernelspace to share the same
representation of that object and its values.
For more complex interaction, sysfs and procfs are often more
appropriate.  While you might configure all known nodes in configfs,
the node up/down state might live in sysfs.  A netlink socket for
up/down events might live in procfs.  And so on.

Joel

-- 

But all my words come back to me
 In shades of mediocrity.
 Like emptiness in harmony
 I need someone to comfort me.

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: [EMAIL PROTECTED]
Phone: (650) 506-8127
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-19 Thread David Teigland
On Tue, Jul 19, 2005 at 05:52:14PM +0200, Lars Marowsky-Bree wrote:

> The nodeid, I thought, was relative to a given DLM namespace, no? This
> concept seems to be missing here, or are you suggesting the nodeid to be
> global across namespaces?

I'm not sure I understand what you mean.  A node would have the same
nodeid across different dlm locking-domains, assuming, of course, those
dlm domains were in the context of the same cluster.  The dlm only uses
nodemanager to look up node addresses, though.

> Also, eventually we obviously need to have state for the nodes - up/down
> et cetera. I think the node manager also ought to track this.

We don't have a need for that information yet; I'm hoping we won't ever
need it in the kernel, but we'll see.

> How would kernel components use this and be notified about changes to
> the configuration / membership state?

"Nodemanager" is perhaps a poor name; at the moment its only substantial
purpose is to communicate node address/id associations in a way that's
independent of a specific driver or fs.

Changes to cluster configuration/membership happen in user space, of
course.  Those general events will have specific consequences to a given
component (fs, lock manager, etc).  These consequences vary quite widely
depending on the component you're looking at.

There are at least two ways to handle this:

1. Pass cluster events and data into the kernel (this sounds like what
you're talking about above), notify the effected kernel components, each
kernel component takes the cluster data and does whatever it needs to with
it (internal adjustments, recovery, etc).

2. Each kernel component "foo-kernel" has an associated user space
component "foo-user".  Cluster events (from userland clustering
infrastructure) are passed to foo-user -- not into the kernel.  foo-user
determines what the specific consequences are for foo-kernel.  foo-user
then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs,
configfs, etc).  These control hooks would largely be specific to foo.

We're following option 2 with the dlm and gfs and have been for quite a
while, which means we don't need 1.  I think ocfs2 is moving that way,
too.  Someone could still try 1, of course, but it would be of no use or
interest to me.  I'm not aware of any actual projects pushing forward with
something like 1, so the persistent reference to it is somewhat baffling.

Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-19 Thread Lars Marowsky-Bree
On 2005-07-18T14:15:53, David Teigland <[EMAIL PROTECTED]> wrote:

> Some of the comments about the dlm concerned how it's configured (from
> user space.)  In particular, there was interest in seeing the dlm and
> ocfs2 use common methods for their configuration.
> 
> The first area I'm looking at is how we get addresses/ids of other nodes.
> Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a
> separate kernel module called "ocfs2_nodemanager" that's based on
> configfs.
> 
> I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use
> it (removing ocfs-specific stuff).  It still needs some work, but I'd like
> to know if this appeals to the ocfs group and to others who were
> interested in seeing some similarity in dlm/ocfs configuration.

Hi Dave, I finally found time to read through this.

Yes, I most definetely like where this is going!

> +/* TODO:
> +   - generic addresses (IPV4/6)
> +   - multiple addresses per node

The nodeid, I thought, was relative to a given DLM namespace, no? This
concept seems to be missing here, or are you suggesting the nodeid to be
global across namespaces?

Also, eventually we obviously need to have state for the nodes - up/down
et cetera. I think the node manager also ought to track this.

How would kernel components use this and be notified about changes to
the configuration / membership state?


Sincerely,
Lars Marowsky-Brée <[EMAIL PROTECTED]>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-19 Thread Lars Marowsky-Bree
On 2005-07-18T14:15:53, David Teigland [EMAIL PROTECTED] wrote:

 Some of the comments about the dlm concerned how it's configured (from
 user space.)  In particular, there was interest in seeing the dlm and
 ocfs2 use common methods for their configuration.
 
 The first area I'm looking at is how we get addresses/ids of other nodes.
 Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a
 separate kernel module called ocfs2_nodemanager that's based on
 configfs.
 
 I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use
 it (removing ocfs-specific stuff).  It still needs some work, but I'd like
 to know if this appeals to the ocfs group and to others who were
 interested in seeing some similarity in dlm/ocfs configuration.

Hi Dave, I finally found time to read through this.

Yes, I most definetely like where this is going!

 +/* TODO:
 +   - generic addresses (IPV4/6)
 +   - multiple addresses per node

The nodeid, I thought, was relative to a given DLM namespace, no? This
concept seems to be missing here, or are you suggesting the nodeid to be
global across namespaces?

Also, eventually we obviously need to have state for the nodes - up/down
et cetera. I think the node manager also ought to track this.

How would kernel components use this and be notified about changes to
the configuration / membership state?


Sincerely,
Lars Marowsky-Brée [EMAIL PROTECTED]

-- 
High Availability  Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
Ignorance more frequently begets confidence than does knowledge

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

2005-07-19 Thread David Teigland
On Tue, Jul 19, 2005 at 05:52:14PM +0200, Lars Marowsky-Bree wrote:

 The nodeid, I thought, was relative to a given DLM namespace, no? This
 concept seems to be missing here, or are you suggesting the nodeid to be
 global across namespaces?

I'm not sure I understand what you mean.  A node would have the same
nodeid across different dlm locking-domains, assuming, of course, those
dlm domains were in the context of the same cluster.  The dlm only uses
nodemanager to look up node addresses, though.

 Also, eventually we obviously need to have state for the nodes - up/down
 et cetera. I think the node manager also ought to track this.

We don't have a need for that information yet; I'm hoping we won't ever
need it in the kernel, but we'll see.

 How would kernel components use this and be notified about changes to
 the configuration / membership state?

Nodemanager is perhaps a poor name; at the moment its only substantial
purpose is to communicate node address/id associations in a way that's
independent of a specific driver or fs.

Changes to cluster configuration/membership happen in user space, of
course.  Those general events will have specific consequences to a given
component (fs, lock manager, etc).  These consequences vary quite widely
depending on the component you're looking at.

There are at least two ways to handle this:

1. Pass cluster events and data into the kernel (this sounds like what
you're talking about above), notify the effected kernel components, each
kernel component takes the cluster data and does whatever it needs to with
it (internal adjustments, recovery, etc).

2. Each kernel component foo-kernel has an associated user space
component foo-user.  Cluster events (from userland clustering
infrastructure) are passed to foo-user -- not into the kernel.  foo-user
determines what the specific consequences are for foo-kernel.  foo-user
then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs,
configfs, etc).  These control hooks would largely be specific to foo.

We're following option 2 with the dlm and gfs and have been for quite a
while, which means we don't need 1.  I think ocfs2 is moving that way,
too.  Someone could still try 1, of course, but it would be of no use or
interest to me.  I'm not aware of any actual projects pushing forward with
something like 1, so the persistent reference to it is somewhat baffling.

Dave

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/