date:20111020

Extended attributes Linux interface

2011-10-20 Thread Matthew Mondor

Hello,

There were previously discussions, started by Emmanuel, concerning the
extended attributes, including on the various available APIs and which
to support etc.

At the time I read them I was catching up with a lot of mail and had
written down a small note about a potential security implication that
crossed my mind if we used the Linux interface.  Perhaps someone can
(dis)confirm:

Strings are used instead of IDs to distinguish the class of an extended
attribute, i.e. "system" etc.  My question is then: must those be
limited to ASCII or can they support arbitrary bytes, or UTF-8?

If unicode strings are possible, I think that it'd be possible for a
string to look like "system" but to actually be something else to an
auditing administrator, unless all tools clearly showed those non-ASCII
bytes in an escaped format.

Of course, if the kernel wanted to match "system", it wouldn't match
then, but the fact that it may _appear_ to be correct to an admin may
introduce a security issue if extended permissions were ever
implemented on top of that system.  Perhaps that this problem could
also exist with the key names in case they're part of permission
descriptions?

Thanks,
-- 
Matt

Re: fs-independent quotas

2011-10-20 Thread Daniel Hagerty

Ignatios Souvatzis  writes:

> On Wed, Oct 19, 2011 at 06:09:27PM +, David Holland wrote:
> > support to other filesystems (tempfs, perhaps v7fs) or even add other
> > filesystems that have or may have their own native quota handling
> > (zfs, Hammer, you name it). 
> 
> zfs - does it really have quota? 

Yes, it does, as of zfs filesystem V4.

http://hub.opensolaris.org/bin/view/Community+Group+zfs/faq#HCanIsetquotasonZFSfilesystems3F

Re: fs-independent quotas

2011-10-20 Thread Thor Lancelot Simon

On Thu, Oct 20, 2011 at 06:54:54PM +0200, Manuel Bouyer wrote:
> On Thu, Oct 20, 2011 at 04:39:21PM +, David Holland wrote:
> >  > We're talking a few MB of ram here, isn't it ? the kernel can certainly
> >  > allocate this without troubles (other subsystems do).
> > 
> > The proplib'd and XMLified complete dump for 50,000 users will
> > probably make a blob of between 10 and 20 MB. (Note: this is an
> > estimate; I haven't checked the size by trying it. It might be larger.
> > I'd be surprised if it were much smaller.)
> 
> I tested with a few 10s or users; my estimate is about 35MB for 50k users.
> 
> > 
> > I don't see why it's desirable to manifest such large objects when
> > it's easily avoidable.
> 
> We don't agree on "easily". 

FYI:  I just went around, and around, and around on this with the
configuration framework a proprietary kernel subsystem.  If you just
take the position that _any_ write to _any_ part of the data invalidates
all cursors it is not so bad.  The user application has to be coded to
deal with that, but it keeps the complexity out of the kernel.

Thor

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Thu, Oct 20, 2011 at 09:41:25PM +0200, Manuel Bouyer wrote:
> > 
> > I don't see that you can do anything with an unmounted filesystem in
> > repquota. Unless the quota files for the filesystem are on a different
> > (and mounted) volume, it won't be able to read them, and it doesn't
> > have any code to mount the filesystem temporarily to do that.
> 
> Hum, you're right, it seems I broke this. I'll have a look at fixing
> it, it's a bug.

I just remembered, there's another reasons why requota reads the quota
file directly: I didn't implement the getall command for ffs-quota1.
It has nothing to do with xml or filesystem-independant code :)
it's just I didn't know (and still don't know) how to properly read
a whole file, especially when it's sparce, from the kernel.
I could try a dqget for the 2^21-1 ids, but there's probably a better
way to do that.
I guess I could find the hupper bound from the quota vnode's size.
Maybe that would be enough ...

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Thu, Oct 20, 2011 at 05:35:16PM +, David Holland wrote:
>  > I can't parse this, can you explain ? The tools needs to be aware of the
>  > format to do something usefull with the data, isn't it ?
> 
> The tools can and should work with a filesystem-independent abstract
> schema. This should be independent of any filesystem's on-disk quota
> format, just as the  structures are independent of any
> filesystem's on-disk directory layout.

the current proplib-based schema is independant of the on-disk format
(as it's just another representation of the same set of data that you 
proposed).

> 
>  > that's plain wrong. If it's quota1 you can use the quota1 code in
>  > sys/ufs/ufs (just as it would have done before quota2).
> 
> No, it is not wrong. It cannot use the quota1 code in ufs; the whole
> premise of the proposed lfs renovation is to unhook lfs from ufs. The
> ufs code is a big blob, not a library of components; you can't just
> use parts of it, or at least not easily.
> 
> I can copy the ufs quota1 structures and some of the ufs quota code,
> yes; but then I have struct lfs_dqblk, and I need to interface it to
> the rest of the system, and as things currently stand that forces me
> to clone all the ffs-quota1-specific quota code all over everywhere.

So, if I understand you properly, your lfs code won't use the 
quota1 on-disk format but some new format based on a lfs_dqblk structure.
Then it's a brand new disk format, the right thing to do is to use the
convertion functions from common/lib/libquota/ (as the ufs/quota1 and
ufs/quota2 code already do) and convert from here to your on-disk
format. 

You can't claim a data representation isn't filesystem-independant because
it doesn't correspond to you on-disk representation. As it's
filesystem-independant it has (by definition) to be converted to every
on-disk representation.

> 
> The lfs/ufs split would have been committed ages ago if the quota
> system hadn't gotten in the way. This is why, last spring, when yo
> were designing quota2, I was asking you to fix things above the FS to
> be FS-independent. But you didn't; instead it got worse. I tried at
> the time to explain the situation and the premises, and why the quota
> system should be FS-independent at and above the VFS level, but I got
> ignored and then sucked away by real life.

Well, I don't remember the details of that time but what I retained
is that you didn't like xml.
Now you're saying "I move lfs out of ufs and I can't use quota1 for lfs".
Yes, of course as quota1 is tightly coupled to ufs, and my project was
not to make quota1 filesystem-independant - it was to add a new
on-disk quota for ffs with some better properties. You can't blame
me for not making quota1 (or even quota2) reusable outside of ufs when 
my goal was to get a new on-disk format for ffs. That's just not the
same work.

Now, I don't think the current quota1 code is that much tied to ufs.
If you want to use the same dqblk for your on-disk format (but then
it's on-disk format, you can't claim it's fs-independant), code can
certainly be reorganised to make it reusable outside of ufs. But that's
orthogonal to filesystem-independant format representation.

> 
> Now I'm trying to fix it.
> 
>  > > Likewise, if I were to go add quota support to v7fs, or try to hook up
>  > > whatever quota support zfs has, or commit Hammer and try to get
>  > > whatever quota support *it* has working, or add ext2 quota support, or
>  > > write a new fs with quota support, or whatever, I'd have to make still
>  > > more copies of the logic to cope with all the different formats and
>  > > layouts.
>  > 
>  > Of course if you have new on-disk format you need to do some conversion,
>  > whatever "filesystem independant" format you use.
>  > But I think you could still reuse sys/ufs/ufs/quota2_subr.c to do the
>  > convertion from plist to some binary representation.
> 
> I could cut and paste it, maybe. That's not particularly desirable.

Now that I understand where you want to go, it's not the right thing
to do. Use the code in common/lib/libquota and write convertion routines
for your filesystem. You can call it a 'cut-n-paste' from quota2_subr.c,
but as quota2_subr.c is about converting the filsystem-independant
data to the quota2 on-disk format, and you use a different on-disk
format you can't blame it for not fitting your needs.

> 
>  > > This is not a good idea, not scalable, and not sensible, especially
>  > > when a filesystem-independent (read "format-independent" if you like)
>  > > interface is both perfectly possible and simpler.
>  > 
>  > I strongly believe the plist representation is format-independent.
>  > It has exactly the same informations as what you propose.
> 
> Right now, I'm not sure if it is or not. I'm only sure that it's
> highly complicated

It's not more complicated than the table representation you proposed
(beside being xml-based, but that's all whe have now).

> (unnecessarily so) and underdocumented. Mea

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 07:59:08PM +0200, Ignatios Souvatzis wrote:
 > >  > How would this fit in, if at all?
 > > 
 > > That's a good question. My first instinct is that like the other stuff
 > > zfs does that it does in its own semantically-incompatible way, it
 > > would require its own tools. But I guess the quota system could be
 > > made to report the limits if the sub-filesystems are specifically
 > > assigned to users somehow. Or something like that...
 > 
 > The problem is that sub-file-system per user, or one per workgroup and 
 > one subsubfs per user, are only special cases of what you can do. It's 
 > really a filesystem, mounted at some point in the tree, and can be used
 > to limit finer-grained than what you can express with user and group
 > quota, although it can emulate their functionality.

That's sort of what I suspected. I dunno, it probably won't work.

-- 
David A. Holland
dholl...@netbsd.org

Re: fs-independent quotas

2011-10-20 Thread Ignatios Souvatzis

On Thu, Oct 20, 2011 at 03:16:09PM +, David Holland wrote:
> On Thu, Oct 20, 2011 at 11:57:04AM +0200, Ignatios Souvatzis wrote:
>  > > support to other filesystems (tempfs, perhaps v7fs) or even add other
>  > > filesystems that have or may have their own native quota handling
>  > > (zfs, Hammer, you name it). 
>  > 
>  > zfs - does it really have quota? 
> 
> I don't know... but if not, there are plenty of other fses.
> 
>  > All the demos I've seen talk about sub-filesystem limits; you create
>  > per-user sub-filesystems if you want to emulate per-user quota.
>  > 
>  > (Correct me if I'm wrong.)
>  > 
>  > How would this fit in, if at all?
> 
> That's a good question. My first instinct is that like the other stuff
> zfs does that it does in its own semantically-incompatible way, it
> would require its own tools. But I guess the quota system could be
> made to report the limits if the sub-filesystems are specifically
> assigned to users somehow. Or something like that...

The problem is that sub-file-system per user, or one per workgroup and 
one subsubfs per user, are only special cases of what you can do. It's 
really a filesystem, mounted at some point in the tree, and can be used
to limit finer-grained than what you can express with user and group
quota, although it can emulate their functionality.

-is

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 12:56:17PM +0200, Manuel Bouyer wrote:
 > > > > So, a few months back we got a new improved quota format for FFS.
 > > > > Unfortunately, one of the side effects of this was to sprinkle
 > > > > specific knowledge of the new format through all the userlevel quota
 > > > > tools and quota support logic. To be fair, this was alongside the
 > > > > existing specific knowledge of the old quota format; nonetheless, it's
 > > > > messy and unscalable.
 > > > 
 > > > of course there's been changes to the tools, as there's a new format.
 > > 
 > > The tools ought to be format-independent.
 > 
 > I can't parse this, can you explain ? The tools needs to be aware of the
 > format to do something usefull with the data, isn't it ?

The tools can and should work with a filesystem-independent abstract
schema. This should be independent of any filesystem's on-disk quota
format, just as the  structures are independent of any
filesystem's on-disk directory layout.

 > > > You'll have to explain this. lfs is some variant of ffs, I see no
 > > > reasons why it coudln't use the new format.
 > > 
 > > It could use whatever format it wants. To the extent it currently
 > > supports quotas, I think it's limited to the old-style quotas, that
 > > is, quota1. But there's no way to plug it in without taking the
 > > fs-dependent code currently in all the tools and access pathway and
 > > making a third or perhaps a third and fourth copy of all the logic.
 > 
 > that's plain wrong. If it's quota1 you can use the quota1 code in
 > sys/ufs/ufs (just as it would have done before quota2).

No, it is not wrong. It cannot use the quota1 code in ufs; the whole
premise of the proposed lfs renovation is to unhook lfs from ufs. The
ufs code is a big blob, not a library of components; you can't just
use parts of it, or at least not easily.

I can copy the ufs quota1 structures and some of the ufs quota code,
yes; but then I have struct lfs_dqblk, and I need to interface it to
the rest of the system, and as things currently stand that forces me
to clone all the ffs-quota1-specific quota code all over everywhere.

The lfs/ufs split would have been committed ages ago if the quota
system hadn't gotten in the way. This is why, last spring, when yo
were designing quota2, I was asking you to fix things above the FS to
be FS-independent. But you didn't; instead it got worse. I tried at
the time to explain the situation and the premises, and why the quota
system should be FS-independent at and above the VFS level, but I got
ignored and then sucked away by real life.

Now I'm trying to fix it.

 > > Likewise, if I were to go add quota support to v7fs, or try to hook up
 > > whatever quota support zfs has, or commit Hammer and try to get
 > > whatever quota support *it* has working, or add ext2 quota support, or
 > > write a new fs with quota support, or whatever, I'd have to make still
 > > more copies of the logic to cope with all the different formats and
 > > layouts.
 > 
 > Of course if you have new on-disk format you need to do some conversion,
 > whatever "filesystem independant" format you use.
 > But I think you could still reuse sys/ufs/ufs/quota2_subr.c to do the
 > convertion from plist to some binary representation.

I could cut and paste it, maybe. That's not particularly desirable.

 > > This is not a good idea, not scalable, and not sensible, especially
 > > when a filesystem-independent (read "format-independent" if you like)
 > > interface is both perfectly possible and simpler.
 > 
 > I strongly believe the plist representation is format-independent.
 > It has exactly the same informations as what you propose.

Right now, I'm not sure if it is or not. I'm only sure that it's
highly complicated (unnecessarily so) and underdocumented. Meanwhile,
you've also been arguing that the quota2 on-disk structures are
format-independent, so forgive me if I take this all with a grain of
salt.

 > >  > This is exactly the format described in quotactl(2).
 > > 
 > > No, what's described in quotactl(2) is something about commands and
 > > arguments... and while there is a substructure that looks something
 > > like this, the fact remains that it's a *sub*structure
 > 
 > Yes, but you still need a way to pass commands. You didn't talk about this.

No, because I had something like the old quotactl(2) in mind - an
ordinary call passing a filesystem identifier, a command code, and an
argument.

 > > and the schema
 > > is not tabular.
 > 
 > I don't understant what you mean here. there's a set of values associated
 > with an id, I can't see the difference with what your proposing.

There's a complicated hierarchical structure of arrays and
maps/dictionaries, as opposed to a single flat table with columns.
Or, put another way, the schema I proposed is (I think) in third
normal form, and yours isn't.

Another way to put it is that your schema requires proplib to manage
it, with all the attendant complexity, whereas mine works perfectly

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Thu, Oct 20, 2011 at 04:39:21PM +, David Holland wrote:
>  > We're talking a few MB of ram here, isn't it ? the kernel can certainly
>  > allocate this without troubles (other subsystems do).
> 
> The proplib'd and XMLified complete dump for 50,000 users will
> probably make a blob of between 10 and 20 MB. (Note: this is an
> estimate; I haven't checked the size by trying it. It might be larger.
> I'd be surprised if it were much smaller.)

I tested with a few 10s or users; my estimate is about 35MB for 50k users.

> 
> I don't see why it's desirable to manifest such large objects when
> it's easily avoidable.

We don't agree on "easily". 

> 
>  > > There are two design truisms for database stuff that apply here:
>  > > first, you always end up wanting cursors, and second, you always end
>  > > up wanting bulk get (and not just single get) from those cursors. So
>  > > it's usually a good idea to anticipate this and design it all in up
>  > > front.
>  > 
>  > Maybe ... I know that in the end I want the whole set of data and not
>  > just a part of it.
> 
> Yes, probably. The cursor API I've floated so far is not general
> enough to support much else. Although it could be made more general.
> 
>  > But if you believe it's needed this can easily be added to the
>  > existing quotactl(2) (it would just be a new command).
> 
> Yes, perhaps it could... but why? What's to be gained by using a
> baroque proplib encoding of what can otherwise be handled as an array
> of simple structs?

it's an easily machine-parsable text. That's probably the reason why it's
used in other parts of the kernel too.

> 
> I remember asking this question when you first proposed the proplib
> interface last spring, and never really got a clear answer.

I see it as being the common format used for non-performance-critical
kernel/userland communication. It has been adopted by other kernel
subsystems, there's prior art there.

> 
>  > >  > > The reason to wrap the position in a cursor abstraction is to allow
>  > >  > > flexibility about how the position is represented.
>  > >  > 
>  > >  > But then the cursor would still be stored in userland ?
>  > > 
>  > > That's the idea, like reading a file with pread().
>  > > 
>  > > I think the kernel should know, or at least be able to know, how many
>  > > cursors are currently open; but I don't think there's any need to keep
>  > > the cursor state itself in the kernel.
>  > 
>  > So you want a quotaopen/quotaclose, with a file descriptor (or something
>  > similar) ?
> 
> The proposed API already has explicit open and close for cursors; what
> I'm saying is that this should be exposed to the kernel. (Open already
> has to be, to initialize the cursor position; close should be, so the
> filesystem can if necessary know if there are cursors open at any
> given time. Otherwise you can get into trouble; see for example nfsd
> and readdir.)

So you're close to have something like a file descriptor.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 06:00:28PM +0200, Manuel Bouyer wrote:
 > >  > It's certainly less trouble to send back to userland the whole set of
 > >  > data - especially if what userland wants is the whole set of data
 > >  > (I can't see what a partial read of quota would be usefull for).
 > > 
 > > No, no it really isn't. Suppose there are, say, 50,000 users, so to
 > > send back the whole works you have to accumulate 100,000 quota entries
 > > in a gigantic blob... a machine with 50,000 users will have enough RAM
 > > for this but that doesn't mean that allocating a contiguous chunk of
 > > kernel memory that large is easy or desirable. Far better to read it
 > > out a couple hundred at a time.
 > 
 > We're talking a few MB of ram here, isn't it ? the kernel can certainly
 > allocate this without troubles (other subsystems do).

The proplib'd and XMLified complete dump for 50,000 users will
probably make a blob of between 10 and 20 MB. (Note: this is an
estimate; I haven't checked the size by trying it. It might be larger.
I'd be surprised if it were much smaller.)

I don't see why it's desirable to manifest such large objects when
it's easily avoidable.

 > > There are two design truisms for database stuff that apply here:
 > > first, you always end up wanting cursors, and second, you always end
 > > up wanting bulk get (and not just single get) from those cursors. So
 > > it's usually a good idea to anticipate this and design it all in up
 > > front.
 > 
 > Maybe ... I know that in the end I want the whole set of data and not
 > just a part of it.

Yes, probably. The cursor API I've floated so far is not general
enough to support much else. Although it could be made more general.

 > But if you believe it's needed this can easily be added to the
 > existing quotactl(2) (it would just be a new command).

Yes, perhaps it could... but why? What's to be gained by using a
baroque proplib encoding of what can otherwise be handled as an array
of simple structs?

I remember asking this question when you first proposed the proplib
interface last spring, and never really got a clear answer.

 > >  > > The reason to wrap the position in a cursor abstraction is to allow
 > >  > > flexibility about how the position is represented.
 > >  > 
 > >  > But then the cursor would still be stored in userland ?
 > > 
 > > That's the idea, like reading a file with pread().
 > > 
 > > I think the kernel should know, or at least be able to know, how many
 > > cursors are currently open; but I don't think there's any need to keep
 > > the cursor state itself in the kernel.
 > 
 > So you want a quotaopen/quotaclose, with a file descriptor (or something
 > similar) ?

The proposed API already has explicit open and close for cursors; what
I'm saying is that this should be exposed to the kernel. (Open already
has to be, to initialize the cursor position; close should be, so the
filesystem can if necessary know if there are cursors open at any
given time. Otherwise you can get into trouble; see for example nfsd
and readdir.)

-- 
David A. Holland
dholl...@netbsd.org

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Thu, Oct 20, 2011 at 03:47:26PM +, David Holland wrote:
> On Thu, Oct 20, 2011 at 05:23:14PM +0200, Manuel Bouyer wrote:
>  > > That's way more complicated than necessary. Think of it as like
>  > > VOP_READDIR - you get passed a position, you send back some number of
>  > > items, and update the position.
>  > 
>  > Depending on how the data are stored on disk, the notion of position
>  > (which also implies some ordering) can be difficult to handle,
>  > especially if the data we're reading can change between two calls,
>  > causing the position do become invalid.
> 
> ...yes, but this is just one of those things you have to cope with
> when doing filesystems. It's no different from readdir in that regard.
> 
>  > It's certainly less trouble to send back to userland the whole set of
>  > data - especially if what userland wants is the whole set of data
>  > (I can't see what a partial read of quota would be usefull for).
> 
> No, no it really isn't. Suppose there are, say, 50,000 users, so to
> send back the whole works you have to accumulate 100,000 quota entries
> in a gigantic blob... a machine with 50,000 users will have enough RAM
> for this but that doesn't mean that allocating a contiguous chunk of
> kernel memory that large is easy or desirable. Far better to read it
> out a couple hundred at a time.

We're talking a few MB of ram here, isn't it ? the kernel can certainly
allocate this without troubles (other subsystems do).


> 
> There are two design truisms for database stuff that apply here:
> first, you always end up wanting cursors, and second, you always end
> up wanting bulk get (and not just single get) from those cursors. So
> it's usually a good idea to anticipate this and design it all in up
> front.

Maybe ... I know that in the end I want the whole set of data and not
just a part of it. But if you believe it's needed this can
easily be added to the existing quotactl(2) (it would just be a new command).

> 
>  > > The reason to wrap the position in a cursor abstraction is to allow
>  > > flexibility about how the position is represented.
>  > 
>  > But then the cursor would still be stored in userland ?
> 
> That's the idea, like reading a file with pread().
> 
> I think the kernel should know, or at least be able to know, how many
> cursors are currently open; but I don't think there's any need to keep
> the cursor state itself in the kernel.

So you want a quotaopen/quotaclose, with a file descriptor (or something
similar) ?

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 05:23:14PM +0200, Manuel Bouyer wrote:
 > > That's way more complicated than necessary. Think of it as like
 > > VOP_READDIR - you get passed a position, you send back some number of
 > > items, and update the position.
 > 
 > Depending on how the data are stored on disk, the notion of position
 > (which also implies some ordering) can be difficult to handle,
 > especially if the data we're reading can change between two calls,
 > causing the position do become invalid.

...yes, but this is just one of those things you have to cope with
when doing filesystems. It's no different from readdir in that regard.

 > It's certainly less trouble to send back to userland the whole set of
 > data - especially if what userland wants is the whole set of data
 > (I can't see what a partial read of quota would be usefull for).

No, no it really isn't. Suppose there are, say, 50,000 users, so to
send back the whole works you have to accumulate 100,000 quota entries
in a gigantic blob... a machine with 50,000 users will have enough RAM
for this but that doesn't mean that allocating a contiguous chunk of
kernel memory that large is easy or desirable. Far better to read it
out a couple hundred at a time.

There are two design truisms for database stuff that apply here:
first, you always end up wanting cursors, and second, you always end
up wanting bulk get (and not just single get) from those cursors. So
it's usually a good idea to anticipate this and design it all in up
front.

 > > The reason to wrap the position in a cursor abstraction is to allow
 > > flexibility about how the position is represented.
 > 
 > But then the cursor would still be stored in userland ?

That's the idea, like reading a file with pread().

I think the kernel should know, or at least be able to know, how many
cursors are currently open; but I don't think there's any need to keep
the cursor state itself in the kernel.

-- 
David A. Holland
dholl...@netbsd.org

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Thu, Oct 20, 2011 at 03:08:16PM +, David Holland wrote:
> That's way more complicated than necessary. Think of it as like
> VOP_READDIR - you get passed a position, you send back some number of
> items, and update the position.

Depending on how the data are stored on disk, the notion of position
(which also implies some ordering) can be difficult to handle,
especially if the data we're reading can change between two calls,
causing the position do become invalid.

It's certainly less trouble to send back to userland the whole set of
data - especially if what userland wants is the whole set of data
(I can't see what a partial read of quota would be usefull for).

> 
> If you want to take the trouble to guarantee strict transactional
> consistency, you can easily enough by checking generation numbers and
> failing with a particular errno if things have changed; but I don't
> think there's any real need for that level of strict consistency for
> quotas. Much less so than for readdir, at least, and we manage to cope
> with readdir the way it is.

I agree with this.

> 
> The reason to wrap the position in a cursor abstraction is to allow
> flexibility about how the position is represented.

But then the cursor would still be stored in userland ?

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 11:57:04AM +0200, Ignatios Souvatzis wrote:
 > > support to other filesystems (tempfs, perhaps v7fs) or even add other
 > > filesystems that have or may have their own native quota handling
 > > (zfs, Hammer, you name it). 
 > 
 > zfs - does it really have quota? 

I don't know... but if not, there are plenty of other fses.

 > All the demos I've seen talk about sub-filesystem limits; you create
 > per-user sub-filesystems if you want to emulate per-user quota.
 > 
 > (Correct me if I'm wrong.)
 > 
 > How would this fit in, if at all?

That's a good question. My first instinct is that like the other stuff
zfs does that it does in its own semantically-incompatible way, it
would require its own tools. But I guess the quota system could be
made to report the limits if the sub-filesystems are specifically
assigned to users somehow. Or something like that...

-- 
David A. Holland
dholl...@netbsd.org

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 06:43:47AM +0200, Emmanuel Dreyfus wrote:
 > > It seems to me that quotas are fundamentally a special-purpose
 > > key/value store; that is, you look up quota information for a
 > > particular thing (the key) and get back the quota settings and current
 > > usage information (the value).
 > 
 > If you are going to add a generic key/value store mechanism for all
 > filesystems,  you can consider fs-independent extended attrbiutes as
 > well.

I am not adding a generic key/value store mechanism. I am representing
the quota data as a specific key/value store.

A generic key/value store mechanism for all filesystems would be a
very large, messy, and semantically nebulous project...

-- 
David A. Holland
dholl...@netbsd.org

Re: fs-independent quotas

2011-10-20 Thread David Holland

On Thu, Oct 20, 2011 at 04:53:53PM +0200, Manuel Bouyer wrote:
 > > You don't need to track state in the kernel, you just need to keep a
 > > generation ID. Have the caller pass a "starting index" and
 > > "requested count" parameter, and have the kernel include "number of
 > > matches", "total matches", and "generation ID" in the response. Let
 > > the kernel limit the maximum number of matches to return per request
 > > if you like. If the generation ID changes while the caller is
 > > fetching records you simply restart the process.
 > 
 > I still don't see how it's going to work in details. Either
 > the kernel have to restart reading all quotas on each call to
 > return requested range, or it has to cache the previous read
 > of all quotas.
 > Once the kernel has read all the quotas, it can as well return
 > all the data to the caller, and let the caller deal with the
 > iteration.

That's way more complicated than necessary. Think of it as like
VOP_READDIR - you get passed a position, you send back some number of
items, and update the position.

If you want to take the trouble to guarantee strict transactional
consistency, you can easily enough by checking generation numbers and
failing with a particular errno if things have changed; but I don't
think there's any real need for that level of strict consistency for
quotas. Much less so than for readdir, at least, and we manage to cope
with readdir the way it is.

The reason to wrap the position in a cursor abstraction is to allow
flexibility about how the position is represented.

-- 
David A. Holland
dholl...@netbsd.org

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Thu, Oct 20, 2011 at 10:48:16AM -0400, Jared McNeill wrote:
> Heyas Manuel --
> 
> You don't need to track state in the kernel, you just need to keep a
> generation ID. Have the caller pass a "starting index" and
> "requested count" parameter, and have the kernel include "number of
> matches", "total matches", and "generation ID" in the response. Let
> the kernel limit the maximum number of matches to return per request
> if you like. If the generation ID changes while the caller is
> fetching records you simply restart the process.

I still don't see how it's going to work in details. Either
the kernel have to restart reading all quotas on each call to
return requested range, or it has to cache the previous read
of all quotas.
Once the kernel has read all the quotas, it can as well return
all the data to the caller, and let the caller deal with the
iteration.

-- 
Manuel Bouyer 
 NetBSD: 26 ans d'experience feront toujours la difference
--

proposed additions to sys/conf/std

2011-10-20 Thread Jonathan A. Kollasch


I propose adding

pseudo-device drvctl

and/or

options BUFQ_PRIOCSCAN

to src/sys/conf/std.

The reasons I even bring this up:
 - Many kernels are missing drvctl and thus do not support disk wedges
   (this is arguably due to a flaw in the design of disk wedges, but
   that's a another bikeshed).
 - BUFQ_PRIOCSCAN is superior to BUFQ_DISKSORT, and in fact
   BUFQ_DISKSORT is actually inferior to BUFQ_FCFS in terms of
   interactive disk I/O responsiveness.  There are many kernels that
   default to BUFQ_DISKSORT due to not explicitly adding BUFQ_PRIOCSCAN.

The ominous "
# "it's commonly used" is NOT a good reason to enable options here.
" line has me a bit apprehensive.  However, pseudo-device cpuctl is there
already.  There are some options that are there for historical reasons,
so this is sort of a slippery slope.  Do we need a new config file for
standard-but-optional-options?

Jonathan Kollasch

Re: fs-independent quotas

2011-10-20 Thread Jared McNeill


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Thu, 20 Oct 2011, Manuel Bouyer wrote:

also it doesn't support cursors.


This can easily be implemented in userland, without changes to the
quotactl(2) interface. I've trouble seeing how this can be sanely
implemented at the quotactl(2) level (I don't like the idea of the
kernel keeping states about what a specific userland process is doing).


Heyas Manuel --

You don't need to track state in the kernel, you just need to keep a 
generation ID. Have the caller pass a "starting index" and "requested 
count" parameter, and have the kernel include "number of matches", "total 
matches", and "generation ID" in the response. Let the kernel limit 
the maximum number of matches to return per request if you like. If 
the generation ID changes while the caller is fetching records you simply 
restart the process.


Cheers,
Jared
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (NetBSD)

iQEcBAEBAgAGBQJOoDSxAAoJEKdMfxFXhnem68IIAIGRUe0LA7chV89RvUBIS8Ji
5N3d/6bGk2bchbmLzHFav3TdR6IZREdOpQ9/sgawNeucP22c2evKh2431ASojsGL
Cd4SbFY6eTWxnC+HFeTVNDVbM7TsIxvIXzpFHZfIPwJGwny6gh4TzyAhP1Ep1wW4
E4AVRrsG5cZyYcFZdxW8/0CL9nNyFU9L2uMNnGldiwkv42lwbsQXoeLI1MfutYLY
6aZy29UWyE4ehwdmB2KrSjqqR6+Pbbzj9AsOEAhlhXotvk7wXPBCfN9WhkstwOUa
uFRmWDqNW5EFr0Q3nbPfTQvD3bYoCNbvHlL27VT+ZB8TsFRk2EO8PROxka8EMOk=
=jXDw
-END PGP SIGNATURE-

Re: fs-independent quotas

2011-10-20 Thread Manuel Bouyer

On Wed, Oct 19, 2011 at 10:20:23PM +, David Holland wrote:
> On Wed, Oct 19, 2011 at 09:22:02PM +0200, Manuel Bouyer wrote:
>  > > So, a few months back we got a new improved quota format for FFS.
>  > > Unfortunately, one of the side effects of this was to sprinkle
>  > > specific knowledge of the new format through all the userlevel quota
>  > > tools and quota support logic. To be fair, this was alongside the
>  > > existing specific knowledge of the old quota format; nonetheless, it's
>  > > messy and unscalable.
>  > 
>  > of course there's been changes to the tools, as there's a new format.
> 
> The tools ought to be format-independent.

I can't parse this, can you explain ? The tools needs to be aware of the
format to do something usefull with the data, isn't it ?

> 
>  > > We may want to add more quota formats (e.g. the different and
>  > > incompatible new quota format FreeBSD added last year) or add quota
>  > > support to other filesystems (tempfs, perhaps v7fs) or even add other
>  > > filesystems that have or may have their own native quota handling
>  > > (zfs, Hammer, you name it). Also, my planned lfs-renovation is
>  > > currently hung up on the VFS-level quota interface, because I don't
>  > > want to rip out the existing maybe-partial support for quotas but
>  > > can't plug new code into the existing framework.
>  > 
>  > You'll have to explain this. lfs is some variant of ffs, I see no reasons
>  > why it coudln't use the new format.
> 
> It could use whatever format it wants. To the extent it currently
> supports quotas, I think it's limited to the old-style quotas, that
> is, quota1. But there's no way to plug it in without taking the
> fs-dependent code currently in all the tools and access pathway and
> making a third or perhaps a third and fourth copy of all the logic.

that's plain wrong. If it's quota1 you can use the quota1 code in
sys/ufs/ufs (just as it would have done before quota2).

> 
> Likewise, if I were to go add quota support to v7fs, or try to hook up
> whatever quota support zfs has, or commit Hammer and try to get
> whatever quota support *it* has working, or add ext2 quota support, or
> write a new fs with quota support, or whatever, I'd have to make still
> more copies of the logic to cope with all the different formats and
> layouts.

Of course if you have new on-disk format you need to do some conversion,
whatever "filesystem independant" format you use.
But I think you could still reuse sys/ufs/ufs/quota2_subr.c to do the
convertion from plist to some binary representation.

> 
> This is not a good idea, not scalable, and not sensible, especially
> when a filesystem-independent (read "format-independent" if you like)
> interface is both perfectly possible and simpler.

I strongly believe the plist representation is format-independent.
It has exactly the same informations as what you propose.

> 
>  > in fact the new format is fs-independant. 
> 
> Yes, in the sense that one could add the format to other file systems;
> but no, in the sense that other file systems already have their own
> quota formats and we need to be able to interoperate.

You have to do some convertion, of the same level as with what you
propose.

> 
>  > But this is just what the current propib format is ! a set of tables
>  > with key/values pair !
> 
> That's great, that'll make the changes I need to make that much
> easier. But it doesn't seem particularly familiar relative to the code
> I've been working on.

Or maybe you don't need to change it at all.

> 
>  > > the quota *type*
>  > > 
>  > >- the quota value is:
>  > > the configured hard limit
>  > > the configured soft limit
>  > > the configured grace period
>  > > the current usage
>  > > the current grace expiry time (if any)
>  > 
>  > This is exactly the format described in quotactl(2).
> 
> No, what's described in quotactl(2) is something about commands and
> arguments... and while there is a substructure that looks something
> like this, the fact remains that it's a *sub*structure

Yes, but you still need a way to pass commands. You didn't talk about this.

> and the schema
> is not tabular.

I don't understant what you mean here. there's a set of values associated
with an id, I can't see the difference with what your proposing.

> 
>  > > The quota *class* is the thing the quota is imposed on; this is
>  > > currently either "user" or "group". There is no likely prospect of
>  > > additional quota classes appearing.
>  > 
>  > I don't think we should limit ourselve to these class. I could see
>  > per-host or per-hostgroup quotas for networked filesystems for example.
> 
> I'm not limiting it to anything, but I'll believe in more quota
> classes when I see them. Per-host quotas (even if they make sense,
> which I question) aren't going to work very well with a 32-bit id, for
> example.

right, that's where a plist is a win.

> 
> Whereas, as I pointed out before, th

Re: fs-independent quotas

2011-10-20 Thread Ignatios Souvatzis

On Wed, Oct 19, 2011 at 06:09:27PM +, David Holland wrote:
> support to other filesystems (tempfs, perhaps v7fs) or even add other
> filesystems that have or may have their own native quota handling
> (zfs, Hammer, you name it). 

zfs - does it really have quota? 

All the demos I've seen talk about sub-filesystem limits; you create
per-user sub-filesystems if you want to emulate per-user quota.

(Correct me if I'm wrong.)

How would this fit in, if at all?

-is

Re: dtrace ioctls

2011-10-20 Thread David Laight

On Wed, Oct 19, 2011 at 10:22:08PM +, David Holland wrote:
> On Wed, Oct 19, 2011 at 10:01:33PM +0100, David Laight wrote:
>  > 
>  > Hmmm... the sun code is passing the structure by value 
> 
> Is it? The non-sun code appears to be calling an ioctl that's defined
> to take a pointer to a pointer to a structure. Or maybe I'm totally
> misreading ioccom.h?

Maybe I was asleep

David

-- 
David Laight: da...@l8s.co.uk

Extended attributes Linux interface

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

proposed additions to sys/conf/std

Re: fs-independent quotas

Re: fs-independent quotas

Re: fs-independent quotas

Re: dtrace ioctls

22 matches

Site Navigation

Mail list logo

Footer information