Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI This information is Copyright 2009 Sun Microsystems 1. Introduction 1.1. Project/Component Working Name: ZFS user/group quotas & space accounting 1.2. Name of Document Author/Supplier: Author: Matthew Ahrens 1.3 Date of This Document: 30 March, 2009 4. Technical Description ZFS user/group space accounting
A. SUMMARY This case adds support to ZFS for user/group quotas & per-uid/gid space tracking. B. PROBLEM Enterprise customers often want to know who is using space, based on what uid and gid owns each file. Education customers often want to apply per-user quotas to hundreds of thousands of users. In these situations, the number of users and/or existing infrastructure prohibits using one filesystem per user and setting filesystem-wide quotas. C. PROPOSED SOLUTION 1. Overview Each filesystem keeps track of how much space inside it is owned by each user (uid) and group (gid). This is the amount of space "referenced", so relationships between filesystems, descendents, clones, and snapshots are ignored, and each tracks their "user used" and "group used" independently. This is the same policy behind the "referenced", "refquota", and "refreservation" properties. The amount of space charged is the amount of space reported by struct stat's st_blocks and du(1). Both POSIX ids (uid & gid) and untranslated SIDs are supported (eg, when sharing filesystems over SMB without a name service translation set up). ZFS will now enforce quotas on the amount of space referenced by files owned by particular users and groups. Enforcement may be delayed by several seconds. In other words, users may go a bit over their quota before the system notices that they are over quota and begins to refuse additional writes with EDQUOT. This decision was made to get the feature to market in a reasonable time, with a minimum of engineering resources expended. The design and implementation do not preclude implementing strict enforcement at a later date. User space accounting and quotas "stick with" each dataset (snapshot, filesystem, and clone). This means that user quotas (and space accounting) are not inherited. They will be "copied" to a new snapshot, and keep the values they had at the time the snapshot was taken. Likewise, user quotas will be "copied" to a clone (from its origin snapshot), and they will be copied with "zfs send" (even without -R). (User accounting and quota information is not actually copied to snapshots and clones, just referenced and copied-on-write like other filesystem contents.) The user space accounting and quotas is reported by the new userused@<user>, groupused@<group>, userquota@<user>, and groupquota@<group> properties, and by the new "zfs userspace" and "zfs groupspace" subcommands, which are detailed below. 2. Version Compatibility To use these features, the pool must be upgraded to a new on-disk version (15). Old filesystems must have their space accounting information initialized by running "zfs userspace <fs>" or upgrading the old filesystem to a new on-disk version (4). To set user quotas, the pool and filesystem must both be upgraded. 3. Permissions Setting or changing user quotas are administrative actions, subject to the same privilege requirements as other zfs subcommands. There are new "userquota" and "groupquota" permissions which can be granted with "zfs allow", to allow those properties to be viewed and changed. Unprivileged users can only view their own userquota and userused, and the groupquota and groupused of any groups they belong to. The new "userused" and "groupused" permissions can be granted with "zfs allow" to permit users to view these properties. The existing "version" permission (granted with "zfs allow") permits the accounting information to be initialized by "zfs userspace". 4. New Properties user/group space accounting information and quotas can be manipulated with 4 new properties: zfs get userused@<user> <fs|snap> zfs get groupused@<group> <fs|snap> zfs get userquota@<user> <fs|snap> zfs get groupquota@<group> <fs|snap> zfs set userquota@<user>=<quota> <fs> zfs set groupquota@<user>=<quota> <fs> The <user> or <group> is specified using one of the following forms: posix name (eg. ahrens) posix numeric id (eg. 126829) sid name (eg. ahrens at sun) sid numeric id (eg. S-1-12345-12423-125829) For "zfs set", if a nonexistent name is specified, an error is generated. Any numeric ID is permitted. For "zfs get", if a nonexistent name is specified, "-" is printed for the value, indicating that there is no value available (like "zfs get nonexistent:userproperty"). As with filesystem quotas ("quota" and "refquota" properties), user quotas can be set to a value larger than the available space. User quotas can also be set to a value less than the amount of space used by that user, effectively forcing that user to reduce their space utilization. These new properties are not printed by "zfs get all", since that could generate a huge amount of output, which would not be very well organized. The new "zfs userspace" subcommand should be used instead. 5. New Subcommands Two new subcommands are added: "zfs userspace" and "zfs groupspace": zfs {user|group}space [-hniHp] [-o field[,...]] [-sS field] ... [-t type [,...]] <filesystem|snapshot> Typical output is like this: TYPE NAME USED QUOTA POSIX User ahrens 14M 1G POSIX User george 56.5M none POSIX User lling 258M 500M SMB User marks at sun 103M none Option flags: -h Show help message and exit. -n Print numeric ID instead of user/group name (like "ls -n") -i Translate SID to POSIX ID. The POSIX ID may be ephemeral if no mapping exists. Normal POSIX interfaces (eg, stat(2), "ls -l") perform this translation, so the -i option allows the output from "zfs userspace" to be compared directly with those utilities. However, -i may lead to confusion if some files were created by a SMB user before a SMB -> POSIX name mapping was established. In that case some files are owned by the SMB entity and some by the POSIX entity. The -i flag will report that the POSIX entity has the total usage and quota for both entities. -H Do not print headers, use tab-delimited output (like "zfs list/get -H") -p Use exact (parsable) numeric output (like "zfs get -p") -o field[,...] Print only the specified fields (like "zfs list/get -o"), from the following set: type,name,used,quota. The default is to print all fields. -s field Sort output by this field (like "zfs list -s"). The -s (and -S) flag may be specified multiple times to sort first by one field, then by another. The default is "-s type -s name". -S field Sort by this field in reverse order, see -s. -t type[,...] Print only the specified types (like "zfs list -t"), from the following set: all,posixuser,smbuser,posixgroup,smbgroup. The default for "zfs userspace" is "-t posixuser,smbuser". The default for "zfs groupspace" is "-t posixgroup,smbgroup". This is the only difference between the two subcommands, and in fact "zfs userspace -t posixgroup" is perfectly valid. 6. Stability This case requests patch/micro release binding. The new interfaces are committed. 6. Resources and Schedule 6.4. Steering Committee requested information 6.4.1. Consolidation C-team Name: ON 6.5. ARC review type: FastTrack 6.6. ARC Exposure: open