I originally started testing a prototype for an enterprise file service
implementation on our campus using S10U4. Scalability in terms of file
system count was pretty bad, anything over a couple of thousand and
operations started taking way too long.

I had thought there were a number of improvements/enhancements that had
been made since then to improve performance and scalability when a large
number of file systems exist. I've been testing with SXCE (b97) which
presumably has all of the enhancements (and potentially then some) that
will be available in U6, and I'm still seeing very poor scalability once
more than a few thousand filesystems are created.

I have a test install on an x4500 with two TB disks as a ZFS root pool, 44
TB disks configured as mirror pairs belonging to one zpool, and the last
two TB disks as hot spares.

At about 5000 filesystems, it starts taking over 30 seconds to
create/delete additional filesystems.

At 7848, over a minute:

# time zfs create export/user/test

real    1m22.950s
user    1m12.268s
sys     0m10.184s

I did a little experiment with truss:

# truss -c zfs create export/user/test2

syscall               seconds   calls  errors
_exit                    .000       1
read                     .004     892
open                     .023      67       2
close                    .001      80
brk                      .006     653
getpid                   .037    8598
mount                    .006       1
sysi86                   .000       1
ioctl                 115.534 31303678    7920
execve                   .000       1
fcntl                    .000      18
openat                   .000       2
mkdir                    .000       1
getppriv                 .000       1
getprivimplinfo          .000       1
issetugid                .000       4
sigaction                .000       1
sigfillset               .000       1
getcontext               .000       1
setustack                .000       1
mmap                     .000      78
munmap                   .000      28
xstat                    .000      65      21
lxstat                   .000       1       1
getrlimit                .000       1
memcntl                  .000      16
sysconfig                .000       5
lwp_sigmask              .000       2
lwp_private              .000       1
llseek                   .084   15819
door_info                .000      13
door_call                .103    8391
schedctl                 .000       1
resolvepath              .000      19
getdents64               .000       4
stat64                   .000       3
fstat64                  .000      98
zone_getattr             .000       1
zone_lookup              .000       2
                     --------  ------   ----
sys totals:           115.804 31338551   7944
usr time:             107.174
elapsed:              897.670


and it seems the majority of time is spent in ioctl calls, specifically:

ioctl(16, MNTIOC_GETMNTENT, 0x08045A60)         = 0

Interestingly, I tested creating 6 filesystems simultaneously, which took a
total of only three minutes, rather than 9 minutes had they been created
sequentially. I'm not sure how parallelizable I can make an identity
management provisioning system though.

Was I mistaken about the increased scalability that was going to be
available? Is there anything I could configure differently to improve this
performance? We are going to need about 30,000 filesystems to cover our
faculty, staff, students, and group project directories. We do have 5
x4500's which will be allocated to the task, so about 6000 filesystems per.
Depending on what time of the quarter it is, our identity management sytem
can create hundreds up to thousands of accounts, and when we purge accounts
quarterly we typically delete 10,000 or so. Currently those jobs only take
2-6 hours, with this level of performance from ZFS they would take days if
not over a week :(.

Thanks for any suggestions. What is the internal recommendation on maximum
number of file systems per server?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [EMAIL PROTECTED]
California State Polytechnic University  |  Pomona CA 91768
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to