Paul B. Henson wrote: > <snip> > > At about 5000 filesystems, it starts taking over 30 seconds to > create/delete additional filesystems. > > At 7848, over a minute: > > # time zfs create export/user/test > > real 1m22.950s > user 1m12.268s > sys 0m10.184s > > I did a little experiment with truss: > > # truss -c zfs create export/user/test2 > > syscall seconds calls errors > _exit .000 1 > read .004 892 > open .023 67 2 > close .001 80 > brk .006 653 > getpid .037 8598 > mount .006 1 > sysi86 .000 1 > ioctl 115.534 31303678 7920 > execve .000 1 > fcntl .000 18 > openat .000 2 > mkdir .000 1 > getppriv .000 1 > getprivimplinfo .000 1 > issetugid .000 4 > sigaction .000 1 > sigfillset .000 1 > getcontext .000 1 > setustack .000 1 > mmap .000 78 > munmap .000 28 > xstat .000 65 21 > lxstat .000 1 1 > getrlimit .000 1 > memcntl .000 16 > sysconfig .000 5 > lwp_sigmask .000 2 > lwp_private .000 1 > llseek .084 15819 > door_info .000 13 > door_call .103 8391 > schedctl .000 1 > resolvepath .000 19 > getdents64 .000 4 > stat64 .000 3 > fstat64 .000 98 > zone_getattr .000 1 > zone_lookup .000 2 > -------- ------ ---- > sys totals: 115.804 31338551 7944 > usr time: 107.174 > elapsed: 897.670 > > > and it seems the majority of time is spent in ioctl calls, specifically: > > ioctl(16, MNTIOC_GETMNTENT, 0x08045A60) = 0 >
Yes, the implementation of the above ioctl walks the list of mounted filesystems 'vfslist' [in this case it walks 5000 nodes of a linked list before the ioctl returns] This in-kernel traversal of the filesystems is taking time. > Interestingly, I tested creating 6 filesystems simultaneously, which took a > total of only three minutes, rather than 9 minutes had they been created > > sequentially. I'm not sure how parallelizable I can make an identity > management provisioning system though. > > Was I mistaken about the increased scalability that was going to be > available? Is there anything I could configure differently to improve this > performance? We are going to need about 30,000 filesystems to cover our > You could set 'zfs set mountpoint=none <pool-name>' and then create the filesystems under the <pool-name> . [In my experiments the number of ioctl's went down drastically.] You could then set a mountpoint for the pool and then issue a 'zpool mount -a' . Pramod > faculty, staff, students, and group project directories. We do have 5 > x4500's which will be allocated to the task, so about 6000 filesystems per. > Depending on what time of the quarter it is, our identity management sytem > can create hundreds up to thousands of accounts, and when we purge accounts > quarterly we typically delete 10,000 or so. Currently those jobs only take > 2-6 hours, with this level of performance from ZFS they would take days if > not over a week :(. > > Thanks for any suggestions. What is the internal recommendation on maximum > number of file systems per server? > > > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss