I originally started testing a prototype for an enterprise file service implementation on our campus using S10U4. Scalability in terms of file system count was pretty bad, anything over a couple of thousand and operations started taking way too long.
I had thought there were a number of improvements/enhancements that had been made since then to improve performance and scalability when a large number of file systems exist. I've been testing with SXCE (b97) which presumably has all of the enhancements (and potentially then some) that will be available in U6, and I'm still seeing very poor scalability once more than a few thousand filesystems are created. I have a test install on an x4500 with two TB disks as a ZFS root pool, 44 TB disks configured as mirror pairs belonging to one zpool, and the last two TB disks as hot spares. At about 5000 filesystems, it starts taking over 30 seconds to create/delete additional filesystems. At 7848, over a minute: # time zfs create export/user/test real 1m22.950s user 1m12.268s sys 0m10.184s I did a little experiment with truss: # truss -c zfs create export/user/test2 syscall seconds calls errors _exit .000 1 read .004 892 open .023 67 2 close .001 80 brk .006 653 getpid .037 8598 mount .006 1 sysi86 .000 1 ioctl 115.534 31303678 7920 execve .000 1 fcntl .000 18 openat .000 2 mkdir .000 1 getppriv .000 1 getprivimplinfo .000 1 issetugid .000 4 sigaction .000 1 sigfillset .000 1 getcontext .000 1 setustack .000 1 mmap .000 78 munmap .000 28 xstat .000 65 21 lxstat .000 1 1 getrlimit .000 1 memcntl .000 16 sysconfig .000 5 lwp_sigmask .000 2 lwp_private .000 1 llseek .084 15819 door_info .000 13 door_call .103 8391 schedctl .000 1 resolvepath .000 19 getdents64 .000 4 stat64 .000 3 fstat64 .000 98 zone_getattr .000 1 zone_lookup .000 2 -------- ------ ---- sys totals: 115.804 31338551 7944 usr time: 107.174 elapsed: 897.670 and it seems the majority of time is spent in ioctl calls, specifically: ioctl(16, MNTIOC_GETMNTENT, 0x08045A60) = 0 Interestingly, I tested creating 6 filesystems simultaneously, which took a total of only three minutes, rather than 9 minutes had they been created sequentially. I'm not sure how parallelizable I can make an identity management provisioning system though. Was I mistaken about the increased scalability that was going to be available? Is there anything I could configure differently to improve this performance? We are going to need about 30,000 filesystems to cover our faculty, staff, students, and group project directories. We do have 5 x4500's which will be allocated to the task, so about 6000 filesystems per. Depending on what time of the quarter it is, our identity management sytem can create hundreds up to thousands of accounts, and when we purge accounts quarterly we typically delete 10,000 or so. Currently those jobs only take 2-6 hours, with this level of performance from ZFS they would take days if not over a week :(. Thanks for any suggestions. What is the internal recommendation on maximum number of file systems per server? -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | [EMAIL PROTECTED] California State Polytechnic University | Pomona CA 91768 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss