Re: [zfs-discuss] zfs fragmentation

Richard Elling Tue, 11 Aug 2009 08:05:55 -0700

On Aug 11, 2009, at 7:39 AM, Ed Spencer wrote:

On Tue, 2009-08-11 at 07:58, Alex Lam S.L. wrote:
At a first glance, your production server's numbers are lookingfairly
similar to the "small file workload" results of your development
server.
I thought you were saying that the development server has fasterperformance?
The development serer was running only one cp -pr command.
The production mail sevrer was running two concurrent backup jobsand of
course the mail system, with each job having the same performance
throughput as if there were a single job running. The single threaded
backup jobs do not conflict with each other over performance.


Agree.

If we ran 20 concurrent backup jobs, overall performance would scaleupquite a bit. (I would guess between 5 and 10 times the performance).(I
just read Mike's post and will do some 'concurrency' testing).


Yes.

Users are currently evenly distributed over 5 filesystems (Ipreviously

mentioned 7 but its really 5 filesystems for users and 1 for system
data, totalling 6, and one test filesystem).

We backup 2 filesystems on tuesday, 2 filesystems on thursday, and 2on

saturday. We backup to disk and then clone to tape. Our backup people
can only handle doing 2 filesystems per night.

Creating more filesystems to increase the parallelism of our backup is
one solution but its a major redesign of the of the mail system.


Really?  I presume this is because of the way you originally
allocated accounts to file systems.  Creating file systems in ZFS is
easy, so could you explain in a new thread?

Adding a second server to half the pool and thereby half theproblem isanother solution (and we would also create more filesystems at thesame
time).


I'm not convinced this is a good idea. It is a lot of work based on
the assumption that the server is the bottleneck.

Moving the pool to a FC San or a JBOD may also increase performance.
(Less layers, introduced by the appliance, thereby increasing
performance.)


Disagree.

I suspect that if we 'rsync' one of these filesystems to a second

server/pool that we would also see a performance increase equal towhat

we see on the development server. (I don't know how zfs send a receive
work so I don't know if it would address this "Filesystem Entropy" or
specifically reorganize the files and directories). However, when we
created a testfs filesystem in the zfs pool on the production server,
and copied data to it, we saw the same performance as the other
filesystems, in the same pool.


Directory walkers, like NetBackup or rsync, will not scale well as
the number of files increases.  It doesn't matter what file system you

use, the scalability will look more-or-less similar. For millions offiles,

ZFS send/receive works much better.  More details are in my paper.

We will have to do something to address the problem. A combination of
what I just listed is our probable course of action. (Much testingwillhave to be done to ensure our solution will address the problembecausewe are not 100% sure what is the cause of performance degradation).I'm
also dealing with Network Appliance to see if there is anything we can
do at the filer end to increase performance. But I'm holding outlittle
hope.


DNLC hit rate?
Also, is atime on?

But please, don't miss the point I'm trying to make. ZFS would benefit
from a utility or a background process that would reorganize files and
directories in the pool to optimize performance. A utility to dealwith
Filesystem Entropy. Currently a zfs pool will live as long as the
lifetime of the disks that it is on, without reorganization. Thiscan be
a long long time. Not to mention slowly expanding the pool over time
contributes to the issue.


This does not come "for free" in either performance or risk. It will
do nothing to solve the directory walker's problem.

NB, people who use UFS don't tend to see this because UFS can't
handle millions of files.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs fragmentation

Reply via email to