Priming the cache for ZFS should work at least after boot
When freemem is large; any read block will make it to
cache. Post boot when memory is primed with something else
(what?) then it gets more difficult for both UFS and ZFS to
guess what to keep in caches.

Did you try priming ZFS after boot ?

So next you seem to suffer because your sequential write to
log files appear to displace from the ARC the more useful DB 
files (I'd be interested to see if this still occur after
you've primed the ZFS cache after boot).

Note that  if your logfile rate is  huge  (dd like) then ZFS
cache management will suffer but that is well on it's way to
being fixed. But  for DS, I would  think  that the log  rate
would be  more reasonable and that  your storage  is able to
keep up. That gives ZFS cache management a fighting chance
to store the reused data over sequential writes.

If the default  behavior is not working  for you, we'll need
to consider the ARC  behavior in this case.  I don't see why
it  should not work out of  the box. But manual control will
come also in the form of this DIO like feature :

        6429855  Need way to tell ZFS that caching is a lost cause

While  we manage to try and  solve your problem out the box,
you might also have a  background process that keeps priming
the cache at a  low  I/O rate.  Not a great  workaround, but
should be effective.


-r



Brad Diggs writes:
 > Hello Darren,
 > 
 > Please find responses in line below...
 > 
 > On Fri, 2008-02-08 at 10:52 +0000, Darren J Moffat wrote:
 > > Brad Diggs wrote:
 > > > I would like to use ZFS but with ZFS I cannot prime the cache
 > > > and I don't have the ability to control what is in the cache 
 > > > (e.g. like with the directio UFS option).
 > > 
 > > Why do you believe you need that at all ?  
 > 
 > My application is directory server.  The #1 resource that 
 > directory needs to make maximum utilization of is RAM.  In 
 > order to do that, I want to control every aspect of RAM
 > utilization both to safely use as much RAM as possible AND
 > avoid contention among things trying to use RAM.
 > 
 > Lets consider the following example.  A customer has a 
 > 50M entry directory.  The sum of the data (db3 files) is
 > approximately 60GB.  However, there is another 2GB for the
 > root filesystem, 30GB for the changelog, 1GB for the 
 > transaction logs, and 10GB for the informational logs.
 > 
 > The system on which directory server will run has only 
 > 64GB of RAM.  The system is configured with the following
 > partitions:
 > 
 >   FS      Used(GB)  Description
 >    /      2         root
 >    /db    60        directory data
 >    /logs  41        changelog, txn logs, and info logs
 >    swap   10        system swap
 > 
 > I prefer to keep the directory db cache and entry caches
 > relatively small.  So the db cache is 2GB and the entry 
 > cache is 100M.  This leaves roughly 63GB of RAM for my 60GB
 > of directory data and Solaris. The only way to ensure that
 > the directory data (/db) is the only thing in the filesystem
 > cache is to set directio on / (root) and (/logs).
 > 
 > > What do you do to "prime" the cache with UFS 
 > 
 > cd <ds_instance_dir>/db
 > for i in `find . -name '*.db3"`
 > do
 >   dd if="${i}" of=/dev/null
 > done
 > 
 > > and what benefit do you think it is giving you ?
 > 
 > Priming the directory server data into filesystem cache 
 > reduces ldap response time for directory data in the
 > filesystem cache.  This could mean the difference between
 > a sub ms response time and a response time on the order of
 > tens or hundreds of ms depending on the underlying storage
 > speed.  For telcos in particular, minimal response time is 
 > paramount.
 > 
 > Another common scenario is when we do benchmark bakeoffs
 > with another vendor's product.  If the data isn't pre-
 > primed, then ldap response time and throughput will be
 > artificially degraded until the data is primed into either
 > the filesystem or directory (db or entry) cache.  Priming
 > via ldap operations can take many hours or even days 
 > depending on the number of entries in the directory server.
 > However, priming the same data via dd takes minutes to hours
 > depending on the size of the files.  
 > 
 > As you know in benchmarking scenarios, time is the most limited
 > resource that we typically have.  Thus, priming via dd is much
 > preferred.
 > 
 > Lastly, in order to achieve optimal use of available RAM, we
 > use directio for the root (/) and other non-data filesystems.
 > This makes certain that the only data in the filesystem cache
 > is the directory data.
 > 
 > > Have you tried just using ZFS and found it doesn't perform as you need 
 > > or are you assuming it won't because it doesn't have directio ?
 > 
 > We have done extensive testing with ZFS and love it.  The three 
 > areas lacking for our use cases are as follows:
 >  * No ability to control what is in cache. e.g. no directio
 >  * No absolute ability to apply an upper boundary to the amount
 >    of RAM consumed by ZFS.  I know that the arc cache has a 
 >    control that seems to work well. However, the arc cache is
 >    only part of ZFS ram consumption.
 >  * No ability to rapidly prime the ZFS cache with the data that 
 >    I want in the cache.
 > 
 > I hope that helps give understanding to where I am coming from!
 > 
 > Brad
 > 
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to