[perf-discuss] Re: FileBench Discuss

James Mauro Thu, 11 Aug 2005 13:07:28 -0700

I'd like to take this opportunity to post some general information on what I 
picked up recently while using filebench to generate some workloads for file 
system performance. This will be spread across several replies to this topic.


The filebench binaries (for SPARC and x86), as well as filebench source, are 
available on sourceforge (www.sourceforge.net/projects/filebench). This 
discussion, at least for now, will assume Solaris (either SPARC or x86) for 
discussions on directory and file names - all the other stuff will be OS 
independent. That is, full pathnames referencing the location of filebench 
files and directories assume a Solaris installation. All the other bits, such 
as running filebench, creating profiles, etc, is OS independent.

The filebench package I'm currently using is:

solaris10> pkginfo -l filebench
   PKGINST:  filebench
      NAME:  FileBench
  CATEGORY:  application
      ARCH:  sparc,i386
   VERSION:  20 Jul 05
   BASEDIR:  /opt
    VENDOR:  Richard Mc Dougall
      DESC:  FileBench
    PSTAMP:  1.64.5_s10_x86_sparc_PRERELEASE
  INSTDATE:  Aug 01 2005 14:32
     EMAIL:  [EMAIL PROTECTED]
    STATUS:  completely installed
     FILES:      266 installed pathnames
                   6 linked files
                  28 directories
                  67 executables
                   2 setuid/setgid executables
               33368 blocks used (approx)


First - a couple things to be aware of regarding the state of filebench and 
documentation. Aside from the output of "/opt/filebench/bin/filebench -h" 
(which is actually very usefull - try it!), there are two text files in 
filebench/docs,, README and README.benchpoint. That's pretty much it for 
filebench documentation for right now. As we work on fixing that, we will use 
this forum to share information and help close the documentation gap. We 
enccourage others to share what they have learned, and of course post questions 
and queries here.

The README.benchpoint file may cause some confusion. Benchpoint, so far as I 
can tell, is a generic name used to describe the filebench user environment 
and/or wrapper script (I think...I was not privy to this work, so I may be 
wrong). The important thing to note is that the README.benchpoint file refers 
to directories in the filebench hierarchy that do not exist - specifically, 
benchpoint/config. The current filebench distribution includes a config 
directory under /opt/filebench (/opt/filebench/config), which is where the 
profiles (.prof files) and function files (.func) reside. Also, there is a 
benchpoint script in /opt/filebench/bin, which simply informs the user that 
runbench should be used instead of benchpoint. So I think it's fair to assume 
that we can remove references to benchpoint in filebench, and focus on what the 
current distribution provides, and how to use it.

The top level filebench directory (/opt/filebench on Solaris) will contain 6 
subdirectories.
bin - filebench command binaries.
config - preconfigued profiles (.prof) and function (.func) files, to get 
things started.
docs - README and README.benchpoint. More to follow.
scripts - various system stat gathering scripts, and other scripted components 
of the filebench framework.
workloads - configured flowop (.f) files, designed to simulate specific 
workloads.
xanadu - A framework for post-processing results data, including system 
statistics.

A few additional points need to be made on the workloads and xanadu 
directories...
The .f files in the workloads directory are files written in the 'f' language, 
which is a workload modeling language developed as part of the filebench 
framework. It is designed to allow for taking known attributes of workload, and 
representing them in a descriptive language that is than used to generate a 
load. The syntax, semantics and grammer rules for writing custom .f scripts is 
not yet documented. /opt/filebench/bin/filebench -h will output a summary of 
the 'f' language definition. That, coupled with reading existing .f scripts, is 
sometimes enough to begin creating your own custom workflows. But, we still 
need to get this documented.

The workloads in the workloads directory fall into one of two categories - 
micro and macro. The  micro workloads are those prefixed with filemicro_XXX.f. 
As they name implies, these workloads represent lower-level file IO operations 
that are typically a subset of what a macro workload does. The macro workloads, 
such as oltp.f and varmail.f, use descriptive names to represent they type of 
workload they are designed to generate.

Xanadu is a framework developed in the Performance & Availability Engineering 
(PAE) group at Sun (the group Richard and I work for). It's a very slick set of 
tools designed to grab loads of benchmark-generated output (including the 
output of system stat tools like mpstat(1)), and provide summary tables and 
graphs of the data. Additionally, Xanadu is designed to make it easy to compare 
data from different benchmark runs. We will further explore using Xanadu in the 
next post.

With that, the information below was created by Richard McDougall. It is a 
cut-and-paste from Richard's internal (Sun internal) WEB site, and provides a 
great jumpstart to using filebench.
Installing

The workspace is at /home/rmc/ws/filebench-gate, the gate log is here, and a 
download of a filebench package (Filebench) here.

# zcat 1.28_s10_x86_sparc.tar.Z |tar xvf -
...
# pkgadd -d 1.28_s10_x86_sparc 

The following packages are available:
  1  filebench     FileBench
                   (sparc,i386) 15 Oct 04

Select package(s) you wish to process (or 'all' to process
all packages). (default: all) [?,??,q]: 1

Processing package instance  from 

## Processing package information.
## Processing system information.
   10 package pathnames are already properly installed.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.

Installing FileBench as 

## Installing part 1 of 1.
/opt/filebench/README
/opt/filebench/README.benchpoint
/opt/filebench/bin/Comm.pl
/opt/filebench/bin/benchpoint
/opt/filebench/bin/i386/fastsu
/opt/filebench/bin/i386/filebench
/opt/filebench/bin/i386/gnuplot
/opt/filebench/bin/sparcv9/fastsu
/opt/filebench/bin/sparcv9/filebench
/opt/filebench/bin/sparcv9/gnuplot
/opt/filebench/runscripts/runem.sh
/opt/filebench/tools/collect_iostat
/opt/filebench/tools/collect_lockstat
/opt/filebench/tools/collect_vmstat
/opt/filebench/tools/filebench_plot
/opt/filebench/tools/filebench_summary
/opt/filebench/workloads/bringover.f
/opt/filebench/workloads/createfiles.f
/opt/filebench/workloads/deletefiles.f
/opt/filebench/workloads/fileserver.f
/opt/filebench/workloads/multistreamread.f
/opt/filebench/workloads/oltp.f
/opt/filebench/workloads/postmark.f
/opt/filebench/workloads/randomread.f
/opt/filebench/workloads/singlestreamread.f
/opt/filebench/workloads/varmail.f
/opt/filebench/workloads/webproxy.f
/opt/filebench/workloads/webserver.f
[ verifying class  ]

Installation of  was successful.

Getting Started Running Filebench

Example varmail run:

The workloads have been encapsulated in workload definition files
and the environment is now quite simple to drive.

There are simple workload personalities, which configure the type
of workload to simulate. An example is a /var/mail directory
simulation (like postmark):

$ /opt/filebench/bin/filebench
filebench> load varmail
 8395: 3.898: Varmail personality successfully loaded
 8395: 3.899: Usage: set $dir=<dir>
 8395: 3.900:        set $filesize=<size>    defaults to 16384
 8395: 3.900:        set $nfiles=<value>     defaults to 1000
 8395: 3.901:        set $dirwidth=<value>   defaults to 20
 8395: 3.901:        set $nthreads=<value>   defaults to 1
 8395: 3.902:        set $meaniosize=<value> defaults to 16384
 8395: 3.902:        run <runtime>
filebench> set $dir=/tmp
filebench> run 10
 8395: 14.886: Fileset mailset: 1000 files, avg dir = 20, avg depth = 2.305865, 
mbytes=15
 8395: 15.301: Preallocated fileset mailset in 1 seconds
 8395: 15.301: Starting 1 filereader instances
 8396: 16.313: Starting 1 filereaderthread threads
 8395: 19.323: Running for 10 seconds...
 8395: 29.333: Stats period = 10s
 8395: 29.347: IO Summary:      21272 iops 2126.0 iops/s, (1063/1063 r/w)  
32.1mb/s,    338us cpu/op,   0.3ms latency
 8395: 29.348: Shutting down processes
filebench> stats dump stats.varmail
filebench> quit

This run did 21272 logical operations per second - e.g. 21272 open,
close, read/write etc...

$ more stats.varmail

Flowop totals:
closefile4                492ops/s   0.0mb/s      0.0ms/op       31us/op-cpu
readfile4                 492ops/s   7.1mb/s      0.6ms/op      171us/op-cpu
openfile4                 492ops/s   0.0mb/s      5.4ms/op      653us/op-cpu
closefile3                492ops/s   0.0mb/s      1.1ms/op      150us/op-cpu
appendfilerand3           492ops/s   7.7mb/s      0.0ms/op       55us/op-cpu
readfile3                 492ops/s   7.2mb/s      0.6ms/op      163us/op-cpu
openfile3                 492ops/s   0.0mb/s      5.1ms/op      623us/op-cpu
closefile2                492ops/s   0.0mb/s      0.9ms/op      136us/op-cpu
appendfilerand2           492ops/s   7.7mb/s      0.1ms/op       67us/op-cpu
createfile2               492ops/s   0.0mb/s      8.9ms/op     1151us/op-cpu
deletefile1               492ops/s   0.0mb/s      8.4ms/op     1007us/op-cpu

IO Summary:        649116 iops   5409.0 iops/s,      983/983 r/w    29.7mb/s,   
  1461uscpu/op


Running multiple workloads automatically

$ mkdir mybench
$ cd mybench
$ cp /opt/filebench/config/filemicro.prof mybench.prof
 or
$ cp /opt/filebench/config/filemacro.prof mybench.prof

$ vi mybench.prof

set your stats dir to where you want the logs to go...

$ /opt/filebench/bin/benchpoint mybench

$ browse the index.html in the stats dir



Application-Emulation Workloads Currently Available in Filebench package

      varmail
      A /var/mail NFS mail server emaulation, following the workload of 
postmark, but multi-threaded. The workload consists of a multi-threaded set of 
open/read/close, open/append/close and deletes in a single directory.
      fileserver
      A file system workload, similar to SPECsfs. This workload performs a 
sequence of creates, deletes, appends, reads, writes and attribute operations 
on the file system. A configurable hierarchical directory structure is used for 
the file set.
      oltp
      A database emulator. This workload performance transactions into a 
filesystem using an I/O model from Oracle 9i. This workload tests for the 
performance of small random reads & writes, and is sensitive to the latency of 
moderate (128k+) synchronous writes as in the log file. It by default lanches 
200 reader processes, 10 processes for asyncrhonous writing, and a log writer. 
The emulation includes use of ISM shared memory as per Oracle, Sybase etc which 
is critical to I/O efficiency (as_lock optimizations).
      dss
      DSS Database (not yet in pkg)
      webserver
      A mix of open/read/close of multiple files in a directory tree, plus a 
file append (to simulate the web log). 100 threads are used by default. 16k is 
appended to the weblog for every 10 reads.
      webproxy
      A mix of create/write/close, open/read/close, delete of multiple files in 
a directory tree, plus a file append (to simulate the proxy log). 100 threads 
are used by default. 16k is appended to the log for every 10 read/writes. 

Micro-benchmark Workloads Currently Available in Filebench package

      copyfiles
      A copy of a large directory tree. This workload creates a hierarchical 
directory tree, then measures the rate at which files can be copied from the 
source tree to a new tree. A single thread is used by default, although this is 
configurable.
      createfiles
      Create a directory tree and fill files. A populate of files of specified 
sizes a directory tree. File sizes are chosen according to a gamma distribution 
of 1.5, with a mean size of 16k.
      randomread
      A multi-threaded read of a single large file, defaulting to 8k reads. A 
single thread is used by default, although configurable by $nthreads.
      randomwrite
      A multi-threaded write of a single large file, defaulting to 8k writes. A 
single thread is used by default, although configurable by $nthreads.
      singlestreamread
      A sequential read of a large file. 1MB reads are used by default.
      singlestreamwrite
      A sequential write of a large file. 1MB writes are used by default.
      A sequential read of 4 large files, each with their own reader thread. 
1MB reads are used by default.
      multistreamwrite
      A sequential write of 4 large files, each with their own writer thread. 
1MB writes are used by default. 

Recommended parameters

      Recommended defaults for small configurations (1 disk):
      For fileserver bringover createfiles deletefiles varmail webproxy 
webserver:
            set $nfiles=50000 For randomread singlestreamread multistreamread 
singlestreamwrite multistreamwrite:
            set $filesize=1g For oltp:
            set $filesize=1g 
      Recommended defaults for large configurations (20+ disks):
      For fileserver bringover createfiles deletefiles varmail webproxy 
webserver:
            set $nfiles=100000 For randomread randomwrite
            set $filesize=5g set $nthreads=256 For singlestreamread 
multistreamread singlestreamwrite multistreamwrite:
            set $filesize=5g For oltp:
            set $filesize=5g 

Workload Model Language Example (Subject to lots of change at the moment)

#!/home/rmc/ws/filebench-rmc/src/filebench/filebench -f

debug 1

define file 
name=bigfile1,path=$datadir1/myrandfile,size=50g,prealloc,reuse,paralloc

define process name=rand-read,instances=1
{
  thread name=rand-thread,memsize=10m,instances=100
  {
    flowop read name=rand-read1,filename=bigfile1,iosize=$iosize,random,directio
    flowop eventlimit name=rand-rate
  }
}


Workload Model Language Spec (Subject to lots of change at the moment)

Usage:
filebench: interpret f script and generate file workload
Options:
   [-h] Display verbose help
   [-p] Disable opening /proc to set uacct to enable truss

'f' language definition:

Variables:
set $var = value
    $var   - regular variables
    ${var} - internal special variables
    $(var) - environment variables


define file name=<name>,path=<pathname>,size=<size>
                        [,paralloc]
                        [,prealloc]
                        [,reuse]

define fileset name=<name>,path=<pathname>,entries=<number>
                        [,dirwidth=[width]
                        [,dirgamma=[100-10000] (Gamma * 1000)
                        [,sizegamma=[100-10000] (Gamma * 1000)
                        [,prealloc=[percent]]

define process name=<name>[,instances=<count>]
{
  thread ...
  thread ...
  thread ...
}


  thread  name=<name>[,instances=<count>]

  {
    flowop ...
    flowop ...
    flowop ...
  }

flowop [aiowrite|write|read] name=<name>, 
                       filename=<fileobjname>,
                        iosize=<size>
                        [,directio]
                        [,dsync]
                        [,iters=<count>]
                        [,random]
                        [,workingset=<size>]

flowop aiowait name=<name>,target=<aiowrite-flowop>

flowop sempost name=<name>,target=<semblock-flowop>,
                        value=<increment-to-post>

flowop semblock name=<name>,value=<decrement-to-receive>,
                        highwater=<inbound-queue-max>

flowop block name=<name>

flowop hog name=<name>,value=<number-of-mem-ops>

flowop wakeup name=<name>,target=<block-flowop>,

flowop eventlimit name=<name>
flowop bwlimit name=<name>

flowop [readfile|writefile] name=<name>, 
                       filename=<fileobjname|filesetname>

Commands:

eventgen rate=<rate>
create [files|processes]
stats [clear|snap]
stats command "shell command $var1,$var2..."
stats directory <directory>
sleep <sleep-value>
quit
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
[email protected]

[perf-discuss] Re: FileBench Discuss

Reply via email to