Keith M Wesolowski <[EMAIL PROTECTED]> wrote: > On Fri, Jul 15, 2005 at 06:58:24PM -0700, Jake Hamby wrote: > > > I just did a quick performance test of bsdtar vs. gnutar, star, and Solaris > > tar in extracting a large (704MB uncompressed) .tar.bz2 archive: > > In the case of compressed files, especially bzip2, decompression time > dominates. You should consider testing uncompressed files as well if > you want to help isolate tar from bzip2. You could also determine > whether each tar's decompress-and-unpack option is faster or slower > than piping it the output of bzcat. And you definitely need to > discard the first run, since some extra time is needed the first time > to read the file into memory from media (or you need to use a file > that's many times the size of memory, but that may be impossible). > > I'm not saying your results are wrong, but quick and dirty benchmark > results can be misleading.
Doing senseful benchmarks for tar is really hard. In general, you are right as you should never include the compression time and you should not use the time results from the first run. But there is much more: - CPU usage and partially speed depend on the tar format used. With the historic tar format and plain files, star needs about 25% less user CPU time than what GNU tar needs. If you compare wall clock time, GNU tar and star are roughly of the same speed in case you write the output into a file. - If you archive sparse files (files with holes) and the OS does not include support for backing up sparse files (like on Linux), then star is roughly 4x faster than GNU tar. If the OS includes such support, star may be up to 1000 times faster than GNU tar. - If you write to a real tape device things become different. A tape needs the data to be delivered at a constant speed. If the data from tar comes slower than needed by the tape drive, then the drive will constantly move the media forwards and backwards resulting in a low speed and media degradation. Star includes a FIFO that bufferes data and is filled up when star is easily able to get the speed and empties when star finds parts of the filesystem that are slow to read. On a modern OS, Star may easily configured to use 128 MB of FIFO. This may cause star to be twice as fast as GNU tar - The remote tape implementation from GNU tar is much slower than the remote tape implementation in star. Star is about 4x faster than GNU tar when in remote mode. - Star typically consumes a bit more system CPU time than other tar implementations because the fork() and the synchronizing of the two processes takes some time. - Star consumes significantly less use CPU time than other tar implementations when in create mode. As an example: If you have lots of small files and archive them in POSIX.1-2001 tar format (usually called pax), then star needs about three times as much user CPU time then when you chose the POSIX.1-1988 format (also known as ustar). However, star in POSIX.1-2001 mode still only needs about the same user CPU time as GNU tar in POSIX.1-1988 mode. A tar program that uses significantly less CPU time may help to keep the tape streaming as it does not take as many resources from the system then another program. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED] (uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org