Re: [zfs-discuss] Re: disk write cache, redux

2006-06-15 Thread Roch Bourbonnais - Performance Engineering
I'm puzzled by 2 things. Naively I'd think a write_cache should not help throughput test since the cache should fill up after which you should still be throttled by the physical drain rate. You clearly show that it helps; Anyone knows why/how a cache helps throughput ? And the second thing...q

Re: [zfs-discuss] Re: 3510 configuration for ZFS

2006-06-02 Thread Roch Bourbonnais - Performance Engineering
Tao Chen writes: > Hello Robert, > > On 6/1/06, Robert Milkowski <[EMAIL PROTECTED]> wrote: > > Hello Anton, > > > > Thursday, June 1, 2006, 5:27:24 PM, you wrote: > > > > ABR> What about small random writes? Won't those also require reading > > ABR> from all disks in RAID-Z to read the

Re: [zfs-discuss] question about ZFS performance for webserving/java

2006-06-02 Thread Roch Bourbonnais - Performance Engineering
You propose ((2-way mirrored) x RAID-Z (3+1)) . That gives you 3 data disks worth and you'd have to loose 2 disk in each mirror (4 total) to loose data. For random read load you describe, I could expect that the per device cache to work nicely; That is file blocks stored at some given

Re: Re[4]: [zfs-discuss] Re: Big IOs overhead due to ZFS?

2006-06-01 Thread Roch Bourbonnais - Performance Engineering
Robert Milkowski writes: > > > > btw: just a quick thought - why not to write one block only on 2 disks > (+checksum on a one disk) instead of spreading one fs block to N-1 > disks? That way zfs could read many fs block at the same time in case > of larger raid-z pools. ? That's what y

Re: [zfs-discuss] Re: [osol-discuss] Re: I wish Sun would open-source"QFS"... / was:Re: Re: Distributed File System for Solaris

2006-05-31 Thread Roch Bourbonnais - Performance Engineering
> I think ZFS should do fine in streaming mode also, though there are > currently some shortcomings, such as the mentioned 128K I/O size. It may eventually. The lack of direct I/O may also be an issue, since some of our systems don't have enough main memory bandwidth to support data be

Re: [zfs-discuss] Re: [osol-discuss] Re: I wish Sun would open-source"QFS"... / was:Re: Re: Distributed File System for Solaris

2006-05-31 Thread Roch Bourbonnais - Performance Engineering
Anton wrote: (For what it's worth, the current 128K-per-I/O policy of ZFS really hurts its performance for large writes. I imagine this would not be too difficult to fix if we allowed multiple 128K blocks to be allocated as a group.) I'm not taking a stance on this, but if I keep a co

Re: [zfs-discuss] 3510 configuration for ZFS

2006-05-31 Thread Roch Bourbonnais - Performance Engineering
Hi Grant, this may provide some guidance for your setup; it's somewhat theoretical (take it for what it's worth) but it spells out some of the tradeoffs in the RAID-Z vs Mirror battle: http://blogs.sun.com/roller/page/roch?entry=when_to_and_not_to As for serving NFS, the user e

Re: [zfs-discuss] hard drive write cache

2006-05-29 Thread Roch Bourbonnais - Performance Engineering
Chris Csanady writes: > On 5/26/06, Bart Smaalders <[EMAIL PROTECTED]> wrote: > > > > There are two failure modes associated with disk write caches: > > Failure modes aside, is there any benefit to a write cache when command > queueing is available? It seems that the primary advantage is i

Re: [zfs-discuss] Sequentiality & direct access to a file

2006-05-26 Thread Roch Bourbonnais - Performance Engineering
Scott Dickson writes: > How does (or does) ZFS maintain sequentiality of the blocks of a file. > If I mkfile on a clean UFS, I likely will get contiguous blocks for my > file, right? A customer I talked to recently has a desire to access you would get up to maxcontig worth of sequential b

Re: [zfs-discuss] ZFS and databases

2006-05-22 Thread Roch Bourbonnais - Performance Engineering
Cool, I'll try the tool and for good measure the data I posted was sequential access (from logical point of view). As for the physical layout, Idon't know, it's quite possible that ZFS has layed out all blocks sequentially on the physical side; so certainly this is not a good way

Re: [zfs-discuss] ZFS and databases

2006-05-22 Thread Roch Bourbonnais - Performance Engineering
Gregory Shaw writes: > Rich, correct me if I'm wrong, but here's the scenario I was thinking > of: > > - A large file is created. > - Over time, the file grows and shrinks. > > The anticipated layout on disk due to this is that extents are > allocated as the file changes. The extent

Re: Re[7]: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-22 Thread Roch Bourbonnais - Performance Engineering
Robert Says: Just to be sure - you did reconfigure system to actually allow larger IO sizes? Sure enough, I messed up (I had no tuning to get the above data); So 1 MB was my max transfer sizes. Using 8MB I now see: Bytes Elapse of phys IO Size Sent 8 MB; 357

Re: Re[7]: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-19 Thread Roch Bourbonnais - Performance Engineering
Robert Milkowski writes: > Hello Roch, > > Monday, May 15, 2006, 3:23:14 PM, you wrote: > > RBPE> The question put forth is whether the ZFS 128K blocksize is sufficient > RBPE> to saturate a regular disk. There is great body of evidence that shows > RBPE> that the bigger the write sizes a

Re: [zfs-discuss] Re: Re[5]: Re: Re: Due to 128KB limit in ZFS it can'tsaturate disks

2006-05-16 Thread Roch Bourbonnais - Performance Engineering
Anton B. Rang writes: > One issue is what we mean by "saturation." It's easy to bring a disk to 100% busy. We need to keep this discussion in the context of a workload. Generally when people care about streaming throghput of a disk, it's because they are reading or writing a single large file

Re: [zfs-discuss] ZFS and databases

2006-05-15 Thread Roch Bourbonnais - Performance Engineering
Gregory Shaw writes: > I really like the below idea: > - the ability to defragment a file 'live'. > > I can see instances where that could be very useful. For instance, > if you have multiple LUNs (or spindles, whatever) using ZFS, you > could re-optimize large files to spre

Re: Re[5]: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-15 Thread Roch Bourbonnais - Performance Engineering
The question put forth is whether the ZFS 128K blocksize is sufficient to saturate a regular disk. There is great body of evidence that shows that the bigger the write sizes and matching large FS clustersize lead to more throughput. The counter point is that ZFS schedules it's I/O like nothing

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Nicolas Williams writes: > On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance > Engineering wrote: > > For read it is an interesting concept. Since > > > >Reading into cache > >Then copy into user space > >th

Re: Re[2]: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Robert Milkowski writes: > Hello Roch, > > Friday, May 12, 2006, 2:28:59 PM, you wrote: > > RBPE> Hi Robert, > > RBPE> Could you try 35 concurrent dd each issuing 128K I/O ? > RBPE> That would be closer to how ZFS would behave. > > You mean to UFS? > > ok, I did try and I get about

Re: [zfs-discuss] Re: ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Anton B. Rang writes: > >Were the benefits coming from extra concurrency (no > >single writer lock) or avoiding the extra copy to page cache or > >from too much readahead that is not used before pages need to > >be recycled. > > With QFS, a major benefit we see for databases and direct I/

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Franz Haberhauer writes: > > 'ZFS optimizes random writes versus potential sequential reads.' > > This remark focused on the allocation policy during writes, > not the readahead that occurs during reads. > Data that are rewritten randomly but in place in a sequential, > contiguos file (l

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Peter Rival writes: > Roch Bourbonnais - Performance Engineering wrote: > > Tao Chen writes: > > > On 5/12/06, Roch Bourbonnais - Performance Engineering > > > <[EMAIL PROTECTED]> wrote: > > > > > > > > From: Gregory Shaw <[

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
You could start with the ARC paper, Megiddo/Modha FAST'03 conference. ZFS uses a variation of that. It's an interesting read. -r Franz Haberhauer writes: > Gregory Shaw wrote On 05/11/06 21:15,: > > Regarding directio and quickio, is there a way with ZFS to skip the > > system buffer cache?

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
'ZFS optimizes random writes versus potential sequential reads.' Now I don't think the current readahead code is where we want it to be yet but, in the same way that enough concurrent 128K I/O can saturate a disk (I sure hope that Milkowski's data will confirm this, ot

Re: [zfs-discuss] Re: Re: Due to 128KB limit in ZFS it can't saturate disks

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Hi Robert, Could you try 35 concurrent dd each issuing 128K I/O ? That would be closer to how ZFS would behave. -r Robert Milkowski writes: > Well I have just tested UFS on the same disk. > > bash-3.00# newfs -v /dev/rdsk/c5t50E0119495A0d0s0 > newfs: construct a new file system /dev/rd

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Tao Chen writes: > On 5/12/06, Roch Bourbonnais - Performance Engineering > <[EMAIL PROTECTED]> wrote: > > > > From: Gregory Shaw <[EMAIL PROTECTED]> > > Regarding directio and quickio, is there a way with ZFS to skip the > > system buffe

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Tao Chen writes: > On 5/11/06, Peter Rival <[EMAIL PROTECTED]> wrote: > > Richard Elling wrote: > > > Oracle will zero-fill the tablespace with 128kByte iops -- it is not > > > sparse. I've got a scar. Has this changed in the past few years? > > > > Multiple parallel tablespace creates is

Re: [zfs-discuss] ZFS and databases

2006-05-12 Thread Roch Bourbonnais - Performance Engineering
Jeff Bonwick writes: > > Are you saying that copy-on-write doesn't apply for mmap changes, but > > only file re-writes? I don't think that gels with anything else I > > know about ZFS. > > No, you're correct -- everything is copy-on-write. > Maybe the confusion comes from: mma

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
From: Gregory Shaw <[EMAIL PROTECTED]> Sender: [EMAIL PROTECTED] To: Mike Gerdts <[EMAIL PROTECTED]> Cc: ZFS filesystem discussion list , [EMAIL PROTECTED] Subject: Re: [zfs-discuss] ZFS and databases Date: Thu, 11 May 2006 13:15:48 -0600 Regarding directio and quickio, is there

Re: [zfs-discuss] ZFS RAM requirements?

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
the require memory pressure on ZFS. Sounds like another bug we'd need to track; -r Daniel Rock writes: > Roch Bourbonnais - Performance Engineering schrieb: > > A already noted, this needs not be different from other FS > > but is still an interesting question. I

Re: [zfs-discuss] ZFS RAM requirements?

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
Certainly something we'll have to tackle. How about a zpool memstat (or zpool -m iostat) variation that would report at least freemem and the amount evictable cached data ? Would that work for you ? -r Philip Beevers writes: > Roch Bourbonnais - Performance Engineeri

Re: [zfs-discuss] Re: [dtrace-discuss] Re: [nfs-discuss] Script to trace NFSv3 client operations

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
1.033 sys11.405 -r > > On 5/11/06, Roch Bourbonnais - Performance Engineering > <[EMAIL PROTECTED]> wrote: > > > > > > # ptime tar xf linux-2.2.22.tar > > ptime tar xf linux-2.2.22.tar > > > > real 50.292 > &g

Re: [zfs-discuss] ZFS and databases

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
- Description of why I don't need directio, quickio, or ODM. The 2 main benefits that cames out of using directio was reducing memory consumption by avoiding the page cache AND bypassing the UFS single writer behavior. ZFS does not have the single writer lock. As for memory, the UFS code

Re: [zfs-discuss] Re: [dtrace-discuss] Re: [nfs-discuss] Script to trace NFSv3 client operations

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
# ptime tar xf linux-2.2.22.tar ptime tar xf linux-2.2.22.tar real 50.292 user1.019 sys11.417 # ptime tar xf linux-2.2.22.tar ptime tar xf linux-2.2.22.tar real 56.833 user1.056 sys11.581 # avg time waiting for async writes is

RE: [zfs-discuss] ZFS and databases

2006-05-11 Thread Roch Bourbonnais - Performance Engineering
Gehr, Chuck R writes: > One word of caution about random writes. From my experience, they are > not nearly as fast as sequential writes (like 10 to 20 times slower) > unless they are carefully aligned on the same boundary as the file > system record size. Otherwise, there is a heavy read pena