...

> I personally believe that since most people will have
> hardware LUN's
> (with underlying RAID) and cache, it will be
> difficult to notice
> anything. Given that those hardware LUN's might be
> busy with their own
> wizardry ;) You will also have to minimize the effect
> of the database
> cache ...

By definition, once you've got the entire database in cache, none of this 
matters (though filling up the cache itself takes some added time if the table 
is fragmented).

Most real-world databases don't manage to fit all or even mostly in cache, 
because people aren't willing to dedicate that much RAM to running them.  
Instead, they either use a lot less RAM than the database size or share the 
system with other activity that shares use of the RAM.

In other words, they use a cost-effective rather than a money-is-no-object 
configuration, but then would still like to get the best performance they can 
from it.

> 
> It will be a tough assignment ... maybe someone has
> already done this?
> 
> Thinking about this (very abstract) ... does it
> really matter?
> 
> [8KB-a][8KB-b][8KB-c]
> 
> So what it 8KB-b gets updated and moved somewhere
> else? If the DB gets
> a request to read 8KB-a, it needs to do an I/O
> (eliminate all
> caching). If it gets a request to read 8KB-b, it
> needs to do an I/O.
> 
> Does it matter that b is somewhere else ...

Yes, with any competently-designed database.

 it still
> needs to go get
> it ... only in a very abstract world with read-ahead
> (both hardware or
> db) would 8KB-b be in cache after 8KB-a was read.

1.  If there's no other activity on the disk, then the disk's track cache will 
acquire the following data when the first block is read, because it has nothing 
better to do.  But if the all the disks are just sitting around waiting for 
this table scan to get to them, then if ZFS has a sufficiently intelligent 
read-ahead mechanism it could help out a lot here as well:  the differences 
become greater when the system is busier.

2.  Even a moderately smart disk will detect a sequential access pattern if one 
exists and may read ahead at least modestly after having detected that pattern 
even if it *does* have other requests pending.

3.  But in any event any competent database will explicitly issue prefetches 
when it knows (and it *does* know) that it is scanning a table sequentially - 
and will also have taken pains to try to ensure that the table data is laid out 
such that it can be scanned efficiently.  If it's using disks that support 
tagged command queuing it may just issue a bunch of single-database-block 
requests at once, and the disk will organize them such that they can all be 
satisfied by a single streaming access; with disks that don't support queuing, 
the database can elect to issue a single large I/O request covering many 
database blocks, accomplishing the same thing as long as the table is in fact 
laid out contiguously on the medium (the database knows this if it's handling 
the layout directly, but when it's using a file system as an intermediary it 
usually can only hope that the file system has minimized file fragmentation).

> 
> Hmmm... the only way is to get some data :) *hehe*

Data is good, as long as you successfully analyze what it actually means:  it 
either tends to confirm one's understanding or to refine it.

- bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to