Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-02 Thread Richard Elling
Christiaan Willemsen wrote:
> Hi Richard,
>
> Richard Elling wrote:
>> It should cost less than a RAID array...
>> Advertisement: Sun's low-end servers have 16 DIMM slots.
>
> Sadly, those are by far more expensive than what I have here from our 
> own server supplier...
>

ok, that pushed a button.  Let's see...  I just surfed the 4 major
server vendors for their online store offerings without logging in
(no discounts applied).  I was looking for a 64 GByte 1-2U rack
server with 8 internal disk drives.  Due to all of the vendors having
broken stores, in some form or another, it was difficult to actually
get an exact, orderable configuration, but I was able to come close.
Requirements: redundant power supplies, 8x 146 GByte 10k rpm
disks, 64 GBytes of RAM, 4 cores of some type, no OS (I'll use
OpenSolaris, thank you :-)  All prices in USD.

IBM - no 1-2U product with 64 GByte memory capacity, the x3650
line have only 12 slots available until you get to the 4U servers. 
Didn't make the first cut.  But for the record, if you want it at
48 GBytes, $10,748, and if you could add 16 GBytes more, it
would come in at around $12,450... not bad.

HP - DL380 G5 looks promising. Site had difficulty calculating
the price, but it cruised in at $23,996.

Dell - PowerEdge 2970 seemed to be the most inexpensive, at
first.  But once configured (store showed configuration errors,
but it seems to be a bug in the error reporting itself). $21,825.

Sun - X4150 is actually 1U while the others are 2U.  Store
would only allow me to configure 60 GBytes -- 1 pair of
DIMMs were 2 GByte.  I'm sure I could get it fully populated
with a direct quote.  $12,645.

The way I see it, for your solution Sun offers the best value
by far.  But more importantly, it really helps to shop around
for these x64 boxes.  I was quite surprised to find Sun's price
to be nearly half of HP and Dell...
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-02 Thread Christiaan Willemsen
> Let ZFS deal with the redundancy part. I'm not
> counting "redundancy" offered by traditional RAID as
> you can see by just posts in this forums that -
> 1. It doesn't work.
> 2. It bites when you least expect it to.
> 3. You can do nothing but resort to tapes and LOT of
> aspirin when you get bitten.

Thanks, that's exactly what I was asking about.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-01 Thread Akhilesh Mritunjai
I feel I'm being mis-understood.

RAID - "Redundant" Array of Inexpensive Disks.

I meant to state that - Let ZFS deal with redundancy.

If you want to have an "AID" by all means have your "RAID" controller do all 
kind of striping/mirroring it can to help with throughput or ease of managing 
drives.

Let ZFS deal with the redundancy part. I'm not counting "redundancy" offered by 
traditional RAID as you can see by just posts in this forums that -
1. It doesn't work.
2. It bites when you least expect it to.
3. You can do nothing but resort to tapes and LOT of aspirin when you get 
bitten.

- Akhilesh
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-01 Thread Mike Gerdts
On Mon, Jun 30, 2008 at 11:43 AM, Akhilesh Mritunjai
<[EMAIL PROTECTED]> wrote:
>> I'll probably be having 16 Seagate 15K5 SAS disks,
>> 150 GB each.  Two in HW raid1 for the OS, two in HW
>> raid 1 or 10 for the transaction log. The OS does not
>> need to be on ZFS, but could be.
>
> Whatever you do, DO NOT mix zfs and HW RAID.
>
> ZFS likes to handle redundancy all by itself. It's much smarter than any HW 
> RAID and when does NOT like it when it detects a data corruption it can't fix 
> (i.e. no replicas). HW RAID's can't fix data corruption and that leads to a 
> very unhappy ZFS.
>
> Let ZFS handle all redundancy.

If you are dealing with a high-end storage array[1] that does RAID-5,
you probably want to do RAID-5 on there, as well as mirroring with
ZFS.  This allows disk replacements to be done using only the internal
paths of the array.  If you push the rebuild of a 1 TB disk to the
server, it causes an unnecessary amount of traffic across shared[2]
components such as CHIPP processors[3], inter-switch-links, etc.
Mirroring then allows zfs to have the bits needed to self-heal.


1. Typically as physically large as the combined size of your fridge,
your mom's fridge, and those of your three best friends that are out
of college and have a fridges significantly larger than a keg.
2. "Shared" as in one server's behavior can and may be somewhat likely
to affect the performance of another.
3. Assuming Hitachi


-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-01 Thread Bob Friesenhahn
On Tue, 1 Jul 2008, Johan Hartzenberg wrote:
>
> Larger disks can put more data on the outer edge, where performance is
> better.

On the flip side, disks with a smaller form factor produce less 
vibration and are less sensitive to it so seeks stabilize faster with 
less chance of error.  The platters are also smaller so they can seek 
faster and more reliably.  Less heat is produced and less energy is 
consumed.  The 2.5" form factor is the better choice if large storage 
is not required.

> get pretty decent performance.  I read somewhere that ZFS automatically
> gives preferences to the outer cylinders of a disk when selecting free
> blocks, but you could also restrict the ZFS pool to using only the outer say
> 20 GB of each disk by creating slices and adding those to the pool.

A more effective method would be to place a quota on the filesystem 
which assures that there will always be substantial free space in the 
pool.  Simply decide to not use a large part of the pool space.  With 
lots of free space in the pool, zfs won't have to look very hard for 
more free space to satisfy its COW requirements and it is more likely 
that the allocation is a good one (less fragmentation).

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-01 Thread Johan Hartzenberg
On Mon, Jun 30, 2008 at 10:17 AM, Christiaan Willemsen <
[EMAIL PROTECTED]> wrote:

> The question is: how can we maximize IO by using the best possible
> combination of hardware and ZFS RAID?
>
> Here are some generic concepts that still hold true:

More disks can handle more IOs.

Larger disks can put more data on the outer edge, where performance is
better.

If you use disks much larger than your required data set, then the head seek
movement will also be minimized (You can limit the seek more by forcing the
file system to live in a small slice on the disk, the placement on the disk
which you can control.)

Don't put all your disks on a single controller.  Just as more disks can
handle more IOs at a time, so can more controllers issue more instructions
at once.  On the other hand giving each disk a dedicated controller is a
waste because the controller will then be idle most of the time, waiting for
the disk to return results.

RAM, as mentioned before, is your friend.  ZFS will use it liberally.

You mentioned a 70 GB database, so: If you take say 10 x 146GB 15Krpm SAS
disks, set those up in a 4-disk stripe and add a mirror to each disk, you'll
get pretty decent performance.  I read somewhere that ZFS automatically
gives preferences to the outer cylinders of a disk when selecting free
blocks, but you could also restrict the ZFS pool to using only the outer say
20 GB of each disk by creating slices and adding those to the pool.

Note if you do use slices in stead of whole disks, you need to manually turn
on disk write caching (format -e -> SCSI cache options)

If you don't care about tracking file access times, turn it off. (zfs set
atime=off datapool)

Have you decided on a server model yet?  Storage subsystems?  HBAs?  The
specifics in your configuration will undoubtedly get lots of responses from
this list about how to tune each component!  Everything from memory
interleaving to spreading your HBAs across schizo chips.

However much more important in your actual end result is your application
and DB setup, config, and how it is developed.  If the application
developers or the DBAs get it wrong, the system will always be a dog.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-01 Thread Richard Elling
Christiaan Willemsen wrote:
>> Why not go to 128-256 GBytes of RAM?  It isn't that
>> expensive and would
>> significantly help give you a "big performance boost"
>> ;-)
>> 
>
> Would be nice, but it not that much inexpensive since we'd have to move up a 
> class in server choise, and besides the extra memory cost, also brings some 
> more money with it.
>   

It should cost less than a RAID array...
Advertisement: Sun's low-end servers have 16 DIMM slots.

Fast, inexpensive, reliable: pick 2.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-07-01 Thread Christiaan Willemsen
> Why not go to 128-256 GBytes of RAM?  It isn't that
> expensive and would
> significantly help give you a "big performance boost"
> ;-)

Would be nice, but it not that much inexpensive since we'd have to move up a 
class in server choise, and besides the extra memory cost, also brings some 
more money with it.

> The database transaction log should be relatively
> small, so I would
> look for two LUNs (disks), mirrored.  Similarly, the
> ZIL should be
> relatively small -- two LUNs (disks), mirrored.  You
> will want ZFS to
> manage the redundancy here, so think about mirroring
> at the
> ZFS level.  The actual size needed will be based on
> the transaction
> load which causes writes.  For ZIL sizing, we like to
> see something
> like 20 seconds worth of write workload.  In most
> cases, this will
> fit into the write cache of a decent array, so you
> may not have to
> burn an actual pair of disks in the backing store.
>  But since I don't
> now the array your using, it will be difficult to be
> specific.

Oka, so if the array cache is large enough, there is no actual need for a 
seperate ZIL disk.

Another consideration could be the use of SSD's for all of the stuff. You'll 
only need a few of these to have by far beter IO performance than the 16 SAS 
disks could ever do. Also, you'd probably not need a ZIL disk, nor a disk for 
the transaction log.

It will cost about the same, but will probably give better performance
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Bob Friesenhahn
On Mon, 30 Jun 2008, Richard Elling wrote:
>
> There is a general feeling that COW, as used by ZFS, will cause
> all sorts of badness for database scans.  Alas, there is a dearth of
> real-world data on any impacts (I'm anxiously awaiting...)

It seems like the primary badness from ZFS as pertains to databases is 
the fact that it checksums each block and prefers large blocks.  If 
the filesystem block size perfectly matches the database block size 
and blocks are perfectly aligned (database dependent), then 
performance should be pretty good.

If 8K of a 128K block needs to be updated, then the 128K block needs 
to be read, checksummed, updated, checksummed, allocated (for COW), 
and then written.  Clearly this cost is reduced if the amount of data 
involved is reduced.  But 8k blocks will increase the cost of any 
fragmentation for sequential access since then there may be an extra 
seek for every 8K rather than for every 128K.

An extent-based filesystem will also incur costs and may increase 
fragmentation.  There is a maximum limit on ZFS fragmentation which is 
determined by the blocksize.  It seems that load-shared mirrors will 
suffer least from fragmentation.

The DTrace Toolkit provides a script called 'iopattern' which is quite 
helpful to understand how much I/O is random vs sequential and the 
type/size of the I/Os.  Lots of random I/O while doing a sequential 
scan likely indicates fragmentation.

> In this particular case, it would be cost effective to just buy a
> bunch of RAM and not worry too much about disk I/O during
> scans.  In the future, if you significantly outgrow the RAM, then

RAM is definitely your friend.

Bob
==
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread David Collier-Brown
David Collier-Brown wrote:
>>   ZFS copy-on-write results in tables' contents being spread across
>> the full width of their stripe, which is arguably a good thing
>> for transaction processing performance (or at least can be), but
>> makes sequential table-scan speed degrade.
>>  
>>   If you're doing sequential scans over large amounts of data
>> which isn't changing very rapidly, such as older segments, you
>> may want to re-sequentialize that data.

Richard Elling <[EMAIL PROTECTED]> wrote 
> There is a general feeling that COW, as used by ZFS, will cause
> all sorts of badness for database scans.  Alas, there is a dearth of
> real-world data on any impacts (I'm anxiously awaiting...)
> There are cases where this won't be a problem at all, but it will
> depend on how you use the data.

I quite agree: at some point, the experts on Oracle, MySQL and
PostgreSQL will get a clear understanding of how to get the
best performance for random database I/O and ZFS.  I'll be
interested to see what the behavior is for large, high-performance
systems. In the meantime...

> In this particular case, it would be cost effective to just buy a
> bunch of RAM and not worry too much about disk I/O during
> scans.  In the future, if you significantly outgrow the RAM, then
> there might be a case for a ZFS (L2ARC) cache LUN to smooth
> out the bumps.  You can probably defer that call until later.

... it's a Really Nice Thing that large memories only cost small 
dollars (;-))

--dave
-- 
David Collier-Brown| Always do right. This will gratify
Sun Microsystems, Toronto  | some people and astonish the rest
[EMAIL PROTECTED] |  -- Mark Twain
(905) 943-1983, cell: (647) 833-9377, (800) 555-9786 x56583
bridge: (877) 385-4099 code: 506 9191#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Richard Elling
Christiaan Willemsen wrote:
> I'm new so opensolaris and very new to ZFS. In the past we have always used 
> linux for our database backends.
>
> So now we are looking for a new database server to give us a big performance 
> boost, and also the possibility for scalability.
>
> Our current database consists mainly of a huge table containing about 230 
> million records and a few (relatively) smaller tables (something like 13 
> million records ans less). The main table is growing with about 800k records 
> every day, and the prognosis is that this number will increase significantly 
> in the near future.
>
> All of this is currently held in a Postgresql database with the largest 
> tables divided into segments to speed up performance. This all runs on a 
> linux machine with 4 GB of RAM and 4 10K SCSI disks in HW raid 10. The 
> complete database is about 70 Gb in size, and growing every day.
>
> We will soon need hew hardware, and are also reviewing software needs.
>
> Besides a lot more RAM (16 or 32GB), the new machine will also get a much 
> lager disk array. We don't need the size, but we do need the IO it can 
> generate.  And what we also need is it beeing able to scale. When needs grow, 
> it should be possible to add more disks to be able to handle the extra IO.
>   

Why not go to 128-256 GBytes of RAM?  It isn't that expensive and would
significantly help give you a "big performance boost" ;-)

> And that is exactly where ZFS  comes in, at least as far as I read.
>
> The question is: how can we maximize IO by using the best possible 
> combination of hardware and ZFS RAID?
>   

Adding lots of RAM would allow you to minimize I/O -- generally
a good thing -- do less of those things that hurt.

Any modern machine should be able to generate decent I/O demand.
I don't think that the decision to choose RAID array redundancy
should be based purely on performance.  You might find more
differentiation in the RAS side of such designs.

> I'll probably be having 16 Seagate 15K5 SAS disks, 150 GB each.  Two in HW 
> raid1 for the OS, two in HW raid 1 or 10 for the transaction log. The OS does 
> not need to be on ZFS, but could be. 
>   

The database transaction log should be relatively small, so I would
look for two LUNs (disks), mirrored.  Similarly, the ZIL should be
relatively small -- two LUNs (disks), mirrored.  You will want ZFS to
manage the redundancy here, so think about mirroring at the
ZFS level.  The actual size needed will be based on the transaction
load which causes writes.  For ZIL sizing, we like to see something
like 20 seconds worth of write workload.  In most cases, this will
fit into the write cache of a decent array, so you may not have to
burn an actual pair of disks in the backing store.  But since I don't
know the array your using, it will be difficult to be specific.

> So that leaves 10 or 12 disks to configure for the database. The question is 
> how to divide them to get the best IO performance by mixing the best of both 
> worlds.
>
> For what I read, mirroring and striping should get me better performance than 
> raidz of RAID5. But I guess you might give me some pointer on how to 
> distribute the disk. My biggest question is what I should leave to the HW 
> raid, and what to ZFS?
>   

Array-based RAID-5 implementations perform fairly well.
If the data is important and difficult to reload, then you should
consider using some sort of ZFS redundancy: mirror or copies.

David Collier-Brown wrote:
>   This is a bit of a sidebar to the discussion about getting the 
> best performance for PostgreSQL from ZFS, but may affect
> you if you're doing sequential scans through the 70GB table
> or its segments.
>
>   ZFS copy-on-write results in tables' contents being spread across
> the full width of their stripe, which is arguably a good thing
> for transaction processing performance (or at least can be), but
> makes sequential table-scan speed degrade.
>  
>   If you're doing sequential scans over large amounts of data
> which isn't changing very rapidly, such as older segments, you
> may want to re-sequentialize that data.

There is a general feeling that COW, as used by ZFS, will cause
all sorts of badness for database scans.  Alas, there is a dearth of
real-world data on any impacts (I'm anxiously awaiting...)
There are cases where this won't be a problem at all, but it will
depend on how you use the data.

In this particular case, it would be cost effective to just buy a
bunch of RAM and not worry too much about disk I/O during
scans.  In the future, if you significantly outgrow the RAM, then
there might be a case for a ZFS (L2ARC) cache LUN to smooth
out the bumps.  You can probably defer that call until later.

Backups might be challenging, so ZFS snapshots might help
reduce that complexity.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Richard Elling
Akhilesh Mritunjai wrote:
>> I'll probably be having 16 Seagate 15K5 SAS disks,
>> 150 GB each.  Two in HW raid1 for the OS, two in HW
>> raid 1 or 10 for the transaction log. The OS does not
>> need to be on ZFS, but could be. 
>> 
>
> Whatever you do, DO NOT mix zfs and HW RAID.
>   

I disagree.  There are very good reasons to use RAID arrays
along with ZFS.  Each case may be slightly different, but there
is nothing wrong with enjoying the benefits of both.

> ZFS likes to handle redundancy all by itself. It's much smarter than any HW 
> RAID and when does NOT like it when it detects a data corruption it can't fix 
> (i.e. no replicas). HW RAID's can't fix data corruption and that leads to a 
> very unhappy ZFS.
>
> Let ZFS handle all redundancy.
>   

I disagree, but only slightly :-).  I'd say, let ZFS manage at
least one level of redundancy.

A reminder for the group: there are failures which can and
do occur in the data path between RAID arrays and hosts.
ZFS can detect these, but can only automatically correct
the errors when it is managing a level of redundancy.

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Akhilesh Mritunjai
> I'll probably be having 16 Seagate 15K5 SAS disks,
> 150 GB each.  Two in HW raid1 for the OS, two in HW
> raid 1 or 10 for the transaction log. The OS does not
> need to be on ZFS, but could be. 

Whatever you do, DO NOT mix zfs and HW RAID.

ZFS likes to handle redundancy all by itself. It's much smarter than any HW 
RAID and when does NOT like it when it detects a data corruption it can't fix 
(i.e. no replicas). HW RAID's can't fix data corruption and that leads to a 
very unhappy ZFS.

Let ZFS handle all redundancy.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Sean Sprague
Christiaan,

> So right now, I'm not babling about some ZFS tuning setting, but about the 
> advantages and disadvantages of using ZFS, hardware raid, or a combination of 
> the two.

I never accused you of babbling, I opened my response with "As ZFS 
tuning has already been suggested"; and gave some very generic pointers.

Regards... Sean.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread David Collier-Brown
  This is a bit of a sidebar to the discussion about getting the 
best performance for PostgreSQL from ZFS, but may affect
you if you're doing sequential scans through the 70GB table
or its segments.

  ZFS copy-on-write results in tables' contents being spread across
the full width of their stripe, which is arguably a good thing
for transaction processing performance (or at least can be), but
makes sequential table-scan speed degrade.
 
  If you're doing sequential scans over large amounts of data
which isn't changing very rapidly, such as older segments, you
may want to re-sequentialize that data.

 I was talking to one of the Slony developers back whern this
first came up, and he suggested a process to do this in PostgreSQL.

  He suggested doing a "cluster" operation, relative to a specific 
index, then dropping and recreating the index.  This results in the 
relation being rewritten in the order the index is sorted by, which
should defragment/linearize it. The dropping and recreating
the index rewrites it sequentially too.

  Neither he nor I know the cost if the relation has more than one
index: we speculate they should be dropped before the clustering
and recreated last.

 --dave
-- 
David Collier-Brown| Always do right. This will gratify
Sun Microsystems, Toronto  | some people and astonish the rest
[EMAIL PROTECTED] |  -- Mark Twain
(905) 943-1983, cell: (647) 833-9377, (800) 555-9786 x56583
bridge: (877) 385-4099 code: 506 9191#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Christiaan Willemsen
Another thing: what about a seperate disk (or disk set) for the ZIL?

Would it be worth sacrificing two SAS disks for two SSD disks in raid 1 
handling the ZIL?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Christiaan Willemsen
> Christiaan,
> 
> As ZFS tuning has already been suggested, remember:
> 
> a) Never tune unless you need to.
> b) Never tune unless you have an untuned benchmark
> set of figures to 
> compare against after the system has been tuned -
> especially in ZFS-land 
> which, whilst it may not be quite there, is designed
> to ultimately be 
> "self-tuning". Putting stuff hard into /etc/system
> might be 
> counter-productive to performance in the future
> (although hopefully by 
> that time, it will be blithely ignored).
> c) Never tune more than one parameter at one go.
> d) Understand as fully as possible the wider
> ramifications of any tuning 
> that you undertake.
> 
> If this is all "master of the bleedin' obvious" to
> you, then please 
> accept my humble apologies - it is often not the
> case...
> 
> Regards... Sean.

This is not about tuning, but about choosing the right configuration from the 
start. I can surely buy the stuff and spend day's testing all kinds of 
scenario's to come to the best possible configuration. That all great, but I'd 
like to start these test with a bit of decent background so I know what to 
expect.

So right now, I'm not babling about some ZFS tuning setting, but about the 
advantages and disadvantages of using ZFS, hardware raid, or a combination of 
the two.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Sean Sprague
Christiaan,

As ZFS tuning has already been suggested, remember:

a) Never tune unless you need to.
b) Never tune unless you have an untuned benchmark set of figures to 
compare against after the system has been tuned - especially in ZFS-land 
which, whilst it may not be quite there, is designed to ultimately be 
"self-tuning". Putting stuff hard into /etc/system might be 
counter-productive to performance in the future (although hopefully by 
that time, it will be blithely ignored).
c) Never tune more than one parameter at one go.
d) Understand as fully as possible the wider ramifications of any tuning 
that you undertake.

If this is all "master of the bleedin' obvious" to you, then please 
accept my humble apologies - it is often not the case...

Regards... Sean.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Nico Sabbi
On Monday 30 June 2008 11:14:10 James C. McPherson wrote:
> Christiaan Willemsen wrote:
> ...
>
> > And that is exactly where ZFS  comes in, at least as far as I
> > read.
> >
> > The question is: how can we maximize IO by using the best
> > possible combination of hardware and ZFS RAID?
>
> ...
>
> > For what I read, mirroring and striping should get me better
> > performance than raidz of RAID5. But I guess you might give me
> > some pointer on how to distribute the disk. My biggest question
> > is what I should leave to the HW raid, and what to ZFS?
>
> Hi Christiaan,
> If you haven't found it already, I highly recommend going
> through the information at these three urls:
>
>
> http://www.solarisinternals.com/wiki/index.php/ZFS_Configuration_Gu
>ide
> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_G
>uide
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guid
>e
>
>
> I'll defer to Richard Elling and Roch Bourbonnais for specific
> suggestions based on your email - as far as I'm concerned they're
> the experts when it comes to ZFS tuning and database performance.
>
>
> James C. McPherson
> --

I want to save you some time and sufference: I had to  add
set zfs:zfs_nocacheflush = 1
to /etc/system and reboot to cure the horrible slowness I experienced
with all of Myisam engines on ZFS, especially Innodb.
I had never seen a DB going so slow until that moment
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread James C. McPherson
Christiaan Willemsen wrote:
...
> And that is exactly where ZFS  comes in, at least as far as I read.
> 
> The question is: how can we maximize IO by using the best possible
> combination of hardware and ZFS RAID?
...
> For what I read, mirroring and striping should get me better performance
> than raidz of RAID5. But I guess you might give me some pointer on how to
> distribute the disk. My biggest question is what I should leave to the HW
> raid, and what to ZFS?

Hi Christiaan,
If you haven't found it already, I highly recommend going
through the information at these three urls:


http://www.solarisinternals.com/wiki/index.php/ZFS_Configuration_Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide


I'll defer to Richard Elling and Roch Bourbonnais for specific
suggestions based on your email - as far as I'm concerned they're
the experts when it comes to ZFS tuning and database performance.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Some basic questions about getting the best performance for database usage

2008-06-30 Thread Christiaan Willemsen
I'm new so opensolaris and very new to ZFS. In the past we have always used 
linux for our database backends.

So now we are looking for a new database server to give us a big performance 
boost, and also the possibility for scalability.

Our current database consists mainly of a huge table containing about 230 
million records and a few (relatively) smaller tables (something like 13 
million records ans less). The main table is growing with about 800k records 
every day, and the prognosis is that this number will increase significantly in 
the near future.

All of this is currently held in a Postgresql database with the largest tables 
divided into segments to speed up performance. This all runs on a linux machine 
with 4 GB of RAM and 4 10K SCSI disks in HW raid 10. The complete database is 
about 70 Gb in size, and growing every day.

We will soon need hew hardware, and are also reviewing software needs.

Besides a lot more RAM (16 or 32GB), the new machine will also get a much lager 
disk array. We don't need the size, but we do need the IO it can generate.  And 
what we also need is it beeing able to scale. When needs grow, it should be 
possible to add more disks to be able to handle the extra IO.

And that is exactly where ZFS  comes in, at least as far as I read.

The question is: how can we maximize IO by using the best possible combination 
of hardware and ZFS RAID?

I'll probably be having 16 Seagate 15K5 SAS disks, 150 GB each.  Two in HW 
raid1 for the OS, two in HW raid 1 or 10 for the transaction log. The OS does 
not need to be on ZFS, but could be. 

So that leaves 10 or 12 disks to configure for the database. The question is 
how to divide them to get the best IO performance by mixing the best of both 
worlds.

For what I read, mirroring and striping should get me better performance than 
raidz of RAID5. But I guess you might give me some pointer on how to distribute 
the disk. My biggest question is what I should leave to the HW raid, and what 
to ZFS?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss