Re: SAME and separating disk and index tablespaces

Gaja Krishna Vaidyanatha Thu, 09 Oct 2003 10:20:47 -0700

David and list,

Some points to keep in mind in our discussion:


* Throughput can be measured both by IOPS and MB./sec,
it depends on the application, whether it is purely
transactional or is just hauling loads of data. 

Access time normally is a function of the number of
I/O operations the disk volume can support, the type
of I/O demand (the mixture of random versus sequential
I/O requests on both reads and writes) and the
physical location of the data that is being accessed
(outer tracks versus the inner tracks). Thus IOPS is
very much relevant to our discussion, as statistically
speaking, given the capacity of "n IOPS", if x is the
acutal number of IOPS being serviced by the volume,
lower the value of x, lower will be the probability is
for access times to be "out of whack".

* There is disk technology available today that
supports multiple "actuators", which has a definite
positive impact on access times. Again, IOPS has a
play in this too.

* Almost all mirroring technologies that I have
encountered provide for the following 2 important
features:
  - Doubling of IOPS as compared to a non-mirrored
enviroment (due to double the drives)
  - Intelligent servicing of I/O requests across the
mirrored members (some use the round-robin algorithm,
others use the least-busy algorithm). This is the
single most salient point that should be highlighted
in our discussion. I should have brought this up in my
original posting. Better late than never!
  - Redundancy. This feature is an obvious one, which
again is relevant to our discussion, because I don't
think any of us are going to just implement a "purely
striped volume" with NO mirroring.

So assuming a mirrored volume (say 4-way stripe), that
consumes a total of 8 drives, it can be argued (or
even proved) that placing the objects in the volume
based on IOPS, will suffice, regardless of whether the
object type is - data, index, undo or temp.

This is because, in a mirrored volume, say on an index
scan, the I/O operation for the index block can be
serviced from one member of the mirrored volume and
subsequently the I/O operation for the data block can
be serviced from the other member of the mirrored
volume. So the cost of seeking different parts of the
same disk, can be avoided. We don't eliminate any
seeks, it just that the seeks that we do perform are
much cheaper, as the amount of "head movement" is
minimized.

I think this provides pretty much a similar scenario
as proposed by you of having 2 different volumes with
5 drives each, instead of having 1 volume with 10
drives. Again, I should have originally highlighted
this in our discussion of adding more drives to a
volume. Hope that clarifies things a bit.


Cheers,


Gaja
--- David Hau <[EMAIL PROTECTED]> wrote:
> Hans,
> 
> Your statement is true except in the case of a fast
> full-index scan. 
> But that's not my point.  What I'm trying to say is:
> 
> 1.  In scenarios where response time is important,
> for example when you 
> want to obtain the first n rows of a query result as
> quickly as 
> possible, then access time may be as important as
> throughput.
> 
> 2.  Adding disks to a striped array only improves
> throughput, not access 
> time.
> 
> 3.  Access time can be improved by parallel I/O
> execution on separate 
> disk arrays.
> 
> I'm trying to point out the difference between:
> 
> 1.  striping 10 disks into a single array
> 2.  striping 10 disks into two arrays of 5 disks
> each
> 
> In the first case, you get max throughput but
> because you only have one 
> array, you cannot improve access time by
> parallelizing disk access.
> 
> In the second case, you get half the throughput of
> the first case, but 
> if you can parallelize disk access to both disks,
> your access time or 
> response time to get the 1st row of the query result
> may be shorter.
> 
> Regards,
> Dave
> 
> 
> 
> 
> [EMAIL PROTECTED] wrote:
> > Dave,
> > 
> > during a 'db file sequential read', an index is
> _not_ accessed 
> > sequentially.
> > An index is not a sequential structure, so reading
> from an index in 
> > order will cause multiple seeks on the index
> itself. And we're talking 
> > single user here....
> > 
> > regards,
> > Hans
> > 
> > 
> > Reply-To: [EMAIL PROTECTED]
> > To: Multiple recipients of list ORACLE-L
> <[EMAIL PROTECTED]>
> > Date: Wed, 08 Oct 2003 16:29:25 -0800
> > 
> > Great responses ! Thanks very much ..
> > 
> > -----Original Message-----
> > Dave Hau
> > Sent: Wednesday, October 08, 2003 3:19 PM
> > To: Multiple recipients of list ORACLE-L
> > 
> > Hi Gaja,
> > 
> > I agree that throughput can always be improved by
> adding more drives to
> > the striped array.  However, this does not improve
> access time.  If you
> > have your tables and indexes on the same striped
> array, necessarily the
> > two I/O's have to be done sequentially, incurring
> two times access time
> > at a minimum.  However, if you separate the two
> into different arrays,
> > then you can access them in parallel, starting to
> get data from each
> > disk array in 1* access time.  This makes sense
> esp. in scenarios where
> > response time is more important than throughput,
> and also in use cases
> > where your access pattern is random rather than
> sequential.
> > 
> > So I feel that there's a tradeoff between access
> time and throughput.
> > If you have ten drives, and you stripe all of them
> into a single array
> > and put both your data and indexes onto this
> array, you get maximum
> > throughput but you're sacrificing access time for
> throughput.  However,
> > if you build two arrays each consisting of five
> drives, and put your
> > data and indexes onto each array, you get half of
> the previous
> > throughput, but you get better access time because
> now your data and
> > index access can be truly in parallel.
> > 
> > Regards,
> > Dave
> > 
> > 
> > 
> > 
> > 
> > [EMAIL PROTECTED] wrote:
> >  > Hi Hans/Vikas,
> >  >
> >  > I tend to agree that the old draconian rule
> that "thou
> >  > shalt always separate indexes from tables" may
> not
> >  > apply any more. We used to apply that principle
> in the
> >  > past when the number of available spindles was
> not
> >  > adequate. Seems like with 256G drives in the
> market,
> >  > we are being pushed back in time, in some
> way!!!
> >  >
> >  > The way I look at the problem is purely from an
> IOPS
> >  > perspective. For example, if each physical disk
> is
> >  > capable of 256 IOPS (ignore the cache
> configured here)
> >  > and you have 10 disks in your volume, then the
> total
> >  > I/O capacity on this volume is 2560 IOPS.
> Separation
> >  > of objects across multiple volumes may becomes
> an
> >  > issue, only when the demand for I/O outstrips
> the
> >  > supply (in this case 2560 IOPS).
> >  >
> >  > Even then, you can always add more drives to
> the
> >  > existing volume and restripe, i.e., adding 5
> more
> >  > drives to 10 drives increases the I/O capacity
> by 50%.
> >  > At the end of the day, the I/O sub-system does
> not
> >  > care, whether it is servicing a data segment,
> index
> >  > segment or undo segment.
> >  >
> >  > But, in certain environments, that I have dealt
> with,
> >  > there has been a need to separate heavily and
> >  > concurrently accessed objects (does not matter
> whether
> >  > these objects are all indexes or tables or
> both). This
> >  > may be true only for certain objects and
> certain
> >  > queries. So, please don't apply this in a
> blanket
> >  > fashion.
> >  >
> >  > Empirical data is always the best justification
> >  > mechnism for a configuration exercise such as
> this.
> >  > Plus, you may have partitioning and other
> requirements
> >  > such as parallelism that impact the placement
> and
> >  > availability of your data. This in turn will
> control
> >  > the number of logical volumes that need to be
> created.
> >  >
> >  > I think the idea and philosophy behind SAME is
> noble -
> >  > Use all available drives, so that you do not
> have
> >  > localized hot-spots. But the implementation of
> SAME
> >  > and how many volumes you need in your
> enviroment, is a
> >  > function of your custom needs based on your
> system and
> >  > application demands. When you over-simplify
> something,
> >  > you lose the flexibility. The art factor here
> (which
> >  > requires some planning) is in achieving a
> balance
> 
=== message truncated ===


=====
Gaja Krishna Vaidyanatha
Principal Technical Product Manager, 
Application Performance Management, Veritas Corporation
E-mail : [EMAIL PROTECTED]  Phone: (650)-527-3180
Website: http://www.veritas.com

__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com
-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Gaja Krishna Vaidyanatha
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Re: SAME and separating disk and index tablespaces

Reply via email to