Re: why compression costs additional I/O?

David Betten Wed, 27 Jan 2010 07:57:01 -0800

> generally all of it probably mean that using DFSORT for compressed
datasets is
> not good idea.

I'm not sure I would agree with a general statement such as that.

First.  There is a cpu overhead associated with compression and it effects
ALL applications, not just sort.  The overhead
is generally higher for write than it is for read.  In some cases, that
overhead can be offset by reduced data transfer but that
depends on how well the data compresses.  You also need to look at how the
data set is used.  If it's written once but read
many times, then you may get enough benefit on all those reads to warrant
the negative impact on the write.

Second, higher EXCPs does not necessarily mean higher I/Os.  For BSAM,
buffers are used to store the blocks which are
then chained together in single I/Os.  So the increase in I/Os is likely
much smaller than the increase in EXCPs.

Thirdly, you may want to consider multiple stripes so that data is
transferred in parallel.  This won't reduce the I/Os but
it would allow multiple I/Os to be done in parallel and reduce elapsed
time.

I've never really considered compression as a means of improving
performance.  I've heard all the arguments about less
data being transferred but in all my years of batch tuning I never really
saw that great an impact to offset the cpu cost.  To me,
compression is great for avoiding out of space conditions and managing very
large files.  When performance is the sole
concern, I've always recommended extended format with multiple stripes but
not compressed.   Of course that requires that.
you have the disk space available to support storing the large data sets!

Have a nice day,
Dave Betten
DFSORT Development, Performance Lead
IBM Corporation
email:  bet...@us.ibm.com
DFSORT/MVSontheweb at http://www.ibm.com/storage/dfsort/

IBM Mainframe Discussion List <IBM-MAIN@bama.ua.edu> wrote on 01/27/2010
10:23:22 AM:

> [image removed]
>
> Re: why compression costs additional I/O?
>
> Pawel Leszczynski
>
> to:
>
> IBM-MAIN
>
> 01/27/2010 10:26 AM
>
> Sent by:
>
> IBM Mainframe Discussion List <IBM-MAIN@bama.ua.edu>
>
> Please respond to IBM Mainframe Discussion List.
>
> Hi Yifat,
>
> Thanks for answer - you are right! - I 've checked in joblog:
>
> for compressed output:
>
>  0 SORTOUT  : BSAM USED
>
> but for non-compressed output:
>
> SORTOUT  : EXCP USED
>
> generally all of it probably mean that using DFSORT for compressed
datasets is
> not good idea.
>
> Regards,
> Pawel
>
>
>
>
>
> On Wed, 27 Jan 2010 15:55:24 +0200, Yifat Oren <yi...@tmachine.com>
> wrote:
>
> >Hi Pawel,
> >
> >The reason is the sort product can not use the EXCP access method with
the
> >compressed data set and instead chooses BSAM as the access method.
> >The EXCP access method usually reads or writes on a cylinder (or more)
> >boundary while BSAM, as its name suggests, reads or writes block by
block.
> >
> >Hope that helps,
> >Yifat Oren.
> >
> >-----Original Message-----
> >From: IBM Mainframe Discussion List [mailto:ibm-m...@bama.ua.edu] On
> Behalf
> >Of Pawel Leszczynski
> >Sent: Wednesday, January 27, 2010 12:56 PM
> >To: IBM-MAIN@bama.ua.edu
> >Subject: why compression costs additional I/O?
> >
> >Hello everybody,
> >Recently we are reviewing our EndOfDay jobs looking for potential
> >performance improvements (reducing CPU/elapsed time).
> >We have several jobs sorting big datasets where output is
SMS-compressible
> >(type: EXTENDED) datasets.
> >When we compare such sorting with sorting on non-compressible output we
> can
> >see this:
> >                                             EXCP   TCB   SRB   el.time
> >TESTXWP5       STEP110     00   757K   3.51    .70    9.01 <-- w/o
> >compression
> >TESTXWP5       STEP120     00  1462K   3.62  2.89  10.45 <-- w.
compresion
> >
> >We guess that big SRB in (2) goes for compression (that we understand -
we
> >probably quit compression at all), but we don't understand 2 times
bigger
> >EXCP in second case.
> >
> >Any ideas will be appreciated,
> >Regards,
> >Pawel Leszczynski
> >PKO BP SA
> >
> >----------------------------------------------------------------------
> >For IBM-MAIN subscribe / signoff / archive access instructions, send
email
> >to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the
> >archives at http://bama.ua.edu/archives/ibm-main.html
> >
> >----------------------------------------------------------------------
> >For IBM-MAIN subscribe / signoff / archive access instructions,
> >send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
> >Search the archives at http://bama.ua.edu/archives/ibm-main.html
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
> Search the archives at http://bama.ua.edu/archives/ibm-main.html

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Re: why compression costs additional I/O?

Reply via email to