[EMAIL PROTECTED] wrote on 01/31/2006 09:15:09 PM:

> The file STC.HIST as a dynamic file takes up 4.3Gig of disk space.  It
has
> around 944,000 records, a blocksize of 1024 but a modulo of 4,000,000+
> When I convert this to a static file, I can properly size it with a
modulo
> of around 94,000 which takes up a mere 75Meg.
>
> I've tried changing split/merge loads from the default of 60/40 to
20/10.
> I've tried playing with the minimum modulo.
> I'm clueless on dynamic files and would love any insight.

For the large file in its dynamic form, is most of the space consumed by
the dat* or the over* files?  If the former, you may just be wasting
space.  If the latter, you have some file configuration issues to resolve.
 What does GROUP.STAT show you?  Are the records distributed evenly in the
groups or does it vary greatly?

What do your key lengths look like?  Are they large or small?  Do they
vary greatly from one record to the next?  You may want to consider using
the KEYDATA option.  It doesn't usually work better than KEYONLY, but I
have seen it make a huge positive difference on a small percentage of
files.

You might also want to play around with changing the hash type.  0 usually
works best, but once in a while 1 will help you.  Also, you may actually
see things improve by increasing the split and merge loads.  I've seen
files that worked with 90/45, but that doesn't happen very often.  So much
depends on the characteristics of the data in the file.

I recently posted the following [very verbose] information, which may give
you some guidance.

===== COPY OF RECENT POSTING BEGINS HERE =====
The following is swiped from an old technical bulletin and is a good
starting point.  You should run guide with the -r option on the file and
use its output for the variables below.  However, there's nothing like
getting a small sample of the records in the file - maybe 1 percent, and
creating a small test file to play around with.  You can play with
CONFIGURE.FILE and memresize (AVOID REBUILD.FILE AT ALL COSTS) to find the
best parameters, then use those, with an increased modulo to size the real
file.

For what it's worth, I generally find that smaller split and load numbers,
such as 20/10, work better than larger ones.  Of course, that varies from
file to file, and there is no absolute rule.

==============================================

Formula for determining base modulo, block size, SPLIT_LOAD, and
MERGE_LOAD for UniData KEYONLY Dynamic Files


Note that the variables used are the same as the DICT items in
$UDTHOME/sys/D_UDT_GUIDE.  Any calculated values which are not attributes
in this dictionary appear in bold italic.

Considerations:

        The following does not take into account the Unix disk record
(frame) size so it is best to
        select a block size based on the number of items you?d like in a
group.

        No one method will provide absolute results but these calculations
will minimize level one
        overflow caused by a high SPLIT_LOAD value.

        Type 0 works best for most Dynamic Files but it is best to check a
small sample via the
        GROUP.STAT command.

Step 1: Determine the blocksize.  (Use 4096 unless the Items per group is
larger then 35 or less then 2)

A)      If the MAXSIZ < 1K
        ITEMSIZE = 10 * MAXSIZ
B)      If  1 K < MAXSIZ < 3 K
        ITEMSIZE = 5 * MAXSIZ
C)      If  MAXSIZ > 3 K
       ITEMSIZE = 5 * (AVGSIZ + DEVSIZ )

Once you determine the item size, use it to determine the NEWBLOCKSIZE.

A)      ITEMSIZE < 1024;                NEWBLOCKSIZE = 1024
B)      1024 > ITEMSIZE < 2048; NEWBLOCKSIZE = 2048
C)      2048 > ITEMSIZE < 4096; NEWBLOCKSIZE = 4096
D)      4096 > ITEMSIZE < 8192; NEWBLOCKSIZE = 8192
8192 > ITEMSIZE < 16384;        NEWBLOCKSIZE = 16384

Step 2: Determine the actual number of items per group.

        ITEMS_PER_GROUP = NEWBLOCKSIZE-32 / AVGSIZ

Step 3: Determine the base modulo

        BASEMODULO = COUNT / ITEMS_PER_GROUP

Step 4: Determine SPLIT_LOAD

SPLIT_LOAD=INT((((AVGKEY + 9) * ITEMS_PER_GROUP ) / NEW_BLOCKSIZE)*100)+1

        If the SPLIT_LOAD is less then ten then:        SPLIT_LOAD = 10

Step 5: Determine MERGE_LOAD

        MERGE_LOAD = SPLIT_LOAD / 2     ( Rounded up )


Tim Snyder
Consulting I/T Specialist , U2 Professional Services
North American Lab Services
DB2 Information Management, IBM Software Group
717-545-6403
[EMAIL PROTECTED]
-------
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/

Reply via email to