Hi,

The various suggestions about setting the minimum modulus to reduce overflow 
are all very well but effectively you are turning a
dynamic file into a static one, complete with all the continual maintenance 
work needed to keep the parameters in step with the
data.

In most cases, the only parameter that is worth tuning is the group size to try 
to pack things nicely. Even this is often fine left
alone though getting it to match the underlying o/s page size is helpful.

I missed the start of this thread but, unless you have a performance problem or 
are seriously short of space, my recommendation
would be to leave the dynamic files to look after themselves.

A file without overflow is not necessarily the best solution. Winding the split 
load down to 70% means that at least 30% of the file
is dead space. The implication of this is that the file is larger and will take 
more disk reads to process sequentially from one end
to the other.


Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England
+44 (0)1604-709200



-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin
Sent: 05 July 2012 15:19
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] RESIZE - dynamic files


I was able to drop from 30% overflow to 12% by making 2 changes:

1) changed the split from 80% to 70% (that alone reduce 10% overflow)
2) changed the MINIMUM.MODULUS to 118,681 (calculated this way -> [ (record 
data + id) * 1.1 * 1.42857 (70% split load)] / 4096 )

My disk size only went up 8%..

My file looks like this now:

File name ..................   GENACCTRN_POSTED
Pathname ...................   GENACCTRN_POSTED
File type ..................   DYNAMIC
File style and revision ....   32BIT Revision 12
Hashing Algorithm ..........   GENERAL
No. of groups (modulus) ....   118681 current ( minimum 118681, 140 empty,
                                            14431 overflowed, 778 badly )
Number of records ..........   1292377
Large record size ..........   3267 bytes
Number of large records ....   180
Group size .................   4096 bytes
Load factors ...............   70% (split), 50% (merge) and 63% (actual)
Total size .................   546869248 bytes
Total size of record data ..   287789178 bytes
Total size of record IDs ...   21539538 bytes
Unused space ...............   237532340 bytes
Total space for records ....   546861056 bytes

Chris



> From: keith.john...@datacom.co.nz
> To: u2-users@listserver.u2ug.org
> Date: Wed, 4 Jul 2012 14:05:02 +1200
> Subject: Re: [U2] RESIZE - dynamic files
> 
> Doug may have had a key bounce in his input
> 
> > Let's do the math:
> >
> > 258687736 (Record Size)
> > 192283300 (Key Size)
> > ========
> 
> The key size is actually 19283300 in Chris' figures
> 
> Regarding 68,063 being less than the current modulus of 82,850.  I think the 
> answer may lie in the splitting process.
> 
> As I understand it, the first time a split occurs group 1 is split and its 
> contents are split between new group 1 and new group 2.
All the other groups effectively get 1 added to their number. The next split is 
group 3 (which was 2) into 3 and 4 and so forth. A
pointer is kept to say where the next split will take place and also to help 
sort out how to adjust the algorithm to identify which
group matches a given key.
> 
> Based on this, if you started with 1000 groups, by the time you have split 
> the 500th time you will have 1500 groups.  The first
1000 will be relatively empty, the last 500 will probably be overflowed, but 
not terribly badly.  By the time you get to the 1000th
split, you will have 2000 groups and they will, one hopes, be quite reasonably 
spread with very little overflow.
> 
> So I expect the average access times would drift up and down in a cycle.  The 
> cycle time would get longer as the file gets bigger
but the worst time would be roughly the the same each cycle.
> 
> Given the power of two introduced into the algorithm by the before/after the 
> split thing, I wonder if there is such a need to
start off with a prime?
> 
> Regards, Keith
> 
> PS I'm getting a bit Tony^H^H^H^Hverbose nowadays.
> 
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
                                          
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to