Chis, I still am wondering what is prompting you to continue using the larger group size.
I think that Martin, and the UV documentation is correct in this case; you would be as well or better off with the defaults. -Rick On Jul 5, 2012, at 9:13 AM, "Martin Phillips" <martinphill...@ladybridge.com> wrote: coming > Hi, > > The various suggestions about setting the minimum modulus to reduce overflow > are all very well but effectively you are turning a > dynamic file into a static one, complete with all the continual maintenance > work needed to keep the parameters in step with the > data. > > In most cases, the only parameter that is worth tuning is the group size to > try to pack things nicely. Even this is often fine left > alone though getting it to match the underlying o/s page size is helpful. > > I missed the start of this thread but, unless you have a performance problem > or are seriously short of space, my recommendation > would be to leave the dynamic files to look after themselves. > > A file without overflow is not necessarily the best solution. Winding the > split load down to 70% means that at least 30% of the file > is dead space. The implication of this is that the file is larger and will > take more disk reads to process sequentially from one end > to the other. > > > Martin Phillips > Ladybridge Systems Ltd > 17b Coldstream Lane, Hardingstone, Northampton NN4 6DB, England > +44 (0)1604-709200 > > > > -----Original Message----- > From: u2-users-boun...@listserver.u2ug.org > [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Chris Austin > Sent: 05 July 2012 15:19 > To: u2-users@listserver.u2ug.org > Subject: Re: [U2] RESIZE - dynamic files > > > I was able to drop from 30% overflow to 12% by making 2 changes: > > 1) changed the split from 80% to 70% (that alone reduce 10% overflow) > 2) changed the MINIMUM.MODULUS to 118,681 (calculated this way -> [ (record > data + id) * 1.1 * 1.42857 (70% split load)] / 4096 ) > > My disk size only went up 8%.. > > My file looks like this now: > > File name .................. GENACCTRN_POSTED > Pathname ................... GENACCTRN_POSTED > File type .................. DYNAMIC > File style and revision .... 32BIT Revision 12 > Hashing Algorithm .......... GENERAL > No. of groups (modulus) .... 118681 current ( minimum 118681, 140 empty, > 14431 overflowed, 778 badly ) > Number of records .......... 1292377 > Large record size .......... 3267 bytes > Number of large records .... 180 > Group size ................. 4096 bytes > Load factors ............... 70% (split), 50% (merge) and 63% (actual) > Total size ................. 546869248 bytes > Total size of record data .. 287789178 bytes > Total size of record IDs ... 21539538 bytes > Unused space ............... 237532340 bytes > Total space for records .... 546861056 bytes > > Chris > > > >> From: keith.john...@datacom.co.nz >> To: u2-users@listserver.u2ug.org >> Date: Wed, 4 Jul 2012 14:05:02 +1200 >> Subject: Re: [U2] RESIZE - dynamic files >> >> Doug may have had a key bounce in his input >> >>> Let's do the math: >>> >>> 258687736 (Record Size) >>> 192283300 (Key Size) >>> ======== >> >> The key size is actually 19283300 in Chris' figures >> >> Regarding 68,063 being less than the current modulus of 82,850. I think the >> answer may lie in the splitting process. >> >> As I understand it, the first time a split occurs group 1 is split and its >> contents are split between new group 1 and new group 2. > All the other groups effectively get 1 added to their number. The next split > is group 3 (which was 2) into 3 and 4 and so forth. A > pointer is kept to say where the next split will take place and also to help > sort out how to adjust the algorithm to identify which > group matches a given key. >> >> Based on this, if you started with 1000 groups, by the time you have split >> the 500th time you will have 1500 groups. The first > 1000 will be relatively empty, the last 500 will probably be overflowed, but > not terribly badly. By the time you get to the 1000th > split, you will have 2000 groups and they will, one hopes, be quite > reasonably spread with very little overflow. >> >> So I expect the average access times would drift up and down in a cycle. >> The cycle time would get longer as the file gets bigger > but the worst time would be roughly the the same each cycle. >> >> Given the power of two introduced into the algorithm by the before/after the >> split thing, I wonder if there is such a need to > start off with a prime? >> >> Regards, Keith >> >> PS I'm getting a bit Tony^H^H^H^Hverbose nowadays. >> >> _______________________________________________ >> U2-Users mailing list >> U2-Users@listserver.u2ug.org >> http://listserver.u2ug.org/mailman/listinfo/u2-users > > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users > > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users