Re: [U2] resizing VOC on UV

Clifton Oliver Mon, 16 Aug 2004 22:31:02 -0700

Very interesting. I see why 4 and 5 are not in your favor. In the key samples I tested (sequential integers, integers with alpha suffixes, zero padded, etc.), I noted that 4 consistently was worse than 18, though other types would still be better than 18. Sometimes 12 and 13 were just downright "evil." Something to ponder.

Thanks for posting your method of testing.

What I do on live data where running all the combos of HASH.AID isn't feasible (millions of records) is to take a random sample of the file and copy it into something manageable. Then using RESIZE and GROUP.STAT (since the records are usually very "lumpy"), I compare percent std deviations to look for record distribution.

--

Regards,

Clif

On Aug 14, 2004, at 12:16, Rosenberg Ben wrote:

Using a sample of files with no very large records,
or using id-only test files with null @RECORD,
for each filename, do
   {
   CLEAR-FILE DATA HASH.AID.FILE
   for a sample of reasonable moduli, do
      {
      PHANTOM HASH.AID filename 2,18 mod sep
      }
   SORT HASH.AID.FILE BY-DSND LARGEST.GROUP
   to see the worst file types.
   }

-------
u2-users mailing list
[EMAIL PROTECTED]
To unsubscribe please visit http://listserver.u2ug.org/

Re: [U2] resizing VOC on UV

Reply via email to