I believe PiOpen used a directory with two files in it ‘&$0’ and ‘&$1’ corresponding to DATA.30 and OVER.30. If the numbers went up from there, I think that they corresponded to alternate keys, ie ‘&$2’ and ‘&$3’ represented DATA.30 and OVER.30 for the first alternate key.
I do not think that PiOpen supported statically hashed files. (Pr1me Information did) All of that is a few years ago Unidata uses dat001 and over001 with the number increasing to allow for very large files (I think). -Rick On Jul 4, 2012, at 10:51 AM, Wols Lists wrote: > On 04/07/12 11:26, Brian Leach wrote: >>> All the other groups effectively get 1 added to their number >> Not exactly. >> >> Sorry to those who already know this, but maybe it's time to go over linear >> hashing in theory .. >> >> Linear hashing was a system devised by Litwin and originally only for >> in-memory lists. In fact there's some good implementations in C# that >> provide better handling of Dictionary types. Applying it to a file system >> adds some complexity but it's basically the same theory. >> >> Let's start with a file that has 100 groups initially defined (that's 0 >> through 99). That is your minimum starting point and should ensure that it >> never shrinks below that, so it doesn't begin it's life with loads of splits >> right from the start as you populate the file. You would size this similarly >> to the way you size a regular hashed file for your initial content: no point >> making work for yourself (or the database). >> >> As data gets added, because the content is allocated unevenly, some of that >> load will be in primary and some in overflow: that's just the way of the >> world. No hashing is perfect. Unlike a static file, the overflow can't be >> added to the end of the file as a linked list (* why nobody has done managed >> overflow is beyond me), it has to sit in a separate file. > > I don't know what the definition of "badly overflowed" is, but assuming > that a badly overflowed group has two blocks of overflow, then those > file stats seem perfectly okay. As Brian has explained, the distribution > of records is "lumpy" and as a percentage of the file, there aren't many > badly overflowed groups. > > You've got roughly 1/3 of groups overflowed - with an 80% split that > doesn't seem at all out of order - on average each group is 80% full so > 1/3rd more than 100% full is fine. > > You've got (in thousands) one and a half groups badly overflowed out of > eighty-three. That's less than two percent. That's nothing. > > As for why no-one has done managed overflow, I think there are various > reasons. The first successful implementation (Prime INFORMATION) didn't > need it. It used a peculiar type of file called a "Segmented Directory" > and while I don't know for certain what PI did, I strongly suspect each > group had its own normal file so if a group overflowed, it just created > a new block at the end of the file. Same with large records, it > allocated a bunch of overflow blocks. This file structure was far more > evident with PI-Open - at the OS level a dynamic file was a OS directory > with lots of numbered files in it. > > The UV implementation of "one file for data, one file for overflow" may > be unique to UV. I don't know. What little I know of UD tells me it's > different, and others like QM could well be different again. I wouldn't > actually be surprised if QM is like PI. > > Cheers, > Wol > _______________________________________________ > U2-Users mailing list > U2-Users@listserver.u2ug.org > http://listserver.u2ug.org/mailman/listinfo/u2-users _______________________________________________ U2-Users mailing list U2-Users@listserver.u2ug.org http://listserver.u2ug.org/mailman/listinfo/u2-users