On Wednesday, July 29, 2015 at 10:38:56 AM UTC-4, Tom Breloff wrote: > > Scott: Is your number format a public (open source) specification? How > does it differ from decimal floating point? >
It was just a packed storage format (think serialize/deserialize in Julia), that I invented to be able to efficiently store Mumps values, some 20+ years ago. The format was documented by sales engineers, and customers certainly figured it out easily enough, even though I wanted it kept opaque (so that I could add things later). It was trivial, a length byte (which included the length byte itself, so 0 was available as a special marker to indicate a longer length was stored (unsigned, LE) in the next two bytes (and later on, that was changed so that if the 2-byte length was 0, it was followed by a 4 byte unsigned length), followed optionally by a type byte, and optionally further data. (a length byte of 1 indicated an undefined or null value). Storing a 0 only took 2 bytes, a length of 2, and the marker for non-negative integer. -1 also took only 2 bytes, length of 2, marker for negative integer. The *format* could handle any arbitrary length integers, although the code only supported 0-8 bytes after the type, so 64-bit max, and values >= 2^63, or < -2^64, would get an error if read in. Other type bytes indicated binary or 8-bit text string, or UTF16LE Unicode values, or a 1 byte power of 10 scale followed by 0-8 bytes, and later on, IEEE doubles. No big deal, many people have come up with similar schemes, but I optimized it for space, which made it very useful for getting good performance, packing as much data as possible into the B+ tree nodes.