Re: Module namespace for IEEE754 encoder/decoder

Timothe Litt Thu, 03 Jun 2021 08:59:11 -0700

> I would recommend against using the "top level" space, Number::Binary,
> as that would reserve a namespace which seems to cover a very broad
> topic, binary numbers, for just encoding/decoding. 
Fair point.


Number::Encode seems to be a hash generator for sort-of-randomish
sort-of-hashes; not well documented, and it's current usefulness is
unclear.  Not sure I'd want to adopt that.

Binary does seem like a better name for something that converts many
number formats to and from bits.

How about Number::Binary::Encode->encoder, Number::Binary::Encode->decoder?

That takes only one name from Number:Binary and leaves the rest of
Number::Binary free for other uses, and avoids having to adopt
Number::Encode.  The specific formats use N::B::E::<formatname> for
their implementations.

So someone later could have N::B::Bitcount, N::B::Bitmask,
N::Binary::CountByElevens, or whatever comes to mind.

I already suggested sub-modules - putting formats directly in
N::B::Encode itself doesn't scale well, makes the module large (typical
users will, I expect, only need a small number of formats), harder to
maintain, likely doesn't conserve namespace if encoding become objects,
and makes it harder for other people to contribute new formats.


On 03-Jun-21 11:05, Diab Jerius wrote:
> The free-for-all that is CPAN namespaces is particularly hard to
> navigate, and I would be enthusiastic if we could create a hierarchy
> that makes it easy to understand the relationship between them.
>
> I would recommend against using the "top level" space, Number::Binary,
> as that would reserve a namespace which seems to cover a very broad
> topic, binary numbers, for just encoding/decoding.  It would also
> imply similar functionality for a parallel Number::Decimal, or some
> other numbering system.
>
> What about using the existing Number::Encode hierarchy, and naming the
> module Number::Encode::Binary?  It's not a bad fit for what you're
> doing, and since the precedent for Number::Encode is already set, it
> makes it easier to find it.  Write the API so that it is easy to
> incorporate additional formats, either via subclassing by other sub
> modules in the N::E::B namespace, or by adding them to the N::E::B
> module itself.
>
> Number::Encode is probably ripe for adoption, and could be repurposed
> for a top level "documentation of namespace" module, maintaining the
> existing code for any possible existing users, but relegating it to a
> historical (albeit working) footnote.
>
>
>
> On 6/3/21 9:39 AM, Timothe Litt wrote:
>>
>> Hmmm, Number::Binary doesn't seem to be taken, at least according to
>> meta::cpan.
>>
>> What about $enc = Number::Binary->encoder("name"); $dec =
>> Number::Binary->decoder("name");?
>>
>> You could also make it easy to get inverse functions by accepting an
>> object; e.g. $dec = Number::Binary->decoder($enc);
>>
>> e.g. $e80 = Number::Binary->encoder("intel80")->encode(2.71828)
>>
>>     printf( "%s: %s\n", Number::Binary->name("intel80"),
>> Number::Binary->decoder("intel80")->decode($e))'
>>
>>     >> Intel extended floating point: 2.71828
>>
>> Supporting BigXXX for the native value seems important, not only for
>> the large floats, but some ints.
>>
>> E.g. Double precision integers (128 bit, or 64 bit on a 32-bit
>> platform - or 72 bit on a 36-bit platform)
>>
>> You're the expert; whether the encoders figure out whether bignum was
>> used in the caller, or the decoder constructors take a "use_big"
>> option, or the result of decode is specified as a possibly overloaded
>> object that can do math, or ... I leave to you.
>>
>> Besides finessing bignums, making the output of decode an object
>> allows for a path to info methods - besides name, sizes common to
>> all, maybe things like "smallest number of bits that are needed for
>> this value" (e.g. would this fit in a smaller format without loss of
>> precision?)  Not sure what the right list would be, but once there's
>> one, it will probably grow.
>>
>> Yes, small can get interesting too - e.g. saturating 8-bit bytes
>> packed in something bigger...
>>
>> Sounds like fun.  Hope this helps.  Good luck.
>>
>> On 03-Jun-21 08:52, Peter John Acklam wrote:
>>> Thanks for the feedback!
>>>
>>> I see your point regarding the Math:: namespace and agree that it
>>> isn't the best. Alas, Number::Encode is already taken. I suggested
>>> Number::Pack and Number::Unpack because they aren't taken and because
>>> of the module's similar functionality to pack() and unpack().
>>>
>>> I agree that there should be a simple wrapper and that the format
>>> should not be a part of the module name specified by the user. I also
>>> agree about not assuming floating point. Actually, one of the use
>>> cases is encoding/decoding unsigned 24 bit integers, which are used by
>>> ImageMagic when reading/writing PAM (portable anymap) images.
>>>
>>> There is also Data::IEEE754, but I think the Data:: namespace is too
>>> general. I will only be dealing with numbers.
>>>
>>> Peter
>>>
>>> tor. 3. jun. 2021 kl. 13:22 skrev Timothe Litt <tlhack...@cpan.org>:
>>>> I'd be a bit careful about assuming floating point - will someone
>>>> want to pack/unpack BCD? Or PDP-10 Gfloat (well, OK that's a
>>>> floating format)?  Or...
>>>>
>>>> I don't like Math:: - it implies that it does arithmetic (or
>>>> calculus, or statistics, or - more than a conversion).
>>>>
>>>> And I'd rather not have a format name encoded in the module that
>>>> the user calls.
>>>>
>>>> How about Number::Encode->new("name") & Number::Decode->new("name")?
>>>>
>>>> Let "name" get to a subclass, so other formats can be supported
>>>> just by adding a module - e.g. "Number:Encode::BCD" could be
>>>> require'd if *->new('bcd') called.  Obviously, you'd implement
>>>> IEEE754, Intel80, and whatever else...
>>>>
>>>> Define the API for the subclasses - encode(),decode(), perhaps some
>>>> info functions (e.g. a printable name, perhaps exponent and
>>>> fraction range/#bits, ...)
>>>>
>>>> Then someone who wants Number::Decode::VAX_DFLOAT just calls
>>>> Number::Decode->new('vax_dfloat') - after writing it.
>>>>
>>>> Some of these can get interesting if you want to decode and
>>>> actually do math - presumably you'll support Math::BigXxx / bignum?
>>>> (binary128, VAX H_Floating are, IIRC about 36 decimal digits)
>>>>
>>>> And some program that reads archived data can have a description
>>>> language that is simply "name"  "format" "byte offset" "length",
>>>> and not worry about what module handles what format.  In fact, such
>>>> a program might appreciate the trivial modules
>>>> Number::Encode::INTEGER32 (and perhaps the less obvious
>>>> Number::Encode::INTEGER32_ONESCOMPLEMENT)...
>>>>
>>>> I suspect there are better names for the format, but the idea is to
>>>> export a simple wrapper so the next format can be added by anyone,
>>>> and the callers don't have to know too much.
>>>>
>>>> FWIW.
>>>>
>>>>
>>>> On 03-Jun-21 06:23, Peter John Acklam wrote:
>>>>
>>>> I also plan to implement the 80 bit "extended precision" format, which
>>>> is not IEEE 754 compatible. Perhaps the best and simplest is
>>>> Number::Pack and Number::Unpack?
>>>>
>>>> Peter
>>>>
>>>> tor. 3. jun. 2021 kl. 11:43 skrev Peter John Acklam
>>>> <pjack...@gmail.com>:
>>>>
>>>> Hi
>>>>
>>>> I am working on two modules for encoding and decoding numbers as
>>>> per IEEE754. The pack() function can encode and decode the formats
>>>> binary32 (single precision) and binary64 (double precision). My
>>>> module can also handle binary128 (quad precision), binary16 (half
>>>> precision), bfloat16 (not an IEEE754 format, but it follows the
>>>> IEEE754 pattern), and a few other formats.
>>>>
>>>> My question is about the namespace. Is Math::IEEE754::Encoder (and
>>>> ...::Decoder) OK? Or is Number::IEEE754::Encoder better? Or any other?
>>>>
>>>> Here is an example showing how I use it:
>>>>
>>>> my $encoder = Math::IEEE754::Encoder -> new("binary16");
>>>> my $bytes = $encoder -> (3.14159265358979);  # = "\x42\x48"
>>>>
>>>> my $decoder = Math::IEEE754::Decoder -> new("binary16");
>>>> my $number = $decoder -> ($bytes);               # = 3.140625
>>>>
>>>> The reason for returning an anonymous function rather than
>>>> implementing the function directly, is speed. There are some
>>>> constants involved, and I don't want to compute them for each
>>>> function call.
>>>>
>>>> Cheers,
>>>> Peter John Acklam (PJACKLAM)
>

OpenPGP_signature
Description: OpenPGP digital signature

Re: Module namespace for IEEE754 encoder/decoder

Reply via email to