On Wednesday, 30 May 2012 at 19:43:32 UTC, Dmitry Olshansky wrote:
On 30.05.2012 23:25, Era Scarecrow wrote:
That overhead is for concatenation ? Did it ever occurred to
you that arrays do sometimes reallocate on append. And of
course a ~ b should produce new copy (at least semantically).
Why bit array shouldn't follow the same design?
Yes, a ~ b of course creates a new one. a ~= true or a ~= b is
different. To do slices I did a simple start/end position pair
and it keeps the current array (Not too unlike a regular array).
When you slice it just adjusts the starting/ending location, and
'reserved' gets truncated based on the slice. Course I might be
able to drop that to a single bit meaning it was a slice or the
original, but that still has to come from somewhere, due to
alignment it would likely still be a 32/64bit size_t, unless I
have other stuff to go with it.
I fail to see how the whole 4 byte word has to be wasted.
Obviously you can denote 1 byte for small length. Or maximum
2bytes in both cases.
Truthfully only 2 bytes are wasted for the compact version (As a
flag), and 2 bytes are used for the start/ending locations.
On 64bit systems, you'll have 32bytes, 24 avaliable, so 192
bits. or 40bytes with 32, giving 256 bits. I will the current
to at least match the regular/normal one equally so some space
isn't lost, but it won't be by much.
Same thought where these 8 bytes go ? Why would I pay 8 bytes
for possible concatenation if I happen to use fixed sizes or
resize them rarely? Pay as you go is the most sane principle.
(built-in arrays do have append cache if you didn't know so
they already reserve space in advance(!))
But unfortunately bit arrays aren't normal arrays. If you did
exactly the amount to fill the size_t types, then you won't have
extra space and it would have to be resized to concatenate
(assuming it could be). I can also only do what makes sense to
me. If I don't have a 'reserved' somehow, every concatenation
could require a resize/copy to ensure the slices don't overlap.
So a ~= true; a ~= true; would do 2 forced resizes/copies, which
is silly but would be required.
If you drop it down to a simple pointer (like the original),
than slices
and extra features and optimizations go away.
Slice could be another type BTW. That type should be implicitly
convertible to BitArray on assignment. Thus slice would have
start/end pair of ulongs while array itself wouldn't.
Sounds good?
Sounds workable... Although it increases the complexity a bit,
and even more functions to make just to cover the compatibility
and differences. Since the BitArray is a struct rather than a
class, if it were a class, then it would be easier to do some of
this, but the hidden overhead then becomes greater than the
struct overhead. Then there's also if you decide you want a
uneven array not divisible by size_t bits, you'd get stuck with a
slice anyways or the length would be wrong all the time.
a.length = 10;
assert(a == 10); //breaks! Length is 32/64
a.length = 32;
assert(a == 32); //okay on 32bit machines
a ~= true;
assert(a == 33); //Breaks! length is now 64/128
Choices choices.... Simpler seems
Having it implicitly convertible means any time you would want
to do a bitarray and a slice, the slice needs to reallocate so
it's beginning offset is 0 and not possible 0 - size_t*8, or
making a slice actually makes a unique copy each time, which is
it's own overhead but makes the structure smaller, or making a
slice version of a BitArray, but once again doubles complexity
and requires helper versions for all of them. The simple bitarray
suddenly is becoming very large and complex just to save a few
bytes.
When I did C programming before I was always anal about space
which I find now is rather silly. I would do something like
struct {
unsigned int a : 5;
unsigned int b : 5;
unsigned int c : 10;
unsigned int d : 2;
}
etc etc. Any time my requirements changed although I needed to
change a few bits, having the size difference between 4 bytes and
32 bytes depending on how many times the object is made seems
small and silly. I have gigs of memory free and only only a
handful of the objects made at any given time makes it seem silly.
Maybe I have it all wrong, or maybe there's much more to worry
about than originally thought.