On 29-Jul-12 03:16, Era Scarecrow wrote:
On Saturday, 28 July 2012 at 23:02:17 UTC, Dmitry Olshansky wrote:
On 29-Jul-12 02:31, Era Scarecrow wrote:
On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
Mhm? Not getting you it at all. What's locking it ?

What's wrong with:
struct BitArray
{

union {
   struct {
       size_t[] data;
       ulong length;    //in bits

//length in words would be just
// (length + size_t.sizeof*4) / // (size_t.sizeof*8);
// as it's a power of 2 it should get optimized

   }
   size_t[2+ulong.sizeof/size_t.sizeof] compact;

// and say 1 bit is used for isCompact would be great if it could use
// the least significant bit of pointer from data[], dunno if it's
easy enough

I'd rather not rely on the pointer at all as a reference for if it's
compact.

No problem, I redesigned a bit:

struct BitArray
{

union {
    struct {
        ulong length;    //in bits, the top bit used as isCompact
        size_t[] data;
    }
    struct{
    ushort smallLength; //here as well
    ubyte[size_t.sizeof*2+ulong.sizeof-2] compact;
    }
}

Up 2^63 of bits would occupy 2^60 of bytes should be good enough for
the years to come.
 I'm not seeing it as a problem as at very least processors have to
actually be able to address more then 42 (or was it 48?) bits that
they can now. (see AMD or Intel or ARM datasheets on 64 bit chips)

2^63, or drop 8 so it takes up no more than 2^55 bytes of memory. Yes
that's quite a large chunk.

  Curiously enough the current structure is already smaller than the
proposed unions.

OK. Now back to the biggest question:
Slices use references (sometimes) to previous bitarray. Since it's a slice (like an array) this could be expected.

The "sometimes" part is not god enough! The problem is that small string optimization (or small array optimization) doesn't work with reference semantics. You may dig up discussion on proposal for small string optimization for char[] and string that was shot down on these grounds. Bulit-in like semantics call for simple ptr/ptr+length struct and you can't get that can you?.

E.g.

void mess_with(BitArray ba)
{
ba ~= true;
ba ~= false;
 //does it change anything outside this function? *maybe*
ba[0] ^= true;
ba[1] ^= true;
auto x = ba; //even if used ref BitArray ba, x *maybe* references ba
//is x now a reference or a copy? *maybe*

}

Thus BitArray either becomes value type (that may come as unexpected, see your option c/d). Or it becomes full reference type as in option a (though it sucks). Sorry but b is no go, as it makes code unpredictable I'd rather take my idea with separate ranges then b.

Solutions:
a) Drop the union, make all functions @safe (compared to @trusted), and they are all ref/slices b) Leave as sometimes ref, sometimes not; Use .dup when you NEED to ensure different unique copies. c) Apply COW (Copy-On-Write) where a known slice type (canExpand is false) when a change would apply, it replaces the present slice into it's own unique array. d) All slices become new dup'ed allocated arrays (Most expensive of them all)

Because of d, I proposed that slices be separate type that is always a reference to original. Then only coping arrays themselves involves dup.


Begin takes extra ulong I'm not sure it is needed in the container
itself.

  No it doesn't. bitfields are used to handle the beginning/ending
offsets, equaling a total of 14 bits. It's the most confusing part
(aside from [offset + sliceoffset + i[ formulas). canExpand and
isCompact fill the other two bits.

OK, finally I think I understand how your current version works. It's clever but leaves you in half-way value-reference state.

To point out few problems with unification of slice type and BitArray:

- Every time a small slice is taken say [1..30] it becomes a small array(?) thus it is a value type and doesn't affect original array

- No idea if a = b make a copy will make programmers nervous, esp if one is writing a library function as in this case there is no assumption that fits all

- Same thing but more, does this work?
        auto c = ba; //if ba is compact
        ba ~= true;
        assert(c.length = ba.length-1);
//c should remain with the same length unaffected (at least like built in arrays)

--
Dmitry Olshansky

Reply via email to