Sometimes I find bugs talking to myself so .. > Maybe its the collection of constructor things ... they use a weird format now > to optimise space. EG option type None | Some t is a pointer to t or NULL > for NONE .. I think .. not a variant record.
To improve the performance of union (sum) types, src/compiler/flx_cpp_backend/flx_vrep.ml uses this type: type variant_rep = VR_self | VR_int | VR_packed | VR_uctor This says a union is represented by an argument value, and integer a pointer with an integer packing the low 2 bits, or a standard union constructor (which is a pair consisting of an integer and a pointer). VR_self is supposed to be used for a type like union single = | Single of int; i.e. where there's just one constructor: in that case case is represented by the argument (here, an int). This representation is not currently used because I had trouble getting the C++ typing right :) VR_int is used in a case like: enum X {a,b,c}; where no constructor has an argument, it's just the case index (zero origin). VR_packed is used when there are at most 4 constructors. In this case a pointer to the heap allocated argument is used, or NULL if there's no argument, with the case index bitwise or-ed into the pointer. Since there are at most for cases the index is at most 2 bits. Since pointers into the heap are always 8 byte aligned on a 32 bit machine the low 3 bits of all heap pointers are always zero: 4 bits on a 64 bit machine due to 16 byte alignment. [So actually we could handle 8 cases on a 32 bit machine and 16 on a 64 bit machine .. but the current rep will work even on a 16 bit machine] These macros are used to manipulate the parts, they're generated in lib/rtl/flx_rtl_config.hpp: #define FLX_RAWADDRESS unsigned long #define FLX_MAX_ALIGN 16 // get variant index code and pointer from packed variant rep #define FLX_VP(x) ((void*)((FLX_RAWADDRESS)(x) & ~(FLX_RAWADDRESS)0x03)) #define FLX_VI(x) ((int)((FLX_RAWADDRESS)(x) & (FLX_RAWADDRESS)0x03)) // make a packed variant rep from index code and pointer #define FLX_VR(i,p) ((void*)((FLX_RAWADDRESS)(p)|(FLX_RAWADDRESS)(i))) As an example, in lib/std/list.flx we see: union list[T] = | Empty | Cons of T * list[T]; proc splice[T] : &list[T] * list[T] = """ { // list splice struct node_t { ?1 elt; void *tail; }; void **p = $1; while(*p) p = &((node_t*)FLX_VP(*p))->tail; *p = $2; } """ ; The option type None | Some of T should also benefit from the packed representation. As an aside, the Ocaml code calculating the representation is suboptimal: let cal_variant_rep bsym_table t = let n = cal_variant_cases bsym_table t in let z = cal_variant_maxarg bsym_table t in match n,z with | -1,_ -> assert false (* Remove this case temporarily because it is a bit tricky to implement *) (* | 1,_ -> VR_self *) (* only one case do drop variant *) | _,0 -> VR_int (* no arguments, just use an int *) | k,_ when k <= 4 -> VR_packed (* At most 4 cases, encode caseno in point low bits *) | _,_ -> VR_uctor (* Standard Uctor *) Here variant_maxarg makes a valiant attempt to calculate the argument size. The idea is if any arguments are a suitably aligned pointers, we can use that pointer directly with the low bits holding the case index, instead of heap allocating it and using the pointer to the heap. That calculation is done but ignore in the representation calculation (along with VR_self) because it requires a lot more work (and testing) to ensure the interpretation is correct everywhere. NOW to the point of all this!!!! The garbage collector doesn't know anything about this! Strangely .. it should still work! The reason is: when a possible pointer is fetched, it is looked up using JudyLLast in the map from allocated pointers to shapes. This finds the equal or lower value allocated pointer. We then check the shape object for the length of the shape and add this to the actual pointer found to find a high bound for the object, and check the pointer candidate to see if it is interior to the object. Well .. if you consider a pointer p | 0x3 which finds p, then if the length is at least 4 then p | 0x3 is interior to the object. Interesting .. this is true for all data types other than "short" or "char".. so actually there's a bug UNLESS the allocator ensures all allocations are for at least 4 bytes ... if(amt & 1) ++amt; // round up to even number In fact it ensures at least 2. Except for malloc(0) of course .. Note however that the SHAPE object might be smaller... The bottom line: its a bug BUT I don't believe it is causing a problem in any of my code that's crashing... ;( -- john skaller skal...@users.sourceforge.net ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language