Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
Some extra ideas. 1. Avoiding extra header for big sized objects. I not sure about this, but still .. according to Eliot's design: 8: slot size (255 => extra header word with large size) What if we extend size to 16 bits (so in total it will be 65536 slots) and we have a single flag, pointing how to calculate object size: flag(0) object size = (size field) * 8 flag(1) object size = 2^ (slot field) which means that past 2^16 (or how many bits we dedicate to size field in header) all object sizes will be power of two. Since most of the objects will fit under 2^16, we don't lose much. For big arrays, we could have a special collection/array, which will store exact size in it's inst var (and we even don't need to care in cases of Sets/Dicts/OrderedCollections). Also we can actually make it transparent: Array class>>new: size size > (max exact size ) ifTrue: [ ^ ArrayWithBigSizeWhatever new: size ] of course, care must be taken for those variable classes which potentially can hold large amounts of bytes (like Bitmap). But i think code can be quickly adopted to this feature of VM, which will simply fail a #new: primitive if size is not power of two for sizes greater than max "exact size" which can fit into size field of header. 2. Slot for arbitrary properties. If you read carefully, Eliot said that for making lazy become it is necessary to always have some extra space per object, even if object don't have any fields: <> So, this fits quite well with idea of having slot for dynamic properties per object. What if instead of "extending object" when it requires extra properties slot, we just reserve the slot for properties at the very beginning: [ header ] [ properties slot] ... rest of data .. so, any object will have that slot. And in case of lazy-become. we can use that slot for holding forwarding pointer. Voila. 3. From 2. we going straight back to hash.. VM don't needs to know such a thing as object's hash, it has no semantic load inside VM, it just answers those bits by a single primitive. So, why it is kind of enforced inherent property of all objects in system? And why nobody asks, if we have that one, why we could not have more than one or as many as we want? This is my central question around idea of having per-object properties. Once VM will guarantee that any object can have at least one slot for storing object reference (property slot), then it is no longer needed for VM to care about identity hash. Because it can be implemented completely at language size. But most of all, we are NO longer limited how big/small hash values , which directly converts into bonuses: less hash collisions > more performance. Want 64-bit hash? 128-bit? Whatever you desire: Object>>identityHash ^ self propertiesAt: #hash ifAbsentPut: [ HashGenerator newHashValue ] and once we could have per-object properties.. and lazy become, things like Magma will get a HUGE benefits straightly out of the box. Because look, lazy become, immutability - those two addressing many problems related to OODB implementation (i barely see other use cases, where immutability would be as useful as in cases of OODB).. so for me it is logical to have this last step: by adding arbitrary properties, OODB now can store the ID there. -- Best regards, Igor Stasenko.
Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
On Wed, 30 May 2012, Igor Stasenko wrote: Here are couple (2) of mine, highly valuable cents :) 2^20 for classes? might be fine (or even overkill) for smalltalk systems, but could be quite limiting for one who would like experiment and implementing a prototype-based frameworks, where every object is a "class" by itself. I think it's more important to have a fast Smalltalk VM, than one that is slower, but might fit for a concrete experiment which might happen sometime and would get some performance boost from the implementation. --- 8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index --- what takes most of the space in object header? right - hash! Now, since we will have lazy become i am back to my idea of having extra & arbitrary properties per object. In a nutshell, the idea is to not store hash in an object header, but instead use just a single bit 'hash present'. When identity hash of object is requested (via corresponding primitive) the implementation could check if 'hash present' is set, then if it's not there , we do a 'lazy become' of existing object to same object copied into another place, but with hash bit set, and with extra 64-bit field, where hash value can be stored. So, when you requesting an identity hash for object which don't have it, the object of from: [header][...data.. ] copied to new memory region with new layout: [header][hash bits][...data..] and old object, is of course 'corpsed' to forwarding pointer to new location. The weak point of this idea is that you might run of out memory during the allocation of the new object if you ask the identity hash of a larger object or many smaller objects at once. Next step is going from holding just hash to having an arbitrary & dynamic number of extra fields per object. In same way, we use 1 extra bit, indicating that object having extra properties. And when object don't have it, we lazy-become it from being [header][...data.. ] or [header][hash bits][...data..] to: [header][hash bits][oop][...data..] where 'oop' can be anything - instance of Array/Dictionary (depends how language-side will decide to store extra properties of object) This , for instance , would allow us to store extra properties for special object formats like variable bytes or compiled methods, which don't have the instance variables. Not need to mention, how useful being able to attach extra properties per object, without changing the object's class. And , of course the freed 18 bits (20 - 2) in header can be allocated for other purposes. (Stef, how many bits you need for experiments? ;) About immediates zoo. Keep in mind, that the more immediates we have, the more complex implementation tends to be. I would just keep 2 data types: - integers - floats and third, special 'arbitrary' immediate , which seen by VM as a 60-bit value. The interpretation of this value depends on lookup in range-table, where developer specifying the correspondence between the value interval and class: [min .. max] -> class intervals, of course, cannot overlap. Determining a class of such immediate might be slower - O(log2(n)) at best (where n is size of range table), but from other side, how many different kinds of immediates you can fit into 60-bit value? Right, it is 2^60. Much more than proposed 8 isn't? :) And this extra cost can be mitigated completely by inline cache. - in case of regular reference, you must fetch the object's class and then compare it with one, stored in cache. - in case of immediate reference, you compare immediate value with min and max stored in cache fields. And if value is in range, you got a cache hit, and free to proceed. So, its just 1 extra comparison comparing to 'classic' inline cache. And, after thinking how inline cache is organized, now you can scratch the first my paragraph related to immediates! We really don't need to discriminate between small integers/floats/rest!! They could also be nothing more than just one of a range(s) defined in our zoo of 'special' immediates! So, at the end we will have just two kinds of references: - zero bit == 0 -- an object pointer - zero bit == 1 -- an immediate Voila!. We can have real zoo of immediates, and simple implementation to support them. And not saying that range-table is provided by language-side, so we're free to rearrange them at any moment. And of course, it doesn't means that VM could not reserve some of the ranges for own 'contracted' immediates, like Characters, and even class reference for example. Think about it :) I like the idea, but I'm not sure how useful it will be in practice. I'd also add characters as third data type. String/Chara
Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
Here are couple (2) of mine, highly valuable cents :) 2^20 for classes? might be fine (or even overkill) for smalltalk systems, but could be quite limiting for one who would like experiment and implementing a prototype-based frameworks, where every object is a "class" by itself. --- 8: slot size (255 => extra header word with large size) 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for pointer, 7 => # fixed fields is in the class) 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, compiled method, ephemerons, weak, etc) 1: immutability 3: GC 2 mark bits. 1 forwarded bit 20: identity hash 20: class index --- what takes most of the space in object header? right - hash! Now, since we will have lazy become i am back to my idea of having extra & arbitrary properties per object. In a nutshell, the idea is to not store hash in an object header, but instead use just a single bit 'hash present'. When identity hash of object is requested (via corresponding primitive) the implementation could check if 'hash present' is set, then if it's not there , we do a 'lazy become' of existing object to same object copied into another place, but with hash bit set, and with extra 64-bit field, where hash value can be stored. So, when you requesting an identity hash for object which don't have it, the object of from: [header][...data.. ] copied to new memory region with new layout: [header][hash bits][...data..] and old object, is of course 'corpsed' to forwarding pointer to new location. Next step is going from holding just hash to having an arbitrary & dynamic number of extra fields per object. In same way, we use 1 extra bit, indicating that object having extra properties. And when object don't have it, we lazy-become it from being [header][...data.. ] or [header][hash bits][...data..] to: [header][hash bits][oop][...data..] where 'oop' can be anything - instance of Array/Dictionary (depends how language-side will decide to store extra properties of object) This , for instance , would allow us to store extra properties for special object formats like variable bytes or compiled methods, which don't have the instance variables. Not need to mention, how useful being able to attach extra properties per object, without changing the object's class. And , of course the freed 18 bits (20 - 2) in header can be allocated for other purposes. (Stef, how many bits you need for experiments? ;) About immediates zoo. Keep in mind, that the more immediates we have, the more complex implementation tends to be. I would just keep 2 data types: - integers - floats and third, special 'arbitrary' immediate , which seen by VM as a 60-bit value. The interpretation of this value depends on lookup in range-table, where developer specifying the correspondence between the value interval and class: [min .. max] -> class intervals, of course, cannot overlap. Determining a class of such immediate might be slower - O(log2(n)) at best (where n is size of range table), but from other side, how many different kinds of immediates you can fit into 60-bit value? Right, it is 2^60. Much more than proposed 8 isn't? :) And this extra cost can be mitigated completely by inline cache. - in case of regular reference, you must fetch the object's class and then compare it with one, stored in cache. - in case of immediate reference, you compare immediate value with min and max stored in cache fields. And if value is in range, you got a cache hit, and free to proceed. So, its just 1 extra comparison comparing to 'classic' inline cache. And, after thinking how inline cache is organized, now you can scratch the first my paragraph related to immediates! We really don't need to discriminate between small integers/floats/rest!! They could also be nothing more than just one of a range(s) defined in our zoo of 'special' immediates! So, at the end we will have just two kinds of references: - zero bit == 0 -- an object pointer - zero bit == 1 -- an immediate Voila!. We can have real zoo of immediates, and simple implementation to support them. And not saying that range-table is provided by language-side, so we're free to rearrange them at any moment. And of course, it doesn't means that VM could not reserve some of the ranges for own 'contracted' immediates, like Characters, and even class reference for example. Think about it :)
Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
On Wed, May 30, 2012 at 1:53 PM, Mariano Martinez Peck < marianop...@gmail.com> wrote: > > > > On Wed, May 30, 2012 at 10:22 PM, Eliot Miranda > wrote: > >> >> >> >> On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse < >> stephane.duca...@inria.fr> wrote: >> >>> I would like to be sure that we can have >>>- bit for immutable objects >>>- bits for experimenting. >>> >> >> There will be quite a few. And one will be able to steal bits from the >> class field if one needs fewer classes. I'm not absolutely sure of the >> layout yet. But for example >> >> 8: slot size (255 => extra header word with large size) >> 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for >> pointer, 7 => # fixed fields is in the class) >> 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles >> indexable, compiled method, ephemerons, weak, etc) >> 1: immutability >> 3: GC 2 mark bits. 1 forwarded bit >> 20: identity hash >> > > and we can make it lazy, that is, we compute it not at instantiation time > but rather the first time the primitive is call. > > > >> 20: class index >> > > This would probably work for a while. I think that it would be good to let > an "open door" so that in the future we can just add one more word for a > class pointer. > Turns out that's not such a simple change. Class indices have two advantages. One is that they're more compact (2^20 classes is still a lot of classes). The other is that they're constant, which has two main benefits. First, in method caches and in-line caches the class field holds an index and hence doesn't need to be updated by the GC. The GC no longer has top visit send sites. Second, because they're constants both checking for well-known classes and instantiating well-known classes can be done without going to the specialObjectsArray. One just uses the constant. Now undoing these optimizations to open a back-dorr is not trivial. So best accept the benefits and exploit them to a maximum. > >> >> still leaves 5 bits unused. And stealing 4 bits each from class index >> still leaves 64k classes. So this format is simple and provides lots of >> unused bits. The format field is a great idea as it combines a number of >> orthogonal properties in very few bits. I don't want to include odd bytes >> in format because I think a separate field that holds odd bytes and fixed >> fields is better use of space. But we can gather statistics before >> deciding. >> >> >>> Stef >>> >>> On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote: >>> >>> > Hi guys >>> > >>> > Here is an important topic I would like to see discussed so that we >>> see how we can improve and join forces. >>> > May a mail discussion then a wiki for the summary would be good. >>> > >>> > >>> > stef >>> > >>> > >>> > >>> > Begin forwarded message: >>> > >>> >> From: Eliot Miranda >>> >> Subject: Re: Plan/discussion/communication around new object format >>> >> Date: May 27, 2012 10:49:54 PM GMT+02:00 >>> >> To: Stéphane Ducasse >>> >> >>> >> >>> >> >>> >> On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse < >>> stephane.duca...@inria.fr> wrote: >>> >> Hi eliot >>> >> >>> >> do you have a description of the new object format you want to >>> introduce? >>> >> >>> >> The design is in the class comment of CogMemoryManager in the Cog >>> VMMaker packages. >>> >> >>> >> Then what is your schedule? >>> >> >>> >> This is difficult. I have made a small start and should be able to >>> spend time on it starting soon. I want to have it finished by early next >>> year. But it depends on work schedules etc. >>> >> >>> >> >>> >> I would like to see if we can allocate igor/esteban time before we >>> run out of money >>> >> to help on that important topic. >>> >> Now the solution is unclear and I did not see any document where we >>> can evaluate >>> >> and plan how we can help. So do you want help on that topic? Then how >>> can people >>> >> contribute if any? >>> >> >>> >> The first thing to do is to read the design document, to see if the >>> Pharo community thinks it is the right direction, and to review it, spot >>> deficiencies etc. So please get those interested to read the class comment >>> of CogMemoryManager in the latest VMMaker.oscog. >>> >> >>> >> Here's the current version of it: >>> >> >>> >> CogMemoryManager is currently a place-holder for the design of the >>> new Cog VM's object representation and garbage collector. The goals for >>> the GC are >>> >> >>> >> - efficient object representation a la Eliot Miranda's VisualWorks >>> 64-bit object representation that uses a 64-bit header, eliminating direct >>> class references so that all objects refer to their classes indirectly. >>> Instead the header contains a constant class index, in a field smaller >>> than a full pointer, These class indices are used in inline and first-level >>> method caches, hence they do not have to be updated on GC (although they do >>> have to be traced to be able to GC classes). Classes ar
Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
On Wed, May 30, 2012 at 1:57 PM, Stefan Marr wrote: > > Hi Eliot: > > From my experience with the RoarVM, it seems to be a rather simple > exercise to enable the VM to support a custom 'pre-header' for objects. > That is, a constant offset in the memory that comes from the allocator, > and is normally ignored by the GC. > That's a great idea. So for experimental use one simply throws a whole word at the problem and forgets about the issue. Thanks, Stefan. That leaves me free to focus on something fast and compact that is still flexible. Great! > > That allows me to do all kind of stuff. Of course at a cost of a word per > object, and at the cost of recompiling the VM. > But, that should be a reasonable price to pay for someone doing research > on these kind of things. > > Sometimes a few bits are just not enough, and such a pre-header gives much > much more flexibility. > For the people interested in that, I could dig out the details (I think, I > did that already once on this list). > > Best regards > Stefan > > > On 30 May 2012, at 22:22, Eliot Miranda wrote: > > > > > > > On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse < > stephane.duca...@inria.fr> wrote: > > I would like to be sure that we can have > >- bit for immutable objects > >- bits for experimenting. > > > > There will be quite a few. And one will be able to steal bits from the > class field if one needs fewer classes. I'm not absolutely sure of the > layout yet. But for example > > > > 8: slot size (255 => extra header word with large size) > > 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for > pointer, 7 => # fixed fields is in the class) > > 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles > indexable, compiled method, ephemerons, weak, etc) > > 1: immutability > > 3: GC 2 mark bits. 1 forwarded bit > > 20: identity hash > > 20: class index > > > > still leaves 5 bits unused. And stealing 4 bits each from class index > still leaves 64k classes. So this format is simple and provides lots of > unused bits. The format field is a great idea as it combines a number of > orthogonal properties in very few bits. I don't want to include odd bytes > in format because I think a separate field that holds odd bytes and fixed > fields is better use of space. But we can gather statistics before > deciding. > > > > > > Stef > > > > On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote: > > > > > Hi guys > > > > > > Here is an important topic I would like to see discussed so that we > see how we can improve and join forces. > > > May a mail discussion then a wiki for the summary would be good. > > > > > > > > > stef > > > > > > > > > > > > Begin forwarded message: > > > > > >> From: Eliot Miranda > > >> Subject: Re: Plan/discussion/communication around new object format > > >> Date: May 27, 2012 10:49:54 PM GMT+02:00 > > >> To: Stéphane Ducasse > > >> > > >> > > >> > > >> On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse < > stephane.duca...@inria.fr> wrote: > > >> Hi eliot > > >> > > >> do you have a description of the new object format you want to > introduce? > > >> > > >> The design is in the class comment of CogMemoryManager in the Cog > VMMaker packages. > > >> > > >> Then what is your schedule? > > >> > > >> This is difficult. I have made a small start and should be able to > spend time on it starting soon. I want to have it finished by early next > year. But it depends on work schedules etc. > > >> > > >> > > >> I would like to see if we can allocate igor/esteban time before we > run out of money > > >> to help on that important topic. > > >> Now the solution is unclear and I did not see any document where we > can evaluate > > >> and plan how we can help. So do you want help on that topic? Then how > can people > > >> contribute if any? > > >> > > >> The first thing to do is to read the design document, to see if the > Pharo community thinks it is the right direction, and to review it, spot > deficiencies etc. So please get those interested to read the class comment > of CogMemoryManager in the latest VMMaker.oscog. > > >> > > >> Here's the current version of it: > > >> > > >> CogMemoryManager is currently a place-holder for the design of the > new Cog VM's object representation and garbage collector. The goals for > the GC are > > >> > > >> - efficient object representation a la Eliot Miranda's VisualWorks > 64-bit object representation that uses a 64-bit header, eliminating direct > class references so that all objects refer to their classes indirectly. > Instead the header contains a constant class index, in a field smaller > than a full pointer, These class indices are used in inline and first-level > method caches, hence they do not have to be updated on GC (although they do > have to be traced to be able to GC classes). Classes are held in a sparse > weak table. The class table needs only to be indexed by an instance's > class index in class hierarchy search, in the
Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
Hi Eliot: From my experience with the RoarVM, it seems to be a rather simple exercise to enable the VM to support a custom 'pre-header' for objects. That is, a constant offset in the memory that comes from the allocator, and is normally ignored by the GC. That allows me to do all kind of stuff. Of course at a cost of a word per object, and at the cost of recompiling the VM. But, that should be a reasonable price to pay for someone doing research on these kind of things. Sometimes a few bits are just not enough, and such a pre-header gives much much more flexibility. For the people interested in that, I could dig out the details (I think, I did that already once on this list). Best regards Stefan On 30 May 2012, at 22:22, Eliot Miranda wrote: > > > On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse > wrote: > I would like to be sure that we can have >- bit for immutable objects >- bits for experimenting. > > There will be quite a few. And one will be able to steal bits from the class > field if one needs fewer classes. I'm not absolutely sure of the layout yet. > But for example > > 8: slot size (255 => extra header word with large size) > 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for > pointer, 7 => # fixed fields is in the class) > 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, > compiled method, ephemerons, weak, etc) > 1: immutability > 3: GC 2 mark bits. 1 forwarded bit > 20: identity hash > 20: class index > > still leaves 5 bits unused. And stealing 4 bits each from class index still > leaves 64k classes. So this format is simple and provides lots of unused > bits. The format field is a great idea as it combines a number of orthogonal > properties in very few bits. I don't want to include odd bytes in format > because I think a separate field that holds odd bytes and fixed fields is > better use of space. But we can gather statistics before deciding. > > > Stef > > On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote: > > > Hi guys > > > > Here is an important topic I would like to see discussed so that we see how > > we can improve and join forces. > > May a mail discussion then a wiki for the summary would be good. > > > > > > stef > > > > > > > > Begin forwarded message: > > > >> From: Eliot Miranda > >> Subject: Re: Plan/discussion/communication around new object format > >> Date: May 27, 2012 10:49:54 PM GMT+02:00 > >> To: Stéphane Ducasse > >> > >> > >> > >> On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse > >> wrote: > >> Hi eliot > >> > >> do you have a description of the new object format you want to introduce? > >> > >> The design is in the class comment of CogMemoryManager in the Cog VMMaker > >> packages. > >> > >> Then what is your schedule? > >> > >> This is difficult. I have made a small start and should be able to spend > >> time on it starting soon. I want to have it finished by early next year. > >> But it depends on work schedules etc. > >> > >> > >> I would like to see if we can allocate igor/esteban time before we run out > >> of money > >> to help on that important topic. > >> Now the solution is unclear and I did not see any document where we can > >> evaluate > >> and plan how we can help. So do you want help on that topic? Then how can > >> people > >> contribute if any? > >> > >> The first thing to do is to read the design document, to see if the Pharo > >> community thinks it is the right direction, and to review it, spot > >> deficiencies etc. So please get those interested to read the class > >> comment of CogMemoryManager in the latest VMMaker.oscog. > >> > >> Here's the current version of it: > >> > >> CogMemoryManager is currently a place-holder for the design of the new Cog > >> VM's object representation and garbage collector. The goals for the GC are > >> > >> - efficient object representation a la Eliot Miranda's VisualWorks 64-bit > >> object representation that uses a 64-bit header, eliminating direct class > >> references so that all objects refer to their classes indirectly. Instead > >> the header contains a constant class index, in a field smaller than a full > >> pointer, These class indices are used in inline and first-level method > >> caches, hence they do not have to be updated on GC (although they do have > >> to be traced to be able to GC classes). Classes are held in a sparse weak > >> table. The class table needs only to be indexed by an instance's class > >> index in class hierarchy search, in the class primitive, and in tracing > >> live objects in the heap. The additional header space is allocated to a > >> much expanded identity hash field, reducing hash efficiency problems in > >> identity collections due to the extremely small (11 bit) hash field in the > >> old Squeak GC. The identity hash field is also a key element of the class > >> index scheme. A class's identity hash is its index into the clas
Re: [Pharo-project] [Vm-dev] Re: Plan/discussion/communication around new object format
On Wed, May 30, 2012 at 10:22 PM, Eliot Miranda wrote: > > > > On Wed, May 30, 2012 at 12:59 PM, Stéphane Ducasse < > stephane.duca...@inria.fr> wrote: > >> I would like to be sure that we can have >>- bit for immutable objects >>- bits for experimenting. >> > > There will be quite a few. And one will be able to steal bits from the > class field if one needs fewer classes. I'm not absolutely sure of the > layout yet. But for example > > 8: slot size (255 => extra header word with large size) > 3: odd bytes/fixed fields (odd bytes for non-pointer, fixed fields for > pointer, 7 => # fixed fields is in the class) > 4 bits: format (pointers, indexable, bytes/shorts/longs/doubles indexable, > compiled method, ephemerons, weak, etc) > 1: immutability > 3: GC 2 mark bits. 1 forwarded bit > 20: identity hash > and we can make it lazy, that is, we compute it not at instantiation time but rather the first time the primitive is call. > 20: class index > This would probably work for a while. I think that it would be good to let an "open door" so that in the future we can just add one more word for a class pointer. > > still leaves 5 bits unused. And stealing 4 bits each from class index > still leaves 64k classes. So this format is simple and provides lots of > unused bits. The format field is a great idea as it combines a number of > orthogonal properties in very few bits. I don't want to include odd bytes > in format because I think a separate field that holds odd bytes and fixed > fields is better use of space. But we can gather statistics before > deciding. > > >> Stef >> >> On May 30, 2012, at 8:48 AM, Stéphane Ducasse wrote: >> >> > Hi guys >> > >> > Here is an important topic I would like to see discussed so that we see >> how we can improve and join forces. >> > May a mail discussion then a wiki for the summary would be good. >> > >> > >> > stef >> > >> > >> > >> > Begin forwarded message: >> > >> >> From: Eliot Miranda >> >> Subject: Re: Plan/discussion/communication around new object format >> >> Date: May 27, 2012 10:49:54 PM GMT+02:00 >> >> To: Stéphane Ducasse >> >> >> >> >> >> >> >> On Sat, May 26, 2012 at 1:46 AM, Stéphane Ducasse < >> stephane.duca...@inria.fr> wrote: >> >> Hi eliot >> >> >> >> do you have a description of the new object format you want to >> introduce? >> >> >> >> The design is in the class comment of CogMemoryManager in the Cog >> VMMaker packages. >> >> >> >> Then what is your schedule? >> >> >> >> This is difficult. I have made a small start and should be able to >> spend time on it starting soon. I want to have it finished by early next >> year. But it depends on work schedules etc. >> >> >> >> >> >> I would like to see if we can allocate igor/esteban time before we run >> out of money >> >> to help on that important topic. >> >> Now the solution is unclear and I did not see any document where we >> can evaluate >> >> and plan how we can help. So do you want help on that topic? Then how >> can people >> >> contribute if any? >> >> >> >> The first thing to do is to read the design document, to see if the >> Pharo community thinks it is the right direction, and to review it, spot >> deficiencies etc. So please get those interested to read the class comment >> of CogMemoryManager in the latest VMMaker.oscog. >> >> >> >> Here's the current version of it: >> >> >> >> CogMemoryManager is currently a place-holder for the design of the new >> Cog VM's object representation and garbage collector. The goals for the GC >> are >> >> >> >> - efficient object representation a la Eliot Miranda's VisualWorks >> 64-bit object representation that uses a 64-bit header, eliminating direct >> class references so that all objects refer to their classes indirectly. >> Instead the header contains a constant class index, in a field smaller >> than a full pointer, These class indices are used in inline and first-level >> method caches, hence they do not have to be updated on GC (although they do >> have to be traced to be able to GC classes). Classes are held in a sparse >> weak table. The class table needs only to be indexed by an instance's >> class index in class hierarchy search, in the class primitive, and in >> tracing live objects in the heap. The additional header space is allocated >> to a much expanded identity hash field, reducing hash efficiency problems >> in identity collections due to the extremely small (11 bit) hash field in >> the old Squeak GC. The identity hash field is also a key element of the >> class index scheme. A class's identity hash is its index into the class >> table, so to create an instance of a class one merely copies its identity >> hash into the class index field of the new instance. This implies that >> when classes gain their identity hash they are entered into the class table >> and their identity hash is that of a previously unused index in the table. >> It also implies that there is a maximum number of classes in the tab