Re: TaggedArrays (Proposal)

Jim Laskey Mon, 02 Jul 2012 07:07:18 -0700

On 2012-07-02, at 10:57 AM, Rémi Forax wrote:

> On 07/02/2012 03:05 PM, Jim Laskey wrote:
>> During a week in the rarefied air of Stockholm back in May, a 
>> sleepless night got me thinking.  The day after that, the thinking 
>> became a reality.  I've been sitting on the code since, not sure what 
>> to do next.  So..., why not start the month leading up to the JVMLS 
>> with a discussion about dynamic values.
>> 
>> Every jLanguage developer knows that primitive boxing is the enemy. 
>> Even more so for untyped languages.  We need a way to interleave 
>> primitive types with references.
>> 
>> Tagged values (value types) for dynamic languages have been approached 
>> from a dozen different angles over the history of Java.  However, no 
>> one seems to be satisfied with any of the proposals so far.  Either 
>> the implementation is too limiting for the language developer or too 
>> complex to implement.
>> 
>> Most recently, John (Rose) proposed hiding value tagging in the JVM 
>> via the Integer/Long/Float/Double.valueof methods.  I saw a few issues 
>> with this proposal.  First, the implementation works differently on 32 
>> bit and 64 bit platforms (only half a solution on each).  Secondly, 
>> control of the tag bits is hidden such that it doesn't give a language 
>> implementor any leeway on bit usage.  Finally, it will take a long 
>> time for it to get introduced into the JVM.  The implementation is 
>> complex, scattered all over the VM and will lead to a significant 
>> multiplier for testing coverage.
> 
> but it will also help Java perf.


But of course. :-)


> 
>> 
>> It occurred to me on that sleepless Monday night, that the solution 
>> for most dynamic languages could be so much simpler.  First, we have 
>> to look at what it is we really need.  Ultimately it's about boxing. 
>> We want to avoid allocating memory whenever we need to store a 
>> primitive value in an object.  Concerning ourselves with passing 
>> around tagged values in registers and storing in stack frames is all 
>> red herring.  All that is needed is a mechanism for storing tagged 
>> values (or compressed values) in a no-type slot of a generic object. 
>> Thinking about it in these terms isolates all issues to a single 
>> array-like class, and thus simplifies implementation and simplifies 
>> testing.  Instances of this class can be used as objects, as stack 
>> frames and even full stacks.  A good percentage of a dynamic language 
>> needs are covered.
> 
> using it as a stack frames will require a pretty good escape analysis if 
> you want same perf as the native stack
> or is there a trick somewhere ?
> But given that there is a trick to avoid boxing for local variables (see 
> my talk at next JVM Summit),
> having an array like this just for storing fields is enough to pull its 
> weight.

I was thinking in terms of languages that use a side stack in conjunction with 
the system stack.

> 
> 
>> 
>> So, Rickard Bäckman (also of Oracle) and I defined an API and 
>> implemented (in HotSpot) an interface called TaggedArray. 
>> Conceptional, TaggedArray is a fixed array of no-type slots (64-bit), 
>> where each slot can contain either a reference or a tagged long value 
>> (least significant bit set.)  Internally, TaggedArray class's doOop 
>> method knows that it should skip any 64-bit value with the least 
>> significant bit set.  How the language developer uses the other 63 
>> bits is up to them.  References are just addresses.  On 32 bit 
>> machines, the address (or packed address) is stored in the high 
>> 32-bits (user has no access)  So there is no interference with the tag 
>> bit.
>> 
>> We supply four implementations of the API.  1) is a naive two parallel 
>> arrays (one Object[], one long[]) implementation for platforms not 
>> supporting TaggedArrays (and JDK 1.7), 2) an optimized version of 1) 
>> that allocates each array on demand, 3) a JNI implementation 
>> (minimally needed for the interpreter) that uses the native 
>> implementation and 4) the native implementation that is recognized by 
>> both the C1/C2 compilers (effort only partially completed.)  In 
>> general, the implementation choice is transparent to the user (optimal 
>> choice.)
> 
> Being able to subclass it in order to add fixed field like a metaclass 
> field, i.e a field that
> is always a reference, would be cool too.

Not so easy - the end of the tagged array object is the array area.  Adding 
fields before that area in subclasses complicates determining the base of the 
array.  This would also affect performance (fixed vs variable.)  Reserving a 
slot and doing a getReference(i) on that slot would only cost you the bit test.

> 
> About the API, the two method set should be setValue()/setReference().

Sure.

> I think that getValue()/setValue() should return the long with the bit 
> set because
> If i want to execute x + 1, I can convert it to x + 2 at compile time 
> thus avoid the shifts at runtime.

It does return the bit.  That is the point.  The language developer has 
complete control over bits and can do what ever tricks they choose.

-- Jim

> 
>> 
>> I've enclosed a JavaDoc and the roughed out source.  For discussion. 
>> Fire away.
>> 
>> Cheers,
>> 
>> -- Jim
> 
> cheers,
> Rémi
> 
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Re: TaggedArrays (Proposal)

Reply via email to