@Kris, I was assuming that the tag would be sufficient for the JVM since 'real' references would be aligned and hence naturally not tagged. But I don't know enough about the JVM and hence you could well be correct. -- Howard.
On 3 July 2012 15:40, Krystal Mok <rednaxel...@gmail.com> wrote: > On Tue, Jul 3, 2012 at 1:28 PM, Howard Lovatt <howard.lov...@gmail.com>wrote: > >> I like the idea and something along these lines would be a great addition >> to the standard library, which I will come back to as a PS. >> >> In com.sun.misc.Unsafe there are already getLong(Object, int) and >> setLong(Object, int, long) methods and the same for Object. No doubt if we >> used getLong and setLong as they stand they would fail due to type checking >> if used to access references to Objects in an Object[]. However if similar >> methods were added that circumvented the type checking, >> getLongOrReference(Object, int) and setLongOrReference(Object, int, long), >> then you could write: >> >> private Object[] vsAndRs; >> >> ... >> >> @Override public long getValue(final int index) { >> checkIndex(pos); >> final long value = Unsafe.getLongOrReference(vsAndRs, index); >> if ( (value & TaggedArrays.TAG) != 0 ) { >> return value; >> } >> throw new TaggedArrayException("value at index =" + index + " is not >> tagged"); >> } >> >> @Override public Object getReference(final int index) { >> if ( isReference(index) ) { >> return vsAndRs[index]; >> } >> throw new TaggedArrayException("reference at index =" + index +" is >> tagged"); >> } >> >> .. >> >> Which would save half the space and still be understandable Java (i.e. >> not a collection of native calls). >> >> It's not just about type checking at Java level, but also something deep > within the VM: the need to distinguish between a plain value and a > reference (pointer), which is fundamental to a exact/accurate garbage > collector. > Jim's idea includes modifying the doOop() function in the VM, which is > exactly where HotSpot handles reference fields within an object. > > If you put random values into a place where HotSpot (or other JVM impls > using exact GC) think it's a reference, then something bad may happen later > (e.g. during GC), that the VM think it had found a corrupted reference. > > - Kris > > >> Great work, >> >> -- Howard. >> >> PS A few random thoughts for the interface, assuming that something like >> this makes it into the JDK then the following additions might be nice: >> >> 1. Fills that accept a lambda that receives the index as its argument >> (filling function as opposed to filling constant). >> 2. A second interface and second factory that define tagged arrays for >> long and double in addition to Objects, like the example code given. >> 3. TaggedArray32, primarily for phones etc., with the extended interface, >> point 2 above, for TaggedArray32 accepting Objects, ints, and floats. On a >> 64 bit system TaggedArray32 would be the same as TaggedArray, but for 32 >> bit systems it would save space. >> >> >> On 2 July 2012 23:05, Jim Laskey <jlas...@me.com> wrote: >> >>> During a week in the rarefied air of Stockholm back in May, a sleepless >>> night got me thinking. The day after that, the thinking became a reality. >>> I've been sitting on the code since, not sure what to do next. So..., why >>> not start the month leading up to the JVMLS with a discussion about dynamic >>> values. >>> >>> Every jLanguage developer knows that primitive boxing is the enemy. >>> Even more so for untyped languages. We need a way to interleave primitive >>> types with references. >>> >>> Tagged values (value types) for dynamic languages have been approached >>> from a dozen different angles over the history of Java. However, no one >>> seems to be satisfied with any of the proposals so far. Either the >>> implementation is too limiting for the language developer or too complex to >>> implement. >>> >>> Most recently, John (Rose) proposed hiding value tagging in the JVM via >>> the Integer/Long/Float/Double.valueof methods. I saw a few issues with >>> this proposal. First, the implementation works differently on 32 bit and >>> 64 bit platforms (only half a solution on each). Secondly, control of the >>> tag bits is hidden such that it doesn't give a language implementor any >>> leeway on bit usage. Finally, it will take a long time for it to get >>> introduced into the JVM. The implementation is complex, scattered all over >>> the VM and will lead to a significant multiplier for testing coverage. >>> >>> It occurred to me on that sleepless Monday night, that the solution for >>> most dynamic languages could be so much simpler. First, we have to look at >>> what it is we really need. Ultimately it's about boxing. We want to avoid >>> allocating memory whenever we need to store a primitive value in an object. >>> Concerning ourselves with passing around tagged values in registers and >>> storing in stack frames is all red herring. All that is needed is a >>> mechanism for storing tagged values (or compressed values) in a no-type >>> slot of a generic object. Thinking about it in these terms isolates all >>> issues to a single array-like class, and thus simplifies implementation and >>> simplifies testing. Instances of this class can be used as objects, as >>> stack frames and even full stacks. A good percentage of a dynamic language >>> needs are covered. >>> >>> So, Rickard Bäckman (also of Oracle) and I defined an API and >>> implemented (in HotSpot) an interface called TaggedArray. Conceptional, >>> TaggedArray is a fixed array of no-type slots (64-bit), where each slot can >>> contain either a reference or a tagged long value (least significant bit >>> set.) Internally, TaggedArray class's doOop method knows that it >>> should skip any 64-bit value with the least significant bit set. How >>> the language developer uses the other 63 bits is up to them. References >>> are just addresses. On 32 bit machines, the address (or packed address) is >>> stored in the high 32-bits (user has no access) So there is no >>> interference with the tag bit. >>> >>> We supply four implementations of the API. 1) is a naive two parallel >>> arrays (one Object[], one long[]) implementation for platforms not >>> supporting TaggedArrays (and JDK 1.7), 2) an optimized version of 1) that >>> allocates each array on demand, 3) a JNI implementation (minimally needed >>> for the interpreter) that uses the native implementation and 4) the native >>> implementation that is recognized by both the C1/C2 compilers (effort only >>> partially completed.) In general, the implementation choice is transparent >>> to the user (optimal choice.) >>> >>> I've enclosed a JavaDoc and the roughed out source. For discussion. >>> Fire away. >>> >>> Cheers, >>> >>> -- Jim >>> >>> >>> >>> >>> >>> _______________________________________________ >>> mlvm-dev mailing list >>> mlvm-dev@openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >>> >>> >> >> >> -- >> -- Howard. >> >> >> _______________________________________________ >> mlvm-dev mailing list >> mlvm-dev@openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev >> >> > > _______________________________________________ > mlvm-dev mailing list > mlvm-dev@openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev > > -- -- Howard.
_______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev