Re: RFR 8017540: Improve multi-threaded contention behavior of BigInteger.toString() radix conversion cache

Peter Levart Thu, 27 Jun 2013 02:18:47 -0700

Sorry for high frequency of messages.

This one is even better. powerCacheIndex need not be holding Nodes, butBigIntegers instead. So no indirection on fast-path:



    private static class Node {
        final BigInteger value;
        Node next;
        Node(BigInteger value) { this.value = value; }
    }

    private static final Node[] powerCache;
    private static final BigInteger[][] powerCacheIndex;
    private static final int POWER_CACHE_LINE_CHUNK = 16;

    static {
        powerCache = new Node[Character.MAX_RADIX + 1];
        for (int i = Character.MIN_RADIX; i <= Character.MAX_RADIX; i++) {
            powerCache[i] = new Node(BigInteger.valueOf(i));
        }
        powerCacheIndex = new BigInteger[Character.MAX_RADIX + 1][];
    }

private static BigInteger getRadixConversionCache(int radix, intexponent) {

        BigInteger[] cacheLine = powerCacheIndex[radix];

if (cacheLine != null && exponent < cacheLine.length) { //cache line is long enough

            BigInteger value = cacheLine[exponent];
            if (value != null) {
                return value;
            }
            return fillCacheLine(cacheLine, powerCache[radix], exponent);
        }
        else { // we need to extend / create cache line

cacheLine = new BigInteger[(exponent /POWER_CACHE_LINE_CHUNK + 1) * POWER_CACHE_LINE_CHUNK];BigInteger result = fillCacheLine(cacheLine,powerCache[radix], exponent);

            powerCacheIndex[radix] = cacheLine; // install new line
            return result;
        }
    }

private static BigInteger fillCacheLine(BigInteger[] cacheLine,Node node, int exponent) {

        cacheLine[0] = node.value;
        for (int i = 1; i <= exponent; i++) {
            synchronized (node) {
                if (node.next == null) {
                    node.next = new Node(node.value.pow(2));
                }
            }
            node = node.next;
            cacheLine[i] = node.value;
        }
        return node.value;
    }


Regards, Peter

On 06/27/2013 10:51 AM, Peter Levart wrote:

Hi,
Expanding on the idea of building single growing linked list of valuesand treating the array just as index for quick access which can bere-created at any time without synchronization or volatilepublication, here's the attempt:
    private static class Node {
        final BigInteger value;
        Node next;
        Node(BigInteger value) { this.value = value; }
    }

    private static final Node[] powerCache;
    private static final Node[][] powerCacheIndex;
    private static final int POWER_CACHE_LINE_CHUNK = 16;

    static {
        powerCache = new Node[Character.MAX_RADIX + 1];
        for (int i = Character.MIN_RADIX; i <= Character.MAX_RADIX; i++) {
            powerCache[i] = new Node(BigInteger.valueOf(i));
        }
        powerCacheIndex = new Node[Character.MAX_RADIX + 1][];
    }
private static BigInteger getRadixConversionCache(int radix, intexponent) {
        Node[] cacheLine = powerCacheIndex[radix];
if (cacheLine != null && exponent < cacheLine.length) { //cache line is long enough
            Node node = cacheLine[exponent];
            if (node != null) {
                return node.value;
            }
            return fillCacheLine(cacheLine, powerCache[radix], exponent);
        }
        else { // we need to extend / create cache line
cacheLine = new Node[(exponent / POWER_CACHE_LINE_CHUNK +1) * POWER_CACHE_LINE_CHUNK];BigInteger result = fillCacheLine(cacheLine,powerCache[radix], exponent);
            powerCacheIndex[radix] = cacheLine; // install new line
            return result;
        }
    }
private static BigInteger fillCacheLine(Node[] cacheLine, Nodenode0, int exponent) {
        Node node = cacheLine[0] = node0;
        for (int i = 1; i <= exponent; i++) {
            Node nextNode;
            synchronized (node) {
                nextNode = node.next;
                if (nextNode == null) {
                    nextNode = new Node(node.value.square());
                    node.next = nextNode;
                }
            }
            cacheLine[i] = node = nextNode;
        }
        return node.value;
    }



What do you think?

Regards, Peter


On 06/27/2013 09:27 AM, Peter Levart wrote:
On 06/27/2013 08:37 AM, Peter Levart wrote:
Hi,
This version seems correct. Maybe just replace double volatile readat length re-check with a single one:
private static BigInteger getRadixConversionCache(int radix, intexponent) {
        BigInteger[] cacheLine = powerCache[radix]; // volatile read
        if (exponent < cacheLine.length)
            return cacheLine[exponent];

        int oldLength = cacheLine.length;
        cacheLine = Arrays.copyOf(cacheLine, exponent + 1);
        for (int i = oldLength; i <= exponent; i++)
            cacheLine[i] = cacheLine[i - 1].pow(2);

        BigInteger[][] pc = powerCache; // volatile read again
        if (exponent >= pc[radix].length) {
            pc = pc.clone();
            pc[radix] = cacheLine;
            powerCache = pc; // volatile write, publish
        }
        return cacheLine[exponent];
    }
I like this, since it tries to avoid overwriting larger cacheLineswith smaller ones when unexistent exponents are requestedconcurrently for same radix. This is good, since computation forlarger exponents takes quadratically longer time (as Alan Eliasenpoints out) and possibility of concurrent threads stomping on eachother increases. Re-reading and cloning powerCache reference at endof computation also takes care of cacheLines for other radixes thatmight have expanded while the computation was taking place. So theonly overhead remaining is when concurrent threads are uselesslycomputing same results at same time. To avoid this, locking andwaiting would have to be introduced which would *complicate things*.
Regards, Peter
On the other hand, it doesn't complicate thing too much. So if thisextra CPU time proves to be a problem, here's a variation of abovecode which prevents multiple threads from calculating the same resultat the same time, by serializing computation with precise granularity:
    private static class Node {
        final BigInteger value;
        Node next;
        Node(BigInteger value) { this.value = value; }
    }

    private static volatile Node[][] powerCache;

    static {
        powerCache = new Node[Character.MAX_RADIX + 1][];
for (int i = Character.MIN_RADIX; i <= Character.MAX_RADIX;i++) {
            powerCache[i] = new Node[]{new Node(BigInteger.valueOf(i))};
        }
    }
private static BigInteger getRadixConversionCache(int radix, intexponent) {
        Node[] cacheLine = powerCache[radix]; // volatile read
        if (exponent < cacheLine.length)
            return cacheLine[exponent].value;

        int oldLength = cacheLine.length;
        cacheLine = Arrays.copyOf(cacheLine, exponent + 1);
        Node prevNode = cacheLine[oldLength - 1];
        for (int i = oldLength; i <= exponent; i++) {
            Node node;
            synchronized (prevNode) {
                node = prevNode.next;
                if (node == null) {
                    node = new Node(prevNode.value.pow(2));
                    prevNode.next = node;
                }
            }
            cacheLine[i] = prevNode = node;
        }

        Node[][] pc = powerCache; // volatile read again
        if (exponent >= pc[radix].length) {
            pc = pc.clone();
            pc[radix] = cacheLine;
            powerCache = pc; // volatile write, publish
        }
        return cacheLine[exponent].value;
    }
This variation differs only in a subtelty. It wraps each BigIntegerwith a Node, which has a link to next node in chain. Computation ofnext node is serailized using previous node as a lock. So althoughNodes can end up in several arrays (which are used as index for quickaccess), there is only a single growing chain of Nodes per radix andeach Node is computed by single thread.
Regards, Peter
On 06/26/2013 08:13 PM, Brian Burkhalter wrote:
So do we have consensus on this version?

Thanks for the lively "conversation."

Brian

On Jun 26, 2013, at 12:05 AM, Aleksey Shipilev wrote:
Yes, like that.

-Aleksey
On 26.06.2013, at 10:53, Dmitry Nadezhin<dmitry.nadez...@gmail.com> wrote:
We could check for the existing cacheLine.length right beforeinstalling
the new one

Do you mean something like this ?

BigInteger getRadixConversionCache(int radix, int exponent) {
BigInteger[] cacheLine = powerCache[radix]; // volatile read
if (exponent < cacheLine.length)
    return cacheLine[exponent];

int oldLength = cacheLine.length;
cacheLine = Arrays.copyOf(cacheLine, exponent + 1);
for (int i = oldLength; i < exponent + 1; i++)
    cacheLine[i] = cacheLine[i - 1].square();

if (exponent >= powerCache[radix].length) { // volatile read again
   BigInteger[][] pc = Arrays.copyOf(powerCache);
   pc[radix] = cacheLine;
   powerCache = pc; // volatile write, publish
}
return cacheLine[exponent];
}

Re: RFR 8017540: Improve multi-threaded contention behavior of BigInteger.toString() radix conversion cache

Reply via email to