Re: Slightly faster java.util.Arrays.byteSort(byte[])

2019-06-16 Thread Pavel Rappo
Hi Rodion,

A more appropriate place for your email would be the core-libs-dev mailing list,
so CC'ing that list.

> On 14 Jun 2019, at 16:34, Rodion Efremov  wrote:
> 
> Good evening!
> 
> I managed to improve the JDK 8 java.util.Arrays.sort(byte[])
> performance-wise [1]. The (warmed up) demonstration program produces more
> or less optimistic results on arrays of 1e8 bytes:
> 
> seed = 1560526264738
> java.util.Arrays.sort(byte[]) in 87.643701 milliseconds.
> java.util.Arrays.parallelSort(byte[]) in 301.329701 milliseconds.
> net.coderodde.Arrays.sort(byte[]) in 62.0763 milliseconds.
> Algorithms agree: true
> 
> I would like to hear any comments on how to make it eligible for inclusion
> in JDK.
> 
> Best regards,
> Rodion E.
> 
> References:
> [1] https://gist.github.com/coderodde/493407bc1c57352b53c2aa18b5c9a7a8



Re: Slightly faster java.util.Arrays.byteSort(byte[])

2019-06-16 Thread Pavel Rappo
I'm not an expert, however, while your OCA (see below) is being processed I
would recommend to provide more comprehensive stats. Different lengths of an
array, different flavours of data: sorted, sorted in the reverse order, random,
typical expected case(s), etc.

It seems that this particular functionality ( sort(byte[] ) hasn't changed since
the JDK 8. However, you should probably add the current JDK to your comparison.

One necessary step towards making this eligible for inclusion in the JDK would
be to sign the OCA

https://www.oracle.com/technetwork/community/oca-486395.html

Keep in mind that it is not by any means a guarantee that your change will be
included. Once the OCA has been signed and processed, the code then can be
discussed and evaluated by experts.

-Pavel

> On 14 Jun 2019, at 16:34, Rodion Efremov  wrote:
> 
> Good evening!
> 
> I managed to improve the JDK 8 java.util.Arrays.sort(byte[])
> performance-wise [1]. The (warmed up) demonstration program produces more
> or less optimistic results on arrays of 1e8 bytes:
> 
> seed = 1560526264738
> java.util.Arrays.sort(byte[]) in 87.643701 milliseconds.
> java.util.Arrays.parallelSort(byte[]) in 301.329701 milliseconds.
> net.coderodde.Arrays.sort(byte[]) in 62.0763 milliseconds.
> Algorithms agree: true
> 
> I would like to hear any comments on how to make it eligible for inclusion
> in JDK.
> 
> Best regards,
> Rodion E.
> 
> References:
> [1] https://gist.github.com/coderodde/493407bc1c57352b53c2aa18b5c9a7a8




Re: Slightly faster java.util.Arrays.byteSort(byte[])

2019-06-17 Thread Vladimir Yaroslavskiy

>I'm not an expert, however, while your OCA (see below) is being processed I
>would recommend to provide more comprehensive stats. Different lengths of an
>array, different flavours of data: sorted, sorted in the reverse order, random,
>typical expected case(s), etc.
>
>It seems that this particular functionality ( sort(byte[] ) hasn't changed 
>since
>the JDK 8. However, you should probably add the current JDK to your comparison.
>
>One necessary step towards making this eligible for inclusion in the JDK would
>be to sign the OCA
>
> https://www.oracle.com/technetwork/community/oca-486395.html
>
>Keep in mind that it is not by any means a guarantee that your change will be
>included. Once the OCA has been signed and processed, the code then can be
>discussed and evaluated by experts.
>
>-Pavel
>
>> On 14 Jun 2019, at 16:34, Rodion Efremov < codero...@gmail.com > wrote:
>> 
>> Good evening!
>> 
>> I managed to improve the JDK 8 java.util.Arrays.sort(byte[])
>> performance-wise [1]. The (warmed up) demonstration program produces more
>> or less optimistic results on arrays of 1e8 bytes:
>> 
>> seed = 1560526264738
>> java.util.Arrays.sort(byte[]) in 87.643701 milliseconds.
>> java.util.Arrays.parallelSort(byte[]) in 301.329701 milliseconds.
>> net.coderodde.Arrays.sort(byte[]) in 62.0763 milliseconds.
>> Algorithms agree: true
>> 
>> I would like to hear any comments on how to make it eligible for inclusion
>> in JDK.
>> 
>> Best regards,
>> Rodion E.
>> 
>> References:
>> [1] https://gist.github.com/coderodde/493407bc1c57352b53c2aa18b5c9a7a8
Hi Rodion,
I'm working on the new version of Arrays.sort() / Arrays.parallelSort() which 
is under review.
I will look on your version and provide my comments.
Thank you,
Vladimir

Re: Slightly faster java.util.Arrays.byteSort(byte[])

2019-06-18 Thread Vladimir Yaroslavskiy
Hi Rodion,

I looked at your byte sorting and compared it with the version which is under
review now. Here is the summary of my results:

1. net.coderodde.Arrays.sort(byte[]) works slower on small data (len < ~70).
The reason is that insertion sort is invoked on small arrays where counting
sort is not fast.

2. On other data net.coderodde.Arrays.sort shows the same time.

In general your code is not faster to be integrated, but many thanks
for pointing to non-perfect implementation of existing byte sorting.

I provide here the new implementation of byte sorting. Please look at it
and let me know if you have other ideas how to improve it.

/**
 * Min size of a byte array to use counting sort.
 */
private static final int MIN_BYTE_COUNTING_SORT_SIZE = 64;

/**
 * Sorts the specified range of the array using
 * counting sort or insertion sort.
 *
 * @param a the array to be sorted
 * @param low the index of the first element, inclusive, to be sorted
 * @param high the index of the last element, exclusive, to be sorted
 */
static void sort(byte[] a, int low, int high) {
if (high - low > MIN_BYTE_COUNTING_SORT_SIZE) {
countingSort(a, low, high);
} else {
insertionSort(a, low, high);
}
}

/**
 * Sorts the specified range of the array using insertion sort.
 *
 * @param a the array to be sorted
 * @param low the index of the first element, inclusive, to be sorted
 * @param high the index of the last element, exclusive, to be sorted
 */
private static void insertionSort(byte[] a, int low, int high) {
for (int i, k = low; ++k < high; ) {
byte ai = a[i = k];

if (ai < a[i - 1]) {
while (--i >= low && ai < a[i]) {
a[i + 1] = a[i];
}
a[i + 1] = ai;
}
}
}

/**
 * The number of distinct byte values.
 */
private static final int NUM_BYTE_VALUES = 1 << 8;

/**
 * Max index of byte counter.
 */
private static final int MAX_BYTE_INDEX = Byte.MAX_VALUE + NUM_BYTE_VALUES + 1;

/**
 * Sorts the specified range of the array using counting sort.
 *
 * @param a the array to be sorted
 * @param low the index of the first element, inclusive, to be sorted
 * @param high the index of the last element, exclusive, to be sorted
 */
private static void countingSort(byte[] a, int low, int high) {
int[] count = new int[NUM_BYTE_VALUES];

/*
 * Compute a histogram with the number of each values.
 */
for (int i = high; i > low; ++count[a[--i] & 0xFF]);

/*
 * Place values on their final positions.
 */
for (int k = high, i = MAX_BYTE_INDEX; --i > Byte.MAX_VALUE; ) {
int value = i & 0xFF;
for (low = k - count[value]; k > low; a[--k] = (byte) value);
}
}

Thank you,
Vladimir

>>I'm not an expert, however, while your OCA (see below) is being processed I
>>would recommend to provide more comprehensive stats. Different lengths of an
>>array, different flavours of data: sorted, sorted in the reverse order, 
>>random,
>>typical expected case(s), etc.
>>
>>It seems that this particular functionality ( sort(byte[] ) hasn't changed 
>>since
>>the JDK 8. However, you should probably add the current JDK to your 
>>comparison.
>>
>>One necessary step towards making this eligible for inclusion in the JDK would
>>be to sign the OCA
>>
>>  https://www.oracle.com/technetwork/community/oca-486395.html
>>
>>Keep in mind that it is not by any means a guarantee that your change will be
>>included. Once the OCA has been signed and processed, the code then can be
>>discussed and evaluated by experts.
>>
>>-Pavel
>>
>>> On 14 Jun 2019, at 16:34, Rodion Efremov <  codero...@gmail.com > wrote:
>>> 
>>> Good evening!
>>> 
>>> I managed to improve the JDK 8 java.util.Arrays.sort(byte[])
>>> performance-wise [1]. The (warmed up) demonstration program produces more
>>> or less optimistic results on arrays of 1e8 bytes:
>>> 
>>> seed = 1560526264738
>>> java.util.Arrays.sort(byte[]) in  87.643701 milliseconds.
>>> java.util.Arrays.parallelSort(byte[]) in 301.329701 milliseconds.
>>> net.coderodde.Arrays.sort(byte[]) in 62.0763 milliseconds.
>>> Algorithms agree: true
>>> 
>>> I would like to hear any comments on how to make it eligible for inclusion
>>> in JDK.
>>> 
>>> Best regards,
>>> Rodion E.
>>> 
>>> References:
>>> [1]  https://gist.github.com/coderodde/493407bc1c57352b53c2aa18b5c9a7a8
>Hi Rodion,
>I'm working on the new version of Arrays.sort() / Arrays.parallelSort() which 
>is under review.
>I will look on your version and provide my comments.
>Thank you,
>Vladimir